Inference module

Inference module for PCM toolbox with main functionality for model fitting and evaluation. @author: jdiedrichsen

inference.fit_model_group(Data, M, fixed_effect='block', fit_scale=False, scale_prior=1000.0, noise_cov=None, algorithm=None, optim_param={}, theta0=None, verbose=True, return_second_deriv=False, add_prior=False)

Fits PCM models(s) to a group of subjects

The model parameters are (by default) shared across subjects. Scale and noise parameters are individual for each subject. Some model parameters can also be made individual by setting M.common_param

Parameters:

Data (list of pcm.Datasets) – List data set has partition and condition descriptors
M (pcm.Model or list of pcm.Models) – Models to be fitted on the data sets. Optional field M.common_param indicates which model parameters are common to the group (True) and which ones are fit individually (False)
effect (fixed) – None, ‘block’, or nd-array / list of nd-arrays. Default (‘block’) add an intercept for each partition
fit_scale (bool) – Fit a additional scale parameter for each subject? Default is set to False.
scale_prior (float) – Prior variance for log-normal prior on scale parameter
algorithm (string) – Either ‘newton’ or ‘minimize’ - provides over-write for model specific algorithms
noise_cov – None (i.i.d), ‘block’, or optional specific covariance structure of the noise
optim_param (dict) – Additional paramters to be passed to the optimizer
theta0 (list of np.arrays) – List of starting values (same format as return argument theta)
verbose (bool) – Provide printout of progress? Default: True
return_second_deriv (bool) – Returns final Hessian of the loss function
add_prior (bool) – If set to true, optimizes likelihood + prior function

Returns:

T (pandas.dataframe) – Dataframe with the fields: SN: Subject number likelihood: log-likelihood scale: Scale parameter (if fitscale = 1)-exp(theta_s) noise: Noise parameter- exp(theta_eps) iterations: Number of interations for model fit time: Elapsed time in sec
theta (list of np.arrays) – List of estimated model parameters each one is a vector with #num_commonparams + #num_singleparams x #numSubj elements
Hessian (list of np.arrays) – If return_second_deriv is true, then the it returns a (n_params,n_params) ndarray for each model

inference.fit_model_group_crossval(Data, M, fixed_effect='block', fit_scale=False, scale_prior=1000.0, noise_cov=None, algorithm=None, optim_param={}, theta0=None, verbose=True)

Fits PCM model(sto N-1 subjects and evaluates the likelihood on the Nth subject.

Only the common model parameters are shared across subjects.The scale and noise parameters are still fitted to each subject. Some model parameters can also be made individual by setting M.common_param to False

Parameters:

Data (list of pcm.Datasets) – List data set has partition and condition descriptors
M (pcm.Model or list of pcm.Models) – Models to be fitted on the data sets. Optional field M.common_param indicates which model parameters are common to the group (True) and which ones are fit individually (False)
effect (fixed) – None, ‘block’, or nd-array. Default (‘block’) add an intercept for each partition
fit_scale (bool) – Fit a additional scale parameter for each subject? Default is set to False.
scale_prior (float) – Prior variance for log-normal prior on scale parameter
algorithm (string) – Either ‘newton’ or ‘minimize’ - provides over-write for model specific algorithms
noise_cov – None (i.i.d), ‘block’, or optional specific covariance structure of the noise
optim_param (dict) – Additional paramters to be passed to the optimizer
theta0 (list of np.arrays) – List of starting values (same format as return argument theta)
verbose (bool) – Provide printout of progress? Default: True

Returns:

T (pandas.dataframe) – Dataframe with the fields: SN: Subject number likelihood: log-likelihood scale: Scale parameter (if fitscale = 1)-exp(theta_s) noise: Noise parameter- exp(theta_eps) iterations: Number of interations for model fit time: Elapsed time in sec
theta (list of np.arrays) – List of estimated model parameters - common group parameters come from the training data, individual parameters from the testing data
G_pred (list of np.arrays) – List of estimated G-matrices under the model

inference.fit_model_individ(Data, M, fixed_effect='block', fit_scale=False, scale_prior=1000.0, noise_cov=None, algorithm=None, optim_param={}, theta0=None, verbose=True, return_second_deriv=False, add_prior=False)

Fits Models to a data set individually.

The model parameters are all individually fit.

Parameters:

Data (pcm.Dataset or list of pcm.Datasets) – List data set has partition and condition descriptors
M (pcm.Model or list of pcm.Models) – Models to be fitted on the data sets
effect (fixed) – None, ‘block’, or nd-array. Default (‘block’) adds an intercept for each partition
fit_scale (bool) – Fit a additional scale parameter for each subject? Default is set to False.
scale_prior (float) – Prior variance for log-normal prior on scale parameter
algorithm (string) – Either ‘newton’ or ‘minimize’ - provides over-write for model specific algorithms
noise_cov – None (i.i.d), ‘block’, or optional specific covariance structure of the noise
optim_param (dict) – Additional paramters to be passed to the optimizer
theta0 (np.array or list of np.arrays) – Starting values (n_param x n_subj - same as fitted theta)
verbose (bool) – Provide printout of progress? Default: True
return_second_deriv (bool) – Returns final Hessian of the loss function
add_prior (bool) – If set to true, optimizes likelihood + prior function

Returns:

T (pandas.dataframe) – Dataframe with the fields: SN: Subject number likelihood: log-likelihood scale: Scale parameter (if fitscale = 1)-exp(theta_s) noise: Noise parameter- exp(theta_eps) run: Run parameter (if run = ‘random’) iterations: Number of interations for model fit time: Elapsed time in sec
theta (list of np.arrays) – List of estimated model parameters, each a n_param x n_subj np.array
Hessian (list of np.arrays) – If return_second_deriv is true, then the it returns a (nsubj,n_params,n_params) ndarray for each model

inference.get_scale0(G, G_hat)

” Get approximate (log-)scaling parameter between predicted G and estimated G_hat

Parameters:

G (numpy.ndarray0) – Predicted G matrix by the model
G_hat (numpy.ndarry0) – Directly estimated G from the data

Returns:

scale0 – log-scaling parameter

inference.group_to_individ_param(theta, M, n_subj, return_group_indices=False)

Takes a vector of group parameters and rearranges them To make it conform to theta you would get back from a individual fit

Parameters:

theta (nd.array) – n_gparam Vector of group parameters
M (pcm.Model) – PCM model
n_subj (int) – Number of subjects

Returns:

theta_indiv (ndarray) – n_params x n_subj Matrix of individual parameters
indx_indiv (ndarray) – n_params x n_subj Matrix of indices into group parameters
indx_group (ndarray) – n_gparam Vector of indices into original model parameters

inference.likelihood_group(theta, M, YY, Z, X=None, Noise=<PcmPy.model.IndependentNoise object>, n_channel=1, fit_scale=True, scale_prior=1000.0, return_deriv=0, return_individ=False)

Negative Log-Likelihood of group data and derivative in respect to the parameters

Parameters:

theta (np.array) – Vector of (log-)model parameters consisting of common model parameters (M.n_param or sum of M.common_param) + participant-specific parameters (iterated by subject): individ model param (not in common_param), scale parameter noise parameters
M (pcm.Model) – Model object
YY (List of np.arrays) – List of NxN Matrix of outer product of the activity data (Y*Y’)
Z (List of 2d-np.array) – NxQ Design matrix - relating the trials (N) to the random effects (Q)
X (List of np.array) – Fixed effects design matrix - will be accounted for by ReML
Noise (List of pcm.Noisemodel) – Pcm-noise model (default: IndependentNoise)
n_channel (List of int) – Number of channels
fit_scale (bool) – Fit a scaling parameter for the model (default is False)
scale_prior (float) – Prior variance for log-normal prior on scale parameter
return_deriv (int) – 0: Only return negative likelihood 1: Return first derivative 2: Return first and second derivative (default)
return_individ (bool) – return individual likelihoods instead of group likelihood
return_deriv – 0:None, 1:First, 2: second

Returns:

negloglike – Negative log-likelihood of the data under a model
dLdtheta (1d-np.array) – First derivative of negloglike in respect to the parameters
ddLdtheta2 (2d-np.array) – Second derivative of negloglike in respect to the parameters

inference.likelihood_individ(theta, M, YY, Z, X=None, Noise=<PcmPy.model.IndependentNoise object>, n_channel=1, fit_scale=False, scale_prior=1000.0, return_deriv=0)

Negative Log-Likelihood of the data and derivative in respect to the parameters

Parameters:

theta (np.array) – Vector of (log-)model parameters - these include model, signal, and noise parameters
M (PcmPy.model.Model) – Model object with predict function
YY (2d-np.array) – NxN Matrix of outer product of the activity data (Y*Y’)
Z (2d-np.array) – NxQ Design matrix - relating the trials (N) to the random effects (Q)
X (np.array) – Fixed effects design matrix - will be accounted for by ReML
Noise (pcm.Noisemodel) – Pcm-noise mode to model block-effects (default: IndepenentNoise)
n_channel (int) – Number of channels
fit_scale (bool) – Fit a scaling parameter for the model (default is False)
scale_prior (float) – Prior variance for log-normal prior on scale parameter
return_deriv (int) – 0: Only return negative loglikelihood (default) 1: Return first derivative 2: Return first and second derivative

Returns:

negloglike (double) – Negative log-likelihood of the data under a model
dLdtheta (1d-np.array) – First derivative of negloglike in respect to the fitted parameters
ddLdtheta2 (2d-np.array) – Second derivative of negloglike in respect to the fitted parameters

inference.posterior_group(theta, M, YY, Z, X=None, Noise=<PcmPy.model.IndependentNoise object>, n_channel=1, fit_scale=True, scale_prior=1000.0, return_deriv=0, return_individ=False): Returns the non-normalized negative log-posterior of the group model parameters. See likelihood_group for input and output parameters Assignment of prior still neeeds to be implemented

inference.posterior_individ(theta, M, YY, Z, X=None, Noise=<PcmPy.model.IndependentNoise object>, n_channel=1, fit_scale=False, scale_prior=1000.0, return_deriv=0): Returns the non-normalized negative log-posterior of the individual model parameters See likelihood_individ for inout and output parameters

inference.sample_model_group(Data, M, fixed_effect='block', fit_scale=False, scale_prior=1000.0, noise_cov=None, n_mcmc_samples=10000, n_mcmc_chains=1, theta0=None, verbose=True, proposal_sd=None)

Approximates the posterior of the parameters of a group model using MCMC sampling

The model parameters are (by default) shared across subjects. Scale and noise parameters are individual for each subject. Some model parameters can also be made individual by setting M.common_param The starting values are provided for the different chains for the group parameter

Parameters:

Data (list of pcm.Datasets) – List data set has partition and condition descriptors
M (pcm.Model) – Models to sampled
effect (fixed) – None, ‘block’, or nd-array / list of nd-arrays. Default (‘block’) add an intercept for each partition
fit_scale (bool) – Fit a additional scale parameter for each subject? Default is set to False.
scale_prior (float) – Prior variance for log-normal prior on scale parameter
noise_cov – None (i.i.d), ‘block’, or optional specific covariance structure of the noise
sample_param (dict) – Additional paramters to be passed to MCMC sampler
theta0 (np.array) – starting values

Returns:

theta (np.array) – Sampled parameters
l (np.array) – Log-likelihood corresponding to the sampled parameters
G_pred (list of np.arrays) – List of estimated G-matrices under the model

inference.sample_model_individ(Data, M, fixed_effect='block', fit_scale=False, scale_prior=1000.0, noise_cov=None, n_mcmc_samples=10000, n_local_samples=2000)

Approximates the posterior of the parameters of a group model using MCMC sampling If requested, it also tries to approximated the marginal likelihood using mnethod outlined in Chib & Jeliazkov (2011)

Parameters:

Data (list of pcm.Datasets) – List data set has partition and condition descriptors
M (pcm.Model) – Models to sampled
effect (fixed) – None, ‘block’, or nd-array / list of nd-arrays. Default (‘block’) add an intercept for each partition
fit_scale (bool) – Fit a additional scale parameter for each subject? Default is set to False.
scale_prior (float) – Prior variance for log-normal prior on scale parameter
noise_cov – None (i.i.d), ‘block’, or optional specific covariance structure of the noise
sample_param (dict) – Additional paramters to be passed to MCMC sampler
theta0 (np.array) – starting values

Returns:

theta (np.array) – Sampled parameters
l (np.array) – Log-likelihood corresponding to the sampled parameters

inference.set_up_fit(Data, fixed_effect='block', noise_cov=None)

Utility routine pre-calculates and sets design matrices, etc for the PCM fit

Parameters:

Data (pcm.dataset) – Contains activity data (measurement), and obs_descriptors partition and condition
fixed_effect – Can be None, ‘block’, or a design matrix. ‘block’ includes an intercept for each partition.
noise_cov – Can be None: (i.i.d noise), ‘block’: a common noise paramter or a List of noise covariances for the different partitions

Returns:

Z (np.array) – Design matrix for random effects
X (np.array) – Design matrix for fixed effects
YY (np.array) – Quadratic form of the data (Y Y’)
Noise (pcm.model.NoiseModel) – Noise model
G_hat (np.array) – Crossvalidated estimate of second moment of U

inference.set_up_fit_group(Data, fixed_effect='block', noise_cov=None)

Pre-calculates and sets design matrices, etc for the PCM fit for a full group

Parameters:

Data (list of pcm.dataset) – Contains activity data (measurement), and obs_descriptors partition and condition
fixed_effect – Can be None, ‘block’, or a design matrix. ‘block’ includes an intercept for each partition.
noise_cov – Can be None: (i.i.d noise), ‘block’: a common noise paramter or a List of noise covariances for the different partitions

Returns:

Z (np.array) – Design matrix for random effects
X (np.array) – Design matrix for fixed effects
YY (np.array) – Quadratic form of the data (Y Y’)
Noise (NoiseModel) – Noise model
G_hat (np.array) – Crossvalidated estimate of second moment of U