API Reference

Classes and functions in idcempy package. For tutorial on how to use the package to estimate the ZiOP(C) and MiOP(C) models, see IDCeMPy Package.

For tutorial on the GiMNL module, see and gimnl_tutorial.

The zmiopc module

Classes and Functions for the ziopcpy module.

class zmiopc.OpModel(llik, coef, aic, vcov, data, xs, ts, x_, yx_, yncat, xstr, ystr)[source]

Store model results from opmod().

class zmiopc.IopModel(modeltype, llik, coef, aic, vcov, data, xs, zs, ts, x_, yx_, z_, yncat, xstr, ystr, zstr)[source]

Store model results from iopmod().

class zmiopc.IopCModel(modeltype, llik, coef, aic, vcov, data, xs, zs, ts, x_, yx_, z_, rho, yncat, xstr, ystr, zstr)[source]

Store model results from iopcmod().

class zmiopc.FittedVals(responsefull, responseordered, responseinflation, linear)[source]

Store fitted values for iOP models.

zmiopc.op(pstart, x, y, data, weights, offsetx)[source]

Calculate likelihood function for Ordered Probit Model.

Parameters
  • pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.

  • x (pandas.DataFrame) – Covariates for the ordered stage.

  • y (pandas.DataFrame) – The dependent variable (DV).

  • data (pandas.DataFrame) – Dataset.

  • weights – Weights.

  • offsetx – Offset for covariates in the ordered stage.

zmiopc.ziop(pstart, x, y, z, data, weights, offsetx, offsetz)[source]

Calculate likelihood function for Zero-inflated Model.

Parameters
  • pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.

  • x (pandas.DataFrame) – Covariates for the ordered probit stage.

  • y (pandas.DataFrame) – The ordinal dependent Variable (DV).

  • z (pandas.DataFrame) – Covariates for the inflation stage.

  • data (pandas.DataFrame) – Dataset with missing values listwise deleted.

  • weights (float) – weights.

  • offsetx (float) – Offset for the ordered stage covariates (X).

  • offsetz (float) – Offset for the inflation stage covariates (Z).

zmiopc.ziopc(pstart, x, y, z, data, weights, offsetx, offsetz)[source]

Calculate likelihood function for Zero-inflated Correlated-Errors Model.

Parameters
  • pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.

  • x (pandas.DataFrame) – Covariates for the ordered probit stage.

  • y (pandas.DataFrame) – The dependent variable (DV).

  • z (pandas.DataFrame) – Covariates for the inflation stage.

  • data (pandas.DataFrame) – Dataset with missing values listwise deleted.

  • weights (float) – Weights.

  • offsetx (float) – Offset for the ordered probit stage covariates (X).

  • offsetz (float) – Offset for the inflation stage (Z).

zmiopc.miop(pstart, x, y, z, data, weights, offsetx, offsetz)[source]

Likelihood function for Middle-inflated Ordered Probit Model “without” correlated errors.

The number of categories in the dependent variable must be odd.

Parameters
  • pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.

  • x (pandas.DataFrame) – Covariates in the ordered probit stage.

  • y (pandas.DataFrame) – The dependent variable (DV).

  • z (pandas.DataFrame) – Covariates in the inflation stage.

  • data (pandas.DataFrame) – Dataset with missing values listwise deleted.

  • weights (float) – Weights.

  • offsetx (float) – Offset for covariates in the ordered probit stage (X).

  • offsetz (float) – Offset for covariates in the inflation stage.

zmiopc.miopc(pstart, x, y, z, data, weights, offsetx, offsetz)[source]

Likelihood function for Middle-inflated Correlated-Errors Model.

The number of categories in the dependent variable must be odd.

Parameters
  • pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.

  • x (pandas.DataFrame) – Covariates in the ordered probit stage.

  • y (pandas.DataFrame) – The dependent variable (DV).

  • z (pandas.DataFrame) – Covariates in the inflation stage.

  • data (pandas.DataFrame) – Dataset with missing values listwise deleted.

  • weights (float) – Weights.

  • offsetx (float) – Offset for the ordered probit stage covariates (X).

  • offsetz (float) – Offset for the covariates in the inflation stage (Z).

zmiopc.opresults(model, data, x, y)[source]

Produces estimation results, part of opmod().

Parameters
  • model – Model estimation results obtained from minimization.

  • data (pandas.DataFrame) – Dataset.

  • x (list of str) – List of names for independent variables matching column names in data.

  • y – : List of name for dependent Variablematching column names in data.

zmiopc.opmod(data, x, y, pstart=None, method='BFGS', weights=1, offsetx=0)[source]

Estimates Ordered Probit model and returns OpModel class object.

Parameters
  • pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.

  • data (pandas.DataFrame) – Full dataset used for estimation.

  • y (list of str) – Dependent Variable (DV).

  • method – Method for optimization, default is ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.

  • weights – Weights.

  • offsetx – Offset for covariates (X).

Returns

OpModel

zmiopc.iopresults(model, data, x, y, z, modeltype)[source]

Produce estimation results, part of iopmod().

Parameters
  • model – Model estimation results obtained from minimization.

  • data (pandas.DataFrame) – Full dataset used for estimation.

  • x (list of str) – List of names for covariates in the ordered probit stage.

  • y (list of str) – : List of names for dependent cariable.

  • z (list of str) – : List of names for covariates in the inflation stage.

  • modeltype – : Type of model. Options are: ‘ziop’ or ‘miop’.

zmiopc.iopcresults(model, data, x, y, z, modeltype)[source]

Produce estimation results, part of ziopc  mod().

Parameters
  • model – Model estimation results obtained from minimization.

  • data (pandas.DataFrame) – Full dataset used for estimation.

  • x (list of str) – List of names for covariates in the ordered probit stage.

  • y (list of str) – : List of names for dependent cariable.

  • z (list of str) – : List of names for covariates in the inflation stage.

  • modeltype – : Type of model. Options are: ‘ziopc’ or ‘miopc’

zmiopc.iopmod(modeltype, data, x, y, z, pstart=None, method='BFGS', weights=1, offsetx=0, offsetz=0)[source]

Estimate ZiOP model and return IopModel class object as output.

Parameters
  • pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.

  • data (pandas.DataFrame) – Full dataset used for estimation.

  • x (list of str) – Covariates in the ordered probit stage. Elements must match column names of data.

  • y (list of str) – Dependent variable (DV). Element must match column names of data.

  • z (list of str) – Inflation stage variable. Elements must match column names of data.

  • modeltype – Type of model to be estimated. Options are: “ziop” or ‘miop’.

  • method – Method for optimization, default ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.

  • weights – Weights.

  • offsetx – Offset for ordered probit stage covariates (X).

  • offsetz – Offset for inflation stage covariates (Z).

Returns

IopModel

zmiopc.iopcmod(modeltype, data, x, y, z, pstart=None, method='BFGS', weights=1, offsetx=0, offsetz=0)[source]

Estimate an iOP model (ZiOP or MiOP) and return IopcModel.

Parameters
  • pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.

  • data (pandas.DataFrame) – Dataset used for estimation.

  • x (list of str) – Covariates for the ordered probit stage. Elements must match column names of data.

  • y (list of str) – The dependent variable (DV). Element must match column names of data.

  • z (list of str) – Covariates for the inflation stage. Elements must match column names of data.

  • modeltype – Type of model to be estimated. Options are: ‘ziopc’ or ‘miopc’.

  • method – Method for optimization, default is ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.

  • weights – Weights.

  • offsetx – Offset for variables in ordered probit stage (X).

  • offsetz – Offset for variables in inflation stage (Z).

Returns

IopCModel

zmiopc.iopfit(model)[source]

Calculate probabilities from iopmod().

Parameters

model – :class:IopModel object from iopmod().

Returns

:class:FittedVals object with fitted values.

zmiopc.iopcfit(model)[source]

Calculate fitted probabilities from iopcmod().

Parameters

modelIopCModel object from iopcmod().

Returns

FittedVals object with fitted values.

zmiopc.vuong_opiop(opmodel, iopmodel)[source]

Run the Vuong test to compare the performance of the OP and iOP model.

Parameters
  • opmodel – The OP model from OpModel.

  • iopmodel – The ZiOP model from IopModel.

Returns

vuongopiop: Result of the Vuong test.

zmiopc.vuong_opiopc(opmodel, iopcmodel)[source]

Run the Vuong test to compare the performance of the OP and iOPC model.

Parameters
  • opmodel – The OP model from OpModel.

  • iopcmodel – The iOPC model from IopCModel.

Returns

vuongopiopc: Result of the Vuong test.

zmiopc.split_effects(model, inflvar, nsims=10000)[source]

Calculate change in probability of being 0 in the split-probit stage.

This function calculates the predicted probabilities when there is change in value of a variable in the split-probit equation. The chosen dummy variable is changed from 0 to 1, and chosen numerical variable is mean value + 1 standard deviation. Other variables are kept at 0 or mean value (Note: the current version of the function recognizes ordinal variables as numerical).

Parameters
  • modelIopModel or IopCModel.

  • inflvar (int) – Number representing the location of variable in the split-probit equation. (attribute .inflate of IopModel or IopCModel)

  • nsims (int) – Number of simulated observations, default is 10000.

Returns

changeprobs: A dataframe of the predicted probabilities when there is change in the variable (1) versus original values (0).

zmiopc.ordered_effects(model, ordvar, nsims=10000)[source]

Calculate the changes in probability in each outcome in OP stage.

This function calculates predicted probabilities when there is a change in the value of a covariate in the ordered probit equation. The chosen dummy variable is changed from 0 to 1, and chosen numerical variable is mean value + 1 standard deviation. Other variables are kept at 0 or mean value (Note: the current version of the function recognizes ordinal variables as numerical).

Parameters
  • modelIopModel or IopCModel.

  • ordvar (int) – Number representing the location of variable in the ordered probit equation. (attribute .ordered of IopModel or IopCModel)

  • nsims (int) – Number of simulated observations, default is 10000.

Returns

changeprobs: A dataframe of the predicted probabilities when there is change in the variable for each outcome (1) versus original values (0).

The gimnl module

Classes and Functions for the gimnl module.

class gimnl.GimnlModel(modeltype, reference, inflatecat, llik, coef, aic, vcov, data, xs, zs, x_, yx_, z_, ycatu, xstr, ystr, zstr)[source]

Store model results from gimnlmod().

class gimnl.MnlModel(modeltype, reference, llik, coef, aic, vcov, data, xs, x_, yx_, ycatu, xstr, ystr)[source]

Store model results from gimnlmod().

gimnl.mnl3(pstart, x2, x3, y, reference)[source]

Calculate likelihood function for the baseline inflated three-category MNL model.

Parameters
  • pstart (list) – A list of starting parameters.

  • x2 (Pandas dataframe) – Multinomial Logit variables. Data subsetted to selected variables.

  • x3 (Pandas dataframe) – Multinomial Logit variables (should be identical to x2.

  • y (Pandas dataframe) – Dependent variable (DV). Data subsetted to selected variable.

  • reference – List of order of categories.

gimnl.bimnl3(pstart, x2, x3, y, z, reference)[source]

Calculate likelihood function for the baseline inflated three-category MNL model.

Parameters
  • pstart (list) – A list of starting parameters.

  • x2 (pandas dataframe) – Multinomial Logit variables. Data subsetted to selected variables.

  • x3 (pandas dataframe) – Multinomial Logit variables (should be identical to x2.

  • y (pandas dataframe) – Dependent variable (DV). Data subsetted to selected variable.

  • z (pandas dataframe) – Logit Split stage variables. Data subsetted to selected variables.

  • reference – List of order of categories (first element will be the inflated category).

gimnl.simnl3(pstart, x2, x3, y, z, reference)[source]

Calculate likelihood function for the second category inflated three-category MNL model.

Parameters
  • pstart (list) – A list of starting parameters.

  • x2 (pandas dataframe) – Multinomial Logit variables. Data subsetted to selected variables.

  • x3 (pandas dataframe) – Multinomial Logit variables (should be identical to x2.

  • y (pandas dataframe) – Dependent variable (DV). Data subsetted to selected variable.

  • z (pandas dataframe) – Logit Split stage variables. Data subsetted to selected variables.

  • reference – List of order of categories (second element will be the inflated category).

gimnl.timnl3(pstart, x2, x3, y, z, reference)[source]

Calculate likelihood function for the third category inflated three-category MNL model.

Parameters
  • pstart (list) – A list of starting parameters.

  • x2 (pandas dataframe) – Covariates in the MNL.

  • x3 (pandas dataframe) – Multinomial Logit variables (should be identical to x2).

  • y (pandas dataframe) – Dependent variable (DV).

  • z (pandas dataframe) – Covariates in the Logit model (i.e. split stage). variables.

  • reference – List of order of categories (third element will be the inflated category).

gimnl.gimnlresults(model, data, x, y, z, modeltype, reference, inflatecat)[source]

Produce estimation results, part of gimnlmod().

Store estimates, model AIC, and other information to GimnlModel.

Parameters
  • model – Estimation results.

  • data (pandas.DataFrame) – Model data used for estimation, subsetted to selected variables and missing values listwise deleted.

  • x (list of str) – List of names for covariates in Multinomial Logit stage.

  • y (list of str) – List of names for dependent variable .

  • z (list of str) – List of names for covariates in split-stage.

  • modeltype – Type of inflated MNL model. One of ‘bimnl3’, ‘simnl3’, or ‘timnl3’.

  • reference – List of order of categories. First element is the baseline/ reference category.

  • inflatecat – Inflated category. One of ‘baseline’, ‘second’, or ‘third’.

gimnl.mnlresults(model, data, x, y, modeltype, reference)[source]

Produce estimation results, part of mnlmod().

Store estimates, model AIC, and other information to MnlModel.

Parameters
  • model – Estimation results.

  • data (pandas.DataFrame) – Model data used for estimation, subsetted to selected variables and missing values listwise deleted.

  • x (list of str) – List of names for covariates in Multinomial Logit stage.

  • y (list of str) – List of names for dependent variable .

  • modeltype – Three-category MNL model (‘mnl3’).

  • reference – List of order of categories. First element is the baseline/ reference category.

gimnl.gimnlmod(data, x, y, z, reference, inflatecat, method='BFGS', pstart=None)[source]

Estimate three-category inflated Multinomial Logit model.

Parameters
  • data (pandas.dataframe) – Full dataset.

  • x (list of str) – Covariates in Multi Nomial Logit stage. The elements must match column names of data.

  • y (list of str) – Dependent Variable. Values should be integers, with a number from 0-2 representing each category. The element must match column names of data.

  • z (list of str) – Covariates in split-stage. The elements must match column names of data.

  • reference (list of int) – List of three elements specifying the order of categories (e.g [0, 1, 2], [2, 1, 0]. etc…). The first element is the baseline/reference category. The parameter inflatecat then specifies which category in the list inflated.

  • inflatecat – A string specifying the inflated category. One of “baseline” for the first, “second” for the second, or “third” for the third in reference list to specify the inflated category.

  • method – Optimization method. Default is ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.

  • pstart (list) – A list of starting parameters.

gimnl.mnlmod(data, x, y, reference, method='BFGS', pstart=None)[source]

Estimate three-category Multinomial Logit model.

Parameters
  • data (pandas.dataframe) – Full dataset.

  • x (list of str) – Covariates in MNL stage. The elements must match column names of data

  • y – Dependent variable. Values should be integers, with a number from 0-2 representing each category.

  • reference (list of int) – List of three elements specifying the order of categories (e.g [0, 1, 2], [2, 1, 0]. etc…). The first element is the baseline/reference category.

  • method – Optimization method. Default is ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.

  • pstart (list) – A list of starting parameters.

gimnl.vuong_gimnl(modelmnl, modelgimnl)[source]

Run the Vuong test to compare the performance of the MNL and GIMNL model.

For the function to run properly, the models need to have the same X covariates and same number of observations.

Parameters