API Reference¶
Classes and functions in idcempy package. For tutorial on how to use the package to estimate the ZiOP(C) and MiOP(C) models, see IDCeMPy Package.
For tutorial on the GiMNL module, see and gimnl_tutorial.
The zmiopc module¶
Classes and Functions for the ziopcpy module.
- class zmiopc.OpModel(llik, coef, aic, vcov, data, xs, ts, x_, yx_, yncat, xstr, ystr)[source]¶
Store model results from
opmod()
.
- class zmiopc.IopModel(modeltype, llik, coef, aic, vcov, data, xs, zs, ts, x_, yx_, z_, yncat, xstr, ystr, zstr)[source]¶
Store model results from
iopmod()
.
- class zmiopc.IopCModel(modeltype, llik, coef, aic, vcov, data, xs, zs, ts, x_, yx_, z_, rho, yncat, xstr, ystr, zstr)[source]¶
Store model results from
iopcmod()
.
- class zmiopc.FittedVals(responsefull, responseordered, responseinflation, linear)[source]¶
Store fitted values for iOP models.
- zmiopc.op(pstart, x, y, data, weights, offsetx)[source]¶
Calculate likelihood function for Ordered Probit Model.
- Parameters
pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.
x (pandas.DataFrame) – Covariates for the ordered stage.
y (pandas.DataFrame) – The dependent variable (DV).
data (pandas.DataFrame) – Dataset.
weights – Weights.
offsetx – Offset for covariates in the ordered stage.
- zmiopc.ziop(pstart, x, y, z, data, weights, offsetx, offsetz)[source]¶
Calculate likelihood function for Zero-inflated Model.
- Parameters
pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.
x (pandas.DataFrame) – Covariates for the ordered probit stage.
y (pandas.DataFrame) – The ordinal dependent Variable (DV).
z (pandas.DataFrame) – Covariates for the inflation stage.
data (pandas.DataFrame) – Dataset with missing values listwise deleted.
weights (float) – weights.
offsetx (float) – Offset for the ordered stage covariates (X).
offsetz (float) – Offset for the inflation stage covariates (Z).
- zmiopc.ziopc(pstart, x, y, z, data, weights, offsetx, offsetz)[source]¶
Calculate likelihood function for Zero-inflated Correlated-Errors Model.
- Parameters
pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.
x (pandas.DataFrame) – Covariates for the ordered probit stage.
y (pandas.DataFrame) – The dependent variable (DV).
z (pandas.DataFrame) – Covariates for the inflation stage.
data (pandas.DataFrame) – Dataset with missing values listwise deleted.
weights (float) – Weights.
offsetx (float) – Offset for the ordered probit stage covariates (X).
offsetz (float) – Offset for the inflation stage (Z).
- zmiopc.miop(pstart, x, y, z, data, weights, offsetx, offsetz)[source]¶
Likelihood function for Middle-inflated Ordered Probit Model “without” correlated errors.
The number of categories in the dependent variable must be odd.
- Parameters
pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.
x (pandas.DataFrame) – Covariates in the ordered probit stage.
y (pandas.DataFrame) – The dependent variable (DV).
z (pandas.DataFrame) – Covariates in the inflation stage.
data (pandas.DataFrame) – Dataset with missing values listwise deleted.
weights (float) – Weights.
offsetx (float) – Offset for covariates in the ordered probit stage (X).
offsetz (float) – Offset for covariates in the inflation stage.
- zmiopc.miopc(pstart, x, y, z, data, weights, offsetx, offsetz)[source]¶
Likelihood function for Middle-inflated Correlated-Errors Model.
The number of categories in the dependent variable must be odd.
- Parameters
pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.
x (pandas.DataFrame) – Covariates in the ordered probit stage.
y (pandas.DataFrame) – The dependent variable (DV).
z (pandas.DataFrame) – Covariates in the inflation stage.
data (pandas.DataFrame) – Dataset with missing values listwise deleted.
weights (float) – Weights.
offsetx (float) – Offset for the ordered probit stage covariates (X).
offsetz (float) – Offset for the covariates in the inflation stage (Z).
- zmiopc.opresults(model, data, x, y)[source]¶
Produces estimation results, part of
opmod()
.- Parameters
model – Model estimation results obtained from minimization.
data (pandas.DataFrame) – Dataset.
x (list of str) – List of names for independent variables matching column names in data.
y – : List of name for dependent Variablematching column names in data.
- zmiopc.opmod(data, x, y, pstart=None, method='BFGS', weights=1, offsetx=0)[source]¶
Estimates Ordered Probit model and returns
OpModel
class object.- Parameters
pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.
data (pandas.DataFrame) – Full dataset used for estimation.
y (list of str) – Dependent Variable (DV).
method – Method for optimization, default is ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.
weights – Weights.
offsetx – Offset for covariates (X).
- Returns
OpModel
- zmiopc.iopresults(model, data, x, y, z, modeltype)[source]¶
Produce estimation results, part of
iopmod()
.- Parameters
model – Model estimation results obtained from minimization.
data (pandas.DataFrame) – Full dataset used for estimation.
x (list of str) – List of names for covariates in the ordered probit stage.
y (list of str) – : List of names for dependent cariable.
z (list of str) – : List of names for covariates in the inflation stage.
modeltype – : Type of model. Options are: ‘ziop’ or ‘miop’.
- zmiopc.iopcresults(model, data, x, y, z, modeltype)[source]¶
Produce estimation results, part of
ziopc mod()
.- Parameters
model – Model estimation results obtained from minimization.
data (pandas.DataFrame) – Full dataset used for estimation.
x (list of str) – List of names for covariates in the ordered probit stage.
y (list of str) – : List of names for dependent cariable.
z (list of str) – : List of names for covariates in the inflation stage.
modeltype – : Type of model. Options are: ‘ziopc’ or ‘miopc’
- zmiopc.iopmod(modeltype, data, x, y, z, pstart=None, method='BFGS', weights=1, offsetx=0, offsetz=0)[source]¶
Estimate ZiOP model and return
IopModel
class object as output.- Parameters
pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.
data (pandas.DataFrame) – Full dataset used for estimation.
x (list of str) – Covariates in the ordered probit stage. Elements must match column names of
data
.y (list of str) – Dependent variable (DV). Element must match column names of
data
.z (list of str) – Inflation stage variable. Elements must match column names of
data
.modeltype – Type of model to be estimated. Options are: “ziop” or ‘miop’.
method – Method for optimization, default ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.
weights – Weights.
offsetx – Offset for ordered probit stage covariates (X).
offsetz – Offset for inflation stage covariates (Z).
- Returns
IopModel
- zmiopc.iopcmod(modeltype, data, x, y, z, pstart=None, method='BFGS', weights=1, offsetx=0, offsetz=0)[source]¶
Estimate an iOP model (ZiOP or MiOP) and return
IopcModel
.- Parameters
pstart (list) – A list of starting values for the estimation. Length of the number of parameters to be estimated.
data (pandas.DataFrame) – Dataset used for estimation.
x (list of str) – Covariates for the ordered probit stage. Elements must match column names of
data
.y (list of str) – The dependent variable (DV). Element must match column names of
data
.z (list of str) – Covariates for the inflation stage. Elements must match column names of
data
.modeltype – Type of model to be estimated. Options are: ‘ziopc’ or ‘miopc’.
method – Method for optimization, default is ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.
weights – Weights.
offsetx – Offset for variables in ordered probit stage (X).
offsetz – Offset for variables in inflation stage (Z).
- Returns
IopCModel
- zmiopc.iopfit(model)[source]¶
Calculate probabilities from
iopmod()
.- Parameters
model – :class:IopModel object from
iopmod()
.- Returns
:class:FittedVals object with fitted values.
- zmiopc.iopcfit(model)[source]¶
Calculate fitted probabilities from
iopcmod()
.- Parameters
- Returns
FittedVals
object with fitted values.
- zmiopc.vuong_opiop(opmodel, iopmodel)[source]¶
Run the Vuong test to compare the performance of the OP and iOP model.
- zmiopc.vuong_opiopc(opmodel, iopcmodel)[source]¶
Run the Vuong test to compare the performance of the OP and iOPC model.
- zmiopc.split_effects(model, inflvar, nsims=10000)[source]¶
Calculate change in probability of being 0 in the split-probit stage.
This function calculates the predicted probabilities when there is change in value of a variable in the split-probit equation. The chosen dummy variable is changed from 0 to 1, and chosen numerical variable is mean value + 1 standard deviation. Other variables are kept at 0 or mean value (Note: the current version of the function recognizes ordinal variables as numerical).
- Parameters
- Returns
changeprobs: A dataframe of the predicted probabilities when there is change in the variable (1) versus original values (0).
- zmiopc.ordered_effects(model, ordvar, nsims=10000)[source]¶
Calculate the changes in probability in each outcome in OP stage.
This function calculates predicted probabilities when there is a change in the value of a covariate in the ordered probit equation. The chosen dummy variable is changed from 0 to 1, and chosen numerical variable is mean value + 1 standard deviation. Other variables are kept at 0 or mean value (Note: the current version of the function recognizes ordinal variables as numerical).
- Parameters
- Returns
changeprobs: A dataframe of the predicted probabilities when there is change in the variable for each outcome (1) versus original values (0).
The gimnl module¶
Classes and Functions for the gimnl module.
- class gimnl.GimnlModel(modeltype, reference, inflatecat, llik, coef, aic, vcov, data, xs, zs, x_, yx_, z_, ycatu, xstr, ystr, zstr)[source]¶
Store model results from
gimnlmod()
.
- class gimnl.MnlModel(modeltype, reference, llik, coef, aic, vcov, data, xs, x_, yx_, ycatu, xstr, ystr)[source]¶
Store model results from
gimnlmod()
.
- gimnl.mnl3(pstart, x2, x3, y, reference)[source]¶
Calculate likelihood function for the baseline inflated three-category MNL model.
- Parameters
pstart (list) – A list of starting parameters.
x2 (Pandas dataframe) – Multinomial Logit variables. Data subsetted to selected variables.
x3 (Pandas dataframe) – Multinomial Logit variables (should be identical to x2.
y (Pandas dataframe) – Dependent variable (DV). Data subsetted to selected variable.
reference – List of order of categories.
- gimnl.bimnl3(pstart, x2, x3, y, z, reference)[source]¶
Calculate likelihood function for the baseline inflated three-category MNL model.
- Parameters
pstart (list) – A list of starting parameters.
x2 (pandas dataframe) – Multinomial Logit variables. Data subsetted to selected variables.
x3 (pandas dataframe) – Multinomial Logit variables (should be identical to x2.
y (pandas dataframe) – Dependent variable (DV). Data subsetted to selected variable.
z (pandas dataframe) – Logit Split stage variables. Data subsetted to selected variables.
reference – List of order of categories (first element will be the inflated category).
- gimnl.simnl3(pstart, x2, x3, y, z, reference)[source]¶
Calculate likelihood function for the second category inflated three-category MNL model.
- Parameters
pstart (list) – A list of starting parameters.
x2 (pandas dataframe) – Multinomial Logit variables. Data subsetted to selected variables.
x3 (pandas dataframe) – Multinomial Logit variables (should be identical to x2.
y (pandas dataframe) – Dependent variable (DV). Data subsetted to selected variable.
z (pandas dataframe) – Logit Split stage variables. Data subsetted to selected variables.
reference – List of order of categories (second element will be the inflated category).
- gimnl.timnl3(pstart, x2, x3, y, z, reference)[source]¶
Calculate likelihood function for the third category inflated three-category MNL model.
- Parameters
pstart (list) – A list of starting parameters.
x2 (pandas dataframe) – Covariates in the MNL.
x3 (pandas dataframe) – Multinomial Logit variables (should be identical to x2).
y (pandas dataframe) – Dependent variable (DV).
z (pandas dataframe) – Covariates in the Logit model (i.e. split stage). variables.
reference – List of order of categories (third element will be the inflated category).
- gimnl.gimnlresults(model, data, x, y, z, modeltype, reference, inflatecat)[source]¶
Produce estimation results, part of
gimnlmod()
.Store estimates, model AIC, and other information to
GimnlModel
.- Parameters
model – Estimation results.
data (pandas.DataFrame) – Model data used for estimation, subsetted to selected variables and missing values listwise deleted.
x (list of str) – List of names for covariates in Multinomial Logit stage.
y (list of str) – List of names for dependent variable .
z (list of str) – List of names for covariates in split-stage.
modeltype – Type of inflated MNL model. One of ‘bimnl3’, ‘simnl3’, or ‘timnl3’.
reference – List of order of categories. First element is the baseline/ reference category.
inflatecat – Inflated category. One of ‘baseline’, ‘second’, or ‘third’.
- gimnl.mnlresults(model, data, x, y, modeltype, reference)[source]¶
Produce estimation results, part of
mnlmod()
.Store estimates, model AIC, and other information to
MnlModel
.- Parameters
model – Estimation results.
data (pandas.DataFrame) – Model data used for estimation, subsetted to selected variables and missing values listwise deleted.
x (list of str) – List of names for covariates in Multinomial Logit stage.
y (list of str) – List of names for dependent variable .
modeltype – Three-category MNL model (‘mnl3’).
reference – List of order of categories. First element is the baseline/ reference category.
- gimnl.gimnlmod(data, x, y, z, reference, inflatecat, method='BFGS', pstart=None)[source]¶
Estimate three-category inflated Multinomial Logit model.
- Parameters
data (pandas.dataframe) – Full dataset.
x (list of str) – Covariates in Multi Nomial Logit stage. The elements must match column names of
data
.y (list of str) – Dependent Variable. Values should be integers, with a number from 0-2 representing each category. The element must match column names of
data
.z (list of str) – Covariates in split-stage. The elements must match column names of
data
.reference (list of int) – List of three elements specifying the order of categories (e.g [0, 1, 2], [2, 1, 0]. etc…). The first element is the baseline/reference category. The parameter inflatecat then specifies which category in the list inflated.
inflatecat – A string specifying the inflated category. One of “baseline” for the first, “second” for the second, or “third” for the third in reference list to specify the inflated category.
method – Optimization method. Default is ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.
pstart (list) – A list of starting parameters.
- gimnl.mnlmod(data, x, y, reference, method='BFGS', pstart=None)[source]¶
Estimate three-category Multinomial Logit model.
- Parameters
data (pandas.dataframe) – Full dataset.
x (list of str) – Covariates in MNL stage. The elements must match column names of
data
y – Dependent variable. Values should be integers, with a number from 0-2 representing each category.
reference (list of int) – List of three elements specifying the order of categories (e.g [0, 1, 2], [2, 1, 0]. etc…). The first element is the baseline/reference category.
method – Optimization method. Default is ‘BFGS’. For other available methods, see scipy.optimize.minimize documentation.
pstart (list) – A list of starting parameters.
- gimnl.vuong_gimnl(modelmnl, modelgimnl)[source]¶
Run the Vuong test to compare the performance of the MNL and GIMNL model.
For the function to run properly, the models need to have the same X covariates and same number of observations.
- Parameters
modelmnl – A
MnlModel
object.modelgimnl – A
GimnlModel
object.