Title: | Production Function Estimation |
---|---|
Description: | TFP estimation with the control function approach. |
Authors: | Gabriele Rovigatti [aut,cre] |
Maintainer: | Gabriele Rovigatti <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1 |
Built: | 2025-02-24 03:16:45 UTC |
Source: | https://github.com/gabrielerovigatti/prodest |
Sectoral subsample of Chilean firm-level production data 1986-1996.
data("chilean")
data("chilean")
A data.frame object containing 9 variables with production-related data.
Y |
vector of log(outcome) - Value added. |
sX |
vector of log(capital). |
fX |
matrix of log(skilled labor) and log(unskilled labor). |
cX |
vector of log(water). |
pX |
vector of log(electricity). |
inv |
vector of log(investment). |
idvar |
vector of panel identifier. |
timevar |
vector of time. |
panelSim()
produces a N*T balanced panel dataset of firms' production. In particular, it returns a data.frame
with free, state and proxy variables aimed at performing Monte Carlo simulations on productivity-related models.
panelSim(N = 1000, T = 100, alphaL = .6, alphaK = .4, DGP = 1, rho = .7, sigeps = .1, sigomg = .3, rholnw = .3, seed = 123456)
panelSim(N = 1000, T = 100, alphaL = .6, alphaK = .4, DGP = 1, rho = .7, sigeps = .1, sigomg = .3, rholnw = .3, seed = 123456)
N |
the number of firms. By default |
T |
the total time span to be simulated. Only a fraction (the last 10% of observations) will be returned. By default |
alphaL |
the parameter of the free variable. By default |
alphaK |
the parameter of the state variable. By default |
DGP |
Type of DGP; accepts 1, 2 or 3. They differ in terms of shock to wages (0 or 0.1), |
rho |
the AR(1) coefficient for omega. By default |
sigeps |
the standard deviation of epsilon. See |
sigomg |
the standard deviation of the innovation to productivity |
rholnw |
AR(1) coefficient for log(wage). By default |
seed |
seed set when the routine starts. By default |
panelSim()
is the R implementation of the DGP written by Ackerberg, Caves and Frazer (2015).
returns a
data.frame
with 7 variables:
ID codes from 1 to N (by default
).
time variable ranging 1 to
(by default
and
).
log output value added variable
log state variable
log free variable
log proxy variable - no measurement error
log proxy variable -
log proxy variable -
log proxy variable -
Gabriele Rovigatti
Ackerberg, D., Caves, K. and Frazer, G. (2015). "Identication properties of recent production function estimators." Econometrica, 83(6), 2411-2451.
require(prodest) ## Simulate a dataset with 1000 firms (T = 100). \code{Panelsim()} delivers the last 10% of usable time per panel. panel.data <- panelSim() attach(panel.data) ## Estimate various models ACF.fit <- prodestACF(Y, fX, sX, pX2, idvar, timevar, theta0 = c(.5,.5)) LP.fit <- prodestLP(Y, fX, sX, pX2, idvar, timevar) WRDG.fit <- prodestWRDG(Y, fX, sX, pX3, idvar, timevar, R = 5) ## print results in lateX tabular format printProd(list(LP.fit, ACF.fit, WRDG.fit))
require(prodest) ## Simulate a dataset with 1000 firms (T = 100). \code{Panelsim()} delivers the last 10% of usable time per panel. panel.data <- panelSim() attach(panel.data) ## Estimate various models ACF.fit <- prodestACF(Y, fX, sX, pX2, idvar, timevar, theta0 = c(.5,.5)) LP.fit <- prodestLP(Y, fX, sX, pX2, idvar, timevar) WRDG.fit <- prodestWRDG(Y, fX, sX, pX3, idvar, timevar, R = 5) ## print results in lateX tabular format printProd(list(LP.fit, ACF.fit, WRDG.fit))
The printProd()
function accepts a list
of prod
class objects and returns a screen printed tabular in lateX format of the results.
printProd(mods, modnames = NULL, parnames = NULL, outfile = NULL, ptime = FALSE, nboot = FALSE)
printProd(mods, modnames = NULL, parnames = NULL, outfile = NULL, ptime = FALSE, nboot = FALSE)
mods |
a |
modnames |
an optional vector of model names. By default, model names are the |
parnames |
an optional vector of parameter names. By default, parameter names are the |
outfile |
optional string with the path and directory to store a text file (.txt, .tex, etc. depending on the specified extension) with the tabular. By default |
ptime |
add a row showing the computational time. By default |
nboot |
add a row showing the number of bootstrap repetitions. By default |
The output of the function printProd
is either a screen printed tabular in lateX format of prod
object results or a text file tabular in lateX format of prod
object results.
Gabriele Rovigatti
data("chilean") WRDGfit <- prodestWRDG_GMM(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar) OPfit <- prodestOP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar) printProd(list(OPfit, WRDGfit), modnames = c('Olley-Pakes', 'Wooldridge'), parnames = c('bunsk', 'bsk', 'bk'))
data("chilean") WRDGfit <- prodestWRDG_GMM(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar) OPfit <- prodestOP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar) printProd(list(OPfit, WRDGfit), modnames = c('Olley-Pakes', 'Wooldridge'), parnames = c('bunsk', 'bsk', 'bk'))
Class for prodest fitted objects.
A virtual Class: No objects may be created from it.
Model
:Object of class list
. Contains information about the model and the optimization procedue:
method
: string
The method used in estimation.
FSbetas
: numeric
First-stage estimated parameters.
boot.repetitions
: numeric
Number of bootstrap repetitions.
elapsed.time
: numeric
Time - in seconds - required for estimation.
theta0
: numeric
Vector of Second-stage optimization starting points.
opt
: string
Optimizer used for the Second-stage.
seed
: numeric
seed set.
opt.outcome
: list
Optimization outcome (depends on optimizer choice).
Data
:Object of class list
. Contains:
Y
: numeric
Dependent variable - Value added.
free
: matrix
Free variable(s).
state
: matrix
State variable(s).
proxy
: matrix
Proxy variable(s).
control
: matrix
Control variable(s).
idvar
: numeric
Panel identifiers.
timevar
: numeric
Time identifiers.
FSresiduals
: numeric
First-Stage residuals.
Estimates
:Object of class list
. Contains:
pars
: numeric
Estimated parameters for the variables of interest.
std.errors
: numeric
Estimated standard errors for the variables of interest.
show
signature(object = 'prod')
: Show table with the method, the estimated parameters and their standard errors.
summary
signature(object = 'prod')
: Show table with method, parameters, std.errors and auxiliary information on model and optimization.
FSres
signature(object = 'prod')
: Extract First-Stage residual vector.
omega
signature(object = 'prod')
: Extract estimated productivity vector.
coef
signature(object = 'prod')
: Extract estimated coefficients.
Gabriele Rovigatti
The prodestACF()
function accepts at least 6 objects (id, time, output, free, state and proxy variables), and returns a prod
object of class S3
with three elements: (i) a list of model-related objects, (ii) a list with the data used in the estimation and estimated vectors of first-stage residuals, and (iii) a list with the estimated parameters and their bootstrapped standard errors .
prodestACF(Y, fX, sX, pX, idvar, timevar, zX = NULL, control = c('none','fs','2s'), dum = F, G = 3, A = 3, R = 20, orth = F, opt = 'optim', theta0 = NULL, seed = 123456, cluster = NULL)
prodestACF(Y, fX, sX, pX, idvar, timevar, zX = NULL, control = c('none','fs','2s'), dum = F, G = 3, A = 3, R = 20, orth = F, opt = 'optim', theta0 = NULL, seed = 123456, cluster = NULL)
Y |
the vector of value added log output. |
fX |
the vector/matrix/dataframe of log free variables. |
sX |
the vector/matrix/dataframe of log state variables. |
pX |
the vector/matrix/dataframe of log proxy variables. |
idvar |
the vector/matrix/dataframe identifying individual panels. |
timevar |
the vector/matrix/dataframe identifying time. |
zX |
the vector/matrix/dataframe of (input price) control variables. |
control |
the way in which the control variables should be included. By default |
dum |
whether time dummies should be included in the first stage. By default |
G |
the degree of the first-stage polynomial in fX, sX and pX. By default |
A |
the degree of the polynomial for the Markov productivity process. By default |
R |
the number of block bootstrap repetitions to be performed in the standard error estimation. By default |
orth |
a Boolean that determines whether first-stage polynomial should be orthogonal or raw. By default, |
opt |
a string with the optimization algorithm to be used during the estimation. By default |
theta0 |
a vector with the second stage optimization starting points. By default |
cluster |
an object of class |
seed |
seed set when the routine starts. By default |
Consider a Cobb-Douglas production technology for firm at time
where is the (log) output, w_it a 1xJ vector of (log) free variables, k_it is a 1xK vector of state variables and
is a normally distributed idiosyncratic error term.
The unobserved technical efficiency parameter
evolves according to a first-order Markov process:
and is a random shock component assumed to be uncorrelated with the technicalefficiency, the state variables in
and the lagged free variables
.
ACF propose an estimation algorithm alternative to OP and LP procedures claiming that the labour demand and the control function are partially collinear.
It is based on the following set of assumptions:
a) is the proxy variable policy function;
b) is strictly monotone in
;
c) is scalar unobservable in
;
d) The state variable are decided at time t-1. The less variable labor input, , is chosen at t-b, where
. The free variables,
, are chosen in t when the firm productivity shock is realized.
Under this set of assumptions, the first stage is meant to remove the shock from the the output,
. As in the OP/LP case, the inverted policy function replaces the productivity term
in the production function:
which is estimated by a non-parametric approach - First Stage. Exploiting the Markovian nature of the productivity process one can use assumption d) in order to set up the relevant moment conditions and estimate the production function parameters - Second stage.
The output of the function prodestACF
is a member of the S3
class prod. More precisely, is a list (of length 3) containing the following elements:
Model
, a list with elements:
method:
a string describing the method ('ACF').
boot.repetitions:
the number of bootstrap repetitions used for standard errors' computation.
elapsed.time:
time elapsed during the estimation.
theta0:
numeric object with the optimization starting points - second stage.
opt:
string with the optimization routine used - 'optim', 'solnp' or 'DEoptim'.
seed:
the seed set at the beginning of the estimation.
opt.outcome:
optimization outcome.
FSbetas:
first stage estimated parameters.
Data
, a list with elements:
Y:
the vector of value added log output.
free:
the vector/matrix/dataframe of log free variables.
state:
the vector/matrix/dataframe of log state variables.
proxy:
the vector/matrix/dataframe of log proxy variables.
control:
the vector/matrix/dataframe of log control variables.
idvar:
the vector/matrix/dataframe identifying individual panels.
timevar:
the vector/matrix/dataframe identifying time.
FSresiduals:
numeric object with the residuals of the first stage.
Estimates
, a list with elements:
pars:
the vector of estimated coefficients.
std.errors:
the vector of bootstrapped standard errors.
Members of class prod
have an omega
method returning a numeric object with the estimated productivity - that is: .
FSres
method returns a numeric object with the residuals of the first stage regression, while summary
, show
and coef
methods are implemented and work as usual.
Gabriele Rovigatti
Ackerberg, D., Caves, K. and Frazer, G. (2015). "Identification properties of recent production function estimators." Econometrica, 83(6), 2411-2451. De Loecker, J., Goldberg, P. K., Khandelwal, A. K., & Pavcnik, N. (2016). "Prices, markups, and trade reform." Econometrica, 84(2), 445-510. De Loecker, J., & Warzynski, F. (2012). "Markups and firm-level export status." American Economic Review, 102(6), 2437-71.
require(prodest) ## Chilean data on production.The full version is Publicly available at ## http://www.ine.cl/canales/chile_estadistico/estadisticas_economicas/industria/series_estadisticas/series_estadisticas_enia.php data(chilean) # we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) ACF.fit <- prodestACF(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, theta0 = c(.5,.5,.5), seed = 154673) ACF.fit.solnp <- prodestACF(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, theta0 = c(.5,.5,.5), opt = 'solnp', seed = 154673) # run the same regression in parallel nCores <- as.numeric(Sys.getenv("NUMBER_OF_PROCESSORS")) cl <- makeCluster(getOption("cl.cores", nCores - 1)) ACF.fit.par <- prodestACF(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, theta0 = c(.5,.5,.5), cluster = cl, seed = 154673) stopCluster(cl) # show results coef(ACF.fit) coef(ACF.fit.solnp) coef(ACF.fit.par) # show results in .tex tabular format printProd(list(ACF.fit, ACF.fit.solnp, ACF.fit.par))
require(prodest) ## Chilean data on production.The full version is Publicly available at ## http://www.ine.cl/canales/chile_estadistico/estadisticas_economicas/industria/series_estadisticas/series_estadisticas_enia.php data(chilean) # we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) ACF.fit <- prodestACF(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, theta0 = c(.5,.5,.5), seed = 154673) ACF.fit.solnp <- prodestACF(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, theta0 = c(.5,.5,.5), opt = 'solnp', seed = 154673) # run the same regression in parallel nCores <- as.numeric(Sys.getenv("NUMBER_OF_PROCESSORS")) cl <- makeCluster(getOption("cl.cores", nCores - 1)) ACF.fit.par <- prodestACF(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, theta0 = c(.5,.5,.5), cluster = cl, seed = 154673) stopCluster(cl) # show results coef(ACF.fit) coef(ACF.fit.solnp) coef(ACF.fit.par) # show results in .tex tabular format printProd(list(ACF.fit, ACF.fit.solnp, ACF.fit.par))
The prodestLP()
The prodestWRDG()
function accepts at least 6 objects (id, time, output, free, state and proxy variables), and returns a prod
object of class S3
with three elements: (i) a list of model-related objects, (ii) a list with the data used in the estimation and estimated vectors of first-stage residuals, and (iii) a list with the estimated parameters and their bootstrapped standard errors.
prodestLP(Y, fX, sX, pX, idvar, timevar, R = 20, G = 3, orth = F, cX = NULL, opt = 'optim', theta0 = NULL, seed = 123456, cluster = NULL, tol = 1e-100)
prodestLP(Y, fX, sX, pX, idvar, timevar, R = 20, G = 3, orth = F, cX = NULL, opt = 'optim', theta0 = NULL, seed = 123456, cluster = NULL, tol = 1e-100)
Y |
the vector of value added log output. |
fX |
the vector/matrix/dataframe of log free variables. |
sX |
the vector/matrix/dataframe of log state variables. |
pX |
the vector/matrix/dataframe of log proxy variables. |
cX |
the vector/matrix/dataframe of control variables. By default |
idvar |
the vector/matrix/dataframe identifying individual panels. |
timevar |
the vector/matrix/dataframe identifying time. |
R |
the number of block bootstrap repetitions to be performed in the standard error estimation. By default |
G |
the degree of the first-stage polynomial in fX, sX and pX. By default |
orth |
a Boolean that determines whether first-stage polynomial should be orthogonal or raw. By default, |
opt |
a string with the optimization algorithm to be used during the estimation. By default |
theta0 |
a vector with the second stage optimization starting points. By default |
cluster |
an object of class |
seed |
seed set when the routine starts. By default |
tol |
optimizer tolerance. By default |
Consider a Cobb-Douglas production technology for firm at time
where is the (log) output, w_it a 1xJ vector of (log) free variables, k_it is a 1xK vector of state variables and
is a normally distributed idiosyncratic error term.
The unobserved technical efficiency parameter
evolves according to a first-order Markov process:
and is a random shock component assumed to be uncorrelated with the technicalefficiency, the state variables in
and the lagged free variables
.
The LP method relies on the following set of assumptions:
a) firms immediately adjust the level of inputs according to demand function after the technical efficiency shock realizes;
b) is strictly monotone in
;
c) is scalar unobservable in
;
d) the levels of are decided at time
; the level of the free variable,
, is decided after the shock
realizes.
Assumptions a)-d) ensure the invertibility of in
and lead to the partially identified model:
which is estimated by a non-parametric approach - First Stage.
Exploiting the Markovian nature of the productivity process one can use assumption d) in order to set up the relevant moment conditions and estimate the production function parameters - Second stage.
Exploting the resisual of:
and is typically left unspecified and approximated by a
order polynomial and
is an inidicator function for the attrition in the market.
The output of the function prodestLP
is a member of the S3
class prod. More precisely, is a list (of length 3) containing the following elements:
Model
, a list containing:
method:
a string describing the method ('LP').
boot.repetitions:
the number of bootstrap repetitions used for standard errors' computation.
elapsed.time:
time elapsed during the estimation.
theta0:
numeric object with the optimization starting points - second stage.
opt:
string with the optimization routine used - 'optim', 'solnp' or 'DEoptim'.
seed:
the seed set at the beginning of the estimation.
opt.outcome:
optimization outcome.
FSbetas:
first stage estimated parameters.
Data
, a list containing:
Y:
the vector of value added log output.
free:
the vector/matrix/dataframe of log free variables.
state:
the vector/matrix/dataframe of log state variables.
proxy:
the vector/matrix/dataframe of log proxy variables.
control:
the vector/matrix/dataframe of log control variables.
idvar:
the vector/matrix/dataframe identifying individual panels.
timevar:
the vector/matrix/dataframe identifying time.
FSresiduals:
numeric object with the residuals of the first stage.
Estimates
, a list containing:
pars:
the vector of estimated coefficients.
std.errors:
the vector of bootstrapped standard errors.
Members of class prod
have an omega
method returning a numeric object with the estimated productivity - that is: .
FSres
method returns a numeric object with the residuals of the first stage regression, while summary
, show
and coef
methods are implemented and work as usual.
Gabriele Rovigatti
Levinsohn, J. and Petrin, A. (2003). "Estimating production functions using inputs to control for unobservables." The Review of Economic Studies, 70(2), 317-341.
require(prodest) ## Chilean data on production. ## Publicly available at http://www.ine.cl/canales/chile_estadistico/estadisticas_economicas/industria/series_estadisticas/series_estadisticas_enia.php data(chilean) ## we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) LP.fit <- prodestLP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, seed = 154673) LP.fit.solnp <- prodestLP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, opt = 'solnp') # run the same model in parallel require(parallel) nCores <- as.numeric(Sys.getenv("NUMBER_OF_PROCESSORS")) cl <- makeCluster(getOption("cl.cores", nCores - 1)) LP.fit.par <- prodestLP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, cluster = cl, seed = 154673) stopCluster(cl) # show results summary(LP.fit) summary(LP.fit.solnp) summary(LP.fit.par) # show results in .tex tabular format printProd(list(LP.fit, LP.fit.solnp, LP.fit.par))
require(prodest) ## Chilean data on production. ## Publicly available at http://www.ine.cl/canales/chile_estadistico/estadisticas_economicas/industria/series_estadisticas/series_estadisticas_enia.php data(chilean) ## we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) LP.fit <- prodestLP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, seed = 154673) LP.fit.solnp <- prodestLP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, opt = 'solnp') # run the same model in parallel require(parallel) nCores <- as.numeric(Sys.getenv("NUMBER_OF_PROCESSORS")) cl <- makeCluster(getOption("cl.cores", nCores - 1)) LP.fit.par <- prodestLP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar, cluster = cl, seed = 154673) stopCluster(cl) # show results summary(LP.fit) summary(LP.fit.solnp) summary(LP.fit.par) # show results in .tex tabular format printProd(list(LP.fit, LP.fit.solnp, LP.fit.par))
The prodestOP()
function accepts at least 6 objects (id, time, output, free, state and proxy variables), and returns a prod
object of class S4
with three elements: (i) a list of model-related objects, (ii) a list with the data used in the estimation and estimated vectors of first-stage residuals, and (iii) a list with the estimated parameters and their bootstrapped standard errors .
prodestOP(Y, fX, sX, pX, idvar, timevar, R = 20, G = 3, orth = F, cX = NULL, opt = 'optim', theta0 = NULL, seed = 123456, cluster = NULL, tol = 1e-100)
prodestOP(Y, fX, sX, pX, idvar, timevar, R = 20, G = 3, orth = F, cX = NULL, opt = 'optim', theta0 = NULL, seed = 123456, cluster = NULL, tol = 1e-100)
Y |
the vector of value added log output. |
fX |
the vector/matrix/dataframe of log free variables. |
sX |
the vector/matrix/dataframe of log state variables. |
pX |
the vector/matrix/dataframe of log proxy variables. |
cX |
the vector/matrix/dataframe of control variables. By default |
idvar |
the vector/matrix/dataframe identifying individual panels. |
timevar |
the vector/matrix/dataframe identifying time. |
R |
the number of block bootstrap repetitions to be performed in the standard error estimation. By default |
G |
the degree of the first-stage polynomial in fX, sX and pX. By default, |
orth |
a Boolean that determines whether first-stage polynomial should be orthogonal or raw. By default, |
opt |
a string with the optimization algorithm to be used during the estimation. By default |
theta0 |
a vector with the second stage optimization starting points. By default |
cluster |
an object of class |
seed |
seed set when the routine starts. By default |
tol |
optimizer tolerance. By default |
Consider a Cobb-Douglas production technology for firm at time
where is the (log) output, w_it a 1xJ vector of (log) free variables, k_it is a 1xK vector of state variables and
is a normally distributed idiosyncratic error term.
The unobserved technical efficiency parameter
evolves according to a first-order Markov process:
and is a random shock component assumed to be uncorrelated with the technicalefficiency, the state variables in
and the lagged free variables
.
The OP method relies on the following set of assumptions:
a) - investments are a function of both the state variable and the technical efficiency parameter;
b) is strictly monotone in
;
c) is scalar unobservable in
;
d) the levels of and
are decided at time
; the level of the free variable,
, is decided after the shock
realizes.
Assumptions a)-d) ensure the invertibility of in
and lead to the partially identified model:
which is estimated by a non-parametric approach - First Stage.
Exploiting the Markovian nature of the productivity process one can use assumption d) in order to set up the relevant moment conditions and estimate the production function parameters - Second stage.
Exploting the resisual of:
and is typically left unspecified and approximated by a
order polynomial and
is an inidicator function for the attrition in the market.
The output of the function prodestOP
is a member of the S3
class prod. More precisely, is a list (of length 3) containing the following elements:
Model
, a list containing:
method:
a string describing the method ('OP').
boot.repetitions:
the number of bootstrap repetitions used for standard errors' computation.
elapsed.time:
time elapsed during the estimation.
theta0:
numeric object with the optimization starting points - second stage.
opt:
string with the optimization routine used - 'optim', 'solnp' or 'DEoptim'.
seed:
the seed set at the beginning of the estimation.
opt.outcome:
optimization outcome.
FSbetas:
first stage estimated parameters.
Data
, a list containing:
Y:
the vector of value added log output.
free:
the vector/matrix/dataframe of log free variables.
state:
the vector/matrix/dataframe of log state variables.
proxy:
the vector/matrix/dataframe of log proxy variables.
control:
the vector/matrix/dataframe of log control variables.
idvar:
the vector/matrix/dataframe identifying individual panels.
timevar:
the vector/matrix/dataframe identifying time.
FSresiduals:
numeric object with the residuals of the first stage.
Estimates
, a list containing:
pars:
the vector of estimated coefficients.
std.errors:
the vector of bootstrapped standard errors.
Members of class prod
have an omega
method returning a numeric object with the estimated productivity - that is: .
FSres
method returns a numeric object with the residuals of the first stage regression, while summary
, show
and coef
methods are implemented and work as usual.
Gabriele Rovigatti
Olley, S G and Pakes, A (1996). "The dynamics of productivity in the telecommunications equipment industry." Econometrica, 64(6), 1263-1297.
require(prodest) ## Chilean data on production.The full version is Publicly available at ## http://www.ine.cl/canales/chile_estadistico/estadisticas_economicas/industria/series_estadisticas/series_estadisticas_enia.php data(chilean) # we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) OP.fit <- prodestOP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$inv, d$idvar, d$timevar) OP.fit.solnp <- prodestOP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$inv, d$idvar, d$timevar, opt='solnp') OP.fit.control <- prodestOP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$inv, d$idvar, d$timevar, cX = d$cX) # show results summary(OP.fit) summary(OP.fit.solnp) summary(OP.fit.control) # show results in .tex tabular format printProd(list(OP.fit, OP.fit.solnp, OP.fit.control))
require(prodest) ## Chilean data on production.The full version is Publicly available at ## http://www.ine.cl/canales/chile_estadistico/estadisticas_economicas/industria/series_estadisticas/series_estadisticas_enia.php data(chilean) # we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) OP.fit <- prodestOP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$inv, d$idvar, d$timevar) OP.fit.solnp <- prodestOP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$inv, d$idvar, d$timevar, opt='solnp') OP.fit.control <- prodestOP(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$inv, d$idvar, d$timevar, cX = d$cX) # show results summary(OP.fit) summary(OP.fit.solnp) summary(OP.fit.control) # show results in .tex tabular format printProd(list(OP.fit, OP.fit.solnp, OP.fit.control))
The prodestWRDG()
function accepts at least 6 objects (id, time, output, free, state and proxy variables), and returns a prod
object of class S4
with three elements: (i) a list of model-related objects, (ii) a list with the data used in the estimation and estimated vectors of first-stage residuals, and (iii) a list with the estimated parameters and their bootstrapped standard errors.
prodestWRDG(Y, fX, sX, pX, idvar, timevar, R = 20, G = 3, orth = F, cX = NULL, seed = 123456, tol = 1e-100, theta0 = NULL, cluster = NULL)
prodestWRDG(Y, fX, sX, pX, idvar, timevar, R = 20, G = 3, orth = F, cX = NULL, seed = 123456, tol = 1e-100, theta0 = NULL, cluster = NULL)
Y |
the vector of value added log output. |
fX |
the vector/matrix/dataframe of log free variables. |
sX |
the vector/matrix/dataframe of log state variables. |
pX |
the vector/matrix/dataframe of log proxy variables. |
cX |
the vector/matrix/dataframe of control variables. By default |
idvar |
the vector/matrix/dataframe identifying individual panels. |
timevar |
the vector/matrix/dataframe identifying time. |
R |
the number of block bootstrap repetitions to be performed in the standard error estimation. By default |
G |
the degree of the polynomial for productivity in sX and pX. By default, |
orth |
a Boolean that determines whether first-stage polynomial should be orthogonal or raw. By default, |
theta0 |
a vector with the second stage optimization starting points. By default |
cluster |
an object of class |
seed |
seed set when the routine starts. By default |
tol |
optimizer tolerance. By default |
Consider a Cobb-Douglas production technology for firm at time
where is the (log) output, w_it a 1xJ vector of (log) free variables, k_it is a 1xK vector of state variables and
is a normally distributed idiosyncratic error term.
The unobserved technical efficiency parameter
evolves according to a first-order Markov process:
and is a random shock component assumed to be uncorrelated with the technical efficiency, the state variables in
and the lagged free variables
.
Wooldridge method allows to jointly estimate OP/LP two stages jointly in a system of two equations. It relies on the following set of assumptions:
a) : productivity is an unknown function
of state and a proxy variables;
b) , productivity is an unknown function
of lagged productivity,
.
Under the above set of assumptions, It is possible to construct a system gmm using the vector of residuals from
where the unknown function is approximeted by a n-th order polynomial and
. In particular,
is a linear combination of functions in
and
are the addends of this linear combination. The residuals eqnr_it are used to set the moment conditions
with the following set of instruments:
The output of the function prodestWRDG
is a member of the S3
class prod. More precisely, is a list (of length 3) containing the following elements:
Model
, a list containing:
method:
a string describing the method ('WRDG').
elapsed.time:
time elapsed during the estimation.
seed:
the seed set at the beginning of the estimation.
opt.outcome:
optimization outcome.
Data
, a list containing:
Y:
the vector of value added log output.
free:
the vector/matrix/dataframe of log free variables.
state:
the vector/matrix/dataframe of log state variables.
proxy:
the vector/matrix/dataframe of log proxy variables.
control:
the vector/matrix/dataframe of log control variables.
idvar:
the vector/matrix/dataframe identifying individual panels.
timevar:
the vector/matrix/dataframe identifying time.
Estimates
, a list containing:
pars:
the vector of estimated coefficients.
std.errors:
the vector of bootstrapped standard errors.
Members of class prod
have an omega
method returning a numeric object with the estimated productivity - that is: .
FSres
method returns a numeric object with the residuals of the first stage regression, while summary
, show
and coef
methods are implemented and work as usual.
Gabriele Rovigatti
Wooldridge, J M (2009). "On estimating firm-level production functions using proxy variables to control for unobservables." Economics Letters, 104, 112-114.
data("chilean") # we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) WRDG.fit <- prodestWRDG(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar) # show results WRDG.fit ## Not run: # estimate a panel dataset - DGP1, various measurement errors - and run the estimation sim <- panelSim() WRDG.sim1 <- prodestWRDG(sim$Y, sim$fX, sim$sX, sim$pX1, sim$idvar, sim$timevar) WRDG.sim2 <- prodestWRDG(sim$Y, sim$fX, sim$sX, sim$pX2, sim$idvar, sim$timevar) WRDG.sim3 <- prodestWRDG(sim$Y, sim$fX, sim$sX, sim$pX3, sim$idvar, sim$timevar) WRDG.sim4 <- prodestWRDG(sim$Y, sim$fX, sim$sX, sim$pX4, sim$idvar, sim$timevar) # show results in .tex tabular format printProd(list(WRDG.sim1, WRDG.sim2, WRDG.sim3, WRDG.sim4), parnames = c('Free','State')) ## End(Not run)
data("chilean") # we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) WRDG.fit <- prodestWRDG(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar) # show results WRDG.fit ## Not run: # estimate a panel dataset - DGP1, various measurement errors - and run the estimation sim <- panelSim() WRDG.sim1 <- prodestWRDG(sim$Y, sim$fX, sim$sX, sim$pX1, sim$idvar, sim$timevar) WRDG.sim2 <- prodestWRDG(sim$Y, sim$fX, sim$sX, sim$pX2, sim$idvar, sim$timevar) WRDG.sim3 <- prodestWRDG(sim$Y, sim$fX, sim$sX, sim$pX3, sim$idvar, sim$timevar) WRDG.sim4 <- prodestWRDG(sim$Y, sim$fX, sim$sX, sim$pX4, sim$idvar, sim$timevar) # show results in .tex tabular format printProd(list(WRDG.sim1, WRDG.sim2, WRDG.sim3, WRDG.sim4), parnames = c('Free','State')) ## End(Not run)
The prodestWRDG_GMM()
function accepts at least 6 objects (id, time, output, free, state and proxy variables), and returns a prod
object of class S3
with three elements: (i) a list of model-related objects, (ii) a list with the data used in the estimation and estimated vectors of first-stage residuals, and (iii) a list with the estimated parameters and their bootstrapped standard errors.
prodestWRDG_GMM(Y, fX, sX, pX, idvar, timevar, G = 3, orth = F, cX = NULL, seed = 123456, tol = 1e-100)
prodestWRDG_GMM(Y, fX, sX, pX, idvar, timevar, G = 3, orth = F, cX = NULL, seed = 123456, tol = 1e-100)
Y |
the vector of value added log output. |
fX |
the vector/matrix/dataframe of log free variables. |
sX |
the vector/matrix/dataframe of log state variables. |
pX |
the vector/matrix/dataframe of log proxy variables. |
cX |
the vector/matrix/dataframe of control variables. By default |
G |
the degree of the polynomial for productivity in sX and pX. By default, |
orth |
a Boolean that determines whether first-stage polynomial should be orthogonal or raw. By default, |
idvar |
the vector/matrix/dataframe identifying individual panels. |
timevar |
the vector/matrix/dataframe identifying time. |
seed |
seed set when the routine starts. By default |
tol |
optimizer tolerance. By default |
Consider a Cobb-Douglas production technology for firm at time
where is the (log) output, w_it a 1xJ vector of (log) free variables, k_it is a 1xK vector of state variables and
is a normally distributed idiosyncratic error term.
The unobserved technical efficiency parameter
evolves according to a first-order Markov process:
and is a random shock component assumed to be uncorrelated with the technical efficiency, the state variables in
and the lagged free variables
.
Wooldridge method allows to jointly estimate OP/LP two stages jointly in a system of two equations. It relies on the following set of assumptions:
a) : productivity is an unknown function
of state and a proxy variables;
b) , productivity is an unknown function
of lagged productivity,
.
Under the above set of assumptions, It is possible to construct a system gmm using the vector of residuals from
where the unknown function is approximated by a n-th order polynomial and
. In particular,
is a linear combination of functions in
and
are the addends of this linear combination. The residuals eqnr_it are used to set the moment conditions
with the following set of instruments:
where is a set of non-linear functions of c_it-1.
The output of the function prodestWRDG
is a member of the S3
class prod. More precisely, is a list (of length 3) containing the following elements:
Model
, a list containing:
method:
a string describing the method ('WRDG').
elapsed.time:
time elapsed during the estimation.
seed:
the seed set at the beginning of the estimation.
opt.outcome:
optimization outcome.
Data
, a list containing:
Y:
the vector of value added log output.
free:
the vector/matrix/dataframe of log free variables.
state:
the vector/matrix/dataframe of log state variables.
proxy:
the vector/matrix/dataframe of log proxy variables.
control:
the vector/matrix/dataframe of log control variables.
idvar:
the vector/matrix/dataframe identifying individual panels.
timevar:
the vector/matrix/dataframe identifying time.
Estimates
, a list containing:
pars:
the vector of estimated coefficients.
std.errors:
the vector of bootstrapped standard errors.
Members of class prod
have an omega
method returning a numeric object with the estimated productivity - that is: .
FSres
method returns a numeric object with the residuals of the first stage regression, while summary
, show
and coef
methods are implemented and work as usual.
Gabriele Rovigatti
Wooldridge, J M (2009). "On estimating firm-level production functions using proxy variables to control for unobservables." Economics Letters, 104, 112-114.
data("chilean") # we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) WRDG.GMM.fit <- prodestWRDG_GMM(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar) # show results WRDG.GMM.fit # estimate a panel dataset - DGP1, various measurement errors - and run the estimation sim <- panelSim() WRDG.GMM.sim1 <- prodestWRDG_GMM(sim$Y, sim$fX, sim$sX, sim$pX1, sim$idvar, sim$timevar) WRDG.GMM.sim2 <- prodestWRDG_GMM(sim$Y, sim$fX, sim$sX, sim$pX2, sim$idvar, sim$timevar) WRDG.GMM.sim3 <- prodestWRDG_GMM(sim$Y, sim$fX, sim$sX, sim$pX3, sim$idvar, sim$timevar) WRDG.GMM.sim4 <- prodestWRDG_GMM(sim$Y, sim$fX, sim$sX, sim$pX4, sim$idvar, sim$timevar) # show results in .tex tabular format printProd(list(WRDG.GMM.sim1, WRDG.GMM.sim2, WRDG.GMM.sim3, WRDG.GMM.sim4), parnames = c('Free','State'))
data("chilean") # we fit a model with two free (skilled and unskilled), one state (capital) and one proxy variable (electricity) WRDG.GMM.fit <- prodestWRDG_GMM(d$Y, fX = cbind(d$fX1, d$fX2), d$sX, d$pX, d$idvar, d$timevar) # show results WRDG.GMM.fit # estimate a panel dataset - DGP1, various measurement errors - and run the estimation sim <- panelSim() WRDG.GMM.sim1 <- prodestWRDG_GMM(sim$Y, sim$fX, sim$sX, sim$pX1, sim$idvar, sim$timevar) WRDG.GMM.sim2 <- prodestWRDG_GMM(sim$Y, sim$fX, sim$sX, sim$pX2, sim$idvar, sim$timevar) WRDG.GMM.sim3 <- prodestWRDG_GMM(sim$Y, sim$fX, sim$sX, sim$pX3, sim$idvar, sim$timevar) WRDG.GMM.sim4 <- prodestWRDG_GMM(sim$Y, sim$fX, sim$sX, sim$pX4, sim$idvar, sim$timevar) # show results in .tex tabular format printProd(list(WRDG.GMM.sim1, WRDG.GMM.sim2, WRDG.GMM.sim3, WRDG.GMM.sim4), parnames = c('Free','State'))