Brief Introduction SEEMINGLY UNRELATED REGRESSION (SUR)
Transcript of Brief Introduction SEEMINGLY UNRELATED REGRESSION (SUR)
Brief Introduction SEEMINGLY UNRELATED REGRESSION
(SUR) APPLICATION and DEVELOPMENT
ALDON MHP SINAGA
Yogyakarta, 2015
i
CONTENTS INTRODUCTION ................................................................................... 1
About SUR ........................................................................................ 1
SUR Model Spesification and Estimation ..................................................... 2
IMPLEMENTATION OF SUR ON RESEARCH .................................................... 6
SUR in Environmental Research ............................................................... 6
SUR in Animal Husbandry Research ........................................................... 7
SUR in Agricultural Production Research .................................................... 9
SUR in Agricultural Economics Research ................................................... 10
Applied SUR Overview ........................................................................ 12
Application Approaches ................................................................... 12
Application Benefit ......................................................................... 13
END NOTES AND CONCLUSION ................................................................ 15
Development of SUR Model .................................................................. 15
Conclusion ...................................................................................... 15
REFFERENCE ..................................................................................... 17
COPY OF ARTICLES ............................................. Error! Bookmark not defined.
1
INTRODUCTION
About SUR
In some cases during our research we often found that we are dealing with
several number of equation which contain an amount of same regressors. Those
regressors some way has known influence some different regresand. Let says if we
want to learn about investment on two big company, as illustrated by Zellner (1962).
As we have two situation of investment at two company (say Westinghouse and
General Electrics). As we describe the situation as two different equation, and assume
that in every equation real gross investment were determine by an outstanding shares
at the beginning period and the opening value of each companies capital stock. On
those condition as the data for the equation came from the same periods we can
expect that error terms of those two equation may be contemporaneous correlated
(Zellner, 1962).
Correlation of the error terms comes up as given presence of common market
forces condition and the likelihood of similar factors which affect the inclusion of
error terms in the regression. In instance, if error terms from the first equation is
related to some unobservable variable, which mostly also followed by the other
equation error terms, so we can find that error terms of each equation are
correlated. This condition made those two equation “seemingly”unrelated.
James Thronton (2013) has noted that Seemingly Unrelated Regression (SUR) is
one of four types equation system which contains with two or more particulary
related equation. Beside SUR, we also known (1). Simultaneous Equation System, (2).
Recursive Equation System, and (3). Block Recursive System. In several ways, SUR
have a similar with Simultaneous Equation System, but have a basic differences in
how the the equations are related one to another.
In SUR equation system, the equations are related based on one or both these
following ways :
1. The error terms of each equations are related. The error terms are
correlated if there are a common unobserved factors which influence the
dependent variables of the equations.
2
2. There are a relation among the parameters of each different equations.
This condition will occurs if (a). the same parameters appears in more
than one equation, or (b). if a parameters in one equation has a linear
or nonlinear function / relation to the parameters in the other
equations.
(Thronton, 2013)
Mark Beasley (2008) has clearly mentioned that SUR models are often applied
when we faced several equations, which looks like to be unrelated; but however, they
might be related by the fact that:
1. Some coefficients were assumed to be the same or zero;
2. The disturbances wereknown well to be correlated across equations; and/or
3. The variables at right hand side subset of equation (independent variables /
regressors subset) are the same.
There are many economic arguments or phenomenon which best described by a
Seemingly Unrelated Regression equation system. For an examples there are : (1)
Investment demand equations for several firms / company at the same industry. (2)
Consumer demand function / equations which implied by behavior in maximizing
utility. And (3) Input demand function / equations which implied by behavior of the
company to minimizing cost and maximizing profit (Thronton, 2013).
SUR Model Spesification and Estimation
From the background described before, Thronton (2013) has describing SUR
spesification and estimation. We can assume that if there is a M number of equations
that are related each other because the error terms are correlated. This M number of
seemingly unrelated regression equations can be written with matrix format :
y1 = X11 + 1
y2 = X22 + 2
y3 = X33 + 3
3
.
.
yM = XMM + M
or can be written as follows :
yi = Xii + i
asi = 1,2,3 .., M
If time is represented by T, thus ;
in matrix form, yiisT x 1 column vector of observation on the ith dependent
variable
while Xi is T x K matrix of observation for K-1 explanatory variables;
i is the Kx1 column vector of parameters for the ith equation;
andi is the Tx1 column vector of disturbances for the ith equation.
Now, we can combine this M number of equation as one large equation, which can
formed as matrix as follows;
y1 X1 0 0 0 0 0 0 1 1
y2 0 X2 0 0 0 0 0 2 2
y3 0 0 X3 0 0 0 0 3 3 . = 0 0 0 . 0 0 0 x . + . . 0 0 0 0 . 0 0 . . . 0 0 0 0 0 . 0 . .
yM 0 0 0 0 0 0 XM M M
(MT)x1 (MT)x(MK) (MK)x1 (MT)x1
This multiple equation can be simply written as;
y = X +
where :
y = ( M . T ) x 1 column vector of observation on the M number of equation of regressand variable
X = ( M . T ) x ( M . K) matrix of observation on regressors variable
= ( M . K ) x 1 column vector of parameter for M number of equation
= ( M . T ) x 1 column vector of disturbances for M number of equation
Those spesification above has defined by the assumption as follow;
1. The functional form of SUR equation is linear in parameters.
y = X +
4
2. The mean of error term in SUR equation is equal to zero.
E() = 0
3. The errors of SUR equation are follow variance-covariance matrix of errors
Cov() = E(T) = W = I,
thus it must satisfy the assumptions that;
a. The error variance for every individual equation which be a part of SUR is constant (no heteroscedasticity)
b. The error variance may be different for every individual equations. c. The errors for every individual equation which be a part of SUR are
uncorrelated (no autocorrelation) d. The errors for different individual equations arecontemporaneously
correlated, following this pattern;
For time series data, the errors in different equations at the same time are correlated, while the errors in different equations for different time are notcorrelated.
For cross section data, the errors in different equations for the same decision makingunits(samples of respondents) are correlated, mean while the errors in different equations for different decision making units (samples of respondents) are not correlated.
4. The error term for SUR equation has a normal distribution
~ N
5. The error term of SUR equation is uncorrelated with every independent
variable in the SUR equation.
Cov (,X) = 0
Finally, to obtain estimation of the parameters (i)of the SUR model, we need
to choose an estimator. There are 4 option of estimator which we can employ to
estimate i, ;
1. Ordinary least squares (OLS), the estimator is following the rule
^ = (XTX)-1XTy,
which can beeasily employto estimate vector ( M . K ) x 1 in the single
equation. OLS estimator is unbiased but inefficient, that because the error
5
of SUR equation are non-spherical, and OLS fail to use the information
about this error to estimates the parameter(i).
2. Generalized least squares (GLS), GLS is estimate the parameter (i)
following the rule
^GLS = (XTW-1X)-1XT W-1ythat equal to^GLS = [XT (-1I) X]-1XT (-1I)y.
Therule can be applied directly to SUR Equation where W is the (M.T)x(M.T)
error variance-covariance matrix for SUR equation. GLS estimator is
unbiased, efficient and also fulfill the maximum likelihood requiremet.
GLS is more precise than OLS since GLS uses the information about
disturbances contained in W in the estimation of paramaters. GLS has a
major limitation, as GLS is not a feasible estimators. Its appear due the
fact that we cannot determine the elements of error variance-covariance
matrix (W).
3. Feasible generalized least squares (FGLS), in order to improve feasibility
of GLS estimator we can use FGLS. FGLS is followed by the rule
^FGLS = (XTW-1^X)-1XT W-1^yequal to^FGLS = [XT (-1^I) X]-1XT (-1^I) y
As mentioned before, the limitation of GLS is its feasibility. In case to
improve it, we need to estimates W. We can estimate W^ by using a sample
data. Then by replacing W with estimates W^, will result to FGLS
estimator. The most popular method in estimating W is Zellner’s method.
FGLS method has known to be unbiased, efficient and consistent estimators.
Some studies also suggest that FGLS has a smaller variance compare to OLS
estimators.
4. Iterated feasible least squares (ITGLS), this method is another alternative of
FGLS. As described from the previous method (FGLS), to determine
estimate W (W^) we can use Zellner’s Iteration SUR method(ISUR)
estimators. As a FGLS method, this ITGLS method has deliver unbiased,
efficient and consistent results. In Advantage, eventough there are many
debates, this methods has proven that yields better estimators when we use
it for small samples.
6
IMPLEMENTATION OF SUR ON RESEARCH
As described above, there are many researches and phenomenon are best describe by
SUR. Since there are two or more equation which looks like unrelated but have a
relation in order of its error, it may accurately and efficiently estimates by SUR.
There are some research which has employ SUR which can use to learn and describe
implementation of SUR
SUR in Environmental Research
Zaman et.al. (2011), has employ SUR for their research upon The impact of
population on environmental degradation in South Asia. This study conducted to
undertake an empirical fact for relationship between population and environmental
degradation along 1985-2009 for three countries i.e., India, Pakistan and Sri Lanka.
Researcher has consider a balanced panel data of three countries over the
period of 25 years from 1985-2009. The data were taken from World Development
Indicators pub- lished by the World Bank (2009). The research has use four variables
i.e., arable land = AL, carbondiaoxide emissions = CO2, population growth = PG and
population density = PD.
To study the implication of population indicators (growh and density) to the
environment in, researches has used two environmental indicators (CO2 and AL) as
independent variables separately covering the period of study. Thus they have
estimated two simple nonlinear population-environment models which have been
specified as follow:
logALit = α+ 1rLogPGit +2rLogPDit +it
logCOit = ω+ 1rLogPGit +2rLogPDit +it
where AL is the wide of arable land (hectares); CO2 are level of carbondiaoxide
emissions (Kt); PG is the rate of population growth (annual %); PD is the umber of
population density (people per km2), time is represent by t = 1, 2…25 periods; and i =
1...3 are representation of each countries.
To analyze the empirical model, as described above, Zaman et.al. has conducted
several sequences procedures of analysis as follow :
7
1. Panel Unit Root Test, used to check stationary condition of data. It’s a
common procedure engage time series data. To do the test, researches
employed The ImPesaran Shin (IPS) test (2003).
2. Panel Cointegration Test, used to check a long run conintegration between
variables. This test will shows whether there are proper to employ SUR
estimation.
3. And finally the SUR Estimation, which employed to estimates relationship
between population and environmental indicators.
The results of this study has reveal that population density significantly cause on
increase of arable land in all three countries. The population growth and population
density cause an increase of carbondioxide emission in Pakistan. While only
population growth has significantly causes carbondioxide emission in India and Sri
Lanka.
Except pronounciation of “non-linear” model for population and environmental
relationship, researchers of this study has conducted the research properly.
Employing every procedures of analysis ato the appropriate.
SUR in Animal Husbandry Research
The research titled :The Use of Seemingly Unrelated Regression (SUR) to
Predict the Carcass Composition of Lambs, has shwn different type of application of
SUR. The research has aimed to develop and evaluate models for predicting the
carcass composition of lambs (Cadavez and Henningsen, 2011).
As the research mentioned five equation as below;
LMP = α0 +α1 HCW +α2C12+α3 E2+ε1 ............................................. (1)
SFP = β0 +β1 HCW +β2C12+β3 E2+ε2 .............................................. (2)
IFP = γ0 +γ1 HCW +γ2C12+γ3 E2+ε3 ............................................... (3)
BP = δ0 +δ1 HCW +δ2C12+δ3 E2+ε4 ............................................... (4)
KCFP = θ0 +θ1 HCW +θ2C12+θ3 E2+ε5 ............................................ (5)
8
Where LMP = lean meat proportion, SFP = subcutaneous fat proportion, IFP =
intermuscular fat proportion, BP = bone plus remainders pro- portion, and KCFP =
kidney knob and channel fat proportion, assumed determined by HCW = Hot carcass
weight. The equation also included C12 = the subcutaneous fat thickness (mm) which
measure between the 12th and 13th rib, and E2 = the total breast bone tissue
thickness (mm) which was taken with a sharpened steel rule in the middle of the 2nd
sternebrae.
The research has compared results of OLS to those five equation above with
SUR results, which generated through Generalized Last Square (GLS). The analysis
shown results below;
The results has shown that SUR technique has able to generate the results that
relevant for implementing objective carcass classification systems. Theimportant
results of this study has revealed that the Hot Carcass Weight (HCW) and The Total
Breast Bone Tissue Thickness (E2) measurement are the most relevant predictors of
carcass tissues indicators.Although the performance of predictors might be different
for different lamb populations (other breeds or production systems).
Statistically by comparison to OLS, SUR estimator provides the lowest standard errors
of the estimated parameters and thus, the highest precision of the estimation.
9
SUR in Agricultural Production Research
Determinants of profitability in rain-fed paddy rice production in Ikenne
Agricultural Zone, Ogun State, Nigeria was a research title, which conducted by O.O.
Olubanjo& O. Oyebanjo (2005). This research was focused to analyzed the effect of
farm inputs use on the profitability of rain-fed paddy rice production in some
Agricultural Zone, at Ogun State, Nigeria.
This research has ultilizedthe Cobb-Douglas type profit function which fitted to
the data. SUR has employed to estimates the determinants factor which effect
profitability of paddy production upon 6 different rice production communities. The
profit function has drawn as equation below :
InΠi* = InAi* + µi1* Inp + µi2* Inr + µi3* Inw + bi1* InK + bi2* InT + bi3*lnD
Where :
Πi* = the UOP profit (i.e. revenue less variable costs divided by the unit price of output),
p = price per kg of seed r = price per kg of fertiliser w = wage rate per manday K = interest on fixed capital per farm, T = Farm size cultivated in hectares per farm, and D = Dummy for variety of seed cultivated (0 = local and 1 = improved) i = 1,2,3 … 6 (different rice production communities)
The results of this research can be described as a table below. This research
has revealed that the elasticity of the profit function has increased as quantity of
fertiliser applied and farm size cultivated increase. Elasticity of profit function also
decreased in respect due to increased use of labor cost and seeds expenses. Farmer
also tends to increase the number of Fertilizer as this will enhance productivity and
ensure increased profitability of rain-fed paddy rice production.
10
SUR in Agricultural Economics Research
Other study has come from NidhiyaMenon (2007), whose was conducting
research on Rainfall Uncertainty and Occupational Choice in Agricultural Households
of Rural Nepal. This reseach has an objective to why that households strive to
diversify their sources of income, eventhough agriculture was main occupied job in
Nepal.
Menon (2007) has modeled the house hold choices to alternative of working
expectation by this equation as described below :
where :
Pijw = is the probability that non-head member i in household j in ward (region) w is engaged in the same occupation as his/her household head.
X1ijw and X2jw = exogenous variables, specific to member i and household j that influence the choice of occupations. These variables are:age, gender, land ownership, number of males and females over ages 10 years in the household and the total number of dependents.
11
X3w = variables specific to the ward that effect occupational choice COVw = coefficient of variation of rain
As the researcher has differentiated P as a probability of householdworkers
classified as self-employed (Pse) and probability of household workers classified as
wages-employed (Pwe). And as both dependent variables has depends on the same
regressors. Menon has found that two different equation has appropriate for use of
Zellner’s seemingly unrelated regression (SUR) model.
The research has found that occupational choice is highly correlated to the
uncertainty associated with historical rainfall patterns. Where the head is employed
in agriculture, other family members are less likely to choose agriculture as an
occupation in districts where rain is more uncertain. Estimates indicate that for a 1%
increase in the coefficient of variation of rain, there is a 0.61% decrease in the
probability of choosing the same occupation as the household head, where the head is
classified as self-employed in agriculture. The negative effect of rainfall uncertainty
on occupational choice is less evident in households that have access to credit, and in
households with relatively high levels of human capital.
The same study has come from Feng and Heerink(2008). They were conducting
research which aimed to examines the factors that determine the participation off
farm households in land renting and migration, and investigates whether participation
in land renting and migration influence each other. As the research has assumed that
land renting and migration are determined by household characteristics, Other Fixed
vectors (as; number of dependants, number of durable assets, number of cattle, age
of households, age of adults, education of households, education of adults, female-
male ratio, irrigated land per adults, possession of land contracts, land transfer
rights), labor endowments, and household land endowments, this research has
conclude the models as below :
Where;
12
R = dummy variable for land renting (1if the household rented land) M = dummy variable for migration (1if there was at least one
household memberinvolved in migration) ZH = a vector of household characteristics ZF = a vector of fixed factors L = household labour endowment A = household land endowment w = wage rate r = land rent Z = a vector of institutional factors affecting land renting and
migration έ = error terms Basic on its dependent variable which are bivariate qualitative parameter, the
models above can be called as a seemingly unrelated bivariate probit models. Under
this research, the model has shown its ability to measure household decision making
over land and labor and its dependency to several independent variables.
This study has become very conventional or classic in utilized SUR. As many
studies about the same topic has fail to find the inter relation among reforms in rural
China with development ofland andlabor markets.Feng and Heerink (2008) has Found
that the errorterms ofthe land renting equation and the migration equationwere
strongly correlated. There is a negative relationship between land renting and
migration.
Applied SUR Overview
Application Approaches
Most of the research has been Described above has Using SUR without
significant modification. Those research told us several think related to applied SUR.
1. There are differences application of SUR as the difference of the parameters in
the SUR equation. The first character is the use of quantitative parameters for
the dependent variable and the second is the use of qualitative parameters for
the dependent variable. Not many adjustments needed to SUR approach as its
use quantitative parameters. On using qualitative parameters, we are
13
introduced with the use of SUR with Bivariate Probit models (Feng and Heerink,
2008).
2. In addition we also introduced with use of SUR for time series data and cross
section data. On the use of SUR for time series data, we see there are more
procedures. Panel Cointegration Test, employed to check a long run
cointegration between variables. This test has fully needed to shows whether
to analyze the data were proper to employ SUR estimation (Zaman et.al,
(2011), Menon, (2007)) .8
3. We also introduced in utilization of SUR to estimates the coefficient as we
employed panel data sets (Zaman et.al., 2011).
Application Benefit
SUR has deliver a certain benefit to the specific research objective. The
benefit has not been same from one research to another. Its depend on the objective
of the research. Most all of the research above has shown that SUR has proofed that
standard errors of SUR were less than standard error generated by OLS.
Research by Rumelan and Setiawan (2009), has specially aimed to compared
whether OLS or SUR is better to deliver the best estimators for households food
security. The results has as written in the following table;
14
This results also supported by Zaman, et. (2008) Al, Cadaves and Heningsen
(2012) (has described above). They proof that SUR was efficient ways of regression
compares to OLS, due its smaller standard errors.
Another advanced benefit in using SUR has shown by Feng and Heerink (2008).
Their research has solved relationship of various vector of household characteristics
which determine whether farmer decide to rent or decide to migrate. As many
previous research are fail to unrevealed this interaction, Feng and Heerink has found
longruncointegrationof the error term. Although the variables are seemingling
unrelated, SUR models has able to bring out the real cointegration and also found
significant effect of several regressors.
15
END NOTES AND CONCLUSION
Development of SUR Model
As many statistics models, SUR has experienced modification and development.
Most of modification were addressed to improve SUR application to certain situation.
Some of this development were known as follows:
1. Dynamic Seemingly Unrelated Regression (DSUR). One of parameter
estimator proposed by Nelson C. Mark, Masao Ogaki and DonggyuSul (2004).
DSUR has proven tobe feasible especially for balanced panel data which the
number of cointegrating equation (N) were substantially smaller than the
number of time series observations T. Dynamic Seemingly Unrelated
Regression or DSUR also applicable for situation which cointegrating vectors
are homogeneous across equations or where they are not. This model has
proven to be properly used in case of estimating relation of investment
rate over saving rates in European country (Mark et.al, 2004).
2. Sparse Seemingly Unrelated Regression (SSUR). As conventional SUR
model which is unconstrained model has known tobe over parameterized,
Wang (2010) has introduced, the beyesian analysis of SSUR. SSUR has the
main innovations include;
a. Inferences via Markov chain Monte Carlo (MCMC) Simulations for
specific constraints of regression coefficients and errors,
b. Evaluations of the marginal likelihoods of restrictions using coupled
Candidate’s formula approximations, and
c. The extension of sparse modelling to dynamic SUR models.
Conclusion
After discuss some research and article above, we can concluded that;
o Seemingly Unrelated Regression, has known to be unbiased, efficient and fulfill
the maximum likelihood, as applied for the multiple equation which contains
similar regressors.
16
o Several adjustment in a certain research to apply SUR, has made a SUR to properly
applied to Wide range type of research.
o Seemingly Unrelated Regression also experience a development on model
spessification and estimation. Dynamic SUR and Sparse SUR is two kind of SUR
Modification.
17
REFFERENCE
Cadavez, V.A.P. and A. Henningsen, 2011. The Use of Seemingly Unrelated Regression (SUR) to Predict the Carcass Composition of Lambs. Meat Science 92, 2012, p. 548–553. DOI: 10.1016/j.meatsci.2012.05.0257.
Feng, S. and N. Heerink, 2008. Are farm households' land renting and migration decisions inter-related in rural China?. NJAS - Wageningen Journal of Life Sciences Volume 55 issue 4 May 2008 pp: 345-362.
Mark Beasley, 2008. Seemingly Unrelated Regression (SUR) Models as a Solution to Path Analytic Models with Correlated Errors. Multiple Linear Regression Viewpoints, 2008, Vol. 34(1). University of Alabama at Birmingham.
NidhiyaMenon, 2007. Rainfall Uncertainty and Occupational Choice in Agricultural Households of Rural Nepal. 2006 North East Universities Development Conference at Cornell University. November 1st, 2006.
Olubanjo O.O. and O. Oyebanjo, 2005. Determinants of profitability in rain-fed paddy rice production in Ikenne Agricultural Zone, Ogun State, Nigeria. African Crop Science Conference Proceedings, Vol. 7. pp. 901-903. ISSN 1023-070X/2005.
Rumelan and Setiawan, 2009. Pemodelan Ketahanan Pangan Rumah Tangga Indonesia dengan Pendekatan Seemingly Unrelated Regression. Repository - Institut Teknologi Sepuluh Nopember – Surabaya.
Thronton, James, 2013. Econometrics – SUR Models Text Files. University of Eastern Michigan, USA. Downloaded from http: // people.emich.edu / jthornton / text-files / Econ515_out_sur_model.doc. July 7th, 2015.
Wang, H., 2010. Sparse seemingly unrelated regression modelling: Applications in finance and econometrics. Computational Statistics and Data Analysis 54 (2010) 2866–2877. DOI: 10.1016/j.csda.2010.03.028.
Zaman K., Himayatullah Khan, Muhammad Mushtaq Khan, ZohraSaleem, and Muhammad Nawaz, 2011. The impact of population on environmental degradation in South Asia: application of seemingly unrelated regression equation model. Environmental Economics, Volume 2, Issue 2, 2011.
Zellner, Arnold, 1962. An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. Journal of the American Statistical Association, 57, 348-368