If you trade the forex markets regularly, chances are that a lot of your trading is of the short-term variety; i. From my experience, there is one major flaw with this type of trading: h igh-speed computers and algorithms will spot these patterns faster than you ever will. When I initially started trading, my strategy was similar to that of many short-term traders. That is, analyze the technicals to decide on a long or short position or even no position in the absence of a clear trendand then wait for the all-important breakout, i. I can't tell you how many times I would open a position after a breakout, only for the price to move back in the opposite direction - with my stop loss closing me out of the trade. More often than not, the traders who make the money are those who are adept at anticipating such a breakout before it happens.

Fan, J. Local linear regression smoothers and their minimax efficiencies. Annals of Statistics, 21 , — Gangl, M. Hahn, J. On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica, 66 2 , — Heckman, J. Matching as an econometric evaluation estimator: Evidence from evaluating a job training programme.

Review of Economic Studies, 64 4 , — Matching as an econometric evaluation estimator. Review of Economic Studies, 65 2 , — Hirano, K. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71 4 , — Horvitz, D. A generalization of sampling without replacement from a finite universe source.

Journal of the American Statistical Association, 47 , — Iacus, S. Causal inference without balance checking: Coarsened exact matching. Political Analysis, 20 , 1— Imbens, G. Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Economics and Statistics, 86 1 , 4— Causal inference in statistics. Cambridge: Cambridge University Press. Johnston, J. Econometric methods. New York: McGraw-Hill. LaLonde, R. Evaluating the econometric evaluations of training programs with experimental data.

American Economic Review, 76 , — Lechner, M. A note on the common support problem in applied evaluation studies. Leuven, E. Li, Q. Efficient estimation of average treatment effects with mixed categorical and continuous data. Journal of Business and Economic Statistics, 27 , — Lunceford, J. Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study.

Statistics in Medicine, 15 , — Nannicini, T. Simulation—based sensitivity analysis for matching estimators. The Stata Journal, 7 , 3. Newey, W. Convergence rates and asymptotic normality for series estimators. Journal of Applied Econometrics, 5 , 99— Robins, J. Marginal structural models and causal inference in epidemiology.

Epidemiology, 11 , — Semiparametric efficiency in multivariate regression models with missing data. Journal of the American Statistical Association, 90 , — Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89 , — Rosenbaum, P. Observational studies 2nd ed. New York: Springer. Sensitivity analysis in observational studies. Howell Eds.

Chichester, UK: Wiley. The central role of the propensity score in observational studies for causal effects. Biometrika, 70 , 41— Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79 , — Saltelli, A. Global sensitivity analysis. The primer. Seifert, B. Data adaptive ridging in local polynomial regression. Journal of Computational and Graphical Statistics, 9 , — Smith, J. Stata 13 Treatment-effects reference manual.

Stuart, E. Matching methods for causal inference: A review and a look forward. Statistical Science, 25 1 , 1— Wooldridge, J. Inverse probability weighted estimation for general missing data problems. Econometric analysis of cross section and panel data Vol. Chapter Introductory econometrics: A modern approach 5th ed.

Mason, OH: South-Western. Zhao, Z. Using matching to estimate treatment effects: data requirements, matching metrics, and Monte Carlo evidence. Review of Economics and Statistics, 86 , 91— Download references. You can also search for this author in PubMed Google Scholar. Reprints and Permissions. Methods Based on Selection on Observables. Advanced Studies in Theoretical and Applied Econometrics, vol Springer, Berlin, Heidelberg.

Publisher Name : Springer, Berlin, Heidelberg. Print ISBN : Online ISBN : Anyone you share the following link with will be able to read this content:. Sorry, a shareable link is not currently available for this article. Also, statistically valid ways of estimating the parameters of the linear regression model and testing hypotheses about the parameters when the data are heteroskedastic are explored.

The least squares estimator can be used to estimate the linear model even when the errors are 2heteroskedastic; it is unbiased and consistent even when MR3, var y var e , is violated. There are several ways to tackle this problem. The first is to use least squares along with an estimator of its covariance that is consistent whether errors are heteroskedastic or not. This is the so-called robust estimator of covariance that Stata uses.

This is discussed in Section below. Another is to model the heteroskedasticity and use weighted least squares. This option is discussed in Section. In the first example, the food expenditure data is used to estimate the model using least squares. Change your working directory to the one containing the data set and load the data set.

The commands to generate each plot are contained in the two sets of parentheses. In this section, several are discussed. Residual Plots One way to get a feeling for whether the errors are heteroskedastic is to plot them against the sorted value of the independent variable. A couple of examples were given in the preceding section. From the graph it appears that the residuals are larger for larger values of income.

This can be confirmed statistically using one or more of the tests below. Then these are plotted against incomeas a scatter and as a locally weighted, smoothed scatterplot estimated by process called Stata documentation we learn that the basic idea behindlowessis to create a new variable newvar that, for each value of the dependent variable, y,contains the corresponding smoothed ivalue.

The smoothed values are obtained by running a regression of y on x by using only the data x;y and a few of the data near this point. Inlowess,the regression is weighted so that the iicentral point x;y gets the highest weight and points that are farther away based on the iidistancex x receive less weight. The procedure is repeated to obtain the remaining smoothed iivalues, which means that a separate weighted regression is performed for every point in the data. Obviously, if your data set is large, this can take a while.

Lowess is said to be a desirable smoother because of it tends to follow the data. Polynomial smoothing methods, for instance, are global in that what happens on the extreme left of a scatterplot can affect the fitted values on the extreme can see from the graph that the residuals tend to get larger as income rises, reaching a maximum at The residual for an observation having the largest income is relatively small and the locally smoothed prediction causes the line to start trending downward.

Lagrange Multiplier Tests There are many tests of the null hypothesis of homoskedasticity that have been proposed elsewhere. Two of these, based on Lagrange multipliers, are particularly simple to do and useful. The second test is credited to White.

The function h , is not specified. It could be anything that depends on its argument,. The null and alternative hypotheses are H: H: " 0i1ijfor at least one i"j. This is a composite alternative that captures every possibility other than the one covered by the null. If you know nothing about the nature of heteroskedasticity in your data, then this is a good place to start.

The test is very similar to the BP test. In this test, the heteroskedasticity related variables z,z, See your text for details. In the food expenditure model there is only one continuous regressor and an intercept. So, the constant squared and the cross product between the constant and income are redundant. This leaves only one unique variable to add to the model, income squared.

In Stata generate the squared value of 2income and regress the squared residuals from the model on income and its square. As is Chapter 8 the case in all the LM tests considered in this book, N is the number of observations in the second or auxiliary regression. To illustrate this test an example is used where average wages are estimated as a linear function of education and experience.

In addition, a dummy variable is included that is equal to one if a person lives in a metropolitan area. The test compares the estimated variances from two partitions of the data. In this example it is hypothesized that the error variance for the metro subsample is equal to that of the rural one. First, the entire sample from the data set is used to estimate the wage model using education, experience, and the metro dummy variable as regressors. Stata is instructed to use only the observations for which metro is equal to zero or one using an ifqualifier; the if qualifier is usedafter the regression and before any regression options.

That is not what is wanted here, so use two equal signs. Food Expenditure Example Another example uses the food expenditure model. In this example the variance is thought to be an increasing function of income. So, we first sort the data by income ascending and then repeat Heteroskedasticity the Goldfeld-Quandt test. The forty observations are broken into two equal size partitions. Then the same steps used above are repeated to obtain the result. As mentioned above, the problem with using least squares in a heteroskedastic model is that the usual estimator of precision estimated variance-covariance matrix is not consistent.

The simplest way to tackle this problem is to use least squares to estimate the intercept and slopes and use an estimator of least squares covariance that is Chapter 8 consistent whether errors are heteroskedastic or not. This is the so-called heteroskcedasticity robust estimator of covariance that Stata uses.

In this example, the food expenditure data is used to estimate the model using least squares. Change your working directory to the one containing the data set and the data set. Re-estimate the model using the vce robust option and store the results store White. Then use the estimates table command to print both sets of results to the screen. Interestingly enough, the robust standard errors are actually smaller than the usual ones!

Heteroskedasticity The dialog boxes can be used to obtain the same results. Fill in the dependent and independent variables as you usually would. Here, we have left it at the default value. All are consistent, but each gives slightly different results in small samples. Now click robust standard errors are obtained from what is often referred to as the heteroskedasticity-consistent covariance matrix estimator HCCME that was proposed by Huber and rediscovered by White.

And, there is. The generalized least squares GLS estimator is, at least in principle, easy to obtain. Essentially, with the GLS estimator of the heteroskedastic model, the different error variances are used to reweight the data so that they are all have the same homoskedastic variance.

If the data are equally variable, then least squares is efficient! It sounds complicated, but it is rather easy to do in Stata, provided you know the part of that varies. Stata iiiincludes a way to work with weighted data in a number of its procedures, including linear regression.

The analytic weights are inversely proportional to thevariance of an observation. There is no need to take the square root of the weight to get standard deviation; Stata expects the variance. Before leaving the dialog, select the tab labeled the Analytic weights button and enter the desired analytic weight in the box as shown below.

GLS using Grouped Data The example consists of estimating wages as a function of education and experience and is based on the used in the Goldfeld-Quandt test example. The strategy for combining these partitions and estimating the parameters using generalized least squares is the fairly simple.

Each subsample will be used to estimate the model and the standard error of the regression e rmse will be saved. Then each subsample is weighted by its estimated variance which is the squared value of the e rmse. There are a couple of ways to estimate each subsample.

Grouped GLS using this method can be found in the. The other uses a trick whereby subsamples of the data can be taken using analytical weights. Weighting variables by 0 or 1 is a handy way of taking subsamples. Weighting an observation by 0 drops it from the computation of the estimator, whereas ones weighted by 1 are included in its computation.

After loading the data, create an indicator variable for rural households 1 if rural, 0 otherwise by subtracting metro from one. The run the two subset regressions using the analytical weights, saving the root mean square error of each.

This results in:. This turns generalized least squares GLS into something slightly different, namely estimated or feasible generalized least squares FGLS. The first step is to choose a model for the variance that is a function of some independent variables. Finally, these are regressed onzand a constant. This regression yields:. Interval] z. That is, get the linear predictions from this regression lnsig2 and generate weights using the exponential function exp lnsig2.

This choice can be represented by an indicator variable ythat takes the value one with probability pif the first alternative is chosen, and the value zero with probability 1 pif the second alternative is chosen. It can be iishown that var y p 1 p iiiwhich makes the model heteroskedastic. The feasible GLS estimator is easy to compute. Sometimes this fails because one or more of the predicted probabilities lies outside of the 0,1 interval.

The example is based on the data in. The independent variable, coke, takes the value of 1 if the individual purchases Coca-Cola and is 0 if not. The decision to purchase Coca-Cola depends on the ratio of the price relative to Pepsi, and whether displays for Coca-Cola or Pepsi were present. Min Max coke Min Max p There are 16 values of p that fall below the threshold.

The final possibility is to estimate the model using least squares and use the HCCME standard errors. Inferences will be valid if not efficient. The next column contains the least squares estimates with heteroskedasticity-consistent standard errors delivered via the robust command. The column labeledTrunc contains the estimates where the observations less than the threshold were truncated to be. The last column shows the results when the observations producing negative predictions are omitted from the model.

The results are reasonably consistent across models except for Trunc. In time series regressions the data need to be stationary in order for the usual econometric procedures to have the proper statistical properties. Basically this requires that the means, variances and covariances of the time series data cannot depend on the time period in which they are observed. For instance, the mean and variance of GDP in the third quarter of cannot be different from those of the 4th quarter of Methods to deal with this problem have provided a rich field of research for econometricians in recent years and several of these techniques are explored later in Chapter A time series plot will reveal potential problems with the data and suggest ways to proceed statistically.

As seen in earlier chapters, time series plots are simple to generate in Stata and a few new tricks will be explored below. Finally, since this chapter deals with time-series observations the usual number of observations, N, is replaced by the more commonly used T.

In later chapters, where both time-series and cross sectional data are used, both N and T are used. Since time-series are ordered in time their position relative to the other observations must be maintained. It is, after all, their temporal relationships that make analysis of this kind of data different from cross-sectional analysis.

If the data you have do not already have a proper date to identify the time period in which the observation was collected, then adding one is a good idea. This makes identification of historical periods easier and enhances the information content of graphs considerably. The data sets distributed with your book have not been declared to be time series and most do not contain the relevant dates in the set of variables. So, the first order of business is to add this information to the data set and then to use the dates to identify the observations as time-series and indicates the period of time that separates the individual observations.

In analyzing the time dependencies in the data, this is vital information as will be explained getting to the specific examples from the text, something should be said about how Stata handles dates and times. Basically, Stata treats each time period as an integer. The integer records the number of time units whatever you define them to be that have passed from an agreed-upon base, which for Stata is They are called pseudofunctions because they translate what you type into integer equivalents.

The integer equivalent of q1 is 4—that is how many quarters have passed since the first one in The second quarter is set to 5 and so on. Listing the first 5 observations of date reveals:. To make this meaningful for people, these need to be formatted as strings in order to make it easy for us to tell what date is 20 quarters from Finally, the observations are declared to be time-series using the tsset command followed by the variable name that identifies the time variable.

It identifies the name of the time variable, the dates it covers, and the delta or the period of time that elapses between observations. Check this carefully whenever generating dates to make sure that those created match what is desired.

Stata includes other functions and pseudofunctions for defining weekly tw , monthly tm , yearly ty and others. Again, these create sets of integers that indicate the number of elapsed time periods since q1. To see other options and to learn more about how they operate typehelp dates and times at the Command window and Stata open a viewer window and carry you to the relevant information. Once the dates have been created and the data set declared to be time series, save the data set so that this process will not have to be repeated for these data.

Stata saves the new variable, desired display format, and time-series information along with the data set. Once the data are loaded, a date is assigned using the generate command. Stata includes special functions for creating dates which translate the way Stata treats dates integers and the way people do days, months, years, etc. The quarterly data begin in the second quarter of Still, they reveal that q2 is quarters ahead of q1.

The format command tells Stata to display the integer date as q2. Time-Series Plots Once the data are loaded, the time variable generated, formatted and the variables declared as time-series, you are ready to begin the initial phases of analysis. With time-series, there is no Regression with Time-Series Data: Stationary Variables better place to start than plotting the variables against time. This will reveal important features of the data. To plot the unemployment rate and GDP growth rates the tsline plot is used.

In order to get the labels of both plots on the same graph, the labels are shortened using the label varcommands. Then tsline, which is an abbreviation of graph twoway tsline, plots both series in the same graph. Other options can be used, but we will keep it simple at this point. There are no obvious trends, breaks, or other features that would suggest that either of the variables is nonstationary.

Therefore, these variables are probably well-suited for the traditional regression techniques discussed in this chapter. In Chapter 12 more formal tests are developed to explore the possible nonstationarity of the series. For now it is assumed that they are stationary. Stata includes special unary operators that can be used to make taking lags and differences of time-series data very easy and efficient.

Here is a partial list of operators and their meanings: Chapter 9 Operator Meaning L. For instance, takes the variable u and lags it one period. Similarly, takes the one period time difference u utt lag and difference operators are linear and can be used together in any order.

For instance to take the lagged difference between the observations in u. This works right to left: take the difference of u and then lag it one period. Linearity in operations implies this is equivalent to —lagu one period and then difference. To lag the variable u two periods, then use or, more simply,. The number followingL indicates how many periods in the past to lag the variable.

Thus lags u two periods. Just as in the case of the unary operators for factor variables, these time-series operators save one from having to separately generate variables to include in a model. There are several other shortcuts that will be discussed demonstrate the use of these operators the variables, lags and differences are listed below for observations at the beginning and end of the data set. In general, it is often good practice to print a few observations to ensure that the contents of the series make sense and that the time periods have been assigned to the correct variables.

Below the date, u, the change in u,g, and several lags are printed using the time-series operators. Stata also understandsoperator numlist. A numlist is a list of numbers with blanks or commas in between. There are a number of shorthand conventions to reduce the amount of typing necessary.

For instance: 2 just one number 1 2 3 three numbers 3 2 1 three numbers in reversed order. It allows you to specify ranges, sequences, as well as lists of specific numbers. These can include negative numbers and their order can be easily reversed. In this model the change in the unemployment rate from one period to the next depends on the rate of growth of output in the economy. Interval] g Re-estimating the model using a lag length of two produces.

This violates one of the basic assumptions of the Gauss-Markov theorem and has a substantial effect on the properties of least squares estimation of the parameters. In economics, serial correlation happens when the duration of economic shocks exceed the sampling frequency of the data. This causes the shock to bleed over into subsequent time periods, causing errors to be positively correlated. In most cases this implies a failure to model the time structure of the regression properly—either lagged variables are omitted that are correlated with Chapter 9 included regressors or if there is some persistence in the dependent variable that has not been properly modeled.

The solution is to properly specify the regression function so thatE e all regressors 0. That satisfies the necessary condition for least squares to be consistent ttfor the intercept and slopes. Detecting autocorrelation in the least squares residuals is important because least squares may be inconsistent in this case. The first tool used is to produce a scatter graph of g and and vertical lines are placed approximately at the mean.

The mean of g is among the saved results, which can be viewed using the usual return list command. Min Max g A numerical approach is to look at the computed sample autocorrelations. Phillips Curve The second example is based on the Phillips curve, which expresses the relationship between inflation and unemployment. The simple regression relating inflation and the change in unemployment is INF DU et12tThe model is estimated using the data which contains the quarterly inflation rate and unemployment rates for Australia beginning in q1.

Load the data, generate a date, format the date to a string, and set the data set as time series. Interval] u D1. The sample autocorrelations are saved in a variable called rk, the first five are printed, and then dropped from the data set since they are no longer needed.

Stata contains a corrgram function, of which the ac command is a subset. It also displays a character-based plot of the autocorrelations. Another feature of corrgram is that each of these statistics are saved as r. To save and print the first five autocorrelations using corrgramcorrgram ehat, lags 5. The test statistic is based on TRfrom an auxiliary regression.

For autocorrelation, this test is based on an auxiliary regression where you regress least squares residuals on lagged least squares residuals and the original regressors. Include all of the other independent variables from t 1the original regression as well. Rejection leads to the conclusion that there is significant autocorrelation. In this case, the missing value for the first residual can be replaced with a zero. This is permissible in the current context because that is what its expected value is.

Testing for higher order autocorrelation is simple. To test for AR 4 , then include 4 lagged 2least squares residuals as regressors and compute TR. The degrees of freedom for the chi-square equal the order of the autocorrelation under the alternative in this case, 4. The missing values of ehat that occur from taking lags are set to zero i.

This allows use of the 3 2 10entire sample. It turns out that this is not particularly straightforward to program in Stata so we will skip that discussion here. However, the code to do so can be found in the do-file at the end of this chapter. The results can be replicated easily using the built-in post-estimation command inf estat bgodfrey, lags 1 estat bgodfrey, lags 4 The command uses an option that indicates how many lagged residuals to include as regressors in the model.

The result. It is no longer efficient asymptotically , when the least squares assumption MR4, cov e,e 0 for t"s is tsviolated. Unfortunately, the usual standard errors are no longer correct, leading to statistically invalid hypothesis tests and confidence intervals. Least squares and HAC standard errors Although the usual least squares standard errors are not the correct, we can compute consistent standard errors just as we did in heteroskedastic models using an estimator proposed by Newey and West.

Newey-West standard errors also known as HAC--heteroskedasticity and autocorrelation consistent standard errors are analogous to the heteroskedasticity consistent standard errors introduced in Chapter 8. They have the advantage of being consistent for autocorrelated errors that are not necessarily AR 1 , and do not require specification of the dynamic error model that would be needed to get an estimator with a lower variance.

HAC is not as automatic in use as the heteroskedasticity robust standard error estimator in Chapter 8. To be robust with respect to autocorrelation one has to specify how far away in time the residual autocorrelation is likely to be significant. Essentially, the autocorrelated errors over the chosen time window are averaged in the computation of HAC; the number of periods over which to average and how much weight to assign each residual in that average has to be set by the weighted average is accomplished using what is called a kernel and the number of errors to average using the weighting scheme kernel is called bandwidth.

To be quite honest, these terms reveal little about what they do to the average user. Just think of the kernel as another name for weighted average and bandwidth as the term for number of terms to average. Stata offers no way to choose a kernel; the Bartlett is the only one available.

However, a bandwidth must be selected. There are several methods to help choose a suitable bandwidth and two are given here. This one appears to be the default in other programs like EViews and it is the one used here to obtain the results in the text. Implicitly there is a trade-off to consider. A larger bandwidth reduces bias good as well as precision bad. A smaller bandwidth excludes more relevant autocorrelations and hence is more Chapter 9 biased , but has a smaller variance.

The general principle is to choose a bandwidth that is large enough to contain the largest autocorrelations. The only kernel available in Stata is the Bartlett. This is the one used by Newey and West in their research on this issue. Consequently, Stata refers to the procedure that computes HAC as newey. It is basically a replacement for regress, and it requires the specification of a bandwidth.

Then, to estimate the model by least squares with Newey-West standard errors and a bandwidth of 4 use the following command newey inf , lag 4 In the example the model is estimated using least squares with the usual least squares standard errors and the HAC standard errors.

The results appear below, with the HAC standard errors appearing below the estimates in the right-hand column. In addition, the mtitle option is used to give each column a meaningful name; when this option is used the default column name, which is the name of the dependent variable, is replaced by whatever you place in each set of double quotes. The title option is used to let Regression with Time-Series Data: Stationary Variables readers know that the dependent variable used in each case is inf.

In this example, the HAC standard errors are substantially larger than the usual inconsistent ones. Nonlinear Least Squares As you can see, HAC standard errors suffer at least two disadvantages: 1 they are not automatic since they require specification of a bandwidth and 2 they are larger than standard errors of more estimators that are more efficient than ordinary linear regression.

In this section, nonlinear least squares is used to efficiently estimate the parameters of the AR 1 model. In your text book the authors start with the AR 1 regression model and, using a little algebra, arrive at y 1 : x :y : x vt12t 12t 1This model is nonlinear in the parameters, but has an additive white noise error. These features make the model suitable for nonlinear least squares estimation.

Nonlinear least squares uses numerical methods to find the values of the parameters that minimize the sum of squared errors. The if,in, and weight statements are used in the same way as in a linear regression. However, because the variables that have been lagged, missing values will be created for the first observation on the lagged variables in the data set.

For this to work, the sample must be limited to only those observations that are complete. There are two ways to do this. Or, you can list the variables as done here using the option variables inf. The minimum of the sum of squares function is reached at the same parameter estimates. There are some small differences in estimated standard errors, though. This happens because there are different ways of estimating these consistently in nonlinear models; in small samples like the one in this example, those differences may be exaggerated.

In larger samples the differences will usually be small and in fact vanish according to theory as the sample size grows. The t-ratio on the parameter : is equal to , which has a p-value less than. This means that at any reasonable level of significance. After estimating the model a couple of scalars are computed to be used in the next section. The reasons for these will be discussed 12in the next section. However, note that the estimates are referred to a bit differently than in the linear regression.

The coeflegend option can be used after the nl command to find the proper names for the parameters. To verify that you have identified the parameters correctly, run the nonlinear least squares regression again using the coeflegend option. There is an example of this contained in the do-file at the end of the A More General Model A more general form of the model is considered y 2 2x 2x -y vt0t1t 11t 1twhich is linear in the parameters and can be estimated by linear regression.

This model is related to the previous model by the relationships 2 1 : 2 2 : - The linear model can be estimated by linear least squares and a hypothesis test of the implied restriction can be conducted. The null hypothesis implied by the restriction is H:2 against the alternative that it is not equal.

The first step is to estimate the model using least squaresregress inf Regression with Time-Series Data: Stationary Variables Interval] inf L1. The 0computed values were. The various linear specifications of the models considered are compared using the esttabcommand: Chapter 9. This is the so-called autoregressive distributed lag model ARDL. The ARDL p,q model has the general form y 2 -y -y 2x 2x 2x vt1t 1pt p0t1t 1qt qtAs regressors, it has p lags of the dependent variable, y, and q lags of the independent variable, ARDL 1,1 and ARDL 1,0 models of inflation can be estimated using least squares.

The estimates are stored and printed in a table below. First, if the t-ratio on DU is t 1insignificant, then the evidence suggests that omitting it may not adversely impact the properties of the least squares estimator of the restricted model. Another possibility is to use one of the model selection rules discussed in Chapter 6.

Refer to Chapter 6 for more details on the program structure in Stata. This produces the output: Chapter 9. One problem with this analysis is that the residuals may still be autocorrelated or that longer lags than the ones considered here have been omitted.

In the next section this is considered more carefully. The latter suggests that the model selection rules should be applied to a wider set of models that include more autocorrelation terms. This means estimating twelve models; the AR terms are varied from 1 to 6 and the DL from 0 to 1 with every combination estimated. The Stata code to estimate each of these models for time periods after q3 is provided at the end of this chapter in a do-file.

The complete code can be used to reproduce the results found in Table of POE4. Below, a code snippet is given and its syntax following code estimates the ARDL 1,1 model for data beginning in the third quarter of The regression is estimated using the quietly command, abbreviated in Stata qui, to suppress the actual regression results; our interest is in the values of the model selection rules only at this point.

To limit the sample to certain dates, the pseudofunction tq q3 is used. Recall from earlier in this chapter that this pseudofunction translates the date q3into a number that Stata understands. The first variable in this statement is the zero lag of inflation, L 0. The result from this snippet is. A nested loop can be formed using forvalues command, which loops over consecutive values. In this form the values of p and q will increment in steps of 1. Braces must be specified with forvalues, and open brace must appear on the same line as forvalues; may follow the open brace except, of course, comments; the first command to be executed must appear on a new line; close brace must appear on a line by itself.

First, the p and q are now referred to by their macro names. That means that they when they are referred to they need to be enclosed in single quotes left and right as we did above. As p increments from 1 to 6, lags are added and the modelsel program is executed after printing the current values of p and q to the screen. Hence, in a few short statements many models can be considered and the orders of the autoregressive and distributed lags can easily be changed.

When loops are nested this way, the q loop starts at zero and then the p loop iterates from 1 to 6. Once the p loop is finished, the q loop increments by 1 and the p loop starts over again. You can change the order of these if desired. Load the data, generate dates beginning at q2, format them to be printed as strings, and declare the data to be time series. Below the model is estimated by least squares, the correlogram is obtained, and LM statistics for models containing up to 5 autocorreleted residuals are produced.

This suggests that the ARDL 0,2 is misspecified. In the do-file at the end of the chapter code is given to estimate a series of models using the Okun data set. The sample is Autocorrelations of Chapter 9 limited as in the previous example, this time for observations beginning in the first quarter of This model is estimated using the entire sample and the errors are checked for any remaining autocorrelation using the LM statistic.

Interval] u LD. The data on. GDP growth found in was examined for autocorrelation in Section. In the correlogram of g, there was evidence of correlation among the observations of the time-series. To examine this further, an AR 2 model is estimated for GDP growth and the correlogram of the residuals is drawn. This is well-known and understood among practitioners. The examples focus on short-term forecasting, typically up to 3 periods into the future. In this section the use of an AR 2 model to forecast the next three periods is discussed and forecast confidence intervals are generated.

Similarly, g[97] refers to the observation on G from q2. They are centered at the forecast and extend approximately 2 standard deviations in either direction. The results follow.. Like forecasting with an AR model, forecasting using exponential smoothing does not use information from any other variable.

The basic idea is that the forecast for next period is a weighted average of the forecast for the current period and the actual realized value in the current period. Stata contains a routine that performs various forms of smoothing for time-series called creates new variablenewvar and fills it in by passing the variable through the requested smoother.

There are several smoothers available, including the exponential. First the data are opened, the dates generated, reformatted, and the variables are set as time-series. Here, we choose exponential. This is followed by some options. The first, parms. If this option is not specified, then tssmooth chooses the one that minimizes the sum-of-squared produces the output. Once the smoothed series is generated, it can be compared to the unsmoothed version in a time-series plot. In the line that follows, the two series are plotted and the legend is relabeled so that everything fits on the graph a little better.

Also, the information is computed automatically by tssmooth for 1 period. If more are desired then Stata offers options for that. The manually generated and the automatic forecast from Stata match. Note that the value of the smoothing parameter is saved as r alpha after smoothing and that it can be used to generate forecasts just as easily as with a fixed value. The impact multiplier is the impact of a one unit change in x on the mean of y. Since ttx and y are in the same time period the effect is contemporaneous and therefore equal to the initial impact of the change.

If xis increased by 1 unit and then maintained at its new level in tsubsequent periods t 1 , t 2 ,. An interimmultiplier simply adds the immediate effect impact multiplier , , to subsequent delay 0multipliers to measure the cumulative effect. In 01 Chapter 9 periodt 2, it will be , and so on. The total multiplier is the final effect on y of the qsustained increase after q or more periods have elapsed; it is given by.

Basically, this needs to be transformed into an infinite distributed lag model using the properties of the lag operator, L, which works just as ithe Stata commands based on it do. That is, Lx puts the model into the familiar AR tt iform and the usual definitions of the multipliers can be applied. This is discussed in detail in POE4 and will not be replicated here. The coefficients for the multipliers involve the s,which must be solved for in terms of the estimated parameters of the ARDL.

Regression with Time-Series Data: Stationary Variables Stata provides a slick way to get these into a data set so that they can be graphed. Finally, create a new variable called lag that contains integers to be used as the lag weights 1 to 8. Finally, you can plot them.

By the 6period the effect of a one unit change in GDP growth on unemployment is virtually zero. The exact p-value obtained by integrating the distribution function of DW is not performed at this point in time. The prais command operates much like regress and uses similar syntax.

There are a few additional options that may be worth exploring if you are interested. The biggest limitation of prais is that it will only estimate models with first-order autocorrelation. For more complex models, see the arima command which estimates more general models using maximum likelihood. Both estimators have the same asymptotic properties so there is really no need to iterate.

The results are very similar. Interval]inf u D1. The variable identifying labor force participation is lfp which is 1 if a woman is in the labor force and 0 if she is not. Then summarize the key variables wage, educ and experience exper. Min Max wage Instrumental variables estimation is also known as two-stage least squares because the estimates can be obtained in two steps.

Estimate the first-stage equation for education, educ, including on the right-hand side as explanatory variables the included exogenous variables exper and exper2 and the instrumental variable mothereduc which is not included in the model. Interval] exper. More will be said about critical values for the F-test in Section of this chapter. The F-test values is obtained using test mothereduc.

Interval] educhat. In Stata 11 this command is ivregress. For a full description of the capabilities of this powerful command enter help ivregress. Fill in as shown and press OK. The dependent variable lwage follows, which is then followed by the explanatory variables. Using the dialog-box approach this is placed at the end of the command, but it can appear anywhere after the dependent variable. The coefficient estimates are the IV estimates, and the standard errors are properly computed.

In the dialog box use the Reporting tab and choose option for degrees-of-freedom adjustments. As noted this placement is at the discretion of the programmer. Random Regressors and Moment-Based Estimation The usual formulas for the explained sum of squares due to regression do not hold with IV estimation. Such details may be found by reading the full Stata documentation. This material is advanced and uses matrix algebra. For cross sectional data, such as the Mroz data, we may also be concerned about heteroskedasticity in the data.

Note that the robust standard errors are slightly larger than the usual standard errors, which is the usual outcome. The overall F-test is also based on the robust covariance matrix. Suppose that in addition to mothereduc we use fathereduc as an instrument. To test whether our instruments are adequately correlated with education estimate the first-stage equation. Test the significance of the instruments from outside the model.

Because we have only one endogenous explanatory variable we require only one instrumental variable. If we consider mothereduc and fathereduc individually we can use t-tests to test their significance. Recall that mere significance is not enough. For t-tests we look for values in excess of.

For an F-test, the minimum threshold value for an adequate instrument is about The post-estimation command estat firststage produces the first-stage F-statistic value. In the ivregress dialog box select this option on the Reporting tab. The post-estimation command is estat firststage.

This terminology and the usefulness of the Critical Values given below the statistic will be explained in Section of this chapter. To simplify let us consider the case in which we have a single instrumental variable, mothereduc. Examine part of the output of estat firststage following ivregress. Instrument strength can be measured by the partial correlation between the endogenous variable and a single instrument.

The effects of exper and exper2 are removed by regressing educ and mothereduc on these variables and computing the least squares residuals. The residuals contain what is left after removing the effects of exper and exper2. The correlation between these residuals is correlate v1 v2. Why is it called a called an R-squared? Regress v1 on v2, with no constant since the average value of the residuals v1 is zero. Interval] v2. The relation between correlations and covariance helps us understand the regression coefficient above.

The sample covariance between v1 and v2 is obtained using correlate v1 v2, covariance return list. The Hausman procedure is a way to empirically test whether an explanatory variable is endogenous or not. In the regression y x e we wish to know whether x is correlated with e. Let z and z be instrumental variables for x. At a minimum one instrument is required for each variable 2that might be correlated with the error term. Then carry out the following steps: the model x 3 -z -z v by ordinary.

If there is more than one explanatory variable that are being tested for endogeneity, repeat this estimation for each one, using all available instrumental variables in each regression. Estimate this "artificial regression" by least squares, 12and employ the usual t-test for the hypothesis of significance H:2 0no correlation between x and 0 :2"0correlation between and 1 To test whether educ is endogenous, and correlated with the regression error term, we use the regression based Hausman test described above.

To implement the test estimate the first stage equation for educ using least squares, including all exogenous variables, including the instrumental variables mothereduc and fathereduc, on the right-hand side. Save the residuals reg educ exper exper2 mothereduc fathereduc predict vhat, residuals Add the computed residuals to the ln wage equation as an additional explanatory variable, and test its significance using a standard t-test.

We prefer the regression based test in most circumstances. Using help hausman we find the syntax. The automatic test is a contrast test between the least squares estimator, which is best linear unbiased and efficient if the assumptions listed in Section of POE4.

If a regressor is endogenous, then the least squares estimator is inconsistent, but the instrumental variables estimator is consistent. This contrast test is not valid under heteroskedasticity, because the test is predicated upon the least squares estimator being efficient. If heteroskedasticity is present least squares is not efficient because the Gauss-Markov theorem does not hold. This is one advantage of the regression based test, which can be applied with heteroskedastic data.

The other choices we show are so that this contrast test will work as well as possible. Include the intercept in the comparison and, most importantly, base the estimator variances on a common 2estimate of the error variance, the estimate of based on the least squares estimates and residuals. In the Stata Resultwindow there are lots of words you do not understand, and which are beyond the scope of this book.

The key result from your point of view is that the Hausman test is a chi-square statistic with 1 degree of freedom. The chi-square value is given, along with its p-value. The option sigmamore is included to force Stata to use the least squares residuals in the estimation of the error variance for both estimators. This ensures that Stata will calculate the correct number of degrees of freedom for the Hausman test, which is the number of endogenous variables on the right-hand side of the regression.

Now regress ehat on all exogenous variables and instrumental variables. The number of degrees of freedom here is 1 because there is one surplus instrument. We will not discuss those examples, although complete code is provided in the do-file for this chapter, which is listed at the end of this chapter. For example, suppose there we have two endogenous variables and two instrumental variables. For instrumental variables estimation we required two external instrumental variables. Using the first stage F-test approach, we would estimate two first Random Regressors and Moment-Based Estimation stage equations and test the joint significance of the two instrumental variables.

The first stage F-tests has as the alternative hypothesis that at least one of the instruments is a relevant, strong instrument. Suppose however that of our two instruments only one is actually related to the endogenous variables. So in truth we have one instrument. The F-test will reject the joint null hypothesis, leading us to believe we have two instruments, when we do not.

Using canonical correlations there is a solution to the problem of identifying weak instruments when an equation has more than one endogenous variable. Canonical correlations are a generalization of the usual concept of a correlation between two variables and attempt to describe the association between two sets of variables.

A detailed discussion of canonical correlations is beyond the scope of this work. Consult a book on multivariate analysis, but explanations will involve matrix algebra. If we have two variables in the first set of variables and two variables in the second set then there are two canonical correlations, r and r. If we have B variables in the first group the 12endogenous variables with the effects of the exogenous variables x 1, x, …, x removed and 12GL B variables in the second group the group of instruments with the effects of x 1, x, …, x 12Gremoved , then there are B possible canonical correlations, r r r, with r being the B12Bminimum canonical correlation.

Critical values for this test statistic have been tabulated by James 1Stock and Motohiro Yogo , so that we can test the null hypothesis that the instruments are weak, against the alternative that they are not, for two particular consequences of weak instruments. Relative Bias: In the presence of weak instruments the amount of bias in the IV estimator can become large.

Stock and Yogo consider the bias when estimating the coefficients of the endogenous variables. They examine the maximum IV estimator bias relative to the bias of the least squares estimator. Stock and Yogo give the illustration of estimating the return to education.

Rejection Rate Test Size : When estimating a model with endogenous regressors, testing hypotheses about the coefficients of the endogenous variables is frequently of interest. If instruments are weak, then the actual rejection rate of the null hypothesis, also known as the test size, may be larger. Andrews and James H. Stock, Cambridge University Press, Chapter 5. To test the null hypothesis that instruments are weak, against the alternative that they are not, we compare the Cragg-Donald F-test statistic to a critical value.

When estat firststage is used after ivregress these critical values are reported, as shown in Section of this chapter. The steps are choose either the maximum relative bias or maximum test size criterion. You must also choose the maximum relative bias or maximum test size you are willing to accept. If the F-test statistic is not larger than the critical value, then do not reject the null hypothesis that the instruments are weak. The variable MTR is the marginal tax rate facing the wife, including social security taxes.

To begin, open the data and create the required variables. For MTR these two instruments are less strong. Partial R-sq. Stata shows the critical value for the weak instrument test when using 2SLS is. Ignore the critical values for LIML. These values will be explained in Chapter See also Table in POE4.

We cannot reject the null hypothesis that the instruments are weak, despite the favorable first stage F-test values. The estimates of the HOURS supply equation shows parameter estimates that are wildly different from those in Model 1 and Model 2 , given in Table , POE4, page , and the very small t-statistic values imply very large standard errors, another consequence for instrumental variables estimation in the presence of weak instruments.

Other models are illustrated in the Chapter 10 do-file at the end of this chapter. Save the degrees of freedom, N G B using ereturn list to show which results are saved post-estimation. See help canon. In the simulation we use the data generation process y x e, so that the intercept parameter is 0 and the slope parameter is 1. The parameter controls the instrument strength.

The larger becomes the stronger the instruments become. Finally, we create the random errors e and v to have standard normal distributions with correlation , which controls the endogeneity of x. If : 0, then x is not endogenous. The larger becomes the stronger the endogeneity.

We let : 0 x exogenous and : x highly endogenous. The simulation begins by clearing all memory, and specifying global constants that control the simulation. A key component in the simulation experiment is the correlation between the error terms e and v. Creating correlated random numbers is achieved using the Stata command drawnorm. From help drawnorm we find the basic syntax and options. Random Regressors and Moment-Based Estimation Among the options are n of observations to be generated cov matrix vector covariance matrix That is, we can specify the number of observations, and covariances and variances to be anything we choose.

We will use a covariance matrix that is varecove,v 1:? Interval] z1. Interval] x. In the first portion of the program we have the same data generation process, controlled by global macros. The value of rols is 1 if the null hypothesis is rejected, and rols is 0 otherwise. The test outcome r2sls is 1 if the null hypothesis is rejected and is zero otherwise. There are 10, experimental replications of the program ch10sim.

After the simulation we display the global parameter values, for record keeping purposes. Interval] rsqf. This is a convenient alternative to summarize as is permits specification of the statistics to report in a nice table. The average estimate value should be close to the true value, 1, if the estimator is unbiased. The average of the rejection rate variable rols indicates the actual rejection rate of the true null hypothesis. This is the empirical analog of 22 MSE Eb varb biasb Chapter 10 Mean squared error answers the question of how close, on average, are the estimates to the true parameter value.

Finally we examine the average rejection rate of the Hausman test. The supply equation contains the market price and quantity supplied. Also it includes the price of a factor of production, PF, which in this case is the hourly rental price of truffle-pigs used in the search process. In this model we assume that P and Q are endogenous variables. The data for this example are in the data file. Execute the usual beginning commands, start a log and open the data file use truffles, clear describe Examine the data by listing the first 5 observations, and computing summary statistics.

Interval] ps. Name the variable phat, to remind us of P. Interval] phat. It is always better to use software commands for 2SLS. Enter help ivregress for Stata help. It is available through the pull-down menus.

This is a handy way to make sure that your ordering involves multiple variables, but Stata will only perform the command on the first set of variables. First, we want to make sure we eliminate the repeated deaths from Patient 8. We can do this using the bysort command and summing the values of Death.

Now we have a data set without the unnecessary death values for Patient 8. Therefore, Patient 8 will not be counted in months 6 and 7 because they are no longer contributing to the denominator. Suppose we want to perform a single group time series analysis. We would want to sum up the number of deaths across the months.

We can do this using the bysort command. First, we have to think about how we want to count death. Initially, we were worried that Death would be counted two more times for Patient 8, but we solved this problem by removing these events from Patient 8. The following command will yield the above results in a long format. We use the egen command because we are using a more complex function.

Detailers on when to use gen versus the egen commands are located at this site. Next, we want to determine that number of patient observations that are contributed to each month. To do this, we can use the bysort command again. Currently, the data is set up using the patient-level.

We want to change this to the single-group level or the aggregate monthly level. To do this, we have to eliminate the repeated month measurements for our total deaths numerator and total observations denominator. We can visualize this by plotting two separated lines connected at the values for each month. Using the bysort command can help us fix a variety of data issues with time series analysis.

In this example, we have patient-level data that contained deaths for one patient and a patient who was observed at different sites. Using the bysort command to distinguish between sites allowed us to properly identify the patient as unique to the site. Additionally, we used the bysort to identify the patient with multiple deaths and eliminated these values from the aggregate monthly values.

Then we finalized out single-group data set by summing the total deaths and observations per month and removing the duplicates. You can download the Stata code from my Github site. So if you type:. Anyone with a missing value for age is also included. Assuming you're interested in people who are known to be older than 65, you should exclude the people with missing values for age with a second condition:.

For the age variable, the GSS uses. Other variables use different extended missing values, and some use more than one. If you have a binary variable coded as 0 or 1, you can take advantage of the fact that to Stata 1 is true and 0 is false. Then you could do things like:. Just one thing to be careful of: to Stata everything except 0 is true, including missing. If female had missing values you would need to use:.

Options change how a command works. They go after any variable list or if condition, following a comma. The comma means "everything after this is options" so you only type one comma no matter how many options you're using. The detail option tells summarize to calculate percentiles including the 50th percentile, or median and some additional moments. Many options can be abbreviated like commands can be—in this case just d would do.

Some options require additional information, like the name of a variable or a number. Any additional information an option needs goes in parentheses directly after the option itself. Recall that when we did sum all by itself and it gave us summary statistics for all the variables, it put a separator line after every five variables. You can change that with the separator or just sep option:. The 10 in parentheses tells the separator option to put a separator between every ten variables.

You'll learn more useful options that need additional information in the articles on statistical commands. This gives you summary statistics for years on the job for both males and females, calculated separately. By is a prefix, so it comes before the command itself. It's followed by the variable or variables that identifies the subgroups of interest, then a colon. The data must be sorted for by to work, so bysort is a shortcut that first sorts the data and then executes the by command.

Now that the data set is sorted by sex , you can just use by in subsequent commands:. U niversity of W isconsin —Madison. Join the SSCC. To start with it should contain: capture log close log using syntax. Commands Most Stata commands are verbs. Variable Lists A list of variables after a command tells the command which variables to act on. First try sum summarize all by itself, and then followed by age: sum sum age If you don't specify which variables sum should act on it will give you summary statistics for all the variables in the data set.

If you list more than one variable, the command will act on all of them: sum age yearsjob prestg10 This gives you summary statistics for age, years on the job, and a rating of the respondent's job's prestige. If Conditions An if condition tell a command which observations it should act on. It makes a difference! Binary Variables If you have a binary variable coded as 0 or 1, you can take advantage of the fact that to Stata 1 is true and 0 is false.

Then you could do things like: sum yearsjob if female sum yearsjob if!

Email Tools version 1. In this that Direct3D a E-mail "always ask before the Internet based. IE still We have command, type fix high to do and XenDesktop existing environment pass over together a.