*************************************************** * Demo for using PWMSE *************************************************** * pre-requirement: install form_norms.ado and get_pwmse.ado to the relevant ado folder * 1st step: cd to folder that holds the high-frequency data for forming norms cd ... * - the program requires you prepare the high-frequency data by appending the high-frequency of the projected year into the historical high-frequency data * 2nd step: run "form_norms": * - specify the input high-frequency data dta in data() * - specify the projection year in tau() * - specify the cross-sectional unit indicator in unit() * - specify the time dimension for the later panel regression (e.g., year) in dim_0() * - specify the second lowest time frequency (e.g., month) in dim_1() * - specify the highest time frequency (e.g., day) in dim_2(); can just type "dim_2()" if you don't have this time frequency form_norms tAvg, data("DataMY.dta") tau(2050) unit(fips) dim_0(year) dim_1(month) * - in this demo, the input data only have year and month dimensions (no daily level), so we leave out the dim_2() here * 3rd step: save the formed norms as a dta in local folder (will be used later) save saved_norms, replace * 4th step: load the panel data for regression and run get_pwmse to get PWMSEs for a certain specification * - specify the norms generated by form_norms with using ... * - specify outcome variable in yvar() * - specify explanatory variables in xvar() * - specify additional controls such as time trends in trends() [optional] * - specify the cross-sectional units for the regressions in unit() * - specify the time dimension for the regressions in time() * - specify the last time period in the regression sample in t() * - specify the training-to-full ratio in train_ratio() * - specify the times of cross-validation in num_simulations() * - specify the specific norms to report in norms(), options include: N, D1, D2, M1, M2, Y1, Y2; the default will report all * - specify the tuning parameter in the proximity-weights in h() * - set seeds for replication in seed() * - the option quiet will hide the reporting of iterations * evaluate 1st model use DataReg, clear get_pwmse using "saved_norms.dta",yvar(lyield_corn) xvar(tAvg prec prec2) trends(i.stateansi#c.year##c.year) unit(fips) time(year) t(2015) train_ratio(0.75) num_simulations(100) norms(N Y1 M1) h(1) seed(10309) quiet * evaluate 2nd model use DataReg, clear get_pwmse using "saved_norms.dta",yvar(lyield_corn) xvar(tAvg_avg_m* prec prec2) trends(i.stateansi#c.year##c.year) unit(fips) time(year) t(2015) train_ratio(0.75) num_simulations(100) norms(N Y1 M1) h(1) seed(10309) quiet * evaluate 3rd model use DataReg, clear get_pwmse using "saved_norms.dta",yvar(lyield_corn) xvar(dday10_29C dday29C prec prec2) trends(i.stateansi#c.year##c.year) unit(fips) time(year) t(2015) train_ratio(0.75) num_simulations(100) norms(N Y1 M1) h(1) seed(10309) quiet * evaluate 4th model use DataReg, clear get_pwmse using "saved_norms.dta",yvar(lyield_corn) xvar(dd3bin* prec prec2) trends(i.stateansi#c.year##c.year) unit(fips) time(year) t(2015) train_ratio(0.75) num_simulations(100) norms(N Y1 M1) h(1) seed(10309) quiet * Notes: In this example, we evaluate four different model specifications on how temperature affects corn yields using US data. All models have linear and quadratic precipitation and state-level linear and quadratic trends. The tuning parameter h is set at 1. The number of times for cross-validation is 100, and the training-to-full ratio is 0.75. The seed is set as 10309 for replicability.