/*
UMD Econ-626
PS 3 QUESTION 4: Regression discontinuity with manipulated running variables.
*/
/* --------------------------------------------------------------------------------- */
/* We begin by generating data for a hypothetical poverty-targeted social protection program. */
clear all
version 11.2
version 11.2: set seed 9876
set obs 10000
/* Initial wealth will vary on a scale from -1 to 1. */
gen truewealth=2*uniform()-1
/* Noisy measure of partial assets */
gen someassets=truewealth*0.5+0.1*uniform()
/* What wealth do program applicants report at the time of the poverty targeting exercise? */
/* In one scenario, program applicants report their true wealth. */
gen reportwealth1=truewealth
/* In one scenario, program applicants report their true wealth with some noise. */
gen reportwealth2=truewealth+0.1*invnorm(uniform())
/* In another scenario, program applicants with true wealth above 0 but below 0.5
sometimes misrepresent themselves as having wealth just below the cutoff. */
gen reportwealth3=cond(truewealth>0 & uniform()<0.8 & truewealth<0.5*uniform(),-0.5*abs(uniform()+uniform()-1),truewealth)
/* In another scenario, program applicants in general
sometimes misrepresent themselves as having lower wealth, but not with any precision. */
gen reportwealth4=cond(uniform()<0.5 & truewealth>-0.5*uniform(),truewealth-0.5*uniform(),truewealth)
/* --------------------------------------------------------------------------------- */
/*
4.1 Density
Take a look at the densities of each of the three reported wealth variables.
Make sure that you can see the density separately on both sides of zero.
You might try the histogram command, forcing a starting point and a bin
width that goes evenly into 1, such as 0.1, 0.01, or something in between.
What do you see?
*/
/* ******** YOUR COMMANDS HERE ******** */
/*
4.2 Density test
Justin McCrary and Brian Kovak have developed code to test for a density
change at the cutoff; McCrary's website has the code and an example of its use:
http://emlab.berkeley.edu/~jmccrary/DCdensity/
Be sure the DCdensity.ado is in one of your Stata ado directories, probably
the "PERSONAL" one. To find out where this is, you can type: sysdir
Having done that, the basic McCrary command in this setting is of the form:
DCdensity , breakpoint(0) generate(Xj Yj r0 fhat se_fhat)
You can try this for each of the three running variables. Running the command
will generate the variables listed, so in between running this for a new variable,
be sure to drop the generated variables. You might use this command to do that:
capture drop Xj Yj r0 fhat se_fhat
The McCrary routine will report, among other things, the estimated
log difference in density height, and the standard error of that estimate.
For which of the three variables can you reject the null hypothesis of
a zero change in density?
*/
/* ******** YOUR COMMANDS HERE ******** */
/* Several more lines of data generating process: */
/* Now we add the treament. */
/* In each case, treatment in the social safety net program is determined by reported wealth being below the cutoff. */
gen t1=cond(reportwealth1<0,1,0)
gen t2=cond(reportwealth2<0,1,0)
gen t3=cond(reportwealth3<0,1,0)
gen t4=cond(reportwealth4<0,1,0)
/* here are the interactions between treatment and the running variable: */
gen t1r=reportwealth1*t1
gen t2r=reportwealth2*t2
gen t3r=reportwealth3*t3
gen t4r=reportwealth4*t4
/* add a treatment effect to unobserved true wealth */
gen newwealth1=truewealth+t1*0.2+0.25*invnorm(uniform())
gen newwealth2=truewealth+t2*0.2+0.25*invnorm(uniform())
gen newwealth3=truewealth+t3*0.2+0.25*invnorm(uniform())
gen newwealth4=truewealth+t4*0.2+0.25*invnorm(uniform())
/*
4.3 Biased estimates of treatment effect under manipulation
Assume we do *not* observe the initial true wealth, but we do observe
the reported initial wealth, the treatment status, and the new wealth.
Using the regress command to conduct a regression discontinuity analysis,
estimate the effect of the program in each scenario (t1, t2, t3, t4) on
the outcome, (newwealth1/2/3/4). Control for the running variable on both
sides of the cutoff. Try relatively wide and narrow bandwidths.
After exploring with the regress command, use the rd or rdrobust command.
(Note that the estimates of treatment effects that "rd" finds will be
negative because treatment goes from one to zero at the cutoff rather
than from zero to one.)
What do you find? What is the true effect?
When do the estimated effects differ from the true effect, and why?
*/
/* ******** YOUR COMMANDS HERE ******** */
/*
4.4 Baseline covariates
The "someassets" variable is highly correlated with true wealth.
The regression discontinuity design depends on unobserved covariates
not changing at the cutoff score. Indeed, "someassets" is a pre-program
measure, so it *should* not change at the cutoff score. For each of
the four versions of reported wealth and the associated cutoff, does
"someassets" change discretely at the cutoff? If it does, we should
worry that the RD design would not yield credible estimates of program
impact.
*/
/* ******** YOUR COMMANDS HERE ******** */
/*
4.5 Continuation
Consider what would happen if the program were targeted based on true initial
wealth (data generating process 1), but instead of observing true initial wealth,
we only observed wealth with substantial noise. For example, type:
gen reportwealth1b = reportwealth1 + 0.5*invnorm(uniform())
This is mis-reporting, rather than manipulation, since the program is actually
targeted correctly. Now run the analysis for t1 but observing only reportwealth1b,
not reportwealth1. What goes wrong? Is there a jump in treatment status at
reported wealth equal to zero? (you might use rdplot or other commands to see this.)
*/
/* ******** YOUR COMMANDS HERE ******** */