Due at classtime, Tuesday 14 March 2000
Set up a Stata program to provide the empirical results requested. Hand in a copy of the program, annotated with your comments as warranted. The comments may be handwritten on the printout if they are clearly legible.
Use the Wooldridge GPA2 dataset, available from within Stata via the Stata command
use http://fmwww.bc.edu/ec-p/data/wooldridge/GPA2
This dataset contains 4,137 observations on the following variables:
1. sat combined SAT score 2. tothrs total hours through fall semest 3. colgpa GPA after fall semester 4. athlete =1 if athlete 5. verbmath verbal/math SAT score 6. hsize size graduating class, 100s 7. hsrank rank in graduating class 8. hsperc 100*(hsrank/hssize) 9. female =1 if female 10. white =1 if white 11. black =1 if black 12. hsizesq hsize^2
Test the following hypotheses:
1. Assuming that students' SAT scores are drawn from a distribution with a common variance, test that athletes have lower SAT scores than non-athletes. (hint: see -ttest-).
2. Attained fall semester GPA can be adequately predicted by the student's SAT score and the number of credit hours completed.
3. This model can be improved significantly by taking into account the student's race and sex.
4. This model (with race and sex) can be augmented to show that high-achieving students (those with a high 'hsperc') are more likely to do poorly in college.
5. This model (with race, sex, and hsperc) can be augmented to show that, controlling for these factors, athletes earn higher college grades.
6. In this context, the effects of race and athlete status are nonlinear; the effect on GPA of being a black athlete is not merely the sum of the individual effects of those two factors.
7. In a constant-elasticity relationship between GPA, SAT, and size of the student's high school, after controlling for race and sex, students from larger schools do more poorly in college. Can this model be improved by taking account of "hsizesq"?