EC 771B Spring 2000 Problem Set 1

Due at classtime, Tuesday 14 March 2000

Set up a Stata program to provide the empirical results requested. Hand in a copy of the program, annotated with your comments as warranted. The comments may be handwritten on the printout if they are clearly legible.

Use the Wooldridge GPA2 dataset, available from within Stata via the Stata command

use http://fmwww.bc.edu/ec-p/data/wooldridge/GPA2

This dataset contains 4,137 observations on the following variables:

  1. sat                      combined SAT score
  2. tothrs                   total hours through fall semest
  3. colgpa                   GPA after fall semester
  4. athlete                  =1 if athlete
  5. verbmath                 verbal/math SAT score
  6. hsize                    size graduating class, 100s
  7. hsrank                   rank in graduating class
  8. hsperc                   100*(hsrank/hssize)
  9. female                   =1 if female
 10. white                    =1 if white
 11. black                    =1 if black
 12. hsizesq                  hsize^2

Test the following hypotheses:

1. Assuming that students' SAT scores are drawn from a distribution with a common variance, test that athletes have lower SAT scores than non-athletes. (hint: see -ttest-).

2. Attained fall semester GPA can be adequately predicted by the student's SAT score and the number of credit hours completed.

3. This model can be improved significantly by taking into account the student's race and sex.

4. This model (with race and sex) can be augmented to show that high-achieving students (those with a high 'hsperc') are more likely to do poorly in college.

5. This model (with race, sex, and hsperc) can be augmented to show that, controlling for these factors, athletes earn higher college grades.

6. In this context, the effects of race and athlete status are nonlinear; the effect on GPA of being a black athlete is not merely the sum of the individual effects of those two factors.

7. In a constant-elasticity relationship between GPA, SAT, and size of the student's high school, after controlling for race and sex, students from larger schools do more poorly in college. Can this model be improved by taking account of "hsizesq"?