{smcl} {* *! Verions 1 31jul2018}{...} {cmd:help stndzxage} {findalias asfradohelp}{...} {vieweralsosee "" "--"}{...} {vieweralsosee "[R] help" "help help"}{...} {viewerjumpto "Syntax" "examplehelpfile##syntax"}{...} {viewerjumpto "Description" "examplehelpfile##description"}{...} {viewerjumpto "Options" "examplehelpfile##options"}{...} {viewerjumpto "Remarks" "examplehelpfile##remarks"}{...} {viewerjumpto "Examples" "examplehelpfile##examples"}{...} {hline} {title:Title} {p2colset 5 11 13 2}{...} {p2col :{hi:stndzxage} {hline 2} STaNDardiZe byX AGE}{p_end} {p2colreset}{...} {p 5 5 0} This command standardizes test scores with respect to the mean across age (a running variable) assuming normal distributions given age. Additional variables can be specified over which the test variable is age-standardized. For example, a test score could be standardized over age and sex. Additionally, a subpopulation can serve as a reference population for standardizing the scores of other observations in the same dataset. The program creates a new z-score variable, stx_{it:testvar}. {title:Syntax} {p 5 5 2} {cmdab:stndzxage} testvar agevar [{it:varlist}] [if] [{cmd:,} {it:options}] {p 8 8 8} {it:testvar} is the variable (test score) to be standardized {p_end} {p 8 8 8} {it:agevar} is the running variable over which standardization is done. In the child development context, this is typically age, but in other contexts, different variables could be used. For example, to calculate a measure of height-standardized basketball skill, the test variable could be number of baskets made in 1 minute and the running variable is height. The age variable should be an integer when {it:continuous} is not chosen. {p 8 8 8} [{it:varlist}] are additional categorical variables over which to standardize, such as sex or language {p_end} {synoptset 20 tabbed}{...} {synopthdr} {synoptline} {syntab:Main} {synopt:{opt binw:idth(#)}} number of units of {it:agevar} that establishes the interval in which the test scores of children will grouped for standardized. Default is 1. Cannot be used with the option continuous. {p_end} {synopt:{opt minb:insize(#)}} minimum number of observations in a bin. For bins with fewer than #, stndzxage changes the values of stx_testvar to missing. Default is 30. When continuous is specified, binwidth refers to the minimum number of observations required in the bins defined by {it:varlist} (and the reference variable, if indicated). {p_end} {synopt:{opt cont:inuous}} Instead of standardizing over subgroups of age, the entire age span is used for fitting means and standard deviations using a polynomial of degree specified in the option {it:polynomial}. The default is a third-degree polynomial. Cannot be used with minbinsize. Default is to standardize using the discreet age bins. {p_end} {synopt:{opt poly:nomial(#)}} A polynomial of degree # is used to generate the age-specific means. Option {it:continuous} must be specified. Default is 3. {p_end} {synopt:{opt ref:erence(varname)}} The mean and standard deviation used in standardizing is generated only using observations for which {it:varname}=1, and these reference means and standard deviations are applied to the entire population. {it:varname} must only have 0 & 1 values. Default is to standardize using the entire population. {p_end} {synopt:{opt ce:iling}}applies a Tobit estimation of the mean using the maximum value of {it:testvar} as the ceiling. {p_end} {synopt:{opt fl:oor}}applies a Tobit estimation of the mean using the minimum value of {it:testvar} as the floor. {p_end} {synopt:{opt mean(#)}} adjusts the standardization to the indicated mean # instead of the default 0. {p_end} {synopt:{opt sd(#)}} adjusts the standardization to the indicated standard deviation # instead of the default 1. {p_end} {synopt:{opt med:ian}}allows for normalization with respect to the median rather than the mean. Default is mean. Cannot be used with continuous, floor or ceiling. (Note: {p_end} {synopt:{opt gr:aph}}generates diagnostic graphs. Needs the user-written command {it:grc1leg} installed. {p_end} {synoptline} {title:Description} {marker description}{...} {p 5 5 2} Child ability is often age dependent; standardizing assessments of child development by age can be helpful for comparisons. For example, children’s test scores are dependent on age and a wealth index, but a two-dimensional graph showing standardized scores by wealth is simpler to illustrate. Similarly, standardized scores can also be helpful in comparing results on different tests. {p_end} {p 5 5 2} When external norms for standardizing scores are not available or not recommended to implement, as in the case with cross-cultural applications of tests, the researcher can standardize the test within the population. {p_end} {p 5 5 2} A number of child development assessments vary the questions given based on the child's age (Ages & Stages Questionaire, for example). Thus the standardization of these scores should maintain these age groupings to obtain comparability. The score of a 3-month old child should not be standardized with the scores of a 4-month old child because the questions the parent answers are different. In this case, a categorical variable can be specified to indicate question set. When this variable is used in addition to the age variable, {it:stndzxage} ensures that the standardization is done separately for each subgroup (question set) {p_end} {p 5 5 2} If a researcher wishes to standardize by question set, but there are different ages for which each question set is applied, the categorical variable indicating which question set was administered can substitute for the age variable in the command line. {p_end} {p 5 5 2} {it:stndzxage} provides discrete and continuous methods by which within-population age-standardized scores are computed. {p_end} {p 5 5 2} {bf:1.Discrete intervals} With this method, the population is divided into discrete groups. The grouping criteria can be chosen to account for the density of the data over age. Large populations can group fewer age units while smaller populations may require wider bins. {p_end} {p 5 5 2} To ensure sufficient density of the population for standardization, use {it: minbinsize} to indicate the minimum number of observations in each bin. Then use the option {it: binwidth} to indicate how many units of agevar are combined in an age bin. Bin formation always starts with the youngest age. All age levels are included even if there are no individuals of that a given age in the data. If the last group contains just one age unit, it is combined with the previous group, unless winsize is 1. See examples below: {p_end} {bf:Values of {it:agevar} in data set} | {bf:binwidth} | {bf:Resulting intervals} -----------------------------|----------|--------------------- 4, 5, 6, 7, 8, 9, 10 | 2 | 4-5, 6-7, 8-10 4, 5, 7, 8, 10, 11, 12, 13 | 4 | 4-7, 8-11, 12-13 4, 5, 7, 8, 10, 11, 13 | 4 | 4-7, 8-11, 12-13 4, 5, 7, 8, 9, 10, 12 | 4 | 4-7, 8-12 {p 5 5 2} {it:stx_testvar} is assigned a value of missing if the number of observations in each bin (defined by {it:binwidth} and categorical variables var1, var2, etc.) is less than the specified {it:minbinsize}. The default minimum bin size is 30. {p_end} {p 5 5 2} {bf:2.Continuous} Data from children across the entire age range may be used to determine the mean and standard deviation. {it:stndzxage} uses a polynomial, with the degree to be chosen using the option {it:poly(#)}. The default is a 3rd degree polynomial, as has been used before in the literature (Rubio-Codina). The age dependent standard deviation is calculated similarly by fitting a a polynomial (of the same degree indicated by {it:poly(#)}) to the absolute value of the residuals. A kernel smoothing option (e.g. lowess) is currently not provided because this does not allow for the Tobit adjustment. {p_end} {p 5 5 2} {bf:3.Other features} Sometimes subpopulations require different standardizations, such as males and females. Categorical variables can be listed after the age variable to indicate further divisions of the data. In the discrete process, the age bins align for all subpopulations. {p_end} {p 5 5 2} A {it:reference} population may be chosen, indicated by a binary variable. In this case, the standardization procedures described above are performed only on the reference population, and then these norms are used to calculate standardized scores for the rest of the population. Age distributions of the two populations might not always match, so depending on criteria for bin width and size, there may be some individuals who remain with unstandardized scores. {p_end} {p 5 5 2} One can standardize using the {it:median} instead of the mean, but this is available only in the discrete case. The standard deviation is still generated around the mean, however. The {it:mean} and {it:sd} options are useful for situations where child development scores are standardized with a mean of 100 and a standard deviation of 15, as is common in the literature. {p_end} {p 5 5 2} A diognostic {it:graph} can be produced to assess if the standardization choices were appropriate. If a reference group was chosen, only the data from the reference group is presented. For each integer of age, the raw data graph(s) show means (or median) used in standardization along with the actual test scores. The standardized graph shows the standardized test scores and thier means. If the standardized graph does not appear relatively flat, the researcher may wish to consider other standardization options (e.g. continuous instead of bins). {p_end} {marker examples}{...} {title:Examples} {phang}{cmd:. stndzxage ppvt agemonths}{p_end} {phang}{cmd:. stndzxage ppvt agemonths, continuous graph}{p_end} {phang}{cmd:. stndzxage ppvt agemonths SES, ref(boy) minbinsize(30) binwidth(3) mean(100) sd(15)}{p_end} {hline} {title:Acknowledgements} Thanks to Soledad Martinez, Ann Weber, Beth Prado, & Lia Fernald {title:Author} {pstd}Sarah Anne Reynolds{p_end} {pstd}School of Public Health{p_end} {pstd}University of California, Berkeley{p_end} {pstd}sar48@berkeley.edu{p_end}