{smcl} {* 3-Aug2004 rev 8-11-2004, 4-11-2005, 1-27-2012, 2013feb24 & Mar8, May4, July25, 2014may16, 2016jan15} {hline} help for {hi:carryforward} {hline} {title:Carry values forward, filling in missing values.} {p 8 17 2} {cmd:carryforward} {it:varlist} [{cmd:if} {it:exp}] [{cmd:in} {it:range}]{cmd:, }{c -(}{cmd:gen(}{it:newvarlist1}{cmd:)} | {cmd:replace}{c )-} [{cmd:cfindic(}{it:newvarlist2}{cmd:) back carryalong(}{it:varlist2}{cmd:)} {cmd:strict} {cmd:nonotes} {cmd:dynamic_condition(}{it:dyncond}{cmd:)} {cmd:extmiss}] {p 4 4 2} {cmd:by} {it:...} {cmd::} may be used with {cmd:carryforward}; see {help by}. {title:Description} {p 4 4 2} {cmd:carryforward} will carry non-missing values forward from one observation to the next, filling in missing values with the previous value. Thus, if you consider a sequence of missing values as a gap in the overall sequence, this operation will fill the gaps with values that appear before the gap. {p 4 4 2} It is important to understand that this is not appropriate for imputing missing values; more on this later, under "Additional Remarks". {p 4 4 2} The value-carrying action proceeds sequentially in the existing order of observations, or as sorted by {help bysort}, cascading values from one observation to the next, potentially carrying a given value through many observations. The process stops upon encountering a non-missing value, an excluded observation, or the end of a {cmd:by} group (that it, a change in value of the primary sort-variable, when used with {help by}). The process resumes when another missing value is encountered. {p 4 4 2} An example will illustrate: {cmd:. carryforward x, gen(y)} {txt}(6 real changes made) {cmd:. list, noobs sep(0)} {txt} {c TLC}{hline 4}{c -}{hline 4}{c TRC} {c |} {res} x y {txt}{c |} {c LT}{hline 4}{c -}{hline 4}{c RT} {c |} {res}12 12 {txt}{c |} {c |} {res} 4 4 {txt}{c |} {c |} {res} . 4 {txt}{c |} {c |} {res} . 4 {txt}{c |} {c |} {res} . 4 {txt}{c |} {c |} {res} 3 3 {txt}{c |} {c |} {res} . 3 {txt}{c |} {c |} {res} 7 7 {txt}{c |} {c |} {res} . 7 {txt}{c |} {c |} {res} . 7 {txt}{c |} {c BLC}{hline 4}{c -}{hline 4}{c BRC} {p 4 4 2} Notice that each value is carried until a non-missing value of x is encountered. {title:Options} {p 4 4 2} {cmd:gen(}{it:newvarlist1}{cmd:)} specifies the new variable(s) that will receive the values. If it is specified, then {it:newvarlist1} must have exactly as many names as there are in {it:varlist}; the variable names in the two lists will correspond in the order presented. The variables in {it:newvarlist1} will equal their corresponding variables in {it:varlist} wherever the latter are non-missing. {p 4 4 2} {cmd:replace} specifies that the new values are to br replaced directly in the variables of {it:varlist}. Under this option, {cmd:carryforward} functions as a {help replace} operation. {p 4 4 2} You must use either {cmd:gen()} or {cmd:replace}, but not both. {p 4 4 2} {cmd:cfindic(}{it:newvarlist2}{cmd:)} specifies indicator variable(s) that will be generated, indicating which observations received carry-forward values, that is, which observations were altered by the process. This is probably more useful under the {cmd:replace} option, since with {cmd:gen()}, this information is discernable by comparing the original and generated values. If {cmd:cfindic(}{it:newvarlist2}{cmd:)} is specified, then {it:newvarlist2} must have exactly as many names as there are in {it:varlist}; the variable names in the two lists will correspond in the order presented. Furthermore, {it:newvarlist2} may not have any names in common with {it:newvarlist1}. {p 4 4 2} {cmd:carryalong(}{it:varlist2}{cmd:)} specifies additional variables that will have their values carried along in concert with {it:varlist}. These variables get their values carried forward, but the set of observations that are affected is determined by {it:varlist} rather than the variables in {it:varlist2} themselves. This may be specified only if {it:varlist} consists of a single name. Be aware that this is essentially a {help replace} operation, with no regard for the original values in {it:varlist2}. Whereas {it: varlist} (with {cmd:replace}) never has non-missing values overwritten, the variables in {it:varlist2} can, indeed, have non-missing values overwritten. (If you are concerned about overwriting values, keep a copy in a separate variable. But typically, you would use this option to carry values into what were originally missing values.) {p 4 4 2} {cmd:back} merely affects the wording of labels and notes, and has no effect on the data. It inserts text into labels and notes, indicating that the operation was performed backward. Typically, you would use it when you "fool" {cmd:carryforward} into carrying values backward (see example). {p 4 4 2} {cmd:strict} imposes an additional constraint on the treatment of excluded observations which result from {cmd:if} or {cmd:in} qualifiers. Such observations are always excluded from having missing values filled in (with values from the previous observation). With the {cmd:strict} option, they are also excluded from having non-missing value carried forward (into the next observation). This will be illustrated below. {p 4 4 2} {cmd:nonotes} prevents the setting of notes on the generated or replaced variable. This pertains to the note stating that the variable was subjected to a carryforward operation; it does not affect the transfer of existing notes to the new variable under the {cmd:gen()} option. This option is provided for instances where the notes may not be appropriate, such as when carryforward is used as a tool for constructing a summary measure, rather than for modifying existing data. (For example, you derive a new variable to detect a condition; the new variable initially may be sparsely populated; you do a carryforward, followed by a reduction to the last observation per group.) {p 4 4 2} {cmd:dynamic_condition(}{it:dyncond}{cmd:)} specifies a restricting condition which may include references to the value being carried. It is a more-capable alternative to the {cmd:if} {it:exp} qualifier (though the two can be combined as well). The difference is that the {cmd:if} {it:exp} qualifier operates only on conditions that are "static" in that they must be computable at the start of the process; by contrast, the {cmd:dynamic_condition()} option allows for references to values as they get propagated during the carryforward process. {p 4 4 2} Another limitation of the {cmd:if} {it:exp} qualifier {c -} a consequence of its static nature {c -} is that, when there are multiple variables in {it:varlist}, the {cmd:if} {it:exp} qualifier establishes a restriction pattern that is the same for all the variables; the {cmd:dynamic_condition} option can affect each variable differently. {p 4 4 2} Note that a reference to the value being carried would be {it:var}{cmd:[_n-1]}, where {it:var} is the variable being operated on. You can specify such a reference in {cmd:if} {it:exp}, but it may not work as you would want, since {it:var}{cmd:[_n-1]} will likely refer to observations that do not yet have the desired values in them at the start of the carrying process (in instances where the value would be carried more than once). That is, such a reference in {cmd:if} {it:exp} is allowed, but it refers to values {it:before} the carrying operation begins {c -} not as they get carried. The {cmd:dynamic_condition} option enables you to reference these values during the process of being carried. Thus, for example, you might write,{p_end} {p 6 8 2}{cmd:. by person_id (date): carryforward a, dynamic_condition(a[_n-1]