{smcl} {* 28jun2016}{...} {cmd:help dataex} {hline} {title:Title} {phang} {cmd:dataex} {hline 2} Generate a properly formatted data example for Statalist {title:Syntax} {cmd:dataex} [{varlist}] {ifin} [, {opt v:arlabel} {opt e:lsewhere} {opt c:ount(#)} ] {title:Description} {pstd} {cmd:dataex} is for producing a data example to include in a post on Statalist. Make sure that you have read the {browse "http://www.statalist.org/forums/help":FAQ} before posting. Users who read your post will be able to copy the code generated by {cmd:dataex} and recreate the dataset shown. {pstd} The {cmd:input} command is used to enter the data into Stata variables of the same type as the original variables in memory. All numeric {help datetime} variables will be correctly formatted and all numeric variables with associated value {help labels} will also be recreated. If the {opt v:arlabel} option is specified, the results will include commands to regenerate all variable labels. {pstd} Copy what is produced by {cmd:dataex} in the Stata Results window to your post on Statalist. Make sure to include the {bf:[CODE]} and {bf:[/CODE]} lines. You can use the {bf:Preview} button just to the left of the {bf:Post Reply} button to verify within Statalist that the data example is correctly formatted. {title:Remarks} {pstd} The output produced by {cmd:dataex} may also be useful outside Statalist in other forums, or even privately, say in communicating with StataCorp technical support. In other forums or privately, the {bf:[CODE]} and {bf:[/CODE]} lines will not be useful and may be omitted. As a convenience the option {opt e:lsewhere} may be used to suppress display of such lines. {pstd} General advice on posting example data includes the following. {pmore} 0. It should be evident that readers can understand your dataset only to the extent that you explain it clearly. A detailed verbal explanation is likely to be too long to read and too hard for readers to absorb. So, use examples! {pmore} 1. Aim for a {browse "http://stackoverflow.com/help/mcve":minimal, complete and verifiable example}. {pmore} 2. The word {bf:minimal} underlines that small examples (say 5 to 10 observations) may be quite sufficient to explain your data structure, variable types and names. It is also true that your example should be {bf:complete} enough to make your question clear. By providing data that you have used, you make your question {bf:verifiable} too. {pmore} 3. Even if you use a mutually accessible dataset (say one read in with {help sysuse} or {help webuse}) providing code that others can run quickly will be very helpful. {pstd} {cmd:dataex} is not offered as a "one size fits all" solution to providing example data. Depending on your problem, explaining other facts about your dataset may be crucial, say on its size, what you have {help tsset} or {help xtset}, and so forth. {title:Options} {phang}{opt v:arlabels} specifies that commands to produce variable labels are also to be shown. {phang}{opt e:lsewhere} indicates that your example is for use somewhere other than Statalist. Display of {bf:CODE} delimiters intended for Statalist will therefore be suppressed. {phang}{opt c:ount(#)} specifies a limit to the number of observations listed. The default is 100. {title:Examples} {pstd} Prepare a small example from the standard auto dataset {cmd:.} {stata sysuse auto} {cmd:.} {stata dataex make price mpg rep78 in 1/5} {pstd} You present the variables in the order you want. If some variables have value labels, the results will include commands to recreate them {cmd:.} {stata dataex make rep78 price foreign if rep78 == 5} {pstd} You can use the {opt v:arlabel} option to include commands to regenerate variable labels {cmd:.} {stata dataex make rep78 price foreign if rep78 == 5, var} {pstd} Numeric {help datetime} variables will also be correctly formatted. In the following example, the daily date variable {bf:date} is regenerated using Stata's internal numeric values and then formatted using the {bf:%td} format. The next example shows a quarterly date variable. {cmd:.} {stata sysuse sp500} {cmd:.} {stata dataex in 1/5} {cmd:.} {stata sysuse gnp96} {cmd:.} {stata dataex in 1/5} {pstd} If the dataset is large, consider choosing a random sample. The following example uses {stata "ssc des randomtag":randomtag} (from SSC) to select 10 random observations. {cmd:.} {stata ssc install randomtag} {cmd:.} {stata sysuse icd9_cod.dta, clear} {cmd:.} {stata randomtag if length(__code9) == 4, count(10) gen(pick)} {cmd:.} {stata dataex __code9 __desc9 if pick} {title:Acknowledgements} {pstd} Many thanks to William Lisowski for his observation that some users may inadvertently trigger a large data dump and for his thoughtful suggestions on how to handle the issue. {title:Authors} {pstd}Robert Picard{p_end} {pstd}picard@netbox.com{p_end} {pstd}Nicholas J. Cox, Durham University, UK{p_end} {pstd}n.j.cox@durham.ac.uk{p_end} {title:Also see} {psee} SSC: {stata "ssc des listsome":listsome}, {stata "ssc des randomtag":randomtag} {p_end} {psee} Help: {manhelp input D:input}, {manhelp data_types D:data types}, {manhelp datetime D}, {manhelp label D}, {manhelp encode D} {manhelp list D} {p_end}