-------------------------------------------------------------------------------
help for usesas                                               manual:  [R] none
                                                             dialog:   none    
-------------------------------------------------------------------------------

Use a SAS dataset

usesas using filename [, formats char2lab check clear float xport describe keep(variable names) if(SAS if statement) in(firstobs/lastobs) quotes messy ]

Description

NOTE: Before the first use of usesas your sasexe.ado file may need to be edited to set the location of your SAS executable file (sas.exe) and your savastata SAS macro file (savastata.sas). It may be that usesas will be able to run with the default settings in sasexe.ado.

usesas loads a SAS datafile into memory. This usually occurs by supplying usesas a SAS dataset (*.sas7bdat, *.sd7, *.sd2, *.ssd01, *.xpt, *.cport) or an SPSS portable file (*.por), but usesas can also load a SAS datafile into memory via a SAS program (*.sas) that creates a SAS dataset. The last dataset created by the SAS program will be the SAS dataset processed by usesas.

usesas assumes the most common SAS datafile extension .sas7bdat if no file extension/suffix is specified.

usesas uses the savastata SAS macro to create the Stata dataset from the SAS dataset. usesas downloads the savastata SAS macro and stores it where user-written Stata ado-files are stored that begin with the letter "s". This macro can be used in SAS. Learn about savastata here: http://www.cpc.unc.edu/research/tools/data_analysis/savastata.html

usesas figures out how much memory the SAS dataset will require to be loaded into Stata and sets Stata's memory for you if your memory setting is less than is required.

usesas indicates that it has finished running by reporting to you how many observations and variables are in your dataset now in memory. For example:

Stata reports that the dataset has 200 observations and 11 variables.

NOTE: usesas calls SAS to run a SAS program. This requires the ability to run SAS on your computer.

Options

formats specifies to create value labels from SAS user-defined formats that are stored in a SAS formats catalog file that has the same name as the dataset and is in the same directory as the SAS dataset. For example: MySasData.sas7bcat . If this file doesn't exist, usesas will look for the file formats.sas7bcat in the same directory as the dataset.

char2lab specifies to encode long SAS character variables like the Stata command encode. Character variables that are too long for a Stata string variable are maintained in value labels. This is all done with the char2fmt SAS macro.

check specifies to generate basic stats for both datasets for the user to compare the newly created Stata dataset with the imported SAS dataset to make sure usesas created the files correctly. This is a comparison that should be done after any datafile is converted to any other type of datafile by any software. The SAS file is created in the same directory as the input SAS datafile and is named starting with the name of the datafile followed by "_SAScheck.lst" (SAS). e.g. "mySASdata_SAScheck.lst"

clear specifies to clear the data currently in memory before running usesas.

float specifies that numeric variables that would otherwise be stored as numeric type double be stored with numeric type float. This option should only be used if you are certain you have no integer variables that have more than 7 digits (like an ID variable).

xport specifies that the input dataset is a SAS Transport/Xport dataset. Since there is no standard file extension for SAS Xport datasets, this option is required. Datasets created by SAS's PROC CPORT procedure are allowed.

describe makes usesas act somewhat like the Stata command describe using. It does not bring the full dataset into memory. Instead it specifies for usesas only to load the descriptive information about the using dataset into Stata's memory as a Stata dataset and print it. So, instead of loading the actual dataset into Stata, usesas loads the descriptive information (variable names, what type of variables they are, the variable labels and formats associated to the variables) into Stata as a dataset. You can clear the descriptive data out of Stata's memory or use the descriptive data however you like to create variable lists for your actual invocation of usesas. This may be helpful for situations where the SAS dataset has more variables than your version of Stata can handle. You can create a variable list from the variable called "name" to create another invocation of usesas to read in only the variables you need.

If you do not want to have the describe option list the descriptive information of the imported dataset, you can use the option listnot with describe. The descriptive information will still be loaded into Stata as a Stata dataset.

The descriptive data are sorted in the variable order of the using dataset so a variable list for usesas could be created like so:

. display "`= trim(name[1])'--`= name[2047]'"

id--income88

which could then be used like so to keep the first 2,047 variables in the using dataset (2,047 is the maximum number of variables that Stata Intercooled can handle):

. usesas using "mySASdata.sas7bdat", clear keep(`= trim(name[1])'--`= name[2047]')

SAS variable lists using two dashes "--" tells SAS to use the variables that exist positionally between the first variable and the last variable in the using dataset inclusively. Read more about this under the documentation of the keep option.

The describe option makes usesas return the following in r():

Scalars r(N) number of observations in using dataset r(k) number of variables in using dataset

Macros r(varlist) variables in using dataset r(sortlist) variables by which using data are sorted

The above scalars and macros contain information about the dataset that was described, not information of the dataset of descriptive information that usesas loaded into Stata with the describe option.

keep allows for a list of variables from the imported dataset to be read in. This list is used in the SAS code portion of usesas so must be written in the SAS variable list style. SAS does not allow for variable lists to contain stars (*) or question marks (?). For example:

keep(var1-var20) includes only vars that start with "var" and end in a number between 1 and 20.

keep(var1--var20) includes only vars in the dataset between var1 and var20. This is like Stata's varlist style var1-var20.

if allows for a SAS if statement to subset the data before it's read in. Any valid SAS style if statement will work.

in allows for subsetting the data before it's read in. Use only #/# where both numbers are positive, for example 1/30 for the first 30 observations.

quotes specifies that double quotes that exist in string variables are to be replaced with single quotes. Since the data are written out to an ASCII file and then read into Stata, there are rare instances when double quotes are not allowed inside string variables.

messy specifies that all the intermediary files created by usesas during its operation are not to be deleted. The messy option prevents usesas from cleaning up after it has finished. This option is mostly useful for debugging purposes in order to find out where something went wrong. All intermediary files have a name starting with an underscore "_" followed by the process ID and are located in Stata's temp directory.

Examples

. usesas using "mySASdata.sas7bdat"

. usesas using "c:\data\mySASdata.ssd01", check

. usesas using "mySASdata.xpt", xport

. usesas using "mySASdata.sas7bdat", formats

. usesas using "mySASdata.sd2", quotes

. usesas using "mySASdata.sas7bdat", messy

. usesas using "mySASdata.sas7bdat", keep(id--qvm203a) if(1980<year<2000) in(1/500)

. usesas using "mySASdata.sas7bdat", describe

. usesas using "mySASdata.sas7bdat", describe nolist

// then submit the following actual invocation of usesas:

. usesas using "mySASdata.sas7bdat", clear keep(`r(sortlist)' `= trim(name[1])'--`= name[2047]')

NOTE: If you are setting up this program on your computer for the first time, please edit sasexe.ado to set the location of your SAS executable file (sas.exe). If you do not, usesas will try to set it for you. The sasexe.ado file is an ASCII text file and should be saved as such after editing. Stata's do-file editor will do the trick.

Setting up usesas

edit sasexe.ado (click, to edit the sasexe.ado file, remember to save when done.)

Author

Dan Blanchette The Carolina Population Center University of North Carolina - Chapel Hill, USA dan_blanchette@unc.edu

Also see

On-line: use, fdause, savasas (if installed)