/************************************************************** * Savastata should work for you as is, but you may need * to edit this file to set the values for the macro variables: * ustata -- if you are using savastata in a unix or linux environment * wstata -- if you are using savastata in a windows environment * * There are instructions further down in this file explaining where * and how to edit the settings of these macro variables. * Do a find for "let ustata" or "let wstata". **************************************************************/ run; ** Because the world needs more run statements. **; %MACRO savastata(out_dir,options,sortedby,tfns,nosave,u_dir,U_SE,version=8); run; ** Because the world needs more run statements. **; ** If an error has occurred before call to savastata then fail immediately. **; %if &syserr^=0 %then %goto nevrmind; /********************************************************************************************* ** Macro: savastata ** Input: Most recently created SAS work dataset, directory path of where to save the output ** Stata dataset file, and any options. ** Output: A Stata 6, 7, 7-SE, 8, or 8-SE dataset, or an ASCII data file with ** other files created by savastata to input the data into Stata. ** ** ** Programmer: dan blanchette dan_blanchette@unc.edu ** Developed at The Carolina Population Center at The University of North Carolina at Chapel Hill ** Date: 20October2003 ** Modified: 07Aug2006 - Added ability to handle non-alphanumeric characters in directory names. ** - Added ability for batch invocation of SAS to set location of ustata ** with the sysparm macro variable. ** - Added checking HOSTNAME when discerning where to log usage. ** Modified: 29May2006 - Fixed char2lab option problem with usesas & savas when ** formats were stored in formats.sas7bcat ** Modified: 04Apr2006 - Fixed it so that locations of the Stata executable file ** that have spaces in the directory name no longer crash ** savastata. The %sysexec macro is not supposed to require ** double quotes in such situations but adding double quotes ** fixed the problem. ** Modified: 14Mar2006 - no longer searches for stata.exe when run by -usesas- ** Modified: 13Feb2006 - fixed formatting of special missings issues. ** Modified: 10Nov2005 - fixed when usagelog not set issues. ** Modified: 29Sep2005 - fixed situation where usesas calls savastata and savastata closed user's log ** - added max str var length to 244 if Stata version is >= 9.1 ** since Stata 9.1 does not allow you to save as Stata 8 or 9.0 any special missing ** will be lost since -saveold- saves as Stata 7. ** Modified: 09Aug2005 - If using/saving to Stata 9 and using SAS 9, then SAS user-defined formats ** that have a max length of 32,767 characters can have up to the first ** 32,000 (Stata 9's limit) saved as Stata value labels. ** - new option char2lab which runs the SAS macro char2fmt is only helpful ** if using Stata 9 and only allowed if savastata is being run by savas ** or usesas since it changes data dramatically ** - check users path for stata executable, For *nix, "stata" or "stata-se" ** seem to just work so no need to go hunting. ** - does not save data if Stata does not report same number of obs and vars ** - no longer searches for stata.exe when run by -usesas- ** - spaces in directory names (where intermediary files put) in *nix seems to be fixed. ** - for when -usesas- runs savastata, the maxvar setting is not changed if ** savastata suggests a number lower than the user has already specified. ** otherwise savastata sets maxvar to about 10% of the difference of ** (32,766 - (all variables)) ** savastata has successfully worked with 32,766 variables! ** Modified: 24Jun2005 - made sure that if unix/linux directory names had spaces all is still fine ** Modified: 14Jun2005 - fixed it so that directory names with commas in them are okay and problem ** with changing back to pwd that was introduced in 15May2005 version. ** Modified: 15May2005 - fixed it so that directory names with commas in them or ** start with forward slashes (universal filenaming convention) ** are okay. ** - if user has a profile.do that cd's to another directory ** that that is okay. ** Modified: 10May2005 - updated search for windows stata executable file to find up to ** version 12 Stata (if directory naming conventions stay predictable) ** - made work when user does not have Stata and wants to save to ascii file ** - made work when user has profile.do changing directories on savastata ** This introduces problems when Unix/Linux directory contains a double quote... ** but what good Unix/Linux user would name a directory containing a double quote?! ** Modified: 22Mar2005 - fixed problems related to when a new dataset name provided ** Modified: 08Dec2004 - The new name of the Stata datafile can be upper case or mix case. ** Modified: 09Nov2004 - User can now specify the name of the Stata datafile they want. ** %savastata("c:\MyProject\My{98}Stata!Data.dta", replace ); ** This is helpful if they want to use an invalid SAS filename but valid Stata file name. ** !The Stata dataset name must end in ".dta"! ** - Thanks to West, bug fixed that was related formats using the "Other" category. ** Modified: 26Oct2004 - Stopped allowing user to set what version of Stata they are using. Savastata will always figure it out. ** - also changed note in log to report Stata version as integer instead of 8.2 since the version of ** a dataset is not that specific. ** Modified: 22Jul2004 - fixed bug with numeric variables that had both positive and negative ** formatted values, and made a few tweaks, one of which is that the raw ** ascii data file in the work ** directory is deleted when savastata successfully completes. ** Modified: 27Apr2004 - fixed macro so that setting use8SE=1 does not generate error messages ** about macro variable VER. ** - fixed warning message about script macro var when usagelog not used. ** - added more locations to find the windows Stata executable. ** Modified: 26Feb2004 - memory setting fix for -usesas- ** Modified: 03Feb2004 1) when -usesas- calls macro memory will not be reset if not needed. ** 2) no longer closes existing logs when called by -usesas- . ** ** Disclaimer: This program is free to use and to distribute as long as credit is given ** to Dan Blanchette and The Carolina Population Center at UNC-CH. ** The University of North Carolina at Chapel Hill is not responsible ** for datasets created by this macro. It is the responsibility of the user ** to check the quality of the output dataset with the original dataset. ** There is no warranty on this software either expressed or implied. ** This program is released under the terms and conditions of ** GNU General Public License. ** Comments: ** Savastata SAS macro when implemented by the SAS System saves the most recently created ** SAS dataset in the work library to a Stata dataset. Savastata requires that you have a ** working copy of SAS and a working copy of Stata Intercooled or SE on your computer to run ** successfully. If your SAS dataset is small enough savastata may work on Stata Small ** (Student Version). ** If your SAS dataset is using formats that are in a formats catalog (work.formats or ** library.formats), savastata will make an attempt to preserve them as value labels in ** Stata. Stata does not allow all the variations of user-defined formats that SAS offers. ** ** Savastata may take a few minutes to save your dataset. ** ** Savastata will work in SAS interactive mode or in SAS batch mode. ** Savastata will run on various operating systems including: Windows, Red Hat Linux, AIX. ** It may work fine on others as well. ** ** Savastata uses the most recently created dataset in the work directory, ** figures out how best to store the data as a Stata dataset, keeps most date ** formats and other basic formats for numeric variables, checks for invalid ** Stata variable names, if using or outputting to Stata6: checks to see if long variable ** names need to be shortened, prints them to the SAS log and shortens them, writes out ** the SAS dataset to an ASCII data file, attempts to preserve all user-defined formats for ** numeric variables, writes out Stata-do files, and submits them to Stata in batch mode ** in order to have Stata read the data in and save it as a Stata dataset. ** ** Restrictions of each version of Stata: ** 1. Variable names can be no more than 8 characters long: Stata 6 ** 2. Variable names can be up to 32 characters long: Stata 7 and 8 ** 3. String variables can contain a maximum of 80 characters: Stata 6, 7 and 8 Intercooled. ** 4. String variables can contain a maximum of 244 characters: Stata 7 and 8 SE. ** 5. The maximum number of variables is 2,047: Stata 6, 7, and 8 Intercooled ** 6. The maximum number of variables is 32,766: Stata 7 ** 7. The maximum number of variables is 32,767: Stata 8 and 9 SE ** 8. Special missings (.a through .z) for numeric data are allowed: Stata 8 ** ** ** -- Savastata can run on SAS ver 7, 8 or 9 ** -- Savastata can run in the following environments: Windows, RS6000, SUN , LINUX ** and maybe on many others. ** ** REQUIRED INPUT TO SAVASTATA: ** -- Savastata needs to have the most recently created dataset by SAS to be in the ** work directory. ** -- Savastata needs to know what directory to put the Stata dataset and possibly ** the files used to input the data to Stata that are written by savastata. ** These are the only required input, but you may choose to make use of the ** following options. ** ** LIST OF OPTIONS: ** NOTE: Options can be used in any order. You can specify as many as you want ** or none at all. ** The following three options work as they do in Stata: ** -replace -- If the Stata dataset you want to output already exists, then overwrite it with ** the dataset generated by savastata. ** ** -old -- Outputs a Stata 6 dataset if using Stata 7 or Stata 7-SE, ** or a Stata 7 Intercooled dataset if using Stata 8 Intercooled ** or a Stata 7-SE dataset if you are using Stata 8-SE. ** ** -intercooled -- Outputs a Stata Intercooled dataset if using Stata 8-SE or Stata 7-SE. ** ** NOTE: If you do not specify what version of Stata to save in, Stata will save the ** dataset in the current version. ** ** -float -- Numeric variables that contain decimals will be stored as float instead of the ** default of double. This may result in a loss of precision, but float is the ** default storage type that Stata uses. This will help decrease your filesize. ** ** -quotes -- Replace double quotes ( " ) occurring in character variables with single quotes ( ' ) ** and replace compound quotes ( `" or "' ) occurring in variable labels or formats ** with single quotes ( ' ). ** Savastata cannot process character variables with double quotes or variable ** labels or formats with compound quotes. ** ** -messy -- Puts the files generated by savastata used to create the Stata dataset in the directory ** named in the pathname provided in the call to the savastata macro. ** ** -check -- Creates two check files for the user to compare the SAS input dataset with ** the Stata output dataset to make sure savastata created the files correctly. ** This is a comparison that should be done after any data file is converted ** to any other type of data file by any software. The files are ** created in the same directory as the output Stata data file and are named ** starting with the name of the data file followed by either "_SAScheck.lst" ** or "_STATAcheck.log", e.g. "mydata_SAScheck.lst" and "mydata_STATAcheck.log". ** The SAS check file contains output from proc means, ** proc contents (by position), and a proc print of the first 5 observations. ** The Stata check file contains the equivalent output by the commands summarize, ** describe, and a list of the first 5 observations. ** ** -ascii -- Outputs only the ASCII data file and does not save your dataset in Stata format. ** It also turns on "-messy" switch so that the Stata input files are ** not deleted after your program has run. Use this switch to make your own ** edits to the input of these data into Stata. Use this option if you do not ** have Stata on the same computer that you have SAS. The files generated by ** savastata can be moved to another computer that does have Stata and run there to ** create the Stata dataset. ** ** -char2lab - Runs the CHAR2FMT macro but only if savastata invoked by savas script or usesas ** because CHAR2FMT changes the user's dataset in a dramatic way. ** CHAR2FMT converts long character variables to numeric vars and stores character data ** in user-defined formats which get translated into Stata value labels which have ** a maximum length of 32,000 characters (new feature in Stata 9). ** ** SETTING UP SAVASTATA ** These are instructions to edit the savastata.mac file. ** ** NOTE: If you are setting up this macro on your computer for the first time, ** please choose which version of Stata you are going to have savastata use. ** If you do not choose to set one of the following switches, savastata will ** figure out what version of Stata you are running for you. This may ** add a noticeable amount of time to processing so you may want to set these ** switches to the correct version of Stata. You can easily figure out what ** version of Stata you are using by looking at the top of your results window ** in Stata or by typing in the command "about" at the Stata command line. ** One advantage of leaving savastata to figure out what version of Stata is ** being used is that when you upgrade your version of Stata you will not have to ** update savastata. ** ** NOTE: ** -- If you are running savastata on the Unix or Linux platforms then ** you need to be able to start a Stata batch job by: ** stata -b do mydofile.do ** If not then change the setting of the ustata macro variable. **********************************************************************************************/ /** This may work: **/ %let ustata=/usr/local/stata/stata; ** tfns is only submitted by savas and usesas and savas may set the location of stata **; ** savas may invoke SAS like so: sas -sysparm /alt_location/stata my_sas.sas **; %if "&tfns"^="" and "&sysparm"^="" and %sysfunc(fileexist("&sysparm")) %then %do; %let ustata=%nrbquote(&sysparm); %end; /********************************************************************************************** ** ** -- If you are running savastata on the Windows platform, you need to ** tell savastata where the stata executable file is located. ** If you do not know where your stata executable file is located, find your Stata ** short-cut icon, right click on it, choose "properties", and look in the "target" field. ** This will show you where the stata executable file is located on your hard drive. ** ***********************************************************************************************/ ** Change what is inside the parentheses to the location of your stata executable file **; /** This may work: **/ %let wstata=%nrstr(c:\Stata9\wstata.exe); /** savastata will try the following (plus up to ver 12) if the first does not work: %let wstata=%nrstr(c:\Stata8\wsestata.exe); ** Stata 8 SE **; %let wstata=%nrstr(c:\Program Files\Stata8\wsestata.exe); ** Stata 8 SE **; %let wstata=%nrstr(d:\Stata8\wsestata.exe); ** Stata 8 SE **; %let wstata=%nrstr(c:\Program Files\Stata8\wstata.exe); ** Stata 8 **; %let wstata=%nrstr(d:\Stata8\wstata.exe); ** Stata 8 **; %let wstata=%nrstr(c:\Stata8\wsestata.exe); ** Stata 8 SE **; %let wstata=%nrstr(c:\Stata7\wstata.exe); ** Stata 7 or 7 SE **; %let wstata=%nrstr(c:\Stata\wstata.exe); ** Stata 6 or 7 or 8, the catcher **; %let wstata=%nrstr(j:\afs\isis.unc.edu\pc-pkg\stata-90\program\wsestata.exe); ** Stata 9 SE ** %let wstata=%nrstr(j:\afs\isis.unc.edu\pc-pkg\stata-80\program\wsestata.exe); ** Stata 8 SE ** ** then savastata searches your path ** ************************************************/ /********************************************************************************************* ** ** HOW TO USE THE SAVASTATA MACRO: ** Using the savastata macro requires that you understand how to use the "%include" statement ** and that you know how to call a SAS macro. ** ** %include'LOCATION AND NAME OF A FILE THAT CONTAINS SAS CODE'; ** ** For example, if you have copied this file to "c:\SASmacro", then you tell SAS ** about this macro by adding the following line to your SAS program: ** ** %include 'c:\SASmacro\savastata.mac'; ** ** This statement makes SAS aware of the savastata macro which is in the file savastata.mac. ** To use the macro you have to make a call to it. To do that you add a line like the ** following to your SAS program: ** ** %savastata(c:\mySASdir\,-old); ** ** The information inside the parentheses is passed on to the savastata macro. The first ** string of information is the location you want to save your SAS dataset as a Stata dataset. ** This is somewhat like a libname statement. The second string of information is the options ** you wish to pass on to the savastata macro. You can use as many options as you like or none at all. ** ** ** EXAMPLE USE OF THE SAVASTATA MACRO: ** %include 'c:\SASmacro\savastata.mac'; ** Include macro once in a SAS session and call it ** as many times as you like in that session. **; ** ** data work.ToBeStata; ** This makes a copy of the SAS dataset in the in ** the work library. **; ** set in.mySASfile; ** run; ** ** ** %savastata(c:\mydata\,); ** Saves the dataset in the c:\mydata\ directory if it does not ** already exist in that directory. **; ** ** OTHER EXAMPLE CALLS: ** ** %savastata(c:\mydata\,-replace); ** Saves the dataset c:\mydata\, overwriting it if ** it already exists. **; ** ** %savastata(c:\data\,-old); ** Saves the dataset as the previous version of Stata ** in c:\data\ directory **; ** ** %savastata(c:\data\,-old -replace); ** Saves the dataset as the previous version of Stata ** in c:\data\ directory, overwriting it if it ** already exists. **; ** ** %savastata(c:\data\,-intercooled); ** Saves the dataset as the Intercooled version of Stata ** in c:\data\ directory. This is only possible if ** your version of Stata is an SE edition. **; ** ** %savastata(/project/data/,-old -intercooled); ** Saves the dataset as previous version of ** Stata Intercooled in the /project/data/ ** directory **; ** ** ***********************************************************************************************/ ** SET LOCATION OF USAGE LOG FILE **; %let usagelog="specify what file name and location you want here"; %if "&sysscp"="WIN" %then %do; %let usagelog="x:\software\temp\savas_usage.log"; /* windoze */ %end; %if "&sysscp"="RS6000" %then %do; %let usagelog="/backups/usage/savas_usage.log"; /* gromit */ %end; %if "&sysscp"="LINUX" %then %do; /* linux boxes */ %if %index(%qlowcase(%qcmpres(%sysget(HOSTNAME))),"gromit") = 1 or %index(%qlowcase(%qcmpres(%sysget(HOSTNAME))),"sig") = 1 or %index(%qlowcase(%qcmpres(%sysget(HOSTNAME))),".cpc.") ^= 0 %then /* %let usagelog = "/backups/usage/savas_usage.log"; */ %let usagelog="/afs/isis.unc.edu/home/d/a/danb/usage/savas_usage.log"; %else %let usagelog="/afs/isis.unc.edu/home/d/a/danb/usage/savas_usage.log"; %end; %if %index(%qupcase(&sysscp),SUN) %then %do; /* %let usagelog="/tmp/savas_usage.log"; */ /* sunny */ %let usagelog="/afs/isis.unc.edu/home/d/a/danb/usage/savas_usage.log"; %end; /***************************************************************************/ /****** !NO MORE EDITS TO THE MACRO SHOULD BE MADE BEYOND THIS POINT! ******/ /***************************************************************************/ ** Save option settings so they can be restored at the end of this macro. **; %let notes=%sysfunc(getoption(notes)); %let obs=%sysfunc(getoption(obs)); options obs=MAX; *** Reason for maximizing it is because user could have * set it lower than the number of variables in the dataset. **; options nonotes; ** Shut off notes while program is running in order to reduce log size. **; ** Time how long savastata takes to run **; data _null_; call symput("startdat",datetime()); run; ** initialize macro vars **; %let diffhour=0; %let diffmin=0; %let diffsec=0; %let fail=0; %let success=0; %if %nrbquote(%index(%nrbquote(&out_dir),%str(%")))=1 or %nrbquote(%index(%nrbquote(&out_dir),%str(%')))=1 %then %let out_dir=%nrbquote(%substr(%nrbquote(&out_dir),2,%length(%nrbquote(&out_dir))-2)); %if %nrbquote(%index(%nrbquote(&u_dir),%str(%")))=1 or %nrbquote(%index(%nrbquote(&u_dir),%str(%')))=1 %then %let u_dir=%nrbquote(%substr(%nrbquote(&u_dir),2,%length(%nrbquote(&u_dir))-2)); ** initialize var **; %let newname=; ** check to see if new dataset name provided in %nrbquote(&out_dir) **; %if %nrbquote(%length(&out_dir)) > 0 %then %do; %if %nrbquote(%index(%qlowcase(&out_dir.),.dta)) %then %do; ** if Stata dataset name provided with directory info **; %if "%qlowcase(%substr(%nrbquote(&out_dir),%length(%nrbquote(&out_dir))-3,4))"=".dta" %then %do; ** if no backslash provided: savstata(d:mydata.dta) then add in the backslash **; %if "&sysscp"="WIN" and %nrbquote(%index(%nrbquote(&out_dir),:)) = 2 and %nrbquote(%index(%nrbquote(&out_dir),\)) ^= 3 %then %let out_dir = %nrbquote(%substr(%nrbquote(&out_dir.),1,2)\%substr(%nrbquote(&out_dir.),3,%length(%nrbquote(&out_dir.))-2)); %let newname=%nrbquote(%substr(%nrbquote(&out_dir.),1,%length(%nrbquote(&out_dir.))-4)); %if %index(%nrbquote(&newname),\) %then %do; %do %while(%nrbquote(%index(%nrbquote(&newname),\))); %let newname=%nrbquote(%substr(%nrbquote(&newname),%index(%nrbquote(&newname),\)+1,%length(&newname)-(%index(%nrbquote(&newname),\)))); %end; %let out_dir=%nrbquote(%substr(%nrbquote(&out_dir.),1,%index(%nrbquote(&out_dir),%nrbquote(&newname))-1)); %end; %else %if %index(%nrbquote(&newname),/) %then %do; %do %while(%nrbquote(%index(%nrbquote(&newname),/))); %let newname=%nrbquote(%substr(%nrbquote(&newname),%index(%nrbquote(&newname),/)+1,%length(&newname)-(%index(%nrbquote(&newname),/)))); %end; %let out_dir=%nrbquote(%substr(%nrbquote(&out_dir.),1,%index(%nrbquote(&out_dir.),%nrbquote(&newname))-1)); %end; %else %let out_dir=; ** only new name provided **; %end; %end; ** end of if %index(%nrbquote(&out_dir),.dta) do loop **; %end; ** end of if length(%nrbquote(&out_dir)) = 0 do loop **; %let s_dsn=&sysdsn; ** preserve these to restore after setting up usagelog **; ** script var is definitive way to determine how savastata was called, * if empty then not called by either usesas or savas **; %let script=; %if "&tfns" ^= "" and "&nosave"="nosave" %then %do; %let script=usesas; %end; %else %if "&tfns"^="" and "&nosave"="" %then %do; %let script=savas; %end; ** log usage of savastata if usage log file exists **; %if %sysfunc(fileexist(&usagelog)) %then %do; data _null_; file &usagelog mod; %if "&script"="" %then %do; put " "; date=datetime() ; put " savastata macro " date dateampm. ; %end; put " &sysuserid. savastata( &out_dir,&options,&sortedby,&tfns,&nosave,&u_dir,&U_SE )"; %end; %let sysdsn=&s_dsn; ** restore after setting up usagelog **; %let noisily = ; %let quietly = ; %if "&script" = "usesas" %then %do; options nonotes nodate; %let noisily =noisily; %let quietly =quietly; ** proc printto prints any weird error messages that SAS has to this log file in a nice, readable format because usesas looks for this file and prints it to the results window **; proc printto log="&out_dir._&tfns._report.log" new ; run; %end; ** current website address for savastata help: used in fail messages at end of macro **; %let http=%nrstr(http://www.cpc.unc.edu/services/computer/presentations/sas_to_stata/savastata.html); %let char2fmthttp=%nrstr(http://www.cpc.unc.edu/services/computer/presentations/sas_to_stata/char2fmt.html); ** Find out what directory SAS currently is using as the present working directory so that it can be restored at end of macro. **; libname ________ " "; ** ________ is a very unlikely libname **; %let pwdir=%nrbquote(%sysfunc(pathname(________))); %let pwdrive= ; %IF %index(%nrbquote(&pwdir),\) %THEN %do; %let pwdrive= %qsubstr(%nrbquote(&pwdir),1,2); ** get drive info eg. "d:" **; %end; ** if no temporary filenames are supplied then use sysjobid macro var **; %if %length(&tfns)=0 %then %let tfns=&sysjobid.&sysindex; ** Have macro var that will increase each time macro run for times when one SAS session runs savastata macro multiple times and -messy option specified. ***; *** Use the most recently created SAS work dataset. ***; %let s_last=&syslast; %let ldset=%length(&syslast); %let decpos=%index(&syslast,.); %let dset=%substr(&syslast,&decpos.+1,&ldset.-&decpos); ** use the work directory to store the SAS files that this program creates ***; %let temp_dir = %nrbquote(%sysfunc(pathname(work))); %let work_dir = %nrbquote(%sysfunc(pathname(work))); ** this is first time program goes to a fail label **; %if %index(%nrbquote(&work_dir),\)=1 %then %goto fail18; ** Work directory cannot start with a back slash * because savastata needs to cd to it. **; ** Figure out whether the operating system uses forward slashes or back slashes in directory paths and make sure that out_dir has the appropriate slash at the end. **; %let unix=0; %let drive= ; %IF %index(%nrbquote(&temp_dir),\) %THEN %do; %let unix=0; ** unix=0 implies windows platform **; %let drive= %qsubstr("&work_dir",2,2); ** get drive info eg. "d:" **; %let temp_dir = %nrbquote(&temp_dir.)\; ** tack on a back slash **; %if "&out_dir"="" %then %goto fail4; %else %if "&out_dir"=" " %then %goto fail4; %else %if "&out_dir"="." %then %goto fail4; %let slash= %qsubstr("&out_dir",%length("&out_dir")-1,1); ** check if back slash at end **; %if "&slash"^="\" %THEN %do; %let out_dir= %nrbquote(&out_dir.)\; ** add a back slash at end if it is not there already **; %end; %end; %ELSE %IF %index(%nrbquote(&temp_dir),/) %THEN %do; %let unix=1; ** unix or unix-like platform **; %let temp_dir = %nrbquote(&temp_dir.)/; ** tack on a forward slash **; ** make sure that out_dir is not a relative directory name like: ../mydata/ **; libname ________ "&out_dir"; ** ________ is a very unlikely libname **; %let out_dir=%nrbquote(%sysfunc(pathname(________))); %let slash= %qsubstr("&out_dir",%length("&out_dir")-1,1); ** check if back slash at end **; %if "&slash"^="/" %THEN %do; %let out_dir= %nrbquote(&out_dir.)/; ** add a forward slash at end if it is not there already **; %end; %end; ** ELSE IF index("temp_dir",/) THEN do loop **; ** Make sure the dataset name and any option passed to savastata is in lower case. **; %let dset=%lowcase(%nrbquote(&dset)); *%if %length(&newname)>0 %then %let fdset=%lowcase(%nrbquote(&newname)); ** force lowercase on newname **; %if %length(&newname)>0 %then %let fdset=%nrbquote(&newname); %else %let fdset=%lowcase(%nrbquote(&dset)); %let options=%lowcase(%nrbquote(&options)); %let udset=%qupcase(&fdset); %if %index(&syslast,WORK)^=1 %then %goto fail1; ** if obs are set to zero, error in program previous to savastata **; %if &obs=0 %then %goto fail13; %if &udset=_CONTEN or &udset=_CONTEN1 or &udset=_CONTEN2 or &udset=_CONTEN3 %then %goto fail3; ** initialize macro var **; %let workfmts=0; %let librfmts=0; %let vlabels=0; ** check to see if format libraries exist **; %let workfmts=%sysfunc(cexist(WORK.FORMATS)); %let librfmts=%sysfunc(cexist(LIBRARY.FORMATS)); %if &workfmts=1 or &librfmts=1 %then %let vlabels=1; ** Initialize macro vars for savastata options **; %let ascii=0; %let float=0; %let quotes=0; %let messy=0; %let intrcool= ; %let old= ; %let replace= ; %let check=0; %let char2lab=0; %if %nrstr(&options)^=%nrstr() %then %do; ** Find out what options were specified **; %if %index(&options,asci) %then %let ascii=1; ** set ascii option **; %if %index(&options,fl) %then %let float=1; ** set float option **; %if %index(&options,qu) %then %let quotes=1; ** set quote option **; %if %index(&options,mes) %then %let messy=1; ** set messy option **; %if %index(&options,old) %then %let old=old; ** set old option **; %if %index(&options,int) %then %let intrcool=intercooled; ** set intrcool option **; %if %index(&options,rep) %then %let replace=replace; ** set replace option **; %if %index(&options,rpl) %then %let replace=replace; ** set replace option **; %if %index(&options,che) %then %let check=1; ** set check option **; %if %index(&options,cha) %then %let char2lab=1; ** set char2lab option **; %end; ** Need to save all the files if ascii specified. **; %if &ascii=1 and &messy=0 %then %let messy=1; ** check to see if user has set up windows stata correctly. **; ** if not then check other likely places the Stata executable would be. **; %let fail20=0; %if not (&ascii=1 or "&script." = "usesas") %then %do; ** do not need the stata.exe when ascii or usesas running it *; %if &unix=0 and %sysfunc(fileexist("&wstata"))=0 %then %do %while(&fail20=0); %let drives =c d; %let exe =wsestata wstata; %let versions =12 11 10 9 8 7 6; %do i = 1 %to 2; ** one for each drive **; %let ii =%scan(&drives.,&i.,%nrstr( )); %do j = 1 %to 2; ** one for each exe of stata **; %let jj =%scan(&exe.,&j.,%nrstr( )); %do k = 1 %to 7; ** one for each version of stata **; %let kk =%scan(&versions.,&k.,%nrstr( )); %let wstata=%str(&ii.:\Stata&kk.\&jj..exe); %if %sysfunc(fileexist("&wstata"))=0 %then %do; %let wstata=%str(&ii.:\Program Files\Stata-&kk.\&jj..exe); ** Stata-9 **; %end; %else %goto exist; ** file exists **; %if %sysfunc(fileexist("&wstata"))=0 %then %do; %let wstata=%str(&ii.:\Program Files\Stata&kk.\&jj..exe); ** Stata9 **; %end; %else %goto exist; ** file exists **; %if %sysfunc(fileexist("&wstata"))=0 %then %do; %let wstata=%str(&ii.:\Stata-&kk.\&jj..exe); ** Stata-9 **; %end; %else %goto exist; ** file exists **; %if %sysfunc(fileexist("&wstata"))=0 %then %do; %let wstata=%str(&ii.:\Stata&kk.\&jj..exe); ** Stata9 **; %end; %else %goto exist; ** file exists **; %if %sysfunc(fileexist("&wstata"))=0 %then %do; %let wstata=%str(&ii.:\Stata\&jj..exe); %end; %else %goto exist; ** file exists **; %if %sysfunc(fileexist("&wstata"))=0 %then %do; %let wstata=%str(j:\isis.unc.edu\pc-pkg\stata-&kk.\program\&jj..exe); /** UNC Stata **/ %end; %else %goto exist; ** file exists **; %if %sysfunc(fileexist("&wstata"))=0 %then %do; ** nothing **; %end; %else %goto exist; ** file exists **; %end; ** of k loop **; %end; ** of j loop **; %end; ** of i loop **; %do; ** check path for Stata executable **; %let i=1; %let delim=%str(;); %do %until (%qscan(%sysget(PATH),&i.,%str(&delim.)) = ); %let wstata="%qscan(%sysget(PATH),&i.,%str(&delim.))\wsestata.exe"; %if %sysfunc(fileexist("&wstata"))=0 %then %let wstata="%qscan(%sysget(PATH),&i.,%str(&delim.))\wstata.exe"; %if %sysfunc(fileexist("&wstata")) %then %do; %let fail20=2; %let i=200000; ** break loop if found it **; %end; %let i=%eval(&i.+1); %end; %if fail20=2 %then %goto exist; ** file exists **; %end; ** of checking path for Stata executable **; %if %sysfunc(fileexist("&wstata"))=0 %then %do; %let fail20=1; ** give up **; %end; %exist: ; %if &fail20=0 %then %let fail20=2; ** found file so break while loop **; %end; ** end of if unix=0 then do while loop **; %end; ** end of if &ascii = 0 and &script = "usesas" **; %if &fail20=1 %then %goto fail20; libname ________ "&out_dir"; ** ________ is a very unlikely libname **; %if &syslibrc^=0 %then %do; libname ________ clear; ** do away with it now **; %goto fail5; ** exit if not a valid pathname **; %end; libname ________ clear; ** do away with it now **; %if &messy=1 %then %do; ** Use the output directory to store the SAS program files that this macro creates. ***; %let temp_dir = %nrbquote(&out_dir); %let work_dir = %nrbquote(&out_dir); %if &unix=0 %then %do; %let windrive= %qsubstr("&out_dir",2,2); ** get drive info eg. "d:" **; %sysexec &windrive; ** change to whatever drive files are going **; %end; ** sysexec requires no quotes even when changing to dirs with spaces in windows or unix **; %sysexec cd %nrbquote(&out_dir) ; ** change to the drive and directory where the Stata do files are. **; %end; %if "&u_dir"^="" %then %do; ** this happens when usesas or savas script call savastata **; %let out_dir = %nrbquote(&u_dir.); %end; ** Initialize macro vars **; %let useSE=0; %let SE=0; ** initialize vars **; %let use6=0; %let use7=0; %let use7SE=0; %let use8=0; %let use8SE=0; %if %eval(&use6 + &use7 + &use8)>1 %then %goto fail2; %if %eval(&use6 + &use7 + &use7SE + &use8 + &use8SE)=0 and &ascii=0 %then %do; data _null_; file "&temp_dir._&tfns._version.do"; put " capture program drop stata_v"; put " program define stata_v, nclass"; put " capture log close "; put " quietly log using ""&temp_dir._&tfns._ver.log"" "; put ' display "%let version=" _caller() " ; " '; put ' display "%let SE=$S_StataSE ;" '; put " quietly capture log close " ; put " end "; put " stata_v"; run; %let uspace=0; ** flag if in non-Windows and directory name has a space in it **; %if "&nosave."="" %then %do; ** if not run by usesas **; %if &unix=1 %then %do; ** sysexec requires no quotes even when changing to dirs with spaces in windows or unix **; %sysexec cd %nrbquote(&work_dir) ; ** change to the drive and directory where the Stata do files are. **; %if %index(%nrbquote(&temp_dir.),%str( )) %then %do; /* dirname has at least one space in it, likely only a problem when using messy because * not likely a workdir will have spaces in *nix environ. * problem is that SAS/*nix(?) does not maintain the double quotes around dir and filename. * stata -b do "/some dumb name/_28492_version.do" * ends up being: * stata -b do /some dumb name/_28492_version.do * which creates some.log but since savastata has cd-ed to /some dumb name/ some.log * is in that directory. The solution of putting single quotes around the double quotes * seems to work but will create a file firstwordofdirnamewithaspace.log which seems okay * since usesas runs the stata programs itself and so does savas and SAS deletes workdirs * and if user runs in messy they'll figure it out. * if user does not have a profile.do command that always changes them to their fav directory * then savastata should work just fine. **/ %if &messy=1 %then %do; ** only warn if using messy option **; %put You have at least one space in your directory: *; %put "&temp_dir." * ; %put Spaces in directory names cause problems for savastata. *; %put The file: *; %put "&temp_dir.firstwordindirnamewithaspace.log" *; %put will be created thanks to that pesky space. *; %put Consider_naming_directories_without_using_spaces. *; %let uspace=1; %end; %sysexec %nrbquote(&ustata) -b do %str(%')"&temp_dir._&tfns._version.do"%str(%'); ** Run Stata in batch. **; %end; /* end of if dirname has space in it */ %else %do; %sysexec %nrbquote(&ustata) -b do "&temp_dir._&tfns._version.do"; ** Run Stata in batch. **; %end; %end; ** if unix=1 do loop **; %if &unix=1 and &sysrc^=0 %then %goto fail21; %if &unix=0 %then %do; %sysexec &drive; ** sysexec requires no quotes even when changing to dirs with spaces in windows or unix **; %sysexec cd %nrbquote(&work_dir) ; %sysexec "&wstata" /e do "&temp_dir._&tfns._version.do"; ** Run Stata in batch. **; %end; ** if unix=0 then do loop **; %if %sysfunc(fileexist("_&tfns._ver.log")) %then %do; %include "_&tfns._ver.log"; %end; %else %goto fail22; %if &version=6 %then %let use6=1; %else %if &version=7 %then %let use7=1; %else %if &version>=8 %then %let use8=1; ** use >= 8 to make it still work with * future versions of Stata **; %if &SE=SE %then %let useSE=1; %end; /** end of "&nosave." ^= "" (not being run by usesas) **/ %end; ** if eval(use6 + use7 + use7SE + use8 + use8SE)=0 and ascii=0 then do loop **; ** Set useSE macro based on preset switches at top of macro. **; %if &use7SE=1 %then %do; %let use7=1; %let useSE=1; %end; %if &use8SE=1 %then %do; %let use8=1; %let useSE=1; %end; %if "&script." = "usesas" %then %do; %let use8=1; ** use8=1 because usesas does not work on previous versions **; %let useSE=&U_SE; %end; ** Have to be using Stata SE to specify that you are saving a Stata Intercooled dataset. **; %if &useSE=0 %then %let intrcool= ; /****** okay to have ascii plus further specification of the Stata datafile ** Can only choose to save one dataset type. **; %if &ascii=1 and &old=old %then %goto fail6 ; %if &ascii=1 and &intrcool=intercooled %then %goto fail6 ; ******end of "what was I thinking" ******/ ** set what version to save to **; %let save6=0; %let save7=0; %if &use7=1 and &old=old %then %let save6=1; %if &use8=1 and &old=old %then %let save7=1; ** make version be what version stata file to be created **; %if &save6=1 and &version. = 7 %then %let version=6; %if &save7=1 and &version. >= 8 %then %let version=7; %if &char2lab=1 %then %do; %if &version. < 9 %then %do; %put WARNING: The option char2lab not beneficial prior to Stata 9. *; %let char2lab=0; %end; %if ("&script." = "" ) %then %do; /* The option char2lab is not allowed when SAVASTATA is not run by the savas script * or the Stata command -usesas-. */ %put WARNING: The option char2lab is not allowed. *; %put Consider running the SAS macro CHAR2FMT before running SAVASTATA. *; %put For more help check here: &char2fmthttp. ; %let char2lab=0; %end; %if &char2lab=0 %then %do; %put WARNING: char2lab option will be ignored. * ; %end; %end; %if &save6=1 %then %do; %let maxstrvarlen=80; %let maxvallablen=80; %end; %else %if "&intrcool." = "intercooled" %then %do; %let maxstrvarlen=80; %let maxvallablen=80; %end; %else %if &useSE=0 %then %do; %let maxstrvarlen=80; ** this changes a few lines down to 244 if version >= 9.1 **; %let maxvallablen=80; ** this changes a few lines down to 32000 if version >= 9 **; %end; %else %do; %let maxstrvarlen=244; %let maxvallablen=244; %end; %if &version. >= 9 and &sysver < 9 %then %let maxvallablen=256; %if &version. >= 9 and &sysver >= 9 %then %let maxvallablen=32000; %if &version. >= 9.1 %then %let maxstrvarlen=244; ** Stata 9.1 now allows Student and Intercooled ** to have str244 vars **; ** make sure version is an integer. **; %let version=%sysfunc(int(&version.)); %let dta_exists=0; %if "&script." ^= "usesas" %then %do; %if %sysfunc(fileexist("&out_dir.&fdset..dta")) %then %do; %let dta_exists=1; %if &replace^=replace and &ascii=0 %then %do; %goto fail7; %end; %end; %end; ** check for long character variables only if using Stata 9 or higher **; %if &version. >= 9 %then %do; proc contents data=&dset. out=_conten noprint; run; proc sort data=_conten; by type; run; %let max_clen_count=0; data _null_; set _conten (where=(type=2)) end=lastobs ; retain max_clen_count 0; by type; if length > &maxstrvarlen. then do; max_clen_count = max_clen_count + 1; %if &char2lab. = 0 and "&script." ^= "" %then %do; put "WARNING: This is a list of character variables that are going to be truncated by savastata *"; put " because they are longer than &maxstrvarlen. characters. *"; put "Consider using the char2lab option to encode them to numeric with value labels in Stata. *"; %end; %else %if &char2lab. = 0 and "&script." = "" %then %do; put "WARNING: This is a list of character variables that are going to be truncated by savastata *"; put " to the first &maxstrvarlen. characters. *"; put "Consider using the SAS macro CHAR2FMT to convert them to numeric variables with *"; put " SAS formats containing their character data. SAVASTATA saves formats as value labels in Stata. *"; %end; %else %if &char2lab. = 1 %then %do; put "This is a list of character variables that are going to be made into numeric variables *"; put " but have value labels containing their character contents because they *"; put " contain more than &maxstrvarlen. characters. *"; %end; put " " name " *"; end; if last.type and type=2 then do; call symput( 'max_clen_count', compress(left( put( max_clen_count, 10. ) ) ) ) ; end; run; ** only do char2fmt if user asked for it and have long character vars that need * to be made into labels **; * char2fmt creates and deletes temporary dataset _conten2 **; %if (&char2lab.=1) and (&max_clen_count. > 0 ) %then %do; %char2fmt(dset=&dset. , maxlen=&maxstrvarlen. , temp_dir=%nrbquote(&temp_dir.) , tfns=&tfns. ); %let workfmts=%sysfunc(cexist(WORK.FORMATS)); %end; ** end begin char2lab process **; %end; ** of if version >=9 **; ** run proc means to check data after potential change in data by char2fmt **; %if &check=1 %then %do; proc printto print="&out_dir.&fdset._SAScheck.lst" new ; run; %if "&sortedby"^="" %then %do; title "Data are sorted by: &sortedby"; %end; proc means data=&dset; proc contents data=&dset position; proc print data=&dset (obs=5); run; proc printto; ** ends printing to means.lst and returns printing to normal **; run; %end; ** Here starts the processing of the dataset. **; ** Create a dataset of the dataset info of the input dataset. **; proc contents data=&dset. out=_conten noprint; run; ** Initialize macro vars **; %let nv=0; %let cv=0; %let ch=0; %let ln=0; %let ob=0; %let lo=0; %let dq=0; %let sorted=0; %let bign=0; %let bign1=0; %let name=0; ** Find out if data are sorted and check for vars named _N and _______N. **; data _null_; set _conten(keep=name sorted) end=lastobs; retain bign bign1 0; name=lowcase(name); if name="_n" then bign=1; if name="_______N" then bign1=1; if name="___nv___" then call symput("nv",1); if name="___cv___" then call symput("cv",1); if name="___ch___" then call symput("ch",1); if name="___ln___" then call symput("ln",1); if name="___ob___" then call symput("ob",1); if name="___lo___" then call symput("lo",1); if name="___dq___" then call symput("dq",1); if name="_name_" then call symput("name",1); ** make macro var sorted equal to 1 if data are sorted **; if sorted=1 then call symput("sorted",1); if lastobs then do; call symput("bign",bign); call symput("bign1",bign1); end; run; /** DR's little fix **/ %let crdate=; data _null_; date=put(date(),date9.); call symput("crdate",date); run; %let crdate=Savastata created this dataset on &crdate; ** An attempt to rename _N to _______N will fail * because both vars exist in the dataset. **; %if &bign=1 and &bign1=1 %then %goto fail8; %let bvar=________; %if &nv=1 %then %do; %let bvar=___nv___; %goto fail9; %end; %if &cv=1 %then %do; %let bvar=___cv___; %goto fail9; %end; %if &ch=1 %then %do; %let bvar=___ch___; %goto fail9; %end; %if &ln=1 %then %do; %let bvar=___ln___; %goto fail9; %end; %if &ob=1 %then %do; %let bvar=___ob___; %goto fail9; %end; %if &lo=1 %then %do; %let bvar=___lo___; %goto fail9; %end; %if &dq=1 %then %do; %let bvar=___dq___; %goto fail9; %end; %if &name=1 %then %do; %let bvar=_name_; %goto fail9; %end; %let cq=0; %if "&sortedby"^="" %then %do; data _null_; sortedby=trim(lowcase("&sortedby")); call symput("sortedby",sortedby); run; %let sorted=1; %end; data _conten; set _conten; name=upcase(name); ** make sure all variable names are upper case **; length __strvar 8; if (substr(name,1,3)='STR') then do; ** Look for variables named like "str14" which is an invalid variable name in Stata if it was in lower case. Leave them in upper case. **; __strvar=substr(name,4,length(name)); _error_=0; ** SAS creates _error_=15 if _strvar evaluates to ., so clear it. **; if (__strvar in(.,0)) then name=lowcase(name); end; ** Check for variable names that are invalid variable names in Stata if they were in lower case and leave them in upper case. *; else if name ~in('_ALL','_B','BYTE','_COEF','_CONS','DOUBLE','FLOAT','IF', 'IN','INT','LONG','_N','_PI','_PRED','_RC','_SE','_SKIP','USING','WITH') then name=lowcase(name); %if &bign=1 and &bign1=0 %then %do; if (name='_N') then do; put 'WARNING: Savastata has renamed Stata invalid variable _N to _______N * '; name='_______N'; end; %end; if index(label,compress("`"||' " ')) or index(label,compress('"'||" ' ")) then do; call symput("cq",1); ** Variable label contains a compound quote. **; %if "es=1 %then %do; ** replace the double quote portion of the compound quote with a space if user wants **; label=translate(label,compress(" '' "),compress(' `" ')); label=translate(label,compress(" '' "),compress(' "'' ')); %end; end; ** of if compound quote found in label **; if length(label)>80 then do; label=substr(label,1,80); put 'WARNING: Savastata has truncated the variable label for ' name ' to 80 characters. * '; end; length w d $5; w=formatl; d=formatd; format=upcase(format); orig_fmt=format; if (format='' & formatl>0) then format = compress('%'||w||'.'||d||'f',' '); else if (format='F') then format = compress('%'||w||'.'||d||'f',' '); else if (format='BEST') then format = compress('%'||formatl||'.0g'); else if (format='DATE' & formatl<9) then format = '%d'; else if (format='DATE' & formatl>=9) then format = '%dDlCY'; else if (format='DDMMYY' & formatl<10) then format = '%dD/N/Y'; else if (format='DDMMYY' & formatl>=10) then format = '%dD/N/CY'; else if (format='MMDDYY' & formatl<10) then format = '%dN/D/Y'; else if (format='MMDDYY' & formatl>=10) then format = '%dN/D/CY'; else if (format='YYMMDD' & formatl<8) then format = '%dYND'; else if (format='YYMMDD' & 8<=formatl<10) then format = '%dY-N-D'; else if (format='YYMMDD' & formatl>=10) then format = '%dCY-N-D'; else if (format='DAY') then format = '%dD'; else if (format='MONTH') then format = '%dl'; else if (format='YEAR' & formatl<4) then format = '%dY'; else if (format='YEAR' & formatl>=4) then format = '%dCY'; else if (format='MONNAME') then format = '%dM'; else if (format='MONYY' & formatl<7) then format = '%dlY'; else if (format='MONYY' & formatl>=7) then format = '%dlCY'; else if (format='WEEKDAY') then format = '%dd'; else if (format='WORDDATE') then format = '%d'; else if (format='WORDDATX') then format = '%d'; else if (format='YYMM') then format = '%d'; else if (format='YYMON' & formatl<7) then format = '%dYl'; else if (format='YYMON' & formatl>=7) then format = '%dCYl'; else format='default'; if type=2 then format='default'; ** make all string vars be default format **; run; %if &cq=1 and "es=0 %then %goto fail10; ** Variable label contains a compound quote **; ** initialize macro vars **; %let VAR_N = 0 ; * number of numeric variables *; %let VAR_C = 0 ; * number of character variables *; %if &sysver<7 %then %goto skip6; %if &use6=1 or (&use7=1 and &old=old) %then %do; ** check for varnames longer than 8 characters **; %let long=0; ** and rename them **; data _conten; length longname $32; set _conten; longname = name; if (length(name)>8) then do; s_name=right(substr(name,1,4)); call symput("long",1); end; run; %if &long=1 %then %do; ** only do if there is at least one varname > 8 **; ** Check that variables have not been renamed to names that already exist in the dataset. **; proc sort data=_conten; by s_name name; run; data _conten; set _conten; by s_name; retain count start 0; if first.s_name then count=0; if (length(name)>8) then do; start=start+1; count=count+1; name=compress(s_name||count); if start=1 then do; ** print to log **; put ' '; put 'WARNING: Stata 6 does not allow variable names longer than 8 characters. * '; put ' '; put 'WARNING: List of long variable names that savastata has renamed: * '; put ' '; put ' Original long name' @35 ' New short name * '; put ' '; end; put ' ' longname @33 ' = ' @36 name ' * '; end; ** end of if (length(name)>8) do loop **; run; proc sort data=_conten; by name; run; %let fail11=0; ** initialize fail11 macro var **; data _null_; set _conten; by name; if not (first.name and last.name) then do; ** means there is a repeat in varnames **; call symput("fail11",1); end; run; %if &fail11=1 %then %goto fail11; %end; ** end of if long=1 the do loop **; %end; ** end of if use6=1 the do loop **; %skip6: ; ** Skip fixing stuff for Stata 6 if using SAS 6 **; %if "&sortedby"="" %then %do; ** only do this when sort order not passed into savastata **; %let sortedby=a; ** initialize macro var **; %if &sorted=1 %then %do; /**************************************************** The problem with this way of figuring out the sort order of the dataset is that the variables have not been checked for invalid Stata variable names. Using the _conten file is fairly quick since it is subsetted to just the sort vars. data _null_; dsid=open("work.&dset","i"); sortedby=attrc(dsid,"SORTEDBY"); call symput("sortedby",sortedby); rc=close(dsid); run; ****************************************************/ ** If data are sorted then get variable name(s). **; proc sort data=_conten (keep=name sortedby where=(sortedby>0)) out=_conten1; by sortedby; ** This puts sort vars in sort order. **; run; data _conten1; set _conten1 end=lastobs; length sdby $260; retain sdby ' '; if _n_=1 then sdby=trim(name); ** If more than one sort var then concatenate them together. **; if _n_>1 then sdby=trim(sdby)||trim(" ")||trim(name); if lastobs then call symput("sortedby",trim(sdby)); run; %end; ** end of if data are sorted **; %end; ** end of if sort order already passed to savastata **; proc sort data=_conten; by type; run; %let var_n=0; %let var_c=0; %let avars=0; ** Count up number of numeric and number of character variables. **; data _null_ ; set _conten end=lastobs; by type; if first.type then do; var_non=0; var_noc=0; end; if type=1 then do; * numeric vars **; var_non + 1 ; end; if type=2 then do; * character vars **; var_noc + 1 ; end; ** Create macro vars containing final number of vars. **; if last.type and type=1 then call symput( 'VAR_N', left( put( var_non , 5. ) ) ) ; if last.type and type=2 then do; call symput( 'VAR_C', left( put( var_noc, 5. ) ) ) ; end; ** AVARS is total number of variables **; if lastobs then call symput( 'AVARS', compress(left( put( _n_, 5. ) ) ) ) ; run ; %if &AVARS=0 %then %do ; %goto fail12 ; %end ; %if &use8=1 %then %do; ** Initialize macro vars **; %do i=1 %to &VAR_N; %let m&i=0; %end; %end; *** Figure out minimum safe storage type for each variable. ***; %if &use8=0 %then %do; %let bytemin = -127; %let bytemax = 126; %let intmin = -32767; %let intmax = 32766; %let longmin = -2147483647; %let longmax = 2147483646; %end; ** of if use8=0 then do loop **; %if &use8=1 %then %do; ** Stata 8 has a smaller range due to storage of special missings **; %let bytemin = -127; %let bytemax = 100; %let intmin = -32767; %let intmax = 32740; %let longmin = -2147483647; %let longmax = 2147483620; %end; ** of if use8=1 then do loop **; ** Initialize macro vars **; %let nobs=0; %let dq_fail=0; %let sm=0; data _conten1; set work.&dset end=___lo___; format _all_; ** remove all formats and informats **; informat _all_; ** Count up observations since using _n_ to step through arrays. **; array ___ob___[1] _temporary_; if _n_=1 then ___ob___[1]=0; ___ob___[1]=___ob___[1]+1; %if &VAR_N>0 %then %do; * process numeric vars *; array ___nv___ [&VAR_N] _numeric_; ** all numeric variables in dataset **; array ___ln___ [&VAR_N] _temporary_; if _n_=1 then do; ** use the temporary variable _n_ to step through the arrays **; do _n_ = 1 to &VAR_N; ___ln___[_n_]=3; ** initialize temp vars to min numeric length *; end; _n_=1; ** return the value of _n_ back to 1 **; end; do _n_ = 1 to &VAR_N; %if &use8=1 %then %do; ** check to see if any numeric var has special missing values **; if .a<=___nv___[_n_]<=.z then call symput("sm",1); %end; if ___ln___[_n_] ne 8 and ___nv___[_n_] ne . then do ; if ___nv___[_n_] ne int(___nv___[_n_]) then ___ln___(_n_)=8; ** all decimal vars length 8 *; else /* check numeric variables that are integers */ if &BYTEMIN<=___nv___[_n_]<=&BYTEMAX then ___ln___(_n_)= max( ___ln___(_n_), 3 ) ; else if &INTMIN<=___nv___[_n_]<=&INTMAX then ___ln___(_n_)= max( ___ln___(_n_), 4 ) ; else if &LONGMIN<=___nv___[_n_]<=&LONGMAX then ___ln___(_n_)= max( ___ln___(_n_), 6 ) ; else ___ln___(_n_)=8; end ; end ; *** end of _n_=1 to &VAR_N ***; %end; ** end of processing numeric vars **; %if &VAR_C>0 %then %do; * now process the character variables *; array ___cv___( &VAR_C ) _character_ ; ** all character variables in dataset **; array ___ch___( &VAR_C ) _temporary_ ; array ___dq___(1) _temporary_ ; if _n_=1 then do; ** use the temporary variable _n_ to step through the arrays **; ___dq___[1]=0; ** initialize temp var to 0 *; do _n_ = 1 to &VAR_C; ___ch___[_n_]=1; ** initialize temp vars to min character length *; end; _n_=1; ** return the value of _n_ back to 1 **; end; * increase character length until the maximum needed *; do _n_ = 1 to &VAR_C ; ** check for double quotes in character variables **; if index(___cv___[_n_],compress(' " ')) then ___dq___[1]=1; if ___ch___[_n_] < length(___cv___[_n_]) then ___ch___[_n_]=length(___cv___[_n_]); end ; %end; * end of processing character vars *; if ___lo___ then do ; call symput("nobs",compress(___ob___[1])); %if &VAR_N>0 %then %do; do _n_ = 1 to &VAR_N ; ___nv___[_n_]=___ln___[_n_]; ** replace values of variables with their length **; end; %end; %if &VAR_C>0 %then %do; if ___dq___[1]=1 then call symput("dq_fail",1); do _n_ = 1 to &VAR_C ; ** this converts the character data to numeric data *; ___cv___[_n_]=___ch___[_n_]; ** replace values of variables with their length **; end; %end; output; end; run; %if %sysfunc(fileexist(&usagelog)) %then %do; data _null_; file &usagelog mod; put " Input SAS dataset has &nobs obs and &AVARS vars" ; %end; %if &nobs=0 %then %goto fail13; %if &dq_fail=1 and "es=0 %then %goto fail14; %if &sm=1 and &use8=1 and (&save7=1 or &old=old) %then %do; %put WARNING: The dataset WORK.&dset contains special missing data that will be converted to missing (.) * ; %let sm=0; %end; ** Since _conten1 is one obs in the dataset transpose to create variable _name_. **; proc transpose data =_conten1 out=_conten1; var _all_; run; ** Put _conten dataset in the variable order of the original dataset. **; proc sort data=_conten; by varnum; run; ** Figure out the minimum required length for accurate storage of the col1 variable. **; data _conten1; set _conten1 end=___lo___; varnum=_n_; ** make the variable order be the order they are in dataset **; run; data _conten; merge _conten(keep=name varnum type label orig_fmt format) _conten1(keep=_name_ varnum col1); by varnum; run; data _conten; length c_len $10 stype $10 oformat $10; set _conten; c_len=compress(col1); n_len=input(c_len,8.); ** name has been fixed if too long or left upper case and _name_ is untouched **; if type=1 /*** and format="default" ***/ then do; ** numeric variables **; if c_len="3" then do; stype="byte"; oformat="best4."; end; else if c_len="4" then do; stype="int"; oformat="best6."; end; else if c_len="6" then do; stype="long"; oformat="best11."; end; else if c_len="8" and &float=1 then do; stype="float"; oformat="best18."; end; else if c_len="8" and &float=0 then do; stype="double"; oformat="best18."; end; end; /****************** if type=1 and format^="default" then do; ** numeric variables with formats **; stype="double"; oformat="best17."; end; ******************/ if type=2 then do; ** character variables **; if n_len>&maxstrvarlen. then do; c_len="&maxstrvarlen."; if &char2lab. = 0 then put "WARNING: Savastata has truncated the contents of variable " name " to &maxstrvarlen. characters. * "; end; period='.'; oformat=compress("$char"||c_len); oformat=compress(oformat||period); stype=compress("str"||c_len); stype=left(stype); end; ** end of if type=2 do loop **; run; ** Figure out the record length. **; data _null_; set _conten end=lastobs; retain bpos lepos 0; if type=1 then do; if stype="byte" then len=1; if stype="int" then len=2; if stype="long" then len=4; if stype="float" then len=4; if stype="double" then len=8; end; if type=2 then len=input(c_len,8.); bpos=lepos+1; lepos=bpos+len-1; if lastobs then do; call symput("reclen",lepos); end; run; %if &use8=1 %then %do; proc format; value ___mi___ .a=".a" .b=".b" .c=".c" .d=".d" .e=".e" .f=".f" .g=".g" .h=".h" .i=".i" .j=".j" .k=".k" .l=".l" .m=".m" .n=".n" .o=".o" .p=".p" .q=".q" .r=".r" .s=".s" .t=".t" .u=".u" .v=".v" .w=".w" .x=".x" .y=".y" .z=".z" ;;; run; %end; ** end of if use8=1 do loop **; ** Write a SAS program to output the data to an ascii file. **; data _null_; set _conten end=lastobs; ** Write to a SAS program to be inserted in this program later. **; file "&temp_dir._&tfns._ascii.sas"; retain lvar ivar 0; if _n_=1 then do; *** this fileref is used after success to delete this raw file **; put "filename ________ ""&temp_dir._&tfns._.raw""; "; ** write out to an ascii file **; put "data _null_; "; put " set work.&dset. end=___lo___; "; put " file ________ ls=2000; "; ** write out to an ascii file **; end; lvar=lvar+1; ivar=ivar+1; %if &dq_fail=1 and "es=1 %then %do; ** check for double quotes in character variables and replace with single quotes **; if type=2 then put " if index(" _name_ ",compress(' "" ')) then "; if type=2 then put _name_ "=translate(" _name_ ",compress("" ' ""),compress(' "" '));" ; %end; ** if dq_fail=1 and quotes=1 then do loop **; %if &use8=0 or &sm=0 %then %do; ** make special missings equal to missing **; if type=1 then put " if .<" _name_ "<=.z then " _name_ " =. ; "; %end; ** if use8=0 or sm=0 then do loop **; %if &use8=1 and &sm=1 %then %do; ** make invalid special missings equal to missing **; ** Stata can only handle special missing between .a and .z, SAS also has " ._ " ***; if type=1 then put " if " _name_ "<.a then " _name_ " =. ; "; %end; ** if use8=1 and sm=1 then do loop **; if (lvar < 5 and ivar < &avars.) then do; ** put the variable with the output format and put a space after each variable **; %if &sm=1 %then %do; ** keep special missings special **; if type=1 then put 'if .a<=' _name_ '<=.z then put ' _name_ ' ___mi___. " " @;'; %end; ** if sm=1 then do loop **; if type=1 then put 'if ' _name_ '<.a or ' _name_ '>.z then put ' _name_ oformat ' " " @;'; /** would work if Stata could infile compound quotes: if type=2 then put ' put " `""" ' _name_ oformat ' """'' " " " @ ; '; ******/ if type=2 then put ' put " """ ' _name_ oformat ' """ " " " @ ; '; end; ** of if (lvar < 5 and ivar < avars.) then do loop **; else do; %if &sm=1 %then %do; ** keep special missings special **; if type=1 then put 'if .a<=' _name_ '<=.z then put ' _name_ ' ___mi___. ;'; %end; ** if sm=1 then do loop **; if type=1 then put 'if ' _name_ '<.a or ' _name_ '>.z then put ' _name_ oformat ' ;'; /** would work if Stata could infile compound quotes: if type=2 then put ' put " `""" ' _name_ oformat ' """'' " ; '; **/ if type=2 then put ' put " """ ' _name_ oformat ' """ " ; '; lvar=0; end; ** of if else (the 5th var) do loop **; put " "; if lastobs then do; put "run; "; end; run; %include "&temp_dir._&tfns._ascii.sas"; ** initialize maxvar **; %let maxvar =0; %let reclen=%eval(&reclen.); %if (&intrcool=intercooled or &use6=1 or &useSE=0) %then %do; %if &avars.>2047 %then %goto fail15; ** &avars. is number of all variables **; %if (&use6=1 or &save6=1) and &reclen>8192 %then %goto fail16; %if (&useSE=0 or &save7=1) and &reclen>24576 %then %goto fail16; %end; %else %do; ** using and saving SE **; ** The maximum width of a dataset in Stata SE is 12*maxvar. **; ** The maximum number of variables for Stata SE is 32,766. **; ** The maximum number of variables savastata can handle varies based on type of variables **; %let maxreclen = %eval(&avars * 12 ); ** The real max width for Stata is 393204 (=32767 * 12) not 393192 (= 32766 * 12) **; %if &maxreclen > 393204 %then %goto fail16; %if &reclen > 393204 %then %goto fail16; %if &avars > 32767 %then %goto fail24; ** Starting with SAS 9, SAS now allows more than 32,767 vars **; %if &avars < 32756 %then %do; ** allow for more vars (10 percent of the difference of total amount of vars you can have) **; %let maxvar = %eval(&avars. + %sysfunc(int((32766-&avars.) / 10))); %end; %else %if &avars. >= 32756 %then %do; %let maxvar = %eval(&avars. + 1); %end; %if &avars > 32767 %then %goto fail24; ** This should not happen, but check it again **; %if &maxvar<=5000 %then %let maxvar=5000; %if &maxvar>32767 %then %goto fail16; ** This really means the dataset is too wide. **; %end; %let SE=0; ** Initialize macro var **; %let flavor=0; ** Initialize macro var **; ** Write do-file to read in data. **; data _null_; set _conten end=lastobs; file "&temp_dir._&tfns._infile.do"; ** Write to a Stata do-file. **; if _n_=1 then do; /**************************************************************** # Figure out memory requirements. # Use a slightly modified version of the formula Stata suggests to figure # out how much memory is needed. # STATA_s formula: N*V*W + 4*N # M = -------------- # 1024 * 1024 # N = number observations # V = number of variables # W = average width in bytes of a variable # M = number of megabytes # savastata_s formula: ****************************************************************/ ** record length is number of variables times average variable width **; statamem= &nobs*(&reclen + (&reclen/4)) + (4*&nobs) ; ** "+ (&reclen/4)" adds about 25 percent for good measure **; %if %eval(&use8 + &use8se) >= 1 %then %do; put " if `c(memory)' < " statamem " { "; ** c(memory) is in bytes and statamem is also at this point **; %end; ** Convert STATAMEM bytes to megabytes **; statamem=int(statamem / 1024**2); ** Make sure memory is set to at least 20 megabytes **; if (statamem< 20) then statamem=20; mem=compress(statamem||"m"); put " set memory " mem; %if %eval(&use8 + &use8se) >= 1 %then %do; put " } "; ** end of if memory setting is less than needed **; %end; %if &maxvar>5000 %then %do; ** but only reset if not high enough (for usesas) **; put " if `c(max_k_current)' < &maxvar. set maxvar &maxvar. "; %end; put " #delimit ; "; put " infile "; end; ** end of if _n_=1 do loop **; put ' ' stype name ; if lastobs then do; put " using ""&temp_dir._&tfns._.raw"" "; put ';;;'; put ' '; put ' #delimit cr '; put " do ""&temp_dir._&tfns._labels.do"""; ** This will call the next do-file. **; put " do ""&temp_dir._&tfns._fix.do"""; ** This will call the next do-file. **; put " do ""&temp_dir._&tfns._formats.do"""; ** This will call the next do-file. **; %if &vlabels=1 %then %do; put " do ""&temp_dir._&tfns._dlabels.do"""; ** This will call the next do-file. **; put " do ""&temp_dir._&tfns._vlabels.do"""; ** This will call the next do-file. **; %end; put " "; %if &sorted=1 %then %do; put " sort &sortedby. "; %end; put " label data ""&crdate."" "; ******* save Stata dataset *********************************; *----------------------------------------------------------*; %if "&script." ^= "usesas" %then %do; put 'if _caller()<8 { '; put ' quietly describe '; put ' local obs=`r(N)'' '; put ' local vars=`r(k)'' '; put " } "; put " else { "; put ' local obs=`c(N)'' '; put ' local vars=`c(k)'' '; put " } "; put " if `obs' == &nobs & `vars' == &AVARS { "; ** only save if successful **; ** not saving will still allow checking data **; ** add another back slash to directories that start with a back slash, as Stata * drops the first back slash for some reason. **; %if %index(%nrbquote(&out_dir),\)=1 %then %do; if &use8=0 then put " save ""\&out_dir.&fdset..dta"", &old &intrcool &replace "; else if &use8=1 then put " save&old ""\&out_dir.&fdset..dta"", &intrcool &replace "; %end; %else %do; if &use8=0 then put " save ""&out_dir.&fdset..dta"", &old &intrcool &replace "; else if &use8=1 then put " save&old ""&out_dir.&fdset..dta"", &intrcool &replace "; %end; put " } " ; ** end of if obs and vars match SAS dataset **; %if &ascii=0 %then %do; ** usesas should not reset linesize **; put " set linesize 100 "; %end; %end; ** of script ^= "usesas" do loop **; *** end save Stata dataset *********************************; *----------------------------------------------------------*; %if "&script." = "usesas" %then %do; put "capture program drop savastata_report "; put "program savastata_report "; put " if `c(N)' != &nobs | `c(k)' != &AVARS { "; put " noisily { "; put ' di _n "{txt}SAS reports that the input dataset has {res}&nobs. {txt}observations " /* '; put ' */ "and {res}&AVARS. {txt}variables " '; put ' di "{txt}but Stata reports that the dataset has {res}`c(N)'' {txt}observations " /* '; put ' */ "and {res}`c(k)'' {txt}variables " '; put ' di as err "{help usesas:usesas} was unable to read in your SAS data correctly." '; put ' di as err "Does your data contain non-roman characters?" '; put ' di as err "Savastata writes the data to ascii and non-roman characters mess that up." '; put ' di as err "If you want to check out the intermediary files generated " '; put ' di as err "by usesas in order to check out why usesas failed, " '; put ' di as err "try usesas again using the {text}messy {error}option. " '; put " } "; put " drop _all // clear data from memory "; put " } "; put ' else noi di _n "{txt}Stata reports that the dataset has {res}`c(N)'' {txt}observations " /* '; put ' */ "and {res}`c(k)'' {txt}variables. " '; put "end"; %end; ** of script = "usesas" do loop **; %if &ascii=0 %then %do; ** usesas also runs this only to check data **; %if "&script." ^= "usesas" %then %do; ** usesas already knows if it loaded data correctly or not **; put " capture program drop __save__ "; put " program define __save__, nclass "; put " args obs vars "; put " capture log close "; %if &check=1 %then %do; ** If a directory starts with a back slash, Stata likes to remove it. **; %if %index(%nrbquote(&out_dir),\)=1 %then %do; put "&quietly. log using ""\&out_dir.&fdset._STATAcheck.log"", replace "; %end; %else %do; put "&quietly. log using ""&out_dir.&fdset._STATAcheck.log"", replace "; %end; put " &noisily. display "" "" "; put " &noisily. display as res ""** Compare these results with the results provided by SAS **"" "; put " &noisily. display ""** in the file &out_dir.&fdset._SAScheck.lst. **"" "; put " &noisily. display "" "" "; put " &noisily. summarize "; put " &noisily. describe "; put " &noisily. list in 1/5"; put " capture log close "; %end; ** of if check=1 do loop **; put " ** If this file exists then Stata successfully saved &fdset..dta ** "; put " quietly log using ""&temp_dir._&tfns._done.log"" "; put " ** If this file exists then Stata successfully saved &fdset..dta ** "; put ' display "%macro _______v;" '; put ' display " %let SE=$S_StataSE ; " '; put ' display " %let flavor=$S_FLAVOR; " '; put ' display " %let version=%sysfunc(int(" _caller() ")); " '; put ' display " %if &SE=SE and &intrcool=intercooled %then %let flavor=Intercooled; " '; put ' display " %if &SE=SE and &intrcool^=intercooled and &save6=0 %then %let flavor=SE; " '; put ' display " %if &old=old %then %let version=%eval(&version.-1);" '; put ' display " options notes; " '; put " if `obs' != &nobs | `vars' != &AVARS { "; ** failed possibly due to non-roman characters **; put ' di "%put ERROR: Savastata was unable to save your SAS data correctly. *;" '; put ' di "%put Does your data contain non-roman characters? *;" '; put " di ""%nrstr(%put) SAS reports that the input dataset has &nobs. observations and &AVARS. variables. *; "" "; put ' di "%put but Stata reports that the dataset has `obs'' observations and `vars'' variables. *; " '; put ' di "%let fail23=1;" '; put ' } '; put ' else { '; /** success! **/ put ' di "%put NOTE: Savastata has successfully saved the *; " '; put ' di "%put Stata &version. &flavor data file &out_dir.&fdset..dta. *; " '; put ' di "%put Stata reports that the dataset has `obs'' observations and `vars'' variables. *; " '; put ' } '; put "capture which usagelog"; put " if _rc==0 {"; put " usagelog , type(savas) message(Output Stata dataset has `obs' obs and `vars' vars)"; put " }"; %if &check=1 %then %do; put ' di "%put *; " '; put ' di "%put You have requested to have savastata provide 2 check files: *; " '; put ' di "%put ""&out_dir.&fdset._SAScheck.lst"" and *; " '; put ' di "%put ""&out_dir.&fdset._STATAcheck.log"" *; " '; put ' di "%put *; " '; %end; ** if check=1 then do loop **; put ' display " %mend; " '; put ' display " %_______v; " '; put "quietly capture log close"; put "end "; ** end of defining program __save__ **; put "&noisily. __save__ `obs' `vars'"; %end; ** of script ^= "usesas" do loop **; %end; ** of if &ascii=0 do loop **; end; ** if last_obs do loop **; run; ** Write the Stata label do-file ***; data _null_; set _conten end=lastobs; file "&temp_dir._&tfns._labels.do"; ** start a new Stata do-file **; if _n_=1 then do; put " ** This do-file assigns variable labels ** "; end; if label^="" then put ' label var ' name ' `"' label '"'' '; run; ** Write the Stata fix do-file. ***; data _null_; set _conten end=lastobs; file "&temp_dir._&tfns._fix.do"; ** start a new Stata do-file **; if _n_=1 then do; ** If a directory starts with a back slash, Stata likes to remove it. **; put " ** This do-file replaces empty strings with null values. ** "; end; if type=2 then put ' replace ' name '="" if ltrim(' name ')=="" '; run; ** SAS defined format do-file ***; data _null_; set _conten end=lastobs; file "&temp_dir._&tfns._formats.do"; ** start a new Stata do-file **; if _n_=1 then do; put " ** This do-file assigns variables formats. ** "; end; if format^="default" then put ' format ' name format; if lastobs then do; put ' '; end; run; %if &vlabels=1 %then %do; ** If there are user defined formats try to save as many as possible. **; %if &workfmts=1 %then %do; proc format library=work cntlout=_conten1(keep=type fmtname start end label); run; proc sort data=_conten1(rename=(fmtname=format label=fmtlabel)); by format; ** Stata can only handle formats of integers. **; %if &sm=0 %then %do; where not index(start,".") and not index(end,".") and start^='**OTHER**' and type="N"; ** Stata can only handle formats of numeric vars. **; run; %end; ** if sm=0 then do loop **; %if &sm=1 %then %do; where type="N"; ** Stata can only handle formats of numeric vars. **; run; data _conten1; set _conten1; fine=0; ** If both start and end are special missing that is okay, ranges now okay. **; if (compress(start)>=".A" and compress(start)<=".Z") and (compress(end)>=".A" and compress(end)<=".Z") then do; start=lowcase(start); end=lowcase(end); fine=1; end; else if not index(start,".") and not index(end,".") then fine=1; if fine=1; run; %end; ** if sm=1 then do loop **; %end; ** if workfmts=1 then do loop **; %if &librfmts=1 %then %do; proc format library=library cntlout=_conten2(keep=type fmtname start end label); run; proc sort data=_conten2(rename=(fmtname=format label=fmtlabel)); by format; ** Stata can only handle formats of integers. **; %if &sm=0 %then %do; where not index(start,".") and not index(end,".") and start^='**OTHER**' and type="N"; ** Stata can only handle formats of numeric vars. **; run; %end; ** if sm=0 then do loop **; %if &sm=1 %then %do; where type="N"; ** Stata can only handle formats of numeric vars. **; run; data _conten2; set _conten2; fine=0; ** If both start and end are special missing that is okay, ranges now okay. **; if (compress(start)>=".A" and compress(start)<=".Z") and (compress(end)>=".A" and compress(end)<=".Z") then do; start=lowcase(start); end=lowcase(end); fine=1; end; else if not index(start,".") and not index(end,".") then fine=1; if fine=1; run; %end; ** if sm=1 then do loop **; %end; ** if librfmts=1 then do loop **; %if &workfmts=1 and &librfmts=1 %then %do; ** If there are both work formats and library formats, choose work over library. **; data _conten3; merge _conten1(in=a keep=format) _conten2(in=b keep=format); by format; if a and b; ** Find ones in common so that formats in library.formats can be excluded. **; run; data _conten3; set _conten3; by format; if first.format; ** Reduce to just one obs per format. **; run; data _conten2; merge _conten3(in=a keep=format) _conten2(in=b); by format; if not a; ** Exclude formats in work.library that are also in work.formats. **; run; ** Concatenate the two sets of formats. **; proc sort data=_conten1; by format start; run; proc sort data=_conten2; by format start; run; data _conten1; set _conten1 _conten2; by format start; run; %end; ** if workfmts=1 and librfmts=1 then do loop **; %if &workfmts=0 and &librfmts=1 %then %do; data _conten1; set _conten2; run; %end; proc sort data=_conten(keep=type varnum format name orig_fmt where=(format="default")) out=_conten(drop=format rename=(orig_fmt=format)); by orig_fmt; run; data _conten3; merge _conten(in=a keep=format) _conten1(in=b keep=format); by format; if a and b; ** Only keep the formats used in the dataset. **; run; data _conten3; set _conten3; by format; if first.format; ** Reduce to one obs per format. **; run; data _conten1; merge _conten3(in=a keep=format) _conten1(in=b); by format; if a and b; ** Only keep the formats used in the dataset. **; run; ** Make an attempt to save some of the other formats. **; ** first check to see if any will be truncated **; %let labtrunc = 0; data _null_; set _conten1; ** Stata can only handle value labels up to &maxvallablen. characters. **; if length(fmtlabel) > &maxvallablen. then call symput('labtrunc',1); run; %if &labtrunc=1 %then %do; %put WARNING: Savastata truncated at least one format label because it contained more than &maxvallablen. characters. *; %end; data _null_; length fmtlabel $&maxvallablen; ** To allow for quotes to be attached later. **; set _conten1 end=lastobs; by format; file "&temp_dir._&tfns._dlabels.do" ls=32200; ** start a new Stata do-file **; if _n_=1 then do; put " ** This do-file defines value labels. ** "; end; ** nstart has to be numeric and Stata can only handle up to 11 digits **; nstart=input(start,best12.); nend=input(end,best12.); format=lowcase(format); if index(fmtlabel,compress("`"||' " ')) or index(fmtlabel,compress('"'||" ' ")) then do; call symput("cq",1); ** Format contains a compound quote. **; %if "es=1 %then %do; ** replace the double quote portion of the compound quote with a space if user wants **; fmtlabel=translate(fmtlabel,compress(" '' "),compress(' `" ')); fmtlabel=translate(fmtlabel,compress(" '' "),compress(' "'' ')); %end; end; ** of if compound quote found in fmtlabel **; ** One format can be assigned to many variables. **; /* new way of defining each value and adding if more than 1 * because long value labels can make Stata unable to find end quote. * the added bonus is that a semicolon in a label will not throw off Stata *****/ if start = end then do; if first.format then put " label define " format start " `""" fmtlabel """' " ; else put " label define " format start " `""" fmtlabel """' , add " ; end; else if ( nstart > .z and nend > .z ) then do i=nstart to nend; if first.format and i=nstart then do; put " label define " format i " `""" fmtlabel """' " ; end; else do; put " label define " format i " `""" fmtlabel """' , add " ; end; end; else if (compress(start)>=".a" and compress(start)<=".z") and (compress(end)>=".a" and compress(end)<=".z") then do; cstart = substr(lowcase(compress(start)),2,1); ** Stata does not allow special missing ._ **; nstart = index("abcdefghijklmnopqrstuvwxyz",compress(cstart)); cend = substr(lowcase(compress(end)),2,1); nend = index("abcdefghijklmnopqrstuvwxyz",compress(cend)); do i = nstart to nend; _sm_ = compress("." || substr("abcdefghijklmnopqrstuvwxyz",i,1)); if first.format and i=nstart then do; put " label define " format _sm_ " `""" fmtlabel """' " ; end; else do; put " label define " format _sm_ " `""" fmtlabel """' , add " ; end; end; end; run; %if &cq=1 and "es=0 %then %goto fail17; ** at least one format contains a compound quote. **; data _conten; merge _conten(in=a keep=type varnum name format) _conten3(in=b keep=format); by format; if a and b; ** Keep only the variables assigned to these formats. **; run; data _null_; set _conten (where=(type=1)) end=lastobs; ** can only format numeric variables **; file "&temp_dir._&tfns._vlabels.do"; ** start a new Stata do-file **; if _n_=1 then do; put " ** This do-file assigns value labels. ** "; end; format=lowcase(format); put 'label value ' name format; run; %end; ** of if vlabels=1 then do loop **; %if "&script." ^= "usesas" %then %do; %let u_sysrc=0; ** initialize macro var **; %let w_sysrc=0; ** initialize macro var **; %if &ascii=0 %then %do; %if &unix=1 %then %do; ** This submits the Stata do-file that reads in the ascii dataset that becomes the Stata dataset. ***; ** sysexec requires no quotes even when changing to dirs with spaces in windows or unix **; %sysexec cd %nrbquote(&work_dir) ; ** change to the drive and directory where the Stata do files are. **; %if %index(%nrbquote(&temp_dir.),%str( )) %then %do; /* dirname has at least one space in it, likely only a problem when using messy because * not likely a workdir will have spaces in *nix environ. * problem is that SAS/*nix(?) does not maintain the double quotes around dir and filename. * stata -b do "/some dumb name/_28492_version.do" * ends up being: * stata -b do /some dumb name/_28492_version.do * which creates some.log but since savastata has cd-ed to /some dumb name/ some.log * is in that directory. The solution of putting single quotes around the double quotes * seems to work but will create a file firstwordofdirnamewithaspace.log which seems okay * since usesas runs the stata programs itself and so does savas and SAS deletes workdirs * and if user runs in messy they'll figure it out. * if user does not have a profile.do command that always changes them to their fav directory * then savastata should work just fine. **/ %if &uspace.=0 and &messy.=1 %then %do; ** only warn if using messy option **; %put You have at least one space in your directory: *; %put "&temp_dir." * ; %put Spaces in directory names cause problems for savastata. *; %put The file: *; %put "&temp_dir.firstwordindirnamewithaspace.log" *; %put will be created thanks to that pesky space. *; %put Consider_naming_directories_without_using_spaces. *; %end; %sysexec %nrbquote(&ustata) -b do %str(%')"&temp_dir._&tfns._infile.do"%str(%'); ** Run Stata in batch. **; %end; /** end of if dir has space in name **/ %else %do; %sysexec %nrbquote(&ustata) -b do "&temp_dir._&tfns._infile.do"; ** Run Stata in batch. **; %end; %let u_sysrc=&sysrc; ** store value of sysrc until after if ascii=0 do loop **; %end; /** if unix=1 then do loop **/ %if &unix=0 %then %do; %sysexec &drive.; ** sysexec requires no quotes even when changing to dirs with spaces in windows or unix **; %sysexec cd %nrbquote(&work_dir) ; %sysexec "&wstata" /e do "&temp_dir._&tfns._infile.do"; ** Run Stata in batch. **; %let w_sysrc=&sysrc; ** store value of sysrc until after if ascii=0 do loop **; %end; ** if unix=0 then do loop **; %end; ** if ascii=0 then do loop. Do not run Stata if only want ascii dataset **; %if &u_sysrc^=0 %then %goto fail21; %if &w_sysrc^=0 %then %goto fail20; %end; /** of script ^= "usesas" do loop **/ %let fail23=0; %if (&sysver<7 and &ascii=0) or (&ascii=0 and %sysfunc(fileexist("&temp_dir._&tfns._done.log"))) %then %do; %include"&temp_dir._&tfns._done.log"; ** Run report program written by Stata. **; %if &fail23=1 %then %goto fail23; %let success=1; %end; %if &ascii=1 and %sysfunc(fileexist("&out_dir._&tfns._.raw")) %then %do; %let success=1; %end; %if &success=0 and "&nosave"="" %then %goto fail19; ********* delete intermediary files when successful *********************; options nonotes; %if &success=1 and &ascii=0 and &messy=0 %then %do; ** at least delete the biggest file which is the ascii data file **; data _null_; fname="________"; rc =0; if rc = 0 and fexist(fname) then rc=fdelete(fname); run; filename ________ clear; %end; ** if ascii=0 do loop **; options notes; ********* (end of) delete intermediary files when successful **************; %goto okay; ** The following are all the failure messages returned to the user when the macro is unable to process the input SAS dataset. **; %fail1: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The input dataset &dset. has to be in the work library. * ; %put For more help check here: &http ; ** SAS 8 will put a note at the end of the log stating that what page errors occurred but earlier versions do not. So insert bad code to generate an error. **; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=1; %goto okay; %fail2: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put You can only choose to use one version of Stata. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=2; %goto okay; %fail3: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The dataset cannot be named &fdset. * ; %put Savastata uses this dataset name. * ; %put Consider renaming the dataset. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=3; %goto okay; %fail4: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put You did not tell savastata what directory you want to save the Stata dataset. * ; %put Your call to savastata should look something like this: * ; %put %nrstr(%%)savastata(c:\mydata\,-replace) * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=4; %goto okay; %fail5: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The directory %nrbquote(&out_dir.) does not exist. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=5; %goto okay; %fail6: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put You can only choose to save one type of dataset. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=6; %goto okay; %fail7: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The dataset %nrbquote(&out_dir.)&fdset..dta already exists. * ; %put If you want to overwrite this file, then use the savastata option "-replace". * ; %put Like: %nrstr(%%)savastata(%nrbquote(&out_dir.),-replace &options)%nrstr(;) ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=7; %goto okay; %fail8: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put Your dataset &dset contains a variable named _N which is not a valid * ; %put variable name in Stata. Consider renaming it to a valid Stata * ; %put variable name or dropping the variable from the SAS dataset. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put ; %let fail=8; %goto okay; %fail9: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The input dataset &dset cannot contain a variable named &bvar * ; %put because savastata uses this name. * ; %put Either rename the variable or drop it. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=9; %goto okay; %fail10: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put At least one variable label in input dataset &dset contains a compound quote ( %nrquote(%str(`%") or %str(%")%nrstr(%')) ). * ; %put This is not allowed by savastata. * ; %put Consider replacing any compound quotes with single ( %nrquote(%str(%')) ) using * ; %put the savastata option "-quotes". * ; %put Like: %nrstr(%%)savastata(%nrbquote(&out_dir.),-quotes &options)%nrstr(;) ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=10; %goto okay; %fail11: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put Savastata tried to rename the variable names that are 9 or more characters * ; %put to valid Stata 6 variable names (8 or less characters). Savastata was * ; %put unable to come up with unique variables names. * ; %put Rename the long variable names to 8 characters or less and try savastata again. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=11; %goto okay; %fail12: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The input dataset &dset has no variables. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=12; %goto okay; %fail13: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The input dataset &dset. has no observations. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=13; %goto okay; %fail14: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put At least one character variable in the input dataset &dset contains a double quote ( %nrquote(%str(%")) ). * ; %put This is not allowed by savastata. * ; %put Consider replacing any double quotes with single quotes ( %nrquote(%str(%')) ) by using * ; %put the savastata option "-quotes". * ; %put Like: %nrstr(%%)savastata(%nrbquote(&out_dir.),-quotes &options)%nrstr(;) ; %put The following is a list of all character variables in your dataset that * ; %put contain double quotes: * ; data _conten1; set &dset end=___lo___; keep _character_; ** keep only character variables **; array ___ch___ ( &VAR_C ) _temporary_; array ___cv___( &VAR_C ) _character_ ; ** all character variables in dataset **; do _n_=1 to &VAR_C; if index(___cv___[_n_],compress(' " ')) then ___ch___[_n_]=1; end; if ___lo___; do _n_=1 to &VAR_C; ___cv___[_n_]=___ch___[_n_]; end; run; proc transpose data =_conten1 out=_conten1; var _all_; run; data _null_; set _conten1; if compress(col1)="1"; _name_=lowcase(_name_); put " " _name_ " * "; run; %put *; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=14; %goto okay; %fail15: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put Only Stata SE can handle more than 2047 variables. * ; %put You have &avars. variables in dataset &dset. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=15; %goto okay; %fail16: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The input dataset &dset exceeds the width limit of the version of Stata you * ; %put are using or saving to. * ; %if &float=0 %then %do; %put Consider using the savastata option "-float" to save space. The "-float" option * ; %put saves numeric variables that contain decimal values as float instead of the default * ; %put of double. Float is %nrquote(Stata%str(%'s)) default storage type for numeric variables. * ; %end; %else %if &float=1 %then %do; %put Consider dropping some variables. * ; %end; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=16; %goto okay; %fail17: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put At least one of the formats for a numeric variable contains a compound quote ( %nrquote(%str(`%") or %str(%")%nrstr(%')) ). * ; %put This is not allowed by savastata. * ; %put Consider replacing any compound quotes with single quotes ( %nrquote(%str(%')) ) using * ; %put the savastata option "-quotes". * ; %put Like: %nrstr(%%)savastata(%nrbquote(&out_dir.),-quotes &options)%nrstr(;) ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=17; %goto okay; %fail18: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The work directory cannot start with a forward slash ( \ ) because * ; %put savastata needs to cd to it. * ; %put Is it possible to reassign your work directory to a drive like C: or D: ? * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=18; %goto okay; %fail19: %put * ; %put ERROR: Savastata did not save your dataset. * ; %if &tfns^=&sysjobid.&sysindex %then %do; %put Run savastata again using the "messy" option * ; %put Like: %nrstr(%%)savastata(%nrbquote(&out_dir.),-messy &options)%nrstr(;) ; %put and then check the files "&out_dir._SomeNumber_infile.log" and * ; %put "&out_dir._SomeNumber_con.log" for errors. * ; %end; %else %do; %put Run savastata again using the "messy" option * ; %put Like: %nrstr(%%)savastata(%nrbquote(&out_dir.),-messy &options)%nrstr(;) ; %put and then check the file "&out_dir._SomeNumber_infile.log" for errors. * ; %end; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=19; %goto okay; %fail20: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put Savastata was not able to get Stata to execute. * ; %put This is not where your stata executable file is located: * ; %put &wstata * ; %put There are instructions in the top section of the savastata.mac file * ; %put that explain how to edit the savastata.mac file to set the * ; %put macro variable wstata to the correct location of your stata executable file. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=20; %goto okay; %fail21: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put Savastata was not able to get Stata to execute. * ; %put This is not how to call Stata: * ; %put &ustata *; %put There are instructions in the top section of the savastata.mac file * ; %put that explain how to edit the savastata.mac file to set the * ; %put macro variable ustata to the correct way to call Stata. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=21; %goto okay; %fail22: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put For reasons unknown savastata was not able to get Stata to execute. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=22; %goto okay; %fail23: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=23; %goto okay; %fail24: %put * ; %put ERROR: Savastata did not save your dataset. * ; %put The input dataset &dset has &avars. variables which exceeds the 32,767 limit of Stata SE * ; %put Consider dropping at least %eval(&avars - 32766) variables. * ; %put For more help check here: &http ; %if &sysver<8 %then savastata did not save your dataset %put * ; %let fail=24; %goto okay; %okay: %put ; %if &success=0 and "&nosave"="" %then %goto done; %if &fail > 0 and "&script." = "usesas" %then %do; ** erase _infile.do so that usesas will not attemp to run it **; %if %sysfunc(fileexist(%nrbquote("&temp_dir._&tfns._infile.do"))) %then %do; data _null_; fname="tempfile"; rc=filename(fname,"&temp_dir._&tfns._infile.do"); if rc = 0 and fexist(fname) then rc=fdelete(fname); run; %end; %end; %if &ascii=1 %then %do; options notes; %put * ; %put NOTE: Savastata has successfully created the ascii data file %nrbquote(&out_dir.)_&tfns._.raw. * ; %put * ; %end; options nonotes; ** Make sure notes are shut off while deleting temp files and figuring out run time. **; ****** clean up ***************; %if "&script." ^= "usesas" %then %do; proc datasets library=work nodetails nolist nowarn; ** if a dataset does not exist, SAS does not give error message * so delete all of them. **; delete _conten _conten1 _conten2 _conten3; run; %end; ** if script ^= usesas **; ** Figure out how much time savastata took to process the dataset. **; ** initialize macro vars **; data _null_; call symput("diffhour",compress(hour(datetime()-&startdat))); call symput("diffmin",compress(minute(datetime()-&startdat))); call symput("diffsec",compress(second(datetime()-&startdat))); run; options notes; %if &diffhour>0 %then %put NOTE: Savastata took about &diffhour hours and &dffmin minutes to run. * ; %else %if &diffmin=0 %then %put NOTE: Savastata took less than a minute to run. * ; %else %put NOTE: Savastata took about &diffmin mins to run. * ; %if &sysindex=10 and &check=1 %then %do; %put ; %put Michael says, %str(%"Checking data is cool!%") ; %put ; %end; %done: ; options nonotes; %if %sysfunc(fileexist(&usagelog)) %then %do; data _null_; difftime=compress("&diffhour"||":"||"&diffmin"||":"||round(&diffsec,0.1)); file &usagelog mod; put " SAS dataset name: &dset pwd=&pwdir "; put " Elapsed time for savastata macro is " difftime; put " fail=&fail "; run; %end; options obs=&obs. ¬es.; ** Restore options. **; ** Make sure that the last dataset created in work is the users dataset. **; %let sysdsn=&s_dsn; %let syslast=&s_last; %if &unix=0 %then %do; %sysexec &pwdrive ; ** change to the drive that SAS started out in. **; %end; ** end of if unix=0 do loop **; ** sysexec requires no quotes even when changing to dirs with spaces in windows or unix **; %sysexec cd &pwdir ; ** change to the directory that SAS started out in. **; %nevrmind: ; ** Go to here if an error occurred before savastata started **; %MEND savastata;