.-
help for ^stak^                                                (Version 1.2.1)
.-

Simple data stacker
-------------------

    ^stak^ varlist [^if^ exp] [^in^ range]^,^ [ ^i^nto^(^newvar^)^ ^g^id^(^newv
> ar^)^ 
                           ^l^abels ^r^etain { ^w^ide | ^d^ummy } ^clear^ ]


Description
-----------

^stak^ is a painless program to do simple ^reshape^s or ^stack^s.  It verticall
> y
stacks the variables in varlist into a single new variable and has options for
retaining the varlist and other variables in the data set.  Use ^stack^ if you 
need more than a single stacked variable or ^reshape^ for more complicated
restructuring.

Variables to be stacked must be either all numeric or all string.  Variables 
may be repeated in varlist.

Consider the following data:

            a         b         c 
  1.        1         4         7  
  2.        2         5         8  
  3.        3         6         9  

^stak a b c^ creates a new dataset containing:

        _into      _gid 
  1.        1         a  
  2.        2         a  
  3.        3         a  
  4.        4         b  
  5.        5         b  
  6.        6         b  
  7.        7         c  
  8.        8         c  
  9.        9         c  

The new dataset has 2 variables (^_into^ and ^_gid^) with k*n observations (k i
> s 
the number of variables in varlist and n equals _N from the old dataset).  
The first n observations of ^_into^ are the data from variable a, the second n
observations are the data from variable b, and the third n observations are 
the data from variable c.

Variable ^_gid^ identifies the groups.  ^_gid^ is a labeled numeric variable th
> at
incrementally numbers the groups starting from 1.  The value labels are the 
names of the stacked variables.


Options
-------

^into(^newvar^)^ specifies the name of the stacked variable to be created.  The
default name is ^_into^.  The name cannot be the same as any varname in varlist
> ;
when ^retain^ is specified, it cannot be the same as any varname in the dataset
> .

^gid(^newvar^)^ specifies the name of the group id variable to be created.  The
default name is ^_gid^.  The name cannot be the same as any varname in varlist;
>  
when ^retain^ is specified, it cannot be the same as any varname in the dataset
> .

^labels^ specifies that the variable labels of the varlist variables are to be 
used as value labels of the ^gid()^ variable.  The default is to use the 
variable names of the varlist variables.

^retain^ includes a stacked copy of each variable not specified in varlist.  
A "stacked copy" is k stacked replicates of the original data.  

^wide^ includes a wide copy of each variable in varlist.  A "wide copy" 
replicates the original data of the kth variable in varlist when the ^gid()^ 
variable has value k; the wide copy has missing values otherwize.  ^wide^ and 
^dummy^ are alternatives; you can specify one but not both.

^dummy^ includes a dummy (or indicator) variable for each variable in varlist. 
>  
A "dummy" has value 1 for the kth variable in varlist when the ^gid()^ variable
has value k; it has value 0 otherwize.  Dummies are returned in the original
variable names.  ^dummy^ and ^wide^ are alternatives; you can specify one but n
> ot
both.

^clear^ indicates your understanding that the data in memory will be lost; if 
this option is not specified, you will be asked to confirm your intentions.


Comments
--------

The new, stacked data file will be unnamed and the changed flag will be set.
The data label will be a modified version of the data label (if any) of the
original data. The modified version has "stak: " added as a prefix.

The treatment of existing value labels, variable labels, and formats depends
on the type of variable:

 ^into()^ variable: none are transferred.
  ^gid()^ variable: none are transferred.
  ^wide^ variables: all are transferred.
^retain^ variables: all are transferred.
 ^dummy^ variables: all are modified... 
           - existing value labels are dropped;
           - the variable label has " dummy" appended; 
           - the format is set to %8.0g.

By default, the name of the value label assigned to the ^gid()^ variable is the
same as the name of the ^gid()^ variable itself.  If that value label name is i
> n
use by another variable (and that variable will be transferred to the new data
file), then ^_gid^ is used instead.  If ^_gid^ is already in use, it will be 
deleted, without notice, and reused for the ^gid()^ variable.

The storage type for the ^into()^ variable is automatically set to that of the
"largest" type among the varlist variables so no precision is lost.

All variables in the new dataset are ^compress^ed.  Although storage types of 
existing variables may change, precision will be maintained.

Characteristics?  The _dta[] chars are retained.  Variable-specific chars are 
retained when the variable is in the new dataset.  ^dummy^ variables inherit 
the characteristics of their "parent" variables.

No attempt is made to determine whether the current ^set memory^ allocation is
large enough to contain the new dataset.  Nonetheless, the data are preserved 
and will be restored if a memory shortage occurs.  The required memory 
allocation will depend on the number of variables to be stacked and whether 
the ^retain^, {^wide^|^dummy^}, and ^labels^ options are specified.  The "worst
>  case"
scenario, for k stacked variables and options ^retain^, ^wide^ and ^label^ spec
> ified,
will be a new dataset of size slightly larger than k times the size of the 
original dataset.  


Examples
--------

Given data:

            a         b         c
  1.        1         4         7
  2.        2         5         8
  3.        3         6         9


 ^. stak a b, into(value) gid(group) retain wide clear^

 ^. list^

        value     group         a         b         c
  1.        1         a         1         .         7
  2.        2         a         2         .         8
  3.        3         a         3         .         9
  4.        4         b         .         4         7
  5.        5         b         .         5         8
  6.        6         b         .         6         9

Variable value contains the stacked values of variables a and b.  The varname 
"value" was specified in the ^into()^ option.

Variable group is a labeled numeric variable identifying the stacked groups 
of data. The varname "group" was specified in the ^gid()^ option.

Variables a and b result from the ^wide^ option.  

Variable c results from the ^retain^ option.

Option ^clear^ eliminated a warning that the existing dataset will be cleared.


Alternatively, we could have specified:

 ^. stak a b, i(value) g(group) r dummy labels clear^

 ^. list^

        value     group         a         b         c
  1.        1   'a' var         1         0         7
  2.        2   'a' var         1         0         8
  3.        3   'a' var         1         0         9
  4.        4   'b' var         0         1         7
  5.        5   'b' var         0         1         8
  6.        6   'b' var         0         1         9

There are two differences from the first example.
  
Specifying ^labels^ caused the value labels of group to be the variable labels
of the stacked variables (rather than the variable names).

Specifying ^dummy^ (rather than ^wide^) caused variables a and b to be dummy
(indicator) variables.


Author
------

  Thomas J. Steichen, RJRT, steicht@@rjrt.com


Acknowledgments
---------------

  Nicholas J. Cox made helpful comments (and was the original author  
                  of some code and code ideas that I willfully pilfered).


Also see
--------

 Manual:  ^[R] stack^
On-line:  help for @contract@, @reshape@, @xpose@