{smcl}
{hline}
help for {cmd:linuxlsd1}{right:(Roger Newson)}
{hline}

{title:Create a dataset of file records from the output of a Linux {cmd:ls -d1} command}

{p 8 21 2}
{cmd:linuxlsd1} [ , {cmdab:fs:pec}{cmd:(}{it:file_spec}{cmd:)} {cmdab:lso:ptions}{cmd:(}{it:ls_options}{cmd:)}
{opth fi:nfo(newvarname)}
{opt clear} 
]

{p 8 21 2}
{cmd:llinuxlsd1} [ , {cmdab:fs:pec}{cmd:(}{it:file_spec}{cmd:)}{cmd:)} {cmdab:lso:ptions}{cmd:(}{it:ls_options}{cmd:)}
{opth fi:nfo(name)}
]

{pstd}
where {it:file_spec} is a Linux or Mac OS X file list specification
and {it:ls_options} is a list of extra options for the Linux {cmd:ls} command.
For more about these specifications and options, see the Linux online help under {cmd:ls --help}.


{title:Description}

{pstd}
{cmd:linuxlsd1} inputs a file list specification recognised by Linux, Unix or Mac OS X,
and generates in the memory a new Stata dataset, with 1 observation per file in the list,
and data on information about the file,
including the file name and possibly other file-specific information.
This new Stata dataset can then be used for mass-processing of the specified files.
{cmd:llinuxlsd1} inputs a file list specification recognised by Linux or Mac OS X,
and generates a {help macro:local macro} containing a list,
with 1 list item per file in the list,
containing information about the file, including the file name and possibly other items.
{cmd:linuxlsd1} and {cmd:llinuxlsd1} work by using the Linux {cmd:ls} command,
with the options {cmd:-d1} to specify that directories in the file list will be listed as files (and not expanded),
and that the output will contain 1 line per file.
They are designed to work only under a Linux, Unix or Mac OS X operating environment.


{title:Options for {cmd:linuxlsd1} and {cmd:llinuxlsd1}}

{phang}
{cmd:fspec(}{it:file_spec}{cmd:)} specifies a list of files,
in the language of Linux or Mac OS X.
If not specified, then it is set to {cmd:.},
specifying the local directory under Linux or Mac OS X.
These file specifications may contain wildcards such as {cmd:*}.
For instance, {cmd:filespec("*.txt")} specifies a list of all files in the current directory with the extension {cmd:.txt}.
It is usually a good idea to enclose the {cmd:fspec()} option in quotes,
because Linux file specifications often contain forward slashes ({cmd:/}),
which Stata uses for an escape character, at least in {help do:do-files}.

{phang}
{cmd:lsoptions(}{it:ls_options}{cmd:)} specifies a list of options acceptable to the {cmd:ls} command under Linux.
For instance, {cmd:lsoptions(-t)} specifies that the files will be sorted by the time when they were created,
and {cmd:lsoptions(l)} specifies that the output variable or macro will contain list items for each file,
instead of just the file name.

{title:Options for {cmd:linuxlsd1} only}

{phang}
{opth finfo(newvarname)} specifies the name for a variable to be generated in the output dataset,
containing the file information output by the Linux {cmd:ls -d1} command,
with the files specified by the {cmd:fspec()} option.
If {cmd:finfo()} is unset, then the variable will have the name {cmd:finfo}.

{phang}
{cmd:clear} specifies that the output dataset will overwrite any existing dataset that may be present in the memory.
If {cmd:clear} is not specified,
then {cmd:linuxlsd1} will refuse to create an output dataset if data are already in memory.
This convention protects the user from deleting important data.


{title:Options for {cmd:llinuxlsd1} only}

{phang}
{opth finfo(name)} specifies the name for a {help macro:local macro} to be generated,
containing the list of file information records output by the Linux {cmd:ls -d1} command
for the files specified by the {cmd:fspec()} option.
If {cmd:finfo()} is unset, then the macro will have the name {cmd:finfo}.


{title:Remarks}

{pstd}
The {cmd:linuxlsd1} package is the Linux, Unix or Mac OS X equivalent of the {helpb msdirb} package,
which creates a dataset in memory, or a local macro, containing a user-specified list of Microsoft Windows file names,
and which can be downloaded from {help ssc:SSC}.


{title:Examples}

{pstd}
The following example creates in memory a dataset of all filenames with the extension {cmd:.txt} in the directory {cmd:./mysub1},
with a single variable {cmd:finfo} containing the file name. Note that the file specification is enclosed in quotes:

{p 8 12 2}{cmd:. linuxlsd1, fspec("./mysub1/*.txt") clear}{p_end}
{p 8 12 2}{cmd:. describe, full}{p_end}
{p 8 12 2}{cmd:. list}{p_end}

{pstd}
The following example creates in memory a dataset of all filenames with the extension {cmd:.txt} in the directory {cmd:./mysub1},
but this time the a single variable containing the file name is named {cmd:filepath}:

{p 8 12 2}{cmd:. linuxlsd1, fspec("./mysub1/*.txt") finfo(filepath) clear}{p_end}
{p 8 12 2}{cmd:. describe, full}{p_end}
{p 8 12 2}{cmd:. list}{p_end}

{pstd}
The following example creates in memory a dataset with 1 observation per file with the extension {cmd:.txt} in the directory {cmd:./mysub1}.
However, this time the file-information variable is named {cmd:filedata},
and the {cmd:lsoptions} option ensures that it contains not only the filename but also a list of other information (specified by the {cmd {ls} option {cmd:-l}),
 and the observations are sorted in descending order of file size (specified by the {cmd:ls} option {cmd:-S}).

{p 8 12 2}{cmd:. linuxlsd1, fspec("./mysub1/*.txt") lsoptions(-lS) finfo(filedata) clear}{p_end}
{p 8 12 2}{cmd:. describe, full}{p_end}
{p 8 12 2}{cmd:. list}{p_end}

{pstd}
The following example uses the {cmd:llinuxlsd1} module to create a list of all filenames with the extension {cmd:.txt} in the directory {cmd:./mysub1},
storing their names in a {help local macro} named {cmd:foobar}.
It then uses the {helpb tfconcat} module of the {help ssc:SSC} package {helpb intext}
to concatenate all the files in the list into a dataset in memory,
with a string variable {cmd:line} containing the file line,
and a string variable {cmd:filename} containing the file from which the line came:

{p 8 12 2}{cmd:. llinuxlsd1, fspec("./mysub1/*.txt") finfo(foobar)}{p_end}
{p 8 12 2}{cmd:. macro list _foobar}{p_end}
{p 8 12 2}{cmd:. tfconcat `foobar', tfname(filename) gene(line)}{p_end}
{p 8 12 2}{cmd:. describe, full}{p_end}
{p 8 12 2}{cmd:. list}{p_end}


{title:Author}

{pstd}
Roger Newson, Imperial College London, UK.{break}
Email: {browse "mailto:r.newson@imperial.ac.uk":r.newson@imperial.ac.uk}


{title:Also see}

{p 4 13 2}
{bind: }Manual: {hi:[D] shell}, {hi:[D] insheet}, {hi:[D] cd}
{p_end}
{p 4 13 2}
On-line: help for {helpb shell}, {helpb insheet}, {helpb pwd}, {helpb cd}
{break}
help for {helpb msdirb}, {helpb intext}, {helpb tfconcat} if installed
{break}
Linux help for the {cmd:ls} command under {cmd:ls --help} in Linux
{p_end}