help parallel -------------------------------------------------------------------------------


parallel -- Stata module for Parallel computing


Setting the number of clusters (data blocks)

parallel setclusters # [, force]

Parallelizing a do-file

parallel do filename [, by(varlist) force keep keeplast programs mata noglobal seeds(numlist) nodata]

Parallelizing a stata command

parallel [, by(varlist) force keep keeplast programs mata noglobal seeds(numlist) nodata]: stata_cmd

Removing auxiliary files

parallel clean [, event(pll_id) all]

Setting the right stata directory

parallel setstatadir stata_path

options Description ------------------------------------------------------------------------- Setting the number of clusters force In order to protect the users' computers, setting more than 8 clusters it is not permitted (see the WARNING in description). With this option the user can skip this restriction.

Byable parallelization by Varlist. Tells the command through which observations the current dataset can be splitted, avoiding stories splitting over two or more clusters (for example). force When using by, parallel checks wheather if the dataset is propertly sorted. By using force the command skips this check.

Keeping auxiliary files keep Keeps auxiliary files generated by parallel. keeplast Keeps auxiliary files and remove those last time saved during the current session.

Passing Stata/Mata objects programs In the case of having temporal programs loaded in the sesion, using this option parallel passes them to the clusters. mata If the algorithm needs to use mata objects, this option allows to pass to each cluster every mata object loaded in the current sesion (including functions). noglobal By defaul parallel takes into account any global macro loaded in the sesion. If the user needs to not include globals in clusters, he should use this option.

Simulation options seeds Numlist. With this option the user can pass an specific seed to be used within each cluster. nodata Tells parallel that there is no data loaded and thus should not try to split or append anything.

Removing auxiliary files event Integer. Specifies which executed (and stored) event's files should be removed. all Tells parallel to remove every remanent auxiliary files generated by it in the current directory.

Setting the right stata directory stata_path Filename. Required option to set stata's current path. -------------------------------------------------------------------------


parallel allows to implement data parallelism algorithm in order to substantially improve speed performance of it.

In order to use parallel it is necesary to set the number of desired clusters with which the user wants to work with. To do this the user should use parallel setclusters # syntaxis, replacing # with the desired number of clusters. Setting more clusters than physical cores the user's computer has it is not recommended (see the WARNING in description).

parallel do is the equivalent to do. By using this syntax, the loaded dataset will be splitted in the number of clusters specified by parallel setclusters and the do-file will be executed independently over each and every one of the data clusters. After all the parallel-instances stops, the datasets will be appended.

parallel : (as a prefix) allows to, after spliting the loaded dataset, execute a stata_cmd over the specified number of data clusters in order to speed up computations. Like parallel do, after all the parallel-instances stops, the datasets will be appended.

Every time that parallel runs several auxiliary files are generated which, after finishing, are automatically deleted. In the case that the user sets keep or keepllast the auxiliar files are kept, thus the syntax parallel clean becomes handy. With parallel clean the user can remove the last generated auxiliar files (default option), an specific parallel instance files (using #pll_id number), or every kept auxiliar file (with all).

Giving N clusters, within each cluster parallel creates the local macros pll_id (equal for all the clusters) and pll_instance (ranging from 1 up to N, equalling 1 inside the first and N inside the last). Also the global macros PLL_CLUSTERS and PLL_DIR are available within each cluster.

As by now, parallel by default automatically identifies stata's executable file path. This is necessary as it is used to run stata in batch mode (the core of the command). Either way, after some reports, that file path is not always correctly identified; where parallel setstatadir can be used to manually set it.

WARNINGS (a) For each cluster parallel starts a new stata instance (thus running as many processes as clusters), this way, if the user sets more clusters than cores his computer has, it is possible that his computer collapses. (b) By this time parallel can not stop running the clusters by itself, what implies that, in the case of any of the clusters starts a endless loop, stoping the clusters should be done manually by the user by killing them from the OS's tasks manager.


Inspired by the R library ``snow'' and to be used in multicore CPUs , parallel implements parallel computing methods through an OS's shell scripting (using Stata in batch mode) to speedup computations by splitting the dataset into a determined number of clusters in such a way to implement a data parallelism algorithm.

The number of efficient computing clusters depends upon the number of physical cores (CPUs) with which your computer is built, e.g. if you have a quad-core computer, the correct cluster setting should be four. In the case of simultaneous multithreading, such as that from Intel's hyper-threading technology (HTT), setting parallel following the number of processors threads, as it was expected, hardly results into a perfect speedup scaling. In spite of it, after several tests on HTT capable architectures, the results of implementing parallel according to the machines physical cores versus it logical's shows small though significant differences.

parallel is especially handy when it comes to implementing loop-based simulation models (or simply loops), Stata commands such as reshape, or any job that (a) can be repeated through data-blocks, and (b) routines that processes big datasets.

At this time parallel has been successfully tested in Windows and Unix machines.Tests using Mac OS are still pending.

After several tests, it has been proven that--thanks to how parallel has been written--it is possible to use the algorithm in such a way that other techniques of parallel computing can be implemented; such as Monte Carlo Simulations, simultaneously running models, etc.. An extensive example through Monte Carlo Simulations is provided here


If the stata_cmd or do-file saves results, as parallel runs stata in batch mode, none of the results will be keept. This is also true for matrices, scalars and mata objects.

Inspite parallel passes-through programs, macros and mata objects, by the time it is not capable of doing the same with matrices or scalars.

Including parallel within ado-files which contains locally-defined programs it is not recommended due to knwon issues. By now parallel can not correctly interpret this kind of programs during the loading process causing erros.

Example 1: using prefix syntax

In this example we'll generate a variable containing the maximum blood-pressure measurement (bp) by patient.

Setup for a quad-core computer . sysuse bplong.dta . sort patient . parallel setclusters 4

Computes the maximum of bp for each patient. We add the option by(patient) to tell parallel not to splitt stories. . parallel, by(patient): by patient: egen max_bp = max(bp) Which is the ``parallel way'' to do:

. by patient: egen max_bp = max(bp) Giving you the same result. Example 2: using parallel do syntax

Another usage that may get big benefits from it is implementing loop-base simulations. Imagine that we have a model that requires looping over each and every record of a panel-data dataset.

Using parallel, the proper way to do this would be using the ``parallel do'' syntax

. use mybigpanel.dta, clear

. parallel setclusters 4 . parallel do . collapse ...

where would look something like this ----------------------------------- begin of do-file ------------ local maxiter = _N forval i = 1/`maxiter' { ...some routine... } ----------------------------------- end of the do-file ----------

Example 3: setting the right path

In the case of parallel setting the stata.exe's path wrongly, using parallel setstatadir will correct the situation. So, if "C:\Archivos de programa\Stata12/stata.exe" is the right path we only have to write:

. parallel setstatadir "C:\Archivos de programa\Stata12/stata.exe"

Saved results

parallel saves the following in r():

Scalars r(pll_n) Number of parallel clusters last used r(pll_id) Id of the last parallel instance executed (needed to use parallel clean) r(pll_t_setu) Time took to setup (before the parallelization) and to finish the job (after the parallelization) r(pll_t_calc) Time took to complete the parallel job r(pll_t_fini) Time took to appending and cleaning r(pll_t_reps) In the case of keeptime, the number of time measures performed.

Macros r(pll_dir) Directory where parallel ran and stored the auxiliary files.


Luke Tierney, A. J. Rossini, Na Li and H. Sevcikova (2012). snow: Simple Network of Workstations. R package version 0.3-9. R Core Team (2012). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL George Vega Y (2012). Introducing PARALLEL: Stata Module for Parallel Computing. Chilean Pension Supervisor, Santiago de Chile, URL


George Vega Yon, Superindentencia de Pensiones.


Damian C. Clarke (Oxford University, England), Felix Villatoro (Superintendencia de Pensiones, Chile), Eduardo Fajnzylber (Universidad Adolfo Ibanez, Chile), Eric Melse (CAREM, Netherlands), Research Division (Superindentendia de Pensiones, Chile)

Also see

Manual: [GSM] Advanced Stata usage (Mac), [GSU] Advanced Stata usage (Unix), [GSW] Advanced Stata usage (Windows)

Online: Running Stata batch-mode in Mac, Unix and Windows