Simulation of IRT models
simirt [, nbobs(#) dim(# [#]) mu(# [#]) cov(# [# #]) diff(list_of_values_or_expression) disc(list_of_values) pmin(list_of_values) pmax(list_of_values) acc(list_of_values) rsm1(list_of_values) rsm2(list_of_values) threshold clear store(filename) replace prefix(string [string] draw group(#) deltagroup(#)])
Description
simirt allows creating a new dataset of responses to items simulated by an unidimensional IRT model. The model can be dichotomous (Rasch, OPLM, Birnbaum, 3PLM, 4PLM, 5PAM) or polytomous (Rating Scale Model-RSM). It is possible to simulate two sets of items linked, for each of them, to a specific latent trait (who can be correlated).
Options
nbobs(#) specifies the number of individuals to simulate. By default, 2000 individuals are simulated.
dim(# [#]) specifies the number of items linked to the first latent trait (and optionally to the second one). If this option is not defined, the simirt command simulates only one latent trait with a number of items equal to the number of values defined in the diff option (at least one of these two options must be defined).
mu(# [#]) specifies the mean(s) of each simulated latent trait(s).
cov(# [# #]) defines the covariance matrix of the latent trait(s). If there is only one latent, cov is composed of the variance of this one, else, cov is composed of the variance of the first latent, followed by the variance of the second latent trait, and of the covariance.
diff(list_of_values_or_expression) defines the values of the difficulty parameters as a list of values (with a number of elements equal to the total number of items), or as an expression like uniform #A #B (to define these parameters as uniformly distributed in ]#A;#B[), or like gauss #M #V (to define these parameters as the percentiles of the gaussian distribution with mean #M and variance #V). If there is two latent traits, the expressions are defined as uniform #A1 #B1 #A2 #B2 and gauss #M1 #V1 #M2 #V2. If this option is not defined (but the dim option is), these parameters are defined among a standardized gaussian distribution.
disc(list_of_values) defines the discriminating values of the items (by default, these parameters are fixed to 1).
pmin(list_of_values) defines the minimal probability of positive responses for each item (by default, these parameters are fixed to 0).
pmax(list_of_values) defines the maximal probability of positive responses for each item (by default, these parameters are fixed to 1).
acc(list_of_values) defines the accelerating parameters for each item (by default, these parameters are fixed to 1).
rsm1(list_of_values) and rsm2(list_of_values) defines the parameters corresponding to the modalities 2 to K (bigger modality) for each item of the first scale (rsm1) or of the second scale (rsm2). If none of these two options is specified, the data are dichotomous ones. The rsm1 and rsm2 options cannot be combined with the disc, pmin, pmax, acc and draw options.
threshold simulates the responses of each individuals directly from the latent trait. In a dichotomous model (disc, pmin, pmax and acc options are not allowed), the response 1 if given as soon the latent trait of the individual is greater than the difficulty parameter of the item (defined with the diff option). In a polytomous model , a modality of response is given as soon the latent trait of the individual is greater than the difficulties of this modality. The difficulty of a modality is computed by the difficulty of the item plus the sum of the tau paramaters (defined with the rsm1 and rsm2 options) from tau2 to the tau parameter corresponding to the chosen modality. In this case, the tau parameters must be positive reals.
clear does not restore the initial dataset at the end of the command (at least one of the clear and store options must be defined).
store(filename) defines the file where the new dataset will be stored (at least one of the clear and store options must be defined).
replace, associated to store, allows replacing the file defined by store, if it already exist.
prefix(string [string]]) allows defining the prefix to use for the names of the items. The string cannot contain space(s). By default, the used prefix is "item" in the unidimensional case, and "itemA" and "itemB" in the bidimensional case. A number follows these prefixes.
draw, in the unidimensional case, this option allows drawing the Items Characteristic Curves on a graph.
group defines, in the unidimensional case, two groups of patients, for example a "treated" group (coded 1) and a "reference" group (coded 0). group defines the expected proportion of individuals of the first group.
deltagroup defines, in the unidimensional case, the difference between the means of the latent trait between the two groups defined by the group option. This option is disabled if the group option is not defined. The variance of the latent trait is considered as equal in the two groups.
Outputs
r(nbobs): Number of simulated individuals.
r(mean_#): Empirical mean of the #th latent trait.
r(var_#): Empirical variance of the #th latent trait.
r(cov_12): Empirical covariance between the two latent traits (if there is two simulated dimensions).
r(rho): Empirical correlation coefficient between the two latent traits (if there is two simulated dimensions).
Examples
. simirt , dim(7) clear /*simulates data by a Rasch model*/
. simirt , diff(gauss 0 1) dim(7) disc(.8 1.2 1.4 .6 1.4 1.0 1.1) clear /*simulates data by a Birnbaum model*/
. simirt , diff(uniform -2 3 0 1) dim(7 7) cov(2 4 1) clear/* simulates data with a bidimensional latent trait*/
. simirt , dim(7) clear group(.5) deltagroup(1) /*simulates data by a Rasch model, with two groups of approximate equal size of patients and a difference between the means of the latent trait for the two groups of 1*/
. simirt , dim(7) clear rsm(1 .5 .2) /*Data simulated by a RSM. Each item has until 5 modalities*/
Notes about the models
Rasch model: By default, you can describe only the diff option.
Birnbaum model and OPLM: By default, the diff and the disc options must be defined.
3-PLM: By default, the diff, the disc and the pmin options must be defined.
4-PLM: By default, the diff, the disc, the pmin and the pmax options must be defined.
5-PM: The diff, the disc, the pmin, the pmax and the acc options must be defined.
RSM: The rsm1 [and eventually the rsm2] option(s) must be defined. The disc, the pmin, the pmax and the acc options cannot be defined.
Author
Jean-Benoit Hardouin, Regional Health Observatory (ORS) - 1, rue Porte Madeleine - BP 2439 - 45032 Orleans Cedex 1 - France. You can contact the author at jean-benoit.hardouin@orscentre.org and visit the websites AnaQol and FreeIRT