Title
group_id -- Groups identifiers when values for specified variables match
Syntax
group_id id_var , matchby(match_vars)
Description
group_id consolidates values of identifier variable id_var when observations are matched by match_vars. When a match is found between two observations with different id_var values, all records with those id_var values are updated to the new consolidated value, even if they do not match by match_vars.
If there is a missing value in any variable in match_vars, the observation is ignored in terms of the matching.
Example
Suppose we have the following dataset,
+-----------------------------------------------------------+ | id contact phone loc email | |-----------------------------------------------------------| | 1 Picard, Michel 555-2222 home | | 1 Picard, Michel 555-3333 work | | 2 Picard, Robert 555-0001 home | | 2 Picard, Robert 555-1234 work | | 2 Picard, Robert picard@netbox.com | | 3 Pickard, John 555-5555 home | | 3 Pickard, John Pickard@here.com | | 4 Robert Picard 555-9999 cell | | 4 Robert Picard picard@netbox.com | +-----------------------------------------------------------+
Each contact has a unique id but values 2 and 4 appear to refer to the same contact because the email is identical. Consolidating id values 2 and 4 is a bit tricky because there are observations with those id values that are not part of the email match. However,
. group_id id, matchby(email)
. list, noobs sep(0)
+-----------------------------------------------------------+ | id contact phone loc email | |-----------------------------------------------------------| | 1 Picard, Michel 555-2222 home | | 1 Picard, Michel 555-3333 work | | 2 Picard, Robert 555-0001 home | | 2 Picard, Robert 555-1234 work | | 2 Picard, Robert picard@netbox.com | | 3 Pickard, John 555-5555 home | | 3 Pickard, John Pickard@here.com | | 2 Robert Picard 555-9999 cell | | 2 Robert Picard picard@netbox.com | +-----------------------------------------------------------+
The listing shows that group_id has grouped the original id values 2 and 4. Other id values are unaffected.
Author
Robert Picard picard@netbox.com