help group_id                                                 dialog:  group_id
-------------------------------------------------------------------------------

Title

group_id -- Groups identifiers when values for specified variables match

Syntax

group_id id_var , matchby(match_vars)

Description

group_id consolidates values of identifier variable id_var when observations are matched by match_vars. When a match is found between two observations with different id_var values, all records with those id_var values are updated to the new consolidated value, even if they do not match by match_vars.

If there is a missing value in any variable in match_vars, the observation is ignored in terms of the matching.

Example

Suppose we have the following dataset,

+-----------------------------------------------------------+ | id contact phone loc email | |-----------------------------------------------------------| | 1 Picard, Michel 555-2222 home | | 1 Picard, Michel 555-3333 work | | 2 Picard, Robert 555-0001 home | | 2 Picard, Robert 555-1234 work | | 2 Picard, Robert picard@netbox.com | | 3 Pickard, John 555-5555 home | | 3 Pickard, John Pickard@here.com | | 4 Robert Picard 555-9999 cell | | 4 Robert Picard picard@netbox.com | +-----------------------------------------------------------+

Each contact has a unique id but values 2 and 4 appear to refer to the same contact because the email is identical. Consolidating id values 2 and 4 is a bit tricky because there are observations with those id values that are not part of the email match. However,

. group_id id, matchby(email)

. list, noobs sep(0)

+-----------------------------------------------------------+ | id contact phone loc email | |-----------------------------------------------------------| | 1 Picard, Michel 555-2222 home | | 1 Picard, Michel 555-3333 work | | 2 Picard, Robert 555-0001 home | | 2 Picard, Robert 555-1234 work | | 2 Picard, Robert picard@netbox.com | | 3 Pickard, John 555-5555 home | | 3 Pickard, John Pickard@here.com | | 2 Robert Picard 555-9999 cell | | 2 Robert Picard picard@netbox.com | +-----------------------------------------------------------+

The listing shows that group_id has grouped the original id values 2 and 4. Other id values are unaffected.

Author

Robert Picard picard@netbox.com