{smcl} {* revised 15aug2015}{...} {cmd:help mergepoly} {hline} {title:Title} {phang} {bf:mergepoly} {hline 2} Merge adjacent polygons from a shapefile {title:Syntax} {p 4 16 2} {cmd:mergepoly} [{it:featureid}] {ifin} using {it:coord_filename} {cmd:,} {opt coor:dinates(save_filename)} [{opt replace} {opt by(byvarlist)} {opt f:ail(newvarname)}] {title:Description} {pstd} {cmd:mergepoly} removes shared borders between adjacent polygons and reconnects the remaining line segments to form polygon(s) that describe the outer border of the original polygons. For example, if the input shapefile contains three features described by the following polygons: {space 10}{c TLC}{hline 16}{c TRC} {space 10}{c LT}{hline 10}{c TRC}{space 5}{c |} {space 10}{c |}{space 10}{c |} {it:2} {c |} {space 10}{c |} {it:1} {c LT}{hline 5}{c RT} {space 10}{c BLC}{hline 2}{c TRC}{space 7}{c |} {it:3} {c |} {space 10}{space 3}{c BLC}{hline 7}{c RT}{space 5}{c |} {space 10}{space 11}{c BLC}{hline 5}{c BRC} {pstd} {cmd:mergepoly} will produce a new shapefile that contains a single feature described by the outer border of the original polygons: {space 10}{c TLC}{hline 16}{c TRC} {space 10}{c |}{space 16}{c |} {space 10}{c |}{space 16}{c |} {space 10}{c |}{space 8}{it:1}{space 7}{c |} {space 10}{c BLC}{hline 2}{c TRC}{space 13}{c |} {space 10}{space 3}{c BLC}{hline 7}{c TRC}{space 5}{c |} {space 10}{space 11}{c BLC}{hline 5}{c BRC} {pstd} {cmd:mergepoly} inputs a shapefile in the dual Stata dataset format generated by {stata ssc des shp2dta:shp2dta} (SSC). The dataset in memory holds the database of feature attributes. It has one observation per geographic feature. If the variable that identifies each feature is not called {hi:_ID}, you must also specify {it:featureid}, whose values will be matched to the variable {hi:_ID} in the coordinates dataset. The Stata dataset {it:coord_filename} contains the coordinates (variables {hi:_X _Y}) of points that make up polygons describing the boundaries of each feature, identified by the numeric variable {hi:_ID}. Note that a feature can be described using more than one polygon (e.g. islands of Hawaii). {pstd} You can use {ifin} to merge polygons for a subset of features. {pstd} {cmd:mergepoly} outputs a dual Stata dataset shapefile that is ready to be visualized by {stata ssc des spmap:spmap} (SSC). The database of feature attributes is left in memory. Attributes that are not constant per feature are dropped. The coordinates of the polygons that describe the feature(s) are saved in the Stata dataset {it:save_filename}. {pstd} If the {opt by(byvarlist)} is specified, {cmd:mergepoly} creates a new feature for each distinct value of {it:byvarlist}. See the examples below that merge the boundaries of U.S. States by Census regions and divisions. {pstd} {cmd:mergepoly} will fail to merge polygons if the shapefile was constructed by combining polygons from other shapefiles without redefining polygons to reflect shared boundaries between adjacent polygons. In the example above, if polygon {it:1} is combined with polygons {it:2} and {it:3} without adjustments, only the horizontal line segments between {it:1} and {it:2} and between {it:2} and {it:3} are shared and dropped by {cmd:mergepoly}. {space 10}{c TLC}{hline 16}{c TRC} {space 10}{c |}{space 16}{c |} {space 10}{c |}{space 10}{c |} {c |} {space 10}{c |} {it:1} {c |}{space 5}{c |} {space 10}{c BLC}{hline 2}{c TRC}{space 7}{c |} {c |} {space 10}{space 3}{c BLC}{hline 7}{c RT}{space 5}{c |} {space 10}{space 11}{c BLC}{hline 5}{c BRC} {pstd} In such cases, {cmd:mergepoly} will issue a warning that it failed to reconnect all line segments. {title:Options} {phang}{opt coor:dinates(save_filename)} is required and specifies the name of the Stata dataset for the merged coordinates. {phang}{opt replace} overwrites the existing {it:save_filename} Stata dataset. {phang}{opt by(byvarlist)} specifies that polygons are to be merged by groups of features that share the same attributes identified by {it:byvarlist}. The database of feature attributes is reduced to one feature per distinct value of {it:byvarlist}. Attributes that are not constant per feature are dropped. See below for examples that merge U.S. States boundaries by Census regions and divisions. {phang}{opt f:ail(newvarname)} specifies a variable name that {cmd:mergepoly} will use to indicate which coordinates are not part of the reconnected polygons. If not specified, the variable will be called {hi:_fail}. {title:Examples} {pstd} Download a shapefile of U.S. States boundaries from the U.S. Census Bureau. {cmd:.} {stata `"copy "http://www2.census.gov/geo/tiger/GENZ2010/gz_2010_us_040_00_500k.zip" "gz_2010_us_040_00_500k.zip""'} {cmd:.} {stata `"unzipfile "gz_2010_us_040_00_500k.zip""'} {pstd} Use {cmd:shp2dta} (from SSC, click {stata ssc install shp2dta:here to install}) to convert to Stata datasets {cmd:.} {stata shp2dta using "gz_2010_us_040_00_500k", data("dbf.dta") coor("coor.dta") genid(_ID)} {pstd} Merge all polygons to form a shapefile of the continental US. Since there is only one feature in the shapefile, we can simply plot the coordinates. {cmd:.} {stata `"use "dbf.dta", clear"'} {cmd:.} {stata `"mergepoly if !inlist(STATE,"02","15","72") using "coor.dta", coor("usa_coor.dta")"'} {cmd:.} {stata `"use "usa_coor.dta", clear"'} {cmd:.} {stata `"line _Y _X, lwidth(vthin) cmissing(n)"'} {pstd} Go back to the shapefile's database of features and merge with a dataset of US Census regions and divisions. Further reduce to states in the contiguous U.S. {cmd:.} {stata `"copy "http://robertpicard.com/stata/us_regions.dta" "us_regions.dta""'} {cmd:.} {stata `"use "dbf.dta", clear"'} {cmd:.} {stata `"merge 1:1 STATE using "us_regions.dta", nogen"'} {cmd:.} {stata `"drop if inlist(STATE,"02","15","72")"'} {cmd:.} {stata `"save "contiguous.dta""'} {cmd:.} {stata `"tab division region"'} {pstd} Merge state boundaries by division. Use {cmd:spmap} (from SSC, click {stata ssc install spmap:here to install}) to visualize each division in terms of the number of states it represents {cmd:.} {stata `"use "contiguous.dta", clear"'} {cmd:.} {stata `"mergepoly using "coor.dta", by(division) coor("div_coor.dta")"'} {cmd:.} {stata `"spmap dstates using "div_coor.dta", id(_ID)"'} {pstd} Repeat, this time merging state boundaries by US Census regions. {cmd:.} {stata `"use "contiguous.dta", clear"'} {cmd:.} {stata `"mergepoly using "coor.dta", by(region) coor("reg_coor.dta")"'} {cmd:.} {stata `"spmap rstates using "reg_coor.dta", id(_ID)"'} {title:Authors} {pstd}Robert Picard{p_end} {pstd}picard@netbox.com{p_end} {pstd}Michael Stepner{p_end} {pstd}michaelstepner@gmail.com{p_end} {title:Acknowledgments} {pstd}A question on Statalist from Karsten Pfaff was the stimulus for writing this program.