help geodist
-------------------------------------------------------------------------------

Title

geodist -- Computes geodetic distances.

Syntax

If one or more lat/lon coordinates are variables

geodist lat1 lon1 lat2 lon2 [if] [in] , generate(newvar) [options]

If all lat/lon coordinates are scalars or strings of numerical coordinates

geodist lat1 lon1 lat2 lon2 , [options]

options Description ------------------------------------------------------------------------- Main miles report distances in miles

Ellipsoid ellipsoid(#1,#2) alternative ellipsoid parameters (a,f) maxiter(#) maximum iterations #

Sphere sphere calculate great-circle distances radius(#) custom radius # (in km) -------------------------------------------------------------------------

Description

geodist computes geodetic distances, i.e. the length of the shortest curve between two points along the surface of a mathematical model of the earth. By default, the input coordinates are assumed to be based on the WGS 1984 datum (the same used by Google Earth/Map and GPS devices) and geodist calculates ellipsoidal distances using Vincenty's (1975) equations.

If the sphere option is selected, geodist uses a sphere with a radius of 6371km to approximate the shape of the earth and computes great-circle distances. This is significantly faster than computing ellipsoidal distances.

Coordinates may be specified in any combination of variables, scalars, or strings of numerical coordinates. If one or more lat/lon coordinates are variables, then generate(newvar) must be specified and newvar will be created to store the computed distances.

Coordinates must be in signed decimal degrees, positive for north and east, and negative for south and west. Latitudes range from -90 to 90 and longitudes from -180 to 180.

Options +------+ ----+ Main +------------------------------------------------------------- miles indicates that distances are to be reported in miles; if omitted, distances are in kilometers.

+-----------+ ----+ Ellipsoid +-------------------------------------------------------- ellipsoid(#1,#2) is used to specify a reference ellipsoid other than the one used by default (WGS 1984). A reference ellipsoid is defined by two constants, (a,f). These are the length of the semi-major axis in meters (i.e. radius to the equator) and the flattening ratio. For example, the Airy 1830 reference ellipsoid can be specified with ellipsoid(6377563.396,299.3249646).

maxiter(#) indicates the maximum number of iterations to perform when computing ellipsoidal distances. The default is 25. Note that virtually all cases require less than 10 iterations.

+--------+ ----+ Sphere +----------------------------------------------------------- sphere requests that great-circle distances be computed instead of ellipsoidal distances. By default, a radius of 6371km is used; this is the earth's mean radius (see http://en.wikipedia.org/wiki/Earth_radius#Mean_radii).

radius(#) specifies that great-circle distances are to be computed on a sphere with a radius of # km.

Accuracy and limitations

geodist uses the Vincenty (1975) equations when computing distances on a reference ellipsoid. According to Thomas and Featherstone (2005), these are thought to maintain submillimeter accuracy between all locations. However, the Vincenty algorithm cannot accurately calculate distances for near-antipodal points. In such cases, geodist will replace the distance with the extended missing value .a. See http://geographiclib.sourceforge.net/cgi-bin/Geod for a calculator that does not have this limitation.

Obviously, most real-life data points will not be located on the reference ellipsoid. Since the reference ellipsoid is designed to approximate mean sea level (or more precisely the geoid), expect distances at sea level to be particularly accurate. However, when computing distances on Lake Titicaca, at 3.2km above the reference ellipsoid, expect an error in proportion to the increase in radius of the earth (6374.2/6371) or 0.05%.

Most longitude and latitude datasets are based on the WGS 1984 datum, which is what GPS devices and Google Earth/Map use. There are other datums and it is important to understand that distances will not be accurate if points are not based on the same datum. For example, here is a photo of my GPS lying on the Prime Meridian line at the Royal Observatory in Greenwich, England. It is part of the OSGB 1936 datum and is about 100 meters to the west of the Reference Meridian of WGS 1984 ( geodist 51.47803 -0.00145 51.47803 0).

When a sphere is used to approximate the shape of the earth, geodist computes great-circle distances using two formulas that are equivalent and mathematically exact. However, each is susceptible to computational errors. As Sinnott (1984) pointed out, the cos function is incapable of distinguishing small angles. This makes the spherical law of cosines formula unsuitable for distances between points that are near each other. On the other hand, the haversine formula is incapable of distinguishing small differences at angles around sin(pi/2), which makes it unsuitable for near-antipodal points. geodist makes the best of both worlds by using the haversine formula for almost all calculations and the law of cosines formula for near-antipodal points. This ensures submillimeter accuracy for all spherical distances computed by geodist.

Examples

Compute the distance between Ann Arbor, MI and Montreal, QC. The third line computes the distance on an ellipsoid with all flattening removed. Obviously, these are significantly different from driving distances, in this case 981km according to Google Maps.

. geodist 42.270873 -83.726329 45.545448 -73.639075 . geodist 42.270873 -83.726329 45.545448 -73.639075, sphere . geodist 42.270873 -83.726329 45.545448 -73.639075, ellipsoid(6371000, > 1e20) With near anti-podal points, Vincenty's solution for distances along an ellipsoid will fail. Note that increasing the number of iterations may not make geodist converge:

. geodist 0 0 0 179.3 . geodist 0 0 0 179.4 . geodist 0 0 0 179.4, maxiter(1000) You can use the Karney (2011) calculator if geodist fails to return a distance.

When computing spherical distances, the average earth radius is just that, and may not be the best to approximate ellipsoidal distances depending on the location of the points on earth. For example, at the poles, the radius of a sphere that matches the curvature of the earth is 6399.6km.

. geodist 88 0 88 180 . geodist 88 0 88 180, radius(6399.6) . geodist 88 0 88 180, sphere

Let's illustrate the same point for locations distributed evenly over the State of Colorado. The northeast corner is at 41,-102 and the southwest corner is at 37,-109.

. clear . set obs 100 . set seed 1234 . gen double lat1 = 37 + (41 - 37) * uniform() . gen double lon1 = -109 + (109 - 102) * uniform() . gen double lat2 = 37 + (41 - 37) * uniform() . gen double lon2 =-109 + (109 - 102) * uniform() . sum lat1 lon1 lat2 lon2

. geodist lat1 lon1 lat2 lon2, gen(v) . geodist lat1 lon1 lat2 lon2, gen(r6371) sphere . geodist lat1 lon1 lat2 lon2, gen(r6377) radius(6377) . gen diff1 = abs(r6371-v) . gen pcerr1 = 100 * diff1 / v . gen diff2 = abs(r6377-v) . gen pcerr2 = 100 * diff2 / v . sum

Any combination of variables and single lat/lon will work

. geodist lat1 lon1 41 -102, gen(v2) . geodist lat1 lon1 lat2 -102, gen(v3) . geodist lat1 -109 lat2 -102, gen(v4)

If you want to compute the nearest neighbors, you can use cross to create all pairwise combinations of points and then sort.

. clear . set obs 100 . gen id2 = 2000 + _n . gen double lat2 = 37 + (41 - 37) * uniform() . gen double lon2 =-109 + (109 - 102) * uniform() . tempfile f . save "`f'" . clear . set obs 100 . gen id1 = 1000 + _n . gen double lat1 = 37 + (41 - 37) * uniform() . gen double lon1 = -109 + (109 - 102) * uniform() . cross using "`f'" . geodist lat1 lon1 lat2 lon2, gen(d) . sort id1 d id2 . by id1: keep if _n == 1 Alternatively, you can use geonear, available from SSC, to find the nearest neighbors:

. geonear id1 lat1 lon1 using "`f'", n(id2 lat2 lon2) ell

Saved results

geodist saves the following in r() when no variable is specified:

Scalars r(iterations) number of iterations for ellipsoidal distances r(distance) distance

References and acknowledgements

Many thanks to Chris Veness for the best web pages on the subject:

http://www.movable-type.co.uk/scripts/latlong-vincenty.html http://www.movable-type.co.uk/scripts/latlong.html The definition of the World Geodetic System 1984 is available from

http://earth-info.nga.mil/GandG/publications/tr8350.2/wgs84fin.pdf See Appendix A.1 for a list of reference ellipsoids.

C. F. F. Karney, Geodesics on an ellipsoid of revolution, Feb. 2011; preprint arxiv:1102.1215.

R. W. Sinnott, "Virtues of the Haversine", Sky and Telescope 68 (2), 159 (1984). Thanks to the University of Michigan's Shapiro Science Library.

C. M. Thomas and W. E. Featherstone, Validation of Vincenty's Formulas for the Geodesic Using a New Fourth-Order Extension of Kivioja's Formula, J. Surv. Engrg. Volume 131, Issue 1, pp. 20-26 (February 2005). Available from:

http://www.cage.curtin.edu.au/~will/thomas-featherstone.pdf

Vincenty, T. (1975) Direct and inverse solutions of geodesics on the ellipsoid with application of nested equations, Survey Review 22(176): 88-93. Available from:

http://www.ngs.noaa.gov/PUBS_LIB/inverse.pdf

Author

Robert Picard picard@netbox.com