{smcl}
{* 3feb2009}{...}
{cmd:help spkde}{right:Version 1.0.0}
{hline}

{title:Title}

{p 4 13 2}
{hi:spkde} {hline 2} Kernel estimation of density and intensity functions
                     for two-dimensional spatial point patterns {p_end}


{marker syntax}{title:Syntax}

{p 8 15 2}
{cmd:spkde} [{varlist}] {ifin} {help using}
{help spkde##section04:{it:gridpoints}}
[{cmd:,}
{help spkde##options1:{it:options}}]


{synoptset 35 tabbed}{...}
{marker options1}{synopthdr:{help spkde##options2:options}{col 41}}
{synoptline}
{marker main}{syntab: Main}
{p2coldent :* {opt x:coord(xvar)}}variable containing the x-coordinate
   of data points {p_end}
{p2coldent :* {opt y:coord(yvar)}}variable containing the y-coordinate
   of data points {p_end}
{synopt :{opt k:ernel(kf)}}kernel function, where {it:kf} is one of the
   following: {cmdab:qu:artic} (default), {cmdab:un:iform},
   {cmdab:no:rmal}, {cmdab:ne:gexp}, {cmdab:tr:iangular},
   {cmdab:ep:anechnikov} {p_end}
{synopt :{opt trunc:ated(t)}}truncation parameter for kernel functions
   {cmd:normal} and {cmd:negexp}, where {it:t} is a positive number {p_end}
{p2coldent :* {opt b:andwidth(method)}}method for setting the value of the
   kernel bandwidth, where {it:method} is one of the following: {cmdab:fbw},
   {cmdab:ndp}, {cmdab:mix:ed} {p_end}

{marker estpar}{syntab: Estimation parameters}
{synopt :{cmdab:fbw(ad}{it:q}|{it:b}{cmd:)}}fixed kernel bandwidth, where
   {it:q} is positive integer and {it:b} is a positive number {p_end}
{synopt :{opt ndp(d)}}minimum (weighted) number of data points to be used for
   kernel estimation at each grid point, where {it:d} is a positive number
   {p_end}
{synopt :{opt ndpw(ndpwvar)}}weight data points by variable {it:ndpwvar}
   when searching for the minimum number of data points to be used for
   kernel estimation {p_end}
{synopt :{opt edge:correction}}apply approximate edge correction {p_end}

{syntab: Reporting}
{synopt :{opt d:ots}}display job progression dots {p_end}
{synopt :{opt noverb:ose}}suppress display of job progression {p_end}

{syntab: Saving results}
{p2coldent :* {cmdab:sav:ing(}{help spkde##section05:{it:kernest}} [{cmd:, replace}]{cmd:)}}save
   results to Stata dataset {it:kernest} {p_end}
{synoptline}
{p 4 6 2}* Required option {p_end}


{marker desc}{title:Description}

{pstd} {cmd:spkde} implements a variety of kernel estimators of both the
       probability density function and the intensity function of
       two-dimensional spatial point patterns. {p_end}

{pstd} A two-dimensional spatial point pattern {bf:S} can be defined as a
       set of {it:data points} {bf:s}_{it:i} ({it:i} = 1, ..., {it:n})
       located in a two-dimensional study region {it:R} at coordinates
       ({it:s_i}1, {it:s_i}2). Each data point {bf:s}_{it:i} represents
       the location in {it:R} of one or more “objects” of some kind: people,
       events, sites, buildings, plants, cases of a disease, etc. {p_end}

{pstd} In the analysis of spatial point patterns we are often interested
       in determining whether the distribution of the objects of interest
       within {it:R} exhibits some form of clustering, as opposed to being
       random. To explore the possibility of clustering, it may be useful
       to describe the spatial point pattern of interest by means of its
       {it:probability density function} {it:p}({bf:s}) and/or its
       {it:intensity function} {it:l}({bf:s}). {p_end}

{pstd} The probability density function {it:p}({bf:s}) defines the probability
       of observing an object at location {bf:s} in {it:R}. In turn, the
       intensity function {it:l}({bf:s}) defines the expected number of objects
       per unit area at location {bf:s} in {it:R}. Therefore, {it:p}({bf:s})
       and {it:l}({bf:s}) differ only by a constant of proportionality (Bailey
       and Gatrell 1995; Waller and Gotway 2004). {p_end}

{pstd} Both the probability density function {it:p}({bf:s}) and the intensity
       function {it:l}({bf:s}) of a given two-dimensional spatial point pattern
       can be easily estimated by means of nonparametric estimators, e.g.,
       kernel estimators. {it:Kernel estimators} are used to generate a
       spatially smooth estimate of {it:p}({bf:s}) and/or {it:l}({bf:s}) at a
       fine grid of points {bf:s}_{it:g} ({it:g} = 1, ..., {it:G}) covering
       the study region {it:R} (Bailey and Gatrell 1995; Waller and Gotway
       2004). {p_end}

{pstd} {cmd:spkde} computes kernel estimates of both {it:p}({bf:s}) and
       {it:l}({bf:s}) at a grid of points generated by the user-written
       Stata program {help spgrid}. Expressly, for each grid point
       {bf:s}_{it:g}, {cmd:spkde} computes first the kernel estimate of
       the intensity {it:l}({bf:s}_{it:g}), and then the kernel estimate
       of the density {it:p}({bf:s}_{it:g}). {p_end}

{marker eq1}{pstd} The intensity {it:l}({bf:s}_{it:g}) at each grid point
       {bf:s}_{it:g} is estimated as follows: {p_end}

{space 12}^{space 10}{it:c}{space 6}{it:n}{space 5}
{space 4}(1){space 5}{it:l}({bf:s}_{it:g}) = {hline 5} · SUM {it:k}({it:d_ig} / {it:h_g}) · {it:y_i}
{space 22}{it:A_g}{space 4}{it:i}=1

{pstd} where {it:k}(·) is the kernel function {hline 1} usually a unimodal
       symmetrical bivariate probability density function; {it:d_ig} is the
       Euclidean distance between data point {bf:s}_{it:i} and grid point
       {bf:s}_{it:g}; {it:h_g} is the kernel bandwidth {hline 1} i.e., the
       radius of the kernel function {hline 1} at grid point {bf:s}_{it:g};
       {it:y_i} is the number of objects located at data point {bf:s}_{it:i};
       {it:A_g} is the area of the subregion of {it:R} over which the kernel
       function is evaluated at grid point {bf:s}_{it:g}, possibly corrected
       for edge effects; and {it:c} is a constant of proportionality. {p_end}

{pstd} In turn, the density {it:p}({bf:s}_{it:g}) at each grid point
       {bf:s}_{it:g} is estimated as follows: {p_end}

{space 24}^
{space 12}^{space 11}{it:l}({bf:s}_{it:g})
{marker eq2}{space 4}(2){space 5}{it:p}({bf:s}_{it:g}) = {hline 12}
{space 23}{it:G}{space 2}^
{space 22}SUM {it:l}({bf:s}_{it:j})
{space 22}{it:j}=1

{pstd} The estimates of {it:l}({bf:s}_{it:g}) and {it:p}({bf:s}_{it:g}),
       along with several other auxiliary variables, are saved to Stata
       dataset {help spkde##section05:{it:kernest}} and can be visualized
       using the user-written Stata program {help spmap}. {p_end}


{marker section01}{title:Kernel function}

{pstd} As shown by {help spkde##eq1:Equation 1} above, for each grid point
       {bf:s}_{it:g}, kernel estimators compute a weighted sum of the objects
       making up the spatial point pattern of interest. The weight
       {it:k}({it:d_ig} / {it:h_g}) applied to each object is a function of
       the ratio between the object's distance from {bf:s}_{it:g} ({it:d_ig})
       and the value of the kernel bandwidth {it:h_g}. In turn, the way
       weights depend on {it:d_ig} and {it:h_g} is determined by the form of
       the kernel function {it:k}(·) used in the analysis. {p_end}

{pstd} {cmd:spkde} allows to choose among six basic types of kernel
       function: uniform, normal, negative exponential, quartic (the default),
       triangular, and Epanechnikov. Moreover, through specification of
       option {opt truncated(t)}, it is possible to use a truncated version
       of the normal and negative exponential kernel functions.

{pstd} Let {it:z} = {it:d_ig} / {it:h_g} denote the argument of the kernel
       function, and {it:t}>0 denote a truncation parameter specified with
       option {opt truncated(t)}. Then, the kernel functions made available
       by {cmd:spkde} can be described as follows: {p_end}

{col 9}{hline 71}
{col 9}Name                          {col 44}Formula
{col 9}{hline 71}
{col 9}Uniform                       {col 44}{it:k}({it:z}) = 1            {col 64}if {it:z} < 1
                                     {col 44}{it:k}({it:z}) = 0            {col 64}otherwise

{col 9}Normal                        {col 44}{it:k}({it:z}) = exp(-{it:z}²/2)

{col 9}Truncated normal              {col 44}{it:k}({it:z}) = exp(-{it:z}²/2){col 64}if {it:d_ig} < {it:h_g}·{it:t}
                                     {col 44}{it:k}({it:z}) = 0            {col 64}otherwise

{col 9}Negative exponential          {col 44}{it:k}({it:z}) = exp(-3{it:z})

{col 9}Truncated negative exponential{col 44}{it:k}({it:z}) = exp(-3{it:z}){col 64}if {it:d_ig} < {it:h_g}·{it:t}
                                     {col 44}{it:k}({it:z}) = 0            {col 64}otherwise

{col 9}Quartic                       {col 44}{it:k}({it:z}) = (1-{it:z}²)² {col 64}if {it:z} < 1
                                     {col 44}{it:k}({it:z}) = 0            {col 64}otherwise

{col 9}Triangular                    {col 44}{it:k}({it:z}) = (1-{it:z})   {col 64}if {it:z} < 1
                                     {col 44}{it:k}({it:z}) = 0            {col 64}otherwise

{col 9}Epanechnikov                  {col 44}{it:k}({it:z}) = (1-{it:z}²)  {col 64}if {it:z} < 1
                                     {col 44}{it:k}({it:z}) = 0            {col 64}otherwise
{col 9}{hline 71}


{marker section02}{title:Kernel bandwidth}

{pstd} While the precise form of the kernel function has a moderate influence
       on the estimates of {it:p}({bf:s}) and {it:l}({bf:s}), the kernel
       bandwidth plays a major role in kernel estimation, since it determines
       the degree of smoothing with which {it:p}({bf:s}) and {it:l}({bf:s})
       are estimated: large bandwidths may result in oversmoothing, small
       bandwidths may retain too much local deatail and exhibit spikes at
       isolated event locations (Bailey and Gatrell 1995; Waller and Gotway
       2004). {p_end}

{pstd} {cmd:spkde} allows to choose among three methods for setting the value
       of the kernel bandwidth {it:h_g} at each grid point {bf:s}_{it:g}: fixed
       bandwidth, minimum (weighted) number of data points, and mixed. {p_end}

{pstd} The {it:fixed bandwidth} method sets {it:h_g} = {it:h} at all grid
       points, where {it:h} is a positive number expressed in the same unit
       of measurement (miles, kilometers, meters, pixels, etc.) as the grid
       and data points coordinates. {p_end}

{pstd} The {it:minimum (weighted) number of data points} method sets
       {it:h_g} = {it:r_g}({it:ndp}), where {it:r_g}({it:ndp}) is the radius
       of the smallest circle centered on {bf:s}_{it:g} that circumscribes at
       least {it:ndp} (weighted) data points. The quantity {it:ndp} can
       express either an unweighted or a weighted number of data points. In
       the latter case, the weight associated with each data point must be
       stored in variable {it:ndpwvar} and specified with option
       {opt ndpw(ndpwvar)}. For discussion and illustration of this
       adaptive method and its applications see, e.g., Bailey and Gatrell
       (1995), Brundson (1995), Talbot {it:et al.} (2000). {p_end}

{pstd} The {it:mixed} method is a combination of the other two
       methods. Expressly, it sets {it:h_g} = {it:h} if the circle centered
       on {bf:s}_{it:g} and having radius {it:h} circumscribes at least
       {it:ndp} (weighted) data points, and {it:h_g} = {it:r_g}({it:ndp})
       otherwise. {p_end}


{marker section03}{title:Edge correction}

{pstd} For bounded and truncated {help spkde##section01:kernel functions}, the
       nominal value of the area over which the kernel function is evaluated at
       each grid point {bf:s}_{it:g} (see {help spkde##eq1:Equation 1} above)
       equals the area of the kernel window, i.e., of the circle centered on
       {bf:s}_{it:g} and having radius {it:h_g} (for bounded functions) or
       {it:h_g·t} (for truncated functions, {it:t} being the truncation
       parameter). Formally: {it:A_g} = 3.1416 · {it:h_g}² for bounded
       functions, and {it:A_g}{space 1}= 3.1416 · ({it:h_g·t})² for truncated
       functions. {p_end}

{pstd} For grid points located near the edges of the study region, a greater
       or lesser portion of their kernel window may lie outside the study
       region, so that the densities/intensities computed with the
       standard formula at such grid points may result underestimated
       (Bailey and Gatrell 1995). {p_end}

{pstd} Although an accurate assessment of the impact of edge effects and
       the development of proper corrections remain an open area of
       research (Waller and Gotway 2004), some adjustments are
       possible. {cmd:spkde} implements an approximate edge correction
       that consists in rescaling the area of the kernel window at each
       grid point {bf:s}_{it:g} by a factor {it:ec_g} that approximately
       equals the proportion of the kernel window lying within the study
       region. After this correction, {it:A_g} = 3.1416 · {it:h_g}² · {it:ec_g}
       for bounded functions, and {it:A_g} = 3.1416 · ({it:h_g·t})² · {it:ec_g}
       for truncated functions. {p_end}

{pstd} In {cmd:spkde}, the approximate edge correction described above can
       be applied to the estimation of {it:l}({bf:s}) and {it:p}({bf:s})
       only when a bounded or truncated kernel function is used. This
       includes the uniform, quartic, triangular, Epanechnicov, truncated
       normal, and truncated negative exponential functions. {p_end}


{marker section04}{title:Input datasets}

{pstd} {cmd:spkde} makes use of two input datasets: {it:datapoints} and
       {it:gridpoints}. {p_end}

{pstd} {it:datapoints} is the dataset that resides in memory when {cmd:spkde}
       is run. It consists of {it:n} observations, one for each data point
       making up the spatial point pattern of interest. At the minimum, it
       must include {help spkde##main:{it:xvar}}, a numeric variable that
       contains the x-coordinate of data points, and
       {help spkde##main:{it:yvar}}, a numeric variable that contains the
       y-coordinate of data points. When each data point “hosts” more than
       one object of {it:J}>0 different types, then {cmd:spkde} must be run
       specifying a {help spkde##syntax:{it:varlist}} made of {it:J} numeric
       variables, each containing the number of objects of a given type
       located at each data point; in this case, dataset {it:datapoints} must
       also include the {it:J} variables specified in {it:varlist}. Finally,
       dataset {it:datapoints} can optionally include
       {help spkde##estpar:{it:ndpwvar}}, a numeric variable sometimes
       used to set the value of the {help spkde##section02:kernel bandwidth}.
       {p_end}

{pstd} {it:gridpoints} is the {help spkde##syntax:using} dataset invoked by
       {cmd:spkde} at run time. It is generated by the user-written Stata
       program {help spgrid} and contains the spatial grid used by {cmd:spkde}
       to compute the kernel estimates of {it:p}({bf:s}_{it:g}) and
       {it:l}({bf:s}_{it:g}). {p_end}


{marker section05}{title:Output dataset}

{pstd} {cmd:spkde} routinely generates {it:kernest}, a Stata dataset that
       consists of {it:G} observations {hline 1} one for each grid point
       {bf:s}_{it:g} {hline 1} and includes the following variables: {p_end}

{phang2}{space 1}o{space 2}{bf:spgrid_id} is a numeric variable that uniquely
                           identifies the cells making up the grid. {p_end}

{phang2}{space 1}o{space 2}{bf:spgrid_xcoord} is a numeric variable that
                           contains the x-coordinate of each grid point.
                           {p_end}

{phang2}{space 1}o{space 2}{bf:spgrid_ycoord} is a numeric variable that
                           contains the y-coordinate of each grid point.
                           {p_end}

{phang2}{space 1}o{space 2}{bf:bandwidth} is a numeric variable that
                           contains the values of {it:h_g} (see
                           {help spkde##eq1:above}). {p_end}

{phang2}{space 1}o{space 2}{bf:ndp} is a numeric variable that contains
                           the number of data points used for kernel
                           estimation at each grid point. {p_end}

{phang2}{space 1}o{space 2}{bf:A} is a numeric variable that contains
                           the values of {it:A_g} (see
                           {help spkde##eq1:above}). {p_end}

{pstd} If no {help spkde##syntax:{it:varlist}} is specified, three additional
       variables are included in dataset {it:kernest}: {p_end}

{phang2}{space 1}o{space 2}{bf:c} is a numeric variable that contains the
                           value of {it:c} (see {help spkde##eq1:above}).
                           {p_end}

{phang2}{space 1}o{space 2}{bf:lambda} is a numeric variable that contains
                           the kernel estimate of {it:l}({bf:s}_{it:g})
                           (see {help spkde##eq1:above}). {p_end}

{phang2}{space 1}o{space 2}{bf:p} is a numeric variable that contains
                           the kernel estimate of {it:p}({bf:s}_{it:g})
                           (see {help spkde##eq2:above}). {p_end}

{pstd} On the other hand, if a {help spkde##syntax:{it:varlist}} is specified,
       for each variable {it:varname} in {it:varlist} three additional
       variables are included in dataset {it:kernest}: {p_end}

{phang2}{space 1}o{space 2}{it:varname}{bf:_c} is a numeric variable that
                           contains the value of {it:c} for objects of type
                           {it:varname}. {p_end}

{phang2}{space 1}o{space 2}{it:varname}{bf:_lambda} is a numeric variable
                           that contains the kernel estimate of
                           {it:l}({bf:s}_{it:g}) for objects of type
                           {it:varname}. {p_end}

{phang2}{space 1}o{space 2}{it:varname}{bf:_p} is a numeric variable
                           that contains the kernel estimate of
                           {it:p}({bf:s}_{it:g}) for objects of type
                           {it:varname}. {p_end}

{pstd} When option {opt ndpw(ndpwvar)} is specified, an additional
       variable is included in dataset {it:kernest}: {p_end}

{phang2}{space 1}o{space 2}{bf:wndp} is a numeric variable that contains
                           the weighted number of data points used for kernel
                           estimation at each grid point. {p_end}

{pstd} Finally, when option {opt edgecorrection} is specified, an additional
       variable is included in dataset {it:kernest}: {p_end}

{phang2}{space 1}o{space 2}{bf:edgecorrect} is a numeric variable that
                           contains the edge correction factor {it:ec_g}
                           (see {help spkde##section03:above}). {p_end}


{marker section06}{title:Visualization of kernel estimates}

{pstd} The kernel estimates of {it:l}({bf:s}_{it:g}) and {it:p}({bf:s}_{it:g})
       generated by {cmd:spkde} can be visualized using the user-written Stata
       program {help spmap}. {p_end}

{pstd} To this purpose, two datasets are needed: {p_end}

{phang2}{space 1}o{space 2}{it:kernest} is the output dataset routinely
                           generated by {cmd:spkde} (see section
                           {help spkde##section05:Output dataset}
                           above). {p_end}

{phang2}{space 1}o{space 2}{it:gridcells} is one of the two output datasets
                           generated by the user-written Stata program
                           {help spgrid##section04:spgrid}. {p_end}

{pstd} To visualize the kernel estimates of interest, {help spmap} must be run
       with {it:kernest} as the {help spmap##sd_master:{it:master} dataset},
       and {it:gridcells} as the {help spmap##sd_basemap:{it:basemap} dataset}.
       {p_end}


{marker section07}{title:Alternative applications}

{pstd} Although {cmd:spkde} has been designed to carry out kernel estimation
       of density and intensity functions for two-dimensional spatial point
       patterns, it can be used also for estimating the joint probability
       density function {it:p}({it:x},{it:y}) of any pair of quantitative
       variables {it:X} and {it:Y} (see section 
       {help spkde##ex2:Examples 2 {hline 1} Alternative applications} below).
       {p_end}


{marker options2}{title:Options}

{dlgtab:Main}

{phang}
{opt xcoord(xvar)} specifies the name of the variable containing the
   x-coordinate of each data point {bf:s}_{it:i}. {p_end}

{phang}
{opt ycoord(yvar)} specifies the name of the variable containing the
   y-coordinate of each data point {bf:s}_{it:i}. {p_end}

{phang}
{opt kernel(kf)} specifies the basic type of kernel function
   to be used in the estimation of {it:p}({bf:s}) and
   {it:l}({bf:s}) (for details, see section
   {help spkde##section01:Kernel function} above). {p_end}

{phang2}{cmd:kernel(quartic)} is the default and requests that the
   quartic kernel function be used. {p_end}

{phang2}{cmd:kernel(uniform)} requests that the uniform kernel function
   be used. {p_end}

{phang2}{cmd:kernel(normal)} requests that the normal kernel function
   be used. {p_end}

{phang2}{cmd:kernel(negexp)} requests that the negative exponential kernel
   function be used. {p_end}

{phang2}{cmd:kernel(triangular)} requests that the triangular kernel function
   be used. {p_end}

{phang2}{cmd:kernel(epanechnikov)} requests that the Epanechnikov kernel
   function be used. {p_end}

{phang}
{opt truncated(t)} applies only when option {cmd:kernel(normal)} or
   {cmd:kernel(negexp)} is specified. It requests that, for each grid
   point {bf:s}_{it:g}, the selected kernel function be evaluated only
   within a distance {it:h_g·t} from {bf:s}_{it:g}, where {it:t} is
   a positive number. {p_end}

{phang}
{opt bandwidth(method)} specifies the method to be used for setting the value
   of the kernel bandwidth {it:h_g} at each grid point {bf:s}_{it:g} (for
   details, see section {help spkde##section02:Kernel bandwidth} above).
   {p_end}

{phang2}{cmd:bandwidth(fbw)} requests that the {bf:f}ixed {bf:b}and{bf:w}idth
   method be used. {p_end}

{phang2}{cmd:bandwidth(ndp)} requests that the minimum (weighted) {bf:n}umber
   of {bf:d}ata {bf:p}oints method be used. {p_end}

{phang2}{cmd:bandwidth(mixed)} requests that the mixed method be used. {p_end}

{dlgtab:Estimation parameters}

{phang}
{cmd:fbw(ad}{it:q}|{it:b}{cmd:)} sets the value of {it:h} as
   defined in section {help spkde##section02:Kernel bandwidth} above. {p_end}

{phang2}{cmd:fbw(ad}{it:q}{cmd:)} sets {it:h} = {it:adq}, where
   {it:adq} equals the average distance between each data point
   {bf:s}_{it:i} and its {it:q} nearest data points. {p_end}

{phang2}{opt fbw(b)} sets {it:h} = {it:b}. {p_end}

{phang}
{opt ndp(d)} sets the value of {it:ndp} as defined in section
   {help spkde##section02:Kernel bandwidth} above, namely
   {it:ndp} = {it:d}. {p_end}

{phang}
{opt ndpw(ndpwvar)} specifies that {it:ndp} denotes a weighted number of
   data points and the weights are stored in variable {it:ndpwvar}.
   {p_end}

{phang}
{cmd:edgecorrection} requests that an approximate edge correction
   be applied to the estimation of {it:p}({bf:s}) and
   {it:l}({bf:s}) (for details, see section
   {help spkde##section03:Edge correction} above). {p_end}

{dlgtab:Reporting}

{phang}
{cmd:dots} requests that job progression dots be displayed. {p_end}

{phang}
{cmd:noverbose} requests that the display of every indicator of job
   progression be suppressed. {p_end}

{dlgtab:Saving results}

{phang}
{cmdab:sav:ing(}{it:kernest} [{cmd:, replace}]{cmd:)}} requests that the
   kernel estimates of {it:p}({bf:s}_{it:g}) and {it:l}({bf:s}_{it:g}) 
   ({it:g} = 1, ..., {it:G}), along with a set of auxiliary variables, be
   saved to dataset {help spkde##section05:{it:kernest}}. If specified,
   suboption {cmd:replace} requests that dataset
   {help spkde##section05:{it:kernest}} be overwritten if already existing.
   {p_end}


{marker ex1}{title:Examples 1 {hline 1} Standard applications}
{cmd}
    . spgrid using "Italy-OutlineCoordinates.dta",   ///
        resolution(w10) unit(kilometers)             ///
        cells("GridCells.dta")                       ///
        points("GridPoints.dta")                     ///
        replace compress dots

    . use "Italy-DataPoints.dta", clear
    . spkde using "GridPoints.dta",     ///
        xcoord(xcoord) ycoord(ycoord)   ///
        bandwidth(fbw) fbw(100) dots    ///
        saving("Kde.dta", replace)
    . use "Kde.dta", clear
    . spmap lambda using "GridCells.dta",     ///
         id(spgrid_id) clnum(20)              ///
         fcolor(Rainbow) ocolor(none ..)      ///
         legend(off)                          ///
         point(data("Italy-DataPoints.dta")   ///
         x(xcoord) y(ycoord))

    . use "Italy-DataPoints.dta", clear
    . spkde using "GridPoints.dta",     ///
        xcoord(xcoord) ycoord(ycoord)   ///
        bandwidth(fbw) fbw(100) dots    ///
        edgecorrect                     ///
        saving("Kde.dta", replace)
    . use "Kde.dta", clear
    . spmap lambda using "GridCells.dta",     ///
         id(spgrid_id) clnum(20)              ///
         fcolor(Rainbow) ocolor(none ..)      ///
         legend(off)                          ///
         point(data("Italy-DataPoints.dta")   ///
         x(xcoord) y(ycoord))

    . use "Italy-DataPoints.dta", clear
    . spkde using "GridPoints.dta",     ///
        xcoord(xcoord) ycoord(ycoord)   ///
        kernel(normal)                  ///
        bandwidth(fbw) fbw(ad5) dots    ///
        saving("Kde.dta", replace)
    . use "Kde.dta", clear
    . spmap lambda using "GridCells.dta",     ///
         id(spgrid_id) clnum(20)              ///
         fcolor(Rainbow) ocolor(none ..)      ///
         legend(off)                          ///
         point(data("Italy-DataPoints.dta")   ///
         x(xcoord) y(ycoord))

    . use "Italy-DataPoints.dta", clear
    . spkde using "GridPoints.dta",     ///
        xcoord(xcoord) ycoord(ycoord)   ///
        bandwidth(ndp) ndp(4) dots      ///
        saving("Kde.dta", replace)
    . use "Kde.dta", clear
    . spmap lambda using "GridCells.dta",     ///
         id(spgrid_id) clnum(20)              ///
         fcolor(Rainbow) ocolor(none ..)      ///
         legend(off)                          ///
         point(data("Italy-DataPoints.dta")   ///
         x(xcoord) y(ycoord))

    . use "Italy-DataPoints.dta", clear
    . spkde dcvd95 pop95 using "GridPoints.dta",   ///
        xcoord(xcoord) ycoord(ycoord)              ///
        bandwidth(fbw) fbw(100) dots               ///
        saving("Kde.dta", replace)
    . use "Kde.dta", clear
    . generate ratio = dcvd95_lambda / pop95_lambda * 1000
    . spmap ratio using "GridCells.dta",      ///
         id(spgrid_id) clnum(20)              ///
         fcolor(Rainbow) ocolor(none ..)      ///
         legend(off)
{txt}

{marker ex2}{title:Examples 2 {hline 1} Alternative applications}

{pstd} As mentioned {help spkde##section07:above}, {cmd:spkde} can be used
       also for estimating the joint probability density function of any
       pair of quantitative variables (for an alternative, see Stata
       program {help kdens2}, written by Christopher F. Baum and available
       from the Boston SSC Archive). {help spmap} can then be used to
       generate the corresponding density plot. To this purpose, it is
       advised to make use of {help mylabels}, a Stata program written
       by Nicholas J. Cox and available from the Boston SSC Archive. {p_end}

{pstd} As an example, let us estimate and plot the bivariate probability
       density function for two of the variables included in the {bf:auto}
       dataset: {bf:mpg} and {bf:price}. This can be done in four steps as
       follows: {p_end}

{pstd} 1. Normalize variables in the range [0,1] {p_end}
{cmd}
    . sysuse "auto.dta", clear
    . summarize price mpg
    . clonevar x = mpg
    . clonevar y = price
    . replace x = (x-0) / (50-0)
    . replace y = (y-0) / (20000-0)
    . mylabels 0(10)50, myscale((@-0) / (50-0)) local(XLAB)
    . mylabels 0(5000)20000, myscale((@-0) / (20000-0)) local(YLAB)
    . keep x y
    . save "xy.dta", replace
{txt}
{pstd} 2. Generate a 100x100 grid {p_end}
{cmd}
    . spgrid, shape(hexagonal) xdim(100)   ///
        xrange(0 1) yrange(0 1)            ///
        dots replace                       ///
        cells("2D-GridCells.dta")          ///
        points("2D-GridPoints.dta")
{txt}
{pstd} 3. Estimate the bivariate probability density function {p_end}
{cmd}
    . spkde using "2D-GridPoints.dta",   /// 
        xcoord(x) ycoord(y)              ///
        bandwidth(fbw) fbw(0.1) dots     ///
        saving("2D-Kde.dta", replace)
{txt}
{pstd} 4. Display the density plot {p_end}
{cmd}
    . use "2D-Kde.dta", clear
    . recode lambda (.=0)
    . spmap lambda using "2D-GridCells.dta",      ///
        id(spgrid_id) clnum(20) fcolor(Rainbow)   ///
        ocolor(none ..) legend(off)               ///
        point(data("xy.dta") x(x) y(y))           ///
        freestyle aspectratio(1)                  ///
        xtitle(" " "Mileage (mpg)")               ///
        xlab(`XLAB')                              ///
        ytitle("Price" " ")                       ///
        ylab(`YLAB', angle(0))
{txt}

{title:Acknowledgments}

{p 4 4 2} I wish to thank Nick Cox for helpful suggestions.


{title:Author}

{p 4} Maurizio Pisati {p_end}
{p 4} Department of Sociology and Social Research {p_end}
{p 4} University of Milano Bicocca - Italy {p_end}
{p 4} {browse "mailto:maurizio.pisati@unimib.it":maurizio.pisati@unimib.it}


{title:References}

{p 4 8 2}Bailey, T.C. and A.C. Gatrell. 1995. {it:Interactive Spatial Data}
         {it:Analysis}. Harlow: Longman. {p_end}

{p 4 8 2}Brundson, C. 1995. Estimating Probability Surfaces for Geographical
         Point Data: An Adaptive Kernel
         Algorithm.{it: Computers & Geosciences} 21: 877{c -}894. {p_end}

{p 4 8 2}Talbot, T.O., Kulldorff, M., Forand, S.P. and
         V.B. Haley. 2007. Evaluation of Spatial Filters To Create Smoothed
         Maps of Health Data. {it:Statistics in Medicine} 19: 2399{c -}2408.
         {p_end}

{p 4 8 2}Waller, L.A. and C.A. Gotway. 2004. {it:Applied Spatial Statistics}
         {it:for Public Health Data}. Hoboken NJ: Wiley. {p_end}


{title:Also see}

{psee}
Online:  {helpb spgrid} (if installed), {helpb spmap} (if installed),
         {helpb mylabels} (if installed)
{p_end}