
geonear2d id y x using "`f'", n(id1 y1 x1) within(10) * get all neighbors within 10 units of distance using cartesian distances Number of observations (_N) was 0, now 800,000

* confirm that -geonear2d- results match exactly those from using -cross-Īssert dc = d_to_id1I rerun the geonear2d part of the example above with 800K points and it took less than 3 minutes to finish on my computer.


* the number of neighbors within 10 units of distance Geonear2d id y x using "`f'", n(id1 y1 x1) within(100) * reduce to neighbors within a specific distance It uses a divide and conquer approach that dramatically reduces the number of distances that need to be calculated.Ĭode: * generate points with coordinates similar to the following post: geonear was designed to tackle large problems with millions of points (which would require more than a trillion distances to be calculated). This works well for small problems but quickly becomes unmanageable if the are more than 10K points (100 million distances). You need to calculate for each point the distance to every other points in the data to determine the nearest neighbors of that point. Gen error = abs(dgscaled - dc)The use of cross to form all pairwise combination shows the brute force approach to find the nearest neighbors. * show that the distances are very close once scaled properly Geodist `ymid1' `xmid1' `ymid2' `xmid2', radius(1) * repeat the scaling and calculate geographic distances * the cartesian distance between the two points * locate a point that is 1 unit away from the midpoint in both direction * to convert back the geographic distance to units that are proportional * use a point in the middle of the map to calculate a scaling factor Geodist lat1 lon1 lat2 lon2, gen(dg) radius(1) * since these are not meaningful coordinates, use a radius of 1 * calculate geographic distances -geodist- is from SSC * now rescale so that coordinates are near the equator * form all pairwise combinations to be able to calculate all distances Here's an example of how this would work with points that have coordinates with values that are similar to those you showed in your example.Ĭode: * generate 10 points with coordinates similar to original post At those latitudes, the distance between one degree of latitude is about the same as the distance between one degree of longitude. The trick is to scale your coordinates so that they appear to be near the equator. If you end up having to work with projected coordinates however, you can still use geonear to find the nearest neighbors. These are invariably inaccurate as you cannot represent a spheroid on a plane in a way that respect distances between points. If you can't get the coordinates back to lat/lon, then you will essentially be measuring distances on a map and not on the surface of the earth.

That would require that you know what map projection was used to generate your coordinates. If these are projected coordinates (as opposed to latitude and longitude in decimal degrees), you can indeed convert them back to lat/lon (using some GIS software) and then use geonear (from SSC) to find the nearest neighbors. There is no need to use Stata or any additional downloading steps to understand the following solution: The whole point of dataex is to prepare a small data example that people can play with in view of presenting a self-contained solution, with everything needed to understand it in the post. * reduce to observations within the desired distanceīysort id: keep if inrange(x_koord_0, x_koord-1, x_koord+1) & ///ĭrop *_0Again however, the solution does not include all the information needed to understand it because the nature of the data is not revealed without the extra step of downloading a copy of the data, running the example, or running the part of the example that loads the data in memory.
