 A high-performance general-purpose compute library
nearestNeighbour

Determine the nearest neighbouring points to a given set of points. More...

## Functions

AFAPI void nearestNeighbour (array &idx, array &dist, const array &query, const array &train, const dim_t dist_dim=0, const unsigned n_dist=1, const af_match_type dist_type=AF_SSD)
C++ interface wrapper for determining the nearest neighbouring points to a given set of points. More...

AFAPI af_err af_nearest_neighbour (af_array *idx, af_array *dist, const af_array query, const af_array train, const dim_t dist_dim, const unsigned n_dist, const af_match_type dist_type)
C++ interface wrapper for determining the nearest neighbouring points to a given set of points. More...

## Detailed Description

Determine the nearest neighbouring points to a given set of points.

A "point" is simply a geometric point's coordinates in an n-dimensional space, which can be specified along the dimension specified by dist_dim (can be 0 or 1). A list of such points can be enumerated along the dimension other than the one specified by dist_dim (excluding dim2 and dim3). By default, dist_dim is 0, so a point's coordinates in this case must be specified along dim0, and the list of points must be enumerated along dim1. Consequently, if dist_dim is 1, then a point's coordinates must be specified along dim1, and the list must be enumerated along dim0.

The arrays train and query are both a list of points, and one must have the same data layout as the other. This function calculates which points in the train are nearest to each point in query, based on the distance metric specified by dist_type: AF_SAD (sum of absolute differences), AF_SSD (sum of squared differences, the default option), or AF_SHD (hamming distance). The resulting n_dist nearest neighboring points are described in two output arrays:

• idx: contains the index of each result that corresponds to the point in train
• dist: contains the distance from the query point to the result's corresponding point in train

In both the output arrays idx and dist, the nearest neighbor results for a single query are enumerated along dim0, in which the $$ith$$ result is the $$ith$$ nearest point to the query point. The result set for each query point is placed along dim1 (columns) of idx and dist, in the order that the queries appear in query. Therefore, the output arrays will have a shape of n_dist $$\times$$ the number of queries (regardless of the data layout of the input arrays, or the value of dist_dim).

For illustration, a simple example is given below for 1 query in 1-dimensional space. There are 6 points in train, and 3 nearest neighbors are queried for (n_dist is 3), so there are 3 elements in the results for this single query, enumerated along dim0. The results idx and dist contain the 3 points closest to 1.25, ordered from nearest to farthest: point 0 in train (1.) with an SSD distance of 0.0625 from the query, point 1 (2.) with a distance of 0.5625, and point 2 (3.) with a distance of 3.0625.

float h_pts = {1.f, 2.f, 3.f, 8.f, 9.f, 10.f};
array pts(dim4(1, 6), h_pts);
// 1. 2. 3. 8. 9. 10.
float h_query = 1.25f;
array query(dim4(1), &h_query);
// 1.25
array idx;
array dist;
nearestNeighbour(idx, dist, query, pts, 0, 3);
// idx
// 0.
// 1.
// 2.
//
// dist
// 0.0625
// 0.5625
// 3.0625

A slightly more complicated example is given below. There are 2 query points and 6 train points, and they are in 3-dimensional space (each point's coordinates are specified along dim0, and the list of points is enumerated along dim1). Note that in the output arrays idx and dist, there are 2 sets of results now, one for each query. The result set located on the the first column of idx and dist correspond to the first query (the first column in query), and the result set on the second column of idx and dist correspond to the second query (second column in query). Thus, for example, the second query point is (7.5, 9., 1.), and the point closest to it in train is point 3, which is (8., 9., 1.), which has a SSD distance of 0.25 from the query point.

float h_pts = {0.f, 0.f, 0.f, 1.f, 0.f, 0.f, 0.f, 1.f, 0.f,
8.f, 9.f, 1.f, 9.f, 8.f, 1.f, 9.f, 9.f, 1.f};
array pts(dim4(3, 6), h_pts);
// 0. 1. 0. 8. 9. 9.
// 0. 0. 1. 9. 8. 9.
// 0. 0. 0. 1. 1. 1.
float h_query = {1.5f, 0.f, 0.f, 7.5f, 9.f, 1.f};
array query(dim4(3, 2), h_query);
// 1.5 7.5
// 0. 9.
// 0. 1.
array idx;
array dist;
nearestNeighbour(idx, dist, query, pts, 0, 3);
// idx
// 1 3
// 0 5
// 2 4
//
// dist
// 0.25 0.25
// 2.25 2.25
// 3.25 3.25
AFAPI void nearestNeighbour(array &idx, array &dist, const array &query, const array &train, const dim_t dist_dim=0, const unsigned n_dist=1, const af_match_type dist_type=AF_SSD)
C++ interface wrapper for determining the nearest neighbouring points to a given set of points.

Note that it does not make sense for the train and query array shapes to have a third and fourth dimension, because a 2-dimensional array is sufficient to describe a list of points, no matter how long the list is or how many dimensions in space do the points span. Therefore, this function requires both input arrays to be at most 2-dimensional.

## ◆ af_nearest_neighbour()

 AFAPI af_err af_nearest_neighbour ( af_array * idx, af_array * dist, const af_array query, const af_array train, const dim_t dist_dim, const unsigned n_dist, const af_match_type dist_type )

C++ interface wrapper for determining the nearest neighbouring points to a given set of points.

Parameters
 [out] idx is an array of $$M \times N$$ size, where $$M$$ is n_dist and $$N$$ is the number of queries. The value at position $$i,j$$ is the index of the point in train along dim1 (if dist_dim is 0) or along dim 0 (if dist_dim is 1), with the $$ith$$ smallest distance to the $$jth$$ query point. [out] dist is an array of $$M \times N$$ size, where $$M$$ is n_dist and $$N$$ is the number of queries. The value at position $$i,j$$ is the distance from the $$jth$$ query point to the point in train referred to by idx( $$i,j$$). This distance is computed according to the dist_type chosen. [in] query is the array containing the points to be queried. The points must be described along dim0 and listed along dim1 if dist_dim is 0, or vice versa if dist_dim is 1. [in] train is the array containing the points used as training data. The points must be described along dim0 and listed along dim1 if dist_dim is 0, or vice versa if dist_dim is 1. [in] dist_dim indicates the dimension that the distance computation will use to determine a point's coordinates. The train and query arrays must both use this dimension for describing a point's coordinates [in] n_dist is the number of nearest neighbour points to return (currently only values <= 256 are supported) [in] dist_type is the distance computation type. Currently AF_SAD (sum of absolute differences), AF_SSD (sum of squared differences), and AF_SHD (hamming distances) are supported.

## ◆ nearestNeighbour()

 AFAPI void nearestNeighbour ( array & idx, array & dist, const array & query, const array & train, const dim_t dist_dim = 0, const unsigned n_dist = 1, const af_match_type dist_type = AF_SSD )

C++ interface wrapper for determining the nearest neighbouring points to a given set of points.

Parameters
 [out] idx is an array of $$M \times N$$ size, where $$M$$ is n_dist and $$N$$ is the number of queries. The value at position $$i,j$$ is the index of the point in train along dim1 (if dist_dim is 0) or along dim 0 (if dist_dim is 1), with the $$ith$$ smallest distance to the $$jth$$ query point. [out] dist is an array of $$M \times N$$ size, where $$M$$ is n_dist and $$N$$ is the number of queries. The value at position $$i,j$$ is the distance from the $$jth$$ query point to the point in train referred to by idx( $$i,j$$). This distance is computed according to the dist_type chosen. [in] query is the array containing the points to be queried. The points must be described along dim0 and listed along dim1 if dist_dim is 0, or vice versa if dist_dim is 1. [in] train is the array containing the points used as training data. The points must be described along dim0 and listed along dim1 if dist_dim is 0, or vice versa if dist_dim is 1. [in] dist_dim indicates the dimension that the distance computation will use to determine a point's coordinates. The train and query arrays must both use this dimension for describing a point's coordinates [in] n_dist is the number of nearest neighbour points to return (currently only values <= 256 are supported) [in] dist_type is the distance computation type. Currently AF_SAD (sum of absolute differences), AF_SSD (sum of squared differences), and AF_SHD (hamming distances) are supported.