A high-performance general-purpose compute library

Determine the nearest neighbouring points to a given set of points. More...

Functions

AFAPI void nearestNeighbour (array &idx, array &dist, const array &query, const array &train, const dim_t dist_dim=0, const unsigned n_dist=1, const af_match_type dist_type=AF_SSD)
 C++ interface wrapper for determining the nearest neighbouring points to a given set of points. More...
 
AFAPI af_err af_nearest_neighbour (af_array *idx, af_array *dist, const af_array query, const af_array train, const dim_t dist_dim, const unsigned n_dist, const af_match_type dist_type)
 C++ interface wrapper for determining the nearest neighbouring points to a given set of points. More...
 

Detailed Description

Determine the nearest neighbouring points to a given set of points.

A "point" is simply a geometric point's coordinates in an n-dimensional space, which can be specified along the dimension specified by dist_dim (can be 0 or 1). A list of such points can be enumerated along the dimension other than the one specified by dist_dim (excluding dim2 and dim3). By default, dist_dim is 0, so a point's coordinates in this case must be specified along dim0, and the list of points must be enumerated along dim1. Consequently, if dist_dim is 1, then a point's coordinates must be specified along dim1, and the list must be enumerated along dim0.

The arrays train and query are both a list of points, and one must have the same data layout as the other. This function calculates which points in the train are nearest to each point in query, based on the distance metric specified by dist_type: AF_SAD (sum of absolute differences), AF_SSD (sum of squared differences, the default option), or AF_SHD (hamming distance). The resulting n_dist nearest neighboring points are described in two output arrays:

In both the output arrays idx and dist, the nearest neighbor results for a single query are enumerated along dim0, in which the \(ith\) result is the \(ith\) nearest point to the query point. The result set for each query point is placed along dim1 (columns) of idx and dist, in the order that the queries appear in query. Therefore, the output arrays will have a shape of n_dist \(\times\) the number of queries (regardless of the data layout of the input arrays, or the value of dist_dim).

For illustration, a simple example is given below for 1 query in 1-dimensional space. There are 6 points in train, and 3 nearest neighbors are queried for (n_dist is 3), so there are 3 elements in the results for this single query, enumerated along dim0. The results idx and dist contain the 3 points closest to 1.25, ordered from nearest to farthest: point 0 in train (1.) with an SSD distance of 0.0625 from the query, point 1 (2.) with a distance of 0.5625, and point 2 (3.) with a distance of 3.0625.

float h_pts[6] = {1.f, 2.f, 3.f, 8.f, 9.f, 10.f};
array pts(dim4(1, 6), h_pts);
// 1. 2. 3. 8. 9. 10.
float h_query = 1.25f;
array query(dim4(1), &h_query);
// 1.25
array idx;
array dist;
nearestNeighbour(idx, dist, query, pts, 0, 3);
// idx
// 0.
// 1.
// 2.
//
// dist
// 0.0625
// 0.5625
// 3.0625

A slightly more complicated example is given below. There are 2 query points and 6 train points, and they are in 3-dimensional space (each point's coordinates are specified along dim0, and the list of points is enumerated along dim1). Note that in the output arrays idx and dist, there are 2 sets of results now, one for each query. The result set located on the the first column of idx and dist correspond to the first query (the first column in query), and the result set on the second column of idx and dist correspond to the second query (second column in query). Thus, for example, the second query point is (7.5, 9., 1.), and the point closest to it in train is point 3, which is (8., 9., 1.), which has a SSD distance of 0.25 from the query point.

float h_pts[18] = {0.f, 0.f, 0.f, 1.f, 0.f, 0.f, 0.f, 1.f, 0.f,
8.f, 9.f, 1.f, 9.f, 8.f, 1.f, 9.f, 9.f, 1.f};
array pts(dim4(3, 6), h_pts);
// 0. 1. 0. 8. 9. 9.
// 0. 0. 1. 9. 8. 9.
// 0. 0. 0. 1. 1. 1.
float h_query[6] = {1.5f, 0.f, 0.f, 7.5f, 9.f, 1.f};
array query(dim4(3, 2), h_query);
// 1.5 7.5
// 0. 9.
// 0. 1.
array idx;
array dist;
nearestNeighbour(idx, dist, query, pts, 0, 3);
// idx
// 1 3
// 0 5
// 2 4
//
// dist
// 0.25 0.25
// 2.25 2.25
// 3.25 3.25
AFAPI void nearestNeighbour(array &idx, array &dist, const array &query, const array &train, const dim_t dist_dim=0, const unsigned n_dist=1, const af_match_type dist_type=AF_SSD)
C++ interface wrapper for determining the nearest neighbouring points to a given set of points.

Note that it does not make sense for the train and query array shapes to have a third and fourth dimension, because a 2-dimensional array is sufficient to describe a list of points, no matter how long the list is or how many dimensions in space do the points span. Therefore, this function requires both input arrays to be at most 2-dimensional.


Function Documentation

◆ af_nearest_neighbour()

AFAPI af_err af_nearest_neighbour ( af_array idx,
af_array dist,
const af_array  query,
const af_array  train,
const dim_t  dist_dim,
const unsigned  n_dist,
const af_match_type  dist_type 
)

C++ interface wrapper for determining the nearest neighbouring points to a given set of points.

Parameters
[out]idxis an array of \(M \times N\) size, where \(M\) is n_dist and \(N\) is the number of queries. The value at position \(i,j\) is the index of the point in train along dim1 (if dist_dim is 0) or along dim 0 (if dist_dim is 1), with the \(ith\) smallest distance to the \(jth\) query point.
[out]distis an array of \(M \times N\) size, where \(M\) is n_dist and \(N\) is the number of queries. The value at position \(i,j\) is the distance from the \(jth\) query point to the point in train referred to by idx( \(i,j\)). This distance is computed according to the dist_type chosen.
[in]queryis the array containing the points to be queried. The points must be described along dim0 and listed along dim1 if dist_dim is 0, or vice versa if dist_dim is 1.
[in]trainis the array containing the points used as training data. The points must be described along dim0 and listed along dim1 if dist_dim is 0, or vice versa if dist_dim is 1.
[in]dist_dimindicates the dimension that the distance computation will use to determine a point's coordinates. The train and query arrays must both use this dimension for describing a point's coordinates
[in]n_distis the number of nearest neighbour points to return (currently only values <= 256 are supported)
[in]dist_typeis the distance computation type. Currently AF_SAD (sum of absolute differences), AF_SSD (sum of squared differences), and AF_SHD (hamming distances) are supported.

◆ nearestNeighbour()

AFAPI void nearestNeighbour ( array idx,
array dist,
const array query,
const array train,
const dim_t  dist_dim = 0,
const unsigned  n_dist = 1,
const af_match_type  dist_type = AF_SSD 
)

C++ interface wrapper for determining the nearest neighbouring points to a given set of points.

Parameters
[out]idxis an array of \(M \times N\) size, where \(M\) is n_dist and \(N\) is the number of queries. The value at position \(i,j\) is the index of the point in train along dim1 (if dist_dim is 0) or along dim 0 (if dist_dim is 1), with the \(ith\) smallest distance to the \(jth\) query point.
[out]distis an array of \(M \times N\) size, where \(M\) is n_dist and \(N\) is the number of queries. The value at position \(i,j\) is the distance from the \(jth\) query point to the point in train referred to by idx( \(i,j\)). This distance is computed according to the dist_type chosen.
[in]queryis the array containing the points to be queried. The points must be described along dim0 and listed along dim1 if dist_dim is 0, or vice versa if dist_dim is 1.
[in]trainis the array containing the points used as training data. The points must be described along dim0 and listed along dim1 if dist_dim is 0, or vice versa if dist_dim is 1.
[in]dist_dimindicates the dimension that the distance computation will use to determine a point's coordinates. The train and query arrays must both use this dimension for describing a point's coordinates
[in]n_distis the number of nearest neighbour points to return (currently only values <= 256 are supported)
[in]dist_typeis the distance computation type. Currently AF_SAD (sum of absolute differences), AF_SSD (sum of squared differences), and AF_SHD (hamming distances) are supported.