ArrayFire is a high performance software library for parallel computing with an easy-to-use API. ArrayFire abstracts away much of the details of programming parallel architectures by providing a high-level container object, the array, that represents data stored on a CPU, GPU, FPGA, or other type of accelerator. This abstraction permits developers to write massively parallel applications in a high-level language where they need not be concerned about low-level optimizations that are frequently required to achieve high throughput on most parallel architectures.
Most of these data types are supported on all modern GPUs; however, some older devices may lack support for double precision arrays. In this case, a runtime error will be generated when the array is constructed.
If not specified otherwise,
arrays are created as single precision floating point numbers (
ArrayFire arrays represent memory stored on the device. As such, creation and population of an array will consume memory on the device which cannot freed until the
array object goes out of scope. As device memory allocation can be expensive, ArrayFire also includes a memory manager which will re-use device memory whenever possible.
Arrays can be created using one of the array constructors. Below we show how to create 1D, 2D, and 3D arrays with uninitialized values:
However, uninitialized memory is likely not useful in your application. ArrayFire provides several convenient functions for creating arrays that contain pre-populated values including constants, uniform random numbers, uniform normally distributed numbers, and the identity matrix:
A complete list of ArrayFire functions that automatically generate data on the device may be found on the functions to create arrays page. As stated above, the default data type for arrays is f32 (a 32-bit floating point number) unless specified otherwise.
arrays may also be populated from data found on the host. For example:
ArrayFire also supports array initialization from memory already on the GPU. For example, with CUDA one can populate an
array directly using a call to
Similar functionality exists for OpenCL too. If you wish to intermingle ArrayFire with CUDA or OpenCL code, we suggest you consult the CUDA interoperability or OpenCL interoperability pages for detailed instructions.
ArrayFire provides several functions to determine various aspects of arrays. This includes functions to print the contents, query the dimensions, and determine various other aspects of arrays.
The af_print function can be used to print arrays that have already been generated or any expression involving arrays:
In addition to dimensions, arrays also carry several properties including methods to determine the underlying type and size (in bytes). You can even determine whether the array is empty, real/complex, a row/column, or a scalar or a vector:
For further information on these capabilities, we suggest you consult the full documentation on the array.
ArrayFire features an intelligent Just-In-Time (JIT) compilation engine that converts expressions using arrays into the smallest number of CUDA/OpenCL kernels. For most operations on arrays, ArrayFire functions like a vector library. That means that an element-wise operation, like
c[i] = a[i] + b[i] in C, would be written more concisely without indexing, like
c = a + b. When there are multiple expressions involving arrays, ArrayFire's JIT engine will merge them together. This "kernel fusion" technology not only decreases the number of kernel calls, but, more importantly, avoids extraneous global memory operations. Our JIT functionality extends across C/C++ function boundaries and only ends when a non-JIT function is encountered or a synchronization operation is explicitly called by the code.
ArrayFire provides hundreds of functions for element-wise operations. All of the standard operators (e.g. +,-,*,/) are supported as are most transcendental functions (sin, cos, log, sqrt, etc.). Here are a few examples:
Constants can be used in all of ArrayFire's functions. Below we demonstrate their use in element selection and a mathematical expression:
Please note that our constants may, at times, conflict with macro definitions in standard header files. When this occurs, please refer to our constants using the
Like all functions in ArrayFire, indexing is also executed in parallel on the OpenCL/CUDA device. Because of this, indexing becomes part of a JIT operation and is accomplished using parentheses instead of square brackets (i.e. as
A(0) instead of
A). To index
af::arrays you may use one or a combination of the following functions:
Please see the indexing page for several examples of how to use these functions.
af::arrays may be accessed using the host() and device() functions. The
host function copies the data from the device and makes it available in a C-style array on the host. As such, it is up to the developer to manage any memory returned by
device function returns a pointer/reference to device memory for interoperability with external CUDA/OpenCL kernels. As this memory belongs to ArrayFire, the programmer should not attempt to free/deallocate the pointer. For example, here is how we can interact with both OpenCL and CUDA:
In addition to supporting standard mathematical functions, arrays that contain integer data types also support bitwise operators including and, or, and shift:
The ArrayFire API is wrapped into a unified C/C++ header. To use the library simply include the
arrayfire.h header file and start coding!
Now that you have a general introduction to ArrayFire, where do you go from here? In particular you might find these documents useful