A high-performance general-purpose compute library

Matrix multiplication using array. More...

Functions

AFAPI array matmul (const array &lhs, const array &rhs, const matProp optLhs=AF_MAT_NONE, const matProp optRhs=AF_MAT_NONE)
 Matrix multiply of two arrays.
 
AFAPI array matmulNT (const array &lhs, const array &rhs)
 Matrix multiply of two arrays.
 
AFAPI array matmulTN (const array &lhs, const array &rhs)
 Matrix multiply of two arrays.
 
AFAPI array matmulTT (const array &lhs, const array &rhs)
 Matrix multiply of two arrays.
 
AFAPI array matmul (const array &a, const array &b, const array &c)
 Chain 2 matrix multiplications.
 
AFAPI array matmul (const array &a, const array &b, const array &c, const array &d)
 Chain 3 matrix multiplications.
 
AFAPI af_err af_gemm (af_array *C, const af_mat_prop opA, const af_mat_prop opB, const void *alpha, const af_array A, const af_array B, const void *beta)
 BLAS general matrix multiply (GEMM) of two af_array objects.
 
AFAPI af_err af_matmul (af_array *out, const af_array lhs, const af_array rhs, const af_mat_prop optLhs, const af_mat_prop optRhs)
 Matrix multiply of two af_array.
 

Detailed Description

Matrix multiplication using array.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.

Size of Input Matrix A Size of Input Matrix B Output Matrix Size
\( \{ M, K, 1, 1 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note
Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

Function Documentation

◆ af_gemm()

AFAPI af_err af_gemm ( af_array C,
const af_mat_prop  opA,
const af_mat_prop  opB,
const void *  alpha,
const af_array  A,
const af_array  B,
const void *  beta 
)

BLAS general matrix multiply (GEMM) of two af_array objects.

This provides a general interface to the BLAS level 3 general matrix multiply (GEMM), which is generally defined as:

\[ C = \alpha * opA(A)opB(B) + \beta * C \]

where \(\alpha\) (alpha) and \(\beta\) (beta) are both scalars; \(A\) and \(B\) are the matrix multiply operands; and \(opA\) and \(opB\) are noop (if AF_MAT_NONE) or transpose (if AF_MAT_TRANS) operations on \(A\) or \(B\) before the actual GEMM operation. Batched GEMM is supported if at least either \(A\) or \(B\) have more than two dimensions (see af::matmul for more details on broadcasting). However, only one alpha and one beta can be used for all of the batched matrix operands.

The af_array that out points to can be used both as an input and output. An allocation will be performed if you pass a null af_array handle (i.e. af_array c = 0;). If a valid af_array is passed as \(C\), the operation will be performed on that af_array itself. The C af_array must be the correct type and shape; otherwise, an error will be thrown.

Note
Passing an af_array that has not been initialized to the C array is will cause undefined behavior.

This example demonstrates the usage of the af_gemm function on two matrices. The \(C\) af_array handle is initialized to zero here, so af_gemm will perform an allocation.

af_array A, B;
dim_t adims[] = {5, 3, 2};
dim_t bdims[] = {3, 5, 2};
af_constant(&A, 1, 3, adims, f32);
af_constant(&B, 1, 3, bdims, f32);
float alpha = 1.f;
float beta = 0.f;
// Undefined behavior!
// af_array undef;
// af_gemm(&undef, AF_MAT_NONE, AF_MAT_NONE, &alpha, a.get(), b.get(),
// &beta);
af_array C = 0;
af_gemm(&C, AF_MAT_NONE, AF_MAT_NONE, &alpha, A, B, &beta);
// C =
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
//
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
@ f32
32-bit floating point values
Definition: defines.h:211
@ AF_MAT_NONE
Default.
Definition: defines.h:349
long long dim_t
Definition: defines.h:56
void * af_array
Definition: defines.h:240
AFAPI af_err af_gemm(af_array *C, const af_mat_prop opA, const af_mat_prop opB, const void *alpha, const af_array A, const af_array B, const void *beta)
BLAS general matrix multiply (GEMM) of two af_array objects.
AFAPI af_err af_constant(af_array *arr, const double val, const unsigned ndims, const dim_t *const dims, const af_dtype type)

The following example shows how you can write to a previously allocated af_array using the af_gemm call. Here we are going to use the af_array s from the previous example and index into the first slice. Only the first slice of the original \(C\) af_array will be modified by this operation.

alpha = 1.f;
beta = 1.f;
af_seq first_slice[] = {af_span, af_span, {0., 0., 1.}};
af_array Asub, Bsub, Csub;
af_index(&Asub, A, 3, first_slice);
af_index(&Bsub, B, 3, first_slice);
af_index(&Csub, C, 3, first_slice);
af_gemm(&Csub, AF_MAT_NONE, AF_MAT_NONE, &alpha, Asub, Bsub, &beta);
// C =
// 6. 6. 6. 6. 6.
// 6. 6. 6. 6. 6.
// 6. 6. 6. 6. 6.
// 6. 6. 6. 6. 6.
// 6. 6. 6. 6. 6.
//
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
// 3. 3. 3. 3. 3.
AFAPI af_err af_index(af_array *out, const af_array in, const unsigned ndims, const af_seq *const index)
Lookup the values of input array based on sequences.
static const af_seq af_span
Definition: seq.h:31
C-style struct to creating sequences for indexing.
Definition: seq.h:20
Parameters
[in,out]CPointer to the output af_array
[in]opAOperation to perform on A before the multiplication
[in]opBOperation to perform on B before the multiplication
[in]alphaThe alpha value; must be the same type as lhs and rhs
[in]ALeft-hand side operand
[in]BRight-hand side operand
[in]betaThe beta value; must be the same type as lhs and rhs
Returns
AF_SUCCESS if the operation is successful.

◆ af_matmul()

AFAPI af_err af_matmul ( af_array out,
const af_array  lhs,
const af_array  rhs,
const af_mat_prop  optLhs,
const af_mat_prop  optRhs 
)

Matrix multiply of two af_array.

Performs a matrix multiplication on two arrays (lhs, rhs).

Parameters
[out]outPointer to the output af_array
[in]lhsA 2D matrix af_array object
[in]rhsA 2D matrix af_array object
[in]optLhsTranspose left hand side before the function is performed
[in]optRhsTranspose right hand side before the function is performed
Returns
AF_SUCCESS if the process is successful.
Note
The following applies for Sparse-Dense matrix multiplication.
This function can be used with one sparse input. The sparse input must always be the lhs and the dense matrix must be rhs.
The sparse array can only be of AF_STORAGE_CSR format.
The returned array is always dense.
optLhs an only be one of AF_MAT_NONE, AF_MAT_TRANS, AF_MAT_CTRANS.
optRhs can only be AF_MAT_NONE.

◆ matmul() [1/3]

AFAPI array matmul ( const array a,
const array b,
const array c 
)

Chain 2 matrix multiplications.

The matrix multiplications are done in a way to reduce temporary memory

Parameters
[in]aThe first array
[in]bThe second array
[in]cThe third array
Returns
out = a x b x c
Note
This function is not supported in GFOR

◆ matmul() [2/3]

AFAPI array matmul ( const array a,
const array b,
const array c,
const array d 
)

Chain 3 matrix multiplications.

The matrix multiplications are done in a way to reduce temporary memory

Parameters
[in]aThe first array
[in]bThe second array
[in]cThe third array
[in]dThe fourth array
Returns
out = a x b x c x d
Note
This function is not supported in GFOR

◆ matmul() [3/3]

AFAPI array matmul ( const array lhs,
const array rhs,
const matProp  optLhs = AF_MAT_NONE,
const matProp  optRhs = AF_MAT_NONE 
)

Matrix multiply of two arrays.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.

Size of Input Matrix A Size of Input Matrix B Output Matrix Size
\( \{ M, K, 1, 1 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note
Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

Parameters
[in]lhsThe array object on the left hand side
[in]rhsThe array object on the right hand side
[in]optLhsTranspose left hand side before the function is performed
[in]optRhsTranspose right hand side before the function is performed
Returns
The result of the matrix multiplication of lhs, rhs
Note
optLhs and optRhs can only be one of AF_MAT_NONE, AF_MAT_TRANS, AF_MAT_CTRANS
This function is not supported in GFOR
The following applies for Sparse-Dense matrix multiplication.
This function can be used with one sparse input. The sparse input must always be the lhs and the dense matrix must be rhs.
The sparse array can only be of AF_STORAGE_CSR format.
The returned array is always dense.
optLhs an only be one of AF_MAT_NONE, AF_MAT_TRANS, AF_MAT_CTRANS.
optRhs can only be AF_MAT_NONE.

◆ matmulNT()

AFAPI array matmulNT ( const array lhs,
const array rhs 
)

Matrix multiply of two arrays.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.

Size of Input Matrix A Size of Input Matrix B Output Matrix Size
\( \{ M, K, 1, 1 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note
Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

Parameters
[in]lhsThe array object on the left hand side
[in]rhsThe array object on the right hand side
Returns
The result of the matrix multiplication of lhs, transpose(rhs)
Note
This function is not supported in GFOR

◆ matmulTN()

AFAPI array matmulTN ( const array lhs,
const array rhs 
)

Matrix multiply of two arrays.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.

Size of Input Matrix A Size of Input Matrix B Output Matrix Size
\( \{ M, K, 1, 1 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note
Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

Parameters
[in]lhsThe array object on the left hand side
[in]rhsThe array object on the right hand side
Returns
The result of the matrix multiplication of transpose(lhs), rhs
Note
This function is not supported in GFOR

◆ matmulTT()

AFAPI array matmulTT ( const array lhs,
const array rhs 
)

Matrix multiply of two arrays.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.

Size of Input Matrix A Size of Input Matrix B Output Matrix Size
\( \{ M, K, 1, 1 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \) \( \{ K, N, b2, b3 \} \) \( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \) \( \{ K, N, 1, 1 \} \) \( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note
Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

Parameters
[in]lhsThe array object on the left hand side
[in]rhsThe array object on the right hand side
Returns
The result of the matrix multiplication of transpose(lhs), transpose(rhs)
Note
This function is not supported in GFOR