Matrix multiplication using array. More...
Functions | |
AFAPI array | matmul (const array &lhs, const array &rhs, const matProp optLhs=AF_MAT_NONE, const matProp optRhs=AF_MAT_NONE) |
Matrix multiply of two arrays. | |
AFAPI array | matmulNT (const array &lhs, const array &rhs) |
Matrix multiply of two arrays. | |
AFAPI array | matmulTN (const array &lhs, const array &rhs) |
Matrix multiply of two arrays. | |
AFAPI array | matmulTT (const array &lhs, const array &rhs) |
Matrix multiply of two arrays. | |
AFAPI array | matmul (const array &a, const array &b, const array &c) |
Chain 2 matrix multiplications. | |
AFAPI array | matmul (const array &a, const array &b, const array &c, const array &d) |
Chain 3 matrix multiplications. | |
AFAPI af_err | af_gemm (af_array *C, const af_mat_prop opA, const af_mat_prop opB, const void *alpha, const af_array A, const af_array B, const void *beta) |
BLAS general matrix multiply (GEMM) of two af_array objects. | |
AFAPI af_err | af_matmul (af_array *out, const af_array lhs, const af_array rhs, const af_mat_prop optLhs, const af_mat_prop optRhs) |
Matrix multiply of two af_array. | |
Matrix multiplication using array.
Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.
Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.
Size of Input Matrix A | Size of Input Matrix B | Output Matrix Size |
---|---|---|
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, 1, 1 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, b2, b3 \} \) |
where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.
For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.
AFAPI af_err af_gemm | ( | af_array * | C, |
const af_mat_prop | opA, | ||
const af_mat_prop | opB, | ||
const void * | alpha, | ||
const af_array | A, | ||
const af_array | B, | ||
const void * | beta | ||
) |
BLAS general matrix multiply (GEMM) of two af_array objects.
This provides a general interface to the BLAS level 3 general matrix multiply (GEMM), which is generally defined as:
\[ C = \alpha * opA(A)opB(B) + \beta * C \]
where \(\alpha\) (alpha
) and \(\beta\) (beta
) are both scalars; \(A\) and \(B\) are the matrix multiply operands; and \(opA\) and \(opB\) are noop (if AF_MAT_NONE
) or transpose (if AF_MAT_TRANS
) operations on \(A\) or \(B\) before the actual GEMM operation. Batched GEMM is supported if at least either \(A\) or \(B\) have more than two dimensions (see af::matmul for more details on broadcasting). However, only one alpha
and one beta
can be used for all of the batched matrix operands.
The af_array that out
points to can be used both as an input and output. An allocation will be performed if you pass a null af_array handle (i.e. af_array c = 0;
). If a valid af_array is passed as \(C\), the operation will be performed on that af_array itself. The C af_array must be the correct type and shape; otherwise, an error will be thrown.
This example demonstrates the usage of the af_gemm function on two matrices. The \(C\) af_array handle is initialized to zero here, so af_gemm will perform an allocation.
The following example shows how you can write to a previously allocated af_array using the af_gemm call. Here we are going to use the af_array s from the previous example and index into the first slice. Only the first slice of the original \(C\) af_array will be modified by this operation.
[in,out] | C | Pointer to the output af_array |
[in] | opA | Operation to perform on A before the multiplication |
[in] | opB | Operation to perform on B before the multiplication |
[in] | alpha | The alpha value; must be the same type as lhs and rhs |
[in] | A | Left-hand side operand |
[in] | B | Right-hand side operand |
[in] | beta | The beta value; must be the same type as lhs and rhs |
AFAPI af_err af_matmul | ( | af_array * | out, |
const af_array | lhs, | ||
const af_array | rhs, | ||
const af_mat_prop | optLhs, | ||
const af_mat_prop | optRhs | ||
) |
Matrix multiply of two af_array.
Performs a matrix multiplication on two arrays (lhs, rhs).
[out] | out | Pointer to the output af_array |
[in] | lhs | A 2D matrix af_array object |
[in] | rhs | A 2D matrix af_array object |
[in] | optLhs | Transpose left hand side before the function is performed |
[in] | optRhs | Transpose right hand side before the function is performed |
lhs
and the dense matrix must be rhs
. optLhs
an only be one of AF_MAT_NONE, AF_MAT_TRANS, AF_MAT_CTRANS. optRhs
can only be AF_MAT_NONE. Chain 2 matrix multiplications.
The matrix multiplications are done in a way to reduce temporary memory
[in] | a | The first array |
[in] | b | The second array |
[in] | c | The third array |
Chain 3 matrix multiplications.
The matrix multiplications are done in a way to reduce temporary memory
[in] | a | The first array |
[in] | b | The second array |
[in] | c | The third array |
[in] | d | The fourth array |
AFAPI array matmul | ( | const array & | lhs, |
const array & | rhs, | ||
const matProp | optLhs = AF_MAT_NONE , |
||
const matProp | optRhs = AF_MAT_NONE |
||
) |
Matrix multiply of two arrays.
Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.
Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.
Size of Input Matrix A | Size of Input Matrix B | Output Matrix Size |
---|---|---|
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, 1, 1 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, b2, b3 \} \) |
where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.
For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.
[in] | lhs | The array object on the left hand side |
[in] | rhs | The array object on the right hand side |
[in] | optLhs | Transpose left hand side before the function is performed |
[in] | optRhs | Transpose right hand side before the function is performed |
lhs
and the dense matrix must be rhs
. optLhs
an only be one of AF_MAT_NONE, AF_MAT_TRANS, AF_MAT_CTRANS. optRhs
can only be AF_MAT_NONE. Matrix multiply of two arrays.
Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.
Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.
Size of Input Matrix A | Size of Input Matrix B | Output Matrix Size |
---|---|---|
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, 1, 1 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, b2, b3 \} \) |
where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.
For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.
[in] | lhs | The array object on the left hand side |
[in] | rhs | The array object on the right hand side |
lhs
, transpose(rhs
)Matrix multiply of two arrays.
Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.
Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.
Size of Input Matrix A | Size of Input Matrix B | Output Matrix Size |
---|---|---|
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, 1, 1 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, b2, b3 \} \) |
where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.
For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.
[in] | lhs | The array object on the left hand side |
[in] | rhs | The array object on the right hand side |
lhs
), rhs
Matrix multiply of two arrays.
Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.
Batched matrix multiplications are supported. Given below are the supported types of batch operations for any given set of two matrices A and B.
Size of Input Matrix A | Size of Input Matrix B | Output Matrix Size |
---|---|---|
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, 1, 1 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, 1, 1 \} \) | \( \{ K, N, b2, b3 \} \) | \( \{ M, N, b2, b3 \} \) |
\( \{ M, K, b2, b3 \} \) | \( \{ K, N, 1, 1 \} \) | \( \{ M, N, b2, b3 \} \) |
where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.
For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.
[in] | lhs | The array object on the left hand side |
[in] | rhs | The array object on the right hand side |
lhs
), transpose(rhs
)