Matrix multiplication. More...

Functions
AFAPI array	matmul (const array &lhs, const array &rhs, const matProp optLhs=AF_MAT_NONE, const matProp optRhs=AF_MAT_NONE)
	C++ Interface to multiply two matrices. More...

AFAPI array	matmulNT (const array &lhs, const array &rhs)
	C++ Interface to multiply two matrices. More...

AFAPI array	matmulTN (const array &lhs, const array &rhs)
	C++ Interface to multiply two matrices. More...

AFAPI array	matmulTT (const array &lhs, const array &rhs)
	C++ Interface to multiply two matrices. More...

AFAPI array	matmul (const array &a, const array &b, const array &c)
	C++ Interface to chain multiply three matrices. More...

AFAPI array	matmul (const array &a, const array &b, const array &c, const array &d)
	C++ Interface to chain multiply three matrices. More...

AFAPI af_err	af_gemm (af_array C, const af_mat_prop opA, const af_mat_prop opB, const void alpha, const af_array A, const af_array B, const void *beta)
	C Interface to multiply two matrices. More...

AFAPI af_err	af_matmul (af_array *out, const af_array lhs, const af_array rhs, const af_mat_prop optLhs, const af_mat_prop optRhs)
	C Interface to multiply two matrices. More...

Detailed Description

Matrix multiplication.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. The supported types of batch operations for any given set of two matrices A and B are given below,

Size of Input Matrix A	Size of Input Matrix B	Output Matrix Size
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note: Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

Function Documentation

◆ af_gemm()

AFAPI af_err af_gemm	(	af_array *	C,
		const af_mat_prop	opA,
		const af_mat_prop	opB,
		const void *	alpha,
		const af_array	A,
		const af_array	B,
		const void *	beta
	)

C Interface to multiply two matrices.

This provides an interface to the BLAS level 3 general matrix multiply (GEMM) of two af_array objects, which is generally defined as:

\[ C = \alpha * opA(A)opB(B) + \beta * C \]

where \(\alpha\) (alpha) and \(\beta\) (beta) are both scalars; \(A\) and \(B\) are the matrix multiply operands; and \(opA\) and \(opB\) are noop (if AF_MAT_NONE) or transpose (if AF_MAT_TRANS) operations on \(A\) or \(B\) before the actual GEMM operation. Batched GEMM is supported if at least either \(A\) or \(B\) have more than two dimensions (see af::matmul for more details on broadcasting). However, only one alpha and one beta can be used for all of the batched matrix operands.

The af_array that out points to can be used both as an input and output. An allocation will be performed if you pass a null af_array handle (i.e. af_array c = 0;). If a valid af_array is passed as \(C\), the operation will be performed on that af_array itself. The C af_array must be the correct type and shape; otherwise, an error will be thrown.

Note: Passing an af_array that has not been initialized to the C array is will cause undefined behavior.

This example demonstrates the usage of the af_gemm function on two matrices. The \(C\) af_array handle is initialized to zero here, so af_gemm will perform an allocation.

    af_array A, B;
 
    dim_t adims[] = {5, 3, 2};
    dim_t bdims[] = {3, 5, 2};
    af_constant(&A, 1, 3, adims, f32);
    af_constant(&B, 1, 3, bdims, f32);
 
    float alpha = 1.f;
    float beta  = 0.f;
 
    // Undefined behavior!
    // af_array undef;
    // af_gemm(&undef, AF_MAT_NONE, AF_MAT_NONE, &alpha, a.get(), b.get(),
    // &beta);
 
    af_array C = 0;
    af_gemm(&C, AF_MAT_NONE, AF_MAT_NONE, &alpha, A, B, &beta);
    // C =
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
 

The following example shows how you can write to a previously allocated af_array using the af_gemm call. Here we are going to use the af_array s from the previous example and index into the first slice. Only the first slice of the original \(C\) af_array will be modified by this operation.

    alpha                = 1.f;
    beta                 = 1.f;
    af_seq first_slice[] = {af_span, af_span, {0., 0., 1.}};
    af_array Asub, Bsub, Csub;
    af_index(&Asub, A, 3, first_slice);
    af_index(&Bsub, B, 3, first_slice);
    af_index(&Csub, C, 3, first_slice);
    af_gemm(&Csub, AF_MAT_NONE, AF_MAT_NONE, &alpha, Asub, Bsub, &beta);
    // C =
    //  6.   6.   6.   6.   6.
    //  6.   6.   6.   6.   6.
    //  6.   6.   6.   6.   6.
    //  6.   6.   6.   6.   6.
    //  6.   6.   6.   6.   6.
    //
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.
    //  3.   3.   3.   3.   3.

Parameters

[in,out]	C	`A` * `B` = `C`
[in]	opA	operation to perform on A before the multiplication
[in]	opB	operation to perform on B before the multiplication
[in]	alpha	alpha value; must be the same type as `A` and `B`
[in]	A	input array on the left-hand side
[in]	B	input array on the right-hand side
[in]	beta	beta value; must be the same type as `A` and `B`

Returns: AF_SUCCESS, if function returns successfully, else an af_err code is given

◆ af_matmul()

AFAPI af_err af_matmul	(	af_array *	out,
		const af_array	lhs,
		const af_array	rhs,
		const af_mat_prop	optLhs,
		const af_mat_prop	optRhs
	)

C Interface to multiply two matrices.

Performs matrix multiplication on two arrays.

Note: The following applies for Sparse-Dense matrix multiplication.; This function can be used with one sparse input. The sparse input must always be the lhs and the dense matrix must be rhs.; The sparse array can only be of AF_STORAGE_CSR format.; The returned array is always dense.; optLhs an only be one of AF_MAT_NONE, AF_MAT_TRANS, AF_MAT_CTRANS.; optRhs can only be AF_MAT_NONE.

Parameters

[out]	out	`lhs` * `rhs` = `out`
[in]	lhs	input array on the left-hand side
[in]	rhs	input array on the right-hand side
[in]	optLhs	transpose `lhs` before the function is performed
[in]	optRhs	transpose `rhs` before the function is performed

Returns: AF_SUCCESS, if function returns successfully, else an af_err code is given

◆ matmul() [1/3]

AFAPI array matmul	(	const array &	a,
		const array &	b,
		const array &	c
	)

C++ Interface to chain multiply three matrices.

The matrix multiplications are done in a way to reduce temporary memory.

This function is not supported in GFOR.

Parameters

[in]	a	The first array
[in]	b	The second array
[in]	c	The third array

Returns: a x b x c

◆ matmul() [2/3]

AFAPI array matmul	(	const array &	a,
		const array &	b,
		const array &	c,
		const array &	d
	)

C++ Interface to chain multiply three matrices.

The matrix multiplications are done in a way to reduce temporary memory.

This function is not supported in GFOR.

Parameters

[in]	a	The first array
[in]	b	The second array
[in]	c	The third array
[in]	d	The fourth array

Returns: a x b x c x d

◆ matmul() [3/3]

AFAPI array matmul	(	const array &	lhs,
		const array &	rhs,
		const matProp	optLhs = `AF_MAT_NONE`,
		const matProp	optRhs = `AF_MAT_NONE`
	)

C++ Interface to multiply two matrices.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. The supported types of batch operations for any given set of two matrices A and B are given below,

Size of Input Matrix A	Size of Input Matrix B	Output Matrix Size
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note: Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

optLhs and optRhs can only be one of AF_MAT_NONE, AF_MAT_TRANS, AF_MAT_CTRANS.

This function is not supported in GFOR.

Note: The following applies for Sparse-Dense matrix multiplication.; This function can be used with one sparse input. The sparse input must always be the lhs and the dense matrix must be rhs.; The sparse array can only be of AF_STORAGE_CSR format.; The returned array is always dense.; optLhs an only be one of AF_MAT_NONE, AF_MAT_TRANS, AF_MAT_CTRANS.; optRhs can only be AF_MAT_NONE.

Parameters

[in]	lhs	input array on the left-hand side
[in]	rhs	input array on the right-hand side
[in]	optLhs	transpose the left-hand side prior to multiplication
[in]	optRhs	transpose the right-hand side prior to multiplication

Returns: lhs * rhs

◆ matmulNT()

AFAPI array matmulNT	(	const array &	lhs,
		const array &	rhs
	)

C++ Interface to multiply two matrices.

The second matrix will be transposed.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. The supported types of batch operations for any given set of two matrices A and B are given below,

Size of Input Matrix A	Size of Input Matrix B	Output Matrix Size
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note: Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

This function is not supported in GFOR.

Parameters

[in]	lhs	input array on the left-hand side
[in]	rhs	input array on the right-hand side

Returns: lhs * transpose(rhs)

◆ matmulTN()

AFAPI array matmulTN	(	const array &	lhs,
		const array &	rhs
	)

C++ Interface to multiply two matrices.

The first matrix will be transposed.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. The supported types of batch operations for any given set of two matrices A and B are given below,

Size of Input Matrix A	Size of Input Matrix B	Output Matrix Size
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note: Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

This function is not supported in GFOR.

Parameters

[in]	lhs	input array on the left-hand side
[in]	rhs	input array on the right-hand side

Returns: transpose(lhs) * rhs

◆ matmulTT()

AFAPI array matmulTT	(	const array &	lhs,
		const array &	rhs
	)

C++ Interface to multiply two matrices.

Both matrices will be transposed.

Performs a matrix multiplication on the two input arrays after performing the operations specified in the options. The operations are done while reading the data from memory. This results in no additional memory being used for temporary buffers.

Batched matrix multiplications are supported. The supported types of batch operations for any given set of two matrices A and B are given below,

Size of Input Matrix A	Size of Input Matrix B	Output Matrix Size
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, 1, 1 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, 1, 1 \} \)	\( \{ K, N, b2, b3 \} \)	\( \{ M, N, b2, b3 \} \)
\( \{ M, K, b2, b3 \} \)	\( \{ K, N, 1, 1 \} \)	\( \{ M, N, b2, b3 \} \)

where M, K, N are dimensions of the matrix and b2, b3 indicate batch size along the respective dimension.

For the last two entries in the above table, the 2D matrix is broadcasted to match the dimensions of 3D/4D array. This broadcast doesn't involve any additional memory allocations either on host or device.

Note: Sparse support was added to ArrayFire in v3.4.0. This function can be used for Sparse-Dense matrix multiplication. See the notes of the function for usage and restrictions.

This function is not supported in GFOR.

Parameters

[in]	lhs	input array on the left-hand side
[in]	rhs	input array on the right-hand side

Returns: transpose(lhs) * transpose(rhs)

Functions

Detailed Description

Function Documentation

◆ af_gemm()

◆ af_matmul()

◆ matmul() [1/3]

◆ matmul() [2/3]

◆ matmul() [3/3]

◆ matmulNT()

◆ matmulTN()

◆ matmulTT()