scikits.cuda.cublas.cublasCgeam

scikits.cuda.cublas.cublasCgeam(handle, transa, transb, m, n, alpha, A, lda, beta, B, ldb, C, ldc)[source]

Matrix-matrix addition/transposition (single-precision complex).

Computes the sum of two single-precision complex scaled and possibly (conjugate) transposed matrices.

Parameters:

handle : int

CUBLAS context

transa, transb : char

‘t’ if they are transposed, ‘c’ if they are conjugate transposed, ‘n’ if otherwise.

m : int

Number of rows in A and C.

n : int

Number of columns in B and C.

alpha : numpy.complex64

Constant by which to scale A.

A : ctypes.c_void_p

Pointer to first matrix operand (A).

lda : int

Leading dimension of A.

beta : numpy.complex64

Constant by which to scale B.

B : ctypes.c_void_p

Pointer to second matrix operand (B).

ldb : int

Leading dimension of A.

C : ctypes.c_void_p

Pointer to result matrix (C).

ldc : int

Leading dimension of C.

Examples

>>> import pycuda.autoinit
>>> import pycuda.gpuarray as gpuarray
>>> import numpy as np
>>> alpha = np.complex64(np.random.rand()+1j*np.random.rand())
>>> beta = np.complex64(np.random.rand()+1j*np.random.rand())
>>> a = (np.random.rand(2, 3)+1j*np.random.rand(2, 3)).astype(np.complex64)
>>> b = (np.random.rand(2, 3)+1j*np.random.rand(2, 3)).astype(np.complex64)
>>> c = alpha*a+beta*b
>>> a_gpu = gpuarray.to_gpu(a)
>>> b_gpu = gpuarray.to_gpu(b)
>>> c_gpu = gpuarray.empty(c.shape, c.dtype)
>>> h = cublasCreate()
>>> cublasCgeam(h, 'n', 'n', c.shape[0], c.shape[1], alpha, a_gpu.gpudata, a.shape[0], beta, b_gpu.gpudata, b.shape[0], c_gpu.gpudata, c.shape[0])
>>> np.allclose(c_gpu.get(), c)
True
>>> a = (np.random.rand(2, 3)+1j*np.random.rand(2, 3)).astype(np.complex64)
>>> b = (np.random.rand(3, 2)+1j*np.random.rand(3, 2)).astype(np.complex64)
>>> c = alpha*np.conj(a).T+beta*b
>>> a_gpu = gpuarray.to_gpu(a.T.copy())
>>> b_gpu = gpuarray.to_gpu(b.T.copy())
>>> c_gpu = gpuarray.empty(c.T.shape, c.dtype)
>>> transa = 'c' if np.iscomplexobj(a) else 't'
>>> cublasCgeam(h, transa, 'n', c.shape[0], c.shape[1], alpha, a_gpu.gpudata, a.shape[0], beta, b_gpu.gpudata, b.shape[0], c_gpu.gpudata, c.shape[0])
>>> np.allclose(c_gpu.get().T, c)
True
>>> cublasDestroy(h)