SWAN API
General
All API functions are prefixed swan. All errors are fatal.
Device Management
int swanGetDeviceCount( void )
- Return the number of available GPU devices.
void swanSetDeviceNumber( int n )
Set the GPU to use. Should be called before swanInit() or any other swan function.
void swanInit( void )
- Initialise the runtime. Its is usually not necessary to call this explicitly.
void swanFinalize( void )
- Shut down the runtime. Only really needed to tidy up messy OpenCL runtimes.
int swanDeviceVersion( void )
- Returns one of:
- SWAN_DEVICE_OPENCL_10
- SWAN_DEVICE_CUDA_100
- SWAN_DEVICE_CUDA_110
- SWAN_DEVICE_CUDA_120
- SWAN_DEVICE_CUDA_130
- SWAN_DEVICE_CUDA_200
Memory Management
void* swanMalloc( size_t len )
Allocate len bytes of device memory.
void* swanMallocHost( size_t len )
Allocate len bytes of pinned host memory.
void swanFree( void *ptr )
Free device memory allocated with swanMalloc()
void swanFreeHost( void *ptr )
Free device memory allocated with swanMallocHost()
void swanMemcpyDtoH( void *ptrd, void *ptrh, size_t len )
Copy len bytes from device memory region ptrd to host memory region ptrh.
void swanMemcpyHtoD( void *ptrh, void *ptrd, size_t len )
Copy len bytes from host memory region ptrh to device memory region ptrd.
void swanMemcpyDtoD( void *ptrd, void *ptrd, size_t len )
Copy len bytes from device memory region ptrd1 to device memory region ptrd2.
void* swanMallocPitch( size_t *pitch_in_bytes, size_t width_in_bytes, size_t height)
Allocate an aligned 2D region of Axheight, where A is width_in_bytes rounded up to something suitable for the hardware (typically 256).
void swanBindToGlobal( const char *varname, size_t len, void *ptrh )
Copy len bytes from host pointer ptrh to the global device variable named varname. This function has static scope.
Texturing
void swanBindToTexture1D( const char *texname, size_t width, void *ptrd, size_t typesize, int flags )
Bind a texture reference to the device pointer ptrd. The allocation should be widthx typesize bytes. The texture reference name is extracted from the source code. typesize gives the size of the tuple size in the texture, eg sizeof(float4). Flags should be a bitwise OR of:
- TEXTURE_FLOAT - texture contains data of type float.
- TEXTURE_INT - texture contains data of type int.
- TEXTURE_UINT - texture contains data of type unsigned int.
- TEXTURE_NORMALISE - texture addressing will be normalised to between 0 and 1.
- TEXTURE_INTERPOLATE - linear interpolation will be used. Requires TEXTURE_NORMALISE.
- This function has static scope.
void swanMakeTexture1D( const char *texname, size_t width, void *ptrh, size_t typesize, int flags )
As swanBindToTexture1D, but copies data from the host memory region ptrh.
void swanMakeTexture2D( const char *texname, size_t width, size_t height, void *ptrh, size_t typesize, int flags )
Create a 2D texture reference. Arguments as per swanBindToTexture1D.
Note: TEXTURE_NORMALISE and TEXTURE_INTERPOLATE not yet supported
Kernel Execution
int swanMaxThreadCount( void )
- Return the maxium number of threads that may be used in a single block.
int swanGetNumberOfComputeElements( void )
- Return the number of compute elements/multiprocessors in the device
void swanDecompose( block_config_t *grid, block_config_t *block, int thread_count, int threads_per_block )
Create a launch configuration based on the number of threads and threads per block. grid and block will be 1D.
void swanSynchronize( void )
- Block until all asynchronous operations are completed.
All kernel entry points have the following prototype format:
void k_kernel_name ( block_config_t grid, block_config_t grid, int shmem, args,... )
Blocking launch of kernel kernel_name.
void k_kernel_name_async ( block_config_t grid, block_config_t grid, int shmem, args,... )
Non-blocking launch of kernel kernel_name.
args are the formal arguments defined in the kernel source itself. All kernel entry points have static scope.