Core API

oneAPI Level Zero Specification - Version 0.91

Common

Common Enums

ze_result_t

enum ze_result_t

Defines Return/Error codes.

Values:

ZE_RESULT_SUCCESS = 0

[Core] success

ZE_RESULT_NOT_READY = 1

[Core] synchronization primitive not signaled

ZE_RESULT_ERROR_DEVICE_LOST = 0x70000001

[Core] device hung, reset, was removed, or driver update occurred

ZE_RESULT_ERROR_OUT_OF_HOST_MEMORY

[Core] insufficient host memory to satisfy call

ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY

[Core] insufficient device memory to satisfy call

ZE_RESULT_ERROR_MODULE_BUILD_FAILURE

[Core] error occurred when building module, see build log for details

ZE_RESULT_ERROR_INSUFFICIENT_PERMISSIONS = 0x70010000

[Tools] access denied due to permission level

ZE_RESULT_ERROR_NOT_AVAILABLE

[Tools] resource already in use and simultaneous access not allowed

ZE_RESULT_ERROR_UNINITIALIZED = 0x78000001

[Validation] driver is not initialized

ZE_RESULT_ERROR_UNSUPPORTED_VERSION

[Validation] generic error code for unsupported versions

ZE_RESULT_ERROR_UNSUPPORTED_FEATURE

[Validation] generic error code for unsupported features

ZE_RESULT_ERROR_INVALID_ARGUMENT

[Validation] generic error code for invalid arguments

ZE_RESULT_ERROR_INVALID_NULL_HANDLE

[Validation] handle argument is not valid

ZE_RESULT_ERROR_HANDLE_OBJECT_IN_USE

[Validation] object pointed to by handle still in-use by device

ZE_RESULT_ERROR_INVALID_NULL_POINTER

[Validation] pointer argument may not be nullptr

ZE_RESULT_ERROR_INVALID_SIZE

[Validation] size argument is invalid (e.g., must not be zero)

ZE_RESULT_ERROR_UNSUPPORTED_SIZE

[Validation] size argument is not supported by the device (e.g., too large)

ZE_RESULT_ERROR_UNSUPPORTED_ALIGNMENT

[Validation] alignment argument is not supported by the device (e.g., too small)

ZE_RESULT_ERROR_INVALID_SYNCHRONIZATION_OBJECT

[Validation] synchronization object in invalid state

ZE_RESULT_ERROR_INVALID_ENUMERATION

[Validation] enumerator argument is not valid

ZE_RESULT_ERROR_UNSUPPORTED_ENUMERATION

[Validation] enumerator argument is not supported by the device

ZE_RESULT_ERROR_UNSUPPORTED_IMAGE_FORMAT

[Validation] image format is not supported by the device

ZE_RESULT_ERROR_INVALID_NATIVE_BINARY

[Validation] native binary is not supported by the device

ZE_RESULT_ERROR_INVALID_GLOBAL_NAME

[Validation] global variable is not found in the module

ZE_RESULT_ERROR_INVALID_KERNEL_NAME

[Validation] kernel name is not found in the module

ZE_RESULT_ERROR_INVALID_FUNCTION_NAME

[Validation] function name is not found in the module

ZE_RESULT_ERROR_INVALID_GROUP_SIZE_DIMENSION

[Validation] group size dimension is not valid for the kernel or device

ZE_RESULT_ERROR_INVALID_GLOBAL_WIDTH_DIMENSION

[Validation] global width dimension is not valid for the kernel or device

ZE_RESULT_ERROR_INVALID_KERNEL_ARGUMENT_INDEX

[Validation] kernel argument index is not valid for kernel

ZE_RESULT_ERROR_INVALID_KERNEL_ARGUMENT_SIZE

[Validation] kernel argument size does not match kernel

ZE_RESULT_ERROR_INVALID_KERNEL_ATTRIBUTE_VALUE

[Validation] value of kernel attribute is not valid for the kernel or device

ZE_RESULT_ERROR_INVALID_COMMAND_LIST_TYPE

[Validation] command list type does not match command queue type

ZE_RESULT_ERROR_OVERLAPPING_REGIONS

[Validation] copy operations do not support overlapping regions of memory

ZE_RESULT_ERROR_UNKNOWN = 0x7fffffff

[Core] unknown or internal error

Common Structures

ze_ipc_mem_handle_t

struct ze_ipc_mem_handle_t

IPC handle to a memory allocation.

Public Members

char data[ZE_MAX_IPC_HANDLE_SIZE]

Opaque data representing an IPC handle.

ze_ipc_event_pool_handle_t

struct ze_ipc_event_pool_handle_t

IPC handle to a event pool allocation.

Public Members

char data[ZE_MAX_IPC_HANDLE_SIZE]

Opaque data representing an IPC handle.

Driver

Driver Functions

zeInit

__ze_api_export ze_result_t __zecall zeInit(ze_init_flag_t flags)

Initialize the ‘One API’ driver and must be called before any other API function.

Parameters
  • flags: initialization flags

  • If this function is not called then all other functions will return ZE_RESULT_ERROR_UNINITIALIZED.

  • Only one instance of a driver per process will be initialized.

  • This function is thread-safe for scenarios where multiple libraries may initialize the driver simultaneously.

Return

zeDriverGet

__ze_api_export ze_result_t __zecall zeDriverGet(uint32_t *pCount, ze_driver_handle_t *phDrivers)

Retrieves driver instances.

Parameters
  • pCount: pointer to the number of driver instances. if count is zero, then the loader will update the value with the total number of drivers available. if count is non-zero, then the loader will only retrieve that number of drivers. if count is larger than the number of drivers available, then the loader will update the value with the correct number of drivers available.

  • phDrivers: [optional][range(0, *pCount)] array of driver instance handles

  • A driver represents a collection of physical devices.

  • The application may pass nullptr for pDrivers when only querying the number of drivers.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetPlatformIDs

Return

zeDriverGetApiVersion

__ze_api_export ze_result_t __zecall zeDriverGetApiVersion(ze_driver_handle_t hDriver, ze_api_version_t *version)

Returns the API version supported by the specified driver.

Parameters
  • hDriver: handle of the driver instance

  • version: api version

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDriverGetProperties

__ze_api_export ze_result_t __zecall zeDriverGetProperties(ze_driver_handle_t hDriver, ze_driver_properties_t *pDriverProperties)

Retrieves properties of the driver.

Parameters
  • hDriver: handle of the driver instance

  • pDriverProperties: query result for driver properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetPlatformInfo

Return

zeDriverGetIPCProperties

__ze_api_export ze_result_t __zecall zeDriverGetIPCProperties(ze_driver_handle_t hDriver, ze_driver_ipc_properties_t *pIPCProperties)

Retrieves IPC attributes of the driver.

Parameters
  • hDriver: handle of the driver instance

  • pIPCProperties: query result for IPC properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDriverGetExtensionFunctionAddress

__ze_api_export ze_result_t __zecall zeDriverGetExtensionFunctionAddress(ze_driver_handle_t hDriver, const char *pFuncName, void **pfunc)

Retrieves an extension function for the specified driver.

Parameters
  • hDriver: handle of the driver instance

  • pFuncName: name of the extension function

  • pfunc: pointer to extension function

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetExtensionFunctionAddressForPlatform

Return

Driver Enums

ze_init_flag_t

enum ze_init_flag_t

Supported initialization flags.

Values:

ZE_INIT_FLAG_NONE = 0

default behavior

ZE_INIT_FLAG_GPU_ONLY = ZE_BIT(0)

only initialize GPU drivers

ze_api_version_t

enum ze_api_version_t

Supported API versions.

  • API versions contain major and minor attributes, use ZE_MAJOR_VERSION and ZE_MINOR_VERSION

Values:

ZE_API_VERSION_1_0 = ZE_MAKE_VERSION(0, )

0.91

ze_driver_properties_version_t

enum ze_driver_properties_version_t

API version of ze_driver_properties_t.

Values:

ZE_DRIVER_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_driver_ipc_properties_version_t

enum ze_driver_ipc_properties_version_t

API version of ze_driver_ipc_properties_t.

Values:

ZE_DRIVER_IPC_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

Driver Structures

ze_driver_uuid_t

struct ze_driver_uuid_t

Driver universal unique id (UUID)

Public Members

uint8_t id[ZE_MAX_DRIVER_UUID_SIZE]

Opaque data representing a driver UUID.

ze_driver_properties_t

struct ze_driver_properties_t

Driver properties queried using zeDriverGetProperties.

Public Members

ze_driver_properties_version_t version

[in] ZE_DRIVER_PROPERTIES_VERSION_CURRENT

ze_driver_uuid_t uuid

[out] universal unique identifier.

uint32_t driverVersion

[out] driver version The driver version is a non-zero, monotonically increasing value where higher values always indicate a more recent version.

ze_driver_ipc_properties_t

struct ze_driver_ipc_properties_t

IPC properties queried using zeDriverGetIPCProperties.

Public Members

ze_driver_ipc_properties_version_t version

[in] ZE_DRIVER_IPC_PROPERTIES_VERSION_CURRENT

ze_bool_t memsSupported

[out] Supports passing memory allocations between processes. See ::zeDriverGetMemIpcHandle.

ze_bool_t eventsSupported

[out] Supports passing events between processes. See ::zeEventPoolGetIpcHandle.

Device

Device Functions

zeDeviceGet

__ze_api_export ze_result_t __zecall zeDeviceGet(ze_driver_handle_t hDriver, uint32_t *pCount, ze_device_handle_t *phDevices)

Retrieves devices within a driver.

Parameters
  • hDriver: handle of the driver instance

  • pCount: pointer to the number of devices. if count is zero, then the driver will update the value with the total number of devices available. if count is non-zero, then driver will only retrieve that number of devices. if count is larger than the number of devices available, then the driver will update the value with the correct number of devices available.

  • phDevices: [optional][range(0, *pCount)] array of handle of devices

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDeviceGetSubDevices

__ze_api_export ze_result_t __zecall zeDeviceGetSubDevices(ze_device_handle_t hDevice, uint32_t *pCount, ze_device_handle_t *phSubdevices)

Retrieves a sub-device from a device.

Parameters
  • hDevice: handle of the device object

  • pCount: pointer to the number of sub-devices. if count is zero, then the driver will update the value with the total number of sub-devices available. if count is non-zero, then driver will only retrieve that number of sub-devices. if count is larger than the number of sub-devices available, then the driver will update the value with the correct number of sub-devices available.

  • phSubdevices: [optional][range(0, *pCount)] array of handle of sub-devices

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clCreateSubDevices

Return

zeDeviceGetProperties

__ze_api_export ze_result_t __zecall zeDeviceGetProperties(ze_device_handle_t hDevice, ze_device_properties_t *pDeviceProperties)

Retrieves properties of the device.

Parameters
  • hDevice: handle of the device

  • pDeviceProperties: query result for device properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetDeviceInfo

Return

zeDeviceGetComputeProperties

__ze_api_export ze_result_t __zecall zeDeviceGetComputeProperties(ze_device_handle_t hDevice, ze_device_compute_properties_t *pComputeProperties)

Retrieves compute properties of the device.

Parameters
  • hDevice: handle of the device

  • pComputeProperties: query result for compute properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetDeviceInfo

Return

zeDeviceGetKernelProperties

__ze_api_export ze_result_t __zecall zeDeviceGetKernelProperties(ze_device_handle_t hDevice, ze_device_kernel_properties_t *pKernelProperties)

Retrieves kernel properties of the device.

Parameters
  • hDevice: handle of the device

  • pKernelProperties: query result for kernel properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDeviceGetMemoryProperties

__ze_api_export ze_result_t __zecall zeDeviceGetMemoryProperties(ze_device_handle_t hDevice, uint32_t *pCount, ze_device_memory_properties_t *pMemProperties)

Retrieves local memory properties of the device.

Parameters
  • hDevice: handle of the device

  • pCount: pointer to the number of memory properties. if count is zero, then the driver will update the value with the total number of memory properties available. if count is non-zero, then driver will only retrieve that number of memory properties. if count is larger than the number of memory properties available, then the driver will update the value with the correct number of memory properties available.

  • pMemProperties: [optional][range(0, *pCount)] array of query results for memory properties

  • Properties are reported for each physical memory type supported by the device.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetDeviceInfo

Return

zeDeviceGetMemoryAccessProperties

__ze_api_export ze_result_t __zecall zeDeviceGetMemoryAccessProperties(ze_device_handle_t hDevice, ze_device_memory_access_properties_t *pMemAccessProperties)

Retrieves memory access properties of the device.

Parameters
  • hDevice: handle of the device

  • pMemAccessProperties: query result for memory access properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetDeviceInfo

Return

zeDeviceGetCacheProperties

__ze_api_export ze_result_t __zecall zeDeviceGetCacheProperties(ze_device_handle_t hDevice, ze_device_cache_properties_t *pCacheProperties)

Retrieves cache properties of the device.

Parameters
  • hDevice: handle of the device

  • pCacheProperties: query result for cache properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetDeviceInfo

Return

zeDeviceGetImageProperties

__ze_api_export ze_result_t __zecall zeDeviceGetImageProperties(ze_device_handle_t hDevice, ze_device_image_properties_t *pImageProperties)

Retrieves image X_DEVICE_MEMORY_ACCESS_PROPERTIES_VERSION_CURRENT of the device.

Parameters
  • hDevice: handle of the device

  • pImageProperties: query result for image properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDeviceGetP2PProperties

__ze_api_export ze_result_t __zecall zeDeviceGetP2PProperties(ze_device_handle_t hDevice, ze_device_handle_t hPeerDevice, ze_device_p2p_properties_t *pP2PProperties)

Retrieves Peer-to-Peer properties between one device and a peer devices.

Parameters
  • hDevice: handle of the device performing the access

  • hPeerDevice: handle of the peer device with the allocation

  • pP2PProperties: Peer-to-Peer properties between source and peer device

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDeviceCanAccessPeer

__ze_api_export ze_result_t __zecall zeDeviceCanAccessPeer(ze_device_handle_t hDevice, ze_device_handle_t hPeerDevice, ze_bool_t *value)

Queries if one device can directly access peer device allocations.

Parameters
  • hDevice: handle of the device performing the access

  • hPeerDevice: handle of the peer device with the allocation

  • value: returned access capability

  • Any device can access any other device within a node through a scale-up fabric.

  • The following are conditions for CanAccessPeer query.

    • If both device and peer device are the same then return true.

    • If both sub-device and peer sub-device are the same then return true.

    • If both are sub-devices and share the same parent device then return true.

    • If both device and remote device are connected by a direct or indirect scale-up fabric or over PCIe (same root complex or shared PCIe switch) then true.

    • If both sub-device and remote parent device (and vice-versa) are connected by a direct or indirect scale-up fabric or over PCIe (same root complex or shared PCIe switch) then true.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDeviceSetLastLevelCacheConfig

__ze_api_export ze_result_t __zecall zeDeviceSetLastLevelCacheConfig(ze_device_handle_t hDevice, ze_cache_config_t CacheConfig)

Sets the preferred Last Level cache configuration for a device.

Parameters
  • hDevice: handle of the device

  • CacheConfig: CacheConfig

  • The application may not call this function from simultaneous threads with the same device handle.

Return

Device Enums

ze_device_properties_version_t

enum ze_device_properties_version_t

API version of ze_device_properties_t.

Values:

ZE_DEVICE_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_device_type_t

enum ze_device_type_t

Supported device types.

Values:

ZE_DEVICE_TYPE_GPU = 1

Graphics Processing Unit.

ZE_DEVICE_TYPE_FPGA

Field Programmable Gate Array.

ze_device_compute_properties_version_t

enum ze_device_compute_properties_version_t

API version of ze_device_compute_properties_t.

Values:

ZE_DEVICE_COMPUTE_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_device_kernel_properties_version_t

enum ze_device_kernel_properties_version_t

API version of ze_device_kernel_properties_t.

Values:

ZE_DEVICE_KERNEL_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_fp_capabilities_t

enum ze_fp_capabilities_t

Floating Point capabilities.

  • floating-point capabilities of the device.

Values:

ZE_FP_CAPS_NONE = 0

None.

ZE_FP_CAPS_DENORM = ZE_BIT(0)

Supports denorms.

ZE_FP_CAPS_INF_NAN = ZE_BIT(1)

Supports INF and quiet NaNs.

ZE_FP_CAPS_ROUND_TO_NEAREST = ZE_BIT(2)

Supports rounding to nearest even rounding mode.

ZE_FP_CAPS_ROUND_TO_ZERO = ZE_BIT(3)

Supports rounding to zero.

ZE_FP_CAPS_ROUND_TO_INF = ZE_BIT(4)

Supports rounding to both positive and negative INF.

ZE_FP_CAPS_FMA = ZE_BIT(5)

Supports IEEE754-2008 fused multiply-add.

ZE_FP_CAPS_ROUNDED_DIVIDE_SQRT = ZE_BIT(6)

Supports rounding as defined by IEEE754 for divide and sqrt operations.

ZE_FP_CAPS_SOFT_FLOAT = ZE_BIT(7)

Uses software implementation for basic floating-point operations.

ze_device_memory_properties_version_t

enum ze_device_memory_properties_version_t

API version of ze_device_memory_properties_t.

Values:

ZE_DEVICE_MEMORY_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_device_memory_access_properties_version_t

enum ze_device_memory_access_properties_version_t

API version of ze_device_memory_access_properties_t.

Values:

ZE_DEVICE_MEMORY_ACCESS_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_memory_access_capabilities_t

enum ze_memory_access_capabilities_t

Memory access capabilities.

  • Supported access capabilities for different types of memory allocations

Values:

ZE_MEMORY_ACCESS_NONE = 0

Access not supported.

ZE_MEMORY_ACCESS = ZE_BIT(0)

Supports load/store access.

ZE_MEMORY_ATOMIC_ACCESS = ZE_BIT(1)

Supports atomic access.

ZE_MEMORY_CONCURRENT_ACCESS = ZE_BIT(2)

Supports concurrent access.

ZE_MEMORY_CONCURRENT_ATOMIC_ACCESS = ZE_BIT(3)

Supports concurrent atomic access.

ze_device_cache_properties_version_t

enum ze_device_cache_properties_version_t

API version of ze_device_cache_properties_t.

Values:

ZE_DEVICE_CACHE_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_device_image_properties_version_t

enum ze_device_image_properties_version_t

API version of ze_device_image_properties_t.

Values:

ZE_DEVICE_IMAGE_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_device_p2p_properties_version_t

enum ze_device_p2p_properties_version_t

API version of ze_device_p2p_properties_t.

Values:

ZE_DEVICE_P2P_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_cache_config_t

enum ze_cache_config_t

Supported Cache Config.

  • Supported Cache Config (Default, Large SLM, Large Data Cache)

Values:

ZE_CACHE_CONFIG_DEFAULT = ZE_BIT(0)

Default Config.

ZE_CACHE_CONFIG_LARGE_SLM = ZE_BIT(1)

Large SLM size.

ZE_CACHE_CONFIG_LARGE_DATA = ZE_BIT(2)

Large General Data size.

Device Structures

ze_device_uuid_t

struct ze_device_uuid_t

Device universal unique id (UUID)

Public Members

uint8_t id[ZE_MAX_DEVICE_UUID_SIZE]

Opaque data representing a device UUID.

ze_device_properties_t

struct ze_device_properties_t

Device properties queried using zeDeviceGetProperties.

Public Members

ze_device_properties_version_t version

[in] ZE_DEVICE_PROPERTIES_VERSION_CURRENT

ze_device_type_t type

[out] generic device type

uint32_t vendorId

[out] vendor id from PCI configuration

uint32_t deviceId

[out] device id from PCI configuration

ze_device_uuid_t uuid

[out] universal unique identifier.

ze_bool_t isSubdevice

[out] If the device handle used for query represents a sub-device.

uint32_t subdeviceId

[out] sub-device id. Only valid if isSubdevice is true.

uint32_t coreClockRate

[out] Clock rate for device core.

ze_bool_t unifiedMemorySupported

[out] Supports unified physical memory between Host and device.

ze_bool_t eccMemorySupported

[out] Supports error correction memory access.

ze_bool_t onDemandPageFaultsSupported

[out] Supports on-demand page-faulting.

uint32_t maxCommandQueues

[out] Maximum number of logical command queues.

uint32_t numAsyncComputeEngines

[out] Number of asynchronous compute engines

uint32_t numAsyncCopyEngines

[out] Number of asynchronous copy engines

uint32_t maxCommandQueuePriority

[out] Maximum priority for command queues. Higher value is higher priority.

uint32_t numThreadsPerEU

[out] Number of threads per EU.

uint32_t physicalEUSimdWidth

[out] The physical EU simd width.

uint32_t numEUsPerSubslice

[out] Number of EUs per sub-slice.

uint32_t numSubslicesPerSlice

[out] Number of sub-slices per slice.

uint32_t numSlices

[out] Number of slices.

uint64_t timerResolution

[out] Returns the resolution of device timer in nanoseconds used for profiling, timestamps, etc.

char name[ZE_MAX_DEVICE_NAME]

[out] Device name

ze_device_compute_properties_t

struct ze_device_compute_properties_t

Device compute properties queried using zeDeviceGetComputeProperties.

Public Members

ze_device_compute_properties_version_t version

[in] ZE_DEVICE_COMPUTE_PROPERTIES_VERSION_CURRENT

uint32_t maxTotalGroupSize

[out] Maximum items per compute group. (maxGroupSizeX * maxGroupSizeY

  • maxGroupSizeZ) <= maxTotalGroupSize

uint32_t maxGroupSizeX

[out] Maximum items for X dimension in group

uint32_t maxGroupSizeY

[out] Maximum items for Y dimension in group

uint32_t maxGroupSizeZ

[out] Maximum items for Z dimension in group

uint32_t maxGroupCountX

[out] Maximum groups that can be launched for x dimension

uint32_t maxGroupCountY

[out] Maximum groups that can be launched for y dimension

uint32_t maxGroupCountZ

[out] Maximum groups that can be launched for z dimension

uint32_t maxSharedLocalMemory

[out] Maximum shared local memory per group.

uint32_t numSubGroupSizes

[out] Number of subgroup sizes supported. This indicates number of entries in subGroupSizes.

uint32_t subGroupSizes[ZE_SUBGROUPSIZE_COUNT]

[out] Size group sizes supported.

ze_native_kernel_uuid_t

struct ze_native_kernel_uuid_t

Native kernel universal unique id (UUID)

Public Members

uint8_t id[ZE_MAX_NATIVE_KERNEL_UUID_SIZE]

Opaque data representing a native kernel UUID.

ze_device_kernel_properties_t

struct ze_device_kernel_properties_t

Device properties queried using zeDeviceGetKernelProperties.

Public Members

ze_device_kernel_properties_version_t version

[in] ZE_DEVICE_KERNEL_PROPERTIES_VERSION_CURRENT

uint32_t spirvVersionSupported

[out] Maximum supported SPIR-V version. Returns zero if SPIR-V is not supported. Contains major and minor attributes, use ZE_MAJOR_VERSION and ZE_MINOR_VERSION.

ze_native_kernel_uuid_t nativeKernelSupported

[out] Compatibility UUID of supported native kernel. UUID may or may not be the same across driver release, devices, or operating systems. Application is responsible for ensuring UUID matches before creating module using previously created native kernel.

ze_bool_t fp16Supported

[out] Supports 16-bit floating-point operations

ze_bool_t fp64Supported

[out] Supports 64-bit floating-point operations

ze_bool_t int64AtomicsSupported

[out] Supports 64-bit atomic operations

ze_bool_t dp4aSupported

[out] Supports four component dot product and accumulate operations

ze_fp_capabilities_t halfFpCapabilities

[out] Capabilities for half-precision floating-point operations.

ze_fp_capabilities_t singleFpCapabilities

[out] Capabilities for single-precision floating-point operations.

ze_fp_capabilities_t doubleFpCapabilities

[out] Capabilities for double-precision floating-point operations.

uint32_t maxArgumentsSize

[out] Maximum kernel argument size that is supported.

uint32_t printfBufferSize

[out] Maximum size of internal buffer that holds output of printf calls from kernel.

ze_device_memory_properties_t

struct ze_device_memory_properties_t

Device local memory properties queried using zeDeviceGetMemoryProperties.

Public Members

ze_device_memory_properties_version_t version

[in] ZE_DEVICE_MEMORY_PROPERTIES_VERSION_CURRENT

uint32_t maxClockRate

[out] Maximum clock rate for device memory.

uint32_t maxBusWidth

[out] Maximum bus width between device and memory.

uint64_t totalSize

[out] Total memory size in bytes.

ze_device_memory_access_properties_t

struct ze_device_memory_access_properties_t

Device memory access properties queried using zeDeviceGetMemoryAccessProperties.

Public Members

ze_device_memory_access_properties_version_t version

[in] ZE_DEVICE_MEMORY_ACCESS_PROPERTIES_VERSION_CURRENT

ze_memory_access_capabilities_t hostAllocCapabilities

[out] Bitfield describing host memory capabilities

ze_memory_access_capabilities_t deviceAllocCapabilities

[out] Bitfield describing device memory capabilities

ze_memory_access_capabilities_t sharedSingleDeviceAllocCapabilities

[out] Bitfield describing shared (single-device) memory capabilities

ze_memory_access_capabilities_t sharedCrossDeviceAllocCapabilities

[out] Bitfield describing shared (cross-device) memory capabilities

ze_memory_access_capabilities_t sharedSystemAllocCapabilities

[out] Bitfield describing shared (system) memory capabilities

ze_device_cache_properties_t

struct ze_device_cache_properties_t

Device cache properties queried using zeDeviceGetCacheProperties.

Public Members

ze_device_cache_properties_version_t version

[in] ZE_DEVICE_CACHE_PROPERTIES_VERSION_CURRENT

ze_bool_t intermediateCacheControlSupported

[out] Support User control on Intermediate Cache (i.e. Resize SLM section vs Generic Cache)

size_t intermediateCacheSize

[out] Per-cache Intermediate Cache (L1/L2) size, in bytes

uint32_t intermediateCachelineSize

[out] Cacheline size in bytes for intermediate cacheline (L1/L2).

ze_bool_t lastLevelCacheSizeControlSupported

[out] Support User control on Last Level Cache (i.e. Resize SLM section vs Generic Cache).

size_t lastLevelCacheSize

[out] Per-cache Last Level Cache (L3) size, in bytes

uint32_t lastLevelCachelineSize

[out] Cacheline size in bytes for last-level cacheline (L3).

ze_device_image_properties_t

struct ze_device_image_properties_t

Device image properties queried using zeDeviceGetComputeProperties.

Public Members

ze_device_image_properties_version_t version

[in] ZE_DEVICE_IMAGE_PROPERTIES_VERSION_CURRENT

ze_bool_t supported

[out] Supports reading and writing of images. See ::zeImageGetProperties for format-specific capabilities.

uint32_t maxImageDims1D

[out] Maximum image dimensions for 1D resources.

uint32_t maxImageDims2D

[out] Maximum image dimensions for 2D resources.

uint32_t maxImageDims3D

[out] Maximum image dimensions for 3D resources.

uint64_t maxImageBufferSize

[out] Maximum image buffer size in bytes.

uint32_t maxImageArraySlices

[out] Maximum image array slices

uint32_t maxSamplers

[out] Max samplers that can be used in kernel.

uint32_t maxReadImageArgs

[out] Returns the maximum number of simultaneous image objects that can be read from by a kernel.

uint32_t maxWriteImageArgs

[out] Returns the maximum number of simultaneous image objects that can be written to by a kernel.

ze_device_p2p_properties_t

struct ze_device_p2p_properties_t

Device properties queried using zeDeviceGetP2PProperties.

Public Members

ze_device_p2p_properties_version_t version

[in] ZE_DEVICE_P2P_PROPERTIES_VERSION_CURRENT

ze_bool_t accessSupported

[out] Supports access between peer devices.

ze_bool_t atomicsSupported

[out] Supports atomics between peer devices.

Cmdqueue

Cmdqueue Functions

zeCommandQueueCreate

__ze_api_export ze_result_t __zecall zeCommandQueueCreate(ze_device_handle_t hDevice, const ze_command_queue_desc_t *desc, ze_command_queue_handle_t *phCommandQueue)

Creates a command queue on the device.

Parameters
  • hDevice: handle of the device object

  • desc: pointer to command queue descriptor

  • phCommandQueue: pointer to handle of command queue object created

  • The command queue can only be used on the device on which it was created.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clCreateCommandQueue

Return

zeCommandQueueDestroy

__ze_api_export ze_result_t __zecall zeCommandQueueDestroy(ze_command_queue_handle_t hCommandQueue)

Destroys a command queue.

Parameters
  • hCommandQueue: [release] handle of command queue object to destroy

  • The application is responsible for making sure the device is not currently referencing the command queue before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this command queue

  • The application may not call this function from simultaneous threads with the same command queue handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clReleaseCommandQueue

Return

zeCommandQueueExecuteCommandLists

__ze_api_export ze_result_t __zecall zeCommandQueueExecuteCommandLists(ze_command_queue_handle_t hCommandQueue, uint32_t numCommandLists, ze_command_list_handle_t *phCommandLists, ze_fence_handle_t hFence)

Executes a command list in a command queue.

Parameters
  • hCommandQueue: handle of the command queue

  • numCommandLists: number of command lists to execute

  • phCommandLists: [range(0, numCommandLists)] list of handles of the command lists to execute

  • hFence: [optional] handle of the fence to signal on completion

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkQueueSubmit

Return

zeCommandQueueSynchronize

__ze_api_export ze_result_t __zecall zeCommandQueueSynchronize(ze_command_queue_handle_t hCommandQueue, uint32_t timeout)

Synchronizes a command queue by waiting on the host.

Parameters
  • hCommandQueue: handle of the command queue

  • timeout: if non-zero, then indicates the maximum time to yield before returning ZE_RESULT_SUCCESS or ZE_RESULT_NOT_READY; if zero, then operates exactly like zeFenceQueryStatus; if UINT32_MAX, then function will not return until complete or device is lost.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

Cmdqueue Enums

ze_command_queue_desc_version_t

enum ze_command_queue_desc_version_t

API version of ze_command_queue_desc_t.

Values:

ZE_COMMAND_QUEUE_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_command_queue_flag_t

enum ze_command_queue_flag_t

Supported command queue flags.

Values:

ZE_COMMAND_QUEUE_FLAG_NONE = 0

default behavior

ZE_COMMAND_QUEUE_FLAG_COPY_ONLY = ZE_BIT(0)

command queue only supports enqueuing copy-only command lists

ZE_COMMAND_QUEUE_FLAG_LOGICAL_ONLY = ZE_BIT(1)

command queue is not tied to a physical command queue; driver may dynamically assign based on usage

ZE_COMMAND_QUEUE_FLAG_SINGLE_SLICE_ONLY = ZE_BIT(2)

‘slice’ size is device-specific. cannot be combined with COPY_ONLY.

command queue reserves and cannot consume more than a single slice.

ZE_COMMAND_QUEUE_FLAG_SUPPORTS_COOPERATIVE_KERNELS = ZE_BIT(3)

command queue supports command list with cooperative kernels. See zeCommandListAppendLaunchCooperativeKernel for more details. cannot be combined with COPY_ONLY.

ze_command_queue_mode_t

enum ze_command_queue_mode_t

Supported command queue modes.

Values:

ZE_COMMAND_QUEUE_MODE_DEFAULT = 0

implicit default behavior; uses driver-based heuristics

ZE_COMMAND_QUEUE_MODE_SYNCHRONOUS

Device execution always completes immediately on execute; Host thread is blocked using wait on implicit synchronization object

ZE_COMMAND_QUEUE_MODE_ASYNCHRONOUS

Device execution is scheduled and will complete in future; explicit synchronization object must be used to determine completeness

ze_command_queue_priority_t

enum ze_command_queue_priority_t

Supported command queue priorities.

Values:

ZE_COMMAND_QUEUE_PRIORITY_NORMAL = 0

[default] normal priority

ZE_COMMAND_QUEUE_PRIORITY_LOW

lower priority than normal

ZE_COMMAND_QUEUE_PRIORITY_HIGH

higher priority than normal

Cmdqueue Structures

ze_command_queue_desc_t

struct ze_command_queue_desc_t

Command Queue descriptor.

Public Members

ze_command_queue_desc_version_t version

[in] ZE_COMMAND_QUEUE_DESC_VERSION_CURRENT

ze_command_queue_flag_t flags

[in] creation flags

ze_command_queue_mode_t mode

[in] operation mode

ze_command_queue_priority_t priority

[in] priority

uint32_t ordinal

[in] if logical-only flag is set, then will be ignored; if supports-cooperative-kernels is set, then may be ignored; else-if copy-only flag is set, then must be less than ze_device_properties_t.numAsyncCopyEngines; otherwise must be less than ze_device_properties_t.numAsyncComputeEngines. When using sub-devices the ze_device_properties_t.numAsyncComputeEngines must be queried from the sub-device being used.

Cmdlist

Cmdlist Functions

zeCommandListCreate

__ze_api_export ze_result_t __zecall zeCommandListCreate(ze_device_handle_t hDevice, const ze_command_list_desc_t *desc, ze_command_list_handle_t *phCommandList)

Creates a command list on the device for submitting commands to any command queue.

Parameters
  • hDevice: handle of the device object

  • desc: pointer to command list descriptor

  • phCommandList: pointer to handle of command list object created

  • The command list can only be used on the device on which it was created.

  • The command list is created in the ‘open’ state.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeCommandListCreateImmediate

__ze_api_export ze_result_t __zecall zeCommandListCreateImmediate(ze_device_handle_t hDevice, const ze_command_queue_desc_t *altdesc, ze_command_list_handle_t *phCommandList)

Creates a command list on the device with an implicit command queue for immediate submission of commands.

Parameters
  • hDevice: handle of the device object

  • altdesc: pointer to command queue descriptor

  • phCommandList: pointer to handle of command list object created

  • The command list can only be used on the device on which it was created.

  • The command list is created in the ‘open’ state and never needs to be closed.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeCommandListDestroy

__ze_api_export ze_result_t __zecall zeCommandListDestroy(ze_command_list_handle_t hCommandList)

Destroys a command list.

Parameters
  • hCommandList: [release] handle of command list object to destroy

  • The application is responsible for making sure the device is not currently referencing the command list before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this command list.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

zeCommandListClose

__ze_api_export ze_result_t __zecall zeCommandListClose(ze_command_list_handle_t hCommandList)

Closes a command list; ready to be executed by a command queue.

Parameters
  • hCommandList: handle of command list object to close

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

zeCommandListReset

__ze_api_export ze_result_t __zecall zeCommandListReset(ze_command_list_handle_t hCommandList)

Reset a command list to initial (empty) state; ready for appending commands.

Parameters
  • hCommandList: handle of command list object to reset

  • The application is responsible for making sure the device is not currently referencing the command list before it is reset

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

Cmdlist Enums

ze_command_list_desc_version_t

enum ze_command_list_desc_version_t

API version of ze_command_list_desc_t.

Values:

ZE_COMMAND_LIST_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_command_list_flag_t

enum ze_command_list_flag_t

Supported command list creation flags.

Values:

ZE_COMMAND_LIST_FLAG_NONE = 0

default behavior

ZE_COMMAND_LIST_FLAG_COPY_ONLY = ZE_BIT(0)

command list only contains copy operations (and synchronization primitives). this command list may only be submitted to a command queue created with ZE_COMMAND_QUEUE_FLAG_COPY_ONLY.

ZE_COMMAND_LIST_FLAG_RELAXED_ORDERING = ZE_BIT(1)

driver may reorder programs and copys between barriers and synchronization primitives. using this flag may increase Host overhead of zeCommandListClose. therefore, this flag should not be set for low-latency usage-models.

ZE_COMMAND_LIST_FLAG_MAXIMIZE_THROUGHPUT = ZE_BIT(2)

driver may perform additional optimizations that increase dexecution throughput. using this flag may increase Host overhead of zeCommandListClose and zeCommandQueueExecuteCommandLists. therefore, this flag should not be set for low-latency usage-models.

ZE_COMMAND_LIST_FLAG_EXPLICIT_ONLY = ZE_BIT(3)

command list should be optimized for submission to a single command queue and device engine. driver must disable any implicit optimizations for distributing work across multiple engines. this flag should be used when applications want full control over multi-engine submission and scheduling.

Cmdlist Structures

ze_command_list_desc_t

struct ze_command_list_desc_t

Command List descriptor.

Barrier

Barrier Functions

zeCommandListAppendBarrier

__ze_api_export ze_result_t __zecall zeCommandListAppendBarrier(ze_command_list_handle_t hCommandList, ze_event_handle_t hSignalEvent, uint32_t numWaitEvents, ze_event_handle_t *phWaitEvents)

Appends an execution and global memory barrier into a command list.

Parameters
  • hCommandList: handle of the command list

  • hSignalEvent: [optional] handle of the event to signal on completion

  • numWaitEvents: [optional] number of events to wait on before executing barrier

  • phWaitEvents: [optional][range(0, numWaitEvents)] handle of the events to wait on before executing barrier

  • If numWaitEvents is zero, then all previous commands are completed prior to the execution of the barrier.

  • If numWaitEvents is non-zero, then then all phWaitEvents must be signaled prior to the execution of the barrier.

  • This command blocks all following commands from beginning until the execution of the barrier completes.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkCmdPipelineBarrier

  • clEnqueueBarrierWithWaitList

Return

zeCommandListAppendMemoryRangesBarrier

__ze_api_export ze_result_t __zecall zeCommandListAppendMemoryRangesBarrier(ze_command_list_handle_t hCommandList, uint32_t numRanges, const size_t *pRangeSizes, const void **pRanges, ze_event_handle_t hSignalEvent, uint32_t numWaitEvents, ze_event_handle_t *phWaitEvents)

Appends a global memory ranges barrier into a command list.

Parameters
  • hCommandList: handle of the command list

  • numRanges: number of memory ranges

  • pRangeSizes: [range(0, numRanges)] array of sizes of memory range

  • pRanges: [range(0, numRanges)] array of memory ranges

  • hSignalEvent: [optional] handle of the event to signal on completion

  • numWaitEvents: [optional] number of events to wait on before executing barrier

  • phWaitEvents: [optional][range(0, numWaitEvents)] handle of the events to wait on before executing barrier

  • If numWaitEvents is zero, then all previous commands are completed prior to the execution of the barrier.

  • If numWaitEvents is non-zero, then then all phWaitEvents must be signaled prior to the execution of the barrier.

  • This command blocks all following commands from beginning until the execution of the barrier completes.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

zeDeviceSystemBarrier

__ze_api_export ze_result_t __zecall zeDeviceSystemBarrier(ze_device_handle_t hDevice)

Ensures in-bound writes to the device are globally observable.

Parameters
  • hDevice: handle of the device

  • This is a special-case system level barrier that can be used to ensure global observability of writes; typically needed after a producer (e.g., NIC) performs direct writes to the device’s memory (e.g., Direct RDMA writes). This is typically required when the memory corresponding to the writes is subsequently accessed from a remote device.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

Copy

Copy Functions

zeCommandListAppendMemoryCopy

__ze_api_export ze_result_t __zecall zeCommandListAppendMemoryCopy(ze_command_list_handle_t hCommandList, void *dstptr, const void *srcptr, size_t size, ze_event_handle_t hEvent)

Copies host, device, or shared memory.

Parameters
  • hCommandList: handle of command list

  • dstptr: pointer to destination memory to copy to

  • srcptr: pointer to source memory to copy from

  • size: size in bytes to copy

  • hEvent: [optional] handle of the event to signal on completion

  • The memory pointed to by both srcptr and dstptr must be accessible by the device on which the command list is created.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clEnqueueCopyBuffer

  • clEnqueueReadBuffer

  • clEnqueueWriteBuffer

  • clEnqueueSVMMemcpy

Return

zeCommandListAppendMemoryFill

__ze_api_export ze_result_t __zecall zeCommandListAppendMemoryFill(ze_command_list_handle_t hCommandList, void *ptr, const void *pattern, size_t pattern_size, size_t size, ze_event_handle_t hEvent)

Initializes host, device, or shared memory.

Parameters
  • hCommandList: handle of command list

  • ptr: pointer to memory to initialize

  • pattern: pointer to value to initialize memory to

  • pattern_size: size in bytes of the value to initialize memory to

  • size: size in bytes to initialize

  • hEvent: [optional] handle of the event to signal on completion

  • The memory pointed to by dstptr must be accessible by the device on which the command list is created.

  • The value to initialize memory to is described by the pattern and the pattern size.

  • The pattern size must be a power of two.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clEnqueueFillBuffer

  • clEnqueueSVMMemFill

Return

zeCommandListAppendMemoryCopyRegion

__ze_api_export ze_result_t __zecall zeCommandListAppendMemoryCopyRegion(ze_command_list_handle_t hCommandList, void *dstptr, const ze_copy_region_t *dstRegion, uint32_t dstPitch, uint32_t dstSlicePitch, const void *srcptr, const ze_copy_region_t *srcRegion, uint32_t srcPitch, uint32_t srcSlicePitch, ze_event_handle_t hEvent)

Copies a region from a 2D or 3D array of host, device, or shared memory.

Parameters
  • hCommandList: handle of command list

  • dstptr: pointer to destination memory to copy to

  • dstRegion: pointer to destination region to copy to

  • dstPitch: destination pitch in bytes

  • dstSlicePitch: destination slice pitch in bytes. This is required for 3D region copies where ze_copy_region_t::depth is not 0, otherwise it’s ignored.

  • srcptr: pointer to source memory to copy from

  • srcRegion: pointer to source region to copy from

  • srcPitch: source pitch in bytes

  • srcSlicePitch: source slice pitch in bytes. This is required for 3D region copies where ze_copy_region_t::depth is not 0, otherwise it’s ignored.

  • hEvent: [optional] handle of the event to signal on completion

  • The memory pointed to by both srcptr and dstptr must be accessible by the device on which the command list is created.

  • The region width, height, and depth for both src and dst must be same. The origins can be different.

  • The src and dst regions cannot be overlapping.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

zeCommandListAppendImageCopy

__ze_api_export ze_result_t __zecall zeCommandListAppendImageCopy(ze_command_list_handle_t hCommandList, ze_image_handle_t hDstImage, ze_image_handle_t hSrcImage, ze_event_handle_t hEvent)

Copies a image.

Parameters
  • hCommandList: handle of command list

  • hDstImage: handle of destination image to copy to

  • hSrcImage: handle of source image to copy from

  • hEvent: [optional] handle of the event to signal on completion

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clEnqueueCopyImage

Return

zeCommandListAppendImageCopyRegion

__ze_api_export ze_result_t __zecall zeCommandListAppendImageCopyRegion(ze_command_list_handle_t hCommandList, ze_image_handle_t hDstImage, ze_image_handle_t hSrcImage, const ze_image_region_t *pDstRegion, const ze_image_region_t *pSrcRegion, ze_event_handle_t hEvent)

Copies a region of a image to another image.

Parameters
  • hCommandList: handle of command list

  • hDstImage: handle of destination image to copy to

  • hSrcImage: handle of source image to copy from

  • pDstRegion: [optional] destination region descriptor

  • pSrcRegion: [optional] source region descriptor

  • hEvent: [optional] handle of the event to signal on completion

  • The region width and height for both src and dst must be same. The origins can be different.

  • The src and dst regions cannot be overlapping.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

zeCommandListAppendImageCopyToMemory

__ze_api_export ze_result_t __zecall zeCommandListAppendImageCopyToMemory(ze_command_list_handle_t hCommandList, void *dstptr, ze_image_handle_t hSrcImage, const ze_image_region_t *pSrcRegion, ze_event_handle_t hEvent)

Copies from a image to device or shared memory.

Parameters
  • hCommandList: handle of command list

  • dstptr: pointer to destination memory to copy to

  • hSrcImage: handle of source image to copy from

  • pSrcRegion: [optional] source region descriptor

  • hEvent: [optional] handle of the event to signal on completion

  • The memory pointed to by dstptr must be accessible by the device on which the command list is created.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clEnqueueReadImage

Return

zeCommandListAppendImageCopyFromMemory

__ze_api_export ze_result_t __zecall zeCommandListAppendImageCopyFromMemory(ze_command_list_handle_t hCommandList, ze_image_handle_t hDstImage, const void *srcptr, const ze_image_region_t *pDstRegion, ze_event_handle_t hEvent)

Copies to a image from device or shared memory.

Parameters
  • hCommandList: handle of command list

  • hDstImage: handle of destination image to copy to

  • srcptr: pointer to source memory to copy from

  • pDstRegion: [optional] destination region descriptor

  • hEvent: [optional] handle of the event to signal on completion

  • The memory pointed to by srcptr must be accessible by the device on which the command list is created.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clEnqueueWriteImage

Return

zeCommandListAppendMemoryPrefetch

__ze_api_export ze_result_t __zecall zeCommandListAppendMemoryPrefetch(ze_command_list_handle_t hCommandList, const void *ptr, size_t size)

Asynchronously prefetches shared memory to the device associated with the specified command list.

Parameters
  • hCommandList: handle of command list

  • ptr: pointer to start of the memory range to prefetch

  • size: size in bytes of the memory range to prefetch

  • This is a hint to improve performance only and is not required for correctness.

  • Only prefetching to the device associated with the specified command list is supported. Prefetching to the host or to a peer device is not supported.

  • Prefetching may not be supported for all allocation types for all devices. If memory prefetching is not supported for the specified memory range the prefetch hint may be ignored.

  • Prefetching may only be supported at a device-specific granularity, such as at a page boundary. In this case, the memory range may be expanded such that the start and end of the range satisfy granularity requirements.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clEnqueueSVMMigrateMem

Return

zeCommandListAppendMemAdvise

__ze_api_export ze_result_t __zecall zeCommandListAppendMemAdvise(ze_command_list_handle_t hCommandList, ze_device_handle_t hDevice, const void *ptr, size_t size, ze_memory_advice_t advice)

Provides advice about the use of a shared memory range.

Parameters
  • hCommandList: handle of command list

  • hDevice: device associated with the memory advice

  • ptr: Pointer to the start of the memory range

  • size: Size in bytes of the memory range

  • advice: Memory advice for the memory range

  • Memory advice is a performance hint only and is not required for functional correctness.

  • Memory advice can be used to override driver heuristics to explicitly control shared memory behavior.

  • Not all memory advice hints may be supported for all allocation types for all devices. If a memory advice hint is not supported by the device it will be ignored.

  • Memory advice may only be supported at a device-specific granularity, such as at a page boundary. In this case, the memory range may be expanded such that the start and end of the range satisfy granularity requirements.

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

Copy Enums

ze_memory_advice_t

enum ze_memory_advice_t

Supported memory advice hints.

Values:

ZE_MEMORY_ADVICE_SET_READ_MOSTLY = 0

hint that memory will be read from frequently and written to rarely

ZE_MEMORY_ADVICE_CLEAR_READ_MOSTLY

removes the affect of ZE_MEMORY_ADVICE_SET_READ_MOSTLY

ZE_MEMORY_ADVICE_SET_PREFERRED_LOCATION

hint that the preferred memory location is the specified device

ZE_MEMORY_ADVICE_CLEAR_PREFERRED_LOCATION

removes the affect of ZE_MEMORY_ADVICE_SET_PREFERRED_LOCATION

ZE_MEMORY_ADVICE_SET_ACCESSED_BY

hint that memory will be accessed by the specified device

ZE_MEMORY_ADVICE_CLEAR_ACCESSED_BY

removes the affect of ZE_MEMORY_ADVICE_SET_ACCESSED_BY

ZE_MEMORY_ADVICE_SET_NON_ATOMIC_MOSTLY

hints that memory will mostly be accessed non-atomically

ZE_MEMORY_ADVICE_CLEAR_NON_ATOMIC_MOSTLY

removes the affect of ZE_MEMORY_ADVICE_SET_NON_ATOMIC_MOSTLY

ZE_MEMORY_ADVICE_BIAS_CACHED

hints that memory should be cached

ZE_MEMORY_ADVICE_BIAS_UNCACHED

hints that memory should be not be cached

Copy Structures

ze_copy_region_t

struct ze_copy_region_t

Copy region descriptor.

Public Members

uint32_t originX

[in] The origin x offset for region in bytes

uint32_t originY

[in] The origin y offset for region in rows

uint32_t originZ

[in] The origin z offset for region in slices

uint32_t width

[in] The region width relative to origin in bytes

uint32_t height

[in] The region height relative to origin in rows

uint32_t depth

[in] The region depth relative to origin in slices. Set this to 0 for 2D copy.

ze_image_region_t

struct ze_image_region_t

Region descriptor.

Public Members

uint32_t originX

[in] The origin x offset for region in pixels

uint32_t originY

[in] The origin y offset for region in pixels

uint32_t originZ

[in] The origin z offset for region in pixels

uint32_t width

[in] The region width relative to origin in pixels

uint32_t height

[in] The region height relative to origin in pixels

uint32_t depth

[in] The region depth relative to origin. For 1D or 2D images, set this to 1.

Event

Event Functions

zeEventPoolCreate

__ze_api_export ze_result_t __zecall zeEventPoolCreate(ze_driver_handle_t hDriver, const ze_event_pool_desc_t *desc, uint32_t numDevices, ze_device_handle_t *phDevices, ze_event_pool_handle_t *phEventPool)

Creates a pool for a set of event(s) for the driver.

Parameters
  • hDriver: handle of the driver instance

  • desc: pointer to event pool descriptor

  • numDevices: number of device handles

  • phDevices: [optional][range(0, numDevices)] array of device handles which have visibility to the event pool. if nullptr, then event pool is visible to all devices supported by the driver instance.

  • phEventPool: pointer handle of event pool object created

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeEventPoolDestroy

__ze_api_export ze_result_t __zecall zeEventPoolDestroy(ze_event_pool_handle_t hEventPool)

Deletes an event pool object.

Parameters
  • hEventPool: [release] handle of event pool object to destroy

  • The application is responsible for destroying all event handles created from the pool before destroying the pool itself

  • The application is responsible for making sure the device is not currently referencing the any event within the pool before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this event pool

  • The application may not call this function from simultaneous threads with the same event pool handle.

  • The implementation of this function should be lock-free.

Return

zeEventCreate

__ze_api_export ze_result_t __zecall zeEventCreate(ze_event_pool_handle_t hEventPool, const ze_event_desc_t *desc, ze_event_handle_t *phEvent)

Creates an event on the device.

Parameters
  • hEventPool: handle of the event pool

  • desc: pointer to event descriptor

  • phEvent: pointer to handle of event object created

  • Multiple events cannot be created using the same location within the same pool.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clCreateUserEvent

  • vkCreateEvent

Return

zeEventDestroy

__ze_api_export ze_result_t __zecall zeEventDestroy(ze_event_handle_t hEvent)

Deletes an event object.

Parameters
  • hEvent: [release] handle of event object to destroy

  • The application is responsible for making sure the device is not currently referencing the event before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this event

  • The application may not call this function from simultaneous threads with the same event handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clReleaseEvent

  • vkDestroyEvent

Return

zeEventPoolGetIpcHandle

__ze_api_export ze_result_t __zecall zeEventPoolGetIpcHandle(ze_event_pool_handle_t hEventPool, ze_ipc_event_pool_handle_t *phIpc)

Gets an IPC event pool handle for the specified event handle that can be shared with another process.

Parameters
  • hEventPool: handle of event pool object

  • phIpc: Returned IPC event handle

  • The application may call this function from simultaneous threads.

Return

zeEventPoolOpenIpcHandle

__ze_api_export ze_result_t __zecall zeEventPoolOpenIpcHandle(ze_driver_handle_t hDriver, ze_ipc_event_pool_handle_t hIpc, ze_event_pool_handle_t *phEventPool)

Opens an IPC event pool handle to retrieve an event pool handle from another process.

Parameters
  • hDriver: handle of the driver to associate with the IPC event pool handle

  • hIpc: IPC event handle

  • phEventPool: pointer handle of event pool object created

Return

zeEventPoolCloseIpcHandle

__ze_api_export ze_result_t __zecall zeEventPoolCloseIpcHandle(ze_event_pool_handle_t hEventPool)

Closes an IPC event handle in the current process.

Parameters
  • hEventPool: [release] handle of event pool object

  • Closes an IPC event handle by destroying events that were opened in this process using zeEventPoolOpenIpcHandle.

  • The application may not call this function from simultaneous threads with the same event pool handle.

Return

zeCommandListAppendSignalEvent

__ze_api_export ze_result_t __zecall zeCommandListAppendSignalEvent(ze_command_list_handle_t hCommandList, ze_event_handle_t hEvent)

Appends a signal of the event from the device into a command list.

Parameters
  • hCommandList: handle of the command list

  • hEvent: handle of the event

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clSetUserEventStatus

  • vkCmdSetEvent

Return

zeCommandListAppendWaitOnEvents

__ze_api_export ze_result_t __zecall zeCommandListAppendWaitOnEvents(ze_command_list_handle_t hCommandList, uint32_t numEvents, ze_event_handle_t *phEvents)

Appends wait on event(s) on the device into a command list.

Parameters
  • hCommandList: handle of the command list

  • numEvents: number of events to wait on before continuing

  • phEvents: [range(0, numEvents)] handle of the events to wait on before continuing

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

zeEventHostSignal

__ze_api_export ze_result_t __zecall zeEventHostSignal(ze_event_handle_t hEvent)

Signals a event from host.

Parameters
  • hEvent: handle of the event

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clSetUserEventStatus

Return

zeEventHostSynchronize

__ze_api_export ze_result_t __zecall zeEventHostSynchronize(ze_event_handle_t hEvent, uint32_t timeout)

The current host thread waits on an event to be signaled.

Parameters
  • hEvent: handle of the event

  • timeout: if non-zero, then indicates the maximum time (in nanoseconds) to yield before returning ZE_RESULT_SUCCESS or ZE_RESULT_NOT_READY; if zero, then operates exactly like zeEventQueryStatus; if UINT32_MAX, then function will not return until complete or device is lost.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clWaitForEvents

Return

zeEventQueryStatus

__ze_api_export ze_result_t __zecall zeEventQueryStatus(ze_event_handle_t hEvent)

Queries an event object’s status.

Parameters
  • hEvent: handle of the event

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clGetEventInfo

  • vkGetEventStatus

Return

zeCommandListAppendEventReset

__ze_api_export ze_result_t __zecall zeCommandListAppendEventReset(ze_command_list_handle_t hCommandList, ze_event_handle_t hEvent)

Reset an event back to not signaled state.

Parameters
  • hCommandList: handle of the command list

  • hEvent: handle of the event

  • The application may not call this function from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkResetEvent

Return

zeEventHostReset

__ze_api_export ze_result_t __zecall zeEventHostReset(ze_event_handle_t hEvent)

Reset an event back to not signaled state.

Parameters
  • hEvent: handle of the event

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkResetEvent

Return

zeEventGetTimestamp

__ze_api_export ze_result_t __zecall zeEventGetTimestamp(ze_event_handle_t hEvent, ze_event_timestamp_type_t timestampType, void *dstptr)

Query timestamp information associated with an event. Event must come from an event pool that was created using ZE_EVENT_POOL_FLAG_TIMESTAMP flag.

Parameters
  • hEvent: handle of the event

  • timestampType: specifies timestamp type to query for that is associated with hEvent.

  • dstptr: pointer to memory for where timestamp will be written to. The size of timestamp is specified in the ze_event_timestamp_type_t definition.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

Event Enums

ze_event_pool_desc_version_t

enum ze_event_pool_desc_version_t

API version of ze_event_pool_desc_t.

Values:

ZE_EVENT_POOL_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_event_pool_flag_t

enum ze_event_pool_flag_t

Supported event pool creation flags.

Values:

ZE_EVENT_POOL_FLAG_DEFAULT = 0

signals and waits visible to the entire device and peer devices

ZE_EVENT_POOL_FLAG_HOST_VISIBLE = ZE_BIT(0)

signals and waits are also visible to host

ZE_EVENT_POOL_FLAG_IPC = ZE_BIT(1)

signals and waits may be shared across processes

ZE_EVENT_POOL_FLAG_TIMESTAMP = ZE_BIT(2)

Indicates all events in pool will contain timestamp information that can be queried using zeEventGetTimestamp

ze_event_desc_version_t

enum ze_event_desc_version_t

API version of ze_event_desc_t.

Values:

ZE_EVENT_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_event_scope_flag_t

enum ze_event_scope_flag_t

Supported event scope flags.

Values:

ZE_EVENT_SCOPE_FLAG_NONE = 0

execution synchronization only; no cache hierarchies are flushed or invalidated

ZE_EVENT_SCOPE_FLAG_SUBDEVICE = ZE_BIT(0)

cache hierarchies are flushed or invalidated sufficient for local sub-device access

ZE_EVENT_SCOPE_FLAG_DEVICE = ZE_BIT(1)

cache hierarchies are flushed or invalidated sufficient for global device access and peer device access

ZE_EVENT_SCOPE_FLAG_HOST = ZE_BIT(2)

cache hierarchies are flushed or invalidated sufficient for device and host access

ze_event_timestamp_type_t

enum ze_event_timestamp_type_t

Supported timestamp types.

Values:

ZE_EVENT_TIMESTAMP_GLOBAL_START = 0

wall-clock time start in GPU clocks for event. Data is uint64_t.

ZE_EVENT_TIMESTAMP_GLOBAL_END

wall-clock time end in GPU clocks for event.Data is uint64_t.

ZE_EVENT_TIMESTAMP_CONTEXT_START

context time start in GPU clocks for event. Only includes time while HW context is actively running on GPU. Data is uint64_t.

ZE_EVENT_TIMESTAMP_CONTEXT_END

context time end in GPU clocks for event. Only includes time while HW context is actively running on GPU. Data is uint64_t.

Event Structures

ze_event_pool_desc_t

struct ze_event_pool_desc_t

Event pool descriptor.

Public Members

ze_event_pool_desc_version_t version

[in] ZE_EVENT_POOL_DESC_VERSION_CURRENT

ze_event_pool_flag_t flags

[in] creation flags

uint32_t count

[in] number of events within the pool

ze_event_desc_t

struct ze_event_desc_t

Event descriptor.

Public Members

ze_event_desc_version_t version

[in] ZE_EVENT_DESC_VERSION_CURRENT

uint32_t index

[in] index of the event within the pool; must be less-than the count specified during pool creation

ze_event_scope_flag_t signal

[in] defines the scope of relevant cache hierarchies to flush on a signal action before the event is triggered

ze_event_scope_flag_t wait

[in] defines the scope of relevant cache hierarchies to invalidate on a wait action after the event is complete

Fence

Fence Functions

zeFenceCreate

__ze_api_export ze_result_t __zecall zeFenceCreate(ze_command_queue_handle_t hCommandQueue, const ze_fence_desc_t *desc, ze_fence_handle_t *phFence)

Creates a fence object on the device’s command queue.

Parameters
  • hCommandQueue: handle of command queue

  • desc: pointer to fence descriptor

  • phFence: pointer to handle of fence object created

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkCreateFence

Return

zeFenceDestroy

__ze_api_export ze_result_t __zecall zeFenceDestroy(ze_fence_handle_t hFence)

Deletes a fence object.

Parameters
  • hFence: [release] handle of fence object to destroy

  • The application is responsible for making sure the device is not currently referencing the fence before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this fence

  • The application may not call this function from simultaneous threads with the same fence handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkDestroyFence

Return

zeFenceHostSynchronize

__ze_api_export ze_result_t __zecall zeFenceHostSynchronize(ze_fence_handle_t hFence, uint32_t timeout)

The current host thread waits on a fence to be signaled.

Parameters
  • hFence: handle of the fence

  • timeout: if non-zero, then indicates the maximum time (in nanoseconds) to yield before returning ZE_RESULT_SUCCESS or ZE_RESULT_NOT_READY; if zero, then operates exactly like zeFenceQueryStatus; if UINT32_MAX, then function will not return until complete or device is lost.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkWaitForFences

Return

zeFenceQueryStatus

__ze_api_export ze_result_t __zecall zeFenceQueryStatus(ze_fence_handle_t hFence)

Queries a fence object’s status.

Parameters
  • hFence: handle of the fence

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkGetFenceStatus

Return

zeFenceReset

__ze_api_export ze_result_t __zecall zeFenceReset(ze_fence_handle_t hFence)

Reset a fence back to the not signaled state.

Parameters
  • hFence: handle of the fence

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • vkResetFences

Return

Fence Enums

ze_fence_desc_version_t

enum ze_fence_desc_version_t

API version of ze_fence_desc_t.

Values:

ZE_FENCE_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_fence_flag_t

enum ze_fence_flag_t

Supported fence creation flags.

Values:

ZE_FENCE_FLAG_NONE = 0

default behavior

Fence Structures

ze_fence_desc_t

struct ze_fence_desc_t

Fence descriptor.

Public Members

ze_fence_desc_version_t version

[in] ZE_FENCE_DESC_VERSION_CURRENT

ze_fence_flag_t flags

[in] creation flags

Image

Image Functions

zeImageGetProperties

__ze_api_export ze_result_t __zecall zeImageGetProperties(ze_device_handle_t hDevice, const ze_image_desc_t *desc, ze_image_properties_t *pImageProperties)

Retrieves supported properties of an image.

Parameters
  • hDevice: handle of the device

  • desc: pointer to image descriptor

  • pImageProperties: pointer to image properties

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeImageCreate

__ze_api_export ze_result_t __zecall zeImageCreate(ze_device_handle_t hDevice, const ze_image_desc_t *desc, ze_image_handle_t *phImage)

Creates a image object on the device.

Parameters
  • hDevice: handle of the device

  • desc: pointer to image descriptor

  • phImage: pointer to handle of image object created

  • The image is only visible to the device on which it was created.

  • The image can be copied to another device using the ::zeCommandListAppendImageCopy functions.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clCreateImage

Return

zeImageDestroy

__ze_api_export ze_result_t __zecall zeImageDestroy(ze_image_handle_t hImage)

Deletes a image object.

Parameters
  • hImage: [release] handle of image object to destroy

  • The application is responsible for making sure the device is not currently referencing the image before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this image

  • The application may not call this function from simultaneous threads with the same image handle.

  • The implementation of this function should be lock-free.

Return

Image Enums

ze_image_desc_version_t

enum ze_image_desc_version_t

API version of ze_image_desc_t.

Values:

ZE_IMAGE_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_image_flag_t

enum ze_image_flag_t

Supported image creation flags.

Values:

ZE_IMAGE_FLAG_PROGRAM_READ = ZE_BIT(0)

programs will read contents

ZE_IMAGE_FLAG_PROGRAM_WRITE = ZE_BIT(1)

programs will write contents

ZE_IMAGE_FLAG_BIAS_CACHED = ZE_BIT(2)

device should cache contents

ZE_IMAGE_FLAG_BIAS_UNCACHED = ZE_BIT(3)

device should not cache contents

ze_image_type_t

enum ze_image_type_t

Supported image types.

Values:

ZE_IMAGE_TYPE_1D

1D

ZE_IMAGE_TYPE_1DARRAY

1D array

ZE_IMAGE_TYPE_2D

2D

ZE_IMAGE_TYPE_2DARRAY

2D array

ZE_IMAGE_TYPE_3D

3D

ZE_IMAGE_TYPE_BUFFER

Buffer.

ze_image_format_layout_t

enum ze_image_format_layout_t

Supported image format layouts.

Values:

ZE_IMAGE_FORMAT_LAYOUT_8

8-bit single component layout

ZE_IMAGE_FORMAT_LAYOUT_16

16-bit single component layout

ZE_IMAGE_FORMAT_LAYOUT_32

32-bit single component layout

ZE_IMAGE_FORMAT_LAYOUT_8_8

2-component 8-bit layout

ZE_IMAGE_FORMAT_LAYOUT_8_8_8_8

4-component 8-bit layout

ZE_IMAGE_FORMAT_LAYOUT_16_16

2-component 16-bit layout

ZE_IMAGE_FORMAT_LAYOUT_16_16_16_16

4-component 16-bit layout

ZE_IMAGE_FORMAT_LAYOUT_32_32

2-component 32-bit layout

ZE_IMAGE_FORMAT_LAYOUT_32_32_32_32

4-component 32-bit layout

ZE_IMAGE_FORMAT_LAYOUT_10_10_10_2

4-component 10_10_10_2 layout

ZE_IMAGE_FORMAT_LAYOUT_11_11_10

3-component 11_11_10 layout

ZE_IMAGE_FORMAT_LAYOUT_5_6_5

3-component 5_6_5 layout

ZE_IMAGE_FORMAT_LAYOUT_5_5_5_1

4-component 5_5_5_1 layout

ZE_IMAGE_FORMAT_LAYOUT_4_4_4_4

4-component 4_4_4_4 layout

ZE_IMAGE_FORMAT_LAYOUT_Y8

Media Format: Y8. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_NV12

Media Format: NV12. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_YUYV

Media Format: YUYV. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_VYUY

Media Format: VYUY. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_YVYU

Media Format: YVYU. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_UYVY

Media Format: UYVY. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_AYUV

Media Format: AYUV. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_P010

Media Format: P010. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_Y410

Media Format: Y410. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_P012

Media Format: P012. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_Y16

Media Format: Y16. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_P016

Media Format: P016. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_Y216

Media Format: Y216. Format type and swizzle is ignored for this.

ZE_IMAGE_FORMAT_LAYOUT_P216

Media Format: P216. Format type and swizzle is ignored for this.

ze_image_format_type_t

enum ze_image_format_type_t

Supported image format types.

Values:

ZE_IMAGE_FORMAT_TYPE_UINT

Unsigned integer.

ZE_IMAGE_FORMAT_TYPE_SINT

Signed integer.

ZE_IMAGE_FORMAT_TYPE_UNORM

Unsigned normalized integer.

ZE_IMAGE_FORMAT_TYPE_SNORM

Signed normalized integer.

ZE_IMAGE_FORMAT_TYPE_FLOAT

Float.

ze_image_format_swizzle_t

enum ze_image_format_swizzle_t

Supported image format component swizzle into channel.

Values:

ZE_IMAGE_FORMAT_SWIZZLE_R

Red component.

ZE_IMAGE_FORMAT_SWIZZLE_G

Green component.

ZE_IMAGE_FORMAT_SWIZZLE_B

Blue component.

ZE_IMAGE_FORMAT_SWIZZLE_A

Alpha component.

ZE_IMAGE_FORMAT_SWIZZLE_0

Zero.

ZE_IMAGE_FORMAT_SWIZZLE_1

One.

ZE_IMAGE_FORMAT_SWIZZLE_X

Don’t care.

ze_image_properties_version_t

enum ze_image_properties_version_t

API version of ze_image_properties_t.

Values:

ZE_IMAGE_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_image_sampler_filter_flags_t

enum ze_image_sampler_filter_flags_t

Supported sampler filtering flags.

Values:

ZE_IMAGE_SAMPLER_FILTER_FLAGS_NONE = 0

device does not support filtering

ZE_IMAGE_SAMPLER_FILTER_FLAGS_POINT = ZE_BIT(0)

device supports point filtering

ZE_IMAGE_SAMPLER_FILTER_FLAGS_LINEAR = ZE_BIT(1)

device supports linear filtering

Image Structures

ze_image_format_desc_t

struct ze_image_format_desc_t

Image format descriptor.

Public Members

ze_image_format_layout_t layout

[in] image format component layout

ze_image_format_type_t type

[in] image format type. Media formats can’t be used for ZE_IMAGE_TYPE_BUFFER.

ze_image_format_swizzle_t x

[in] image component swizzle into channel x

ze_image_format_swizzle_t y

[in] image component swizzle into channel y

ze_image_format_swizzle_t z

[in] image component swizzle into channel z

ze_image_format_swizzle_t w

[in] image component swizzle into channel w

ze_image_desc_t

struct ze_image_desc_t

Image descriptor.

Public Members

ze_image_desc_version_t version

[in] ZE_IMAGE_DESC_VERSION_CURRENT

ze_image_flag_t flags

[in] creation flags

ze_image_type_t type

[in] image type

ze_image_format_desc_t format

[in] image format

uint64_t width

[in] width in pixels for ze_image_type_t::1D/2D/3D and bytes for Buffer, see ze_device_image_properties_t::maxImageDims1D/2D/3D and maxImageBufferSize.

uint32_t height

[in] height in pixels (2D or 3D only), see ze_device_image_properties_t::maxImageDims2D/3D

uint32_t depth

[in] depth in pixels (3D only), see ze_device_image_properties_t::maxImageDims3D

uint32_t arraylevels

[in] array levels (array types only), see ze_device_image_properties_t::maxImageArraySlices

uint32_t miplevels

[in] mipmap levels (must be 0)

ze_image_properties_t

struct ze_image_properties_t

Image properties.

Public Members

ze_image_properties_version_t version

[in] ZE_IMAGE_PROPERTIES_VERSION_CURRENT

ze_image_sampler_filter_flags_t samplerFilterFlags

[out] supported sampler filtering

Memory

Memory Functions

zeDriverAllocSharedMem

__ze_api_export ze_result_t __zecall zeDriverAllocSharedMem(ze_driver_handle_t hDriver, const ze_device_mem_alloc_desc_t *device_desc, const ze_host_mem_alloc_desc_t *host_desc, size_t size, size_t alignment, ze_device_handle_t hDevice, void **pptr)

Allocates memory that is shared between the host and one or more devices.

Parameters
  • hDriver: handle of the driver instance

  • device_desc: pointer to device mem alloc descriptor

  • host_desc: pointer to host mem alloc descriptor

  • size: size in bytes to allocate

  • alignment: minimum alignment in bytes for the allocation

  • hDevice: [optional] device handle to associate with

  • pptr: pointer to shared allocation

  • Shared allocations share ownership between the host and one or more devices.

  • Shared allocations may optionally be associated with a device by passing a handle to the device.

  • Devices supporting only single-device shared access capabilities may access shared memory associated with the device. For these devices, ownership of the allocation is shared between the host and the associated device only.

  • Passing nullptr as the device handle does not associate the shared allocation with any device. For allocations with no associated device, ownership of the allocation is shared between the host and all devices supporting cross-device shared access capabilities.

  • The application may call this function from simultaneous threads.

Return

zeDriverAllocDeviceMem

__ze_api_export ze_result_t __zecall zeDriverAllocDeviceMem(ze_driver_handle_t hDriver, const ze_device_mem_alloc_desc_t *device_desc, size_t size, size_t alignment, ze_device_handle_t hDevice, void **pptr)

Allocates memory specific to a device.

Parameters
  • hDriver: handle of the driver instance

  • device_desc: pointer to device mem alloc descriptor

  • size: size in bytes to allocate

  • alignment: minimum alignment in bytes for the allocation

  • hDevice: handle of the device

  • pptr: pointer to device allocation

  • A device allocation is owned by a specific device.

  • In general, a device allocation may only be accessed by the device that owns it.

  • The application may call this function from simultaneous threads.

Return

zeDriverAllocHostMem

__ze_api_export ze_result_t __zecall zeDriverAllocHostMem(ze_driver_handle_t hDriver, const ze_host_mem_alloc_desc_t *host_desc, size_t size, size_t alignment, void **pptr)

Allocates host memory.

Parameters
  • hDriver: handle of the driver instance

  • host_desc: pointer to host mem alloc descriptor

  • size: size in bytes to allocate

  • alignment: minimum alignment in bytes for the allocation

  • pptr: pointer to host allocation

  • A host allocation is owned by the host process.

  • Host allocations are accessible by the host and all devices within the driver driver.

  • Host allocations are frequently used as staging areas to transfer data to or from devices.

  • The application may call this function from simultaneous threads.

Return

zeDriverFreeMem

__ze_api_export ze_result_t __zecall zeDriverFreeMem(ze_driver_handle_t hDriver, void *ptr)

Frees allocated host memory, device memory, or shared memory.

Parameters
  • hDriver: handle of the driver instance

  • ptr: [release] pointer to memory to free

  • The application is responsible for making sure the device is not currently referencing the memory before it is freed

  • The implementation of this function will immediately free all Host and Device allocations associated with this memory

  • The application may not call this function from simultaneous threads with the same pointer.

Return

zeDriverGetMemAllocProperties

__ze_api_export ze_result_t __zecall zeDriverGetMemAllocProperties(ze_driver_handle_t hDriver, const void *ptr, ze_memory_allocation_properties_t *pMemAllocProperties, ze_device_handle_t *phDevice)

Retrieves attributes of a memory allocation.

Parameters
  • hDriver: handle of the driver instance

  • ptr: memory pointer to query

  • pMemAllocProperties: query result for memory allocation properties

  • phDevice: [optional] device associated with this allocation

  • The application may call this function from simultaneous threads.

Return

zeDriverGetMemAddressRange

__ze_api_export ze_result_t __zecall zeDriverGetMemAddressRange(ze_driver_handle_t hDriver, const void *ptr, void **pBase, size_t *pSize)

Retrieves the base address and/or size of an allocation.

Parameters
  • hDriver: handle of the driver instance

  • ptr: memory pointer to query

  • pBase: [optional] base address of the allocation

  • pSize: [optional] size of the allocation

  • The application may call this function from simultaneous threads.

Return

zeDriverGetMemIpcHandle

__ze_api_export ze_result_t __zecall zeDriverGetMemIpcHandle(ze_driver_handle_t hDriver, const void *ptr, ze_ipc_mem_handle_t *pIpcHandle)

Creates an IPC memory handle for the specified allocation in the sending process.

Parameters
  • hDriver: handle of the driver instance

  • ptr: pointer to the device memory allocation

  • pIpcHandle: Returned IPC memory handle

  • Takes a pointer to the base of a device memory allocation and exports it for use in another process.

  • The application may call this function from simultaneous threads.

Return

zeDriverOpenMemIpcHandle

__ze_api_export ze_result_t __zecall zeDriverOpenMemIpcHandle(ze_driver_handle_t hDriver, ze_device_handle_t hDevice, ze_ipc_mem_handle_t handle, ze_ipc_memory_flag_t flags, void **pptr)

Opens an IPC memory handle to retrieve a device pointer in a receiving process.

Parameters
  • hDriver: handle of the driver instance

  • hDevice: handle of the device to associate with the IPC memory handle

  • handle: IPC memory handle

  • flags: flags controlling the operation

  • pptr: pointer to device allocation in this process

  • Takes an IPC memory handle from a sending process and associates it with a device pointer usable in this process.

  • The device pointer in this process should not be freed with zeDriverFreeMem, but rather with zeDriverCloseMemIpcHandle.

  • The application may call this function from simultaneous threads.

Return

zeDriverCloseMemIpcHandle

__ze_api_export ze_result_t __zecall zeDriverCloseMemIpcHandle(ze_driver_handle_t hDriver, const void *ptr)

Closes an IPC memory handle in a receiving process.

Parameters
  • hDriver: handle of the driver instance

  • ptr: [release] pointer to device allocation in this process

  • Closes an IPC memory handle by unmapping memory that was opened in this process using zeDriverOpenMemIpcHandle.

  • The application may not call this function from simultaneous threads with the same pointer.

Return

Memory Enums

ze_device_mem_alloc_desc_version_t

enum ze_device_mem_alloc_desc_version_t

API version of ze_device_mem_alloc_desc_t.

Values:

ZE_DEVICE_MEM_ALLOC_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_device_mem_alloc_flag_t

enum ze_device_mem_alloc_flag_t

Supported memory allocation flags.

Values:

ZE_DEVICE_MEM_ALLOC_FLAG_DEFAULT = 0

implicit default behavior; uses driver-based heuristics

ZE_DEVICE_MEM_ALLOC_FLAG_BIAS_CACHED = ZE_BIT(0)

device should cache allocation

ZE_DEVICE_MEM_ALLOC_FLAG_BIAS_UNCACHED = ZE_BIT(1)

device should not cache allocation (UC)

ze_host_mem_alloc_desc_version_t

enum ze_host_mem_alloc_desc_version_t

API version of ze_host_mem_alloc_desc_t.

Values:

ZE_HOST_MEM_ALLOC_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_host_mem_alloc_flag_t

enum ze_host_mem_alloc_flag_t

Supported host memory allocation flags.

Values:

ZE_HOST_MEM_ALLOC_FLAG_DEFAULT = 0

implicit default behavior; uses driver-based heuristics

ZE_HOST_MEM_ALLOC_FLAG_BIAS_CACHED = ZE_BIT(0)

host should cache allocation

ZE_HOST_MEM_ALLOC_FLAG_BIAS_UNCACHED = ZE_BIT(1)

host should not cache allocation (UC)

ZE_HOST_MEM_ALLOC_FLAG_BIAS_WRITE_COMBINED = ZE_BIT(2)

host memory should be allocated write-combined (WC)

ze_memory_allocation_properties_version_t

enum ze_memory_allocation_properties_version_t

API version of ze_memory_allocation_properties_t.

Values:

ZE_MEMORY_ALLOCATION_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_memory_type_t

enum ze_memory_type_t

Memory allocation type.

Values:

ZE_MEMORY_TYPE_UNKNOWN = 0

the memory pointed to is of unknown type

ZE_MEMORY_TYPE_HOST

the memory pointed to is a host allocation

ZE_MEMORY_TYPE_DEVICE

the memory pointed to is a device allocation

ZE_MEMORY_TYPE_SHARED

the memory pointed to is a shared ownership allocation

ze_ipc_memory_flag_t

enum ze_ipc_memory_flag_t

Supported IPC memory flags.

Values:

ZE_IPC_MEMORY_FLAG_NONE = 0

No special flags.

Memory Structures

ze_device_mem_alloc_desc_t

struct ze_device_mem_alloc_desc_t

Device mem alloc descriptor.

Public Members

ze_device_mem_alloc_desc_version_t version

[in] ZE_DEVICE_MEM_ALLOC_DESC_VERSION_CURRENT

ze_device_mem_alloc_flag_t flags

[in] flags specifying additional allocation controls

uint32_t ordinal

[in] ordinal of the device’s local memory to allocate from; must be less than the count returned from zeDeviceGetMemoryProperties

ze_host_mem_alloc_desc_t

struct ze_host_mem_alloc_desc_t

Host mem alloc descriptor.

Public Members

ze_host_mem_alloc_desc_version_t version

[in] ZE_HOST_MEM_ALLOC_DESC_VERSION_CURRENT

ze_host_mem_alloc_flag_t flags

[in] flags specifying additional allocation controls

ze_memory_allocation_properties_t

struct ze_memory_allocation_properties_t

Memory allocation properties queried using zeDriverGetMemAllocProperties.

Public Members

ze_memory_allocation_properties_version_t version

[in] ZE_MEMORY_ALLOCATION_PROPERTIES_VERSION_CURRENT

ze_memory_type_t type

[out] type of allocated memory

uint64_t id

[out] identifier for this allocation

Module

Module Functions

zeModuleCreate

__ze_api_export ze_result_t __zecall zeModuleCreate(ze_device_handle_t hDevice, const ze_module_desc_t *desc, ze_module_handle_t *phModule, ze_module_build_log_handle_t *phBuildLog)

Creates module object from an input IL or native binary.

Parameters
  • hDevice: handle of the device

  • desc: pointer to module descriptor

  • phModule: pointer to handle of module object created

  • phBuildLog: [optional] pointer to handle of module’s build log.

  • Compiles the module for execution on the device.

  • The module can only be used on the device on which it was created.

  • The module can be copied to other devices within the same driver instance by using zeModuleGetNativeBinary.

  • The following build options are supported:

    • ”-ze-opt-disable” - Disable optimizations

    • ”-ze-opt-greater-than-4GB-buffer-required” - Use 64-bit offset calculations for buffers.

    • ”-ze-opt-large-register-file” - Increase number of registers available to threads.

  • A build log can optionally be returned to the caller. The caller is responsible for destroying build log using zeModuleBuildLogDestroy.

  • The module descriptor constants are only supported for SPIR-V specialization constants.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeModuleDestroy

__ze_api_export ze_result_t __zecall zeModuleDestroy(ze_module_handle_t hModule)

Destroys module.

Parameters
  • hModule: [release] handle of the module

  • The application is responsible for making sure the device is not currently referencing the module before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this module

  • The application may not call this function from simultaneous threads with the same module handle.

  • The implementation of this function should be lock-free.

Return

zeModuleBuildLogDestroy

__ze_api_export ze_result_t __zecall zeModuleBuildLogDestroy(ze_module_build_log_handle_t hModuleBuildLog)

Destroys module build log object.

Parameters
  • hModuleBuildLog: [release] handle of the module build log object.

  • The implementation of this function will immediately free all Host allocations associated with this object

  • The application may not call this function from simultaneous threads with the same build log handle.

  • The implementation of this function should be lock-free.

  • This function can be called before or after zeModuleDestroy for the associated module.

Return

zeModuleBuildLogGetString

__ze_api_export ze_result_t __zecall zeModuleBuildLogGetString(ze_module_build_log_handle_t hModuleBuildLog, size_t *pSize, char *pBuildLog)

Retrieves text string for build log.

Parameters
  • hModuleBuildLog: handle of the module build log object.

  • pSize: size of build log string.

  • pBuildLog: [optional] pointer to null-terminated string of the log.

  • The caller can pass nullptr for pBuildLog when querying only for size.

  • The caller must provide memory for build log.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeModuleGetNativeBinary

__ze_api_export ze_result_t __zecall zeModuleGetNativeBinary(ze_module_handle_t hModule, size_t *pSize, uint8_t *pModuleNativeBinary)

Retrieve native binary from Module.

Parameters
  • hModule: handle of the module

  • pSize: size of native binary in bytes.

  • pModuleNativeBinary: [optional] byte pointer to native binary

  • The native binary output can be cached to disk and new modules can be later constructed from the cached copy.

  • The native binary will retain debugging information that is associated with a module.

  • The caller can pass nullptr for pModuleNativeBinary when querying only for size.

  • The implementation will copy the native binary into a buffer supplied by the caller.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeModuleGetGlobalPointer

__ze_api_export ze_result_t __zecall zeModuleGetGlobalPointer(ze_module_handle_t hModule, const char *pGlobalName, void **pptr)

Retrieve global variable pointer from Module.

Parameters
  • hModule: handle of the module

  • pGlobalName: name of global variable in module

  • pptr: device visible pointer

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeModuleGetKernelNames

__ze_api_export ze_result_t __zecall zeModuleGetKernelNames(ze_module_handle_t hModule, uint32_t *pCount, const char **pNames)

Retrieve all kernel names in the module.

Parameters
  • hModule: handle of the module

  • pCount: pointer to the number of names. if count is zero, then the driver will update the value with the total number of names available. if count is non-zero, then driver will only retrieve that number of names. if count is larger than the number of names available, then the driver will update the value with the correct number of names available.

  • pNames: [optional][range(0, *pCount)] array of names of functions

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeKernelCreate

__ze_api_export ze_result_t __zecall zeKernelCreate(ze_module_handle_t hModule, const ze_kernel_desc_t *desc, ze_kernel_handle_t *phKernel)

Create a kernel object from a module by name.

Parameters
  • hModule: handle of the module

  • desc: pointer to kernel descriptor

  • phKernel: handle of the Function object

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeKernelDestroy

__ze_api_export ze_result_t __zecall zeKernelDestroy(ze_kernel_handle_t hKernel)

Destroys a kernel object.

Parameters
  • hKernel: [release] handle of the kernel object

  • All kernels must be destroyed before the module is destroyed.

  • The application is responsible for making sure the device is not currently referencing the kernel before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this kernel

  • The application may not call this function from simultaneous threads with the same kernel handle.

  • The implementation of this function should be lock-free.

Return

zeModuleGetFunctionPointer

__ze_api_export ze_result_t __zecall zeModuleGetFunctionPointer(ze_module_handle_t hModule, const char *pFunctionName, void **pfnFunction)

Retrieve a function pointer from a module by name.

Parameters
  • hModule: handle of the module

  • pFunctionName: Name of function to retrieve function pointer for.

  • pfnFunction: pointer to function.

  • The function pointer is unique for the device on which the module was created.

  • The function pointer is no longer valid if module is destroyed.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeKernelSetGroupSize

__ze_api_export ze_result_t __zecall zeKernelSetGroupSize(ze_kernel_handle_t hKernel, uint32_t groupSizeX, uint32_t groupSizeY, uint32_t groupSizeZ)

Set group size for a kernel.

Parameters
  • hKernel: handle of the kernel object

  • groupSizeX: group size for X dimension to use for this kernel

  • groupSizeY: group size for Y dimension to use for this kernel

  • groupSizeZ: group size for Z dimension to use for this kernel

  • The application may not call this function from simultaneous threads with the same kernel handle.

  • The implementation of this function should be lock-free.

  • The implementation will copy the group size information into a command list when the function is appended.

Return

zeKernelSuggestGroupSize

__ze_api_export ze_result_t __zecall zeKernelSuggestGroupSize(ze_kernel_handle_t hKernel, uint32_t globalSizeX, uint32_t globalSizeY, uint32_t globalSizeZ, uint32_t *groupSizeX, uint32_t *groupSizeY, uint32_t *groupSizeZ)

Query a suggested group size for a kernel given a global size for each dimension.

Parameters
  • hKernel: handle of the kernel object

  • globalSizeX: global width for X dimension

  • globalSizeY: global width for Y dimension

  • globalSizeZ: global width for Z dimension

  • groupSizeX: recommended size of group for X dimension

  • groupSizeY: recommended size of group for Y dimension

  • groupSizeZ: recommended size of group for Z dimension

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

  • This function ignores the group size that is set using zeKernelSetGroupSize.

Return

zeKernelSuggestMaxCooperativeGroupCount

__ze_api_export ze_result_t __zecall zeKernelSuggestMaxCooperativeGroupCount(ze_kernel_handle_t hKernel, uint32_t *totalGroupCount)

Query a suggested max group count a cooperative kernel.

Parameters
  • hKernel: handle of the kernel object

  • totalGroupCount: recommended total group count.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeKernelSetArgumentValue

__ze_api_export ze_result_t __zecall zeKernelSetArgumentValue(ze_kernel_handle_t hKernel, uint32_t argIndex, size_t argSize, const void *pArgValue)

Set kernel argument used on kernel launch.

Parameters
  • hKernel: handle of the kernel object

  • argIndex: argument index in range [0, num args - 1]

  • argSize: size of argument type

  • pArgValue: [optional] argument value represented as matching arg type. If null then argument value is considered null.

  • This function may not be called from simultaneous threads with the same function handle.

  • The implementation of this function should be lock-free.

  • The implementation will copy the arguments into a command list when the function is appended.

Return

zeKernelSetAttribute

__ze_api_export ze_result_t __zecall zeKernelSetAttribute(ze_kernel_handle_t hKernel, ze_kernel_attribute_t attr, uint32_t size, const void *pValue)

Sets a kernel attribute.

Parameters
  • hKernel: handle of the kernel object

  • attr: attribute to set

  • size: size in bytes of kernel attribute value.

  • pValue: [optional] pointer to attribute value.

  • This function may not be called from simultaneous threads with the same function handle.

  • The implementation of this function should be lock-free.

Remark

Analogues

  • clSetKernelExecInfo

Return

zeKernelGetAttribute

__ze_api_export ze_result_t __zecall zeKernelGetAttribute(ze_kernel_handle_t hKernel, ze_kernel_attribute_t attr, uint32_t *pSize, void *pValue)

Gets a kernel attribute.

Parameters
  • hKernel: handle of the kernel object

  • attr: attribute to get. Documentation for ze_kernel_attribute_t for return type information for pValue.

  • pSize: size in bytes needed for kernel attribute value. If pValue is nullptr then the size needed for pValue memory will be written to pSize. Only need to query size for arbitrary sized attributes.

  • pValue: [optional] pointer to attribute value result.

  • This function may not be called from simultaneous threads with the same function handle.

  • The implementation of this function should be lock-free.

  • The caller sets pValue to nullptr when querying only for size.

  • The caller must provide memory for pValue querying when querying size.

Return

zeKernelSetIntermediateCacheConfig

__ze_api_export ze_result_t __zecall zeKernelSetIntermediateCacheConfig(ze_kernel_handle_t hKernel, ze_cache_config_t CacheConfig)

Sets the preferred Intermediate cache configuration for a kernel.

Parameters
  • hKernel: handle of the kernel object

  • CacheConfig: CacheConfig

  • The application may not call this function from simultaneous threads with the same kernel handle.

Return

zeKernelGetProperties

__ze_api_export ze_result_t __zecall zeKernelGetProperties(ze_kernel_handle_t hKernel, ze_kernel_properties_t *pKernelProperties)

Retrieve kernel properties.

Parameters
  • hKernel: handle of the kernel object

  • pKernelProperties: query result for kernel properties.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeCommandListAppendLaunchKernel

__ze_api_export ze_result_t __zecall zeCommandListAppendLaunchKernel(ze_command_list_handle_t hCommandList, ze_kernel_handle_t hKernel, const ze_group_count_t *pLaunchFuncArgs, ze_event_handle_t hSignalEvent, uint32_t numWaitEvents, ze_event_handle_t *phWaitEvents)

Launch kernel over one or more work groups.

Parameters
  • hCommandList: handle of the command list

  • hKernel: handle of the kernel object

  • pLaunchFuncArgs: thread group launch arguments

  • hSignalEvent: [optional] handle of the event to signal on completion

  • numWaitEvents: [optional] number of events to wait on before launching

  • phWaitEvents: [optional][range(0, numWaitEvents)] handle of the events to wait on before launching

  • This may only be called for a command list created with command queue group ordinal that supports compute.

  • This function may not be called from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

zeCommandListAppendLaunchCooperativeKernel

__ze_api_export ze_result_t __zecall zeCommandListAppendLaunchCooperativeKernel(ze_command_list_handle_t hCommandList, ze_kernel_handle_t hKernel, const ze_group_count_t *pLaunchFuncArgs, ze_event_handle_t hSignalEvent, uint32_t numWaitEvents, ze_event_handle_t *phWaitEvents)

Launch kernel cooperatively over one or more work groups.

Parameters
  • hCommandList: handle of the command list

  • hKernel: handle of the kernel object

  • pLaunchFuncArgs: thread group launch arguments

  • hSignalEvent: [optional] handle of the event to signal on completion

  • numWaitEvents: [optional] number of events to wait on before launching

  • phWaitEvents: [optional][range(0, numWaitEvents)] handle of the events to wait on before launching

  • This may only be called for a command list created with command queue group ordinal that supports compute.

  • This may only be used for a command list that are submitted to command queue with cooperative flag set.

  • This function may not be called from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

  • Use zeKernelSuggestMaxCooperativeGroupCount to recommend max group count for device for cooperative functions that device supports.

Return

zeCommandListAppendLaunchKernelIndirect

__ze_api_export ze_result_t __zecall zeCommandListAppendLaunchKernelIndirect(ze_command_list_handle_t hCommandList, ze_kernel_handle_t hKernel, const ze_group_count_t *pLaunchArgumentsBuffer, ze_event_handle_t hSignalEvent, uint32_t numWaitEvents, ze_event_handle_t *phWaitEvents)

Launch kernel over one or more work groups using indirect arguments.

Parameters
  • hCommandList: handle of the command list

  • hKernel: handle of the kernel object

  • pLaunchArgumentsBuffer: pointer to device buffer that will contain thread group launch arguments

  • hSignalEvent: [optional] handle of the event to signal on completion

  • numWaitEvents: [optional] number of events to wait on before launching

  • phWaitEvents: [optional][range(0, numWaitEvents)] handle of the events to wait on before launching

  • The launch arguments need to be device visible.

  • The launch arguments buffer may not be reused until the function has completed on the device.

  • This may only be called for a command list created with command queue group ordinal that supports compute.

  • This function may not be called from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

zeCommandListAppendLaunchMultipleKernelsIndirect

__ze_api_export ze_result_t __zecall zeCommandListAppendLaunchMultipleKernelsIndirect(ze_command_list_handle_t hCommandList, uint32_t numKernels, ze_kernel_handle_t *phKernels, const uint32_t *pCountBuffer, const ze_group_count_t *pLaunchArgumentsBuffer, ze_event_handle_t hSignalEvent, uint32_t numWaitEvents, ze_event_handle_t *phWaitEvents)

Launch multiple kernels over one or more work groups using an array of indirect arguments.

Parameters
  • hCommandList: handle of the command list

  • numKernels: maximum number of kernels to launch

  • phKernels: [range(0, numKernels)] handles of the kernel objects

  • pCountBuffer: pointer to device memory location that will contain the actual number of kernels to launch; value must be less-than or equal-to numKernels

  • pLaunchArgumentsBuffer: [range(0, numKernels)] pointer to device buffer that will contain a contiguous array of thread group launch arguments

  • hSignalEvent: [optional] handle of the event to signal on completion

  • numWaitEvents: [optional] number of events to wait on before launching

  • phWaitEvents: [optional][range(0, numWaitEvents)] handle of the events to wait on before launching

  • The array of launch arguments need to be device visible.

  • The array of launch arguments buffer may not be reused until the kernel has completed on the device.

  • This may only be called for a command list created with command queue group ordinal that supports compute.

  • This function may not be called from simultaneous threads with the same command list handle.

  • The implementation of this function should be lock-free.

Return

Module Enums

ze_module_desc_version_t

enum ze_module_desc_version_t

API version of ze_module_desc_t.

Values:

ZE_MODULE_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_module_format_t

enum ze_module_format_t

Supported module creation input formats.

Values:

ZE_MODULE_FORMAT_IL_SPIRV = 0

Format is SPIRV IL format.

ZE_MODULE_FORMAT_NATIVE

Format is device native format.

ze_kernel_desc_version_t

enum ze_kernel_desc_version_t

API version of ze_kernel_desc_t.

Values:

ZE_KERNEL_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_kernel_flag_t

enum ze_kernel_flag_t

Supported kernel creation flags.

Values:

ZE_KERNEL_FLAG_NONE = 0

default driver behavior

ZE_KERNEL_FLAG_FORCE_RESIDENCY

force all device allocations to be resident during execution

ze_kernel_attribute_t

enum ze_kernel_attribute_t

Kernel attributes.

Remark

Analogues

  • cl_kernel_exec_info

Values:

ZE_KERNEL_ATTR_INDIRECT_HOST_ACCESS = 0

Indicates that the function accesses host allocations indirectly (default: false, type: bool_t)

ZE_KERNEL_ATTR_INDIRECT_DEVICE_ACCESS

Indicates that the function accesses device allocations indirectly (default: false, type: bool_t)

ZE_KERNEL_ATTR_INDIRECT_SHARED_ACCESS

Indicates that the function accesses shared allocations indirectly (default: false, type: bool_t)

ZE_KERNEL_ATTR_SOURCE_ATTRIBUTE

Declared kernel attributes (i.e. can be specified with attribute in runtime language). (type: char[]) Returned as a null-terminated string and each attribute is separated by a space. zeKernelSetAttribute is not supported for this.

ze_kernel_properties_version_t

enum ze_kernel_properties_version_t

API version of ze_kernel_properties_t.

Values:

ZE_KERNEL_PROPERTIES_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

Module Structures

ze_module_constants_t

struct ze_module_constants_t

Specialization constants - User defined constants.

Public Members

uint32_t numConstants

[in] Number of specialization constants.

const uint32_t *pConstantIds

[in] Pointer to array of IDs that is sized to numConstants.

const uint64_t *pConstantValues

[in] Pointer to array of values that is sized to numConstants.

ze_module_desc_t

struct ze_module_desc_t

Module descriptor.

Public Members

ze_module_desc_version_t version

[in] ZE_MODULE_DESC_VERSION_CURRENT

ze_module_format_t format

[in] Module format passed in with pInputModule

size_t inputSize

[in] size of input IL or ISA from pInputModule.

const uint8_t *pInputModule

[in] pointer to IL or ISA

const char *pBuildFlags

[in] string containing compiler flags. See programming guide for build flags.

const ze_module_constants_t *pConstants

[in] pointer to specialization constants. Valid only for SPIR-V input. This must be set to nullptr if no specialization constants are provided.

ze_kernel_desc_t

struct ze_kernel_desc_t

Kernel descriptor.

Public Members

ze_kernel_desc_version_t version

[in] ZE_KERNEL_DESC_VERSION_CURRENT

ze_kernel_flag_t flags

[in] creation flags

const char *pKernelName

[in] null-terminated name of kernel in module

ze_kernel_properties_t

struct ze_kernel_properties_t

Kernel properties.

Public Members

ze_kernel_properties_version_t version

[in] ZE_KERNEL_PROPERTIES_VERSION_CURRENT

char name[ZE_MAX_KERNEL_NAME]

[out] Kernel name

uint32_t numKernelArgs

[out] number of kernel arguments.

uint32_t requiredGroupSizeX

[out] required group size in the X dimension

uint32_t requiredGroupSizeY

[out] required group size in the Y dimension

uint32_t requiredGroupSizeZ

[out] required group size in the Z dimension

ze_group_count_t

struct ze_group_count_t

Kernel dispatch group count.

Public Members

uint32_t groupCountX

[in] number of thread groups in X dimension

uint32_t groupCountY

[in] number of thread groups in Y dimension

uint32_t groupCountZ

[in] number of thread groups in Z dimension

Residency

Residency Functions

zeDeviceMakeMemoryResident

__ze_api_export ze_result_t __zecall zeDeviceMakeMemoryResident(ze_device_handle_t hDevice, void *ptr, size_t size)

Makes memory resident for the device.

Parameters
  • hDevice: handle of the device

  • ptr: pointer to memory to make resident

  • size: size in bytes to make resident

  • If the application does not properly manage residency then the device may experience unrecoverable page-faults.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDeviceEvictMemory

__ze_api_export ze_result_t __zecall zeDeviceEvictMemory(ze_device_handle_t hDevice, void *ptr, size_t size)

Allows memory to be evicted from the device.

Parameters
  • hDevice: handle of the device

  • ptr: pointer to memory to evict

  • size: size in bytes to evict

  • The application is responsible for making sure the device is not currently referencing the memory before it is evicted

  • Memory is always implicitly evicted if it is resident when freed.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDeviceMakeImageResident

__ze_api_export ze_result_t __zecall zeDeviceMakeImageResident(ze_device_handle_t hDevice, ze_image_handle_t hImage)

Makes image resident for the device.

Parameters
  • hDevice: handle of the device

  • hImage: handle of image to make resident

  • If the application does not properly manage residency then the device may experience unrecoverable page-faults.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeDeviceEvictImage

__ze_api_export ze_result_t __zecall zeDeviceEvictImage(ze_device_handle_t hDevice, ze_image_handle_t hImage)

Allows image to be evicted from the device.

Parameters
  • hDevice: handle of the device

  • hImage: handle of image to make evict

  • The application is responsible for making sure the device is not currently referencing the memory before it is evicted

  • An image is always implicitly evicted if it is resident when destroyed.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

Sampler

Sampler Functions

zeSamplerCreate

__ze_api_export ze_result_t __zecall zeSamplerCreate(ze_device_handle_t hDevice, const ze_sampler_desc_t *desc, ze_sampler_handle_t *phSampler)

Creates sampler object.

Parameters
  • hDevice: handle of the device

  • desc: pointer to sampler descriptor

  • phSampler: handle of the sampler

  • The sampler can only be used on the device on which it was created.

  • The application may call this function from simultaneous threads.

  • The implementation of this function should be lock-free.

Return

zeSamplerDestroy

__ze_api_export ze_result_t __zecall zeSamplerDestroy(ze_sampler_handle_t hSampler)

Destroys sampler object.

Parameters
  • hSampler: [release] handle of the sampler

  • The application is responsible for making sure the device is not currently referencing the sampler before it is deleted

  • The implementation of this function will immediately free all Host and Device allocations associated with this module

  • The application may not call this function from simultaneous threads with the same sampler handle.

  • The implementation of this function should be lock-free.

Return

Sampler Enums

ze_sampler_desc_version_t

enum ze_sampler_desc_version_t

API version of ze_sampler_desc_t.

Values:

ZE_SAMPLER_DESC_VERSION_CURRENT = ZE_MAKE_VERSION(0, )

version 0.91

ze_sampler_address_mode_t

enum ze_sampler_address_mode_t

Sampler addressing modes.

Values:

ZE_SAMPLER_ADDRESS_MODE_NONE = 0

No coordinate modifications for out-of-bounds image access.

ZE_SAMPLER_ADDRESS_MODE_REPEAT

Out-of-bounds coordinates are wrapped back around.

ZE_SAMPLER_ADDRESS_MODE_CLAMP

Out-of-bounds coordinates are clamped to edge.

ZE_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER

Out-of-bounds coordinates are clamped to border color which is (0.0f, 0.0f, 0.0f, 0.0f) if image format swizzle contains alpha, otherwise (0.0f, 0.0f, 0.0f, 1.0f).

ZE_SAMPLER_ADDRESS_MODE_MIRROR

Out-of-bounds coordinates are mirrored starting from edge.

ze_sampler_filter_mode_t

enum ze_sampler_filter_mode_t

Sampler filtering modes.

Values:

ZE_SAMPLER_FILTER_MODE_NEAREST = 0

No coordinate modifications for out of bounds image access.

ZE_SAMPLER_FILTER_MODE_LINEAR

Out-of-bounds coordinates are wrapped back around.

Sampler Structures

ze_sampler_desc_t

struct ze_sampler_desc_t

Sampler descriptor.

Public Members

ze_sampler_desc_version_t version

[in] ZE_SAMPLER_DESC_VERSION_CURRENT

ze_sampler_address_mode_t addressMode

[in] Sampler addressing mode to determine how out-of-bounds coordinates are handled.

ze_sampler_filter_mode_t filterMode

[in] Sampler filter mode to determine how samples are filtered.

ze_bool_t isNormalized

[in] Are coordinates normalized [0, 1] or not.