Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Querying hardware features

Some features and datatypes are only supported on some hardware or some backends. They can be queried with:

client.properties().feature_enabled(feature)

Also see Feature.

Overview

Features

Also requires device support

FeatureCUDAROCmWGPU (WGSL)WGPU (SPIR-V)
Plane✔️✔️✔️✔️
CMMA✔️✔️✔️

Datatypes

flex32 represented as f32 everywhere except SPIR-V, with no reduced precision. f64 not supported for all operations

TypeCUDAROCmWGPU (WGSL)WGPU (SPIR-V)
u8✔️✔️✔️
u16✔️✔️✔️
u32✔️✔️✔️✔️
u64✔️✔️✔️
i8✔️✔️✔️
i16✔️✔️✔️
i32✔️✔️✔️✔️
i64✔️✔️✔️
f16✔️✔️✔️
bf16✔️✔️
flex32✔️
tf32✔️
f32✔️✔️✔️✔️
f64
bool✔️✔️✔️✔️

Datatype Details

Flex32

Relaxed precision 32-bit float. Minimum range and precision is equivalent to f16, but may be higher. Defaults to f32 when relaxed precision isn't supported.

Tensor-Float32

19-bit CUDA-only type that should only be used as a CMMA matrix type. May be able to reinterpret from f32, but officially undefined. Use Cast::cast_from to safely convert.

Feature Details

Plane

Plane level operations, i.e. plane_sum, plane_elect.

Cooperative Matrix Multiply-Add (CMMA)

Plane-level cooperative matrix multiply-add operations. Maps to wmma in CUDA and CooperativeMatrixMultiply in SPIR-V. Features are registered for each size and datatype that is supported by the hardware. For supported functions, see cmma.