Querying hardware features

Some features and datatypes are only supported on some hardware or some backends. They can be queried with:

client.properties().feature_enabled(feature)

Also see Feature.

Overview

Features

Also requires device support

Feature	CUDA	ROCm	WGPU (WGSL)	WGPU (SPIR-V)
Plane	✔️	✔️	✔️	✔️
CMMA	✔️	✔️	❌	✔️

Datatypes

flex32 represented as f32 everywhere except SPIR-V, with no reduced precision. f64 not supported for all operations

Type	CUDA	ROCm	WGPU (WGSL)	WGPU (SPIR-V)
u8	✔️	✔️	❌	✔️
u16	✔️	✔️	❌	✔️
u32	✔️	✔️	✔️	✔️
u64	✔️	✔️	❌	✔️
i8	✔️	✔️	❌	✔️
i16	✔️	✔️	❌	✔️
i32	✔️	✔️	✔️	✔️
i64	✔️	✔️	❌	✔️
f16	✔️	✔️	❌	✔️
bf16	✔️	✔️	❌	❌
flex32	❔	❔	❔	✔️
tf32	✔️	❌	❌	❌
f32	✔️	✔️	✔️	✔️
f64	❔	❔	❌	❔
bool	✔️	✔️	✔️	✔️

Datatype Details

Flex32

Relaxed precision 32-bit float. Minimum range and precision is equivalent to f16, but may be higher. Defaults to f32 when relaxed precision isn't supported.

Tensor-Float32

19-bit CUDA-only type that should only be used as a CMMA matrix type. May be able to reinterpret from f32, but officially undefined. Use Cast::cast_from to safely convert.

Feature Details

Plane

Plane level operations, i.e. plane_sum, plane_elect.

Cooperative Matrix Multiply-Add (CMMA)

Plane-level cooperative matrix multiply-add operations. Maps to wmma in CUDA and CooperativeMatrixMultiply in SPIR-V. Features are registered for each size and datatype that is supported by the hardware. For supported functions, see cmma.

Keyboard shortcuts

The CubeCL Book 🧊