Querying hardware features
Some features and datatypes are only supported on some hardware or some backends. They can be queried with:
client.properties().feature_enabled(feature)
Also see Feature
.
Overview
Features
Also requires device support
Feature | CUDA | ROCm | WGPU (WGSL) | WGPU (SPIR-V) |
---|---|---|---|---|
Plane | ✔️ | ✔️ | ✔️ | ✔️ |
CMMA | ✔️ | ✔️ | ❌ | ✔️ |
Datatypes
flex32
represented as f32
everywhere except SPIR-V, with no reduced precision. f64
not
supported for all operations
Type | CUDA | ROCm | WGPU (WGSL) | WGPU (SPIR-V) |
---|---|---|---|---|
u8 | ✔️ | ✔️ | ❌ | ✔️ |
u16 | ✔️ | ✔️ | ❌ | ✔️ |
u32 | ✔️ | ✔️ | ✔️ | ✔️ |
u64 | ✔️ | ✔️ | ❌ | ✔️ |
i8 | ✔️ | ✔️ | ❌ | ✔️ |
i16 | ✔️ | ✔️ | ❌ | ✔️ |
i32 | ✔️ | ✔️ | ✔️ | ✔️ |
i64 | ✔️ | ✔️ | ❌ | ✔️ |
f16 | ✔️ | ✔️ | ❌ | ✔️ |
bf16 | ✔️ | ✔️ | ❌ | ❌ |
flex32 | ❔ | ❔ | ❔ | ✔️ |
tf32 | ✔️ | ❌ | ❌ | ❌ |
f32 | ✔️ | ✔️ | ✔️ | ✔️ |
f64 | ❔ | ❔ | ❌ | ❔ |
bool | ✔️ | ✔️ | ✔️ | ✔️ |
Datatype Details
Flex32
Relaxed precision 32-bit float. Minimum range and precision is equivalent to f16
, but may be
higher. Defaults to f32
when relaxed precision isn't supported.
Tensor-Float32
19-bit CUDA-only type that should only be used as a CMMA matrix type. May be able to reinterpret
from f32
, but officially undefined. Use Cast::cast_from
to safely convert.
Feature Details
Plane
Plane level operations, i.e.
plane_sum
,
plane_elect
.
Cooperative Matrix Multiply-Add (CMMA)
Plane-level cooperative matrix multiply-add operations. Maps to wmma
in CUDA and
CooperativeMatrixMultiply
in SPIR-V. Features are registered for each size and datatype that is
supported by the hardware. For supported functions, see
cmma
.