Initial value:=
(std::same_as<ABI, x86_512_> && !detail::avx512_default_64_bytes)
? expected_cardinal_v<T, x86_256_> : expected_cardinal_v<T, ABI>
You can find more explanations in the 'frequency scaling tutorial'. This refers to extreme frequency scaling one encounters when working with 64 byte registers on intel. The processor scales frequency drammatically for a substantial period of time. So even if the algorithm itself will run faster the overall perf will go down. Generally speaking, 64 byte registers on intel make sense only for really big data sets. nofs_cardinal
will produce 32 byte registers on avx512. If you would like to default to 64 byte registers, you can build with DEVE_AVX512_DEFAULT_64_BYTES. This is probably a good idea on AMD-ZEN4 but we do not detect that at the moment.
- Note
- frquency scaling exists for avx2 as well but is generally considered acceptable. For example popular implementations of libc use avx2, so you are very likely already have it. You can always set the width manually if needed.
-
eve::algo
by default will use nofs_cardinal
. See allow_frequency_scaling
trait.
- Template Parameters
-
Type | Type of value to assess |
ABI | SIMD ABI to use as reference. Must models eve::regular_abi. defaults to current. |
Helpers
template <scalar_value T, regular_abi ABI = eve::current_abi_type>
template <scalar_value T>
template <plain_scalar_value T>
using nofs_logical = logical<nofs_wide<T>>;
SIMD register cardinal type.
Definition cardinal.hpp:15
Wrapper for SIMD registers.
Definition wide.hpp:70