In EVE we recognise that we cannot possibly cover all the individual things you might want to do and it is necessary for the users to be able to write intrinsics.
That is why eve::wide and eve::logical for native sizes are implicitly convertible to and from the intrinsic type you would expect.
x86: __m128/256/512 for floats and double, __m128i/256i/512i for ints
neon: int8x8_t, int8x16_t for int8s etc.
sve, powerpc: ....
To combine this with eve::algorithm
s we have to force a specific cardinal. All algorithms are designed to do not require templated callbacks so this remains possible. But, unfortunately views still require a templated callback.
#include <eve/module/core.hpp>
#include <eve/module/algo.hpp>
#include <string>
#include <span>
#ifdef __SSSE3__
void remove_spaces(std::string& s)
{
std::span s_bytes{reinterpret_cast<std::uint8_t*>(s.data()), s.size()};
s_bytes,[](u8x16 v) -> eve::logical<u8x16> {
__m128i lut = _mm_setr_epi8(' ', 0, 0, 0,
0, 0, 0, 0,
0, '\t', '\n', 0,
0, '\r', 0, 0);
return v == u8x16{_mm_shuffle_epi8(lut, v)};
});
auto offset = end - s_bytes.begin();
s.erase(s.begin() + offset, s.end());
}
#endif
What about non-native sizes?
If the number is less than the one that can be natively represented, it's internally still represented as a full register. For example wide<int, eve::fixed<2>> on x86 is still __m128i. The relevant data just occupy the first 2 elements, the others can be garbage.
If the cardinal is larger than the one that can be natively represented, you can use slice to get to the half the wide size.
TTS_CASE("Slice Example") {
i32x16 ints {0, 1, 2, 3,
4, 5, 6, 7,
8, 9, 10, 11,
12, 13, 14, 15 };
i32x8 lo_expected {
0, 1, 2, 3, 4, 5, 6, 7 };
i32x8 hi_expected {
8, 9, 10, 11, 12, 13, 14, 15 };
TTS_EQUAL(ints.slice(eve::lower_), lo_expected);
TTS_EQUAL(ints.slice(eve::upper_), hi_expected);
auto [
lo,
hi] = ints.slice();
TTS_EQUAL(lo, lo_expected);
TTS_EQUAL(hi, hi_expected);
};
constexpr auto remove_if
SIMD version of std::remove_if.
Definition remove.hpp:87
constexpr auto lo
Computes the least significant half of each lane.
Definition lo.hpp:73
constexpr auto hi
elementwise_callable computing the most significant half of each lane.
Definition hi.hpp:71
If you write something generally useful, maybe consider contributing?