◆ set_intersection

eve::algo::set_intersection = function_with_traits<set_intersection_>

inlineconstexpr

#include <eve/module/algo/algo/set_intersection.hpp>

Defined in Header

#include <eve/module/algo.hpp>

The main idea for the algorithm comes from "Faster-Than-Native Alternatives for x86 VP2INTERSECT Instructions" by Guillermo Diez-Canas. Link: https://arxiv.org/abs/2112.06342

Differences:

duplicate handling, eve::algo::set_intersection does not guarantee how many copies of a duplicated element will be in the output. Example: [a, a] intersect with [a, a, a] might produce [a], [a, a] and [a, a, a].
we require both "Less" and "Equal" predicates, unlike std getting equal from less will be expensive.
eve requires an output range instead of an output iterator. eve will do checks that there is enough output space and if there isn't, it will stop. If the algorithm stopped because of insufficient output space, the return value will contain where it stopped - (in1, in2) will point past the last written duplicate. TODO(#1630): provide support for output overallocation.

Other:

Less and Equal have to be semantically compatible
Eve always writes from the first range in case of equivalent elements. If eve::is_less/is_greater and eve::is_equal are passed in, elements are considered to be equal and eve can write from either range.

Tuning:

Has expect_smaller_range<0>, expect_smaller_range<1> variations. These use combined simd/scalar approach. Since the problem is very data dependent, you want to measure for your usecase.
Version expect_smaller_range<0> can be slightly faster, eve can't always use that due to a stability requirement but you might want to swap your inputs.
The provided solution is minimal, eve won't search for beginning of intersection in any way. eve also won't do any dispatch based on size. If this is something that can be beneficial for your case - consider it.
sparse_output/dense_output - controls which eve::algo::compress_copy is used, default is dense.
Basic version does not support aligning/unrolling. expect_smaller_range do.

Callable Signatures

namespace eve::algo
{
  template<relaxed_range R1, relaxed_range R2, relaxed_range RO, typename Less, typename
  Equal> auto set_intersection(R1&& r1, R2&& r2, RO&& ro, Less less, Equal equal)  // (1)
    -> set_intersection_result<unaligned_iterator_t<R1>,
                               unaligned_iterator_t<R2>,
                               unaligned_iterator_t<RO>>
 
  template<relaxed_range R1, relaxed_range R2, relaxed_range RO>
  auto set_intersection(R1&& r1, R2&& r2, RO&& ro)                          // (2)
    -> set_intersection_result<unaligned_iterator_t<R1>,
                               unaligned_iterator_t<R2>,
                               unaligned_iterator_t<RO>>
}

(2) calls (1) with eve::is_less, eve::is_equal and converting to common type.

Parameters (1)

r1, r2 - relaxed ranges to intersect
ro - output relaxed range
less - simd strict weak ordering for elements from r1 and r2
equal - simd predicate for equivalence, compatible with less.

Return value

set_intersection{ .in1, .in2, .out }.

if ro had enough space for all common elements - .in1 == r1.end(), .in2 == r2.end(), .out is an iterator in ro past the last written elements (same as std::set_intersection)
if we ran out of space in ro before finishing the algorithm, .in1 and .in2 will point to positions past the last common element. .out == ro.end()

Example

#include <eve/module/algo.hpp>
#include <eve/module/core.hpp>
 
#include <numeric>
#include <iostream>
#include <utility>
#include <vector>
 
#include <tts/tts.hpp>
 
void
basic_example()
{
  std::cout << "basic =================\n";
 
  std::vector<int> v1 {1, 3, 5, 7};
  std::vector<int> v2 {2, 3, 4, 5, 6};
  std::vector<int> r;
 
  /*
   * The basic case, where output has guaranteed enough space
   * for the intersected elements.
   *
   * We will see all of the common elements in the output.
   * in1, in2 equal to the v1.end(), v2.end()
   */
  r.resize(v1.size());
  r.erase(eve::algo::set_intersection(v1, v2, r).out, r.end());
  TTS_EQUAL(r, (std::vector{3, 5}));
 
  std::cout << "-> v1:" << tts::as_string(v1) << '\n';
  std::cout << "-> v2:" << tts::as_string(v2) << '\n';
  std::cout << "-> r: " << tts::as_string(r) << '\n';
}
 
void not_enough_space_example()
{
  std::cout << "not enough space ====================\n";
 
  std::vector<int> v1 {1, 3, 5, 7};
  std::vector<int> v2 {2, 3, 4, 5, 6};
  std::vector<int> r;
 
  /*
   * The not enough space case - we will stop whe the output
   * runs out.
   *
   * in1 and in2 will be after the last common element.
   */
  r.resize(1);
  auto [in1, in2, out] = eve::algo::set_intersection(v1, v2, r);
 
  TTS_EQUAL(r, std::vector{3});
  TTS_EQUAL(*in1, 5);
  TTS_EQUAL(*in2, 4);
  TTS_EQUAL(out, r.end());
 
  std::cout << "r:             " << tts::as_string(r) << "\n";
  std::cout << "*in1:          " << *in1 << "\n";
  std::cout << "*in2:          " << *in2 << "\n";
  std::cout << "out == r.end() " << tts::as_string(out == r.end()) << "\n";
}
 
void
pairs_example()
{
  std::cout << "mixing types, custom predicate =================\n";
 
  /*
   * Example that does set_union for tuples and integers.
   */
 
  using pair_t = kumi::tuple<int, double>;
 
  eve::algo::soa_vector<pair_t> pairs {{0, 1.1f}, {2, 2.2f}, {3, 3.3f}, {4, 4.4f}};
  std::vector<int>              scalars {2, 4};
 
  eve::algo::soa_vector<pair_t> r;
  r.resize(pairs.size());
 
  auto project = []<typename T>(T a) {
    if constexpr (eve::like<T, pair_t>) return get<0>(a);
    else return a;
  };
 
  /*
   * Unfortunately, unlike std::set_intersect, we for now require that
   * the callback applies both ways.
   * FIX-1631
   */
  auto lt = [&](auto a, auto b) {
    return project(a) < project(b);
  };
 
  auto eq = [&](auto a, auto b) {
    return project(a) == project(b);
  };
 
  r.erase(eve::algo::set_intersection(pairs, scalars, r, lt, eq).out, r.end());
 
  eve::algo::soa_vector<pair_t> r_expected {{2, 2.2f}, {4, 4.4f}};
  TTS_EQUAL(r, r_expected);
 
  std::cout << "-> pairs:   " << tts::as_string(pairs) << '\n';
  std::cout << "-> sclaras: " << tts::as_string(scalars) << '\n';
  std::cout << "-> r:       " << tts::as_string(r) << '\n';
}
 
void
different_scalar_types_example()
{
  std::cout << "different_scalar_types =================\n";
 
  std::vector<double> f64 {10.1, 20, 30.3, 40.4};
  std::vector<int>    i32 {20, 40};
 
  // We will convert from the intersection to the output type
  std::vector<float> r0(5u, 0.0);
 
  // We will use eve::common_type for comparisons between different types
  // if predicates are defaulted.
  r0.erase(eve::algo::set_intersection(f64, i32, r0).out, r0.end());
  TTS_EQUAL(r0, std::vector<float>{20});
 
  auto project = [](auto a) {
    return eve::convert(a, eve::as<std::int64_t>{}) / 10;
  };
 
  auto lt = [&](auto a, auto b) {
    return project(a) < project(b);
  };
 
  auto eq = [&](auto a, auto b) {
    return project(a) == project(b);
  };
 
  // The output is always copied from the first range, so r1, r2 will give
  // different results.
  std::vector<float> r1(5u, 0.0);
  r1.erase(eve::algo::set_intersection(f64, i32, r1, lt, eq).out, r1.end());
  TTS_EQUAL(r1, (std::vector<float>{20, 40.4}));
 
  std::vector<float> r2(5u, 0.0);
  r2.erase(eve::algo::set_intersection(i32, f64, r2, lt, eq).out, r2.end());
  TTS_EQUAL(r2, (std::vector<float>{20, 40}));
 
  std::cout << "-> f64: " << tts::as_string(f64) << '\n';
  std::cout << "-> i32: " << tts::as_string(i32) << '\n';
  std::cout << "-> r0:  " << tts::as_string(r0) << '\n';
  std::cout << "-> r1:  " << tts::as_string(r1) << '\n';
  std::cout << "-> r2:  " << tts::as_string(r2) << '\n';
}
 
void
biase_example()
{
  std::cout << "smaller range hint =================\n";
  std::vector<int> v1(1000u, 0), v2{531, 917};
  std::iota(v1.begin(), v1.end(), 0);
 
  std::vector<int> r0{0, 1}, r1{0, 1};
 
  eve::algo::set_intersection[eve::algo::expect_smaller_range<0>](v1, v2, r0);
  eve::algo::set_intersection[eve::algo::expect_smaller_range<1>](v2, v1, r1);
 
  TTS_EQUAL(r0, v2);
  TTS_EQUAL(r1, v2);
 
  std::cout << "-> r0:  " << tts::as_string(r0) << '\n';
  std::cout << "-> r1:  " << tts::as_string(r1) << '\n';
}
 
void
sparse_output_example()
{
  std::cout << "sparse_output hint =================\n";
 
  std::vector<int> v1(20u, 0), v2(20u, 0);
  for (int i = 0; auto& x : v1) x = i++ * 3;
  for (int i = 0; auto& x : v2) x = i++ * 5;
 
  std::vector<int> r;
  r.resize(20);
 
  r.erase(
    eve::algo::set_intersection[eve::algo::sparse_output](v1, v2, r).out,
    r.end());
 
  std::cout << " -> r: " << tts::as_string(r) << std::endl;
  TTS_EQUAL(r, std::vector({0, 15, 30, 45}));
}
 
int
main()
{
  basic_example();
  not_enough_space_example();
  pairs_example();
  different_scalar_types_example();
  biase_example();
  sparse_output_example();
}

	E.V.E v2023.02.15