std::simd

Struct Simd

pub struct Simd<T, const N: usize>(/* private fields */)
where
    LaneCount<N>: SupportedLaneCount,
    T: SimdElement;

🔬This is a nightly-only experimental API. (portable_simd #86656)

Expand description

A SIMD vector with the shape of [T; N] but the operations of T.

Simd<T, N> supports the operators (+, *, etc.) that T does in “elementwise” fashion. These take the element at each index from the left-hand side and right-hand side, perform the operation, then return the result in the same index in a vector of equal size. However, Simd differs from normal iteration and normal arrays:

Simd<T, N> executes N operations in a single step with no breaks
Simd<T, N> can have an alignment greater than T, for better mechanical sympathy

By always imposing these constraints on Simd, it is easier to compile elementwise operations into machine instructions that can themselves be executed in parallel.

let a: [i32; 4] = [-2, 0, 2, 4];
let b = [10, 9, 8, 7];
let sum = array::from_fn(|i| a[i] + b[i]);
let prod = array::from_fn(|i| a[i] * b[i]);

/ `Simd<T, N>` implements `From<[T; N]>`
let (v, w) = (Simd::from(a), Simd::from(b));
/ Which means arrays implement `Into<Simd<T, N>>`.
assert_eq!(v + w, sum.into());
assert_eq!(v * w, prod.into());

Simd with integer elements treats operators as wrapping, as if T was Wrapping<T>. Thus, Simd does not implement wrapping_add, because that is the default behavior. This means there is no warning on overflows, even in “debug” builds. For most applications where Simd is appropriate, it is “not a bug” to wrap, and even “debug builds” are unlikely to tolerate the loss of performance. You may want to consider using explicitly checked arithmetic if such is required. Division by zero on integers still causes a panic, so you may want to consider using f32 or f64 if that is unacceptable.

§Layout

Simd<T, N> has a layout similar to [T; N] (identical “shapes”), with a greater alignment. [T; N] is aligned to T, but Simd<T, N> will have an alignment based on both T and N. Thus it is sound to transmute Simd<T, N> to [T; N] and should optimize to “zero cost”, but the reverse transmutation may require a copy the compiler cannot simply elide.

§ABI “Features”

Due to Rust’s safety guarantees, Simd<T, N> is currently passed and returned via memory, not SIMD registers, except as an optimization. Using #[inline] on functions that accept Simd<T, N> or return it is recommended, at the cost of code generation time, as inlining SIMD-using functions can omit a large function prolog or epilog and thus improve both speed and code size. The need for this may be corrected in the future.

Using #[inline(always)] still requires additional care.

§Safe SIMD with Unsafe Rust

Operations with Simd are typically safe, but there are many reasons to want to combine SIMD with unsafe code. Care must be taken to respect differences between Simd and other types it may be transformed into or derived from. In particular, the layout of Simd<T, N> may be similar to [T; N], and may allow some transmutations, but references to [T; N] are not interchangeable with those to Simd<T, N>. Thus, when using unsafe Rust to read and write Simd<T, N> through raw pointers, it is a good idea to first try with read_unaligned and write_unaligned. This is because:

read and write require full alignment (in this case, Simd<T, N>’s alignment)
Simd<T, N> is often read from or written to [T] and other types aligned to T
combining these actions violates the unsafe contract and explodes the program into a puff of undefined behavior
the compiler can implicitly adjust layouts to make unaligned reads or writes fully aligned if it sees the optimization
most contemporary processors with “aligned” and “unaligned” read and write instructions exhibit no performance difference if the “unaligned” variant is aligned at runtime

Less obligations mean unaligned reads and writes are less likely to make the program unsound, and may be just as fast as stricter alternatives. When trying to guarantee alignment, [T]::as_simd is an option for converting [T] to [Simd<T, N>], and allows soundly operating on an aligned SIMD body, but it may cost more time when handling the scalar head and tail. If these are not enough, it is most ideal to design data structures to be already aligned to align_of::<Simd<T, N>>() before using unsafe Rust to read or write. Other ways to compensate for these facts, like materializing Simd to or from an array first, are handled by safe methods like Simd::from_array and Simd::from_slice.

Struct Simd Copy item path

§Layout

§ABI “Features”

§Safe SIMD with Unsafe Rust

Implementations§

impl<T, const N: usize> Simd<T, N>where T: SimdElement, LaneCount<N>: SupportedLaneCount,

pub fn reverse(self) -> Simd<T, N>

pub fn rotate_elements_left<const OFFSET: usize>(self) -> Simd<T, N>

pub fn resize<const M: usize>(self, value: T) -> Simd<T, M>where LaneCount<M>: SupportedLaneCount,

pub fn extract<const START: usize, const LEN: usize>(self) -> Simd<T, LEN>where LaneCount<LEN>: SupportedLaneCount,

impl<const N: usize> Simd<u8, N>where LaneCount<N>: SupportedLaneCount,

pub fn swizzle_dyn(self, idxs: Simd<u8, N>) -> Simd<u8, N>

impl<T, const N: usize> Simd<T, N>where LaneCount<N>: SupportedLaneCount, T: SimdElement,

pub const LEN: usize = N

pub const fn len(&self) -> usize

§Examples

§Examples

§Examples

pub const fn from_array(array: [T; N]) -> Simd<T, N>

pub const fn to_array(self) -> [T; N]

pub const fn from_slice(slice: &[T]) -> Simd<T, N>

§Panics

§Example

§Panics

§Example

pub fn load_or_default(slice: &[T]) -> Simd<T, N>where T: Default,

§Examples

§Examples

pub fn load_select_or_default( slice: &[T], enable: Mask<<T as SimdElement>::Mask, N>, ) -> Simd<T, N>where T: Default,

§Examples

pub fn load_select( slice: &[T], enable: Mask<<T as SimdElement>::Mask, N>, or: Simd<T, N>, ) -> Simd<T, N>

§Examples

pub unsafe fn load_select_unchecked( slice: &[T], enable: Mask<<T as SimdElement>::Mask, N>, or: Simd<T, N>, ) -> Simd<T, N>

§Safety

pub unsafe fn load_select_ptr( ptr: *const T, enable: Mask<<T as SimdElement>::Mask, N>, or: Simd<T, N>, ) -> Simd<T, N>

§Safety

pub fn gather_or( slice: &[T], idxs: Simd<usize, N>, or: Simd<T, N>, ) -> Simd<T, N>

§Examples

pub fn gather_or_default(slice: &[T], idxs: Simd<usize, N>) -> Simd<T, N>where T: Default,

§Examples

pub fn gather_select( slice: &[T], enable: Mask<isize, N>, idxs: Simd<usize, N>, or: Simd<T, N>, ) -> Simd<T, N>

§Examples

pub unsafe fn gather_select_unchecked( slice: &[T], enable: Mask<isize, N>, idxs: Simd<usize, N>, or: Simd<T, N>, ) -> Simd<T, N>

§Safety

§Examples

pub unsafe fn gather_ptr(source: Simd<*const T, N>) -> Simd<T, N>where T: Default,

§Safety

§Example

pub unsafe fn gather_select_ptr( source: Simd<*const T, N>, enable: Mask<isize, N>, or: Simd<T, N>, ) -> Simd<T, N>

§Safety

§Example

pub fn store_select( self, slice: &mut [T], enable: Mask<<T as SimdElement>::Mask, N>, )

§Examples

pub unsafe fn store_select_unchecked( self, slice: &mut [T], enable: Mask<<T as SimdElement>::Mask, N>, )

§Safety

§Examples

pub unsafe fn store_select_ptr( self, ptr: *mut T, enable: Mask<<T as SimdElement>::Mask, N>, )

§Safety

pub fn scatter(self, slice: &mut [T], idxs: Simd<usize, N>)

§Examples

pub fn scatter_select( self, slice: &mut [T], enable: Mask<isize, N>, idxs: Simd<usize, N>, )

§Examples

pub unsafe fn scatter_select_unchecked( self, slice: &mut [T], enable: Mask<isize, N>, idxs: Simd<usize, N>, )

§Safety

§Examples

§Safety

§Example

pub unsafe fn scatter_select_ptr( self, dest: Simd<*mut T, N>, enable: Mask<isize, N>, )

§Safety

§Example

Trait Implementations§

impl<'lhs, 'rhs, T, const N: usize> Add<&'rhs Simd<T, N>> for &'lhs Simd<T, N>where T: SimdElement, Simd<T, N>: Add<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

fn add( self, rhs: &'rhs Simd<T, N>, ) -> <&'lhs Simd<T, N> as Add<&'rhs Simd<T, N>>>::Output

impl<T, const N: usize> Add<&Simd<T, N>> for Simd<T, N>where T: SimdElement, Simd<T, N>: Add<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> Add<Simd<T, N>> for &Simd<T, N>where T: SimdElement, Simd<T, N>: Add<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<f32, N>where f32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<f64, N>where f64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<i16, N>where i16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<i32, N>where i32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<i64, N>where i64: SimdElement, LaneCount<N>: SupportedLaneCount,

Struct Simd

impl<T, const N: usize> Simd<T, N>
where T: SimdElement, LaneCount<N>: SupportedLaneCount,

pub fn resize<const M: usize>(self, value: T) -> Simd<T, M>
where LaneCount<M>: SupportedLaneCount,

pub fn extract<const START: usize, const LEN: usize>(self) -> Simd<T, LEN>
where LaneCount<LEN>: SupportedLaneCount,

impl<const N: usize> Simd<u8, N>
where LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> Simd<T, N>
where LaneCount<N>: SupportedLaneCount, T: SimdElement,

pub fn load_or_default(slice: &[T]) -> Simd<T, N>
where T: Default,

pub fn load_select_or_default( slice: &[T], enable: Mask<<T as SimdElement>::Mask, N>, ) -> Simd<T, N>
where T: Default,

pub fn gather_or_default(slice: &[T], idxs: Simd<usize, N>) -> Simd<T, N>
where T: Default,

pub unsafe fn gather_ptr(source: Simd<*const T, N>) -> Simd<T, N>
where T: Default,

impl<'lhs, 'rhs, T, const N: usize> Add<&'rhs Simd<T, N>> for &'lhs Simd<T, N>
where T: SimdElement, Simd<T, N>: Add<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> Add<&Simd<T, N>> for Simd<T, N>
where T: SimdElement, Simd<T, N>: Add<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> Add<Simd<T, N>> for &Simd<T, N>
where T: SimdElement, Simd<T, N>: Add<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<f32, N>
where f32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<f64, N>
where f64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<i16, N>
where i16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<i32, N>
where i32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<i64, N>
where i64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<i8, N>
where i8: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<isize, N>
where isize: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<u16, N>
where u16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<u32, N>
where u32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<u64, N>
where u64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<u8, N>
where u8: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> Add for Simd<usize, N>
where usize: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<T, U, const N: usize> AddAssign<U> for Simd<T, N>
where Simd<T, N>: Add<U, Output = Simd<T, N>>, T: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> AsMut<[T]> for Simd<T, N>
where LaneCount<N>: SupportedLaneCount, T: SimdElement,

impl<T, const N: usize> AsMut<[T; N]> for Simd<T, N>
where LaneCount<N>: SupportedLaneCount, T: SimdElement,

impl<T, const N: usize> AsRef<[T]> for Simd<T, N>
where LaneCount<N>: SupportedLaneCount, T: SimdElement,

impl<T, const N: usize> AsRef<[T; N]> for Simd<T, N>
where LaneCount<N>: SupportedLaneCount, T: SimdElement,

impl<'lhs, 'rhs, T, const N: usize> BitAnd<&'rhs Simd<T, N>> for &'lhs Simd<T, N>
where T: SimdElement, Simd<T, N>: BitAnd<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> BitAnd<&Simd<T, N>> for Simd<T, N>
where T: SimdElement, Simd<T, N>: BitAnd<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> BitAnd<Simd<T, N>> for &Simd<T, N>
where T: SimdElement, Simd<T, N>: BitAnd<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<i16, N>
where i16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<i32, N>
where i32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<i64, N>
where i64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<i8, N>
where i8: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<isize, N>
where isize: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<u16, N>
where u16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<u32, N>
where u32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<u64, N>
where u64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<u8, N>
where u8: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitAnd for Simd<usize, N>
where usize: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<T, U, const N: usize> BitAndAssign<U> for Simd<T, N>
where Simd<T, N>: BitAnd<U, Output = Simd<T, N>>, T: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<'lhs, 'rhs, T, const N: usize> BitOr<&'rhs Simd<T, N>> for &'lhs Simd<T, N>
where T: SimdElement, Simd<T, N>: BitOr<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> BitOr<&Simd<T, N>> for Simd<T, N>
where T: SimdElement, Simd<T, N>: BitOr<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> BitOr<Simd<T, N>> for &Simd<T, N>
where T: SimdElement, Simd<T, N>: BitOr<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<i16, N>
where i16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<i32, N>
where i32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<i64, N>
where i64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<i8, N>
where i8: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<isize, N>
where isize: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<u16, N>
where u16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<u32, N>
where u32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<u64, N>
where u64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<u8, N>
where u8: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitOr for Simd<usize, N>
where usize: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<T, U, const N: usize> BitOrAssign<U> for Simd<T, N>
where Simd<T, N>: BitOr<U, Output = Simd<T, N>>, T: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<'lhs, 'rhs, T, const N: usize> BitXor<&'rhs Simd<T, N>> for &'lhs Simd<T, N>
where T: SimdElement, Simd<T, N>: BitXor<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> BitXor<&Simd<T, N>> for Simd<T, N>
where T: SimdElement, Simd<T, N>: BitXor<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> BitXor<Simd<T, N>> for &Simd<T, N>
where T: SimdElement, Simd<T, N>: BitXor<Output = Simd<T, N>>, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<i16, N>
where i16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<i32, N>
where i32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<i64, N>
where i64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<i8, N>
where i8: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<isize, N>
where isize: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<u16, N>
where u16: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<u32, N>
where u32: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<u64, N>
where u64: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<u8, N>
where u8: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<const N: usize> BitXor for Simd<usize, N>
where usize: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<T, U, const N: usize> BitXorAssign<U> for Simd<T, N>
where Simd<T, N>: BitXor<U, Output = Simd<T, N>>, T: SimdElement, LaneCount<N>: SupportedLaneCount,

impl<T, const N: usize> Clone for Simd<T, N>
where LaneCount<N>: SupportedLaneCount, T: SimdElement,

impl<T, const N: usize> Debug for Simd<T, N>
where LaneCount<N>: SupportedLaneCount, T: SimdElement + Debug,