The OpenD Programming Language


Public Imports

public import inteli.types;
Undocumented in source.



alias _mm_cvt_pi2ps = _mm_cvtpi32_ps

Convert packed signed 32-bit integers in b to packed single-precision (32-bit) floating-point elements, store the results in the lower 2 elements, and copy the upper 2 packed elements from a to the upper elements of result.

alias _mm_load1_ps = _mm_load_ps1

Load a single-precision (32-bit) floating-point element from memory into all elements.



Get the exception mask bits from the MXCSR control and status register. The exception mask may contain any of the following flags: _MM_MASK_INVALID, _MM_MASK_DIV_ZERO, _MM_MASK_DENORM, _MM_MASK_OVERFLOW, _MM_MASK_UNDERFLOW, _MM_MASK_INEXACT. Note: won't correspond to reality on non-x86, where MXCSR this is emulated.


Get the exception state bits from the MXCSR control and status register. The exception state may contain any of the following flags: _MM_EXCEPT_INVALID, _MM_EXCEPT_DIV_ZERO, _MM_EXCEPT_DENORM, _MM_EXCEPT_OVERFLOW, _MM_EXCEPT_UNDERFLOW, _MM_EXCEPT_INEXACT. Note: won't correspond to reality on non-x86, where MXCSR this is emulated. No exception reported.


Get the flush zero bits from the MXCSR control and status register. The flush zero may contain any of the following flags: _MM_FLUSH_ZERO_ON or _MM_FLUSH_ZERO_OFF


Get the rounding mode bits from the MXCSR control and status register. The rounding mode may contain any of the following flags: _MM_ROUND_NEAREST, _MM_ROUND_DOWN, _MM_ROUND_UP, _MM_ROUND_TOWARD_ZERO`.


Set the exception mask bits of the MXCSR control and status register to the value in unsigned 32-bit integer _MM_MASK_xxxx. The exception mask may contain any of the following flags: _MM_MASK_INVALID, _MM_MASK_DIV_ZERO, _MM_MASK_DENORM, _MM_MASK_OVERFLOW, _MM_MASK_UNDERFLOW, _MM_MASK_INEXACT.


Set the exception state bits of the MXCSR control and status register to the value in unsigned 32-bit integer _MM_EXCEPT_xxxx. The exception state may contain any of the following flags: _MM_EXCEPT_INVALID, _MM_EXCEPT_DIV_ZERO, _MM_EXCEPT_DENORM, _MM_EXCEPT_OVERFLOW, _MM_EXCEPT_UNDERFLOW, _MM_EXCEPT_INEXACT.


Set the flush zero bits of the MXCSR control and status register to the value in unsigned 32-bit integer _MM_FLUSH_xxxx. The flush zero may contain any of the following flags: _MM_FLUSH_ZERO_ON or _MM_FLUSH_ZERO_OFF.


Set the rounding mode bits of the MXCSR control and status register to the value in unsigned 32-bit integer _MM_ROUND_xxxx. The rounding mode may contain any of the following flags: _MM_ROUND_NEAREST, _MM_ROUND_DOWN, _MM_ROUND_UP, _MM_ROUND_TOWARD_ZERO.

void _MM_TRANSPOSE4_PS(__m128 row0, __m128 row1, __m128 row2, __m128 row3)

Transpose the 4x4 matrix formed by the 4 rows of single-precision (32-bit) floating-point elements in row0, row1, row2, and row3, and store the transposed matrix in these vectors (row0 now contains column 0, etc.).

__m128 _mm_add_ps(__m128 a, __m128 b)

Add packed single-precision (32-bit) floating-point elements in a and b.

__m128 _mm_add_ss(__m128 a, __m128 b)

Add the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_and_ps(__m128 a, __m128 b)

Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b.

__m128 _mm_andnot_ps(__m128 a, __m128 b)

Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b.

__m64 _mm_avg_pu16(__m64 a, __m64 b)

Average packed unsigned 16-bit integers in `a and b`.

__m64 _mm_avg_pu8(__m64 a, __m64 b)

Average packed unsigned 8-bit integers in `a and b`.

__m128 _mm_cmpeq_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for equality.

__m128 _mm_cmpeq_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for equality, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpge_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for greater-than-or-equal.

__m128 _mm_cmpge_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for greater-than-or-equal, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpgt_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for greater-than.

__m128 _mm_cmpgt_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for greater-than, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmple_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for less-than-or-equal.

__m128 _mm_cmple_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for less-than-or-equal, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmplt_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for less-than.

__m128 _mm_cmplt_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for less-than, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpneq_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for not-equal.

__m128 _mm_cmpneq_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for not-equal, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpnge_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for not-greater-than-or-equal.

__m128 _mm_cmpnge_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for not-greater-than-or-equal, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpngt_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for not-greater-than.

__m128 _mm_cmpngt_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for not-greater-than, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpnle_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for not-less-than-or-equal.

__m128 _mm_cmpnle_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for not-less-than-or-equal, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpnlt_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b for not-less-than.

__m128 _mm_cmpnlt_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b for not-less-than, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpord_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b to see if neither is NaN.

__m128 _mm_cmpord_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b to see if neither is NaN, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cmpunord_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b to see if either is NaN.

__m128 _mm_cmpunord_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b to see if either is NaN. and copy the upper 3 packed elements from a to the upper elements of result.

int _mm_comieq_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point element in a and b for equality, and return the boolean result (0 or 1).

int _mm_comige_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point element in a and b for greater-than-or-equal, and return the boolean result (0 or 1).

int _mm_comigt_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point element in a and b for greater-than, and return the boolean result (0 or 1).

int _mm_comile_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point element in a and b for less-than-or-equal, and return the boolean result (0 or 1).

int _mm_comilt_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point element in a and b for less-than, and return the boolean result (0 or 1).

int _mm_comineq_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point element in a and b for not-equal, and return the boolean result (0 or 1).

__m64 _mm_cvt_ps2pi(__m128 a)

Convert 2 lower packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers.

__m128 _mm_cvt_si2ss(__m128 v, int x)

Convert the signed 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element, and copy the upper 3 packed elements from a to the upper elements of the result.

__m128 _mm_cvtpi16_ps(__m64 a)

Convert packed 16-bit integers in a to packed single-precision (32-bit) floating-point elements.

__m128 _mm_cvtpi32_ps(__m128 a, __m64 b)

Convert packed signed 32-bit integers in b to packed single-precision (32-bit) floating-point elements, store the results in the lower 2 elements, and copy the upper 2 packed elements from a to the upper elements of result.

__m128 _mm_cvtpi32x2_ps(__m64 a, __m64 b)

Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, store the results in the lower 2 elements, then covert the packed signed 32-bit integers in b to single-precision (32-bit) floating-point element, and store the results in the upper 2 elements.

__m128 _mm_cvtpi8_ps(__m64 a)

Convert the lower packed 8-bit integers in a to packed single-precision (32-bit) floating-point elements.

__m64 _mm_cvtps_pi16(__m128 a)

Convert packed single-precision (32-bit) floating-point elements in a to packed 16-bit integers. Note: this intrinsic will generate 0x7FFF, rather than 0x8000, for input values between 0x7FFF and 0x7FFFFFFF.

__m64 _mm_cvtps_pi32(__m128 a)

Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers.

__m64 _mm_cvtps_pi8(__m128 a)

Convert packed single-precision (32-bit) floating-point elements in a to packed 8-bit integers, and store the results in lower 4 elements. Note: this intrinsic will generate 0x7F, rather than 0x80, for input values between 0x7F and 0x7FFFFFFF.

__m128 _mm_cvtpu16_ps(__m64 a)

Convert packed unsigned 16-bit integers in a to packed single-precision (32-bit) floating-point elements.

__m128 _mm_cvtpu8_ps(__m64 a)

Convert the lower packed unsigned 8-bit integers in a to packed single-precision (32-bit) floating-point element.

__m128 _mm_cvtsi32_ss(__m128 v, int x)

Convert the signed 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_cvtsi64_ss(__m128 v, long x)

Convert the signed 64-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element, and copy the upper 3 packed elements from a to the upper elements of result.

float _mm_cvtss_f32(__m128 a)

Take the lower single-precision (32-bit) floating-point element of a.

int _mm_cvtss_si32(__m128 a)

Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer.

long _mm_cvtss_si64(__m128 a)

Convert the lower single-precision (32-bit) floating-point element in a to a 64-bit integer.

__m64 _mm_cvtt_ps2pi(__m128 a)

Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation.

int _mm_cvtt_ss2si(__m128 a)

Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer with truncation.

long _mm_cvttss_si64(__m128 a)

Convert the lower single-precision (32-bit) floating-point element in a to a 64-bit integer with truncation.

__m128 _mm_div_ps(__m128 a, __m128 b)

Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b.

__m128 _mm_div_ss(__m128 a, __m128 b)

Divide the lower single-precision (32-bit) floating-point element in a by the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.

int _mm_extract_pi16(__m64 a, int imm8)

Extract a 16-bit unsigned integer from a, selected with imm8. Zero-extended.

void _mm_free(void* mem_addr)

Free aligned memory that was allocated with _mm_malloc or _mm_realloc.

uint _mm_getcsr()

Get the unsigned 32-bit value of the MXCSR control and status register. Note: this is emulated on ARM, because there is no MXCSR register then.

__m64 _mm_insert_pi16(__m64 v, int i, int imm8)

Insert a 16-bit integer i inside a at the location specified by imm8.

__m128 _mm_load_ps(const(float)* p)

Load 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from memory.

__m128 _mm_load_ps1(const(float)* p)

Load a single-precision (32-bit) floating-point element from memory into all elements.

__m128 _mm_load_ss(const(float)* mem_addr)

Load a single-precision (32-bit) floating-point element from memory into the lower of dst, and zero the upper 3 elements. mem_addr does not need to be aligned on any particular boundary.

__m128 _mm_loadh_pi(__m128 a, const(__m64)* mem_addr)

Load 2 single-precision (32-bit) floating-point elements from memory into the upper 2 elements of result, and copy the lower 2 elements from a to result. mem_addr does not need to be aligned on any particular boundary.

__m128 _mm_loadl_pi(__m128 a, const(__m64)* mem_addr)

Load 2 single-precision (32-bit) floating-point elements from memory into the lower 2 elements of result, and copy the upper 2 elements from a to result. mem_addr does not need to be aligned on any particular boundary.

__m128 _mm_loadr_ps(const(float)* mem_addr)

Load 4 single-precision (32-bit) floating-point elements from memory in reverse order. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.

__m128 _mm_loadu_ps(const(float)* mem_addr)

Load 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from memory. mem_addr does not need to be aligned on any particular boundary.

void* _mm_malloc(size_t size, size_t alignment)

Allocate size bytes of memory, aligned to the alignment specified in align, and return a pointer to the allocated memory. _mm_free should be used to free memory that is allocated with _mm_malloc.

void _mm_maskmove_si64(__m64 a, __m64 mask, char* mem_addr)

Conditionally store 8-bit integer elements from a into memory using mask (elements are not stored when the highest bit is not set in the corresponding element) and a non-temporal memory hint.

__m64 _mm_max_pi16(__m64 a, __m64 b)

Compare packed signed 16-bit integers in a and b, and return packed maximum value.

__m128 _mm_max_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b, and return packed maximum values.

__m64 _mm_max_pu8(__m64 a, __m64 b)

Compare packed unsigned 8-bit integers in a and b, and return packed maximum values.

__m128 _mm_max_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b, store the maximum value in the lower element of result, and copy the upper 3 packed elements from a to the upper element of result.

__m64 _mm_min_pi16(__m64 a, __m64 b)

Compare packed signed 16-bit integers in a and b, and return packed minimum values.

__m128 _mm_min_ps(__m128 a, __m128 b)

Compare packed single-precision (32-bit) floating-point elements in a and b, and return packed maximum values.

__m64 _mm_min_pu8(__m64 a, __m64 b)

Compare packed unsigned 8-bit integers in a and b, and return packed minimum values.

__m128 _mm_min_ss(__m128 a, __m128 b)

Compare the lower single-precision (32-bit) floating-point elements in a and b, store the minimum value in the lower element of result, and copy the upper 3 packed elements from a to the upper element of result.

__m128 _mm_move_ss(__m128 a, __m128 b)

Move the lower single-precision (32-bit) floating-point element from b to the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_movehl_ps(__m128 a, __m128 b)

Move the upper 2 single-precision (32-bit) floating-point elements from b to the lower 2 elements of result, and copy the upper 2 elements from a to the upper 2 elements of dst.

__m128 _mm_movelh_ps(__m128 a, __m128 b)

Move the lower 2 single-precision (32-bit) floating-point elements from b to the upper 2 elements of result, and copy the lower 2 elements from a to the lower 2 elements of result

int _mm_movemask_pi8(__m64 a)

Create mask from the most significant bit of each 8-bit element in a.

int _mm_movemask_ps(__m128 a)

Set each bit of result based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in a.

__m128 _mm_mul_ps(__m128 a, __m128 b)

Multiply packed single-precision (32-bit) floating-point elements in a and b.

__m128 _mm_mul_ss(__m128 a, __m128 b)

Multiply the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.

__m64 _mm_mulhi_pu16(__m64 a, __m64 b)

Multiply the packed unsigned 16-bit integers in a and b, producing intermediate 32-bit integers, and return the high 16 bits of the intermediate integers.

__m128 _mm_or_ps(__m128 a, __m128 b)

Compute the bitwise OR of packed single-precision (32-bit) floating-point elements in a and b, and return the result.

void _mm_prefetch(const(void)* p)

Fetch the line of data from memory that contains address p to a location in the cache hierarchy specified by the locality hint i.

__m128 _mm_rcp_ps(__m128 a)

Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a` , and return the results. The maximum relative error for this approximation is less than 1.5*2^-12.

__m128 _mm_rcp_ss(__m128 a)

Compute the approximate reciprocal of the lower single-precision (32-bit) floating-point element in a, store it in the lower element of the result, and copy the upper 3 packed elements from a to the upper elements of result. The maximum relative error for this approximation is less than 1.5*2^-12.

void* _mm_realloc(void* aligned, size_t size, size_t alignment)

Reallocate size bytes of memory, aligned to the alignment specified in alignment, and return a pointer to the newly allocated memory. Previous data is preserved if any.

void* _mm_realloc_discard(void* aligned, size_t size, size_t alignment)

Reallocate size bytes of memory, aligned to the alignment specified in alignment, and return a pointer to the newly allocated memory. Previous data is discarded.

__m128 _mm_rsqrt_ps(__m128 a)

Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a. The maximum relative error for this approximation is less than 1.5*2^-12.

__m128 _mm_rsqrt_ss(__m128 a)

Compute the approximate reciprocal square root of the lower single-precision (32-bit) floating-point element in a, store the result in the lower element. Copy the upper 3 packed elements from a to the upper elements of result. The maximum relative error for this approximation is less than 1.5*2^-12.

__m64 _mm_sad_pu8(__m64 a, __m64 b)

Compute the absolute differences of packed unsigned 8-bit integers in a and b, then horizontally sum each consecutive 8 differences to produce four unsigned 16-bit integers, and pack these unsigned 16-bit integers in the low 16 bits of result.

__m128 _mm_set1_ps(float a)

Broadcast single-precision (32-bit) floating-point value a to all elements.

__m128 _mm_set_ps(float e3, float e2, float e1, float e0)

Set packed single-precision (32-bit) floating-point elements with the supplied values.

__m128 _mm_set_ss(float a)

Copy single-precision (32-bit) floating-point element a to the lower element of result, and zero the upper 3 elements.

void _mm_setcsr(uint controlWord)

Set the MXCSR control and status register with the value in unsigned 32-bit integer controlWord.

__m128 _mm_setr_ps(float e3, float e2, float e1, float e0)

Set packed single-precision (32-bit) floating-point elements with the supplied values in reverse order.

__m128 _mm_setzero_ps()

Return vector of type __m128 with all elements set to zero.

void _mm_sfence()

Do a serializing operation on all store-to-memory instructions that were issued prior to this instruction. Guarantees that every store instruction that precedes, in program order, is globally visible before any store instruction which follows the fence in program order.

__m128 _mm_shuffle_ps(__m128 a, __m128 b)

Shuffle single-precision (32-bit) floating-point elements in a and b using the control in imm8, Warning: the immediate shuffle value imm is given at compile-time instead of runtime.

__m128 _mm_sqrt_ps(__m128 a)

Compute the square root of packed single-precision (32-bit) floating-point elements in a.

__m128 _mm_sqrt_ss(__m128 a)

Compute the square root of the lower single-precision (32-bit) floating-point element in a, store it in the lower element, and copy the upper 3 packed elements from a to the upper elements of result.

void _mm_store1_ps(float* mem_addr, __m128 a)

Store the lower single-precision (32-bit) floating-point element from a into 4 contiguous elements in memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.

void _mm_store_ps(float* mem_addr, __m128 a)

Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.

void _mm_store_ss(float* mem_addr, __m128 a)

Store the lower single-precision (32-bit) floating-point element from a into memory. mem_addr does not need to be aligned on any particular boundary.

void _mm_storeh_pi(__m64* p, __m128 a)

Store the upper 2 single-precision (32-bit) floating-point elements from a into memory.

void _mm_storel_pi(__m64* p, __m128 a)

Store the lower 2 single-precision (32-bit) floating-point elements from a into memory.

void _mm_storer_ps(float* mem_addr, __m128 a)

Store 4 single-precision (32-bit) floating-point elements from a into memory in reverse order. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.

void _mm_storeu_ps(float* mem_addr, __m128 a)

Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.

void _mm_stream_pi(__m64* mem_addr, __m64 a)

Store 64-bits of integer data from a into memory using a non-temporal memory hint. Note: non-temporal stores should be followed by _mm_sfence() for reader threads.

void _mm_stream_ps(float* mem_addr, __m128 a)

Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from as into memory using a non-temporal memory hint. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated. Note: non-temporal stores should be followed by _mm_sfence() for reader threads.

__m128 _mm_sub_ps(__m128 a, __m128 b)

Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a.

__m128 _mm_sub_ss(__m128 a, __m128 b)

Subtract the lower single-precision (32-bit) floating-point element in b from the lower single-precision (32-bit) floating-point element in a, store the subtration result in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.

__m128 _mm_undefined_ps()

Return vector of type __m128 with undefined elements.

__m128 _mm_unpackhi_ps(__m128 a, __m128 b)

Unpack and interleave single-precision (32-bit) floating-point elements from the high half a and b.

__m128 _mm_unpacklo_ps(__m128 a, __m128 b)

Unpack and interleave single-precision (32-bit) floating-point elements from the low half of a and b.

__m128 _mm_xor_ps(__m128 a, __m128 b)

Compute the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b.

Manifest constants

enum _MM_HINT_NTA;
enum _MM_HINT_T0;
enum _MM_HINT_T1;
enum _MM_HINT_T2;



MXCSR Exception states.


MXCSR Exception states.

enum int _MM_EXCEPT_MASK;

MXCSR Exception states mask.


MXCSR Exception states.


MXCSR Denormal flush to zero mask.

enum int _MM_FLUSH_ZERO_OFF;

MXCSR Denormal flush to zero modes.

enum int _MM_FLUSH_ZERO_ON;

MXCSR Denormal flush to zero modes.

enum int _MM_MASK_DENORM;
enum int _MM_MASK_DIV_ZERO;
enum int _MM_MASK_INEXACT;

MXCSR Exception masks.

enum int _MM_MASK_INVALID;

MXCSR Exception masks.

enum int _MM_MASK_MASK;

MXCSR Exception masks mask.


MXCSR Exception masks.

enum int _MM_ROUND_DOWN;

MXCSR Rounding mode.

enum int _MM_ROUND_MASK;

MXCSR Rounding mode mask.

enum int _MM_ROUND_UP;

MXCSR Rounding mode.
