Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8.
See Implementation
Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8.