# Notes on how the sse4.2 string search functions
The most important thing is the immediate control value which determines how the rest of the arguments are used.
## String Data Type (pick only 1)
This determines what data type a "char" is for this operation:
* `_SIDD_UBYTE_OPS`: unsigned 8-bit characters (`u8`)
* `_SIDD_UWORD_OPS`: unsigned 16-bit characters (`u16`)
* `_SIDD_SBYTE_OPS`: signed 8-bit characters (`i8`)
* `_SIDD_SWORD_OPS`: signed 16-bit characters (`i16`)
## Comparison Operation (pick only 1)
Remembering that "chars" is the data type selected above...
* `_SIDD_CMP_EQUAL_ANY`: are any chars in `arg2` present anywhere in `arg1`
* eg: [b'a', ...] searches for an 'a' anywhere in the string.
* `_SIDD_CMP_RANGES`: This makes `arg2` count as low bound and upper bound pairs that designate ranges to check for.
* eg: [b'a', b'z', ...] would check `arg1` for chars in the b'a' to b'z' range
* `_SIDD_CMP_EQUAL_EACH`: is each char in `arg2` the same as the same position in `arg1`
* eg: [b'a', b'b', ...] only matches "ab...". This is a "normal" style of string comparison
* `_SIDD_CMP_EQUAL_ORDERED`: Is `arg2` contained within `arg1`
* eg: "bc" is a substring of "abcd"
## Operation Modifiers
You can negate the output, which effectively tests for the opposite thing:
* `_SIDD_NEGATIVE_POLARITY` will negate all results.
* `_SIDD_MASKED_NEGATIVE_POLARITY` will negate results "in the string" only.
With operations that end in `i` you can pick the index you get:
* `_SIDD_LEAST_SIGNIFICANT`: least significant bit (this is default)
* `_SIDD_MOST_SIGNIFICANT`: most significant bit
With operations that end in `m` you can pick the mask you get:
* `_SIDD_BIT_MASK`: return bit mask (this is default)
* `_SIDD_UNIT_MASK`: return byte/word mask (depending on char size)
(The second two selections are `imm8[6]`, in the pseudocode on the Intel Intrinsics Guide)
## The Actual Functions To Call
All the functions are named like `cmp?str?`:
* The first unknown is how the string length is known:
* `i`: for "implicit string length" (null-terminated).
* `e`: for "explicit string length" (an extra int arg passed after each register arg).
* The second unknown is one of [a,c,i,m,o,s,z]
* `a`: returns 1 if b did not contain a null character and the resulting mask was zero, and 0 otherwise.
* ??
* `c`: returns 1 if the resulting mask was non-zero, and 0 otherwise.
* aka, "Is there even a partial match?", even if it wasn't a full match.
* `i`: store the generated index in dst.
* The index of the match. This can be the first match or the last match, depending on if you pick least or most significant.
* `m`: store the generated mask in dst.
* Gives the mask of the matches themselves. This can be in bits (one bit par char) or in char units (`u8`/`u16` per char, depending on char size).
* `o`: returns bit 0 of the resulting bit mask
* (fairly useless)
* `s`: returns 1 if any character in a was null, and 0 otherwise
* for explicit length strings: this tells you if the string length is less than the lane maximum (useless)
* For implied length strings this lets you do strlen.
* `z`: returns 1 if any character in b was null, and 0 otherwise
* for explicit length strings: this tells you if the string length is less than the lane maximum (useless)
* For implied length strings this lets you do strlen.
## Example Tasks And How To Do Them
???
// finfine