# Strings A string is just an array of chars with a null char (`\0`) as the last element! ```c char str[] = "Nice"; printf("%s", str); ``` ``` Nice ``` How does this look in memory? ```c {'N', 'i', 'c', 'e', '\0'} ``` --- All are valid definitions: ```c char str[5] = "Nice"; char str[5] = {'N', 'i', 'c', 'e', '\0'}; char str[5] = {78, 105, 99, 101, 0}; ``` --- ```c char str[4] = "Nice"; printf("%s", str); // buffer overflow! ``` If you are declaring the length, make sure it's large enough -- there's no room for the null byte here! --- ```c= char str[] = "the quick brown fox jumps"; str[9] = '\0'; printf("%s", str); ``` ``` the quick ``` <!-- .element: class="fragment" data-fragment-index="1" --> ```c printf("%s", str+4); ``` ``` quick ``` <!-- .element: class="fragment" data-fragment-index="1" --> --- ### Passing a String to a function There are a few ways to do it, but the best way is to do this ```c void function(char* name) ``` You may also need to pass in the length of the string ```c void function(char* name, size_t name_length) ``` --- From `string.h`, `stdlib.h` `str-` functions manipulate null-terminated strings. `strn-` functions manipulate sequences of non-null chars. (avoid buffer overflow) - `strlen` - gets length of a string, not including `\0` - $O(n)$ - `strcpy` - copies strings - `strcmp` - lexographic compare (is a > b?) - `strcat` - concatenate (add) two strings - `strchr` - find char `c` in string `s` - `atoi` - string to int - `atof` - string to double --- ## Escape characters ``` \n :: newline (“line feed”) \r :: carriage return \\ :: unescaped backslash \” :: double quote \’ :: single quote \ :: escaped newline \t :: tab character ``` On Windows, a newline is represented with `\r\n` or CRLF. On Unix, a newline is represented with `\n` or LF. --- # String literals Different to string variables. Stored in read-only memory (cannot be modified). Both strings here are string literals. ```c char *str = "Hello world"; ``` This is also a sritn literal: ```c printf("Hello world\n"); ``` --- # KMP Knuth–Morris–Pratt algorithm. An algorithm that searches for a substring in a string. --- ## Sequential Searching (naive) String: `a b a e f a` Pattern to match: ` a e` --- **whiteboard demo** --- This type of searching is... O(nm) where n is the length of the target string and m is the length of the pattern to match. We can do better with KMP. --- ## KMP Algorithm String: `a b c x a b c d a b x a b c d a b c y` Pattern to match: `a b c d a b c y` Example stolen from: https://www.youtube.com/watch?v=GTJr8OvyEVQ. Watch this in your own time, it is a very good video. --- How do we optimise checking for substrings? We can create a failure function array that tells us where to reset the comparision to --- ``` F = [a, b, c, d, a, b, c, y] ``` `F[0] = -1` `F[1] = 0` **whiteboard demo** --- ## BMH Algorithm Same as KMP just reversed! Use a different failure function - whiteboard demo --- See all the code / sudo code in the lectures