UNIX II - HackMD

[Toc] # UNIX II ## tcsh commands and variables a shell is ++_command interpreter_++ also a ++_programming language_++. some popular shell: * sh: Bourne Shell * ksh: Korn Shell * csh, tcsh: C shell * bash: Bourne-Again Shell * zeh: Z shell ![image](https://hackmd.io/_uploads/r1hHw-kyR.png) > usually, the '#' symbol is a commnt > put the '#!' on the __first__ line of script to choose the shell, it must be the first two characters of the script. ![截圖 2024-03-25 晚上10.38.08](https://hackmd.io/_uploads/H1sU3WyJA.png) ### variable (tcsh syntax) * creating variables: Its data type is implicitly inferred from the data that is assigned to it, even if we want to re-declare a variable with new data type, just reassign it. * variable start with a ```$``` sign when ++used++ * variable gets ++without++ ```$``` when ++assigned++ * declare a variable with ```set``` (undeclare with ```unset```) ```bash set X = 1 set X = "T" set X = $T ``` > ~~set X = 1 + $#~~ #illegal * declare a variable with a ```@``` and a space (but only numbers or expressions can be used) ```bash @ X = 1 @ X = $T # only legal if $T is a number @ X = 1 + $# ``` > ~~@ X = "T"~~ # illegal * creating array: using parentheses, (), and elements are separated by ++__space__++ (not comma) ```bash % set var2 = (apple banana cherry) % % # if do this % set var2 = (apple, banana, cherry) % echo $var2[1] apple, % ``` ![image](https://hackmd.io/_uploads/HJkCDPWk0.png) * use ```[]```to access elements, start from 1 ```bash % set var1=(Apple Banana Cherry Date) % echo $var1[2] Banana % echo $var1[2-3] Banana Cherry % echo $var1[-3] Apple Banana Cherry % echo $var1[2-] Banana Cherry Date % echo $var1[-] Apple Banana Cherry Date % echo $var1[*] Apple Banana Cherry Date % echo $var1 Apple Banana Cherry Date ``` * use ```$#_``` to get an array's size ```bash % set var1 = (Apple Banana Cherry Date) % echo $#var1 4 % expr $#var1-1 3 ``` * use ```shift``` to kill the _first_ item ```bash % set var1 = (Apple Banana Cherry Date) % shift var1 % echo $var1 Banana Cherry Date % set var2=($var1[-2]) % echo $var2 Banana Cherry % % # if we don't know the size of array? % set var3 = (Apple Banana Cherrt Date) % set n_1 = expr $#var3 - 1 % set var3 = ($var3[-$n_1]) # compute the value n-1 % echo var3 Apple Banana % % # however, no `` inside a [...] % set var4 = (Apple Banana Cherrt Date) % set var4 = ($var4[-`expr $#var4 - 1`]) Syntax Error % set var4 = ($var4[-($#var4 - )1]) Syntax Error. ``` _*note: ```expr``` calculate the value of its arguments, and picky about space_ * built-in array ```argv[]``` ![截圖 2024-03-27 下午6.29.47](https://hackmd.io/_uploads/BktmruWkC.png) ![截圖 2024-03-27 下午6.29.31](https://hackmd.io/_uploads/HkFMSOWyR.png) * ```$?``` holds exit status of last command, __0__ means previous command ++succeded++, other any __nonzero__ means the command failed with error codes. > The reaeon is the causes of failure are more impotant than the causes of success, it make sense to allow errer codes by uning nonzero #s to indicate failue. > different shell have different error code > e.g., the error code of "no match" is 1 in csh, 2 in bash * summary parameters * built-in shell variables $PATH, $HOME, $SHELL, $prompt, $agrv, etc * user-defined variables $file1, $myvar, etc * positional parameters $0, $1, ..., $9 * ```$0``` name of the calling program * ```$1```~```$9``` command-line arguments > ```$10``` works correctly in csh, but not in bash > ![image](https://hackmd.io/_uploads/S1O8W0M1A.png) * special variables * ```$*``` list all arguments (learned in UNIX I) * ```$#``` # of arguments * ```$#X``` size of x * ```$?``` check exit status of last command * ```$?X``` check whether variable X exist?, if exist returns __1__ ![image](https://hackmd.io/_uploads/HkqivTfyA.png) * ```$<``` _a word_ typed from the keyboard or redirected from a file * ```set X = $<``` (MAY not properly handle keyboard input) * ```set X = $<:q``` * ```set X = "$<"``` ### command * if -then, else, endif ```if ( expression ) statement``` ```bash if ( expression ) then # some spaces is NECESSARY statements else if (expression) then statments else statements endif ``` | symbal | meaning | | --------- | ----------------------------- | | ! | negate | | != | not equal | | == | equal | | >,<,>=,<= | relational | | =~ | match to **wildcard pattern** | | !~ | not match to wildcard pattern | * csh conditional file tests ```if ( -f filename )``` | flag | meaning | | -------- | ---------------------------------------------------------------- | | ```-d``` | true if filename is a ++d++irectory | | ```-e``` | true if filename ++e++xists | | ```-f``` | true if filename is a plain file | | ```-o``` | true if you ++o++wn filename | | ```-r``` | true if filename is ++r++eadabletrue if filename is ++r++eadable | | ```-w``` | true if filename is ++w++ritable | | ```-x``` | true if filename is ++e++xecutable | | ```-z``` | true if filename is empty | > a tricky expression to test: suppose we want to write a script tha accepts a '-r' option as an input argument. ```bash if ($argv[1] == -r) if (-r == -r) # the csh thinks you mean to be use a file operator, # and so it tests the file name '==' to see if it is readable. # however, the letter -r will make no sence and gererate a syntax error. # to use a "dummy" character before both strings: if ( X$argv[1] == X-r) echo "the -r flag was given." ``` * switch -case, default, breaksw, endsw ```tcsh switch ( string ) case pattern1: # each case must be written alone statements breaksw case pattern2: statements breaksw ... default: statements breaksw endsw ``` * while -continue, break, end ```tcsh while ( condition ) commands end ``` ==EXAMPLE== ```tcsh #!/usr/bin/csh @ i = 0 while ($i < 3) echo -n $i @ i++ end ``` output ``` 012 ``` * foreach -continue, break, end ```csh foreach var ( array varible or wordlist ) commands end ``` ![截圖 2024-03-29 凌晨1.37.00](https://hackmd.io/_uploads/HkOaqm7kR.png) ![截圖 2024-03-29 凌晨1.41.52](https://hackmd.io/_uploads/rk6yhmQJC.png) ==EXAMPLE== a delete script ```tcsh #!/usr/bin/csh foreach name ($argv) if (-f $name) then echo -n "delete the file $name? (y/n/q)" else echo -n "delete the entire directory"\ "$name? (y/n/q)" endif set ans = "$<" switch ($ans) case n: continue: case q: exit: case y: rm -rf $name endsw end ``` ## shell quoting rules | symbal | meaning | | ------- | --------------------------------- | | ```"``` | weak quotes | | ```'``` | strong qutes | | ```\``` | strongly / single character quote | _*note: the ```\``` may be the end of line character_ * ```set verbose``` will echo every line of yout script ++before++ the variables have been evaluated * ```set echo``` will display each line after the variables and the meta-characters have ++been susdtituted++ > turn the these variables off again, use ```unset``` ![截圖 2024-03-29 凌晨2.11.41](https://hackmd.io/_uploads/ByKJ7V71C.png) * what is the output of following is tcsh? ```tcsh % echo \ ? % echo \\ \ % echo \\\ ? % echo \\\\ || echo receives '\\'. thus, a plain '\' \ % echo \\\\\\\\ || echo receives “\\\\”. thus, two plain “\”. \\ % echo \\\\\\\\\\\\\\\\ || echo receives “\\\\\\\\”. thus, four plain “\”. \\\\ % echo \\\\\\\\\\\\\\\\ | xargs echo || xargs receives and passes to echo, as-is “\\\\”. \\ % echo '\\\\\\\\' | xargs echo || xargs receives and passes to echo, as-is “\\\\”. \\ ``` ![截圖 2024-03-29 凌晨2.38.53](https://hackmd.io/_uploads/B1iyqVmJR.png) ## grep * ```grep``` to search for _regular-expression_ patterns ( _get regular expression and print_ ) > 1. to make them easier to write > 2. to allow a choice of a pattern (OR expression) > 3. to specify patterns that cannot be represented by a nondeterministic finite state automaton (NDFA) > Regular expressions are a simple case of context free grammars. Although context free grammars are important in computer science, they **aren’t that useful for UNIX programming** > 4. we will need **egrep**, a search program using **extended regular expression** * ```fgrep``` to searches for strings (doesn't use regular expression) > 1. limitations of fgrep: cannot use it to get approximate match > 2. sometimes we are not sure about thr string we want > for example, you might know only that the word you are seeking begins with z and ends with -ic, and had the sequence gm in it somewhere. > 3. we have to use **grep**, a searching programming for ++regular expression++ * ```egrep``` for an alternative pattern description system (extended regular expressions) > 1. to make regular expression easier to write > 2. to allow a choice of patterns * importannt flags | flag | meaning | | ------------- | ------------------------------------------------------------------------------------- | | ```-i``` | Not case sensitive (i.e., **i**gnore case) | | ```-n``` | Display line **n**umbers (with a colon after each) | | ```-v``` | In**v**ert the matches (i.e., print if not match) | | ```-w``` | **W**hole word matches | | ```-o``` | **O**nly display the match, not the entire line containing it | | ```-e``` | After this flag goes an **e**xpression to match (several conditions) | | ```-A``` | Set the # of lines of context to print **a**fter each match | | ```-B``` | Set the # of lines of context to print **b**efore each match | | ```-C``` | Set the # of lines of **c**ontext to print before and after | | ```--color``` | Highlight the matching pattern within its line of text | ==EXAMPLE== of ```fgrep``` ![截圖 2024-03-29 凌晨3.23.55](https://hackmd.io/_uploads/r1UCQrmyA.png) ![截圖 2024-03-29 凌晨3.24.46](https://hackmd.io/_uploads/HJibVSQyA.png) ![截圖 2024-03-29 凌晨3.27.08](https://hackmd.io/_uploads/r1v9EH710.png) ## regular expression (grep) * ```^``` requires the expression to match the ++front++ of a line (as the first symbal of a regular expression) e.g., line begins with 'A': ```^A``` * ```$``` requires the expression to match the ++end++ of a line (as the end symbal of a regular expression) e.g., line ends with 'Z': ```Z$``` * ```\``` turns off special meaning for the next charcter e.g., to match a literal '$': ```\$``` * ```[]``` matched to any one of the enclosed characters e.g., to match all numbers:```[0-9]``` e.g., not a letter: ```[^a-zA-Z]``` (as the first symbal in []) * ```.``` matches to any 1 character e.g., a 1-chatacter line: ```^.$``` * ```*``` matches to zero or more of the preceding character or exprssion e.g., a line begins with 'A' and ends with 'Z': ```^A.*Z$``` --- Recall: wildcard symbol in csh > ```\``` and ```[]```are equivaltent to each other, ```?``` is same as ```.``` > ```*``` is different in csh and grep > however, [] have differences in csh and grep > 1. whether to treat unfinished [...] as errors > 2. whether to treat '\' as a special character when inside of a[...] * how grep & csh treat a '[' without ']'? ![image](https://hackmd.io/_uploads/BkxJlLQ1A.png) * what if we want a ']' in the set? then it must go first or after the ```^``` ![image](https://hackmd.io/_uploads/SypiG8XkR.png) ![image](https://hackmd.io/_uploads/rJBmI87JC.png) * what will this do? ```% grep 'ab*c' ab*c``` it will search for all patterns with/without b such as ac, abc, abbc, abbbc. within files that have an any number of characters(including zero) in the name ab*c such as abc, abxc. abbdc, abbdc. hence, the word __st++ac++k__ will match, if found a file name __++ab++cdefghijk++c++__ --- more regular expression syntax * ```\{x\}``` Matches the preceding regular expression only if it is repeated exactly x times * ```\{x,y\}``` Matches the preceding regular expression only if the number of repetitions is in the range of x to y * ```\{,x\}``` Matches the preceding regular expression only if the number of repetitions is $\le$ x * ```\{x,\}``` Matches the preceding regular expression only if the number of repetitions is $\ge$ x ![image](https://hackmd.io/_uploads/HyZzRKrkC.png) _*note: noticing that the expression are always taking the longest possible match, y. In the example above, it did not take two matches of size 2, instead, take one match of size 3._ * ```\>``` The preceding expression must end at the end of a ++word++ ![image](https://hackmd.io/_uploads/HkGCvqSyA.png) * ```\<``` The expression that follows must begin a word ![image](https://hackmd.io/_uploads/HkQMFqr1R.png) * ```$…$``` Define a group for a sub-portion of the regular expression. Groups extend the reach of the “*” and \{…\} operators. ![image](https://hackmd.io/_uploads/rkXN5qHkC.png) Another reason for groups is to allow backreferences. Backreferencing is accomplished by subsequently using: * ```\1, \2...``` to let you identify a rematch to the earlier pattern. ![image](https://hackmd.io/_uploads/HJSTJiH1A.png) e.g., suppose that you wanted to find any double-repeated letters, such as in “banana” and “nonogram”. Then your regular expression is: $[a-z]$$[a-z]$\1\2 (“banana” is a double-match, because there’s banana.) --- **POSIX built-in patterns** | character group | meanging | | --------------- | -------------------------------- | | [:alnum:] | alphanumeric | | [:cntrl:] | control character | | [:lower:] | lower case character | | [:space:] | whitespace | | [:alpha:] | alphabetic | | [:digit:] | digit | | [:print:] | printable character | | [:upper:] | upper case character | | [:blank:] | whitespace, tab, etc | | [:graph:] | printable and visible characters | | [:punct:] | punctuation | | [:xdigit:] | extended digit | e.g., [:alnum:] == [a-zA-Z0-9], [;lower:] == [a-z] ## regular expression (egrep) * ```^``` / ```$``` / ```\``` / ```[]``` / ```.``` / ```*```are the same as grep * ```?``` makes the preceding expression optional. It is equivalent to the regular expression syntax: ```\{0,1\}``` * ```+``` requires the preceding expression to occur at least once. It is equivalent to the regular expression syntax: ```\{1,\}``` * ```|``` the OR operation. To search for one of 2 different words. you would say ```word1|word2``` * ```()``` can be used to change the associativity of the OR operator. hence, ```w(x|y)z``` matches to exactly these 2 strings: ```wxz``` or ```wyz``` Also, the () operator can extend the range of ```*```, ```+```, and ```?``` * comparison | syntax | meaning | | ------------------------------ | --------------------------------------------------------- | | ```grep 'abc\|edf'``` | lines containging 'abc\|edf' | | ```grep '(a$)\|(b(c\|d)e)'``` | lines containging 'a$)\|(b(c\|d)e)' | | ```grep 'ab+c'``` | lines containging 'ab+c' | | ```egrep 'abc\|edf'``` | lines containing _abc_ or _edf_ | | ```egrep '(a$)\|(b(c\|d)e)'``` | lines ending in a _a_ or containing either _bce_ or _bde_ | | ```grep 'ab+c'``` | lines containg _abc_, _abbc_, or _abbbbc_, etc | | | | | ```grep '$[ab]$\1'``` | lines containing _aa_ or _ab_ | | ```grep 'a\{2'``` | error, no closing '}' | | ```grep '\<a'``` | lines containg that begin with _a_ | | ```grep '$[ab]$\1'``` | lines containg _(a)1_ or _(b)1_ | | ```grep '\<a'``` | lines containg _a{2_ | | ```grep '\<a'``` | lines containg _'<a'_ | ![截圖 2024-04-08 清晨5.16.06](https://hackmd.io/_uploads/Bk_LTYgl0.png) instead, ```grep``` use ```\?```, ```\+``` and ```\|``` --- :::info more information ![image](https://hackmd.io/_uploads/H1vhN9xlC.png) :::