Introduction to Linux website https://hpc2n.github.io/intro-linux/ Github https://github.com/hpc2n/intro-linux Information https://umeauniversity.sharepoint.com/:w:/s/HPC2N630/Ed16r0fqgGpLm6T2u-bl5yIBZjmV_GvBy-EOsg9fcWK-UQ Question https://umeauniversity.sharepoint.com/:w:/s/HPC2N630/EQUtySAkcGREswBAeZiGJgUBilEListST0Oftyf2FqSnSg https://www.youtube.com/user/HPC2N https://www.hpc2n.umu.se/events/courses/2024/fall/intro-linux # 1. Linux terminology ## 1.1 Abstract stuff **(1) Linux**: Linux is a family of open-source Unix-like **operating systems** based on the Linux kernel. An **operating system** is the software that sits underneath all of the other software on a computer, **managing the computer’s hardware (CPU, GPU, memory, storage…) and taking care of the connections between your other software and the hardware** **Kernel**: A kernel manages your system’s hardware, as well as all the programs on your computer **(2) Linux distribution** Linux is typically packaged as a **Linux distribution**, which includes **the kernel and supporting system software and libraries**, many of which are provided by the GNU Project. Any operating system that uses the Linux kernel is called a Linux distribution Examples of popular Linux distros are: Ubuntu, Debian, Fedora, CentOS, Slackware, Gentoo, Arch, Mint, and many others. **(3) Desktop environment** To make it easier for users to work on their computer, many operating systems have a desktop that offers a **graphical interface** manage the system. Windows and macOS are common examples. Linux also has (optional) desktop(s) **(4) Window managers** - are programs controlling placement and movement of windows on your screen - usually work together with desktop environments - can also be used separately on your Linux machine - are lightweight and can offer better performance than desktop environments - do not ship with unnecessary apps and widget (you even have to install menu and compositor if you go with just a window manager) - can be complicated to set up for non-technical users, since you need to install everything extra yourself. **(5) Desktop distribution** Include a **desktop environment**, like GNOME, MATE, KDE Plasma, Xfce, Unity, or many others. A **window manager** together with applications written using a widget toolkit are generally responsible for most of what the user sees. ![image](https://hackmd.io/_uploads/ByDcp4YaR.png) ## 1.2. command line interface(CLI), terminal The terminal, console, or “**command line interface**” (often abbreviated “CLI”) is a text.based interface which is often the primary way to interact with Linux. It is a program that is used to control your operating system’s “**shell**”. A shell is a text-based (or graphical interface) through which the user interact with the desktop. ## 1.3. Root, user All Linux operating systems have a built-in system of user roles, where each user has a specific role, with varying levels of permissions Some of the common roles are: - **user**: Nothing more, nothing less. The user can normally do what they want in their own home directory and perhaps a few other directories where they have been given permission to work. A user cannot install anything to the system outside of these directories where they have permission, and also cannot change most setup files (other than those affecting only themself). - **root**: The root user or root account has administrative priviliges; complete access to all configurations, commands, and files in the system. Other words for root is superuser or administrator, though root is the most common term on Linux. root refers to both the root directory and the root user. The root directory is the top level/parent directory containing all files and folders of the system. It is designated with a “/”, and this is how you can denote it in your commands. ## 1.4. package manager On Linux, the programs you install are often referred to as “**packages**”. Commonly, they are installed on the command line. A **“package manager**” is a tool that gives you a graphical interface to help you find new packages, install, update, and sometimes even configure them. On Linux, most **apps** are distributed as packages and are available in the official repositories of your distribution. You can also add third-party repositories with a package manager if you want access to even more packages. There are several different package managers available for the different Linux distros. These are some of the popular ones: - APT: used by Debian and Ubuntu-based distributions. - RPM: used by Fedora, CentOS, and RHEL. - pacman: used by Arch Linux and its derivatives. - yum/dnf: used by Red Hat-based distributions. ## 1.5. Source and binary packages * Source packages: these contain the source code of a program: A user must manually compile and install it to run the software. * Binary packages: these contain prebuilt and pre-compiled executables for the software. * Repositories: ollections of packages and their metadata can be found in software repositories on remote servers. ## 1.6. other terminologies https://hpc2n.github.io/intro-linux/linux-terms/ * process: Each program you start on your system will run a number of processes in the background. * shells: A Shell is a command-line interpreter which provides an interface for users to interact with the operating system. * shell script: A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter.The Windows equivalent to shell scripts are “batch scripts”. * file system: A file system provides a data storage service that allows applications to share mass storage. * GNU: GNU is an extensive collection of free software, which can be used as an operating system or can be used in parts with other operating systems. * Bootloader, grub: A bootloader is a program responsible for booting your computer. GRUB is one of the most used bootloaders * When a computer is turned off, its software - including the operating systems, application code, and data - remains stored on non-volatile memory. The computer normally does not have an operating system or its loader in random-access memory (RAM) when it is powered on. First the computer executes a small program (the boot loader) stored in read-only memory (ROM) along with some needed data, to initialize RAM, to access the nonvolatile device (storage system like HDD) or devices from which the operating system programs and data can be loaded into RAM. * Encryption: Encryption scrambles data into a secure and unreadable form so it can only be accessed by authorized parties * IP address: An Internet Protocol address (IP address) is a unique numerical label that is assigned to every device that is connected to a network. * Kernel panic: Kernel Panic is a critical error condition in the Linux kernel. When this happens it cannot continue operating safely. * mount: Mounting means to attach a file system to a specific directory location in the Linux file hierarchy. This causes the contents of the file system to become accessible to the operating system and the users. * TCP/IP: TCP/IP - is the fundamental communication protocol suite used for network communication in Linux and the internet. * UEFI: UEFI is - a modern firmware interface # 2. command line ## 2.1 Basics `ls` list some files and directories `touch MYFILE.txt` Create a file: `mkdir MYDIR` Create a directory: ## 2.2 ls `ls [flags] [directory]` `ls /` lists contents of the root directory `ls ..` lists the contents of the parent directory of the current `ls ~` lists the contents of your user home directory `ls * `lists contents of current directory and subdirectories `-d */` lists only directories `-a `lists content including hidden files and directories `-l` lists content in long table format (permissions, owners, size in bytes, modification date/time, file/directory name) `-lh` adds an extra column to above representing size of each file/directory `-t` lists content sorted by last modified date in descending order `-tr `lists content sorted by last modified date in ascending order `-R` lists contents recursively in all subdirectories `-s `list files with their sizes `-S` sort files/directories by size in descending order `-Sr `sort files/directories by size in ascending order Use `ls --help` or `man ls `in the terminal to see the manual ## 2.3 chomd ### permission groups * owners: these permissions will only apply to owners and will not affect other groups. * groups: you can assign a group of users specific permissions, which will only impact users within the group. The members of your storage directory belongs here. * all users: these permissions will apply to all users, so be careful with this. ### different kind of file permissions * Read (r): This allows a user or a group to view a file (and so also to copy it). * Write (w): This permits the user to write or modify a file or directory. * Execute (x): A user or a group with execute permissions can execute a file. They can also view a subdirectory. ### file type - `-`is a file - `d` is a directory - `l` is a link ### change permissions **Owner** `chmod +rwx FILE/DIR` to add all permissions of a file with name FILE or a directory with name DIR `chmod -rwx FILE/DIR` to remove all permissions from a file with name FILE or a directory with name DIR `chmod +x FILE `to add executable permissions `chmod -wx FILE `to remove write and executable permissions **group** `chmod g-rwx FILE `to remove all permissions to FILE `chmod g+wx FILE` to give write and execute permissions to FILE `chmod g-x FILE `to remove execute permissions to FILE **others** `chmod o+rwx FILE` to add all permissions to FILE `chmod o-rwx FILE` to remove all permissions to FILE `chmod o+w FILE` to add write permissions to FILE `chmod o-rwx DIR` to remove all permissions to DIR **all** `chmod ugo+rwx FILE/DIR` to add all permissions for all users (owner, group, others) to file `named FILE` or directory named DIR `chmod a=rwx FILE/DIR `same as above `chmod a=r DIR` give read permissions to all for DIR ## 2.4 chown - change ownership `chown USERNAME FILE `the user with USERNAME becomes the new owner of FILE `chown USERNAME DIRECTORY` the user with USERNAME becomes the new owner of DIRECTORY (but not any subdirectories) `chown USERNAME:folk DIRECTORY `the user ownership is changed to USER and the group ownership to group “folk” for the directory DIRECTORY `chown :folk DIRECTORY `the group ownership is changed to the group “folk” for the directory DIRECTORY `chown -R USERNAME:folk DIRECTORY` the user ownership is changed to USERNAME and the group ownership is changed to group “folk” for the directory DIRECTORY and all subdirectories ## 2.5 create and remove directory `mkdir DIR`: Create a directory DIR `rm file.txt` remove file `rm -rf DIR`: Remove a directory DIR. The flag “`-r`” means recursively and “`-f`” means do so without asking for each file and subdirectory. Useful, but dangerous. Be careful! `cd`: Go to your home directory ($HOME) `cd DIR`: Change directory to DIR `cd ..`: Change directory to the parent directory of the current directory `touch FILE`: create an empty file with the name FILE ## 2.6 copy and rename directories `cp myfile.txt DIR/`: copy the file “myfile.txt” to the directory DIR `cp DIR1/ DIR2/`: copy the directory DIR1 into the directory DIR2 (Note: overwrites existing files with same name) `cp -R DIR1/ DIR2/`: copy the directory DIR1 and all subdirectories into the directory DIR2 `mv file1.txt file2.txt`: renames file1.txt to file2.txt `mv DIR1/ DIR2/`: renames directory DIR1 to directory DIR2/ `mv file1.txt DIR1/`: moves the file file1.txt into the directory DIR1/ ## 2.7 Symbolic links `ln -s real-file-or-lib link-name` `ln -s /proj/intro-linux/users/MYUSERNAME $HOME/myproj` ## 2.8 redirection `>` redirects the output of some command Example, output of “ls” to a file: `ls > test.dat` `>>` concatenate the output of some command to the content of a file Example, adds the output of ls **to the end of** a file “test.dat”: `ls >> test.dat` `cat file >> file2` Append the contents of file 1 to file2 `echo 'text to append_add_here' >> file2` Append some text to a file called file2 `printf "text to append\n" >> fileName` Another way to append some text to a file `<` changes the standard input `2>` redirects the standard error: Example, redirect the error that is thrown from your program named “myprogram” to a file “error.log”: `./myprogram 2> error.log` `2>&1 `redirects both standard output and standard error Example, redirect output and errors from your program to the same file: `./myprogram > logfile 2>&1` Open the file for writing `cat > foo.txt` Add some text: ``` This is a test. I like the Unix operating systems. The weather is nice today. I am feeling sleepy. ``` To save the changes press `CTRL-d` i.e. press and hold CTRL and press d. ## 2.9 pipes `grep -o -i string file.txt | wc -l` Find the instances of the word ‘string’ in file.txt and count them `grep string file.txt > file.out` Find the lines with instances of ‘string’ in file.txt and output them to file.out `grep string file.txt >> file.out` Find the lines with instances of ‘string’ in file.txt and append them to file.out ## 2.10 exporting variables ### Environment variables They store data that is used by the operating system and other programs. Some are intrinsic to the operating system, some for a specific program/library/programming language, and some are created by the user. The variables can both be used in scripts and on the command line. Usually you reference them by putting a special symbol in front of or around the variable name. By convention, environment variable names are in UPPER CASE `$HOME `Your home directory `$PWD `This variable points to your current directory `$LD_LIBRARY_PATH` a colon-separated list of directories that the dynamic linker should search for shared objects before searching in any other directories `$OMP_NUM_THREADS `Number of OpenMP threads `$PYTHONPATH` Path to the directory where your Python libraries and packages are installed ### some command with environmental varibales `echo $ENVIRONMENT-VARIABLE` To see the content of an environment variable named ENVIRONMENT-VARIABLE `env ` You will get a long list of all environment variables currently set with the command `export VARIABLE=value` set the environment variable VARIABLE to value `export OMP_NUM_THREADS=8` Setting the number of OpenMP threads to 8 in bash `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/your/custom/path/` Adding a new path to $LD_LIBRARY_PATH ### set up a reusable environmental variable The environment variable only retains the value you have set for the duration of the session. When you open a new terminal window or login again, you need to set it again. To avoid that, add it to your .bashrc file, but only do so if it should truly be persisten across many sessions (like adding a new directory to search to LD_LIBRARY_PATH for instance). `echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/your/custom/path/" >> ~/.bashrc` Quickly add a new directory to `LD_LIBRARY_PATH` in tour `.bashrc` # 3. Editors These are all good editors for using on the command line: * nano * vi, vim * emacs ## Using nano * Starting “nano”: Type nano FILENAME on the command line and press Enter. FILENAME is whatever you want to call your file. * If FILENAME is a file that already exists, nano will open the file. If it dows not exist, it will be created. ![image](https://hackmd.io/_uploads/rJQGkwtTA.png) * The ^ before the letter-commands means you should press CTRL and then the letter (while keeping CTRL down). * When you want to exit (and possibly save), you press CTRL and then x while holding CTRL down (this is written CTRL-x or ^x). nano will ask you if you want to save the content of the buffer to the file. After that it will exit. # 4. data handling ## 4.1 compressing and decompressing `gzip FILE` Compressing a file with gzip `gunzip FILE.gz` Decompressing a file with gzip ## 4.2 Archiving ``` tar [-options] <name of the tar archive> [files or directories which to add into archive] Basic options: -c, --create — create a new archive; -a, --auto-compress — additionally compress the archive with a compressor which will be automatically determined by the file name extension of the archive. If the archive's name ends with *.tar.gz then use gzip, if *.tar.xz then use xz, *.tar.zst for Zstandard etc.; -r, --append — append files to the end of an archive; -x, --extract, --get — extract files from an archive; -f, --file — specify the archive's name; -t, --list — show a list of files and folders in the archive; -v, --verbose — show a list of processed files. ``` `tar -cvf DIRECTORY.tar DIRECTORY` Generate a tarball `tar -xvf DIRECTORY.tar` Extracting the files from a tarball `tar -zcvf DIRECTORY.tar.gz DIRECTORY ` Generate a tarball and compress it with gzip `tar -zcvf DIRECTORY.tar.gz DIRECTORY` Uncompressing and extracting files from a tarball ## 4.3 file transfer and syncing ### 4.3.1 scp `$ scp sourcefilename user@hostname:somedir/destfilename` From local system to a remote system `$ scp user@hostname:somedir/sourcefilename destfilename` From a remote system to a local system ### 4.3.2 sftp SFTP (SSH File Transfer Protocol or sometimes called Secure File Transfer Protocol) is a network protocol that provides file transfer over a reliable data stream. From a local system to a remote system This example was made with the remote system “Kebnekaise” belonging to HPC2N. ``` enterprise-d [~]$ sftp user@kebnekaise.hpc2n.umu.se Connecting to kebnekaise.hpc2n.umu.se... user@kebnekaise.hpc2n.umu.se's password: sftp> put file.c C/file.c Uploading file.c to /home/u/user/C/file.c file.c 100% 1 0.0KB/s 00:00 sftp> put -P irf.png pic/ Uploading irf.png to /home/u/user/pic/irf.png irf.png 100% 2100 2.1KB/s 00:00 sftp> ``` From a remote system to a local system ``` sftp> get file2.c C/file2.c Fetching /home/u/user/file2.c to C/file2.c /home/u/user/file.txt 100% 1 0.1KB/s 00:00 sftp> get -P file3.c C/ Fetching /home/u/user/file3.c to C/file3.c /home/u/user/file.txt 100% 1 0.4KB/s 00:00 sftp> exit enterprise-d [~]$ ``` ### 4.3.3 sync rsync is a utility for efficiently transferring and synchronizing files between a computer and a storage drive and across networked computers by comparing the modification times and sizes of files. `rsync -rlpt username@remote_host:sourcedir/ /path/to/localdir` Recursively sync files from one remote directory to a local directory. Also preserve symbolic links and time stamps, and allows resume of partially transferred files on restart `rsync -a /path/to/localdir/ username@remote_host:destination_directory` Recursively sync a local directory to a remote destination directory, preserving owners, permission, modification times, and symbolic links ## 4.4 connecting with ssh The `ssh` command is used for connecting to a remote computer. `ssh username@kebnekaise.hpc2n.umu.se` Connecting to a compute cluster called Kebnekaise `ssh -Y username@kebnekaise.hpc2n.umu.se` Connecting to Kebnekaise and enabling graphical display ## 4.5 Finding patterns ### grep `grep 'word' FILE` Find the pattern ‘word’ in FILE `grep -rine 'word' path/to/dir` Find the pattern ‘word’ recursively under the directory path/to/dir ### awk This command finds patterns in a file and can perform arithmetic/string operations. `awk '/snow/ {print$1}' FILE` Search for the pattern ‘snow’ in the file FILE and print out the first column ### wild card `? `represents a single character `* `represents a string of characters (0 or more) `[ ]` represents a range `{ }` the terms are separated by commas and each term must be a wildcard or exact name `[!] `matches any character that is NOT listed between the [ and ]. This is a lo gical NOT. `** `specifies an “escape” character, when using a subsequent special character. `myfile?.txt` This matches myfile0.txt, myfile1.txt,… for all letters between a-z and numbers between 0-9. Try with `ls myfile?.txt`. `r*d` This matches red, rad, ronald, … anything starting with r and ending with d, including rd. `r[a,i,o]ck` This matches rack, rick, rock. `a[d-j]a` This matches ada, afa, aja, … and any three letter word that starts with an a and ends with an a and has any character d to j in between. Try with `ls a[d-j]a`. `[0-9]` This matches a range of numbers from 0 to 9. `cp {*.dat,*.c,*.pdf} ~` This specifies to copy any files ending in .dat, .c, and .pdf to the user’s homedirectory. No spaces are allowed between the commas, etc. You could test it by creating a matched file in` patterns` directory with` touch file.c `and running the above command to see it only copies that one from the `patterns `directory. `rm thisfile[!8]*` This will remove all files named thisfile*, except those that has an 8 at that position in it’s name. Try running it in the` patterns` directory! Do `ls` before and after to see the change. Remember, you can always recreate the directory `patterns` by untar’ing it again. ### regular expression Regular Expressions can be used with programs like grep, find and many others. `.` matches any single character. Same as ? in standard wildcard expressions. `.* `is used to match any string, equivalent to * in standard wildcards. `\ `is used as an “escape” character for a subsequent special character. `*` the proceeding item is matched zero or more times. ie. n* will match n, nn, nnnn, nnnnnnn but not na or any other character. `^` means “the beginning of the line”. So “^a” means find a line starting with an “a”. `$ `means “the end of the line”. So “a$” means find a line ending with an “a”. `[ ] `specifies a range. Same as for normal wildcards. This is an ‘or’ relationship (you only need one to match). `| `This wildcard makes a logical OR relationship between wildcards. You can thus search something or something else. You may need to add a ‘' before this command to avoid the shell thinking you want a pipe. `[^]` This is the equivalent of [!] in standard wildcards, i.e. it is a logical “not” and will match anything not listed within the square brackets `$ cat myfile | grep '^s.*n$'` This command searches the file myfile for lines starting with an “s” and ending with an “n”, and prints them to the standard output. ### scripting Scripting is used to perform complex or repetitive tasks without user intervention. All Linux commands can be used in a script including wild cards. ![image](https://hackmd.io/_uploads/rJGdAPKp0.png) # 5. Hints and tricks ## short cut `CTRL-a`: Go to the beginning of the line `CTRL-e`: Go to the end of the line `CTRL-l`: Clear the terminal `TAB`: Auto-complete (i.e. start write a command or file name and then press TAB to auto-complete, if possible) `ARROW-UP`: Pressing the arrow-up key repeatedly will let you cycle through recent commands `CTRL-r`: you will get a prompt to write text to search in the list of recent commands. The list is saved in `.bash.history` in your $HOM# ## Alias You will often have to write the same command again and again. If it is a longer command, it is reducing your productivity having to repeat it. Then you can use the alias command to create an ‘alias’ for your command. `$ alias` check for alias `$ alias shortName="your custom command here"` In order to create a new alias, you could write: ### Adding a new alias to the .bashrc file, using ‘nano’ editor (1)Open the file: nano `~/.bashrc` (2)Inside the editor, scroll down to where your aliases are. If you do not have any, just add them at the end, like this ``` #My custom aliases alias c="clear" alias ll="ls -alF" # Colourize ls output alias ls='ls --color=auto' # Colourize grep output alias grep='grep --color=auto' # Easily list my SLURM batch jobs alias jobs='squeue -u $USER' # Find all entries starting with d in the output from the ls -lahrt command alias ldir=’ls -lahrt | egrep "^d"’ ``` (3)Save and Exit the file: `CTRL-x` (Press CTRL and hold it down while pressing x). Answer ‘Y’ to save. (4)Next time you start a shell or after a new login your new alias is available. To make it available immediately, run `$ source ~/.bashrc` ## Misc Write ‘`clear`’ to clear the terminal write ‘`history`’ to see a list of the most recent commands written in the terminal * You can change the number of saved commands by setting the environment variable HISTSIZE in your `.bashrc` file in your home directory. * Example: Open` .bashrc `with `nano`. Somewhere (at the end for instance) add: export `HISTSIZE=NUMBER`” where `NUMBER` is the number of commands to save, for instance 10000. `man PROGRAM` will give you the manual for a specific program or command, if it exists * Example: man gcc will give open manual/help for the compiler gcc, containing flags to the compiler etc. Note that you need to first load a module that has gcc in. # 6. Install tree Install and run tree as not-root on an Ubuntu Linux system where it is not installed. (1) Create a directory to work in: `mkdir ~/tree` (2) Change to that directory: `cd ~/tree` (3) Download tree: `apt download tree` (4) Extract the files: `dpkg-deb -xv ./*deb ./` (5) Now you can run the tree command by giving the full path: `~/tree/usr/bin/tree` (6) In order to run it without having to give the full path, create an alias in your `~/.bashrc` file: 1. Open `~/.bashrc` without your favourite editor. 2. Add this line at the end or with your other alias’es: `alias tree="$HOME/tree/usr/bin/tree"` 3. Save the file. 4. source the file: `source ~/.bashrc` You can now run tree by just giving the command tree. # 7. Linux cheatsheet https://hpc2n.github.io/intro-linux/cheat/ ![image](https://hackmd.io/_uploads/SywT1dY6R.png) ![image](https://hackmd.io/_uploads/r1GCydFTC.png) ![image](https://hackmd.io/_uploads/B1bBlOFTR.png) ![image](https://hackmd.io/_uploads/S1Odg_tpA.png) ![image](https://hackmd.io/_uploads/HyzFx_KaA.png)