--- tags: Matlab Workshop --- # Lesson 6: Files and Folders [TOC] ## 1. File Operators Please create an empty file as `myfile1.csv` and a folder `hello`. Inside this folder create a file named `myfile2.csv`. In the `myfile1.csv` file, it contains the follows ```csv! % myfile1.csv A,B,C,D 6,8,3,1 5,4,7,3 2,6,7,10 2,2,8,2 2,7,5,9 ``` This is the structure of current directory (.). ``` . ├── myfile1.csv └── hello └── myfile2.csv ``` We are going to test several file operators. **1.** `ls` - [matlab document](https://www.mathworks.com/help/matlab/ref/ls.html) List folder contents ```matlab >> ls hello myfile.txt >> ls hello untitled.m >> list = ls list = 'hello myfile1.csv' ``` **2.** `exist` - [matlab document](https://www.mathworks.com/help/matlab/ref/exist.html) Check existence of variable, script, function, folder, or class ```matlab >> exist hello % Return 7 >> exist myfile.csv % Return 2 ``` **3.** `type` - [matlab document](https://www.mathworks.com/help/matlab/ref/type.html) Display contents of file ```matlab >> type myfile1.csv A,B,C,D 6,8,3,1 5,4,7,3 2,6,7,10 2,2,8,2 2,7,5,9 ``` :::info Please refer to [MATLAB File Operators](https://www.mathworks.com/help/matlab/file-operations.html?s_tid=CRUX_lftnav) to know other operators. Please find the command how to create a folder and create a folder named "hello" ::: ## 2. Data Import and Export ### 2.1. Low-Level File I/O Please put the following file in your working directory. ```bash # Life_X_Y_Z.txt >Life_X CBGECHEFDEBHGCFHDBFCBEAEB >Life_Y CASDCASDCASDCASDCASDCD >Life_Z SADFBXCBTHRE ``` Read this file and obtain the sequence and its name. Then we do some calculations and later on write the results on another file. In this scenarios we need to know how to ppen file or obtain information about open files, write to the file, close the file and so on. MATLAB has all that function where you can find it here. Now let's try: #### **1.** `fopen` to open the file - [matlab document](https://www.mathworks.com/help/matlab/ref/fopen.html) :::info Syntax: `fileID = fopen(filename, permission)` `fileID` - An integer file identifier `filename` — Name of file to open `permission` - File access type 'r' (default) | 'w' | 'a' | 'r+' | 'w+' | 'a+' | 'A' | 'W' | ... | Mode | Description | |------|----------------------------------------------| | `r` | Read only. | | `w` | Write only, creates/truncates file. | | `a` | Append, creates file if it doesn't exist. | | `r+` | Read and write. | | `w+` | Read/write, creates/truncates file. | | `a+` | Read/append, creates file if it doesn't exist.| | `A` | Append without buffer flush. | | `W` | Write without buffer flush. | ::: ```matlab fileID = fopen('Life_X_Y_Z.txt', 'r'); % 3 ``` #### **2.** `fgets` to read line from file - [matlab document](https://www.mathworks.com/help/matlab/ref/fgets.html) :::info Syntax: `tline = fgets(fileID)` `tline` - reads the next line of the specified file, including the newline characters `\n`. - If the file is nonempty, then `fgets` returns tline as a **character vector**. - If the file is empty and contains only the end-of-file marker, then `fgets` returns tline as a numeric value **-1**. `fileID` - File identifier ::: ```matlab tline = fgets(fileID); % '>Life_Y % ' ``` #### **3.** `fclose` to close open files - [matlab document](https://www.mathworks.com/help/matlab/ref/fclose.html) :::info Syntax: `fclose(fileID)` or fclose('all') `fileID` - File identifier ::: ```matlab fclose(fileID) ``` ##### Example 1 Extract name and sequence in 'Life_X_Y_Z.txt' file ```matlab filename = 'Life_X_Y_Z.txt'; fileID = fopen(filename, 'r'); % Initialize variables to store sequence names and sequences seq_names = {}; sequences = {}; % Read the file line by line while true line = fgetl(fileID); % Read a line % if line is -1, then break if line == -1 break end % if line contains '>', then extract sequence name if line(1) == '>' seq_names(end + 1) = cellstr(line) ; % else extract sequence else % line(1) ~= '>' sequences(end + 1) = cellstr(line); end end fclose(fileID) for i = 1:length(seq_names) disp(['Sequence Name: ', seq_names{i}]); disp(['Sequence: ', sequences{i}]); disp(['Sequence Length: ', num2str(length(sequences{i}))]); disp('--------------------------'); end ``` Put your code [here](https://hackmd.io/k4ni6ad7TRKt85VCPVQmeg). #### **4.** `fprintf` to write data to text file - [matlab document](https://www.mathworks.com/help/matlab/ref/fprintf.html) Let say we want to report the results of calculation into a file e.g. sequence similarity, sequence length, sequence name, etc. ```bash! # sequences_output.txt Sequence Name: Life_X Sequence: CBGECHEFDEBHGCFHDBFCBEAEB Sequence Length: 25 -------------------------- Sequence Name: Life_Y Sequence: CASDCASDCASDCASDCASDCD Sequence Length: 22 -------------------------- Sequence Name: Life_Z Sequence: SADFBXCBTHRE Sequence Length: 12 -------------------------- ``` :::info Syntax: `fprintf(fileID,formatSpec,A1,...,An)` `fileID` - File identifier `formatSpec` - Format of the output fields, specified using formatting operators ![Screenshot 2024-11-11 at 4.36.20 PM](https://hackmd.io/_uploads/ByMGfHkf1g.png) `%`: start operator `u`: end operator with conversion character (e.g. ) | Conversion Character | Description | Example Output | |----------------------|----------------|-------------------| | `%d` | Signed decimal integer | `42`| | `%s` | String | `"Hello"`| | `%f` | Floating-point number | `3.141590`| | `%c` | Character | `A`| | `%e` | Scientific notation |`1.234500e+04`| `Precision`: Number of digits to the right of the decimal point e.g. '%.4f' `A1,...,An` — Numeric or character arrays ::: ##### Example 2 ```matlab! % Open a new file to write the results outputFile = 'sequences_output.txt'; outputID = fopen(outputFile, 'w'); for i = 1:length(seq_names) % Write sequence names % fprintf(fileID,formatSpec,A1,...,An) fprintf(outputID, 'Sequence Name: %s\n', seq_names{i}); % Write sequences fprintf(...); % Write sequence length fprintf(...); fprintf(outputID, '--------------------------\n'); end fclose(outputID); ``` Put your code [here](https://hackmd.io/k4ni6ad7TRKt85VCPVQmeg). ### 2.2. Structured data In MATLAB, there are three different functions designed to read **structured data** from files, each with its unique use cases. | **Function** | **Best For** | **Output** | **MATLAB document** | |-----------------|----------------|-------------------|-------------------------------| | `readtable` | Data with headers and mixed types (one column one datatype) | Table |[matlab/readmatrix](https://www.mathworks.com/help/matlab/ref/readmatrix.html) | | `readmatrix` | Purely numeric data | Numeric Matrix | [matlab/readtable](https://www.mathworks.com/help/matlab/ref/readtable.html) | | `readcell` | Mixed data, handling text and numbers | Cell Array | [matlab/readcell](https://www.mathworks.com/help/matlab/ref/readcell.html) | ```matlab >> m = readmatrix("myfile1.csv") m = 6 8 3 1 5 4 7 3 2 6 7 10 2 2 8 2 2 7 5 9 >> t = readtable('myfile1.csv') t = 5×4 table A B C D _ _ _ __ 6 8 3 1 5 4 7 3 2 6 7 10 2 2 8 2 2 7 5 9 >> c = readcell("myfile1.csv") c = 6×4 cell array {'A'} {'B'} {'C'} {'D' } {[6]} {[8]} {[3]} {[ 1]} {[5]} {[4]} {[7]} {[ 3]} {[2]} {[6]} {[7]} {[10]} {[2]} {[2]} {[8]} {[ 2]} {[2]} {[7]} {[5]} {[ 9]} ``` :::info Please refer to [MATLAB Data Import and Export](https://www.mathworks.com/help/matlab/files-and-folders.html) to know how to deal with other type of file format. :::