---
tags: homework
---
# Homework 04
###### tags: `file processing`
## How to test your answer
* [Tutorial](https://hackmd.io/@ntnu-csie-cp/SyhfQKrvO)
> TA prepare a shell script to help you check whether your answer or not. And the result NOT represent your actual scores. For example, if you get `55` pts, then you maybe get `10` pts actually.
## Deadline
* News
The 4, 5-th problem upload already.
We release 3 problems in among NOW (at 12:10 PM, 4/27 Tue). You could try your best to conquer these problems. Others (two problems) would announce before noon, 5/02 Sun. Additionally, the homework **duration start** <font color="blue">after</font> all problems is released.
## Submission format
* Please submit your source code on Moodle3.7.
* Prepare a file `[Student_ID]_hw04.zip`, which contains source code (`.c` format).
* The filename of the source code base on the problem number, for example, problem one source code `[Student_ID]_hw04_p1.c`.
Taking a example for `40847099s`:
```
40847099s_hw04.zip
|------40847099s_hw04_p1.c
|------40847099s_hw04_p2.c
|------40847099s_hw04_p3.c
|------40847099s_hw04_p4.c
|______40847099s_hw04_p5.c
```
---
## TA information
> Contact with TAs if having any problems.
* 莊博傑 Po-Chieh Chuang / 40747019s@gapps.ntnu.edu.tw
* 林育辰 Yu-Chen Lin / 40771131h@gapps.ntnu.edu.tw
---
# p1 - File Encoding
> [name=Judge Girl]
## Description
Write a program to encode a file. You will be given a key $k$ from standard input between $0$ and $255$.
And you need to encrypt a file named `test` with following below steps:
1. Read a byte from `test`. Let's call it $m$.
2. Calculate $c = k \oplus m$, write it to encrypted file `test.enc`
3. $k \leftarrow k \oplus m$
4. Repeat until end of `test`
Note: Please read and write files in binary mode.
## Input
Read key from stdin, which is a number between $0$ and $255$.
Read file content from `test`.
## Output
Write encoded content to `test.enc`.
## Example Input/Output
You can find them [here](https://drive.google.com/drive/folders/1K3vFUMGeH44kqmI9dqIXWlyuRY73av-L?usp=sharing) if you need.
* `.in` file is `test`, just rename it to make your program work.
* `.in.stdin` file is the key.
* `.out` file is the excepted `test.enc`.
## Hint
* Adapted from [Judge Girl](https://judgegirl.csie.org/problem/0/89)
## Problem-Solving idea
> Realease after one week.
Be aware that `test` is a binary file. Use `fread` to get the data. (not `fscanf`!)
After getting data, repeat the steps at the description, write encoded data to `test.enc`.
---
# p2 - Byte Frequency Count
> [name=Judge Girl]
## Description
Write a program to read data from a file. Your program will open a file with binary read mode. (The file name is given in standard input.)
The first four bytes of the file is a positive integer `n`, which is the number of data afterward. Then there are `n` two byte **signed** integers ranging from `-32768` to `32767` in binary format.
Now you need to determine which number from `-32768` to `32767` **appears** most. If there is a tie, report the largest one.
For example, if both 1 and 10 appear 100 times, which is the maximum number of times, then you should report 10.
---
#### Subtasks
* 100 pts totally.
* The number of testcases: $n$
* Each testcase account for $(\frac{100}{n})$ pts
## Input
There is a string with a maximum length of 200, specifying the file name. This string can be read by `scanf("%s",...)`. Don't care about the case that specifies character as the file name.
---
<!--
The consecutive numbers separated with space, ending with a newline character.
-->
## Output
You should output two lines:
* First line is the **number** that appears most.
* Second line is the **frequency** of this number.
---
<!--
Output three lines totally. Each line contains one integer only, ending with a newline character. DON'T include any space character.
-->
## Example Input
```
0.dat
```
## Example Output
```
10
100
```
## Hint
* Refer to the file [0.dat](https://github.com/JudgeGirl/JG-testdata/raw/master/practice/264/0.dat)
* The problem refer to [Judge Girl](https://judgegirl.csie.org/problem/0/264)
* DON'T ALLOW to modify or refer to the solution on the Internet or would be punished.
## Problem-Solving idea
> Realease after one week.
---
# p3 - Convert a Binary Grade File to HTML
> [name=Judge Girl]
## Description
Write a program to read a binary file of student records, and produce a HTML file for display.
The definition of the student record is as in the textbook (on page 219). The HTML file should use a table to contain all the fields of a student as a row.
You should produce an HTML file that looks like the following. A sample HTML file has been attached to this article.
```c=
typedef struct {
char name[20];
int id;
char phone[10];
float grade[4];
int birth_year;
int birth_month;
int birth_day;
} Student;
```

The name of the input student file and the name out the output HTML file should be given from stdin.
#### Subtasks
* 100 pts totally.
* The number of testcases: $n$
* Each testcase account for $(\frac{100}{n})$ pts
## Input
* The first line of the input is the name of binary file
* The second line of the input is the output HTML file.
Each file names have no more than 80 characters. The binary file contains an array of struct student.
<!--
The consecutive numbers separated with space, ending with a newline character.
-->
## Output
Output your **HTML file** formatted like the following:
```htmlembedded=
<table border="1">
<tbody>
<tr>
<td>%s</td>
<td>%d</td>
<td>%s</td>
<td>%f, %f, %f, %f</td>
<td>%d, %d, %d</td>
</tr>
</tbody>
</table>
```
<!--
Output three lines totally. Each line contains one integer only, ending with a newline character. DON'T include any space character.
-->
## Example Input
```
students.bin
students.html
```
## Example Output
```
Please referrer to the attached file
Must output a `HTML` file named like the second line of the input.
```
## Hint
* Since there will be multiple records in the binary file and the number of records is unknown, you should process record and output HTML source code one by one but not store them into a fix-sized array.
* Refer to `students.bin`, could download on [this](https://github.com/JudgeGirl/JG-testdata/raw/master/practice/136/students.bin)
* Refer to `students.html`, count download on [this](https://github.com/JudgeGirl/JG-testdata/raw/master/practice/136/1.out)
* The problem refer to [Judge Girl](https://judgegirl.csie.org/problem/0/136)
* DON'T ALLOW to modify or refer to the solution on the Internet or would be punished.
## Problem-Solving idea
> Realease after one week.
---
# p4 - Ransomware
> [name=Bogay]
## Description
Bogay, your computer programming TA, was recently attacked by ransomware. All of his vtuber memes are encrypted now!
Luckily, after some studying about that malware, he know that this malware uses [Affine cipher](https://en.wikipedia.org/wiki/Affine_cipher), which can be easily decrypted.
Can you write a program to help him?
* Note1: There may be some large files that can cause TLE. (see hint!)
* Note2: Affine cipher encrypted every bytes in a file with a pair of random number $a, b$ by function $E(x) = (ax + b) \mod m$.
We know that $m=256$, so just try every possible $a, b$. And you can find the original file.
## Input
You should read a encrypted jpg file `meme.enc`.
## Output
Write decrypted file to `meme.jpg`.
## Example Input/Output
You can find them [here](https://drive.google.com/drive/folders/1k24SQsAo9R-iVwDAU0Ae1Km6zGJJqNIs?usp=sharing) if you need.
* `.enc` file is `meme.enc`.
* `.jpg` file is excepted `meme.jpg`.
## Hint
* How do you verify a file is correctly decrypted?
* You don't need to decrypted the whole file to check whether it is valid.
* See [JPEG Syntax and structure](https://en.wikipedia.org/wiki/JPEG#cite_ref-51) for more details.
## Problem-Solving idea
> Realease after one week.
The first 2 bytes of a JPEG file must be `0xFF 0xD8`, so you can use this feature to verify your decryption. Just try every possible $a, b$ to decrypt the first 2 bytes.
---
# p5 - File Sorter
> [name=Yu-Chen Lin]
## Description
You are given a name of a **binary** file consisting of structures.
Your program should read the structures from this file, **remove** those structures that contain errors, and sort the remaining **valid** structures, and need to write a program, which can generate a `.c` file that can print those **valid** structures.
The structure is like the following.
```c=
typedef struct {
int balance;
char name[128];
int gender;
} Account;
```
A structure is **invalid** if any of the following is true:
* The balance is **negative**
* The name (as a string) contains anything **other than** letters and space
* The gender is **not** 0 or 1.
We should sort the structure according to their names. It is guaranteed that no two names will be the same.
#### Subtasks
| | Limits | Score |
| --------- |:------ | -----:|
| #0 | The file has only one structure, and it is valid. | $15$ |
| #1 | The file has only one structure, and it may be invalid. | $15$ |
| #2 | The file may have more than one structure ($\leq 1024$), and they are all valid. | $40$ |
| #3 | The file may have more than one structure ($\leq 1024$), and they could be invalid. | $30$ |
| **Total** | | $100$ |
## Input
* **Standard Input**
Those filename lengths not exceed 80, and read the file that names like the first line of the standard input, and generate a .c file that names as the second line of the standard input.
```
1.dat
test1.c
```
* `.dat`
N valid or invalid structures in binary format. Read this file until `EOF`. ( $1 \leq N \leq 1024$ )
First, your program deals with the binary file base on the problem requirement (That is remove the error structure and sort according to its name).
\
With a such process, if you print the data sequentially, the format like this maybe:
```
--> 1 <--
balance) 0
name) fYWTGPe
gender) 0
```
We need you can generate a C program, it can print the contexts above. The C program that is generated be same as this(maybe):
* test1.c
```c=
#include <stdio.h>
int main() {
printf("--> 1 <--\n");
printf("balance) 0\n");
printf("name) ");
printf("fYWTGPe\n");
printf("gender) 0\n");
return 0;
}
```
Finally, we would compiler the file you generate, compare its output.
<!--
The consecutive numbers separated with space, ending with a newline character.
-->
## Output
* <font color = "blue">`.c` file</font>
<!--
Output three lines totally. Each line contains one integer only, ending with a newline character. DON'T include any space character.
-->
Generate a `.c` file, then compiler it and execute the file could output the $N$ information(it is satisfy the problem requirement), and the format **like**:
```
--> 1 <--
balance) 46
name) AECH o WN UJz WRX
gender) 0
--> 2 <--
balance) 49
name) CxTr cpWBBA AcKnOPVhmUrl xcvxQzU
gender) 0
--> 3 <--
balance) 41
name) MNhjMwdb rSiggnpGlKHFH mCUenLCb
gender) 0
--> 4 <--
balance) 68
name) OOE GSOZWGTmwmsD wgarJ hSQjOlAK
gender) 0
--> 5 <--
balance) 54
name) TIqj FssG Eks rIBhRZRwYbxfLCnTLqwJ
gender) 1
```
On each valid structure, the format is:
* The first line is `-->[Index number]<--` and ending with a newline character.
* The second line is `balance) [The balance value of this structure]`, and ending with a newline character.
* The third line is `name) [The name of this structure]`, and ending with a newline character.
* The 4-th line is `gender) [The gender value of this structure]`, and ending with a newline character.
The **consecutive** valid structure separated with a blank line. And the `.c` file you generate should output those contexts.
## Example Input
```
1.dat
test1.c
```
## Example Output
## Hint
* Download [1.dat](https://judgegirl.csie.org/downloads/testdata/10065/1.dat), `test1.c` like the output descripion of the problem
* The problem refers to `Judge Girl` and revise it.
## Problem-Solving idea
> Realease after one week.