---
# System prepended metadata

title: File reading and writing in python
tags: [IB, Computer Science, python]

---

# File reading and writing in python

:::info

This is a note part of the coding course for IB Computer Science program. If you want to check out other parts of the same course you can see them in [This link of teaching material](/68GDv_RgT-yh9oERMvdnFw)

:::

:::warning
:warning: Note still under construction :warning: 
:::


## Context 

In this new syllabus you're suposed to know how to handle files, reading and writing them. 

### What is a file. File systems, files and file extensions 

:::info
This is the link to [A1.1](/6LgCmT4IToOkOu0CuV3UBQ) and [A1.3 Notes and Resources](/iD8iagfzSm6tbNqeApW50g)!
:::

Just to be in one page remember that the OS is the one that is going to convert this

![image](https://hackmd.io/_uploads/Bk4N78EO-l.png)

Into this:

![image](https://hackmd.io/_uploads/SkDPQLVdWx.png)
_Linux_

or this

![image](https://hackmd.io/_uploads/HyQTQUEu-x.png)
_windows_

This is done through _abstraction_. These folders can represent parts of the secondary memory and there can be loops or shorcuts to other places. 

To know where your stuff is you will need to know the _route_

For example (windows)

A file that you download in here an be

C:\Users\your.name\Downloads\ib_guide.pdf

C:\ is the **disk** (to be precise the partition that has been *mounted* so it can be accessed)
Users, your.name and Downloads are  **folders**

ib_guide.pdf is the name of the file and it consists of two parts:

ib_guide is the name

and .pdf (including the dot) is the **extension** and it tells the OS how it should be read that file. 

:::info
You can change the extension of a file and it will change how the OS is going to interpret it but if you change a .docx file to a .jpg don't expect to be converted actually into a jpg (or a pdf). 
:::

### Types of file formats

//TO-DO

![imagen](https://hackmd.io/_uploads/HJtLDhh_-g.png)


### Absolute routes and relative routes 

Programs also can access files. Usually they can be linked through **absolute routes** that will start from Home or from the disk unit or **relative routes** that will start from the folder where it's being executed the program. 

For example, `C:\Users\your.name\Downloads\ib_guide.pdf` is an absolute route, but `\Downloads\ib_guide.pdf` is a relative route from the folder C:\Users\your.name

:::warning
:warning: **Slashes**: In windows the slashes are these ones: `\`, in macOS and Linux is the other slash: `/`. In URLS it's also the `/`
:::

Relative routes are usually more practical because it can be transfered from one device to another. 

The most simple **relative route** is the same folder where the script is being executed. That's only the name of the file. 

## Opening files in python

:::info
This is complicated be used with the one compiler for _reasons_
:::


We should use the method open and put the return of that function into a variable (usually `file` but can be other)

```python!
file = open("testinput.txt")
print(file)
```

Now, this file is an Object so when we try to print it this is going to be the result:

output:
![image](https://hackmd.io/_uploads/rJeyydVZbe.png)

If we want to actually read the content we need to use the function `read()` that, if we don't specify is going to read line by line. 

For that we use a loop for, commonly just:

```python!
file = open("testinput.txt")
for line in file:
    pass #substitute with whatever you're going to do with the line
```

### Opening modes

We have different modes

r-> read
w -> write
a -> append 
r+ -> read and write. 

If we don't give the input it will be read by default. If we want to change the way we do it, we can use a string as a second parameter.

:::danger
:warning: Writing is not the same as appending. If you write a file you're overriding what it was before. 
:::

What if I want to "edit" a file? Usually you read it, you store in a variable and then you write with the appropiate modifications. 


```python!
file = open("testinput.txt", "a")
print(file)
```



```python!
file = open("testinput.txt","a")
file.writelines("\nWe're the crystal gems")
file.close()
```

For making a new line we need to use special characters. The new line is usually `\n`. 

:::info
**Special characters in Python**
These special characters are common in other programming languages

![image](https://hackmd.io/_uploads/rkHSlONu-x.png)
_[(source)](https://www.w3schools.com/python/gloss_python_escape_characters.asp)_
:::

### Closing files: close()

Remember from 1.3. that Operating Systems are in charge to allocate the access to files and folders? 

So it happens that if Python (or other program) opens a file, the Operating System is going to prevent other programs to access to the file itself. This is something that you may have encounter when trying to delete a file that you have opened. 

So for making those files free for other programs (or the user) to read/write, we need to close the files once we have finished with them. As you might expect this is done with the  `close()` function.

```python!
file = open("testinput.txt")
file.close()
```

If you don't write it, once the program is over the interpreter is going to close the file anyway. 

But writing this is what is called a **good practice**. Usually the outputs are going to be the same, but other more subtle benefits can happen from this. In this case, that if you do a more complex program you're not going to block files. 

### With open 

Regarding that we have the problem that we have in the while, that we may not remember when to close the file. For that in python there is other common structure that is with open. 

The idea is that you need to write "with open(parameters) as nameOfTheFileVariable:"

Whatever happens inside that block will end with a `close()`, so you don't need to write it! 

So if I write this:
```python! 
file = open("myFile.txt")
for line in file:
    print(line)
file.close()
```

It's the same as writing this:
```python! 
with open("myFile.txt") as file:
    for line in file:
        print(line)
```

You will see examples with both versions all over the internet. 

## Example of reading 

Imagine that we have this textFile.txt 

```
Pipo
Pepa
Pepe
Thomas Jefferson
```

And we want to create a list on them. 

One solution is create in our `main.py` the following code:

```python=
file = open("textFile.txt")

myList = list()

for line in file:
  myList.append(line)
  
print(myList)
```

In this script we open the file, we create an empty list and then for each line that there is in the file we append it to the list. Finally we print the result. And this is the result:

![image](https://hackmd.io/_uploads/SkshvnnP-e.png)

HUM!

Then we need to use strip() to take that \n. 

:::info
`\n` is an escape caracter that means to be a new line. 
:::
:::success
You can read more about strip here: https://hackmd.io/y_afcRmLQcysv05xlSqRuQ#strip
:::

```python

file = open("textFile.txt")

myList = list()

for line in file:
  myList.append(line.strip())
  
print(myList)

```

### Common error: encoding

Remember when in [A1.2 Binary representation](/hQH2iVecTpOaEu WNjLqR3Q) that we talk about ASCII and unicode with UTF-8 and UTF-32? It happens that if you open a file that it's written in other encoding, special characters (such as "ñ", cirilic characters or chinese characters) are going to be seen _weird_. To solve it we need to inlcude one optional parameter and just say `encoding='utf-8'`

It's the same as writing this:
```python! 
with open("myFile.txt",encoding='utf-8') as file:
    for line in file:
        print(line)
```

## How to organize using OneCompiler several files so we can open other files as data files

Open the one compiler and work on the script.

![image](https://hackmd.io/_uploads/BJcwznnv-x.png)

Press the "+" and new python file

![image](https://hackmd.io/_uploads/ryCGV33PZx.png)

**RENAME** the file name and extension

![image](https://hackmd.io/_uploads/r19eBnhP-g.png)
(in this case textFile.txt but we can change the name or the extension)

Change the content:

![image](https://hackmd.io/_uploads/ryM4H23PWx.png)

That's it! you can work with this tests. 

:::warning
This doesn't work for writing files
:::



## Example of writing a file 

Now let's say that we want to do the other way around:

I want to have the same file that it's the example in the writing part: 

We can start with the list that we have before and we need to prepare the lines with the special character `\n`

So let's start here: 

```python!
myList = ["Pipo", "Pepa", "Pepe", "Thomas Jefferson"]
```

Now we need to open the file:


```python!
myList = ["Pipo", "Pepa", "Pepe", "Thomas Jefferson"]
with open("output.txt", "w" ,encoding='utf-8') as file:
    #Don't forget the w of writing!
    pass #remember that this is going to be changed later
```

And while the file is open we can create the lines and write them:


```python!
myList = ["Pipo", "Pepa", "Pepe", "Thomas Jefferson"]
with open("output.txt", "w" ,encoding='utf-8') as file:
    for element in myList:
        file.write(element)
        file.write("\n")
        
```
If done correctly this will do this:

![image](https://hackmd.io/_uploads/Hy3-7t4O-x.png)

Problem: we have 5 lines in the file but we only want to have 4. 

//TO-DO by the students. 

## Exercises

### Write execises with specific propertis using range

range() is a function than we can use to create fast lists so we can create easily files with specific number 

```python!
myList = list(range(1,11)) #numbers from 1 to 10
with open ("testfile.txt", "w") as file:
    for index in range(len(myList)):
        file.write(str(myList[index])) #we need to convert the integers to strings
        if index < len(myList)-1:
            file.write("\n")
```

With that structure you can do:

Construct in Python a program that writes a file called "testOutput.txt" with all the numbers from 1 to 100 

//TO-DO by the students

Construct in Python a program that writes a file "testOutput.txt" with all the numbers from 100 to 1000

//TO-DO by the students

Construct in Python a program that writes a file "testOutput.txt" with all the numbers from -4 to -10

//TO-DO by the students

Construct in Python a program that writes a file "testOutput.txt" with all the **even** numbers from 100 to 1000

Implementation 1:

```python!
file = open("testOutput.txt", "w")

for x in range(100,1001,2):
    string = str(x)
    file.write(string)
    file.write("\n")

file.close()
```

Implementation 2 (if you don't remember steps on range)
```python!
# even numbers from 100 to 1000 

file = open("testOutput.txt", "w")

for x in range(100,1001):
    if(x%2==0):
        string = str(x)
        file.write(string)
        file.write("\n")

file.close()
```

Construct in Python a program that writes a file "testOutput.txt" with all the numbers with 3 digits that the 3 digits sum 9 (such as 117, 333) being the smaller one the first. 

//TO-DO by the students

### Replacing quote of Shakespeare

First version:

```python!
# to be or not to be by Shakespeare

quote = "To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles And by opposing end them. To die—to sleep"

file = open("shakespeareQuote.txt", "w")

vowels = ["a", "e", "i", "o", "u"]
quoteA = quote
for vowel in vowels:
    quoteA = quoteA.replace(vowel, "a")
file.write(quoteA)
file.write("\n")
```

Second version 

```python!
# to be or not to be by Shakespeare

quote = "To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles And by opposing end them. To die—to sleep"

file = open("shakespeareQuote.txt", "w")

vowels = ["a", "e", "i", "o", "u"]

for vowelSubstituting in vowels: 
    quoteOutput = quote
    for vowel in vowels:
        quoteOutput = quoteOutput.replace(vowel, vowelSubstituting)
    file.write(quoteOutput)
    file.write("\n")
```

### Problem of using indexes

I usually advise that indexes are cool to use in list because sometime __you need to__. But with files we can't. If I write this script

```python!
file = open("myText.txt")

print(file)
sum = 0
for lineIndex in range(len(file)):
  pass
```

The result would be this

![image](https://hackmd.io/_uploads/B1x3-v2OZg.png)


```python!
file = open("myText.txt")

print(file)
lineIndex = 0
for line in file:
  print(lineIndex, line)
  if lineIndex == 2:
    print("This is the one that we need")
    break
  #doSomething
  lineIndex +=1 #update Index 


```

### Exam like exercise Advent of code


For this we're going to use this example of the Advent Of Code of 2024 https://adventofcode.com/2024/day/1

The actual thing is a bit more complicated so let's start easy. Then we can continue with the different versions. 

This is the file that we need to read:

```
4   3
2   5
1   3
3   9
3   3
```

You can paste it into some input.txt or similar. 

If you want, you can test the actual file of the input that is a long file and I'm writing after this spoiler:

:::spoiler
```
56208   95668
52621   74203
95252   33335
79799   26047
88005   37435
61887   93836
48454   95821
62543   40154
68631   73255
76702   79056
70838   34466
18680   46550
51787   55754
37646   48228
85603   29306
37768   75105
90135   21612
35469   74470
21245   49622
18699   49193
21078   61415
10873   95775
91781   60483
56840   42667
81094   12954
71762   61434
64670   22232
33536   77827
24607   75889
37829   78341
36203   82413
31747   95700
22170   79056
28403   89045
82265   37081
10713   74405
76191   43205
99737   87526
75120   72550
82607   40971
81396   47693
11533   39908
21138   40655
97567   42427
11179   89506
19548   50064
43080   93836
90155   37743
21664   58398
90621   89990
52392   94519
82246   62358
14857   73475
56878   69951
48921   56211
49545   83117
84329   12955
71390   48442
27461   28450
94969   86731
32344   14652
89304   37435
61343   61730
38097   85797
21480   31806
79330   58398
29803   73475
83903   58398
82961   37081
19048   16800
46338   45021
65120   97811
94574   18471
43129   37435
80135   20553
13860   30053
23228   37081
75373   99812
72438   43358
62029   81330
46550   11995
86921   92694
34790   20959
98862   85030
82648   53872
23156   43559
86507   43025
78535   78226
86410   58158
23699   18312
43083   25346
69925   11280
93761   82246
31314   38097
31775   47098
50720   24867
59647   66769
95275   17550
12756   82744
93464   99949
50801   55400
12063   60335
10930   12955
96820   63957
51672   40727
35874   40727
23608   79584
94056   80438
46387   50916
34304   37081
83311   93836
25295   43512
17263   63763
70673   30357
68716   11280
52238   27269
99670   49467
50516   93196
84006   78719
61795   33177
43228   87640
77557   99949
98130   64716
96939   61786
95849   79056
16209   89990
65420   44437
94486   11280
66223   99949
43524   86363
62859   61786
90225   84321
88954   98335
75285   17907
38457   73475
42857   24650
10995   32077
91376   57464
77255   90595
89361   86284
59270   46550
71756   57794
87966   17022
58398   55419
96091   37435
17150   87526
39389   12196
76990   67087
61924   34998
95291   73475
50147   84321
53391   84401
36506   93836
52291   27344
12731   50538
60534   89078
79539   10368
56787   99949
54143   40727
98095   75373
30744   52795
12926   38097
51613   50801
36198   65515
28401   27065
56046   78440
70329   71912
95114   36977
60740   47082
56549   11280
57725   26874
56639   89045
79567   29741
94022   76570
96358   90832
12302   95795
76023   82246
15612   97271
17185   90595
46709   45432
13154   46550
88519   43458
32560   90478
65035   71620
92462   61237
69416   79584
35688   21793
28438   39240
82325   27813
78531   84321
24899   36148
85012   89990
23508   67634
66264   60161
66503   38477
15986   91668
79146   63922
37435   85030
62752   85977
26512   81606
40841   82246
46251   73475
34026   19249
50440   41947
76774   11280
22007   60880
94909   87526
58404   84321
75320   18175
76222   89990
91724   15431
48733   37574
36629   12879
58515   21379
12471   45395
37709   79056
14932   47109
98634   80670
97279   38097
96785   84321
12591   89556
36934   61786
41593   32076
78262   38097
60695   50941
79770   93836
41948   58398
84058   63922
35466   34206
99842   86919
57631   53042
22167   33022
61786   68147
69070   89045
30553   48228
96972   46550
49251   79584
50994   75840
91178   59794
29903   18646
92270   63922
50470   19887
68328   87526
87788   29501
56579   59667
23154   37081
29012   69142
24751   89990
12191   38097
14613   62616
88157   75476
78319   74115
57573   37081
49634   88296
80779   23920
58363   45944
38626   23891
44119   43477
87012   92308
94285   19887
47957   58398
98671   73475
80957   90536
75724   97254
55207   34449
26929   46570
88177   67524
42538   87526
60062   34610
31875   89453
84401   51230
30370   37435
77609   26508
54400   89045
16998   76877
53993   49621
80345   11280
33089   83000
59754   54338
10070   58398
79056   47218
38998   94596
78946   93836
17022   58398
26488   96876
29505   48228
87585   84630
11881   50840
34065   89196
92780   57959
84213   19887
89873   64164
88432   76456
27932   48228
75295   51234
87532   48228
89045   79056
90024   40848
83717   70022
44095   35548
61389   31672
45153   80712
53642   46743
79895   27572
62353   63922
31718   43553
47638   16679
68079   37435
99980   82246
28989   53198
55315   72011
39285   57048
63939   79056
79274   44485
67372   46550
37081   89990
90006   46550
78226   14583
49740   39212
57801   58398
99700   89990
20514   78226
74207   90650
30895   37435
45690   48228
59599   95569
36570   43919
27098   46550
23232   58398
64885   62096
85030   17022
91242   68818
32735   79056
84321   35330
15556   73475
65535   20996
46287   47868
72870   88715
26102   28963
53627   60988
24886   17144
96189   76191
24759   45636
98950   93836
47061   90595
74001   11280
91701   50701
35581   79056
55474   50020
39955   95451
16183   78226
76584   61285
22474   97165
23396   37435
71483   50801
94355   26883
16187   82246
48892   19971
78206   44272
90897   53164
60606   54389
15976   30489
64132   87526
30030   56758
58524   78226
37832   38097
78244   99949
10124   28222
45015   95384
98771   10124
69753   82069
71588   56590
39250   55040
60302   75373
44720   75373
49715   89857
17921   63358
20595   41212
71304   86278
52931   42672
12734   79056
96838   51840
69950   91553
20134   46460
44631   57523
21834   82246
87705   67785
67003   90674
57615   46550
24713   27095
92252   37081
69531   40818
95550   98970
69979   52208
86183   56378
97271   66222
34116   56073
61862   26938
69230   60312
39386   73475
91297   27731
91115   71904
65373   32930
22264   84534
42510   24914
31022   11750
79868   79584
52225   11280
81993   22693
90622   29200
10788   36850
19596   74840
18745   89990
95360   48049
37314   40727
43291   88941
16658   42023
42096   91462
35810   67282
60361   64835
92458   79056
75435   68039
28125   44179
59855   53140
77935   17022
54641   63922
38395   79056
57487   60560
28437   42839
70988   11280
71731   44391
71274   97823
26290   50801
44854   28186
48594   90871
77287   73475
65614   84401
15083   12542
46738   84321
97269   92183
70388   79584
43702   50801
68959   40727
85936   67896
66276   61786
46788   79046
75138   51794
67218   16350
84121   37435
85923   46550
27696   64738
33242   97663
53268   90595
74163   13720
35519   37435
95673   76504
75926   50898
31828   46367
13255   34762
91251   19661
35165   60459
89359   57486
56104   68802
23120   10446
27928   94429
13036   87135
68217   89045
90595   77093
64564   19604
20580   38456
95473   99949
23762   19970
94918   49478
95915   73475
43165   21655
28604   10124
77991   89045
69168   12955
67281   36803
63168   34514
54261   92874
43629   11280
58472   56533
87513   85394
78278   38097
27186   53704
12183   15651
85071   34888
74115   71810
38589   42387
59858   58282
81623   50530
18220   40720
80203   37435
25112   95576
27487   51328
52735   36378
57441   83458
16583   74359
68008   88649
93205   93836
29428   76277
63669   73090
27809   91190
71814   87028
36345   69002
19141   71868
85043   10124
35017   38242
72320   53921
66740   17022
28399   78733
22592   81140
30299   45195
34919   40727
19792   89045
25687   93836
83082   90916
40314   82874
80873   63922
21542   11066
55959   94833
77334   37081
19353   21529
19186   73475
34259   95072
13064   74519
41193   49662
55457   72223
16381   65670
50435   19770
50032   50801
49806   46550
18336   37435
13333   90595
90334   19267
60180   82246
93184   62302
74440   26507
63011   87526
40458   91304
23721   57800
63922   17206
95151   38097
51880   98112
20539   54229
93904   62420
72851   46550
42399   61786
99078   63519
12254   84321
10577   44243
46056   37435
51235   39472
50954   12660
77199   56517
18283   63922
52859   87526
15246   98536
93836   97622
49522   88978
73549   40727
31118   75373
51627   61786
89566   93643
23542   37435
54992   11280
24990   37435
33159   19657
53499   17022
84995   72286
45818   50732
41169   14636
40120   26973
64286   84321
63432   30593
99949   24130
61077   17022
36433   82246
54816   59964
85544   84321
60397   52196
76341   37081
31479   79027
16260   17420
96337   10124
65067   62284
26483   44801
96326   17022
14058   54149
13393   65898
65901   53423
38500   57276
20118   79584
44353   40727
47498   49739
93041   47833
46970   46550
33688   75373
68223   90514
87884   94553
65159   37435
58329   74006
42199   37350
76135   20844
46639   93433
65739   49055
43290   40727
31619   86698
77506   86616
65377   56317
31884   84321
52311   81238
99796   93836
51116   95658
41167   73020
16379   18876
12306   97271
31299   48015
73188   70147
11052   42268
65938   89990
26961   98705
95494   58398
59265   23555
87502   32132
70660   84797
12091   89045
41613   26719
68419   20739
26943   58398
89349   89990
16651   17022
95467   28221
89990   98021
85616   86929
16479   51259
76664   83454
78928   74941
31423   45590
93437   82778
85851   52691
76954   15659
16646   74584
48296   46550
77084   70921
27964   99119
41032   78157
32094   82246
29499   49175
79737   13077
25423   68557
12219   58628
70282   29077
87976   11280
57871   43034
81209   58398
56626   45328
29128   28793
14975   38794
58884   36392
24278   46550
77033   27547
74285   73475
10130   32903
34728   11456
40336   17022
62399   45936
95138   53688
12955   16609
74887   93836
18948   93836
28427   26898
67749   80949
94522   17022
76210   82892
78717   36081
13088   80671
80352   18670
68720   41082
62146   69651
39371   84952
41619   27315
90093   58398
32716   45626
11280   27624
22129   94534
24812   10124
43238   86431
35791   97662
13162   53170
57534   15678
66069   63618
49841   11541
23345   38097
36649   79056
37193   76443
18519   64707
34081   61786
47018   75935
30452   48882
12559   58817
32825   91051
16051   19974
80171   17022
19374   54710
57295   84738
81314   38097
33479   49965
38611   79056
63785   49754
18774   70415
98510   72330
85801   73475
10744   17022
15097   11280
52514   37435
86860   96099
27804   22005
62601   79155
86020   70452
26991   15399
98690   69989
18491   79768
85009   79056
11329   20212
10782   70405
85026   40727
26888   34001
36315   42445
49098   37435
59474   38097
86676   41947
55178   84321
53985   87526
65575   47299
57292   82817
24657   37081
82204   10439
82639   73475
81107   74139
89946   97271
93306   41394
62863   46550
48547   86956
12374   55301
81681   95539
94195   84321
15303   24017
36395   93928
28572   10804
62931   45406
33902   88845
25697   83483
86956   51476
73957   55123
83525   29803
16581   19887
96012   90129
14067   40727
66384   38351
43782   75373
53517   29489
17126   71797
10854   68046
68594   89990
89786   17022
81737   11280
71138   84321
77457   62041
99540   54740
15369   87693
73475   39716
35348   99949
67944   97271
91345   20757
48703   99901
56450   84321
75169   70680
27015   45299
34175   85030
21184   64377
95716   97271
25738   70254
79049   75062
54904   68773
12813   47161
95857   74505
17456   87526
24008   30203
92925   17022
17008   98227
53399   13325
84080   49498
23522   62070
80392   37081
67497   89518
31797   46550
67380   29610
29237   47817
32106   13477
17711   61786
19802   35480
46456   37081
17560   67389
59598   64837
30808   10592
78213   86193
34411   89045
22714   61786
88368   79584
49172   37435
28106   30525
34595   65269
67363   90595
50389   33649
57932   74115
85665   15566
68846   84257
37420   75483
70969   85030
53989   74115
30509   81520
92232   63922
12314   64612
59372   76983
31019   75373
28072   58019
51728   67527
82597   36949
66512   46706
89805   89043
13220   72753
19121   31313
27852   89644
29411   97271
10756   88672
60773   82080
81746   11688
90653   23550
59400   14905
97489   44079
20132   40678
82210   43722
90436   80985
92889   18182
40727   19573
76922   17022
27554   78420
47592   87324
78580   75728
91390   63922
82409   80324
60005   53958
46353   90334
89672   26377
52436   11280
71495   41947
59787   93836
13621   79584
16093   31156
11603   51436
31593   12955
70829   89990
61100   96353
71072   61786
33450   17022
65493   48359
51250   44884
49691   61786
41947   61786
60973   93836
44926   87954
90989   35723
98883   30386
83974   67713
86576   17323
32534   77745
57475   32708
86499   43190
26325   71687
16737   58398
88460   17022
10798   61963
59899   54790
84722   56390
38117   75766
12595   97271
23267   78757
70504   97271
44104   63922
22950   85030
66137   84321
18407   99949
37598   61786
28201   24323
19404   79868
53725   33781
56941   41947
88231   83925
30916   73475
81019   46011
91386   23123
29455   79332
32603   42378
61578   46922
67431   12955
85884   15692
40799   89045
27326   97271
79681   11280
35614   38097
99407   54190
30288   10124
59805   66109
27031   12955
51952   46550
63420   17022
96282   17307
45123   17022
95858   32553
85339   98702
19887   81650
20177   66465
92548   84663
28915   38097
48995   58398
32153   97259
14697   55843
15115   31921
97258   21175
11833   75150
20476   37081
57780   87869
20868   16813
55154   72786
51855   79056
58551   94071
32367   60493
70926   16754
95196   94004
30949   32933
93955   10128
14553   61786
67116   11579
34377   85680
69983   79584
92220   38097
69116   11280
83291   10124
63499   30127
48167   97271
70014   13990
40226   78226
71699   12411
40455   75750
74381   11343
88846   74967
17219   79739
78747   30775
98201   18354
54571   68841
85738   49837
17375   10897
44862   46550
15932   11030
79584   17022
77641   58398
54882   38097
87246   75159
33635   41878
75249   60331
46712   55151
48228   91967
17264   39458
58907   86956
62693   87904
72122   43222
70553   73934
43027   90334
69420   97905
91527   10124
34146   96995
77998   88728
80279   91008
35125   77608
33786   63922
82027   38917
10855   49903
66407   38097
50926   49360
73591   97271
97759   84321
82742   49137
16002   57753
53455   38875
66703   40727
76244   92842
87526   49664
92147   87526
```
:::

First, we want to go for line by line. Since the element that are separating the numbers

```python!
file = open("input.txt")
for line in file:
    print(line.split())
```

If we do this with the first line of the extended file we're going to get this:

`['56208', '95668']`

See the quotes? Put a pin on those. 

Now let's supose that we want to add each of these 2 numbers. If we do this:

```python!
file = open("input.txt")
for line in file:
    numbers = line.split()
    total = numbers[0] + numbers [1]
    print(total)
```

We're going to get this:

`5620895668` for the first line

This is an error. Do you know why? Give it a thought. 

:::spoiler
56208 + 95668 cannot be a number that big as 5620895668 
:::

Why? Because `'56208'` (and `'95668'`) are strings, so if you put a + between them, you're going to **concatenate them**, so if we want to treat them as numbers we need to treat them as numbers (that is another type). For that we need to use **casting**


#### Casting 


![image](https://hackmd.io/_uploads/BJnoYkSdZe.png)
_Mold and cast of a fossil_

Casting is treat one type of data as another type of data. Usually to cast we use the _constructor_ of the type that we want. For example if we want the "5432" string to be taken as a number, we write `int("5432"`

```python!
file = open("input.txt")
for line in file:
    numbers = line.split()
    total = int(numbers[0]) + int(numbers[1])
    print(total)
```

#### Sum the values of the lines 

//TO-DO 

#### Sum the values of the columns 

//TO-DO 

## Exam exercises

### Maximum number in a file

There is a file called numbers.txt with 1000 lines of numbers. In the next figure you can see a fraction of this numbers:

![image](https://hackmd.io/_uploads/ByrMxfntWe.png)

 
Figure: extract from the file. 
Construct in Python an algorithm that outputs the highest number of this list. [4]

Solution:

:::spoiler
Option 1:

Using `max()`

```python!
file = open("filename.txt")

numbers = list()
for line in file:
  numbers.append(int(line))

print(max(numbers))
```

Option 2:

Not using  `max()`

```python!
file = open("filename.txt")

numbers = list()
for line in file:
  numbers.append(int(line))
maxNumber = numbers[0]
for number in numbers:
  if number>maxNumber:
    maxNumber = number
print(maxNumber)
```
:::

### Minimum number in a file 

There is a file called numbers.txt with 1000 lines of numbers. In the next figure you can see a fraction of this numbers:

![image](https://hackmd.io/_uploads/ByrMxfntWe.png)

 
Figure: extract from the file. 
Construct in Python an algorithm that outputs the lowest number of this list. [4]

### Write a file with a range of numbers

Construct in Python a program that writes a file called “output.txt” with the numbers that comes from 3000 to 1000 in descending order, each number in a line [5]

Variation:

Construct in Python a program that writes a file called “output.txt” with the numbers that comes from 1000 to 1 in descending order, each number in a line [5]



### Exam exercise the CSV of countries 

:::info
:link: This exercise combines 2D lists that you can look for it here https://hackmd.io/-ylq-DUoTKOS6qGCiq3JRA
:::

I have a CSV downloaded from the Canarian Statistics Institute (From the Canary Islands, Spain) with the names of the countries and their coordinates. The full project needs to create a map. The file has this structure (the grey text is the line number, it’s not part of the file but the number that is nonconsecutive is part of the line) 
![image](https://hackmd.io/_uploads/H10gmIvIZx.png)

 The information required is the name of the country and its coordinates (longitude and latitude). 

A-	Construct in Python an algorithm that reads “data.csv” and creates a 2D list called countries with the following structure: [5] 
![image](https://hackmd.io/_uploads/Sy_G7IDLbe.png)

:::info
This is way easier splitting, to check a similar exercise focus in the split part you can go here: https://hackmd.io/y_afcRmLQcysv05xlSqRuQ#Process-data-from-a-line-in-a-csv-file 
:::

Solution:
:::spoiler

Implementation 1:

```python!
file = open("data.csv") #one mark with the correct opening
countries =  []
for line in file: #one mark with the correct loop
    countries.append([])
    fields = line.split(",") #1 mark for the split
    countries[-1].append(fields[1]) #append the second element
    countries[-1].append(float(fields[3])) #if you miss the casting done with float it's ok
    countries[-1].append(float(fields[4])) #if you miss the casting done with float it's ok
    
```

Implementation 2:


```python!
file = open("data.csv") #one mark with the correct opening
countries =  []
row = 0
for line in file: #one mark with the correct loop
    countries.append([])
    fields = line.split(",") #1 mark for the split
    countries[row].append(fields[1]) #append the second element
    countries[row].append(float(fields[3])) #if you miss the casting done with float it's ok
    countries[row].append(float(fields[4])) #if you miss the casting done with float it's ok
    row = row +1
```

Implementation 3 (forgot how to split):

```python!
file = open("data.csv") #one mark with the correct opening
countries =  []
for line in file: #one mark with the correct loop
    countries.append([])
    fieldIndex = 0
    currentField = ""
    for char in line:
        if char != "," and char != "\n": #\n is the special character for "end of the line" I could give the mark if forgot
            currentField = currentField + char
        else:
            if fieldIndex == 1 or
               fieldIndex == 3 or
               fieldIndex == 4:
                  countries[fieldIndex] = currentField
                  fieldIndex +=1
            fieldIndex = fieldIndex +1
            currentField = ""
```
Another implementation without split and "\n" using index while looping through the line. 

```python!
file = open("data.csv") #one mark with the correct opening
countries =  []
for line in file: #one mark with the correct loop
    countries.append([])
    fieldIndex = 0
    currentField = ""
    for lineIndex in range(len(line)): #lineIndex can be named li or something shorter in an exam
        if line[lineIndex] != "," and lineIndex != len(line)-1: 
            currentField = currentField + char
        else:
            if fieldIndex == 1 or
               fieldIndex == 3 or
               fieldIndex == 4:
                  countries[fieldIndex] = currentField
                  #this implementation doesn't cast the numbers to float
                  fieldIndex +=1
            fieldIndex = fieldIndex +1
            currentField = ""
```

Another inplementation with no split:

```python!
file = open("data.csv") #one mark with the correct opening
countries =  []
for line in file: #one mark with the correct loop
    countries.append([
        line[line.find(",")+1:line.find(",PAISES")],
        line[line.find(",PAISES,")+8:line.rfind(",")],
        line[line.rfind(","):]
    ])
        
```

:::

This exercise continues validating the 2D list here https://hackmd.io/RCQ32fpjS-up6rYC74jPoQ#Validating-country-data


## Reference 

[W3Schools](https://www.w3schools.com/python/python_file_handling.asp)

## Extra: copying files for a printer


As far as I know IB doesn't ask you to use the library `shutil` (that I guess that is "shell utility" being shell the terminal) here is a fancy code for something that I needed to do and automate the boring stuff. 

I want to print some cards in a printer and they print them in decks and their decks have 54 cards (like poker decks). So I need to send the files separately. Usually I will vary the side a (Anverso in Spanish) and the side b (Reverso in Spanish). Here we're going to use an example where I want to print both sides the same. 


:::warning
Important, I'm using here a **relative route** to write, but the folder **needs to exist** or you will get an error of "No such file or directory"
:::

```python=
import shutil


numberOfCopies = 54 

sourcePathSideA = "Anverso.pdf"
sourcePathSideB = "Reverso.pdf"

for x in range(numberOfCopies):
    #This if is so I create Anverso 01 instead of Anverso 1 and have the same number of characters
    if x <9:
        stringNumber = "0" + str(x+1)
    else: 
        stringNumber = str(x+1)
    copySideARoute= "Anversos/Anverso "+ stringNumber + ".pdf"
    copySideBRoute= "Reversos/Reverso "+ stringNumber + ".pdf"    
    shutil.copy(sourcePathSideA, copySideARoute)
    shutil.copy(sourcePathSideB, copySideBRoute)

```

And with less of 20 lines of code we're done! 

::: info
I don't use the file open and file read because pdf has some particularities with encoding (they can have different encodings inside it) so using this tool is easier. But here you have the code that should work with plain text files:

:::danger
Not working code
```python
numberOfCopies = 54 

sideAOriginal = open("Anverso.pdf", encoding= "utf-8") #sideA
print(sideAOriginal)
sideBOriginal = open("Reverso.pdf", encoding= "utf-8") #sideB

for x in range(numberOfCopies):
    #This if is so I create Anverso 01 instead of Anverso 1 and have the same number of characters
    if x <9:
        stringNumber = "0" + str(x+1)
    else: 
        stringNumber = str(x+1)
    copySideARoute= "Anversos/Anverso_"+ stringNumber + ".pdf"
    copySideBRoute= "Reversos/Reverso_"+ stringNumber + ".pdf"
    with open(copySideARoute, "w", encoding= "utf-8") as copySideA:
        copySideA.write(sideAOriginal.read())
    with open(copySideBRoute, "w", encoding= "utf-8") as copySideB:
        copySideB.write(sideBOriginal.read())

```
:::
:::

### Reference for the copying part

https://stackoverflow.com/questions/123198/how-do-i-copy-a-file
https://docs.python.org/3/library/shutil.html#directory-and-files-operations

## Snippets


```python!

fileData= str(list(range(1,11)))[1:-1]
with open ("testfile.txt", "w") as file:
    file.writelines(fileData)  
```



```python!
fileData= str(list(range(1,11)))[1:-1].replace(", ", "\n")
with open ("testfile.txt", "w") as file:
    file.writelines(fileData)
    file.close()
```

```python!
myList = list(range(1,11))
string = str(myList)[1:-1]
stringWithLines = string.replace(", ", "\n")
with open ("testfile.txt", "w") as file:
    file.writelines(stringWithLines)
    file.close()
```


```python!
#prepare the string
myList = list(range(1,10001))
string = str(myList)[1:-1]
stringWithLines = string.replace(", ", "\n")
#write the file 
with open("testfile.txt", "w") as file:
    file.writelines(stringWithLines)
    file.close()
    
```

```python!
file = open("testinput.txt")
print(file)
for line in file:
    print(line)
    splittedLine = line.split(" ")
    mappedLine = map(float, splittedLine)
    print(mappedLine)
    numbers = list(mappedLine)
    print(numbers)
    numbers = list(map(int, line.split(" ")))
    
```


```python!
file open(patata.txt)

start = True #Flag variable
for line in file:
  if start:
    start = False
    continue
  if (condition):
    #dosomething 
```


```python!
file open(patata.txt)

foundIncludedPlugins = False
plugins = []
for line in file:
  if line == "Included plugins:":
    foundIncludedPlugins = True
  if foundIncludedPlugins and line[0] == " ":
       plugins.append(line)
```       

