For the first assignment you should get familiar with the Python-interpreter. In the second assignment you write your first Python program using an editor. More instructions are given in the following how to use both, the Python-interpreter and an editor. Good luck!
Load the module for Python 3 with the command module load python/3.6.0
. Open the Python-interpreter with the command python3
. You should then see at the beginning of the line: >>>
. In this exercise we use only the Python-interpreter. You can leave the Python-interpreter when you type quit()
.
print("Assignment7")
>>> print("Assignment7")
Assignment7
i = 10
in the Python-interpreter and then (in a new line) print(i)
. After that (in a new line) enter j = i/2
and (in a new line) print(j)
.>>> i = 10
>>> print(i)
10
>>> j = i/2
>>> print(j)
5.0
7Assignment
the string black magic
. Don’t forget to put the string in quotation marks (" ").>>> 7Assignment = "black magic"
File "<stdin>", line 1
7Assignment = "black magic"
^
SyntaxError: invalid syntax
len()
to determine the length of the sequence A and assign the length of A to variable i. Print A and i.>>> A = "AGCTA"
>>> i = len(A)
>>> print(A, i)
AGCTA 5
>>> print(A + i)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: must be str, not int
print(A + str(i))
.>>> print(A + str(i))
AGCTA5
Hint: What might the built-in function
str()
do? There are also other built-in functions, e.g., to convert a string or number to an integer:int()
, or to convert a string or number to a floating point:float()
.
>>> print(A[1:4])
GCT
>>> print(A[:2])
AG
>>> print(A[-2:])
TA
Hint: Don’t forget to indent the body of the for-loop.
>>> for i in range(len(A)):
>>> print(i)
0
1
2
3
4
Execute the same for-loop a second time and print out the character at each position of string A using A[i] as well.
>>> for i in range(len(A)):
>>> print(i, A[i])
0 A
1 G
2 C
3 T
4 A
i < len(A)/2
.>>> for i in range(len(A)):
>>> if (i < len(A)/2):
>>> print(i, A[i])
0 A
1 G
2 C
>>> i = 0
>>> while (i < len(A)/2):
>>> print(i, A[i])
>>> i=i+1
0 A
1 G
2 C
>>> print(A)
AGCTA
Leave the interactive mode of Python with quit()
.
Now return to the interactive mode of Python and print the variable A.
What happens now and why?
>>> print(A)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'A' is not defined
Open your favorite editor (nano, idle3, gedit, etc.) and write in the file named compare.py
your first Python program.
Hint: When you type
$ idle3 compare.py &
in the terminal, a new line in the terminal should appear (if not press<ctrl C>
). Then you can run your program in the same terminal window:$ python3 compare.py
The advantage is that you can edit your program and switch easily between the editor and terminal window.
Does your program work?
2a)
i = 3
j = 4
if (i == j):
print(1)
else:
print(0)
0
2b)
i = 10
j = 10
if (i == j):
print(1)
else:
print(0)
1
Congratulations, you have now completed the basic python exercises for this session. If you were too quick or just want to try a bit harder exercises, please continue with the bonus exercises below.
In this exercise we write a short Python program (named <program_name>.py
, think of a reasonable program name and name your file accordingly. Replace <program_name> with your new program name).
Chose two variables, e.g. A and B and assign the sequences GATTACA
and TACCATAC
to these variables. Make sure that the two sequences are assigned as strings to their variables A and B. Then print these sequences.
Save everything you wrote and close the editor. Then you can run your program: python3 <program_name>.py
A = "GATTACA"
B = "TACCATAC"
print("sequence A: ", A)
print("sequence B: ", B)
Then extend your program:
1.1 Concatenate both sequences in both ways (AB and BA) and print both options.
A = "GATTACA"
B = "TACCATAC"
print("sequence A + B: ", A + B)
print("sequence B + A: ", B + A)
sequence A + B: GATTACATACCATAC
sequence B + A: TACCATACGATTACA
1.2 Print prefixes and suffixes of length 3 of both sequences A and B. Use the built-in function len()
for determining the suffixes.
print("prefix A: ", A[:3])
print("prefix B: ", B[:3])
suffix_A = len(A) - 3
suffix_B = len(B) - 3
print("suffix A: ", A[suffix_A:])
print("suffix B: ", B[suffix_B:])
# It is also possible to use a negative index
# to count from the end:
print("suffix A: ", A[-3:])
print("suffix B: ", B[-3:])
prefix A: GAT
prefix B: TAC
suffix A: ACA
suffix B: TAC
1.3 Print out the second sequence from the last to the first position (last position first, first position last).
for i in range(len(B)):
print(B[len(B) - i - 1])
C
A
T
A
C
C
A
T
1.4 Assign this inverted sequence to a third variable, you could use the variable name C, and print the value of this variable.
C = ""
for i in range(len(B)):
C = C + B[len(B) - i - 1]
print("inverted sequence B: ", C)
# One way to reverse a string is to use a negative counter
B = "TACCATAC"
C = B[::-1]
print("inverted sequence B: ", C)
inverted sequence B: CATACCAT
1.5 Print out the middle base of each sequence. When a sequence has an even number of bases, print out the base at the right position of the middle. Use the built-in function len() for this task.
For example: For A = "GTCA"
the program should print out C
.
Hint: There exist built-in functions to convert a number to an integer.
print("middle base of A: ", A[int(len(A)/2)])
print("middle base of B: ", B[int(len(B)/2)])
print("middle base of C: ", C[int(len(C)/2)])
middle base of A: T
middle base of B: A
middle base of C: C
1.6 Count how often each base occurs in the first sequence (How often does G occur in the first sequence, then A, so on.) and print out this number for each base.
A = "GATTACA"
for i in ['A','C','G','T']:
n = 0
for j in A:
if(i == j):
n += 1
print(i, "=", n)
A = 3
C = 1
G = 1
T = 2
1.7 Count how often TA occurs in the second sequence and print out this number.
B = "TACCATAC"
n = 0
for i in range(len(B)):
if ((B[i] == "T") and (B[i+1] == "A")):
n += 1
print("TA occurs ", n, " times.")
TA occurs 2 times.
or
B = "TACCATAC"
print('TA occurs' , B.count('TA') , 'times.')
TA occurs 2 times.
Write in an editor the program product.py
as introduced in the lecture, which calculates the product of two numbers 456 and 15. Save the program code as product.py
and run the program as described in the previous assignment.
2.1 Now calculate the product 234 and 24 additionally to the first product and print out both products (results) in one single line.
x = 456
print("x = ", x)
y = 15
print("y = ", y)
product1 = x*y
x = 234
print("x = ", x)
y = 24
print("y = ", y)
product2 = x*y
print("Products: ", product1, product2)
x1 = 456
y1 = 15
x2 = 234
y2 = 24
Products: 6840 5616
2.2 Change the program so that all numbers 456, 15, 234, and 24 are saved in one list l. Change the print statement so that each number gets printed and also the product of the first two and the last two numbers.
l = [456, 15, 234, 24]
i = 0
while (i < 3):
print("Product of : ", l[i], " and ", l[i+1], " is: ", l[i]*l[i+1])
i += 2
Product of : 456 and 15 is: 6840
Product of : 234 and 24 is: 5616
Write in an editor a program, which has three lists l
, m
, and n
. Each list contains several sequences. Save and run the program as described previously.
l: AGGTC, GATC, CTGCA, ATTCGT, ATGGT, GATC
m: CTGCA, GATC
n: CUAGCUA, GTATGG, GUAUC, GTAG
Note: Remember to store all sequences as strings in each of the lists. Extend your program so that it can perform the following tasks.
3.1 Print each sequence in list l
.
l = ["AGGTC", "GATC", "CTGCA", "ATTCGT", "ATGGT", "GATC"]
m = ["CTGCA", "GATC"]
n = ["CUAGCUA", "GTATGG", "GUAUC", "GTAG"]
for seq in l:
print(seq)
AGGTC
GATC
CTGCA
ATTCGT
ATGGT
GATC
3.2 Print the first and last sequence in list l
.
print(l[0], l[len(l)-1])
# or shorter version using a negative index number
print(l[0], l[-1])
AGGTC GATC
3.3 For each sequence in list l store the second position of each sequence in a new variable and print this new sequence.
seq = ""
for i in l:
seq = seq + i[1]
print(seq)
GATTTA
3.4 Add this new sequence to the list l
.
l.append(seq)
3.5 How long is list l
now? Print out the length of list l
.
print(len(l))
7
3.6 Delete the second sequence of list l
. Print list l
and its length.
l = [l[0]] + l[2:]
print(l, len(l))
['AGGTC', 'CTGCA', 'ATTCGT', 'ATGGT', 'GATC', 'GATTTA'] 6
3.7 Divide the new list l
into two equal parts and store the first half in a new list l1
and the second half in a new list l2
. Print both lists.
half = len(l)/2
l1 = l[:half]
l2 = l[half:]
print("l1: ", l1)
print("l2: ", l2)
l1: ['AGGTC', 'CTGCA', 'ATTCGT']
l2: ['ATGGT', 'GATC', 'GATTTA']
3.8 Concatenate list l2
and list l1
(in this order) and store it in a new list l3
.
l3 = l1 + l2
print(l3)
['AGGTC', 'CTGCA', 'ATTCGT', 'ATGGT', 'GATC', 'GATTTA']
3.9 Remove all sequences in list l
, which are also present in list m
.
for seq_m in m:
for seq_l in l:
if (seq_m == seq_n):
l.remove(seq_m)
# or
for seq_m in m:
if seq_m in l:
l.remove(seq_m)
3.10 Invert all sequences in new list l
and store them in a new list l4
.
l4 = []
for seq in l:
l4 = l4 + l[::-1]
print(l4)
['GATTTA', 'ATGGT', 'ATTCGT', 'AGGTC', 'GATTTA', 'ATGGT', 'ATTCGT','AGGTC', 'GATTTA', 'ATGGT', 'ATTCGT', 'AGGTC', 'GATTTA', 'ATGGT','ATTCGT', 'AGGTC']
3.11 In list n a few RNA sequences (U instead of T) are present.
When you print list n it should contain the following sequences in this order:
CTAGCTA, GTATGG, GTATC, and GTAG
.
nDNA = []
for seq in n:
if (seq.find("U") != 0):
new_seq = seq.replace('U', 'T')
nDNA.append(new_seq)
else:
nDNA.append(seq)
print(nDNA)
['CTAGCTA', 'GTATGG', 'GTATC', 'GTAG']
UPPMAX
Intro course