---
title: 'Tâm - Week 2'
tags: CoderSchool, Mariana
---
Week 2
===
## Table of Contents
[TOC]
## Monday
### Web crawler presentation
> Our group Ly-Tam github repo
https://github.com/thtamho/tiki-scraping
- Tuc and Sean has a responsive webpage, which you can click on the card or buy now to go to product page. They also display TikiNOW icon
- Dicky and Tan button also redirects to product page
- Tien and Ngoc displays big categories then sub-categories then finally the smallest sub-categories displaying the products
### Sorting: Bubble & Merge
#### Bubble Sort

```PYTHON=
def bubble_sort(A):
n = len(A)
for j in range(n-1):
for i in range(n-1):
if A[i] > A[i+1]:
A[i], A[i+1] = A[i+1], A[i]
return A
```
> Pretty easy [name=Tam Ho]
#### Merge Sort
```PYTHON=
def merge(a, b):
a_idx, b_idx = 0, 0
a_len, b_len = len(a), len(b)
merged_list = []
while a_idx < a_len and b_idx < b_len:
if a[a_idx] < b[b_idx]:
merged_list.append(a[a_idx])
a_idx += 1
else:
merged_list.append(b[b_idx])
b_idx += 1
# Checking if any element was left
if a_idx == len(a):
merged_list+=b[b_idx:]
if b_idx == len(b):
merged_list+=a[a_idx:]
return merged_list
```
```PYTHON=
def merge_sort(l):
print('l: ',l,sep=' ')
if len(l) < 2:
return l
mid = len(l)//2
lower = l[:mid]
upper = l[mid:]
lower_sorted = merge_sort(lower)
upper_sorted = merge_sort(upper)
print('lower_sorted: ',lower_sorted,sep=' ')
print('upper_sorted: ',upper_sorted,sep=' ')
print('merged: ', merge(lower_sorted,upper_sorted),sep=' ')
return merge(lower_sorted, upper_sorted)
merge_sort([1,4,-2])
```
```gherkin=
l: [1, 4, -2]
l: [1]
l: [4, -2]
l: [4]
l: [-2]
lower_sorted: [4]
upper_sorted: [-2]
merged: [-2, 4]
lower_sorted: [1]
upper_sorted: [-2, 4]
merged: [-2, 1, 4]
[-2, 1, 4]
```
```PYTHON=
merge_sort([1,4,-2,9,8,3,5,-6])
```
```gherkin=
l: [1, 4, -2, 9, 8, 3, 5, -6]
l: [1, 4, -2, 9]
l: [1, 4]
l: [1]
l: [4]
lower_sorted: [1]
upper_sorted: [4]
merged: [1, 4]
l: [-2, 9]
l: [-2]
l: [9]
lower_sorted: [-2]
upper_sorted: [9]
merged: [-2, 9]
lower_sorted: [1, 4]
upper_sorted: [-2, 9]
merged: [-2, 1, 4, 9]
l: [8, 3, 5, -6]
l: [8, 3]
l: [8]
l: [3]
lower_sorted: [8]
upper_sorted: [3]
merged: [3, 8]
l: [5, -6]
l: [5]
l: [-6]
lower_sorted: [5]
upper_sorted: [-6]
merged: [-6, 5]
lower_sorted: [3, 8]
upper_sorted: [-6, 5]
merged: [-6, 3, 5, 8]
lower_sorted: [-2, 1, 4, 9]
upper_sorted: [-6, 3, 5, 8]
merged: [-6, -2, 1, 3, 4, 5, 8, 9]
[-6, -2, 1, 3, 4, 5, 8, 9]
```
### Time complexity
- we have
- $O(n)$, if we have a linear procedure
- $O(n^2)$, if we have for loop within a for loop both of length n
- $O(log_a n)$, if we have an algorithm that divides recursively by 2 (or by a number $a$)
:::info
**NOTE:** Be careful not to use keywords in Python that is not intended to be used that way. That will overide the keywords and render the keywords unusable.
:::
### Object Oriented Programming
- define a class `class Class`
- Constructor `def __init__(self):`
- Python class is not similar to Java class. All of Python class variables are public, which means an instance can access class variables. An instance variable is also public can also be accessed and changed publicly
- Magic methods. For example
- `def __gt__(self, other)`
- `def __truediv__(self,other)`
- allows us to operate on instances of class without specifying `.variable`
- Inheritance `class Class(Parent)`
- New gcd function which works better (works with negative numbers and zero):
```PYTHON=
def gcf(a,b):
while b != 0:
t = b
b = a % b
a = t
return a
```
## Tuesday
### Regex
- `^` negation
- `\` escape a special character
- `[a-z]` match character `[0-9]` `[\d]` match digits
- `[]{n}` match group n times
- `()[\b|\s]?` may or may not have a boundary or a space at the end
- `*` may exists; `?` exists or not; `.` could be anything
> RegexOne
regexone.com
### Binary Search and Binary Search Tree
**This doesn't work**
```PYTHON=
def binary_search(lyst, key):
m = len(lyst)//2
print(m)
while lyst[m] != key:
if lyst[m] > key:
m = len(lyst[:m])//2
elif lyst[m] < key:
m = m//2 + m
print(m)
else:
return m
print(binary_search([1,2,3,4,5,6,7,8],1))
```
```gherkin=
4
2
1
0
0
```
**This works**
```PYTHON=
def binary_search(lyst,key):
right = len(lyst)-1
left=0
while left <= right:
mid = int((left+right)/2)
if key == lyst[mid]:
return mid
elif key < lyst[mid]:
right = mid -1
else:
left = mid+1
return -1
```
### Binary Search Tree
**Recursion Lvl 100**
```PYTHON=
class Node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
class BST:
def __init__(self):
'''constructor function'''
self.root = None
def search(self, data):
'''implement function to return true if data exists, false otherwise'''
return self._search(self.root, data)
def _search(self, node, data):
if node:
if node.data < data:
return self._search(node.right, data)
elif node.data > data:
return self._search(node.left, data)
elif node.data == data:
return True
else:
return False
def insert(self, data):
if self.root == None:
self.root = Node(data)
else:
self._insert(self.root, data)
def _insert(self, node, data):
if data < node.data:
if node.left == None:
node.left = Node(data)
else:
self._insert(node.left, data)
if data > node.data:
if node.right == None:
node.right = Node(data)
else:
self._insert(node.right, data)
def inorder(self):
'''traverse the bst using inorder traversal'''
self._inorder(self.root)
def _inorder(self, node):
if node:
self._inorder(node.left)
print(node.data)
self._inorder(node.right)
return False
def postorder(self):
'''traverse the bst using postorder traversal'''
self._postorder(self.root)
def _postorder(self, node):
if node:
self._postorder(node.left)
self._postorder(node.right)
print(node.data)
def preorder(self):
'''traverse the bst using preorder traversal'''
self._preorder(self.root)
def _preorder(self, node):
if node:
print(node.data)
self._preorder(node.left)
self._preorder(node.right)
bst = BST()
bst.insert(7)
bst.insert(5)
bst.insert(9)
bst.insert(10)
bst.insert(8)
bst.search(5)
# """
# 7
# 5 . 9
# 8 . 10
# """
```
### Bonus Google problem
```gherkin=
This problem was recently asked by Google.
Given a list of numbers and a number k, return whether any two numbers from the list add up to k.
For example, given [10, 15, 3, 7] and k of 17, return true since 10 + 7 is 17.
Bonus: Can you do this with one for-loop?
```
- Using itertools module
```PYTHON=
from itertools import combinations
def google(lyst, k):
comb = combinations(lyst,2)
for i in list(comb):
if k == sum(i):
return True
return False
```
- Not using extra modules
```PYTHON=
def google(lyst, k):
for i in range(len(lyst)):
if k - lyst[i] in lyst:
return True
return False
```
## Wednesday
### Database
- SQL is the language that is used in database. Other things such as PostgreSQL and SQLite are dialects of SQL
**Employee Table**
| ID | Name | Email |Supervisor |Dept|
| -------- | -------- | -------- |-------- |-------- |
| 1 | Minh | Text |7 |1 |
| 2 | Mikal | -------- |1 |1 |
| 3 | Nguyen | |1 |1 |
| 4 | Mina | -------- |1 |1 |
| 5 | Thanh | |6 |2 |
| 6 | Tien | -------- |7 |2 |
| 7 | Charles | Text |- |3 |
**Department Table**
| ID | Name | No of employees |
| -------- | -------- | -------- |
| 1 | Academic | Text |
| 2 | Student Affairs | Text |
| 3 | CEO | Text |
- Employee ID is the primary key of Employee table; Department ID is the primary key of Department table
- Department ID is the foreign key btw Employee table and department table
### Relational Database

- One to many relationship (can be zero): media_types -> tracks
- One to one relationship (one and only one): 12tracks -> media_types
- One and can be zero: customers -> employees, employees -> employees
#### DB Schema for our Cannabis business

**Teacher Correction**

### Learn SQL
> SQL lessons
sqlbolt.com
- Import .db into Google Collab
```PYTHON=
from google.colab import drive
drive.mount('/content/gdrive/')
import sqlite3
conn = sqlite3.connect('gdrive/My Drive/CoderSchool-Mariana/datasets/chinook.db')
```
- Pass SQL queries and return as pandas
```PYTHON=
import pandas as pd
data = pd.read_sql_query('SELECT * FROM albums;', conn)
```
> SQL exercises here
https://colab.research.google.com/drive/1REreaxuWIz7sE0pJJvpJ-6KWZNBXD0rJ
## Thursday
### Installing PostGreSQL
- Install on Mac by `brew install postgres` or upgrade by `brew upgrade postgresql`
- On my mac I have to append `-U postgres` to log in as superuser and use PostGres
### PostGres with Python
- The query can be defined as a string, values can be passed as arguments using `%s`
```PYTHON=
query = f"""
INSERT INTO categories (name, url, parent_id)
VALUES (%s, %s, %s) RETURNING id;
"""
val = (self.name, self.url, self.parent_id)
cur.execute(query, val)
```
- deque from collections: first in first out, `deque.popleft()` will return the head of the line element, `deque.extend(e)` will add `e` to the end of the deque. This can be used to scrap through categories, adding new links to be scrapped into the end of the deque and pop the first links to be scrap.
## Friday
### How to build tiki postgresql database

- query to find the deepest categories
```PYTHON=
query = f"""
SELECT c.name AS Name, p.url AS ChildURL, p.id AS ChildID, p.parent_id AS ParentID, p.name AS ChildName
FROM categories AS p LEFT JOIN categories AS c ON c.parent_id = p.id
WHERE c.Name IS NULL;
"""
```
- We use OOP to create instances of each product/category and use class method to save and connect to tables
### Multithreading
- For each thread open and close a new db connection because multithreading cannot use the same connection, the cursor will return null in this case.
- declaring `global queue` in thread function will allow the function to access queue outside of the function
:::info
**Find this document incomplete?** Leave a comment!
:::
###### tags: `CoderSchool` `Mariana` `MachineLearning`