---
tags: coderschool, note, datastructure
---
# Week 2: Intermediate Python and Database Fundamental
Quote of the Week
>Divide and Conquer
___
## Day 1
* Check Bao's web scraper: https://github.com/brookyct95/tiki_scraping
* Check Tien's website layout: https://github.com/txtien/tiki-scraping
### Data Structure
* Sorting Algorithm: https://en.wikipedia.org/wiki/Sorting
* Bubble Sort: https://www.geeksforgeeks.org/bubble-sort/
* Pseudo Code: Language Agnostic, Theoretical
https://www.geeksforgeeks.org/how-to-write-a-pseudo-code/
* Merge Sort:
Notes: Empty and one-arg list are already sorted, Divide and Conquer: split things into upper and lower halves
* Most pseudo code are meant for C (there's no len function in C)
* Learn more about sorting algorithm: https://www.geeksforgeeks.org/sorting-algorithms/
* Use string dictionary-like format for documenting your code
* Merge sort algorithm visualization: https://www.youtube.com/watch?v=ZRPoEKHXTJg
* Sorting algorithm comparison: https://www.youtube.com/watch?v=ZZuD6iUe3Pc
### Time complexity
* For loop is very resource-consuming
* Algorithm is dependant on the size of the input
* Big O Notation: https://www.geeksforgeeks.org/analysis-algorithms-big-o-analysis/
Must read: https://www.geeksforgeeks.org/analysis-of-algorithms-set-3asymptotic-notations/
* Logarithm in computer science: https://www.techwalla.com/articles/uses-of-logarithms-in-computers
### Objective oriented Programming (OOP)
* Classes in python (need researching): Python class is different from CSS class
* A constructer is to set the definition of the class
* Class is built-in in Python
* An instance of a class?
* Class requires a "self" keyword
* What is inheritant in Python?
* Python Magic method: https://www.tutorialsteacher.com/python/magic-methods-in-python
* Some methods are exclusive to class only
* Other than OOP? https://www.codenewbie.org/blogs/object-oriented-programming-vs-functional-programming
### Regular Expressions and String Manipulation
Reference: https://www.petefreitag.com/cheatsheets/regex/
* There is no substring method in Python
* What is an RFC: https://en.wikipedia.org/wiki/Request_for_Comments
* Internet Message Format: https://www.loc.gov/preservation/digital/formats/fdd/fdd000393.shtml
* Regex is different in some part between languages:
* What is PCRE: https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions
* This sign "^" is "not in here" for pattern = re.compile('[^cf]ar')
* /b is boundary
* /s is space
* Python regex exercises: https://regexone.com/
* What does an r represent in Python: https://stackoverflow.com/questions/33729045/what-does-an-r-represent-before-a-string-in-python
* 99% of regexes have been done: https://emailregex.com/
* Learn about Python compile function: https://www.programiz.com/python-programming/methods/built-in/compile
### Binary Search Tree
* What is casting? https://www.peterbe.com/plog/interesting-casting-in-python
* Sometimes you have to be more specific for the readability of the code
* O(logn) is important
* Iterative is not recursive: https://medium.com/backticks-tildes/iteration-vs-recursion-c2017a483890
* The leading underscore refers to internal function: https://hackernoon.com/understanding-the-underscore-of-python-309d1a029edc
https://stackoverflow.com/questions/53687998/function-name-with-a-leading-underscore
* A node can only have 2 data, the right is always greater than the left
* What is tree traversal: https://en.wikipedia.org/wiki/Tree_traversal
* What is utility function: https://stackoverflow.com/questions/25060976/what-do-you-mean-by-utility-functions-in-javahow-it-is-related-to-static
* What is helper function: https://web.cs.wpi.edu/~cs1101/a05/Docs/creating-helpers.html
* What is the advantage of binary search tree?
https://practice.geeksforgeeks.org/problems/advantages-and-disadvantages-of-bst
* What is tree structure: https://en.wikipedia.org/wiki/Tree_(data_structure)
* Falsiness: False, None, 0, Empty list? What are these?
* What is base case in recursion: https://en.wikipedia.org/wiki/Recursion_(computer_science)
* What is a node? https://en.wikipedia.org/wiki/Node_(computer_science)
___
# Day 3
## Introduction to SQL
Lession: https://sqlbolt.com/
* What is database? https://en.wikipedia.org/wiki/Database
* WHat is schema? https://en.wikipedia.org/wiki/Database_schema
Some example of schema?

* What is the rules of primary key? (Or reference key, foreign key?)
* We don't delete the data in practice, we give it a flag (or status) to indicate its deletion (or soft delete?). This is temporarily.
* Schema: The child should inherit the parents name, 1 parent with many children
* What is varchar(varied) and char(fixed)? https://en.wikipedia.org/wiki/Varchar
* The data types of SQL: https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-types-transact-sql?view=sql-server-ver15
* The varchart and chart: What is the performance?
* Why char is fastest? https://www.youth4work.com/Talent/MySql/Forum/118724-what-is-the-difference-between-char-and-varchar
* It's more professional to put the KEYWORD before a querry
* What is Google File Stream? https://support.google.com/a/answer/7491144?hl=en
* What is linnaeus in SQL?
* JOIN is INNER JOIN by default
* What is FULL JOIN?
* HAVING applies to GROUP BY, which is the result of another clause
* You can GROUP BY something you don't have during SELECTION
* You can join multiple table using JOIN
* Remember to instal PostgreSQL
___
# Day 4
* Postgress cheat sheet: http://www.postgresqltutorial.com/postgresql-cheat-sheet/
* Postgres uses multiple users and databases as a way to improve security and division of data
* Try to crawl the category trees for: https://tiki.vn/
* Check one simple dataset before try bigger datasets
* What is FIFO (First in first out)? https://en.wikipedia.org/wiki/FIFO_(computing_and_electronics)
* What is deque method? https://www.geeksforgeeks.org/deque-in-python/
* What is pop left method? https://pythontic.com/containers/deque/popleft
* What is the path from Main Cat to Sub Cat?
* What is Python Multiprocessing? https://docs.python.org/3.4/library/multiprocessing.html?highlight=process
* concurrent.futures module in Python: ProcessPoolExecutor & ThreadPoolExecutor
* Hints for weekly assignments:
> Crawl n products -> store in DB
> Read from DB -> HTML
> Category trees then all products
> Do this using OOP
> 150k products