--- title: 'Tâm - Week 1' tags: CoderSchool, Mariana --- Week 1 === ## Table of Contents [TOC] ## Monday ### What is Machine Learning? 1950 Alan Turing wrote the Turing test! **1. What is Machine Learning** - 1950-2002 few ML researchers - 2002- : hardware breakthroughs help advance ML; computational power breakthru; big data era ~40k exabytes by the end of 2020 - Definitions: Machine learns from experience, learn from data, ~~and follow instructions.~~ **2. How ML works** - **Traditional programming:** rules and data => answers. disadvantage: inflexibility (for new data coming in) and incapacity to capture all data patterns and/or outliers; in fields such as computer vision, traditional programming cannot handle the amount of data. - **Machine Learning:** can learn from given answer and data and output rules in form of models. The output of machine learning can be used for traditional programming to give/predict answers. - **Machine learning: training; traditional programming: prediction** - **Model evaluation:** split data into train-test sets. count misclassified data points in the test set. Goal: minimize error/cost function. - **Steps of Predictive Modelling:** Get Data -> Clean, Prepare & Manipulate Data -> Train Model -> Test Data -> Improve **3. Types of ML** - Supervised Learning and Unsupervised Learning - Supervised Learning: Regression for continuous output; Classification for categorical output - Unsupervised Learning: analyze data without labels, for example market segmentation - Reinforcement learning: applications in gaming. reward maximization. **4. How to learn ML:** ML is the intersection of CS/Business/Maths+Stats. - Learn pseudocode/flow chart => learn to solve problems algorithmically. - We will learn ML algorithm -> math -> coding. We will learn ML coding from scratch. - Keep reading articles and research in ML. **5. Setting up working environment and code editor:** - Use cs_ftmle as the environment cuz it has jupyter notebook - Cmd + Shift + F: find all in working folders in VSCode - Cmd + F: find all in file in VS Code - Cmd + D: multiple select in VSCode > Emmet docs: https://docs.emmet.io/cheat-sheet/ **6. Books to read:** > Python hands-on: https://jakevdp.github.io/PythonDataScienceHandbook/ > Scikit-Learn and TensorFlow github: https://github.com/ageron/handson-ml2 ### Google Collab and Teamwork - Code blocks in GCollab read in order of Shift + Enter - Python `random.shuffle(array)` shuffles the array in place and outputs `None` ### Appendix and FAQ 1. What is the difference/similarity between classification and regression? 2. What is the order of code read in Google Collab? 3. What enables the AI rapid development during the last two decades? 4. What is the similarity/difference between traditional programming and machine learning? 5. How to activate Jupyter Notebook? 6. Describe that `range` function in Python and its arguments. ## Tuesday ### Developer/Programmer mindset - Let's learn how computer works - 1940s people use analog signals to telephone each other. People call call center to send telephone calls. - How many numbers can we represent with **n** bits : $2^n$ - How to convert from decimals to binaries: divide by 2 iteratively - How to convert from binary to decimals: $\sum$ $bitvalue*2^n$ - An image or audio or text can be converted digitally. For example, images are represented by pixels where each pixel represent RBG values; 256 ASCII characters - 256 ASCII characters can be represented by 8 bits, called collectively a byte ($1 byte = 8 bits$). 1MB = 1024 or 2^10 B and 1GB = 1024MB and similarly for TB PB EB. - Adding binaries - 12 + 24 - convert 12 and 24 into 001100 + 011000 - add column by column - 0 + 0 = 0; 0 + 1 = 1; 1 + 1 = 0 leftover 1 in the next column - Multiplication and division: - convert to addition and substraction - The first codes are written in binary language -> hence the need for high-level programming language - **Mindset of a developer:** problem solving, write functions to solve problems. Outputs of functions can be used to feed into another functions. **Computation graph** can be used to make diagrams of computation thinking ### Coding problems #### **Problem:** greatest common factor of two numbers - **Solution** - GCF (a,b) is a when a == b - GCF (a,b) = GCF (a,a-b) - Pseudocode: ```gherkin= 1) if a == b, return a 2) if a > b: a = a - b return step 1 3) else: b = b - a return step 1 ``` - Python code: ```python def gcf(a,b): while a != b: if a > b: a = a - b else: b = b - a return a ``` #### **Fibonacci sequence:** the next number is the sum of the previous numbrets - My code: ```python def Fibonacci(n): result = [1,1] if n < 2: return 1; for i in range(2,n+1): result.append(result[i-1] + result[i-2]) return result[n] ``` - Answer: ```python def fibo(n): if n < 2: return 1 a,b,i = 1,1,1 for i in range(2,n+1): c = a + b a = b b = c return b ``` - other codes (recursive): ```python def brook(n): if n < 2: return 1 else: return brook(n-1) + brook(n-2) ``` ### Python learning - Check out **BasicPython-Kaggle.ipynb** in GDrive folder. - Get help on functions by `help(function)` - Check a type of variable by `type` - Order of operation is **PEMDAS** - Docstring is wrapped in function by ''' and ''' and will be displayed on calling `help` comprised of a short description and example output >>> - `print` arguments like `sep` and `end` can be used to customized printout - Python allows trailing commas in argument list - Default arguments are defined in function signature `func(arg=default_value)` Non-default argument cannot follow default argument in function def - **Higher Order Function:** - Exclusive or : A xor B is `A ^ B` - Truthiness: empty string and zero are falsey, strings and non-zero numbers are truthy. - Ternary expressions - Lists can consist of elements of many different types, strings , numbers, functions, lists, ... - The third argument column is indicating the step and order of iteration in list, for example -1 indicates reversing. - List, tuple and string are iterable but string like tuple are immutable - List comprehension with `else`: ```python even = [i if i % 2 == 0 else -1 for i in range(100)] ``` - String building with list comprehension ```python s = f"{','.join[i for i in range(10)]}" ``` - str `str.find` method returns `-1` when not found - **Dictionaries** `dict.items` returns tuples of key,value pairs ### Q&A 1. What is the `O()` runtime of recursive algorithms? 2. How many bits is a byte? How many bytes is 1MB? 3. `planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']` What is the following outputs? a. `planets[-2:2]` b. `planets[:3]` c. `planets[3:]` d. `planets[1:-1]` e. `planets[-3:]` f. `planets[::-1]` g. `planets[-2:2:-2]` 4. How to measure time a function take in GCollab? ## Wednesday ### GCollab exercises - Reassign variables: ```PYTHON= a = [1,2,3] b = [3,1,2] a , b = b, a ``` - Python built-in function `round` ```PYTHON= round(9012.231,2) = 9012.23 round(9012.231,-2) = 9000 ``` - Default function arguments ```PYTHON= def to_smash(total_candies, nfriends=3): """Return the number of leftover candies that must be smashed after distributing the given number of candies evenly between 3 friends. >>> to_smash(91) 1 """ return total_candies % nfriends ``` - `time` function ```PYTHON= from time import time def time_call(fn, arg): """Return the amount of time the given function takes (in seconds) when called with the given argument. """ tic = time() fn(arg) toc = time() return toc-tic def slowest_call(fn, arg1, arg2, arg3): """Return the amount of time taken by the slowest of the following function calls: fn(arg1), fn(arg2), fn(arg3) """ return max(time_call(fn,arg1), time_call(fn,arg2), time_call(fn,arg3)) ``` - one-line return with if-else ```PYTHON= def sign(n): if n==0: return 0 return -1 if n<0 else 1 ``` - if-else in print statement ```PYTHON= def to_smash(total_candies): """Return the number of leftover candies that must be smashed after distributing the given number of candies evenly between 3 friends. >>> to_smash(91) 1 """ print("Splitting", total_candies, f"{'candy' if total_candies==1 else 'candies'}" ) return total_candies % 3 ``` - Combining booleans ```PYTHON= def onionless(ketchup, mustard, onion): """Return whether the customer doesn't want onions. """ return not onion def wants_all_toppings(ketchup, mustard, onion): """Return whether the customer wants "the works" (all 3 toppings) """ return ketchup and mustard and onion def wants_plain_hotdog(ketchup, mustard, onion): """Return whether the customer wants a plain hot dog with no toppings. """ return not (ketchup or mustard or onion) def exactly_one_sauce(ketchup, mustard, onion): """Return whether the customer wants either ketchup or mustard, but not both. (You may be familiar with this operation under the name "exclusive or") """ return (ketchup + mustard) == 1 def exactly_one_topping(ketchup, mustard, onion): """Return whether the customer wants exactly one of the three available toppings on their hot dog. """ return (ketchup + mustard + onion) == 1 ``` - Boring menu ```PYTHON= def menu_is_boring(meals): """Given a list of meals served over some period of time, return True if the same meal has ever been served two days in a row, and False otherwise. """ for i in range(len(meals)-1): if meals[i] == meals[i+1]: return True return False ``` - Escape `\` ```PYTHON= b = "it's ok" length = 7 c = 'it\'s ok' length = 7 e = '\n' length = 1 a = "" length = 0 d = """ hey """ length = 3 ``` ### repl.it - Given a year (as a positive integer), find the respective number of the century. Note that, for example, 20th century began with the year 1901. ```PYTHON= import math y = int(input()) print(math.ceil(y/100)) ``` - Given integer coordinates of three vertices of a rectangle whose sides are parallel to coordinate axes, find the coordinates of the fourth vertex of the rectangle. ```PYTHON= x1 = int(input()) y1 = int(input()) x2 = int(input()) y2 = int(input()) x3 = int(input()) y3 = int(input()) if (x1==x2): x = x3 elif (x2==x3): x = x1 elif (x3 == x1): x = x2 if (y1==y2): y = y3 elif (y2==y3): y = y1 elif (y3 == y1): y = y2 print(x,y,sep='\n') ``` - In mathematics, the factorial of an integer n, denoted by n! is the following product: $n! = 1 × 2 × … × n$ . For the given integer n calculate the value $n!$ . Don't use math module in this exercise. ```PYTHON= n = int(input()) def fact(n): if n == 1: return 1 else: return n*fact(n-1) print(fact(n)) ``` - Given a string, delete all its characters whose indices are divisible by 3. ```PYTHON= s = input() index = [3*i for i in range((len(s)//3)+1)] for i in range(len(index)-1): print(s[index[i]+1:index[i+1]],end='') print(s[index[-1]+1:]) ``` - Augustus and Beatrice play the following game. Augustus thinks of a secret integer number from 1 to n. Beatrice tries to guess the number by providing a set of integers. Augustus answers YES if his secret number exists in the provided set, or NO, if his number does not exist in the provided set. Then after a few questions Beatrice, totally confused, asks you to help her determine Augustus's secret number. Given the positive integer n in the first line, followed by the a sequence Beatrice's guesses, series of numbers seperated by spaces and Agustus's responses, or Beatrice's plea for HELP. When Beatrice calls for help, provide a list of all the remaining possible secret numbers, in ascending order, separated by a space. ```PYTHON= n = int(input()) a = set([i for i in range(1,n+1)]) b = set() e = input() while e != 'HELP': c = set([int(i) for i in e.split()]) r = input() if r == 'YES': b.update(c) elif r == 'NO': b.difference_update(c) e = input() else: for i in sorted(list(b)): print(i,end=' ') ``` ## Thursday ### Terminal - `ls` - `ls -p` - `ls -A` - `ls -h` - `ls -l` - Files and directories - `mkdir` - `rmdir` - `rm -R` - `cp` `mv` `touch` - File inspection - `cat` `less` `head` `tail` `shuf` - `wc` `sort` - Grep and Sed - `grep -E` `grep -i` `grep -r` - `sed -E` `sed -i` ### Github ```bash= echo "# mariana-coderschool" >> README.md git init git add README.md git commit -m "first commit" git remote add origin https://github.com/thtamho/mariana-coderschool.git git push -u origin master ``` >Learn git visually https://learngitbranching.js.org ## Friday ### Build web crawler flask app > Go to codepen.io ```JAVASCRIPT= count = 0 function handleClick() { count = count + 1 counter = document.getElementById('counter') counter.innerHTML = count + ' clicked' } ``` - Use `try except` to catch error and prevent program from crashing. Use `continue` and `print` to continue to next loop iteration and print error message - Use `BeautifulSoup` `find_all` method on BeautifulSoup object to search and get relevant tags. For example , `links[0].find_all('a', {"class": "next"})` - `articles = soup.find_all('div', class_='product-item')` and then `articles[i]['data-category']` to access the first tag :::info **Find this document incomplete?** Leave a comment! ::: ###### tags: `CoderSchool` `Mariana` `MachineLearning`