# Week 1 : Note # Day 1 : What's is Machine Learning : ![](https://i.imgur.com/1uimk41.jpg) ### Lecture Introduction : About Coderschool https://www.beautiful.ai/player/-LtNuGRrVNnDekakDC1Q/FTMLE_11a_Welcome Goals of the course : - Have a job in new field with roles such as: DA, DS, ML-Engineer. #### => Depend on our work hard. "No Pressure, No Diamond". Preparation and tips : * Time management * Code knowlegde * Tools & env * Books ( Hand-on, Python, ML by Andrei Ng) * Course online * Never give up <<- Methodology and Strategy : * Learn by doing * Learn by Asking * Learn by helping * Learn from mistakes => ### What is Machine Learning? Read here : - [MachinelearningMastery - What is Machine Learning?](https://machinelearningmastery.com/what-is-machine-learning/) - [Wikipedia - Machine Learning](https://en.wikipedia.org/wiki/Machine_learning) - [SearchEnterpriseAI - Definition of Machine Learning](https://searchenterpriseai.techtarget.com/definition/machine-learning-ML) ### Roles of Data: Data Scientist vs Data Engineer vs ML Engineer [G4G - TL;DR Difference between DS, DE, DA](https://www.geeksforgeeks.org/difference-between-data-scientist-data-engineer-data-analyst/) - [Datacamp - Data Scientist vs Data Engineer](https://www.datacamp.com/community/blog/data-scientist-vs-data-engineer) - [DataQuest - Understanding the Roles of a Data Engineer, Data Analyst, and Data Scientist](https://www.dataquest.io/blog/data-analyst-data-scientist-data-engineer/) - [Edureka - DA vs DE vs DS: Skills, Responsibilities, Salary](https://www.edureka.co/blog/data-analyst-vs-data-engineer-vs-data-scientist/) - [Medium - Collaboration between DE, DA and DS](https://medium.com/dailymotion/collaboration-between-data-engineers-data-analysts-and-data-scientists-97c00ab1211f) - Jupyter Notebook on Google Colab - [RealPython - An introduction to Jupyter Notebook](https://realpython.com/jupyter-notebook-introduction/) - Taking notes with Markdown - [Github - Markdown Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) - [Another Markdown Cheatsheet](https://guides.github.com/pdfs/markdown-cheatsheet-online.pdf) - Basic Programming with Python - [W3S - Python Introduction](https://www.w3schools.com/python/python_intro.asp) #### Role of a Data Scientist / Data Engineer / Machine Learning Engineer : - How different bw them, by how they do and what task they create : ![](https://i.imgur.com/9SCN4od.png) Task: DA : analyzes data to make better decision. Clean Data, Visualization, explose insight, make dashboard, present business client. DE : develop, construct, test, store, continously. dev data pipeline, build ETL, monitor & test the system continously, build ML model. DS : analyzes data and interprets complex data. build ML Modeling, test, continously and make ML-model build better. * Quan & Qual Data : -Different between numberic & description * How to collect data : -Buying, Survey, Google Analyst,Cookies, AIOT (smartwatch). * Data science -> Insight. -80% work is collecting data( raw, clean,…) -20% build & test Data Modeling to finding accurate, insight, LR,.. ## install env : * https://www.notion.so/Preparation-Kit-Miniconda-b1f371a62ecd419a8724056286cda2b8 * create "coderschool"folder and download Env * Bash, Git, miniconda. * => terminal => jupyter notebook ## Notes about this Course : - MLE Class: 10AM - 12AM : Learning 12AM - 2PM : Break time 2PM - 5:30PM : Learning -> Can leave ASAP done these tasks Weakend project: Homework, assigments before 23:59PM Sunday # Day 2 : Python basic :+1: ### Developers : - A to Z - Creative art - Logic Problrm - Not working Normal ### Lists of programming Language : * Around 700+ Program Language >>https://en.wikipedia.org/wiki/List_of_programming_languages * The knowlegde of know how to solve problem in one language can apply to other PL. So you dont need to know much languages. * The program is tranform input to output. * input : data, problem ### the five elements : * Variable * Function * if--else * Loop(for, while,..) * Data structured and algorithms ### five principles : * keep it simple * DRY - dont' repeat yourself * Separation of Concerns * Clear Code > Clever Code * Refactor & Refactor (improve your code when finish) ### Separation of Cocerns (MVC) ![](https://i.imgur.com/DY6nc3x.png) ### Programming Mindset : ![](https://i.imgur.com/4dXWdTU.png) Flowchart : Casestudy => Lamp repair ![](https://i.imgur.com/VgUWchT.png) # Day 3 : ## Basic Python (Kaggle) ### Note : * using help('function') => explain how to use it. * using 'temp' to swap value bw 2 things. * using * ''' description ''' to explain the function, alway put at the beginning (after def...) ### Return () vs Print () : return can use to value to other syntax , funtion. print out syntax is only make sense for views for us, None-type values. (simmilar to string) #### Casestudy : def nhan(x): return (x*2) def inra(x): print(x*2) **Most of time, using a function we need to use return..** ### print() - default sep =' ', end = '\n' in function, default by using def my_function(value = numberic), while using my_function() the output value default is same as numberic value. ## Boolean **defaul value of True and False is 1 and 0.** ## AND vs OR ## **AND** T and T = T T and F = F F and T = F F and F = F **OR** T or T = T T or F = T F or T = T F or T = F Example ![](https://i.imgur.com/Nl2yDyM.png) ## Excercises: return ([boolean, boolean, == int) searching word << 2 last task . ## CHESS GAME :-1: ## bOWLING :8ball: https://repl.it/student/submissions/14550881 ![](https://i.imgur.com/UDmMC5j.png) solution : finding next position => rule of Chess ### Rectan ![](https://i.imgur.com/z53uZ5B.png) (x1,y1), (x2,y2),(x3,y3) => x1 == x2, y1 == y3 =>>the fourth vertex of the rectangle is (x3,y2) Hint: https://stackoverflow.com/questions/53169488/using-python-to-find-the-missing-coordinate-of-a-rectangle # Day 4th : Bash & Git: https://colab.research.google.com/drive/12bRh2Ku8d9NMt-b3n3N-oVHJqg6QuEqw#scrollTo=6LSL-2KE9NVQ ### Intro to Bash Command Line #### GUI vs CLI ![](https://i.imgur.com/gkvoa5V.png) Most of the time we interact with computers exlusively through **graphical user interfaces (GUIs)**. Before computers had graphical displays, though, people typed instructions into a program called a **command-line interfaces (CLIs)**. A **CLI** is a text only interface through which users interact with computers by typing text instructions in a console (or terminal), using specific syntax. Here are a few reasons why it is important to learn the **CLI**: * It is a very popular technology and very common tool in Computer Science * It is one of the best way to use cloud services * You type faster than you click * It allows you to automate repetitive tasks ### Command :+1: * A command is a text instruction given to the computer * Command behavior can be modified with options Powerful ![](https://i.imgur.com/fJEFApx.png) ## Relative paths vs Absolute paths ### Relative : **a .\ and ..\ ** ### Absolute Paths : C:\folder\folder-son\folder-grampa.... ![](https://i.imgur.com/bD9z5EV.png) some code : ## nano -> edit the file ## cd ../ -> duong dan ### cat ./file.txt => open the file ### remove : * rm "file" * rm -r "folder" * rm -rf/ => remove everything ## Basic Commands ![](https://i.imgur.com/LakGaf8.png) ## File inspection ![](https://i.imgur.com/yAgtqpW.png) ![](https://i.imgur.com/eBFKcMy.png) #### Some note :-1: - grep --color=always Hamlet shekpeare.txt => find and colored word - grep Hamlet shekpeare.txt | wc -l => find and count "word" - sort a/fine.txt | grep H | sed 's/Hello/Bonjour/' => multiple code ### How to create thing in Vscode in sortline: mkdir /*adress*/ (folder ) && touch /*adress*/ (file). dont need to use *cd* ## Git & Github **Github**, in a nutshell, is the way to share files with other people collaboratively without overwriting other people's stuff. **Github** is sort of a community, and there are a lot of open-source projects, which means the project's code is available to look at. **Github** is a web-based platform, and it bases on a technology called Git, G-I-T. **Git** is a piece of software called version control. It is a fancy way of saying Keeps track of changes to code. Git has a lot of other features that are useful for software developers. So if you're programming, eventually you have to learn Git at some point. Git visualization: https://learngitbranching.js.org/ touch git status git add git commit echo "print ('123331") > baby.py ### git commit : git reset -- soft 534f3 # Day 5: Web Scraping (Beautiful SUP 4) **SOME Note** : "Git remote: https://www.atlassian.com/git/tutorials/syncing https://www.atlassian.com/git/tutorials/syncing/git-fetch https://www.atlassian.com/git/tutorials/syncing/git-push https://www.atlassian.com/git/tutorials/syncing/git-pull https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes" https://www.beautiful.ai/player/-MDXA__EuG8dD8lhOEGb/FTMLE_web_1_html_css ### =>> Basic ( 3 main comp for Web Dev) ![](https://i.imgur.com/ImkZyIa.png) you can build everything with 3 things above. ![](https://i.imgur.com/08aNdrr.png) HTML : Bone, structure -> CSS : Color, beautiful -> JavaScript : engine & fuel -> run **Web Browser (pick one)** - GG Chrome - Firefox - Safari - Edge - IE (maybe) **Text Editor (pick one)** - Visual Studio Code( best choice) - sublime text - bla bla bal... ## 1. HTML => make up tools :ab: ![](https://i.imgur.com/9RtHAok.jpg) ### Tag Syntax ![](https://i.imgur.com/2K2atRT.png) *Sample* ![](https://i.imgur.com/LIm7YPk.png) - class : Attribute name -> value ### Common HTML Tags ![](https://i.imgur.com/JXKIgLr.png) **Display** div vs span (just for display) ## 2. CSS ? What is CSS? **Not programming language.** ![](https://i.imgur.com/I4PjpGs.png) ** Apply CSS to HTML using "somefile.css" and apply to HTML by *href ="somefile.css" **CSS Syntax** ![](https://i.imgur.com/b5hl6zn.png) **CSS Selectors** ![](https://i.imgur.com/KTMcE38.png) **CSS Box Model** ![](https://i.imgur.com/VMIuklU.png) **CSS Box Properties** ![](https://i.imgur.com/gR9hnKN.png) ![](https://i.imgur.com/lBHrXkd.png) ## Beautiful SOUP4 : from **bs4** import **BeautifulSoup** - soup contains all content of html which is parsed by BeautifulSoup **soup** = **BeautifulSoup**(contents, '*html.parser*') - print out first 500 characters of HTML file print(**soup.prettify**()[:500]) ### Workflow for Crawling Data 1. go to the website, see it "inspert" 2. build a function to loop data (what's the infomation you want to take) 3. save as a data frame through pandas 4. CSV,...