# The Machine Learning Pipeline
Welcome to The Machine Learning Pipeline class. This will be used as supporting material that you can take with you.
### Exam Readiness Workshop
https://www.aws.training/Details/eLearning?id=42183
https://medium.com/@adam.dejans/my-path-to-passing-the-aws-machine-learning-certification-e8fc45ad7762
### Machine Learning
Title | Link
--- | ---
TensorFlow Without a PhD| https://www.youtube.com/watch?v=vq2nnJ4g6N0
TensorFlow Without a PhD Project|https://github.com/GoogleCloudPlatform/tensorflow-without-a-phd
Deep Learing Book|https://d2l.ai/chapter_computer-vision/anchor.html
Managing ML Projects|https://d1.awsstatic.com/whitepapers/aws-managing-ml-projects.pdf
Deep Learning on AWS|https://d1.awsstatic.com/whitepapers/Deep_Learning_on_AWS.pdf
tf-idf|https://www.youtube.com/watch?v=4vT4fzjkGCQ
Boosting|https://www.youtube.com/watch?v=UHBmv7qCey4
Feature Scaling|https://towardsdatascience.com/all-about-feature-scaling-bcc0ad75cb35
CNNs|https://www.youtube.com/watch?v=AjtX1N_VT9E
Visualization: Matplotlib vs. Seaborn|https://www.kaggle.com/fazilbtopal/visualization-matplotlib-vs-seaborn
Anonymisation with PCA|https://medium.com/lizuna/beacon-the-use-of-principal-components-analysis-to-mask-sensitive-data-in-machine-learning-7904b01445d0
Anonymisation with PCA|https://arxiv.org/pdf/1903.11700.pdf
Encoding Cyclic Features|https://towardsdatascience.com/ml-intro-5-one-hot-encoding-cyclic-representations-normalization-6f6e2f4ec001
Boostrapping|https://www.analyticsvidhya.com/blog/2020/02/what-is-bootstrap-sampling-in-statistics-and-machine-learning/
### Fraud Detection
Title | Link
--- | ---
Fraud Detection with Amazon SageMaker Intro|https://www.youtube.com/watch?v=wzwkLV9gDXk
Fraud Detection with Amazon SageMaker Intermediate|https://www.youtube.com/watch?v=elRQPCHDBPE
Deep Fake Detection|https://arxiv.org/pdf/1909.11573.pdf
### SageMaker
Title | Link
--- | ---
Hyperparameter Tuning|https://aws.amazon.com/de/blogs/machine-learning/amazon-sagemaker-automatic-model-tuning-now-supports-random-search-and-hyperparameter-scaling/
Ground Truth| https://www.youtube.com/watch?v=6WJxzKsIFKA
Custom Labeling in Groud Truth|https://aws.amazon.com/de/blogs/machine-learning/build-a-custom-data-labeling-workflow-with-amazon-sagemaker-ground-truth/
Example Notebooks|https://github.com/aws/amazon-sagemaker-examples
Custom Algorithms|https://www.youtube.com/watch?v=Oy_sCAKChhI
Using EFS and FSx|https://aws.amazon.com/de/blogs/machine-learning/speed-up-training-on-amazon-sagemaker-using-amazon-efs-or-amazon-fsx-for-lustre-file-systems/
Multi-Model Endpoints|https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html
API Gateway Integration|https://aws.amazon.com/de/blogs/machine-learning/creating-a-machine-learning-powered-rest-api-with-amazon-api-gateway-mapping-templates-and-amazon-sagemaker/
Invoke Endpoint API|https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html
### Deep Dive Sagemaker Built-In Algorithms
Title | Link
--- | ---
Overview|https://github.com/awsdocs/amazon-sagemaker-developer-guide/blob/master/doc_source/algos.md
BlazingText|https://www.youtube.com/watch?v=G2tX0YpNHfc
DeepAR Forecasting|https://www.youtube.com/watch?v=g8UYGh0tlK0
Factorization Machines|https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf
K-Means|https://www.youtube.com/watch?v=RPInYpI9MjY
LDA|https://www.youtube.com/watch?v=NMDL8Atim1k
Linear Learner|https://www.youtube.com/watch?v=ae08a6Bp5lM
NTM|https://www.youtube.com/watch?v=eAMjEv7EABM
Obj2Vec|https://www.youtube.com/watch?v=ggVWnnRXtYc
PCA|https://www.youtube.com/watch?v=RPInYpI9MjY
RCF|https://www.youtube.com/watch?v=9BWHR4JsTNU
ResNet|https://www.youtube.com/watch?v=CBDwEZtjFDE
Seq2Seq|https://www.youtube.com/watch?v=pZIV5NWfGIU
XGBoost|https://www.youtube.com/watch?v=THcH0tMdZ6o
XGBoost Demo|https://www.youtube.com/watch?v=GrJP9FLV3FE
### New in SageMaker
Title | Link
--- | ---
AutoML|https://www.youtube.com/watch?v=lPQqm5aqXJE
Data Wrangler|https://www.youtube.com/watch?v=_bsat_2N8LI
Feature Store|https://www.youtube.com/watch?v=pEg5c6d4etI
### Stream Data Ingestion with Kinesis
Title | Link
--- | ---
Kinesis Best Practices | https://www.youtube.com/watch?v=jKPlGznbfZ0
Enhanced Fanout | https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/
Streaming ETL|https://aws.amazon.com/de/blogs/big-data/unified-serverless-streaming-etl-architecture-with-amazon-kinesis-data-analytics/
Storm Integration|https://github.com/amazon-archives/kinesis-storm-spout
Flink and Kinesis|https://aws.amazon.com/blogs/big-data/streaming-etl-with-apache-flink-and-amazon-kinesis-data-analytics/
Managed Kafka|https://www.youtube.com/watch?v=HtU9pb18g5Q
### Lambda
Title | Link
--- | --- | ---
Lambda Destinations|https://aws.amazon.com/blogs/compute/introducing-aws-lambda-destinations/
2|https://www.refinery.io/post/how-to-chain-serverless-functions-call-invoke-a-lambda-from-another-lambda
3|https://www.youtube.com/watch?v=Jkx6kVbDpL4
4|https://www.thoughtworks.com/insights/blog/mitigating-serverless-lock-fears
### Orchestrating with Step Functions
Title | Link
--- | --- | ---
EMR Orchestration|https://aws.amazon.com/de/blogs/aws/new-using-step-functions-to-orchestrate-amazon-emr-workloads/
Managed Airflow|https://aws.amazon.com/de/managed-workflows-for-apache-airflow/
X-Ray Support|https://aws.amazon.com/about-aws/whats-new/2020/09/aws-step-functions-adds-support-for-aws-x-ray/
### Processing with BigData
Title | Link
--- | ---
Spark on EKS |https://www.youtube.com/watch?v=lHM96P5kP2k
Use case 1|https://www.youtube.com/watch?v=XpFNznmRoQ0
Use case 2|https://www.youtube.com/watch?v=wbh51O3QrE4
Data Lake|https://www.youtube.com/watch?v=7i1tj59pvYw
Spark Jobs on EKS|https://www.youtube.com/watch?v=Om8RRGbZ6zA
Spark on EKS Best Practices | https://www.youtube.com/watch?v=3EbTr79wLkU
Athena 1|https://www.youtube.com/watch?v=tzoXRRCVmIQ
Athena 2|https://www.youtube.com/watch?v=JIviltfpul0
Glue 1|https://www.youtube.com/watch?v=S_xeHvP7uMo
Glue Reference Architecture|https://aws.amazon.com/de/blogs/big-data/aws-serverless-data-analytics-pipeline-reference-architecture/
Glue Streaming|https://aws.amazon.com/de/blogs/big-data/crafting-serverless-streaming-etl-jobs-with-aws-glue/
Data Brew|https://aws.amazon.com/glue/features/databrew/
Data Lakes with Glue|https://www.youtube.com/watch?v=JsNR8uBVSiA
### ETL with EMR
Title | Link
--- | --- | ---
Deep Dive and Best Practices|https://www.youtube.com/watch?v=dU40df0Suoo
Whats new 2020|https://pages.awscloud.com/Deep-Dive-into-Whats-New-in-Amazon-EMR_2020_0230-ABD_OD.html
EMR 6.0.0|https://www.youtube.com/watch?v=M_EOXbJhD3g
2|https://aws.amazon.com/de/blogs/big-data/build-a-self-service-environment-for-each-line-of-business-using-amazon-emr-and-aws-service-catalog/
3|https://aws.amazon.com/de/blogs/big-data/apply-record-level-changes-from-relational-databases-to-amazon-s3-data-lake-using-apache-hudi-on-amazon-emr-and-aws-database-migration-service/
Flink|https://aws.amazon.com/de/blogs/big-data/use-apache-flink-on-amazon-emr/
File Formats|https://www.youtube.com/watch?v=aIcxFIyL6xo
Spark Optimization|https://www.youtube.com/watch?v=daXEp4HmS-E
Spark on EMR|https://www.youtube.com/watch?v=aIwJlfEAlHQ
EMR vs Glue|https://aws.amazon.com/de/blogs/big-data/how-drop-used-the-amazon-emr-runtime-for-apache-spark-to-halve-costs-and-get-results-5-4-times-faster/
### Resiliency
Title | Link
--- | --- | ---
1|https://aws.amazon.com/de/blogs/big-data/optimizing-amazon-emr-for-resilience-and-cost-with-capacity-optimized-spot-instances/
2|https://www.youtube.com/watch?v=Fup5vHEvU50
3|https://github.com/bbc/chaos-lambda
### DynamoDB
Title | Link
--- | --- | ---
1|https://www.youtube.com/watch?v=HaEPXoXVf2k
2|https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
### Nitro System
Title | Link
--- | --- | ---
AWS Hypervisor Security|https://www.youtube.com/watch?v=0qcUOKupt7Y
### Re:Invent 2020
Title | Link
--- | --- | ---
Re:Invent | https://reinvent.awsevents.com