---
# System prepended metadata

title: Building Batch Data Analytics Solutions on AWS

---

# Building Batch Data Analytics Solutions on AWS

Welcome to the Building Batch Data Analytics Solutions on AWS class. This will be used as supporting material that you can take with you.

### EMR

Title | Link
--- | --- 
Deep Dive and Best Practices|https://www.youtube.com/watch?v=dU40df0Suoo
Whats new 2021|https://www.youtube.com/watch?v=lGm8qe4tBrg
Whats new 2020|https://pages.awscloud.com/Deep-Dive-into-Whats-New-in-Amazon-EMR_2020_0230-ABD_OD.html
EMR 6.0.0|https://www.youtube.com/watch?v=M_EOXbJhD3g
2|https://aws.amazon.com/de/blogs/big-data/build-a-self-service-environment-for-each-line-of-business-using-amazon-emr-and-aws-service-catalog/
3|https://aws.amazon.com/de/blogs/big-data/apply-record-level-changes-from-relational-databases-to-amazon-s3-data-lake-using-apache-hudi-on-amazon-emr-and-aws-database-migration-service/
Flink|https://aws.amazon.com/de/blogs/big-data/use-apache-flink-on-amazon-emr/
File Formats|https://www.youtube.com/watch?v=aIcxFIyL6xo
Spark Optimization|https://www.youtube.com/watch?v=daXEp4HmS-E
Spark on EMR|https://www.youtube.com/watch?v=aIwJlfEAlHQ
EMR vs Glue|https://aws.amazon.com/de/blogs/big-data/how-drop-used-the-amazon-emr-runtime-for-apache-spark-to-halve-costs-and-get-results-5-4-times-faster/
EMR Data Access Controls|https://www.youtube.com/watch?v=qOoWnBhnbuU
EMR Serverless|https://www.youtube.com/watch?v=qk3TDZ4OkNE
Lake Formation Tag AC| https://docs.aws.amazon.com/lake-formation/latest/dg/TBAC-overview.html

### Spark

Title | Link
--- | ---
Spark Jobs on EKS|https://www.youtube.com/watch?v=Om8RRGbZ6zA
Spark on EKS Best Practices | https://www.youtube.com/watch?v=3EbTr79wLkU
RDDs, Dataframes, Datasets|https://www.youtube.com/watch?v=Ofk7G3GD9jk

### Hive

Title | Link
--- | --- 
ACID Transactions|https://aws.amazon.com/de/blogs/big-data/amazon-emr-supports-apache-hive-acid-transactions/

### Step Functions

Title | Link
--- | --- 
EMR Orchestration|https://aws.amazon.com/de/blogs/aws/new-using-step-functions-to-orchestrate-amazon-emr-workloads/
Managed Airflow|https://aws.amazon.com/de/managed-workflows-for-apache-airflow/
X-Ray Support|https://aws.amazon.com/about-aws/whats-new/2020/09/aws-step-functions-adds-support-for-aws-x-ray/
MWAA|https://aws.amazon.com/de/blogs/aws/introducing-amazon-managed-workflows-for-apache-airflow-mwaa/

### Lambda

Title | Link
--- | --- 
Lambda Destinations|https://aws.amazon.com/blogs/compute/introducing-aws-lambda-destinations/
2|https://www.refinery.io/post/how-to-chain-serverless-functions-call-invoke-a-lambda-from-another-lambda
3|https://www.youtube.com/watch?v=Jkx6kVbDpL4
4|https://www.thoughtworks.com/insights/blog/mitigating-serverless-lock-fears

### Glue

Title | Link
--- | --- 
Data Brew|https://aws.amazon.com/glue/features/databrew/
Data Lakes with Glue|https://www.youtube.com/watch?v=JsNR8uBVSiA
PySpark For Glue|https://www.youtube.com/watch?v=DICsZiwuHJo
1|https://www.youtube.com/watch?v=S_xeHvP7uMo
2|https://aws.amazon.com/de/blogs/big-data/aws-serverless-data-analytics-pipeline-reference-architecture/
3|https://aws.amazon.com/de/blogs/big-data/crafting-serverless-streaming-etl-jobs-with-aws-glue/

### Athena

Title | Link
--- | --- 
1|https://www.youtube.com/watch?v=tzoXRRCVmIQ
2|https://www.youtube.com/watch?v=JIviltfpul0

### BigData Architecture

Title | Link
--- | ---
Spark on EKS |https://www.youtube.com/watch?v=lHM96P5kP2k
Use case 1|https://www.youtube.com/watch?v=XpFNznmRoQ0
Use case 2|https://www.youtube.com/watch?v=wbh51O3QrE4
Data Lake|https://www.youtube.com/watch?v=7i1tj59pvYw
Hearst Corp|https://www.youtube.com/watch?v=6cwbbqi36k8

### Resiliency

Title | Link
--- | --- 
1|https://aws.amazon.com/de/blogs/big-data/optimizing-amazon-emr-for-resilience-and-cost-with-capacity-optimized-spot-instances/
2|https://www.youtube.com/watch?v=Fup5vHEvU50
3|https://github.com/bbc/chaos-lambda

### MSK Managed Kafka

Title | Link
--- | --- 
1|https://www.youtube.com/watch?v=HtU9pb18g5Q


### DynamoDB

Title | Link
--- | --- 
1|https://www.youtube.com/watch?v=HaEPXoXVf2k
2|https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

### Kinesis

Title | Link
--- | ---
Kinesis Best Practices | https://www.youtube.com/watch?v=jKPlGznbfZ0
Enhanced Fanout | https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/
Streaming ETL|https://aws.amazon.com/de/blogs/big-data/unified-serverless-streaming-etl-architecture-with-amazon-kinesis-data-analytics/
Storm Integration|https://github.com/amazon-archives/kinesis-storm-spout
Flink and Kinesis|https://aws.amazon.com/blogs/big-data/streaming-etl-with-apache-flink-and-amazon-kinesis-data-analytics/

### Nitro System

Title | Link
--- | --- 
AWS Hypervisor Security|https://www.youtube.com/watch?v=0qcUOKupt7Y

### Re:Invent 2021

Title | Link
--- | --- 
Re:Invent | https://reinvent.awsevents.com

