# Building Batch Data Analytics Solutions on AWS Welcome to the Building Batch Data Analytics Solutions on AWS class. This will be used as supporting material that you can take with you. ### EMR Title | Link --- | --- Deep Dive and Best Practices|https://www.youtube.com/watch?v=dU40df0Suoo Whats new 2021|https://www.youtube.com/watch?v=lGm8qe4tBrg Whats new 2020|https://pages.awscloud.com/Deep-Dive-into-Whats-New-in-Amazon-EMR_2020_0230-ABD_OD.html EMR 6.0.0|https://www.youtube.com/watch?v=M_EOXbJhD3g 2|https://aws.amazon.com/de/blogs/big-data/build-a-self-service-environment-for-each-line-of-business-using-amazon-emr-and-aws-service-catalog/ 3|https://aws.amazon.com/de/blogs/big-data/apply-record-level-changes-from-relational-databases-to-amazon-s3-data-lake-using-apache-hudi-on-amazon-emr-and-aws-database-migration-service/ Flink|https://aws.amazon.com/de/blogs/big-data/use-apache-flink-on-amazon-emr/ File Formats|https://www.youtube.com/watch?v=aIcxFIyL6xo Spark Optimization|https://www.youtube.com/watch?v=daXEp4HmS-E Spark on EMR|https://www.youtube.com/watch?v=aIwJlfEAlHQ EMR vs Glue|https://aws.amazon.com/de/blogs/big-data/how-drop-used-the-amazon-emr-runtime-for-apache-spark-to-halve-costs-and-get-results-5-4-times-faster/ EMR Data Access Controls|https://www.youtube.com/watch?v=qOoWnBhnbuU EMR Serverless|https://www.youtube.com/watch?v=qk3TDZ4OkNE Lake Formation Tag AC| https://docs.aws.amazon.com/lake-formation/latest/dg/TBAC-overview.html ### Spark Title | Link --- | --- Spark Jobs on EKS|https://www.youtube.com/watch?v=Om8RRGbZ6zA Spark on EKS Best Practices | https://www.youtube.com/watch?v=3EbTr79wLkU RDDs, Dataframes, Datasets|https://www.youtube.com/watch?v=Ofk7G3GD9jk ### Hive Title | Link --- | --- ACID Transactions|https://aws.amazon.com/de/blogs/big-data/amazon-emr-supports-apache-hive-acid-transactions/ ### Step Functions Title | Link --- | --- EMR Orchestration|https://aws.amazon.com/de/blogs/aws/new-using-step-functions-to-orchestrate-amazon-emr-workloads/ Managed Airflow|https://aws.amazon.com/de/managed-workflows-for-apache-airflow/ X-Ray Support|https://aws.amazon.com/about-aws/whats-new/2020/09/aws-step-functions-adds-support-for-aws-x-ray/ MWAA|https://aws.amazon.com/de/blogs/aws/introducing-amazon-managed-workflows-for-apache-airflow-mwaa/ ### Lambda Title | Link --- | --- Lambda Destinations|https://aws.amazon.com/blogs/compute/introducing-aws-lambda-destinations/ 2|https://www.refinery.io/post/how-to-chain-serverless-functions-call-invoke-a-lambda-from-another-lambda 3|https://www.youtube.com/watch?v=Jkx6kVbDpL4 4|https://www.thoughtworks.com/insights/blog/mitigating-serverless-lock-fears ### Glue Title | Link --- | --- Data Brew|https://aws.amazon.com/glue/features/databrew/ Data Lakes with Glue|https://www.youtube.com/watch?v=JsNR8uBVSiA PySpark For Glue|https://www.youtube.com/watch?v=DICsZiwuHJo 1|https://www.youtube.com/watch?v=S_xeHvP7uMo 2|https://aws.amazon.com/de/blogs/big-data/aws-serverless-data-analytics-pipeline-reference-architecture/ 3|https://aws.amazon.com/de/blogs/big-data/crafting-serverless-streaming-etl-jobs-with-aws-glue/ ### Athena Title | Link --- | --- 1|https://www.youtube.com/watch?v=tzoXRRCVmIQ 2|https://www.youtube.com/watch?v=JIviltfpul0 ### BigData Architecture Title | Link --- | --- Spark on EKS |https://www.youtube.com/watch?v=lHM96P5kP2k Use case 1|https://www.youtube.com/watch?v=XpFNznmRoQ0 Use case 2|https://www.youtube.com/watch?v=wbh51O3QrE4 Data Lake|https://www.youtube.com/watch?v=7i1tj59pvYw Hearst Corp|https://www.youtube.com/watch?v=6cwbbqi36k8 ### Resiliency Title | Link --- | --- 1|https://aws.amazon.com/de/blogs/big-data/optimizing-amazon-emr-for-resilience-and-cost-with-capacity-optimized-spot-instances/ 2|https://www.youtube.com/watch?v=Fup5vHEvU50 3|https://github.com/bbc/chaos-lambda ### MSK Managed Kafka Title | Link --- | --- 1|https://www.youtube.com/watch?v=HtU9pb18g5Q ### DynamoDB Title | Link --- | --- 1|https://www.youtube.com/watch?v=HaEPXoXVf2k 2|https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf ### Kinesis Title | Link --- | --- Kinesis Best Practices | https://www.youtube.com/watch?v=jKPlGznbfZ0 Enhanced Fanout | https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/ Streaming ETL|https://aws.amazon.com/de/blogs/big-data/unified-serverless-streaming-etl-architecture-with-amazon-kinesis-data-analytics/ Storm Integration|https://github.com/amazon-archives/kinesis-storm-spout Flink and Kinesis|https://aws.amazon.com/blogs/big-data/streaming-etl-with-apache-flink-and-amazon-kinesis-data-analytics/ ### Nitro System Title | Link --- | --- AWS Hypervisor Security|https://www.youtube.com/watch?v=0qcUOKupt7Y ### Re:Invent 2021 Title | Link --- | --- Re:Invent | https://reinvent.awsevents.com