Lab 09: Branching, GitHub Actions, and Reading from DynamoDB

Lab 09: Branching, GitHub Actions, and Reading from DynamoDB ============================================================ This lab is worth 4 points. Submit your answers via Blackboard with screenshots of your progress. :::info **Before starting:** You should have completed Lab 8. You'll need your running EC2 instance available. This lab uses a [new starter repository](https://github.com/merriekay/cs178-lab09) — you'll fork and clone it as part of the setup below. ::: --- Setup: Fork and Clone the Starter Repo ====================================== :::info **Windows users:** Use **Git Bash** as your terminal throughout this lab, not PowerShell or Command Prompt. Git Bash gives you the same Unix commands used in class and on your EC2 instance. You can open it by right-clicking in a folder and selecting "Git Bash Here," or by searching for Git Bash in the Start menu. ::: 1. Go to the course starter repository: [`https://github.com/merriekay/cs178-lab09`](https://github.com/merriekay/cs178-lab09) 2. Click **Fork** (top right) to create your own copy under your GitHub account. ![image](https://hackmd.io/_uploads/Sk94aAa_Zl.png) 3. On your **local machine**, open a terminal and clone your fork: ```bash cd ~/Documents/CS178 git clone https://github.com/YOUR-USERNAME/cs178-lab09.git cd cs178-lab09 ls ``` ![image](https://hackmd.io/_uploads/Bkl7pCpuZl.png) 4. Open the folder in VS Code: **File → Open Folder** → select `cs178-lab09`. You're ready to go. --- Section 1: Keeping Secrets Out of Your Code — `.gitignore` ========================================================== Before we write any new code, let's talk about a common and important mistake: accidentally committing sensitive information to GitHub. When you run a Python script, the interpreter automatically generates `__pycache__` folders and `.pyc` files — compiled bytecode that speeds up future runs. These are machine-generated and differ on every computer, so there's no reason to track them in git. On Macs, Finder also creates invisible `.DS_Store` files in folders you open. Neither of these belong in your repository. A `.gitignore` file tells git which files to silently ignore — they won't show up in `git status` and can't be accidentally committed. **On your local machine**, open a terminal and navigate to your project folder: ```bash cd ~/Documents/CS178/cs178-lab09 ``` Create a `.gitignore` file: ```bash touch .gitignore ``` Open `.gitignore` in VS Code and add the following: ``` # Python bytecode — generated automatically when you run a script, no need to track __pycache__/ *.pyc # macOS metadata files — created automatically by Finder, not part of your project .DS_Store ``` Save the file, then check what git sees: ```bash git status ``` Add, commit, and push `.gitignore`: > I'd recommend running these one at a time so you get the practice of understanding what each command is doing. ```bash git add .gitignore git commit -m "Add .gitignore" git push origin main ``` --- Section 2: Feature Branches =========================== So far, all of your work has happened on the `main` branch. This works fine when you're working alone on a small script — but in real development, `main` is supposed to represent **working, stable code**. New features get built on separate **branches** so that incomplete or broken work doesn't affect the main codebase. Here's the mental model: ``` main ──────────────────────────────────────── (stable, always works) \ feature/read-dynamo ────────────── (work in progress) ``` When the feature is done and tested, you **merge** it back into `main`. Even working solo, branches are a great habit — they give you a clean record of what changed and why, and they let you experiment without fear. --- Step 1: Create a Feature Branch ------------------------------- On your **local machine**, create a new branch for your DynamoDB feature: > again, run these one at a time, please. ```bash git checkout main git pull origin main git checkout -b feature/read-dynamo ``` ![image](https://hackmd.io/_uploads/SyXbnCad-l.png) :::info `git checkout -b <branch-name>` creates a new branch **and** switches to it in one step. The `-b` flag means "new branch." The `feature/` prefix is a naming convention — it tells anyone looking at the repo that this branch contains work-in-progress feature code. ::: Verify you're on the new branch: ```bash git branch ``` You should see `* feature/read-dynamo` with an asterisk indicating it's your current branch. ![image](https://hackmd.io/_uploads/HyIGhA6dbx.png) --- Step 2: Write the DynamoDB Read Feature --------------------------------------- Now let's write something meaningful on this branch. You're going to create a Python script that reads all items from your DynamoDB `Movies` table using **boto3**, the AWS SDK for Python. **boto3** is the official Python library for interacting with AWS services. Instead of clicking through the AWS console, your code can directly query DynamoDB, upload files to S3, trigger Lambda functions, and more. In VS Code, create a new file called `read_movies.py` in your `cs178-lab09` folder and add the following: ```python # read_movies.py # Reads all items from the DynamoDB Movies table and prints them. # Part of Lab 09 — feature/read-dynamo branch import boto3 from boto3.dynamodb.conditions import Key # ------------------------------------------------------- # Configuration — update REGION if your table is elsewhere # ------------------------------------------------------- REGION = "us-east-1" TABLE_NAME = "Movies" def get_table(): """Return a reference to the DynamoDB Movies table.""" dynamodb = boto3.resource("dynamodb", region_name=REGION) return dynamodb.Table(TABLE_NAME) def print_movie(movie): """Print a single movie's details in a readable format.""" title = movie.get("Title", "Unknown Title") year = movie.get("Year", "Unknown Year") # Ratings is a nested map in the table — handle it gracefully ratings = movie.get("Ratings", {}) rating_str = ", ".join(f"{k}: {v}" for k, v in ratings.items()) if ratings else "No ratings" print(f" Title : {title}") print(f" Year : {year}") print(f" Ratings: {rating_str}") print() def print_all_movies(): """Scan the entire Movies table and print each item.""" table = get_table() # scan() retrieves ALL items in the table. # For large tables you'd use query() instead — but for our small # dataset, scan() is fine. response = table.scan() items = response.get("Items", []) if not items: print("No movies found. Make sure your DynamoDB table has data.") return print(f"Found {len(items)} movie(s):\n") for movie in items: print_movie(movie) def main(): print("===== Reading from DynamoDB =====\n") print_all_movies() if __name__ == "__main__": main() ``` Save the file. :::info **What's `scan()` doing?** DynamoDB's `scan()` reads every item in the table. It's the simplest way to retrieve all data, but it's also the slowest — it touches every partition. For a small table like `Movies`, it's perfectly fine. In production systems with millions of rows, you'd use `query()` with a specific partition key instead. ::: Check that git sees the new file: ```bash git status ``` ![image](https://hackmd.io/_uploads/Hy9rAR6u-e.png) Stage and commit it: ```bash git add read_movies.py git commit -m "Add read_movies.py to read from DynamoDB" ``` ![image](https://hackmd.io/_uploads/rkDDRA6Obl.png) --- Step 3: Push the Feature Branch to GitHub ----------------------------------------- Now push your branch to GitHub. This is slightly different from pushing to `main`: ```bash git push origin feature/read-dynamo ``` ![image](https://hackmd.io/_uploads/H1JKCCauWl.png) Go to your GitHub repository in the browser. You should see a banner at the top saying something like **"feature/read-dynamo had recent pushes"** with a button to open a pull request. ![image](https://hackmd.io/_uploads/HyJ9A0pOWl.png) :::info **What's a pull request?** A pull request (PR) is a formal way of saying "I'd like to merge this branch into main." On team projects, PRs are where code review happens. For now, we'll just observe that the branch exists on GitHub — we'll merge it after setting up GitHub Actions. ::: --- Section 3: Automating Deployment with GitHub Actions ==================================================== Right now, getting your code onto EC2 requires three manual steps after every change: push to GitHub, SSH into EC2, run `git pull`. That's fine for one or two changes — but it gets tedious fast, and it's easy to forget. **GitHub Actions** is a built-in automation tool that watches your repository and runs a set of steps whenever you push code. We'll configure it to automatically SSH into your EC2 instance and run `git pull` every time you push to `main`. :::info **This is called CI/CD** — Continuous Integration / Continuous Deployment. It's the standard in real engineering teams. Every company from startups to Amazon uses some version of this workflow. You're learning the same pattern. ::: --- Step 1: Generate a Deploy Key on EC2 ------------------------------------ GitHub Actions needs a way to SSH into your EC2 instance. We'll create a dedicated SSH key pair for this — separate from your personal `.pem` file, so it can be revoked independently if needed. Open a new terminal in VS Code and SSH into your EC2 instance using the shortcut command we set up in Lab08 (or the long command you used in Lab08): ```bash! ssh cs17 ``` You should see the Eagle, that's our clue the `ssh` command worked: ![image](https://hackmd.io/_uploads/Bkwxky0ubx.png) then run: ```bash ssh-keygen -t ed25519 -C "github-actions-deploy" -f ~/.ssh/deploy_key ``` Press **Enter** at all three prompts — no passphrase needed. ![image](https://hackmd.io/_uploads/HJ2m1yAd-g.png) Add the public key to EC2's list of authorized SSH clients: ```bash cat ~/.ssh/deploy_key.pub >> ~/.ssh/authorized_keys ``` Display the **private** key — you'll copy this into GitHub in the next step: ```bash cat ~/.ssh/deploy_key ``` ![image](https://hackmd.io/_uploads/BJFnkJRu-e.png) :::warning Copy the **entire** output, including the `-----BEGIN OPENSSH PRIVATE KEY-----` and `-----END OPENSSH PRIVATE KEY-----` lines. Missing either line will cause the action to fail. ::: --- Step 2: Add Secrets to GitHub ----------------------------- GitHub Secrets let you store sensitive values — like SSH keys and hostnames — so GitHub Actions can use them without ever exposing them in your code or logs. Go to your forked repository on GitHub → **Settings → Secrets and variables → Actions → New repository secret** ![image](https://hackmd.io/_uploads/SyWMekAO-l.png) Add these three secrets one at a time: | Secret Name | Value | | --- | --- | | `EC2_PRIVATE_KEY` | The entire contents of `~/.ssh/deploy_key` from Step 1 | | `EC2_HOST` | Your EC2 **Public IPv4 DNS** (looks like `ec2-XX-XX-XX-XX.compute-1.amazonaws.com`) | | `EC2_USER` | `ec2-user` | :::warning Use the **Public IPv4 DNS** — not the plain IP address. Find it in the EC2 console under your instance details. The DNS name starts with `ec2-` and ends with `.amazonaws.com`. This is the most common setup mistake. ::: ![image](https://hackmd.io/_uploads/ryZOxyR_-e.png) ![image](https://hackmd.io/_uploads/S1J3xyR_Wg.png) It should look like this when you're done: ![image](https://hackmd.io/_uploads/Syq2gy0dWx.png) Step 3: Clone the repo to EC2 ----------------------------- While we're signed into our EC2 instance, go ahead and clone your fork onto EC2: Here's how to find the right URL: ![image](https://hackmd.io/_uploads/r131vGA_-e.png) ```bash cd ~ git clone https://github.com/YOURUSERNAME/cs178-lab09 ls cs178-lab09/ ``` ![image](https://hackmd.io/_uploads/HkZOwzAOWg.png) You now have the repo in three places: your laptop, GitHub, and EC2. When GitHub Actions runs `git pull` on EC2, it will update this copy automatically. --- Step 4: Create the GitHub Actions Workflow File ----------------------------------------------- :::info **Switch back to your local machine for this step.** You'll create the workflow file locally and push it — this way it's tracked in git like any other file. ::: Make sure you're on `main`: ```bash git checkout main ``` Create the GitHub Actions directory and workflow file: ```bash mkdir -p .github/workflows touch .github/workflows/deploy.yml ``` Open `deploy.yml` in VS Code and paste the following: ```yaml # .github/workflows/deploy.yml # This workflow runs every time code is pushed to main. # It SSHes into the EC2 instance and runs git pull to deploy the latest code. name: Deploy to EC2 on: push: branches: - main # Only trigger on pushes to main, not feature branches jobs: deploy: runs-on: ubuntu-latest # GitHub spins up a temporary Linux machine to run these steps steps: # Step 1: Write the private SSH key to a temp file so we can use it - name: Set up SSH key run: | mkdir -p ~/.ssh echo "${{ secrets.EC2_PRIVATE_KEY }}" > ~/.ssh/deploy_key chmod 600 ~/.ssh/deploy_key ssh-keyscan -H ${{ secrets.EC2_HOST }} >> ~/.ssh/known_hosts # Step 2: SSH into EC2 and pull the latest code from main - name: Deploy via SSH run: | ssh -i ~/.ssh/deploy_key ${{ secrets.EC2_USER }}@${{ secrets.EC2_HOST }} \ "cd ~/cs178-lab09 && git pull origin main" ``` ![image](https://hackmd.io/_uploads/HydAwGRdbg.png) :::info **What's `secrets.EC2_PRIVATE_KEY` doing?** GitHub Actions replaces `${{ secrets.SECRET_NAME }}` with the actual secret value at runtime — but that value is never printed in logs or visible to other code. This is how teams use credentials safely in automation pipelines. ::: Add, commit, and push the workflow to `main`: ```bash git add .github/workflows/deploy.yml git commit -m "Add GitHub Actions deploy workflow" git push origin main ``` ![image](https://hackmd.io/_uploads/r1YzOMRdWx.png) --- Step 5: Test the Automation --------------------------- Go to your GitHub repository → **Actions** tab. You should see a workflow run triggered by your last push. ![image](https://hackmd.io/_uploads/rymrOz0O-l.png) Wait for the green ✅. Then, **without SSHing into EC2 yourself**, verify the deployment worked by checking on EC2 from your existing terminal: ```bash ls ~/cs178-lab09/ ``` You should see `.github/` and `.gitignore` now present, confirming EC2 pulled the latest `main`. :::success From this point on: edit locally → `git push origin main` → code is on EC2. The manual SSH and `git pull` steps are gone. ::: --- # Section 4: Attach an IAM Role to your EC2 Instance Okay, so while in Lab07 we gave our `ec2` IAM roles the `dynamoDBFullAccess` permissions, we actually have to do one more step for our EC2 instance to be able to access our dynamoDB tables. ### Step 1: Create the Role 1. Log out of your ec2 user account, and into your root. 2. Go to IAM --> Roles --> Create role ![image](https://hackmd.io/_uploads/SkiHymCdZe.png) 3. Trusted entity type: **AWS service** 4. Use case: EC2 --> click Next: ![image](https://hackmd.io/_uploads/ryehyQCd-g.png) 5. Search for and select AmazonDynamoDBFullAccess and click Next: ![image](https://hackmd.io/_uploads/HkUexmR_-e.png) 6. Add `EC2DynamoDBRole` as the name ![image](https://hackmd.io/_uploads/BkMEe7RdWl.png) 7. Click Creat Role: ![image](https://hackmd.io/_uploads/By1LlXRdZe.png) 8. You should see the confirmation: ![image](https://hackmd.io/_uploads/Hk4PgQ0ubg.png) ### Step 2: Attach the role to your instance (stay logged into your root account) 1. Go back to EC2 --> Instances --> Select your instance ![image](https://hackmd.io/_uploads/H1ikZXAdZe.png) 2. Actions --> Security --> Modify IAM role ![image](https://hackmd.io/_uploads/BJVVWX0uWg.png) 3. Search for `EC2DynamoDBRole` and select the role we just created: ![image](https://hackmd.io/_uploads/HkToZQCdZl.png) 4. Click 'Update IAM Role' and look for this confirmation: ![image](https://hackmd.io/_uploads/HkPAZmC_bl.png) --- Section 5: Merging the Feature Branch ===================================== Your GitHub Actions workflow is set up and working. Now let's bring the DynamoDB feature into `main` so it gets deployed automatically. --- Step 1: Merge Locally --------------------- Switch back to your `feature/read-dynamo` branch to confirm the feature file is there: > again, take these commands one at a time and think through what each is doing ```bash git checkout feature/read-dynamo ls ``` Now merge it into `main`: ```bash git checkout main git merge feature/read-dynamo ``` For me, this pulled up a `vim` interface which is just a different text editor. To get out and save your merge message, go ahead and hit `I`, now you can type whatever message you want to add to the merge. Then hit `esc` followed by `:wq` (this means "save my work", and "quit") and then `Enter`. You should see something similar to this now: ![image](https://hackmd.io/_uploads/Hk_rYMCdbl.png) :::info **Fast-forward merge:** Since no one committed anything to `main` while you were working on the feature branch, git can simply move `main` forward to point at your new commits. There's nothing to reconcile — it's the simplest kind of merge. ::: --- Step 2: Push and Watch the Deployment ------------------------------------- Push `main` to GitHub: ```bash git push origin main ``` ![image](https://hackmd.io/_uploads/ryJ0FGAd-e.png) Go to the **Actions** tab on GitHub and watch the workflow run automatically. ![image](https://hackmd.io/_uploads/rkOb5GAd-l.png) Once the workflow completes, SSH into EC2 and confirm `read_movies.py` arrived: ```bash ls ~/cs178-lab09/ ``` ![image](https://hackmd.io/_uploads/SkZBcGR_-l.png) --- Step 3: Install boto3 and Run the Script ---------------------------------------- Now that we have the IAM Roles in order, go back to VS Code and ssh into our ec2 instance. On your EC2 instance, install the libraries in the `requirements.txt` file by running the following command (we don't need all of these yet, but will soon). Double check that you're in the `cs178-lab09` folder first: ```bash! pwd cd ~/cs178-lab09/ ``` ![image](https://hackmd.io/_uploads/ByPsWJ0dZe.png) ```bash sudo dnf install python3-pip -y pip3 install -r requirements.txt ``` There should be a bunch of console output, but you should see some `'successfully installed...'` Then run the script: ```bash cd ~/cs178-lab09 python3 read_movies.py ``` :::warning If you get an error, it likely has to do with a conflict between what your Movies Table looks like and how the tables are being printed out in `read_movies.py`. For example, I needed to go back and update my `print_movie(movie)` function to look like this: ```python! def print_movie(movie): title = movie.get("Title", "Unknown Title") year = movie.get("Year", "Unknown Year") ratings = movie.get("Ratings", "No ratings") print(f" Title : {title}") print(f" Year : {year}") print(f" Ratings: {ratings}") print() ``` And, of course, that means that you need to edit locally, save your files, `git add .`, `git commit -m "msg"`, and `git push origin main` to make sure the file updated. Then try running `python3 read_movies.py` again. If it works, you should see something like this: ![image](https://hackmd.io/_uploads/rJYzVmROZl.png) ::: :::warning **Getting an error?** Check these things in order: 1. **Region mismatch** — is your DynamoDB `Movies` table in `us-east-1`? If not, update `REGION` in `read_movies.py`. 2. **IAM permissions** — go to EC2 → your instance → **Security** tab → check that the IAM role has DynamoDB read access. 3. **Table name** — `Movies` is case-sensitive. Confirm it matches exactly in the AWS console. ::: --- Section 6: Challenges ===================== ✅ Challenge #1 -------------- Add a new attribute to at least two movies in your DynamoDB `Movies` table using the AWS console (e.g. `Genre`, `Director`, or `Runtime`). Then, **on your local machine on a new feature branch** called `feature/display-genre` (or similar), update `read_movies.py` so that `print_movie()` also prints the new attribute. Merge it into `main` and push — GitHub Actions will deploy it automatically. Run the script on EC2 to verify. **Submit:** A screenshot of your updated `print_movie()` function and the EC2 terminal output showing the new attribute. --- ✅ Challenge #2 -------------- Add a `get_movie_by_title()` function to `read_movies.py` that: 1. Prompts the user to enter a movie title 2. Searches the table for a movie with that title (hint: use `scan()` with a `FilterExpression`) 3. Prints the movie if found, or a "not found" message if it isn't Push your changes from `main` and run on EC2. **Submit:** A screenshot of your function code and the EC2 output showing a successful search result. --- ✅ Challenge #3 -------------- Create a **new DynamoDB table** on any topic you like (a playlist, a reading list, a recipe book — your choice). Add at least 3 items with at least 2 attributes each. Write a new Python file (e.g. `read_my_table.py`) that connects to your table and prints all items. Use a feature branch, merge it to `main`, and let GitHub Actions deploy it. **Submit:** A screenshot of your DynamoDB table in the AWS console and the EC2 terminal output showing your script running. --- :::success 🎉 Nice work! Submit your screenshots via Blackboard for full credit. :::