Lab 16: Serverless Image Processing with AWS Lambda

Lab 16: Serverless Image Processing with AWS Lambda =================================================== ###### CS 178 · Cloud Computing and Database Systems · Spring 2026 :::info **The big idea:** In this lab, you'll wire together three AWS services so that a single file upload kicks off an automatic pipeline — no button presses, no EC2 intervention, no `nohup`. That automatic handoff _between_ services is what Lambda is for. **Pipeline you're building:** | Step | What happens | Who does it | | --- | --- | --- | | 1 | You upload an image via the Flask form | Pre-built | | 2 | Flask uses `boto3` to put the file in S3 | **You configure bucket names (Exercise 1)** | | 3 | S3 fires an event — Lambda wakes up automatically | AWS handles this | | 4 | Lambda flips the image with Pillow, saves to processed bucket | **You configure this** | | 5 | Flask fetches a pre-signed URL and displays the result | Pre-built | **What you'll turn in:** 4 points total (details at the bottom) ::: --- Before You Start ---------------- ### 1\. Fork and clone the starter repo Fork `https://github.com/merriekay/cs178-lab16` on GitHub, then clone your fork **locally**: ```bash git clone https://github.com/YOUR-USERNAME/cs178-lab16.git cd cs178-lab16 ``` ### 2\. Install dependencies locally ```bash pip install -r requirements.txt # or pip3, or use your venv ``` ### 3\. Clone the repo on EC2 SSH into your EC2 instance and clone your fork there too: ```bash ssh cs178 git clone https://github.com/YOUR-USERNAME/cs178-lab16.git cd cs178-lab16 pip install -r requirements.txt ``` :::warning Make sure you clone into your **home directory** (`~/cs178-lab16`) — the deploy workflow expects to find it there. You can confirm with `pwd` after cloning. ::: ### 4\. Open port 8888 in your EC2 security group Lab 16 runs on port 8888 (to avoid conflicting with Project 1 on port 5000). 1. Go to **EC2 → Instances → your instance → Security tab → Security groups → Edit inbound rules** 2. Add a rule: **Type:** Custom TCP, **Port:** 8888, **Source:** Anywhere (0.0.0.0/0) 3. Save ### 5\. Set up GitHub Actions secrets The deploy workflow uses the same three secrets as Lab 9. If they're already set on your fork, you're good. If not: 1. Go to your **GitHub repo → Settings → Secrets and variables → Actions → New repository secret** 2. Add all three: | Secret name | Value | | --- | --- | | `EC2_HOST` | Your EC2 public IP address | | `EC2_USER` | `ec2-user` | | `EC2_PRIVATE_KEY` | Contents of your `.pem` file (the whole thing, including the `-----BEGIN` and `-----END` lines) | :::info These are the same secrets you set up in Lab 9. If you forked from your Lab 9 repo they won't carry over — secrets are per-repo and need to be added again. ::: --- Section 1: Set Up Your Two S3 Buckets 🪣🪣 ------------------------------------------ You need two buckets: one where Flask drops uploads, one where Lambda puts the processed result. 1. Log into the AWS Console as your **IAM user**. 2. Go to **S3 → Create bucket**. - Name: `[yourinitials]-image-source` (e.g. `mkm-image-source`) - Leave all defaults. Click **Create bucket**. ![image](https://hackmd.io/_uploads/Byc0VtGh-x.png) 3. Create a **second** bucket: - Name: `[yourinitials]-image-source-processed` (e.g. `mkm-image-source-processed`) You should see both buckets in your S3 list: ![image](https://hackmd.io/_uploads/Sy1XBYG2Wx.png) :::warning Bucket names must be lowercase, no spaces, and **globally unique across all of AWS**. If your initials clash with someone else's, add a number: `mkm2-image-source`. ::: --- Section 2: IAM Permissions -------------------------- Lambda needs permission to read from the source bucket and write to the processed bucket. We grant this via an IAM **Role** — a named set of permissions that AWS services can assume on your behalf. :::info **Mental model:** Your IAM _user_ is you logging into the console. An IAM _role_ is a set of permissions that a service (like Lambda) can pick up and run with — like handing a contractor a key card that only opens certain doors. ::: 1. Switch to your **root** account. 2. Go to **IAM → Roles → Create Role**. ![image](https://hackmd.io/_uploads/BkEgLYMnZe.png) 3. **Use case:** choose **AWS Service** and below select **Lambda**. Click **Next**. ![image](https://hackmd.io/_uploads/B1kEUFG3Zx.png) 4. Search for and select these **three** policies: - `AWSLambda_FullAccess` - `AmazonS3FullAccess` - `AWSLambdaBasicExecutionRole` - _(you can also attach `AmazonRekognitionFullAccess` now if you plan to complete the stretch goal)_ ![image](https://hackmd.io/_uploads/SybAMvNn-g.png) :::info `AWSLambdaBasicExecutionRole` is what allows Lambda to write output to CloudWatch Logs. Without it, your function runs completely blind — no error messages, no print statements, nothing. Always include it. ::: 5. Name the role `S3LambdaLab16`. Click **Create role**. 6. Also attach `AWSLambda_FullAccess` and `IAMFullAccess` to your **IAM user** (your user account, same as past labs). - Go to **Users** and select your EC2 IAM user. - Add the policies: ![image](https://hackmd.io/_uploads/BypOvKzhbl.png) You should see both in your list of permission policies: ![image](https://hackmd.io/_uploads/By_2PYM3be.png) 7. Sign back in as your **IAM user**. --- Section 3: Create the Lambda Function ------------------------------------- 1. In the AWS Console, go to **Lambda → Create function**. ![image](https://hackmd.io/_uploads/rJkMdFfhbl.png) 2. Select **Author from scratch**. 3. Configure: - **Function name:** `Lab16ImageFlipper` - **Runtime:** Python 3.13 ![image](https://hackmd.io/_uploads/Byd6_YMnZx.png) 4. Under **Change default execution role** → **Use an existing role** → choose `S3LambdaLab16`. ![image](https://hackmd.io/_uploads/r1k9OFz3Ze.png) 5. Click **Create function**. ### Upload the function code You'll see a browser-based code editor labeled **Code source**. This is where you can view and edit your Lambda function directly in the console. The Lambda code is provided for you in `lambda_function.py` in the starter repo. Before uploading, **read through it** — every step is commented. You should be able to answer: _what does `event["Records"][0]["s3"]["object"]["key"]` give you, and where does that value come from?_ Because this function uses Pillow (not available by default in Lambda's runtime), you need to upload a zip that bundles the library alongside the code. 1. Download the pre-packaged zip: [`https://analytics.drake.edu/~moore/CS178/lambda_function.zip`](https://analytics.drake.edu/~moore/CS178/lambda_function.zip) 2. In the Lambda console → **Code** tab → **Upload from → .zip file** → select the zip → **Save**. ![image](https://hackmd.io/_uploads/SkzBSDE2Zl.png) - If `lambda_function.py` didn't come through in the zip, paste the code from `lambda_function.py` in the starter repo directly into the editor. 3. After any edits in the code editor, click the **Deploy** button to save and activate your changes. Lambda will not run your updated code until you do this. ![image](https://hackmd.io/_uploads/rJqndPE2bl.png) ### Configure the processed bucket name The Lambda function needs to know which bucket to write the flipped image to. Set this as an Environment Variable so you don't have to re-package the zip if the bucket name changes. 1. Go to **Configuration tab → Environment variables → Edit** 2. Add: - Key: `PROCESSED_BUCKET` - Value: `[yourinitials]-image-source-processed` 3. Click **Save**. ![image](https://hackmd.io/_uploads/HJk6ttz3Wg.png) ![image](https://hackmd.io/_uploads/Bkqj3KG3Zx.png) :::info Note that `PROCESSED_BUCKET` appears in two places — in `lambda_function.py` (for Lambda) and in `app.py` (for Flask). These are two separate programs running in two separate places. Both need your bucket name, but you set them independently. ::: ### Set memory and timeout The default Lambda settings aren't enough for image processing. While you're in the **Configuration tab → General configuration → Edit**: - Set **Memory to 512 MB** — the default 128 MB causes an out-of-memory error when Pillow processes images - Set **Timeout to 15 seconds** — the default 3 seconds isn't enough time to download, flip, and re-upload an image Click **Save**. ![image](https://hackmd.io/_uploads/HkpDUUVhWe.png) ![image](https://hackmd.io/_uploads/BJZo8IV3Wx.png) :::warning If you skip these steps, the function will fail silently — you'll see nothing in the processed bucket and no useful error message. Set both before moving on. ::: ### Test the function before adding the S3 trigger Before wiring up S3, let's verify the function code is healthy using Lambda's built-in test tool. 1. In the Lambda console, click the **Test** tab. 2. Select **Create new test event**. 3. Leave the default JSON payload as-is — we just want to confirm the function can initialize without import errors. Name it `MyTest` and click **Save**. 4. Click **Test** to run it. ![image](https://hackmd.io/_uploads/H1NWNDV3-e.png) The function will error (because the test event doesn't have a real S3 record in it), but that's fine — what you're looking for is the _type_ of error: :::success ![image](https://hackmd.io/_uploads/ryr8VPN2-e.png) **Good error** — `KeyError: 'Records'` means the function imported successfully and ran; it just got confused by the dummy event. The code is healthy. Move on. ::: :::warning **Bad error** — `Runtime.ImportModuleError: Unable to import module 'lambda_function'` means Pillow isn't in the zip correctly. The zip must have `lambda_function.py` and the `PIL/` folder at the **root level**, not inside a subfolder. Re-package and re-upload. ![image](https://hackmd.io/_uploads/Byp5NDV3Wx.png) ::: ### **📸 Screenshot 1:** The Lambda test output panel showing your test result. _Your screenshot should show the Lambda test execution result with `KeyError: 'Records'` — this means the code loaded correctly. Submit this screenshot to Blackboard._ ### Read your first CloudWatch log Every time Lambda runs — whether from a test or an S3 trigger — it writes to CloudWatch Logs. This is your primary debugging tool. 1. Go to the **Monitor tab**. ![monitor tab](https://hackmd.io/_uploads/SyciHDE3-l.png) 2. Click **View CloudWatch Logs → open the most recent log stream**. ![image](https://hackmd.io/_uploads/B1CxLwV2Wg.png) 3. You'll see entries like: - `START` — function began executing - Any `print()` output from your code - `END` — function finished - `REPORT` — duration, billed duration, memory used, and `Status` ![image](https://hackmd.io/_uploads/HkoBUDV2Wg.png) If `Status: timeout` appears, your timeout is too short. If you see `Runtime.OutOfMemory`, your memory is too low. If you see a Python traceback, that's the actual error to fix. If you see your `print()` statements and no errors, the function ran successfully. Don't worry — we'll see this run successfully once we add the S3 trigger in the next section. --- Section 4: Connect S3 as a Trigger ---------------------------------- Right now, Lambda just sits there. Let's make it fire automatically when a file lands in your source bucket. 1. On the `Lab16ImageFlipper` function page, click **\+ Add trigger**. ![image](https://hackmd.io/_uploads/HkS02YGhWl.png) 2. Choose **S3** as the source. ![image](https://hackmd.io/_uploads/B10kaFG3be.png) 3. **Bucket:** `[yourinitials]-image-source` 4. **Event types:** leave as default (All object create events). 5. Check the **Recursive invocation** acknowledgment box. Click **Add**. ![image](https://hackmd.io/_uploads/Hk8XaFMhZx.png) The designer window should now show an S3 box connected to your function. That arrow _is_ the Lambda concept — code that runs automatically in response to an event, with no server you had to configure. ![image](https://hackmd.io/_uploads/rJqyCYf3We.png) :::warning If you get an error saying **"Configurations overlap"**, a leftover S3 event notification from a previous Lambda function is still attached to this bucket. Go to **S3 → your source bucket → Properties → Event notifications**, delete the old rule, then try adding the trigger again. ::: ### **📸 Screenshot 2:** Lambda designer showing the S3 trigger connected to `Lab16ImageFlipper`. _Your screenshot should show the Lambda designer view with an S3 box on the left connected by an arrow to `Lab16ImageFlipper`. Submit this screenshot to Blackboard._ --- Section 5: Exercise 1 — Configure Your Bucket Names (1 pt) ---------------------------------------------------------- Open `app.py`. Near the top you'll find this block: ```python SOURCE_BUCKET = "YOUR-INITIALS-image-source" # e.g. "mkm-image-source" PROCESSED_BUCKET = "YOUR-INITIALS-image-source-processed" # e.g. "mkm-image-source-processed" ``` Replace both placeholder strings with your actual bucket names from Section 1. That's it — the rest of `app.py` is already written for you. ![bucket name updates](https://hackmd.io/_uploads/HkCIfFz2Zg.png) :::info Note that `PROCESSED_BUCKET` appears in both `app.py` and `lambda_function.py` — these are two separate programs that each need to know the bucket name. You set the Flask app's copy here, and you set Lambda's copy via the Environment Variable in Section 3. Before moving on, read through the `upload()` route in `app.py`. Two lines do the heavy lifting: ```python s3 = boto3.client("s3", region_name=AWS_REGION) s3.upload_fileobj(file, SOURCE_BUCKET, filename) ``` `boto3.client("s3")` opens a connection to S3. `upload_fileobj()` streams your file straight into the bucket — `filename` becomes the S3 key (the name the object gets inside the bucket). You'll see this pattern any time Python code needs to talk to AWS. ::: Once updated, test locally: ```bash python3 app.py # open http://localhost:8888, upload an image ``` _On Windows, try `python app.py` or `python -m flask run --port 8888` if `python3` isn't recognized._ :::info **Having trouble running locally?** GitHub Codespaces is always an option — open your repo in Codespaces, run `python3 app.py` in the terminal, and use the forwarded port to test in your browser. Everything works the same as local. ::: Check your `-image-source` bucket in the S3 console — the file should appear. Lambda triggers automatically, and after ~3 seconds the flipped image should land in `-image-source-processed`. :::info **Why the `time.sleep(3)`?** Lambda takes a moment to spin up and process the image. A real production app would use a webhook or polling loop — but for a lab demo, a short sleep is a fine trade-off. This is a deliberate simplification, not a bug. ::: When it's working locally, commit and push to deploy to EC2: ```bash git add app.py git commit -m "configure bucket names for lab 16" git push ``` Watch the **Actions** tab on your GitHub repo to confirm the deploy workflow runs green. Then visit `http://[your-ec2-ip]:8888` and upload an image to verify the full pipeline works end-to-end on EC2. ### **📸 Screenshot 3:** Your Flask app in the browser showing an uploaded image and its flipped result displayed below the form. _Your screenshot should show your Flask app at your EC2 public IP (port 8888), with the original image on the left and the flipped image on the right. Submit this screenshot to Blackboard._ --- Section 6: Understand the Event Object (Reading Exercise) --------------------------------------------------------- Open `lambda_function.py` from the starter repo and read it top to bottom. The key lines are: ```python record = event["Records"][0] source_bucket = record["s3"]["bucket"]["name"] filename = record["s3"]["object"]["key"] ``` The `event` dictionary is passed to Lambda by AWS when the trigger fires — it's a JSON document describing _what just happened_: which bucket, which file, when, etc. Lambda didn't go looking for work; S3 called it. :::info **Pattern you'll see everywhere:** Every Lambda trigger (S3, DynamoDB Streams, SQS, API Gateway, EventBridge...) passes a different `event` shape. Reading and understanding that shape is the first thing you do in any Lambda function. `print(event)` in CloudWatch is your best debugging tool. ::: Go to **Lambda → Monitor tab → View CloudWatch Logs** and open the log stream from your most recent upload. You should see the `print()` statements from the function confirming what fired and why. ### **📸 Screenshot 4:** CloudWatch log output showing the filename of your uploaded image. _Your screenshot should show a CloudWatch log stream with the `Triggered by upload:` print statement and the filename of the image you uploaded. Submit this screenshot to Blackboard._ --- Section 7: Stretch Goal — Rekognition Labels (ungraded 🤖) ---------------------------------------------------------- `lambda_function_rekognition.py` in the repo is a modified version of the Lambda function that also calls **AWS Rekognition** to generate automatic image labels — the same kind of thing that powers auto alt-text for accessibility. Open the file and fill in the `TODO`. The comments walk you through the exact API call shape. When it works, each upload will produce two files in your processed bucket: - `dog.jpg` — the flipped image - `dog.jpg_labels.json` — Rekognition's labels with confidence scores (e.g. `{"Name": "Dog", "Confidence": 98.4}`) The Flask app will automatically detect the labels file and display a Rekognition card below your result — no changes to `app.py` needed. :::info **To deploy the stretch goal version:** 1. Rename `lambda_function_rekognition.py` to `lambda_function.py` 2. Repackage the zip: bundle the renamed file with the Pillow library at the root level 3. Re-upload the zip in the Lambda console → Code tab → Upload from .zip 4. Click **Deploy** If you added `AmazonRekognitionFullAccess` to your role in Section 2, you're already set. If not, add it now via **IAM → Roles → S3LambdaLab16 → Add permissions**. ::: Here's what the Rekognition card looks like when it's working: ![image](https://hackmd.io/_uploads/BJKeMwVh-x.png) --- Submission ---------- Submit the following to Blackboard: - Your **GitHub repo link** (with your completed `app.py` committed and pushed) - **Screenshot 1** — Lambda test output (healthy function) - **Screenshot 2** — Lambda designer with S3 trigger - **Screenshot 3** — Flask app showing original + flipped image - **Screenshot 4** — CloudWatch log output with filename --- Scoring ------- | Item | Points | | --- | --- | | Exercise 1: correct bucket names in `app.py` (visible in GitHub) | 1 pt | | Screenshot 1: Lambda test showing function is healthy | 0.5 pt | | Screenshot 2: Lambda S3 trigger configured | 0.5 pt | | Screenshot 3: Flask app showing the flipped image | 1 pt | | Screenshot 4: CloudWatch log with filename | 1 pt | | **Total** | **4 pts** |