07 - Structured Output

# 07 - Structured Output **Notebook:** `07-structured-output.ipynb` ## Video <iframe width="720" height="406" src="https://www.youtube.com/embed/_6gcpKUGKPQ?si=Mcclbj_AZlayw4cy" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> --- ## Why Structured Output? To understand why this is valuable, let's go back to something we might have noticed in notebook 06. When we asked OpenAI about the bird in a poem: - One time it came back with "The Raven" - One time it came back with "The poem features a nightingale" Both are *right*, but if I wanted to use this for my encyclopedia—maybe I'm trying to get OpenAI to decide on the title of my entry—and one time it gives me "The Raven" and one time "the poem features a nightingale"... I don't actually know how to **extract** the name of the bird from those sentences. If the input coming back to me is always of a different structure, I don't know how to parse that text to find all the values I need for my encyclopedia. **That's why we use structured output.** --- ## What is Structured Output? **Structured output** allows us to **demand** something like a dictionary coming back to us, and to **define the structure** of that dictionary. Remember when we made our dictionary of poems? We had title, author, year published... we defined the structure of the data for that kind of object. Think about it as building a spreadsheet—every poem will have values for all of those cells. If we want to accomplish that with AI (which tends to be all over the place if you don't pin it down), we use **structured outputs** from OpenAI. --- ## Setting Up Pydantic We need to import a tool called **Pydantic** that helps us define the schema: ```python from openai import OpenAI from pydantic import BaseModel client = OpenAI() ``` --- ## Defining Your Schema Here's where we define the schema—the structure we want our response to come back in. We do this using a [**class**](/glossary/coding-basics-py/class): ```python class BirdPoem(BaseModel): poem_title: str poet: str bird: str symbolism: str ``` ### What's a Class? A class is like a **blueprint** or **template**. It doesn't hold data itself—it describes what shape the data should have. Think of it like designing a form: - "There should be a field called `poem_title` and it should be text" - "There should be a field called `poet` and it should be text" - etc. The `: str` after each field name means "this should be a string (text)." When you press play, the schema gets defined—Python now knows what a `BirdPoem` looks like. --- ## Making a Structured Request ```python response = client.responses.parse( model="gpt-5-mini", input=[{"role": "user", "content": "Analyze the bird in Poe's The Raven"}], text_format=BirdPoem ) result = response.output_parsed print(result) ``` The key difference: we're saying `text_format=BirdPoem` to tell OpenAI that the response **must** come back in that structure. Output: ```text poem_title='The Raven' poet='Edgar Allan Poe' bird='Raven' symbolism='Poe's raven functions as a multi-layered symbol...' ``` Now we're getting **predictable, structured data** back every single time! --- ## Accessing the Fields Once you have structured output, you can access specific fields using [**dot notation**](/glossary/coding-basics-py/dot-notation): ```python print(result.poem_title) # The Raven print(result.poet) # Edgar Allan Poe print(result.bird) # Raven print(result.symbolism) # Poe's raven functions as... ``` Notice we use `result.poem_title` (with a dot) instead of `result["poem_title"]` (with brackets). That's because `result` is a class instance, not a dictionary. This is incredibly powerful for building things like our encyclopedia—we can reliably extract exactly the fields we need. --- ## More Complex Schemas You can add more fields and even use lists: ```python class EncyclopediaEntry(BaseModel): title: str bird_species: str literary_significance: str themes: list[str] era: str art_prompt: str ``` Now OpenAI will return data with all these fields filled in, and `themes` will be a list of strings! --- ## Try It: Remix the Schema You won't feel like you've owned this until you've remixed it and it's doing something you intended to do. Try: - Change the `BaseModel` class - Add a couple more properties - Remove some properties - Ask Gemini if you're unsure of the syntax **Always, always, always ask Gemini for help. Do your best to understand what it's doing.** --- ## Summary In this notebook, you learned: - **Structured output** gives you predictable, parseable data from AI - Use **Pydantic** and `BaseModel` to define your schema - The schema specifies what **fields** and **types** you expect - Access fields with **dot notation** (`result.field_name`) - This is essential for building things like encyclopedias and card decks Next up: **files** — reading input and saving output!