# nelc-ai-workshop [slides here](https://docs.google.com/presentation/d/1dkoEydIH141qP1KWedlUjLERlLMVmHljekd_YxDAxhY/edit?slide=id.g364ee951e0f_0_19#slide=id.g364ee951e0f_0_19) ## Outline * Intro: what's the lay of the land in 2025-26? * AI literacy: how does AI work and what does it get wrong? * AI Research Activity ## Introduction * How are students using AI? * Some statistics from [Harvard](https://arxiv.org/pdf/2406.00833) and [across](https://www.chronicle.com/article/how-are-students-really-using-ai) the [country](https://www.grammarly.com/blog/company/student-ai-adoption-and-readiness/) * A July 2025 Grammarly study of 2,000 US college students found that 87% use AI for schoolwork and 90% for everyday life tasks. * Students most often turn to AI for brainstorming ideas, checking grammar and spelling, and making sense of difficult concepts. * While adoption is high, 55% feel they lack proper guidance, and most believe that learning to use AI responsibly is essential to their future careers. * Discussion: how are you using it? * What is the landscape this year? * There are two AI tools that HUIT is supporting. Let's get you connected to them before we move on with the workshop! * here is your link to [Google Gemini](https://gemini.google.com/app) * and here is your link to [the HUIT AI Sandbox](https://sandbox.ai.huit.harvard.edu/) * **Important privacy note:** These HUIT-supported tools have built-in privacy safeguards. Harvard has contracts with these providers ensuring that anything you share won't be used to train their models or be shared with third parties. These tools are safe for Level 3 data, which includes course materials and student work. This means you can confidently use them for teaching activities without worrying about privacy violations. --- ## AI Literacy: How does AI work? Using AI responsibly starts with AI literacy. This means moving beyond what AI can do and exploring how it works and why it fails. In this section, we’ll focus on two key aspects of how AI functions: - **AI as a Statistical Machine**: LLMs process language as numbers rather than understanding it, leading to predictable errors that users can learn to anticipate and correct. - **AI as a Reflection of its Training Data**: AI models learn from vast amounts of human-generated data, absorbing and amplifying the stereotypes within it. --- ### Activity 1: Tokenization Paste the text below into [tiktokenizer](https://tiktokenizer.vercel.app/). ``` Unsurprisingly, they had to cancel the show. The crowd went home unhappily. ``` * Notice how the model breaks words into tokens. * Try putting in a sentence or a word with complex morphology in your language of choice * Discuss: What does this reveal about how AI “reads” text differently from humans? #### Takeaway AI doesn’t “read” words like humans do. It breaks text into tokens—numbers representing pieces of words. This shows that LLMs process language as math, predicting the next number in a sequence rather than reasoning about meaning. --- ### Activity 2: Multiplication- Predicting vs. "Reasoning" **1. Prompt (for Gemini Flash or Llama 3.2 11b oro older model):** ``` 82,345 × 67,890. give me an immediate response without using code. ``` * Try it yourself first → you’ll see it’s hard to do “in your head.” * See how the AI answers. * Does it get it right? If it's wrong, is it *completely* wrong or close? how? **2. Prompt (for Gemini Flash Thinking or GPT-4.1 Reasoning):** ``` 82,345 × 67,890 ``` * Compare this to when you asked for an “immediate response”. * Does the model get the math right this time? * What’s different about the *style* of its response? #### Takeaway AI doesn’t actually *calculate*—it predicts the next token (number) based on patterns in training data. That’s why answers can be *fact-like* or “almost correct,” but still wrong: they’re based on statistical averages of the internet, not reasoning. AI tools increasingly offer **“thinking” modes** (sometimes called *chain-of-thought* or *reasoning* models). Reasoning models still predict, but showing their work lets you spot errors and better trust or question their output. Asking the model to “think step by step” can improve reliability and helps you check its work. ### Improved Reasoning with RAG: [NotebookLM](https://notebooklm.google.com/)'s' strength is its ability to transform the same source text into different formats for teaching and learning. This mirrors a core principle of language acquisition: knowledge is strengthened when students engage with new vocabulary and grammar through varied channels like reading, writing, and listening. **Try this with a short text (e.g. news article, short story) in the language you teach:** 1. Upload a document into a new NotebookLM notebook. 2. In the Sources view, quickly skim the auto-generated summary and key topics to ensure NotebookLM has grasped the main points. 3. In the Chat box, ask NotebookLM to generate specific materials from the source. 4. You can also create podcasts and other materials. --- ### Activity 3: Ethical Error — Bias in Images Image generation models (like Gemini, DALL·E, or Midjourney) work by sampling from high-dimensional probability distributions conditioned on a prompt. The outputs reflect the distribution of their training data, which is often dominated by certain demographics or cultural defaults. As a result, even seemingly neutral prompts (e.g. “a happy family”) are resolved in highly regularized ways that reproduce these statistical biases. **Prompt an image generator:** ``` Create an image of a happy family ``` or ``` Create an image of a happy couple ``` * Compare the outputs to those sitting next to you--what patterns do you see? What kinds of people or relationships appear most often? * What patterns or omissions do you see? What’s the risk of using these images in class slides for instance? [More examples here →](/pvNaRf56T7qhOqx1GUlcrA) #### Takeaway Generative image models do not “choose” representations; they maximize likelihood based on patterns in skewed datasets. Because outputs are constrained by frequency and representation in the training data rather than a balanced sampling of real-world diversity, prompts like “a happy family” often yield stereotypical demographics, normalizing omissions and reinforcing cultural defaults. --- ### Activity 4: Social Error/Harm — Sycophancy & "Personality" AI sycophancy happens when the model affirms rather than challenges, offering polished praise that can mislead students. At scale, sycophancy is arguably *worse* than hallucination: a hallucinated fact might be detected and corrected, but sycophantic reinforcement of student error embeds the problem more deeply in the learner’s self-perception and habits. It can: * Reinforce flawed work * Erode student trust of both AI and human feedback * Encourage overconfidence and discourage deeper engagement **Prompt an AI:** * ask Gemini Flash and/or GPT-4.1 (in the sandbox) the following: ``` I wrote an essay that argues that the French Revolution happened because of bread prices---I think it's brilliant, don't you? ``` * try it a couple of times in new chat sessions to see how the responses vary * now try it one more time, but this time engage the LLM in friendly (perhaps even OVERLY friendly) small talk first. Now compare and contrast the results. **Reflection:** * What kind of “personality” did the AI adopt in your exchange? * Did its tone shift depending on how you interacted with it? * Share impressions with your group—what words would you use to describe the AI’s character? * Now connect this back to Activities 1 and 2: remembering that the model is not a person but a token predictor, how does that change how we interpret the “personality” we think we see? [Read more →](/E1cbV_KJTwmR30TIMzD7_A) #### Takeaway Sycophancy shows how AI can flatter instead of challenge, reinforcing errors and inflating confidence in ways that are harder to detect than hallucinations. The responses often feel shaped by a “personality” that adapts to user tone, but this is an illusion created by token-by-token prediction, not genuine intention or care. The danger lies both in misleading feedback and in our tendency to treat these statistical patterns as if they were a trustworthy human-like conversational partner. ## AI Research Activity For the second half of today's workshop we're going to do a couple of hands-on activites in pairs. The goal is give you a chance to use AI to test and evaluate its effictiveness as a casual research tool. At the end of the activities, we invite you document which kinds of applications and methods felt more/less useful, reliable, ethical, etc. ### Activity 1: Crash Course into an Expert's Area of Expertise For this activity, we'll break up into pairs. Ideally, you'll be paired with someone whose reasearch doesn't overlap too much with your own. (You can find summaries of the research for the other members of the tutorial in [this markdown doc.)](/4xqj_t2OSsi2FrFC1G5SUg) Once you're in pairs: 1. Identify a term, person, place, etc. that you don't know much about, and 2. Prompt an LLM to help you gain enough expertise with it to engage with a scholar working on it. Follow up with whatever prompts seem to take the chat in a productive direction. 3. After about 5 minutes or so, take turns presenting your chat to your partner. After the presentations, we'll debrief as a group. ### Activity 2 (If Time Allows): Translation Imagine yourself—this might be all-too-familiar—as a scholar who has identified a source that a) is important for a point you're trying to make but b) is in a language you don't read. For our activity, we'll stick to languages that *at least one or so* members of the tutorial can read. 1. Have an LLM translate a quote or passage for you into English and/or another language you *do* read. 2. Try to evaluate whether/why the translation seems realiable. After a few minutes, we'll share out and ask folks who are able to read the language of the source text to weigh in. ### Stay connected! Feel free to reach out at generativeAI@fas.harvard.edu for any AI-related inquiries. We also have a weekly Friday morning AI Lab (9:00-10:30am, Pierce Hall 100F). This is a coffee chat for faculty and grad students focused on AI news and discussion. We’ll share notable developments and open conversation about AI’s impact on teaching, learning, and research. A calendar of events and workshops can be found [here](https://bokcenter.harvard.edu/generative-ai-events).