# bok-ai-lab-20250418-glossary
### **Automatic Speech Recognition (ASR)**
Automatic Speech Recognition (ASR) is the process by which spoken language is converted into written text using machine learning models trained on large-scale audio and linguistic data. ASR powers tools like real-time captioning, searchable video transcripts, and AI-driven note-taking systems.
In the classroom, ASR enhances accessibility and learning by producing live captions for students who are deaf or hard of hearing, as well as for non-native speakers or learners who benefit from visual reinforcement. It also provides a record of what was said during class, which students can review later.
ASR is a foundational layer for many education tools that rely on voice input. Its performance depends on factors like microphone quality, background noise, and discipline-specific vocabulary. Educators should consider these when integrating ASR into live or recorded teaching workflows.
---
### **Whisper (OpenAI)**
Whisper is an open-source speech recognition model developed by OpenAI. It is designed to transcribe speech with high accuracy across a wide range of languages, accents, and audio conditions. Whisper also returns word-level timestamps, which are useful for syncing transcripts with video or audio playback.
In education, Whisper enables live transcription of lectures, post-class audio analysis, and the creation of searchable archives of spoken material. It’s particularly well-suited for multilingual classrooms or environments with variable audio quality.
Because Whisper is open source, it can be customized or integrated into custom tools. It supports both real-time and batch transcription, and its robustness makes it a popular choice for projects that require a dependable, flexible transcription engine.
---
### **Speech-to-Text API**
A Speech-to-Text API is a cloud-based service that accepts audio input and returns transcribed text. Offered by providers like Google, Amazon, Microsoft, and OpenAI, these APIs enable real-time or recorded transcription at scale with features like punctuation, speaker detection, and language support.
In higher education, these APIs are embedded in captioning platforms, note-taking tools, and LMS-integrated lecture capture systems. They allow institutions to provide transcription services without maintaining their own ASR infrastructure.
When selecting an API, developers and educators consider accuracy, latency, pricing, and data privacy. These APIs serve as a bridge between raw audio and downstream tools like summarizers, bots, or analytics dashboards.
---
### **Voice Activity Detection (VAD)**
Voice Activity Detection (VAD) is a signal processing technique that determines when speech is present in an audio stream. It’s used to ignore silence or background noise and activate transcription systems only when someone is speaking.
In a classroom setting, VAD improves transcription quality by focusing only on relevant speech. It can reduce processing costs, increase responsiveness, and help prevent false positives—like transcribing background rustling or distant chatter.
VAD is often integrated into lecture capture systems and transcription pipelines. It’s an invisible but important step that makes AI-powered tools more efficient and useful in real-time educational environments.
---
### **Speaker Diarization**
Speaker diarization is the process of identifying and segmenting different speakers in a single audio recording. It helps determine who said what by grouping sections of speech based on speaker identity, typically without knowing their names in advance.
In education, diarization makes transcripts more readable and useful. For example, it allows students to follow class discussions and distinguish between instructor explanations and student questions. It also supports review of seminars, debates, or group presentations.
Diarization is commonly used in systems that support collaborative or multi-speaker environments. When paired with ASR, it enables searchable, labeled transcripts that preserve the structure and flow of class interactions.
---
### **Noise Suppression**
Noise suppression refers to the process of removing unwanted ambient sounds from audio recordings or live streams, such as keyboard clicks, HVAC noise, or background conversations. This improves the clarity of the speech signal for both listeners and transcription systems.
In classroom environments, noise suppression is especially valuable when using microphones in large spaces, hybrid settings, or student discussion groups. Cleaner audio results in more accurate captions and better student comprehension, especially for those relying on assistive technologies.
Noise suppression is built into many modern conferencing and transcription tools. When layered with ASR and real-time feedback systems, it ensures that only meaningful, intelligible speech is processed and recorded.
---
### **Audio Latency**
Audio latency refers to the delay between when a sound is produced and when it is heard or processed. In educational tools, high latency can make captions feel out of sync with speech or delay AI-generated responses.
Low-latency systems are essential for real-time teaching and learning. For example, students following live captions need them to appear almost immediately after the spoken word. Instructors using voice-triggered bots or transcription tools depend on responsiveness to maintain flow.
Latency can be introduced at many stages—audio input, network transmission, processing, or display. Optimizing each layer is key to creating seamless, real-time learning experiences.
---
### **Closed Captions vs Open Captions**
Closed captions are text displays of spoken dialogue that viewers can turn on or off, while open captions are permanently embedded into the video. Both serve the same core function: making spoken content accessible and searchable.
In classrooms, closed captions offer flexibility for students watching independently on devices, while open captions are helpful in shared environments like lecture halls or recorded content projected during class.
Many AI-powered transcription systems offer both options. The choice depends on audience needs, device access, and whether the video will be edited or distributed after class.
---
### **Language Models with Timestamps**
Some transcription systems include timestamps in their output—aligning specific words or phrases with time codes in a video or audio stream. These allow for advanced features like search-by-topic, video annotation, and time-linked highlights.
For students, this means being able to jump directly to a moment in the lecture where a concept was explained. Instructors can use timestamped data to analyze what content generated confusion or sparked engagement.
Timestamps are especially powerful when used alongside automated summaries, annotations, or analytics. They turn static video into an interactive learning object.
---
### **Lecture Capture**
Lecture capture refers to the recording of live class sessions—including audio, video, slides, and sometimes whiteboard content—for later review by students. It supports flexibility, accessibility, and learning reinforcement.
When combined with ASR, lecture capture systems can provide full transcripts, keyword search, and summarized notes. This helps students review missed content, study for assessments, or engage with material at their own pace.
Today’s capture systems often integrate with LMS platforms and include analytics that track engagement and viewing patterns—helping faculty understand how students interact with their recordings.
---
### **Real-Time Processing**
Real-time processing means that a system responds to input as it happens, rather than after the fact. In education, this includes live transcription, polling, summarization, Q&A systems, and classroom dashboards.
Tools that respond in real time support more interactive, adaptive instruction. Instructors can adjust based on student questions, and students benefit from immediate clarification, reinforcement, or access to alternative formats.
Real-time AI systems require low-latency pipelines and effective interfaces. When designed well, they transform the pace and feel of classroom interaction.
---
### **WebSockets**
WebSockets are a communication protocol that allows for continuous, two-way data flow between a server and a client (like a browser). Unlike traditional HTTP requests, WebSockets stay open and enable instant data updates.
In teaching tools, WebSockets power live experiences—like pushing new transcript lines to students as a lecture unfolds, streaming poll results, or updating shared dashboards in real time.
They’re essential for low-latency systems where speed, synchronization, and interactivity matter.
---
### **WebRTC**
Web Real-Time Communication (WebRTC) is a set of browser-based technologies that enable peer-to-peer sharing of video, audio, and data. It’s widely used in tools like Zoom, Google Meet, and many classroom collaboration platforms.
In education, WebRTC supports synchronous learning: broadcasting a lecture, facilitating breakout discussions, or sending live audio to transcription services. It reduces server load and lowers latency by connecting users directly.
Because WebRTC runs in the browser, it’s accessible and doesn’t require special software—making it ideal for distributed or hybrid classrooms.
---
### **Latency**
Latency is the time delay between when a signal is sent and when it’s received or acted on. In education technology, latency affects how real-time a system feels—whether captions lag behind speech or polling feedback appears instantly.
For students relying on assistive tools or instructors using live interaction systems, low latency is critical. Even a few seconds of delay can disrupt pacing or make a tool feel untrustworthy.
Designers of classroom technology aim to reduce latency at every stage—capturing input, processing it quickly, and delivering the result with minimal lag.
---
### **Slack**
Slack is a messaging platform originally designed for workplace communication, now widely adopted in education as a backchannel for classroom discussion. It supports channels, threads, emoji reactions, file sharing, and app integrations.
In class, Slack allows students to post questions, share resources, or react to discussions in real time. It offers a lower-barrier form of participation—especially for those who may hesitate to speak aloud—and keeps a record of the class conversation.
With integrations like bots, polls, and summarizers, Slack can function as a dynamic layer of instruction, supporting engagement, formative feedback, and asynchronous dialogue.
---
### **Slack API**
The Slack API allows developers to build apps and bots that can read, write, and respond to messages in real time. It provides access to Slack events, messages, reactions, and user activity through both HTTP and WebSocket connections.
In educational contexts, this makes it possible to automate feedback, build Q&A threads, track engagement, or integrate AI services that summarize discussion or tag questions.
When connected to teaching tools, the Slack API enables a structured, intelligent layer on top of informal conversation—helping instructors stay informed and respond with agility.
---
### **Slack Events API**
The Slack Events API sends real-time updates about activity in a workspace. Bots can subscribe to events like new messages, emoji reactions, or thread replies and respond programmatically.
In a classroom, this might mean detecting when a message gets five :question: reactions and automatically surfacing it to the instructor. It enables bots to act as teaching assistants—filtering, flagging, and summarizing conversation data as it happens.
The Events API is essential for making Slackbots feel responsive and intelligent in an educational setting.
---
### **Slackbot**
A Slackbot is a custom, automated user that performs actions inside Slack—like responding to messages, posting reminders, summarizing threads, or logging student questions.
In a classroom, Slackbots help manage the flow of communication during lectures and discussions. They can track common questions, tag confusion, or support formative assessment by prompting students to check in.
Slackbots reduce the noise and friction in real-time discussions, helping instructors and students focus on substance without being overwhelmed by volume.
---
### **Slash Commands (Slack)**
Slash commands are typed shortcuts in Slack that begin with a `/` and execute specific actions, like `/ask`, `/highlight`, or `/summary`. They make complex interactions feel simple and fast.
In class, a student might type `/question` to log a comment for review, or an instructor might use `/poll` to gather feedback on the fly. Commands can be connected to bots, Google Docs, polls, or custom dashboards.
Slash commands make real-time teaching tools more accessible by hiding complexity behind quick, intuitive inputs.
---
### **Emoji Reactions as Signals**
Emoji reactions are not just expressive—they can be structured inputs. In educational Slack spaces, instructors may designate specific emojis to mean “question,” “confusion,” or “agreement.”
This turns a simple emoji tap into a data point that can be tracked, analyzed, or summarized. For example, a bot might collect all messages with the :question: emoji and generate a thread of unanswered student queries.
Using emoji as metadata makes backchannel participation lightweight and fast—especially when paired with real-time analytics or summaries.
---
### **Backchannel**
A backchannel is a secondary, often text-based communication stream that runs parallel to a live class session. Students use it to post questions, react to content, or collaborate with peers without interrupting the speaker.
Backchannels support equitable participation by giving all students a voice, especially those who are hesitant to speak up. Instructors can monitor the thread to identify confusion, address emerging themes, or surface insights in real time.
Platforms like Slack, Discord, or Zoom chat often serve as the backchannel in modern classrooms, and AI tools layered on top can organize, summarize, and elevate student contributions.
---
### **Annotation-Based Learning**
Annotation-based learning involves students marking up texts, transcripts, or media with highlights, comments, tags, or questions. It shifts learners from passive readers to active meaning-makers.
In classrooms, students might annotate a live transcript of a lecture, tag confusing moments, or comment on shared readings. These annotations can then guide class discussion, peer feedback, or AI-generated summaries.
Annotation encourages deep engagement, improves recall, and provides instructors with insight into what students are noticing—and where they’re getting stuck.
---
### **Natural Language Processing (NLP)**
Natural Language Processing (NLP) is a subfield of AI focused on analyzing and generating human language. It underlies tools that perform summarization, question answering, sentiment analysis, and keyword extraction.
In education, NLP powers systems that digest lectures, summarize Slack threads, tag confusing questions, or suggest follow-ups based on student input. It makes large-scale dialogue analyzable and actionable.
Modern NLP relies on models like GPT and BERT, trained on vast text corpora. When applied well, these models help instructors and students navigate language-rich environments more effectively.
---
### **Tokenization**
Tokenization is the process of splitting text into smaller units—usually words or subwords—so they can be processed by language models. Tokens are the basic units that models analyze and generate.
In educational applications, tokenization affects how accurately systems handle technical terms, abbreviations, or nonstandard student writing. Errors in tokenization can lead to misunderstandings in summarization or question answering.
Effective tokenization supports better feedback, cleaner summaries, and more accurate responses from AI systems embedded in classroom tools.
---
### **Transformer Architecture**
Transformer architecture is the foundation of most modern large language models. It processes words in parallel and uses attention mechanisms to capture context—making it efficient, scalable, and well-suited for tasks like summarization and translation.
Transformers enable systems like GPT to understand and generate coherent responses, whether summarizing a lecture transcript or answering a student's question based on context.
Understanding how transformers work helps explain why today’s AI tools are so powerful—and why they sometimes hallucinate, struggle with nuance, or require well-crafted prompts.
---
### **Large Language Model (LLM)**
Large Language Models (LLMs) are AI models trained on enormous datasets of human language. They can summarize, translate, generate text, and answer questions across a range of topics and styles.
In classrooms, LLMs are used to create quiz questions, generate summaries of student notes, or act as intelligent assistants during discussion. They are powerful but imperfect—requiring guidance, structure, and thoughtful implementation.
Educators using LLMs should be aware of their strengths (fluency, generalization) and limitations (bias, hallucination) when incorporating them into teaching tools.
---
### **Few-Shot Learning**
Few-shot learning is a technique for teaching an LLM to perform a task by showing it a few examples directly in the prompt. Unlike traditional machine learning, it requires no retraining—just well-structured inputs.
In education, this means an instructor can show an LLM how to summarize a lesson, respond to a quiz format, or provide feedback on student writing, all within a single prompt.
Few-shot learning enables rapid prototyping of AI-driven tools tailored to the classroom, without needing to fine-tune the model itself.
---
### **Streaming Inference**
Streaming inference refers to generating output from an AI model token by token, as it receives input. It allows systems to start responding before the full input is even received.
This makes streaming models ideal for real-time applications—like answering questions live during class or generating summaries while a lecture is still in progress.
Streaming inference improves the responsiveness of AI tools, creating smoother and more conversational experiences in teaching environments.
---
### **Intent Detection**
Intent detection is the process of classifying what a user is trying to do—ask a question, make a request, express confusion, or give feedback. It is the first step in many conversational AI systems.
In a class Slack channel, intent detection helps bots sort messages into categories, like questions that need answers, ideas that could be summarized, or comments that signal confusion.
By organizing dialogue this way, instructors can prioritize attention, respond faster, and engage more effectively with student contributions.
---
### **Summarization (Text Summarization)**
Summarization is the process of reducing a longer text into a shorter one that preserves the key ideas. AI systems use extractive (selecting key sentences) or abstractive (rephrasing ideas) techniques.
In the classroom, summarization helps distill lecture transcripts, discussion threads, or student notes into study guides or review materials. It supports memory, reflection, and content mastery.
Good summarization tools balance fidelity to the original with clarity and structure—turning messy or dense content into something students can actually use.
---
### **Abstractive vs Extractive Summarization**
Extractive summarization selects exact sentences or phrases from a source text, while abstractive summarization rephrases content to express the same ideas more concisely or clearly.
Extractive methods are easier to implement and less prone to errors but can miss nuance or feel choppy. Abstractive methods feel more natural and coherent, but require more advanced AI models and can occasionally hallucinate.
Educational tools often blend the two—using extractive techniques for precision and abstractive ones for flow—to generate clear, accurate summaries of complex material.
---
### **Question Answering (QA) Systems**
Question Answering (QA) systems use AI to answer questions posed in natural language, often by retrieving or synthesizing information from a specific source (like a transcript or lesson plan).
In class, students might ask, “What did the professor say about recursion?” and a QA system could return the relevant segment from the transcript or notes. This supports real-time clarification and review.
QA tools help close the feedback loop, enabling students to ask more questions—and get faster answers—without overloading the instructor.
---
### **Learning Analytics**
Learning analytics involves collecting and analyzing data on student behavior, performance, and engagement in order to improve teaching and learning outcomes.
In classrooms, this includes tracking which topics generate the most questions, when students are confused, or how they interact with backchannel tools. Analytics can be used to personalize feedback or inform course adjustments.
When paired with real-time AI systems, learning analytics help instructors identify patterns that might otherwise go unnoticed—turning classroom data into insights for action.
---
### **Audience Response System (Clickers)**
Audience Response Systems (a.k.a. clickers) allow students to respond to polls, quizzes, or prompts during class using handheld devices or mobile apps. Responses are often aggregated and displayed live.
These systems promote active learning, check comprehension, and encourage participation. They also help instructors adjust pacing or revisit unclear material based on student responses.
Modern clicker tools often integrate with LMS platforms, Slack, or AI systems that generate feedback, log participation, or even auto-grade and suggest follow-up tasks.
---
### **Just-in-Time Teaching (JiTT)**
Just-in-Time Teaching (JiTT) is a strategy where instructors adapt lessons based on student input collected shortly before or during class—through polls, questions, or reflections.
It makes instruction more responsive and focused. If many students struggle with a concept in pre-class polls, the instructor can spend more time on it during the session. If confusion arises mid-lecture, teaching can pivot in the moment.
JiTT pairs well with real-time AI tools that summarize student input or highlight themes, making quick adaptation easier and more effective.
---
### **Formative Assessment**
Formative assessment refers to low-stakes evaluation used to guide learning—not to assign grades. It includes check-ins like quizzes, reflections, discussions, or polling to see where students are in the learning process.
These assessments help instructors adapt instruction and give students insight into their own understanding. When done well, formative assessment creates a feedback-rich, growth-oriented learning environment.
AI can support formative assessment by automatically summarizing responses, flagging patterns of confusion, or delivering tailored feedback at scale.
---
### **Generative Learning**
Generative learning is a strategy where students actively produce content—like summaries, diagrams, or questions—to deepen understanding. The act of generating helps cement knowledge and reveal gaps.
In educational tools, generative learning can be supported through AI-assisted prompts that ask students to rephrase ideas, compare concepts, or explain material in their own words.
These systems can then provide structured feedback, comparisons to model answers, or highlight key omissions—making student-generated content a starting point for deeper learning.
---
### **Cognitive Load**
Cognitive load refers to the amount of mental effort required to process information. Too much load—due to unclear instruction, dense content, or multiple competing inputs—can impair learning.
Effective educational tools manage cognitive load by chunking content, simplifying interfaces, and scaffolding complex tasks. This helps students stay focused and engaged.
AI systems can dynamically adapt pacing or complexity based on real-time feedback, helping instructors reduce unnecessary load and focus on what matters most.
---
### **Active Learning**
Active learning is an approach that encourages students to participate, reflect, and collaborate—rather than passively absorb information. It includes discussions, polling, problem solving, and annotation.
Active learning increases retention and engagement, and research consistently shows its positive impact on outcomes. Technology helps scale it by making participation more accessible and visible.
AI and real-time systems enable active learning by summarizing responses, identifying participation trends, or surfacing contributions that might otherwise be overlooked.
---