AI & Transcription Glossary
AI flashcards are digital study cards automatically generated from educational content, such as lectures or transcripts, using artificial intelligence.
AI SummaryAn AI summary is a condensed version of a long transcript, generated by artificial intelligence to highlight the most important points, decisions, and action items.
AI TranscriptionAI transcription is the use of artificial intelligence to automatically convert speech from audio or video files into written text.
Audio FingerprintingAudio fingerprinting is a digital condensed summary of an audio signal, used to quickly identify audio samples or locate similar items in a database.
Audio to TextAudio to text is the general process of converting any recorded sound—whether from a voice memo, interview, or song—into a written digital format.
Automatic Speech Recognition (ASR)ASR is a technology that allows a computer to identify and process human speech into a readable text format.
Content RepurposingContent repurposing is the practice of taking one piece of content and adapting it for different platforms or formats to reach a wider audience.
Knowledge BaseA knowledge base is a centralized repository of information, data, and insights that can be easily searched and retrieved.
Knowledge DecayKnowledge decay is the decline in the accuracy or relevance of information over time, or the human tendency to forget information if it is not reinforced.
Meeting TranscriptionMeeting transcription is the automated recording and text conversion of business meetings, ensuring every decision and action item is documented.
Multi-Speaker DetectionMulti-speaker detection is the ability of an AI system to identify and track the presence of multiple distinct voices in a single audio stream.
Multilingual TranscriptionMultilingual transcription is the ability of an AI system to recognize and transcribe speech across many different languages, often with automatic language detection.
Named Entity Recognition (NER)NER is an NLP task that identifies and categorizes key information (entities) in text, such as names of people, organizations, locations, and dates.
Natural Language Processing (NLP)NLP is a field of AI that focuses on the interaction between computers and humans through natural language.
Noise Cancellation for TranscriptionNoise cancellation is the process of using AI to remove background sounds from a recording to improve the clarity of speech and transcription accuracy.
Podcast Show NotesPodcast show notes are a written summary of a podcast episode that typically includes a brief overview, links mentioned, and a timestamped list of topics covered.
Podcast TranscriptionPodcast transcription is the process of converting the spoken dialogue of a podcast episode into a written text format.
RAG (Retrieval-Augmented Generation)RAG is a technique that grants an AI model access to external data sources to provide more accurate, up-to-date, and context-aware responses.
Second BrainA 'Second Brain' is a digital system for capturing, organizing, and retrieving the ideas and information you encounter, freeing your mind for creative work.
Semantic SearchSemantic search is a data searching technique in which a search query aims to not only find keywords, but to determine the intent and contextual meaning of the words.
Speaker DiarizationSpeaker diarization is the process of partitioning an audio stream into homogeneous segments according to the speaker identity.
Speaker IdentificationSpeaker identification is the process of automatically recognizing and labeling a speaker's voice based on a previously stored voice profile.
SRT Subtitle FormatSRT (SubRip Text) is a simple and widely supported file format for video subtitles and captions.
Timestamped TranscriptA timestamped transcript is a written record of spoken words where specific times are noted next to the text, syncing it to the audio or video source.
Topic ModelingTopic modeling is an unsupervised machine learning technique used to discover the abstract topics that occur in a collection of documents.
Transcript SearchTranscript search is the ability to search for specific words, phrases, or concepts within the text of a video or audio transcription.
Vector EmbeddingA vector embedding is a way of representing data, such as words or images, as points in a multi-dimensional space where similar items are placed closer together.
VTT Subtitle FormatVTT (WebVTT) is a modern subtitle format designed for HTML5 video players, offering better styling and metadata support than SRT.
Word Error Rate (WER)Word Error Rate (WER) is a common metric of the performance of a speech recognition or machine translation system.
YouTube ChaptersYouTube Chapters break up a video into sections, each with an individual preview and title, helping viewers navigate long content more easily.