AI & Transcription Glossary

AI Flashcards

AI flashcards are digital study cards automatically generated from educational content, such as lectures or transcripts, using artificial intelligence.

AI Summary

An AI summary is a condensed version of a long transcript, generated by artificial intelligence to highlight the most important points, decisions, and action items.

AI Transcription

AI transcription is the use of artificial intelligence to automatically convert speech from audio or video files into written text.

Audio Fingerprinting

Audio fingerprinting is a digital condensed summary of an audio signal, used to quickly identify audio samples or locate similar items in a database.

Audio to Text

Audio to text is the general process of converting any recorded sound—whether from a voice memo, interview, or song—into a written digital format.

Automatic Speech Recognition (ASR)

ASR is a technology that allows a computer to identify and process human speech into a readable text format.

Content Repurposing

Content repurposing is the practice of taking one piece of content and adapting it for different platforms or formats to reach a wider audience.

Knowledge Base

A knowledge base is a centralized repository of information, data, and insights that can be easily searched and retrieved.

Knowledge Decay

Knowledge decay is the decline in the accuracy or relevance of information over time, or the human tendency to forget information if it is not reinforced.

Meeting Transcription

Meeting transcription is the automated recording and text conversion of business meetings, ensuring every decision and action item is documented.

Multi-Speaker Detection

Multi-speaker detection is the ability of an AI system to identify and track the presence of multiple distinct voices in a single audio stream.

Multilingual Transcription

Multilingual transcription is the ability of an AI system to recognize and transcribe speech across many different languages, often with automatic language detection.

Named Entity Recognition (NER)

NER is an NLP task that identifies and categorizes key information (entities) in text, such as names of people, organizations, locations, and dates.

Natural Language Processing (NLP)

NLP is a field of AI that focuses on the interaction between computers and humans through natural language.

Noise Cancellation for Transcription

Noise cancellation is the process of using AI to remove background sounds from a recording to improve the clarity of speech and transcription accuracy.

Podcast Show Notes

Podcast show notes are a written summary of a podcast episode that typically includes a brief overview, links mentioned, and a timestamped list of topics covered.

Podcast Transcription

Podcast transcription is the process of converting the spoken dialogue of a podcast episode into a written text format.

RAG (Retrieval-Augmented Generation)

RAG is a technique that grants an AI model access to external data sources to provide more accurate, up-to-date, and context-aware responses.

Second Brain

A 'Second Brain' is a digital system for capturing, organizing, and retrieving the ideas and information you encounter, freeing your mind for creative work.

Semantic Search

Semantic search is a data searching technique in which a search query aims to not only find keywords, but to determine the intent and contextual meaning of the words.

Speaker Diarization

Speaker diarization is the process of partitioning an audio stream into homogeneous segments according to the speaker identity.

Speaker Identification

Speaker identification is the process of automatically recognizing and labeling a speaker's voice based on a previously stored voice profile.

SRT Subtitle Format

SRT (SubRip Text) is a simple and widely supported file format for video subtitles and captions.

Timestamped Transcript

A timestamped transcript is a written record of spoken words where specific times are noted next to the text, syncing it to the audio or video source.

Topic Modeling

Topic modeling is an unsupervised machine learning technique used to discover the abstract topics that occur in a collection of documents.

Transcript Search

Transcript search is the ability to search for specific words, phrases, or concepts within the text of a video or audio transcription.

Vector Embedding

A vector embedding is a way of representing data, such as words or images, as points in a multi-dimensional space where similar items are placed closer together.

VTT Subtitle Format

VTT (WebVTT) is a modern subtitle format designed for HTML5 video players, offering better styling and metadata support than SRT.

Word Error Rate (WER)

Word Error Rate (WER) is a common metric of the performance of a speech recognition or machine translation system.

YouTube Chapters

YouTube Chapters break up a video into sections, each with an individual preview and title, helping viewers navigate long content more easily.