What is Voice Cloning?

Creating a synthetic replica of a specific person's voice using AI. Users record a short sample (30 seconds to 5 minutes), and the AI learns to reproduce their speech patterns, tone, and accent. Used for personalized video content.

Clone your voice →

← Text-to-Speech (TTS)AI Puppet →

Related Terms

Neural Voice

A synthetic voice generated by deep neural networks (as opposed to older concatenative TTS). Neural voices sound significantly more natural, with proper intonation, breathing, and emotional range. Leading providers produce voices.

Text-to-Speech (TTS)

Technology that converts written text into spoken audio. Modern TTS systems produce natural-sounding voices with emotion, pacing, and accent control. Puppetry offers 500+ AI voices across 65+ languages.

AI Spokesperson

A digital presenter generated and animated by AI for marketing, training, or product videos. Replaces the cost and scheduling of hiring on-camera talent. Companies typically use a single AI spokesperson identity across many videos for brand consistency.

Synthetic Media

Any media (image, audio, video) generated or substantially modified by AI rather than captured from the real world. Encompasses AI avatars, AI voice, generated music, and AI-translated video. The legitimate, consent-based end of the spectrum is sometimes called "synthetic content."

← Back to full glossary