How do I extract audio from a YouTube video for free?

Paste the YouTube URL into YTCut, set your start and end timestamps (or leave them at the full duration), choose MP3 or M4A as the format, and click download. The audio file downloads directly with no upload or account required. The process takes under a minute for most videos.

What is the best audio format to download from YouTube?

M4A (AAC) is the most efficient format for listening — it sounds better than MP3 at the same bitrate. MP3 offers maximum compatibility with all devices and software. Use WAV only if you plan to edit the audio afterward, since it preserves lossless quality for production workflows.

What quality audio can I get from YouTube?

YouTube stores audio at 128kbps AAC or 160kbps Opus depending on the video. No extraction tool can give you better quality than what YouTube has stored. Requesting 320kbps MP3 from a 128kbps source does not improve quality — it just encodes the same audio at a higher bitrate with no added detail.

Does extracting audio from YouTube reduce quality?

If you extract as M4A, the audio can be passed through without re-encoding, preserving the original YouTube quality exactly. Extracting as MP3 requires re-encoding from AAC, which introduces minimal but technically measurable generation loss. At 192kbps MP3 output, the loss is imperceptible in practice.

What is the best command-line tool to download YouTube audio?

yt-dlp is the best command-line option. The command yt-dlp -x --audio-format mp3 --audio-quality 192K [URL] downloads and converts to 192kbps MP3. yt-dlp requires FFmpeg to be installed for format conversion and is actively maintained by a large open-source community.

Can I extract audio from a YouTube live stream?

After a live stream is archived as a regular video, any extraction method works on it exactly as on a regular video. During an active live stream, extraction is technically possible with yt-dlp but complicated and unreliable. The most practical approach is to wait for the stream to finish and be archived.

How to Extract Audio from YouTube Videos (Every Method, Every Format)

Why People Extract Audio from YouTube (The Actual Reasons)

Let's start here because the reason you're extracting audio matters a lot for which method and format you should pick. Not all use cases are the same, and treating them like they are is how you end up with a 320kbps MP3 of a 45-minute lecture you're only going to listen to once while doing dishes.

Language learning

This is probably the biggest category. YouTube has an enormous amount of native-speaker content in every language imaginable, from French cooking shows to Japanese variety programs to Spanish news broadcasts. Language learners extract short audio segments to practice listening, shadow speakers, and build vocabulary. They need clips. Not the full hour-long video. Just the 90-second segment where the host explains something interesting using vocabulary at their level.

For this use case: MP3 at 192kbps is plenty. You want small files you can load onto a phone. Audio quality matters because speech clarity is important, but you don't need studio-grade WAV files for this.

Podcast clips and reference material

Someone says something genuinely interesting in a three-hour podcast. You want that clip. The whole podcast is on YouTube (most are these days) and you need the two-minute segment where the guest drops a fact you want to share or listen to again. Downloading the entire three-hour video as an MP4 to get 120 seconds of audio is absurd. Extract just what you need.

Music snippets for personal use

Live performances. Rare B-sides that never got an official release. Session recordings that exist only on YouTube. Bootleg concert footage from 2008 that will never be on any streaming platform. People pull these because there's genuinely no other option to have them offline.

For music: quality matters more. Use 320kbps MP3 or M4A if the source is good enough to justify it. Don't bother with anything above 192kbps if the original upload was a potato-quality 240p video from 2009.

Lecture notes and educational content

University lectures uploaded to YouTube. Coding tutorials. Cooking demonstrations. History documentaries. People extract these to listen while commuting, exercising, or doing tasks where they can't watch a screen. Audio-only works perfectly for content that's primarily voice-driven.

Ringtones and notification sounds

Guilty. We all know someone who has done this. A specific sound effect, a memorable line, a piece of music. People grab very short clips (5-30 seconds) for personal use. For this: MP3 works, and you want a precise start and end point so the clip doesn't have a bunch of silence before the sound hits.

Content creation and remixing

Podcasters sampling interview clips. Video editors using B-roll audio. YouTube creators using audio from their own older videos. DJ practice. This is where format choice gets more serious: WAV or FLAC if you're doing further audio processing, because you want the full quality before you start doing anything else to it.

With the "why" established, let's go through every method.

Method 1: YTCut (Precise Segments, Any Format)

This is the method you want when you need a specific segment of a video's audio, not the whole thing. Most tools extract the entire video's audio track. YTCut lets you set an exact start time and end time, then download only that portion as audio. This is genuinely useful.

Step-by-step

Go to ytcut.org and paste the YouTube URL into the input field.
The video preview loads. Use the timeline to set your start point. You can type exact timestamps in the fields or drag the handles on the waveform view. Millisecond precision is available.
Set the end point the same way. If you're grabbing a specific quote that ends at exactly 4:32.8, you can set it there rather than guessing with a rough cut.
Click the format dropdown. You'll see audio options: MP3, M4A, WAV, OGG. Pick the one you need.
For MP3: you can select bitrate (128kbps, 192kbps, 320kbps). For most uses, 192kbps is the right call.
Click Download. Your exact audio segment downloads directly.

Tip: If you're extracting audio for speech (lectures, podcasts, language learning), 128kbps MP3 is fine and keeps file sizes very manageable. If you're extracting music or plan to do any further editing, go 320kbps or WAV.

Why YTCut is the right tool for segments

Most other tools (including yt-dlp without extra scripting) will download the entire audio track of a video and expect you to trim it afterward in a separate audio editor. YTCut cuts the segment server-side before downloading. This means you download a 45-second MP3 instead of a 3-hour WAV file that you then have to open in Audacity to trim. For the average person who just wants a clip, this is significantly faster and easier.

The tradeoff: you need an internet connection and you're relying on the YTCut server to do the processing. For offline workflows, or if you're doing batch extraction of dozens of clips, yt-dlp is a better choice.

Method 2: yt-dlp Command Line (Best Quality Control)

yt-dlp is a fork of the now-abandoned youtube-dl project. It's actively maintained, significantly faster, and has more features. If you're comfortable with a terminal, this is the most powerful option for audio extraction.

Install yt-dlp

On Mac with Homebrew: brew install yt-dlp

On Windows: Download the .exe from the GitHub releases page and add it to your PATH, or install via pip: pip install yt-dlp

On Linux: sudo apt install yt-dlp or pip install yt-dlp

You also need FFmpeg installed, which yt-dlp uses for audio conversion. On Mac: brew install ffmpeg. On Windows: download from ffmpeg.org and add to PATH.

The basic extraction commands

Download as MP3 at 192kbps:

yt-dlp -x --audio-format mp3 --audio-quality 192K "https://www.youtube.com/watch?v=VIDEOID"

What each flag does:

-x: Extract audio. This tells yt-dlp to download the best available video stream and then strip the video, keeping only audio. Without this flag it downloads video.
--audio-format mp3: Convert the extracted audio to MP3. Without this, yt-dlp gives you whatever format YouTube provides natively (usually M4A or WebM/Opus).
--audio-quality 192K: Sets the bitrate for the conversion. Valid values: 0 (best VBR) through 9 (worst VBR) for VBR mode, or a specific bitrate like 128K, 192K, 320K.

Download as WAV (uncompressed):

yt-dlp -x --audio-format wav "https://www.youtube.com/watch?v=VIDEOID"

Download as M4A (keep native format, no re-encode):

yt-dlp -x --audio-format m4a "https://www.youtube.com/watch?v=VIDEOID"

Download best quality audio, no re-encoding:

yt-dlp -f bestaudio "https://www.youtube.com/watch?v=VIDEOID"

This last command is the one audiophiles care about. It downloads the native audio stream from YouTube without any re-encoding. YouTube's best audio is typically 128kbps AAC (M4A) or 160kbps Opus (WebM), depending on the video. No conversion means no generation loss.

Download audio with custom filename

yt-dlp -x --audio-format mp3 --audio-quality 192K -o "%(title)s.%(ext)s" "URL"

The -o flag sets the output filename template. %(title)s uses the video title, %(ext)s uses the correct extension.

Batch download audio from a playlist

yt-dlp -x --audio-format mp3 --audio-quality 192K "https://www.youtube.com/playlist?list=PLAYLISTID"

This downloads every video in the playlist as MP3. yt-dlp handles the whole thing automatically. If a download fails partway through, just run it again and it skips already-downloaded files.

Important: yt-dlp cannot extract a time segment natively without some FFmpeg post-processing tricks. If you need only 30 seconds of a video, YTCut handles this more cleanly for most users. With yt-dlp you'd need to pipe through FFmpeg with -ss and -to flags separately.

Method 3: VLC Media Player

VLC can open YouTube URLs and convert the stream to audio. This is useful if you already have VLC installed and don't want to install anything new. The quality is decent, but the process is a bit clunky compared to dedicated tools.

Steps

Open VLC. Go to Media in the top menu.
Click "Convert/Save" (Ctrl+R on Windows).
Click the "Network" tab and paste the YouTube URL.
Click "Convert/Save" (not Open).
In the Convert dialog, click the dropdown next to Profile and choose "Audio - MP3".
To customize quality: click the wrench icon next to the profile. Go to the Audio codec tab. Change the bitrate to 192kbps or higher.
Set a destination file path under "Destination file".
Click Start. VLC processes the stream and saves the audio file.

The catch with VLC: it's slow for long videos because it processes in real-time by default. A 2-hour video takes roughly 2 hours to convert in VLC. yt-dlp is much faster because it downloads the full stream and then converts.

VLC is good for: occasional conversions, users who hate the command line, situations where you already have VLC open and just need one file quickly.

Method 4: Audacity

Audacity is a free audio editor that's been around since 2000. It doesn't download from YouTube directly, but once you have an audio file, it's an excellent tool for editing and re-exporting in different formats.

The workflow

First, get the audio file using any of the other methods (YTCut, yt-dlp, etc.).
Open Audacity. Drag the audio file into the Audacity window, or use File > Import > Audio.
The audio loads as a waveform. You can now trim, adjust levels, reduce noise, cut sections, or do anything else.
To export: File > Export > Export as MP3 (or WAV, FLAC, etc.).
Set quality (bitrate for MP3, sample rate, channels) in the export dialog.

Why use Audacity after extracting audio? Because you might want to do things like: normalize the volume so the audio is a consistent level, reduce background noise from a recording, trim silence from the beginning and end, combine multiple clips into one file, or add basic effects. Audacity handles all of this well.

Tip: Audacity's "Noise Reduction" tool is surprisingly good. If you extracted audio from a video with background noise (street sounds, fan noise, etc.), Effect > Noise Reduction > Get Noise Profile (from a silent section) > then apply it to the whole track. Makes a noticeable difference.

Audacity export quality settings

When exporting as MP3 from Audacity, you'll see quality options. Use "Preset: Standard (170-210 kbps)" for most uses. Use "Preset: Extreme (220-260 kbps)" for music where quality matters. The "Insane (320 kbps)" preset doesn't sound much better than Extreme for most material and creates significantly larger files.

For WAV export: Audacity exports at the sample rate and bit depth you're working in. If you imported a 44.1kHz file, it exports at 44.1kHz. This is the format to use before sending audio to a professional for mastering or mixing.

Method 5: Online Converters

Y2Mate, OnlineVideoConverter, YTMP3, SaveFrom.net. These sites exist. They work. Sort of.

The honest assessment

Fast. No installation. Paste URL, get MP3. That's genuinely appealing.

The problems, which are real:

Quality ceiling: Most cap output at 128kbps MP3. That's noticeably worse than 192kbps for music, and noticeably worse than 320kbps for anything where audio fidelity matters. For a spoken-word podcast clip it's fine. For music it's not.
Privacy: You're submitting a YouTube URL (and sometimes your IP and browser info) to a third-party server that you know nothing about. Most of these sites monetize through aggressive advertising. Some have had malware issues historically.
Reliability: These sites go down frequently. YouTube regularly blocks their API access. The service that worked last week might not work this week. You get no warning.
Ad experience: Many of these sites are genuinely unpleasant to use. Multiple popups, fake download buttons, sketchy redirects. If you're going to use one, uBlock Origin in your browser is not optional.
No segment control: You get the full video's audio. You cannot specify "I want minutes 12:30 through 15:45 only."

For a one-time extraction of a speech or lecture where 128kbps is fine and you just need it done in 30 seconds: these tools are acceptable. For anything where quality or privacy matters, use a better method.

Method 6: Descript

Descript is a paid tool (with a limited free tier) primarily designed for podcast and video editing. It's in this list because it does something the other tools don't: it generates a text transcript of your audio as part of the workflow.

When Descript makes sense

You have a YouTube interview or podcast episode. You want to extract a specific quote as audio, but you're not sure exactly where the quote starts and ends. Descript's workflow is: import the video, it transcribes it automatically, you find the words in the transcript and click them to find the timestamp, then cut the audio around that word.

For podcast producers who regularly extract clips to promote episodes, this is a legitimate workflow. You find the quotable moment by reading the transcript rather than scrubbing through audio. Then you export just that clip.

The free tier of Descript allows a limited number of hours per month. For occasional use it works. For regular use, the subscription cost is significant (pricing varies, check their site for current rates).

For most people who just want to extract audio, Descript is overkill. But if you're building a content workflow where clips are a regular output, the transcript-based editing is genuinely much faster than timeline scrubbing.

Method 7: FFmpeg Direct

FFmpeg is the underlying engine that most of these tools use. You can use it directly for maximum control, including extracting audio from an already-downloaded video file.

Extract audio from a video file to MP3:

ffmpeg -i input_video.mp4 -vn -acodec libmp3lame -q:a 2 output_audio.mp3

Flags explained:

-i input_video.mp4: Input file
-vn: No video (skip the video stream, audio only)
-acodec libmp3lame: Use the LAME MP3 encoder
-q:a 2: VBR quality level 2, which is roughly 190-200kbps average. Scale is 0 (best) to 9 (worst). Quality 0 is approximately 320kbps average.

Extract audio as WAV (uncompressed, lossless):

ffmpeg -i input_video.mp4 -vn -acodec pcm_s16le output_audio.wav

Extract only a segment (30 seconds starting at 5 minutes):

ffmpeg -i input_video.mp4 -ss 00:05:00 -to 00:05:30 -vn -acodec libmp3lame -q:a 2 output_clip.mp3

Copy audio stream without re-encoding (fastest, lossless if the source is compatible):

ffmpeg -i input_video.mp4 -vn -acodec copy output_audio.m4a

This last command is the fastest option when the source video already has an AAC audio track (which most MP4 files do). It extracts the audio without any re-encoding, so there's no quality loss and it completes in seconds regardless of file length.

Tip: The -acodec copy trick only works if the output format is compatible with the source audio codec. MP4 video with AAC audio: extract to .m4a or .aac with -acodec copy. WebM with Opus audio: extract to .opus or .ogg with -acodec copy. If you need MP3 specifically, you must re-encode.

Audio Format Comparison: MP3 vs M4A vs WAV vs OGG vs FLAC

People get weirdly religious about audio formats. Let's be practical about it.

Format	Compression	Quality at small size	Compatibility	Best use case
MP3	Lossy	Good at 192kbps+	Universal (plays literally everywhere)	General sharing, ringtones, portable audio
M4A (AAC)	Lossy	Better than MP3 at same bitrate	Excellent (all Apple devices, most Android)	Streaming, Apple ecosystem, better efficiency
WAV	Lossless (uncompressed)	Perfect, but huge files	Excellent on desktop, poor on mobile	Audio editing, professional production
OGG (Vorbis)	Lossy	Good at 160kbps+	Poor (not supported by iTunes/iOS natively)	Web streaming, Linux, open-source workflows
FLAC	Lossless (compressed)	Perfect, smaller than WAV	Good on desktop, improving on mobile	Archiving, audiophile listening, editing masters
Opus	Lossy	Excellent, best at low bitrates	Web and modern apps, not iTunes	Voice calls, web audio, streaming at low bitrates

The practical summary

Use MP3 when: you're sharing the file with anyone and don't know what they're using. MP3 plays in everything. No exceptions. 192kbps for speech, 320kbps for music.

Use M4A when: you're on an Apple device or ecosystem, and you want slightly better quality than MP3 at the same file size. M4A (AAC) is genuinely more efficient than MP3 at equivalent bitrates. YouTube's native audio is AAC, so downloading as M4A avoids one generation of re-encoding loss.

Use WAV when: you're going to edit the audio further in any professional software. Always do audio editing from lossless source material. If you compress to MP3, edit the MP3, then export again as MP3, you've introduced two generations of lossy compression. That adds up audibly.

Use OGG when: you specifically need an open-source format and know the recipient can play it. Web developers and Linux users reach for this. Most other people don't need it.

Use FLAC when: you want lossless audio but also want a smaller file than WAV. FLAC is lossless but compressed, so a 200MB WAV might be 100MB as FLAC with no quality difference. Good for archiving large audio collections.

File Size Reference Table

This is what people actually want to know. How big is this going to be?

Format and bitrate	1 hour of audio	30 minutes	10 minutes	1 minute
MP3 at 128kbps	56 MB	28 MB	9.4 MB	0.94 MB
MP3 at 192kbps	84 MB	42 MB	14 MB	1.4 MB
MP3 at 320kbps	140 MB	70 MB	23 MB	2.3 MB
M4A at 128kbps (AAC)	56 MB	28 MB	9.4 MB	0.94 MB
WAV at 44.1kHz/16-bit stereo	635 MB	317 MB	106 MB	10.6 MB
FLAC (44.1kHz, varies)	approx 200-280 MB	approx 100-140 MB	approx 33-46 MB	approx 3-5 MB
OGG Vorbis at 160kbps	70 MB	35 MB	11.7 MB	1.17 MB

The WAV row is the one that surprises people. An hour of uncompressed audio is over 600MB. That's why MP3 exists. For most use cases, 192kbps MP3 or M4A gives you excellent quality at a very manageable file size. You don't need WAV unless you're doing professional audio work.

How to Get the Best Possible Audio Quality

Here's the core principle that most guides skip: the quality ceiling is set by the source video, not your conversion settings.

YouTube does not store your audio in lossless format. They compress everything they receive during upload. The original uploader might have uploaded a lossless WAV file, but YouTube re-encoded it before storing it. What YouTube serves you is typically 128kbps AAC in M4A format (for most videos) or 160kbps Opus in WebM format (for newer uploads). That's the quality ceiling.

If you convert that to a 320kbps MP3, you have a 320kbps MP3 that contains... 128kbps worth of audio data. The extra bits are just encoding the already-compressed audio into a larger container. This doesn't add quality. It adds file size.

The native extraction approach

The highest quality extraction method is to get the audio stream as YouTube provides it, without re-encoding:

yt-dlp -f bestaudio "URL"

This gives you YouTube's native audio stream. Usually that's a .webm file with Opus audio at 160kbps, or an .m4a file with AAC audio at 128kbps. These are lossy, but they're the original lossy copy without an additional re-encoding step on top.

When you tell a tool "give me 320kbps MP3," what actually happens is: download YouTube's 128kbps AAC, re-encode that AAC to MP3 at 320kbps. You've added one generation of lossy compression on top of the existing lossy compression. The output is slightly worse than the source, just larger.

The practical advice

If the audio quality of the original video is good (professionally produced podcast, studio recording, etc.): download as M4A or native format to avoid re-encoding. 128kbps AAC is fine.
If you need MP3 specifically: 192kbps is the sweet spot. Going higher doesn't help, going lower is noticeable.
If you're going to edit the audio after extracting it: download as WAV to avoid further generation loss during your editing exports.
If the source video is low quality (phone recording, webcam audio, background noise): no bitrate setting will fix bad source audio. 128kbps MP3 is fine, the audio quality is limited by the source anyway.

Common Mistakes When Extracting Audio

Mistake 1: Downloading the full video when you need 30 seconds

This is the most common one. Someone needs a 45-second clip from a 3-hour recorded event. They download the entire 3-hour MP4 (2-4GB), then try to figure out how to trim it. The better approach: use YTCut to set the start and end point precisely before downloading, so you get a 45-second MP3 file directly.

Mistake 2: Using 128kbps for music

128kbps is genuinely not great for music listening. The artifacts are audible on headphones, particularly in the high frequencies (cymbals, strings) and during quiet passages. Use 192kbps minimum for music. 320kbps if you're particularly sensitive to audio quality or listening through good headphones.

Mistake 3: Using 320kbps for speech

The reverse mistake. A podcast or lecture at 128kbps is fine. Human speech doesn't have the complex frequency content of music. You're wasting file space. 128kbps MP3 for speech sounds indistinguishable from 320kbps to virtually everyone.

Mistake 4: Converting MP3 to WAV and thinking you got lossless

A WAV file made from an MP3 source is not lossless. It's just an MP3's audio data stored in a WAV container. The file is much larger but the audio quality is identical to the source MP3 (or very slightly worse due to the conversion). If you need lossless, you needed to start with a lossless source.

Mistake 5: Trusting "HD audio" labels on sketchy converter sites

Some online converter sites claim to offer "HD 320kbps" or "high quality" extraction. In most cases, they're downloading the same 128kbps stream that everyone else downloads and encoding it to 320kbps MP3. The label means nothing. The actual quality is determined by what YouTube provides, not what the site claims to do.

Mistake 6: Ignoring the sample rate

YouTube audio is typically 44.1kHz stereo. This is CD quality in terms of sample rate and is completely standard for music and speech. You don't need to change this. However, some converters default to 22.05kHz or mono output, which sounds noticeably worse. Check that your output is 44.1kHz stereo before committing to a workflow.

Frequently Asked Questions

Can I extract audio from a YouTube video that has copyright-restricted music?

Technically, the extraction tools don't care about copyright. They download what's available on YouTube. However, whether you're allowed to do so is a separate question governed by copyright law and YouTube's Terms of Service. For personal, non-commercial use, the practical risk is very low. For commercial use or redistribution, there are real legal considerations. (Our legal guide covers this in much more detail.)

Why does my extracted MP3 sound muffled or worse than the video?

Most likely reason: the video's audio was better quality than the extraction produced. Check what bitrate you extracted at. If you used 64kbps or 96kbps, that's why. Try again at 192kbps. Second likely reason: you downloaded the video's audio at a lower quality setting than what YouTube had available. Make sure you're selecting "best available" quality in your tool.

Is there a file size limit for extraction on YTCut?

YTCut doesn't impose strict file size limits for audio extraction, but very long segments from very long videos may take longer to process. For extracting audio from full 4-hour videos, yt-dlp is more practical since it runs locally. For clips and segments, YTCut handles it quickly.

What's the difference between extracting audio and downloading audio?

Essentially nothing in this context. Both mean getting the audio track from a YouTube video saved to your device. The word "extract" technically implies separating the audio from the video (which is what happens), while "download" emphasizes getting it onto your device. In practice, people use these interchangeably and they mean the same process.

Can I extract audio from YouTube Live streams?

While a stream is live: yt-dlp can record a live stream, but it's complicated and might not work reliably. After a live stream is archived as a regular video: yes, any method in this guide works on archived streams exactly as it does on regular videos.

Why does yt-dlp give me a .webm file instead of MP3?

You either didn't include the --audio-format mp3 flag, or you don't have FFmpeg installed (yt-dlp needs FFmpeg to convert formats). Install FFmpeg and make sure it's in your system PATH. Then the --audio-format flag will work correctly.

Does the audio quality depend on the video's resolution?

No. YouTube stores audio and video as separate streams. A 1080p video and a 360p video of the same content will usually have identical audio streams. The video resolution does not affect audio quality. What affects audio quality is the original upload quality and YouTube's audio processing, both of which are independent of resolution.

In this article