4 Easy Ways to Transcribe Audio with Google Docs (Free)
Ngày 08 tháng 6 năm 2025
 
                    
                    Did you know that Google Docs, which you probably use every day, has a "voice typing" or "transcription" feature built-in?
- You want to transcribe audio for free.
- You always use Google Docs for work.
- You're interested in more than just real-time voice transcription, like converting recorded audio to text or extracting text from images and PDFs.
If this sounds like you, then not using this feature would frankly be a waste!

I want to know everything Google Docs transcription can do!
For that reason, this guide will explain the transcription tools available in Google Docs.
When you search for information on Google Docs transcription, you often only find instructions for either "real-time voice typing" or "audio-to-text conversion." This article, however, covers all related transcription features comprehensively.
By reading this article, you'll be able to easily create documents by voice, transcribe meeting minutes, and perform various other tasks using just Google Docs, without needing to install any new apps.
Additionally, we'll introduce a recommended method for transcribing audio files like MP3s, which Google Docs isn't great at handling directly, using an alternative service.
Please read to the end.
Free and Easy-to-Use AI Transcription Service: Better Than Google Docs
If you're looking for a way to transcribe audio files, we also recommend "Mojiokoshi-san"!
"Mojiokoshi-san" is a free and easy-to-use AI transcription service.
To transcribe MP3 and other audio files in Google Docs, you typically need complex settings to record audio playing on your PC.
In contrast, "Mojiokoshi-san" requires zero troublesome setup!
You can easily transcribe audio files by simply uploading them!
You can transcribe for free without registration or login, so why not try it out here?
4 Easy Ways to Transcribe with Google Docs for Free

Here are four methods for easily transcribing with Google Docs.
Real-time Voice Typing
The easiest way to transcribe in Google Docs is by using "real-time voice typing."
With real-time voice typing, your speech is automatically converted to text as you speak into your microphone.
Since the operation and interface (appearance) can vary slightly between operating systems even within Google Docs, we'll explain the method for each OS.
Steps for Desktop (Windows/Mac)
First, here's how to use real-time voice typing when accessing Google Docs through a browser on a Windows or Mac computer.
1. Tools → Voice typing (Ctrl+Shift+S / ⌘+Shift+S)
2. Click the microphone icon to start transcribing (the microphone turns red)


3. Click the microphone icon again to stop

This allows you to use voice typing on your computer through browsers like Google Chrome.
iPhone Voice Typing Steps
Next, here's how to use voice typing on an iPhone.
On iPhone, you'll use the built-in iPhone voice input feature, not Google Docs' voice typing.
1. Tap the microphone button at the bottom of the software keyboard

2. Waveform appears (transcription starts)

3. Tap the waveform to stop

You can transcribe in Google Docs on an iPhone using this method as well.
Android Voice Typing Steps
Finally, here's how to transcribe using Google Docs on an Android smartphone.
*This guide uses Android 12 as an example.
1. Tap the microphone button on the software keyboard

2. Microphone button changes color and transcription begins

3. Tap the microphone button again to stop

You have now successfully transcribed in Google Docs.
Bonus: How to Input Punctuation and Symbols
In addition to words, commonly used symbols can also be entered by voice.
- Period (.): "period" or "full stop"
- Comma (,): "comma"
- Quotation marks (" "): "open quote" / "close quote"
There are many other keywords you can use for input, so feel free to experiment.
Type with your voice - Docs Editors Help
Type text with your voice on iPhone - Apple Support (Japan)
Type with your voice - Android - Gboard Help
How to Transcribe Audio and Video on PC with Google Docs
The method introduced above was for real-time voice input of spoken content.
For Windows and Mac computers, with a little extra effort, you can actually transcribe recorded audio data using Google Docs.
Steps to Transcribe with Google Docs
Transcribing using Google Docs' features follows these steps:
- Set up the stereo mix function in advance.
- Play the audio/video data.
- Click the microphone button in Google Docs (start transcription).
"Stereo Mix" is a feature that allows the computer to recognize not only microphone input but also audio played back within the device.
Since the necessary setup methods differ significantly between Windows and Mac, we will explain them separately here.
Windows Stereo Mix Settings

For Windows, you can make Google Docs recognize played audio by setting it up as follows:
1. Settings: System → Sound → Select input and click "Manage sound devices"

2. Input device: Select Stereo Mix

3. Click "Enable"

4. Go back to the previous screen and select Input: Stereo Mix

Note: If Stereo Mix does not appear, it often works to download and install the "Realtek audio driver" from your PC manufacturer's website.
Reference: For advanced settings, "VB-Audio VoiceMeeter Banana" is also recommended

While it's a software that might take some getting used to, if you want to perform more detailed settings, using a software called "VB-Audio VoiceMeeter Banana" allows for advanced operations such as recognizing audio playing within your PC, recording, and processing it.
VB-Audio VoiceMeeter Banana is donationware (software that solicits donations) that can be used for free.
It can do similar things to the "virtual audio device" software for Mac introduced below, so if you're interested, it's worth looking into.
Mac Stereo Mix Settings

Mac does not come with a standard sound mixer function, so you need to prepare two things yourself:
- Virtual audio device
- Audio mixing app
That being said, Don't worry, both have free apps.
Setting up Stereo Mix on Mac
Here's how to set up the stereo mix function on a Mac:
1. Install Soundflower or Blackhole *


First, install a virtual audio device app.
Both Soundflower and Blackhole are downloaded and installed from GitHub.
* If you have an M1 Mac, Soundflower may not work properly, so it's recommended to use Blackhole, a similar "virtual audio device" app.
2. Install LadioCast (audio mix app)

Next, install LadioCast, an audio mixing app.
This can be installed from the Mac App Store.
3. Configure in System Preferences
* This explanation uses Soundflower as an example.
First, configure the input and output settings.
- Output: Soundflower (64ch)
- Input: Soundflower (64ch)
Next, configure LadioCast settings.
- Input: Soundflower (64ch) → Enable "Main" and "Aux 1"
- Output Main: Soundflower (2ch)
- Output Aux 1: Built-in Output
Now you can transcribe audio in Google Docs on your Mac.
However, as you can see from the steps outlined, it requires complex setup.
If you want smooth and easy transcription, AI transcription services like "Mojiokoshi-san", introduced below, are also recommended.
For Smartphones: Use the Microphone to Recognize Played Audio Files

For smartphones, there isn't a suitable stereo mix app, so the only option is to have the built-in microphone recognize audio files played through the phone's speaker.
- Play the recorded audio
- Transcribe with Google Docs
However, for smartphones, since there isn't a suitable stereo mix app, the only option is to have the built-in microphone recognize audio files played through the phone's speaker, and this method will not work well unless the audio data and playback environment are exceptionally good.
In such cases, it's recommended to use a dedicated transcription service.
Recommended Dedicated Transcription Service: "Mojiokoshi-san"
For those who want to transcribe audio files easily and smoothly, we recommend "Mojiokoshi-san"!
"Mojiokoshi-san" is an AI transcription service that can convert audio to text with higher accuracy than Google Docs.
You can open it in your browser from here and start transcribing immediately.
What's more, 'Mojiokoshi-san' is free!
If you want to transcribe audio files, why not try using 'Mojiokoshi-san' for free?
Transcription Methods Using Google Docs for Different Scenarios
Next, we'll explain how to transcribe in different scenarios, such as meeting minutes and text extraction from images and PDFs.
Meeting Minutes (Zoom, Microsoft Teams, etc.)

If all participants are present in the same location, you can transcribe everyone's audio simultaneously using the methods described so far.
However, in the case of web conferences, if only the person responsible for the minutes turns on the voice input function, only their voice will be transcribed.
While using a stereo mixer function allows you to transcribe from both microphone and speaker audio simultaneously, there's an even simpler method, which we'll introduce.
Transcribing Meeting Minutes with Google Docs
You can transcribe meeting minutes in Google Docs following these steps:
- Share the same document using Google Docs' sharing feature.
- Participants launch both the web conferencing tool and Google Docs on their respective devices.
- Start voice input (transcription) simultaneously.
However, please be aware of potential issues as this is not the intended use.
In case transcription doesn't work well or for easier editing later, it's a good idea to keep audio data as a backup by using Zoom's recording function or preparing a separate recorder (like a smartphone).
Extract Text from Images and PDFs
- Upload image or PDF files to Google Drive.
- Right-click the file.
- Open with → Google Docs.
This allows you to transcribe text from images and PDFs using Google Docs.
The automatically transcribed text will appear alongside the original image/PDF data.
Supported file types are JPG, PNG, GIF, and PDF (up to 2MB in size).
Convert PDF and photo files to text - Computer - Google Drive Help
Troubleshooting Google Docs Transcription Issues

If Google Docs isn't transcribing when you speak into the microphone, there are three possible reasons:
- Microphone permission is not granted for the browser on your PC.
- Voice input is not enabled on your smartphone.
Each can be resolved with the following methods:
How to Grant Microphone Permission for Your Browser on PC
On Windows and Mac browsers (Google Chrome), you can grant microphone permission as follows:
1. In Google Chrome settings, select "Privacy and security."

2. From "Site Settings," select "Permissions" (Microphone).

3. Turn ON "Sites can ask to use your microphone."

How to Enable Voice Input on Your Smartphone
You can enable voice input on both iPhone and Android devices using the following methods:
iPhone
On an iPhone, you can grant microphone permissions as follows:
1. Open "General" from the Settings app

2. Turn ON "Enable Dictation" from "Keyboard"

Android
※This explanation uses Android 12 as an example.
1. Open the Settings app

2. Open "Languages & input"

3. Select "Gboard" from "On-screen keyboard"

4. Select "Voice typing" on the Gboard settings screen

5. Turn ON voice typing

If Transcriptions Don't Work When Playing Audio Data
If your audio data isn't being transcribed when played, these could be the reasons:
- Stereo Mix is not enabled.
- The browser is not active.
The second point, "The browser is not active," is a common issue.
Google Docs automatically stops transcribing if another window is active, so be aware of this.
When transcribing:
Play audio data → Click the start transcription button
It's recommended to follow this order to minimize errors.
If Transcription Accuracy is Low
If Google Docs recognizes the audio but the transcription accuracy is low, these could be the reasons:
- Volume is too low
- There is background noise
- Speaking speed is too fast
For real-time voice input, it's recommended to speak slowly and clearly, keeping your mouth close to the microphone, and ideally use a high-performance external microphone.
When transcribing from recorded data, using editing software like Audacity to remove noise, adjust volume, and control speed can improve transcription accuracy.
If you still struggle with transcription or it stops midway even after trying these measures, using the recommended services mentioned in this article can help you transcribe smoothly.
Disadvantages of Transcribing with Google Docs
As explained so far, Google Docs can be used for audio transcription with some adjustments.
However, since it's not originally designed for transcription, there are also disadvantages.
Let's discuss the drawbacks of using Google Docs for transcription.
1. Complex Setup

As explained above, transcribing audio files like MP3s with Google Docs requires very complex settings.
If you're not tech-savvy, setting it up alone could take hours...
For a smoother transcription experience, a specialized service like "Mojiokoshi-san" is more convenient.
2. Google Docs Must Remain Open

When transcribing with Google Docs, you must keep Google Docs open throughout the entire process.
If any sound other than the audio file you're transcribing plays on your PC, that sound will also be transcribed, preventing you from using other software.
3. Stops After a Certain Period

Google Docs' voice recognition feature automatically stops after about 5-10 minutes.
To transcribe long audio files, you need to re-enable the feature every time it stops, which is very inconvenient.
4. Cannot Be Used on Smartphones

The smartphone version of Google Docs does not allow for the settings explained in this article, so transcribing audio files is not possible.
In contrast, specialized services like "Mojiokoshi-san" can be used on any device: PC, smartphone, or tablet!
4 Recommended Transcription Services Other Than Google Docs
Finally, we'll introduce services that can smoothly transcribe even in situations where Google Docs struggles!
AI transcription services are highly recommended.
They can easily and smoothly transcribe audio and video files that Google Docs struggles with.
1. Mojiokoshi-san
If you're looking for a transcription service other than Google Docs, this is highly recommended.
"Mojiokoshi-san" is a transcription service that uses the latest AI, allowing you to easily transcribe audio, video, image, and PDF files by simply uploading them.
Transcribing already recorded audio with Google Docs requires complex procedures as introduced in this article, but "Mojiokoshi-san" eliminates all that hassle.
With the high-performance, latest AI speech recognition engine "PerfectVoice," you can transcribe even long audio files in just 10 minutes.
What's more, "Mojiokoshi-san" is free!
You can transcribe files up to 3 minutes without registration or login, so why not experience "Mojiokoshi-san" for yourself?
2. Speechy Lite

Speechy Lite is an ideal app for easy transcription on your iPhone.
Transcribed text and recorded data can be easily shared with other apps.
3. Speechnotes

Since Speechy Lite, introduced above, is only available for iPhone, Android users are recommended to use this app.
Punctuation marks can be handled not only by voice commands but also by keyboard shortcuts.
4. Otter

This is recommended if you want highly accurate English transcription.
It's a service specialized in English transcription, and it accurately distinguishes speakers even in multi-person conversations, making it ideal as a meeting minutes tool.
Summary of Google Docs Transcription Methods

This article explained the transcription features in Google Docs.
While Google Docs is often treated as an alternative to Microsoft Word, it actually has many features that are more convenient than Word.
Why not use this article as a reference to master its diverse functionalities?
Also, dedicated transcription services are easy to use, so it's recommended to use them in conjunction with Google Docs!
■ Dịch vụ phiên âm AI "Phiên âm của Mr."
"Mr. Transcription" là một công cụ phiên âm trực tuyến có thể được sử dụng với chi phí ban đầu bằng 0 và 1.000 yên mỗi tháng (* có sẵn phiên bản miễn phí).
- Hỗ trợ hơn 20 định dạng tệp như âm thanh, video và hình ảnh
- Có thể được sử dụng từ cả PC và điện thoại thông minh
- Hỗ trợ các thuật ngữ kỹ thuật như chăm sóc y tế, CNTT và chăm sóc dài hạn
- Hỗ trợ tạo file phụ đề và tách loa
- Hỗ trợ phiên âm bằng khoảng 100 ngôn ngữ bao gồm tiếng Anh, tiếng Trung, tiếng Nhật, tiếng Hàn, tiếng Đức, tiếng Pháp, tiếng Ý, v.v.
                        Để sử dụng nó, chỉ cần tải lên tệp âm thanh từ trang web. Văn bản phiên âm có sẵn trong vài giây đến hàng chục phút.
                        
                        Bạn có thể sử dụng miễn phí nếu bạn phiên âm tối đa 10 phút, vì vậy hãy thử một lần.
                    
Email: mojiokoshi3.com@gmail.com
Phiên âm để phiên âm âm thanh / video / hình ảnh. Đây là một dịch vụ phiên âm mà bất kỳ ai cũng có thể sử dụng miễn phí mà không cần cài đặt.
                    
- Phiên âm của Mr.
- Phiên âm hình ảnh, âm thanh và video với Phiên âm của Mr.
- Đăng ký miễn phí
- Kế hoạch giá
- thủ công

 
                             
                                     
                                     
                                     
                                     
                                     
                                    