How Much Does Audio Transcription Cost? (Tips for Cheaper Outsourcing)
June 8, 2025


I want to outsource transcription as cheaply as possible!
It's natural for everyone to think that way. You, and of course, me too.
But if you find a service provider or individual who is too cheap, you might worry if they can really do the job well.
On the other hand, if you prioritize quality too much and the price becomes too high, it won't fit your budget.
It's tough because this balance is difficult for an amateur to judge.
This is about the market rate.

How much wasted money have I spent, being swayed by such words from cunning service providers...
Therefore, this time, I will explain the market rates for outsourcing audio transcription, which serve as a guideline.
I will also cover tips for successful transcription outsourcing and techniques for ordering high-quality transcription as cheaply as possible. By reading this article, you should significantly reduce the risk of paying more than the market rate, as I have.
Please read to the end.
Summary of Audio Transcription Outsourcing Costs: By 3 Types
There are roughly three main ways to outsource transcription:
- AI-powered automatic transcription (0.4 yen to 40 yen / minute)
- Hiring individuals via crowdsourcing platforms (6 yen to 200 yen / minute)
- Human transcription by specialized agencies (108 yen to 800 yen / minute)
Generally, the cost tends to increase as you go down the list.
※Some service providers calculate fees per character, but in most cases, they are quoted per minute.
Let's delve into each method in detail.
1. AI-powered Automatic Transcription (Market Rate: 0.4 yen to 40 yen / minute)
If you want to get transcription data quickly with minimal effort, "AI transcription" is recommended.
AI-powered automatic transcription services do not involve human intervention, offering benefits such as:
- Lower unit cost due to no labor expenses
- No time wasted on unnecessary communication
- Reasonably high accuracy
Conversely, the disadvantages include:
- Transcribing all meaningless words like "uh" and "um"
- Manual correction may be required after transcription
These two points are the main drawbacks.
I've written about solutions for these issues in the latter half of the article, so please check them out if you'd like.
2. Hiring Individuals via Crowdsourcing Platforms (Market Rate: 6 yen to 200 yen / minute)
If you want to get human transcription done as cheaply as possible, this method is recommended.
The three most well-known crowdsourcing platforms are:
- CrowdWorks
- Lancers
- Coconala
If you find a good freelancer (crowdworker), they might do a meticulous and speedy job that far exceeds what you'd expect for the price. So, if it works out, it can be incredibly cost-effective.
However, if you're new to outsourcing, you might struggle to identify low-quality freelancers or even encounter communication issues.
This could lead to increased costs from preliminary test writing or wasted money if you have to re-hire someone else. Therefore, instead of simply handing off the task, approach it with a commitment to actively engage and work with the freelancer.
3. Human Transcription by Professional Agencies (Average Cost: ¥108 - ¥800/minute)
For those who are willing to pay a bit more for high-quality transcription, this traditional method is recommended.
Professional transcription by agency writers ensures reliable results, as errors or missed deadlines can impact their company's reputation. This is a significant advantage.
Communication, such as initial consultations, also proceeds very smoothly, as you'd expect from professionals, minimizing the effort required for communication.
Generally, professional transcription agencies often distinguish between delivery formats, such as:
- Verbatim transcription: Transcribing every word exactly as spoken (Average Cost: ¥240 - ¥700/minute)
- "Kebatori" (cleaned-up transcription): Removing unnecessary words (Average Cost: ¥108 - ¥700/minute)
- Polished transcription: Converting spoken language into written language (Average Cost: ¥240 - ¥1000/minute)
This allows you to choose the optimal format for your specific purpose.
On the other hand, there are also disadvantages, such as higher costs and many agencies only accepting projects of a certain volume.
The cost varies depending on the quality of the original audio data and the delivery deadline, and each agency has a different calculation method. If you want to keep costs down, it's advisable to get multiple quotes.
[Summary] Transcription Outsourcing Cost Comparison Chart (by Type)
Type | AI | Freelancer | Professional Agency |
Average Cost (per minute) | ¥0.4 - ¥40 | ¥6 - ¥200 | ¥108 - ¥800 |
Pros | ・Low unit cost ・Extremely fast turnaround ・High accuracy | ・Affordable human transcription ・Accepts small projects | ・Meticulous work ・On-time delivery ・Wide range of services |
Cons | ・Verbatim only | ・Inconsistent quality | ・Expensive (minimum fee applies) |
Related Articles on Transcription Outsourcing
This article also explains various services for outsourcing transcription.
If you're interested, please take a look!
AI Transcription is Recommended for First-Timers
So, which of the three outsourcing methods should you choose?
For those looking to outsource transcription services for the first time, "AI Transcription" is the most recommended method.
The main advantage is that
Also, the price is low and the time required is short.
In recent years, with the dramatic development of speech recognition technology through deep learning, AI transcription has become capable of transcribing with very high accuracy.
It can handle not only technical terms and proper nouns, but also foreign languages that are difficult for Japanese crowdsourcers and professional agencies to handle.
Furthermore, "Mojiokoshi-san" and other services offer free plans, so you can experience how well AI transcription performs for free before committing to full use.
If you're unsure about outsourcing transcription, why not try AI transcription services like "Mojiokoshi-san" first?
4 Key Points to Consider When Outsourcing Transcription

I have a general idea of the market rates. I'll go ahead and order a transcription service!
Hold on a moment, even if you're eager to proceed.
Even if you choose a reputable service at a reasonable price, there are cases where things don't go smoothly for various reasons.
The cause of this is not the service provider, but "you."
Here are four major points you should keep in mind to avoid failure when outsourcing transcription:
- Choose a service that matches your purpose
- Prepare a glossary of terms
- Specify whether to include "kebatori" (removal of filler words) and "seibun" (text refinement)
- Don't haggle too much
Let's explain each in detail.
Choose a service that matches your purpose
Even though they are all called transcription services, each has its own "strengths and weaknesses."
- Outsourcing transcription of short audio data where quality is not a major concern to a highly specialized professional agency.
- Handing over confidential data directly to an individual.
- Attempting to transcribe audio data with poor recording quality as is.
When there's a mismatch between your purpose and the service, like in these examples, it can not only waste time and money but also lead to unnecessary trouble.
*Even among professional agencies, there are various types, such as those specializing in quick delivery or those strong in legal or specialized fields.
When outsourcing transcription, make sure to prepare thoroughly and choose the optimal service.
Prepare a glossary of terms
A common pitfall when outsourcing transcription is that the transcriber cannot properly hear (or understand) technical terms, proper nouns, or niche expressions used only within your company.
This is natural, as the other party is completely unfamiliar with your field.
Therefore, we recommend compiling a "glossary of terms" for words and expressions that are objectively difficult to understand and providing it along with the requested data.
We understand the hesitation due to the "trouble" involved, but the transcriber's understanding directly impacts the overall quality of the work.
If you plan to outsource transcription regularly, this will be necessary sooner or later, so don't be lazy and prepare it thoroughly.
Specify whether to include "kebatori" (removal of filler words) and "seibun" (text refinement)

I thought I asked for a clean verbatim transcript, but they delivered a rough verbatim one instead...
This is a common scenario when dealing with individual freelancers.
Conversely,

I requested a rough verbatim transcript, but they provided a clean verbatim and edited version...
Such cases can also occur.
While some might consider the latter lucky, there's a non-zero chance it could be an unnecessary overstep depending on the intended use of the data.
To ensure the other party's efforts aren't wasted, always confirm the desired format of the delivered data thoroughly when making a request.
Don't Haggle Too Much

I want to spend as little as possible.
While that sentiment is completely understandable, it's counterproductive if an excessive focus on low cost leads to a "you get what you pay for" situation.
Everything has a fair price.
The rates set by each service have clear, specific reasons behind them. If you ignore these and forcefully try to lower the price, it will inevitably lead to problems somewhere down the line.
Often, such "invisible shortcuts" are hard for an amateur to spot.

As long as you don't deviate significantly from the market rates discussed here, it's generally better not to engage in aggressive price negotiations to achieve a satisfactory outcome.
Essential: Prepare Thoroughly in Advance.

I want to outsource to minimize effort.
However, complete delegation (sending raw audio data without proper explanation) is strictly forbidden.
In addition to the glossary and delivery format mentioned above, consider:
- Original audio content and specifications
- Purpose of the transcription data
- Desired quality
- Inclusion of timestamps
- Deadline (preferably with milestones)
In short, provide as much information as possible
It's best to summarize as much detail as possible and organize it clearly.
While this kind of preparation might seem like a hassle, it's actually a necessary expense when you consider the potential for unnecessary burdens or problems later on.
As mentioned earlier, clarifying your objective often leads to insights like, "Hmm, maybe I don't need to hire a professional for this after all..."
The One Trick to Getting Audio Transcription Done Cheaply
Get high-quality transcription done as cheaply as possible, while still adhering to fair market rates.
Believe it or not, there's a way to achieve this dream.
That is,
Combining AI and human effort
This is the technique.
As mentioned earlier, transcription broadly consists of three stages:
- Verbatim transcription (transcribing every word exactly as spoken)
- "Kebab-tori" (removing filler words and unnecessary sounds)
- Text normalization (converting spoken language to written language)
By outsourcing these to different parties, you can achieve high-quality transcription.
The recommended combination is:
- Verbatim transcription: AI
- "Kebab-tori": Individual freelancer
- Text normalization: High-priced individual freelancer or do it yourself
Let's explain each in detail.
[STEP1] Verbatim Transcription: AI
The progress of AI recently has been remarkable, and for the verbatim transcription stage, you can achieve higher quality than by hiring an individual through crowdsourcing.
*Especially if you cannot prepare the "glossary" introduced in the previous section, AI, which often has built-in dictionary data for technical terms, may be superior.
Compared to verbatim transcription by a reputable professional service, there are certainly minor areas where AI falls short. However, unless you need absolutely precise, word-for-word data, there's no need to choose an expensive service.
Some professional services reportedly use tool-based methods for verbatim transcription before "kebab-tori" or text normalization. In such cases, it's significantly cheaper to use AI directly yourself.
Delegate all tasks that AI (machines) excel at, and let humans focus on tasks only humans can do.
For AI Transcription, 'Mojiokoshi-san' is Recommended
When choosing an AI transcription tool for the first time, 'Mojiokoshi-san' is highly recommended.
Mojiokoshi-san offers high-accuracy and fast transcription using the latest AI.
Furthermore, you can choose between two types of AI, each with distinct features:
- PerfectVoice: Supports over 100 languages, including Japanese and English.
- AmiVoice: Features speaker diarization (transcribes by speaker).
This allows it to handle various scenarios, such as foreign language transcription and meeting transcription.
You can use it for free for up to 3 minutes without registration or login, so it's recommended to try 'Mojiokoshi-san' first.
[STEP2] "Kebab-tori": Individual Freelancer
For tasks like removing filler words such as "uh" or "um," and correcting typos, it's easy to hire an individual freelancer at a low cost. Even on crowdsourcing sites where quality can vary, you can generally ensure a certain level of quality.
As a tip, if you can pre-remove sections that are unlikely to need "kebab-tori" during the verbatim transcription stage, it will reduce the workload and save costs.
When making a request, it's a good idea to approach individuals who regularly accept transcription jobs on crowdsourcing sites and ask, "Could you do just the 'kebab-tori' for a lower price?"
[STEP3]
Editing and Polishing: High-Priced Freelancers or DIY
While simply converting spoken language to written language is one thing, summarizing and rephrasing content for clarity requires considerable skill.
Therefore, it's best to clearly separate the editing and polishing phase and entrust it to a reliable freelancer, even if it means a slightly higher cost.
*Even then, it will still be cheaper than hiring a professional agency for the entire process from raw transcription to editing.
If you have enough time, another recommended approach is to handle the editing and polishing yourself.
Someone who is familiar with the original audio data and understands the context of the conversation can sometimes summarize it better than a writing expert.
Aim to Become an "Outsourcing Master" in 3 Steps!
By breaking down the work into smaller tasks like this, you can keep the total cost down while selecting the most suitable service for each task, which can significantly improve quality.
Someone who is good at outsourcing is someone who knows the optimal tools and personnel for each task and can appropriately delegate responsibilities.
Why not aim to become such an "outsourcing master" yourself?
Summary
By effectively utilizing transcription outsourcing services, you can achieve higher quality work at a much better cost-performance ratio than doing it yourself.

But outsourcing costs money...
Even if you've been hesitant, if you compare the time and effort cost of transcribing yourself with the final quality, you'll surely realize the hidden losses you're incurring.
Leave it to the experts. Use the time freed up by outsourcing to focus on more important tasks that only you can do.
The method of combining AI and human effort introduced at the end of the article is a very useful technique that can be applied beyond transcription, so make sure to master it.
For AI transcription services, we recommend trying "Mojiokoshi-san."
■ AI transcription service "Mr. Transscription"
"Mr. Transcription" is an online transcription tool that can be used from zero initial cost and 1,000 yen per month (* free version available).
- Supports more than 20 file formats such as audio, video, and images
- Can be used from both PC and smartphone
- Supports technical terms such as medical care, IT, and long-term care
- Supports creation of subtitle files and speaker separation
- Supports transcription in approximately 100 languages including English, Chinese, Japanese, Korean, German, French, Italian, etc.
To use it, just upload the audio file from the site. Transcription text is available in seconds to tens of minutes.
You can use it for free if you transcribe it for up to 10 minutes, so please try it once.
Email: mojiokoshi3.com@gmail.com
Transcription for audio / video / image transcription. It is a transcription service that anyone can use for free without installation.
- What is Mr. Transcription?
- Transcript images, sounds, and videos with Mr. Transcription
- Free registration
- Rate plan
- manual