Transcribe Meetings & Interviews with Speaker Diarization

June 8, 2025

When transcribing meetings or interviews, it's tough to separate and write down what each person said...

In such cases, we recommend using an AI transcription service with a "speaker separation" feature.

Speaker separation is the process of transcribing audio where multiple people are speaking simultaneously, separating the text by each individual speaker.

The latest speech recognition AI can distinguish the unique characteristics of each speaker's voice, allowing transcription results to be outputted in a way that identifies who is speaking.

However, speaker separation isn't available in all AI transcription services.

Speaker separation is a particularly advanced feature within AI transcription.

Therefore, when you want to use speaker separation for transcribing meetings or interviews, it's important to check if the feature is included.

This article will explain how to transcribe using the speaker separation feature and introduce recommended AI transcription services that offer it!

By using the speaker separation feature, you can reduce the effort of creating meeting minutes by more than half!

Why not speed up your transcription work by referring to this article?

AI Transcription Services with Speaker Separation

Notice: Speaker Separation (Speaker Recognition) Feature Temporarily Paused

The speaker separation (speaker recognition) feature of "Mojiokoshi-san" is currently temporarily paused.

We plan to reactivate the feature by mid-2025.

We apologize for any inconvenience this may cause until then, and we appreciate your continued support for "Mojiokoshi-san."

Mojiokoshi-san

The recommended AI transcription service for transcribing using the speaker separation feature is "Mojiokoshi-san."

"Mojiokoshi-san" is an AI transcription service that uses the latest speech recognition AI to provide fast and highly accurate transcriptions.

You can use two types of AI transcription engines: AmiVoice and PerfectVoice. By choosing "AmiVoice," you can utilize the speaker separation feature.

When you use the speaker separation feature, the transcription is separated by each speaker, as shown below.

Speaker separation example

Of course, you can also download the transcription results!

As you can see, the downloaded file also transcribes by each speaker.

Speaker separation example

When transcribing meeting minutes, security is a common concern, but since it's a Japanese-developed transcription service, you can feel secure in that regard.

If you're looking to transcribe using the speaker separation feature, "Mojiokoshi-san" is highly recommended!

You can transcribe for up to 1 minute for free without registration or login, so why not try "Mojiokoshi-san" first?

Try Mojiokoshi-san now

How to Transcribe Using Speaker Separation

So, how exactly do you transcribe using the speaker separation feature?

Let's immediately look at the method and flow of transcription (tape transcription) using the AI transcription service "Mojiokoshi-san." explained!

1. Record the content of your meeting or interview

Record the content of your meeting or interview

First, record the content during your meeting or interview.

You can use your smartphone's voice memo app or a dedicated IC recorder app for recording.

As a point of caution, make sure to place your smartphone or IC recorder in a location where it can clearly record the voices of everyone speaking.

AI transcription engines are high-performance and can transcribe even with some noise, but it's always recommended to record with the best possible sound quality.

If available, using specialized equipment like a condenser microphone or lavalier microphone is also recommended.

2. Open the 'Mojiokoshi-san' top page

Mojiokoshi-san

'Mojiokoshi-san' is a service used through web browsers like Google Chrome or Safari.

Files are uploaded from the top page.

Open the 'Mojiokoshi-san' top page from this link.

*'Mojiokoshi-san' can be used from any environment (PC, smartphone, tablet) as long as you have an internet connection!

3. Select "AmiVoice"

'Mojiokoshi-san' allows you to use two types of AI transcription engines: "AmiVoice" and "PerfectVoice". However, when using the speaker diarization feature, you must use "AmiVoice".

Select "AmiVoice" from the checkbox for choosing the AI transcription engine.

Select AmiVoice

4. Select language and number of speakers

When you select AmiVoice, a dropdown menu for choosing the number of speakers will appear.

Number of speakers selection displayed

Select the number of people who spoke during the meeting or interview.

Select number of people

There is also a language selection menu, but it is set to "Japanese" by default, so if your audio is in Japanese, you can leave it as is.

5. Select and upload file

Upload your file.

You can select the file by clicking or tapping "Select".

Upload file

When using from a PC, drag and drop is also possible.

After selecting the file, click the "Transcribe" button to start the upload.

Start upload

*Keep the browser screen open during upload.

6. Start transcription

Once the upload is complete, transcription will automatically begin.

Start transcription

Once "Processing. Please wait." is displayed, you can close the browser screen.

*If you are using it for free without registration/login, you must keep the screen open.

*If you close the screen, a transcription completion notification email will be sent to your registered email address.

7. Transcription complete

Check the transcription results.

If you closed the screen

If you closed the screen, open the link provided in the email

By clicking "History" from the menu on the 'Mojiokoshi-san' website,

you can view the transcription results.

If you kept the top page open

If you kept the top page open, the screen will switch and display the transcription results like this.

However, even in this case, to check the speaker-separated transcription results, you need to open the detailed transcription results from the "History" page.

Click the "Check History" button to navigate to the history page.

History Page

When you open the history page, you'll see a list of transcription results like this.

Click on the file name in the right column to open the detailed screen.

When you check the transcription results on the detailed screen, you'll see that they are speaker-separated for each person who spoke, like this.

8. Downloading the File

To download the transcribed content, click the "Download" button.

From the menu that opens after clicking the button, click "Speaker Separation".

This will allow you to download the speaker-separated transcription results in text file format.

When you open the downloaded file, you'll see the transcription separated by speaker, like this.

This completes the transcription using the speaker separation feature of the AI transcription service 'Mojiokoshi-san'.

With 'Mojiokoshi-san', anyone can easily and quickly transcribe meeting minutes or interviews.

Why not try the speaker separation feature of 'Mojiokoshi-san' yourself?

Try Mojiokoshi-san now

4 Transcription Services with Speaker Separation

Speaker separation is a particularly advanced feature among AI-powered transcription services.

Therefore, there are fewer AI transcription services that offer speaker separation.

Let's briefly introduce some of the services available.

1. Mojiokoshi-san

The AI transcription service introduced in this article, 'Mojiokoshi-san,' is the most recommended AI transcription service if you want to use speaker diarization.

'Mojiokoshi-san' utilizes two types of cutting-edge AI:

AmiVoice: Speaker diarization available, high-speed transcription in about the same time as the audio file length.
PerfectVoice: Supports 100 languages including Japanese and English, ultra-high-speed transcription in about 10 minutes for long files.

Since it's a service that transcribes by uploading recorded audio files, the transcription accuracy is outstanding!

※Some AI transcription services offer real-time transcription, but real-time transcription inevitably suffers from lower accuracy due to processing limitations. In contrast, services like 'Mojiokoshi-san' that use file uploads have almost no speech recognition errors.

Even when using the speaker diarization feature, processing is very speedy, taking about the same amount of time as the audio file itself!

Furthermore, within the same plan, you can transcribe foreign languages like Japanese and English, or use the ultra-high-speed transcription feature (under 10 minutes) if speaker diarization is not needed.

You can transcribe up to 1 minute without registration or login, so why not experience the AI transcription accuracy of 'Mojiokoshi-san' first?

Try Mojiokoshi-san now

2. User Local Voice Meeting Minutes System

The User Local Voice Meeting Minutes System is a real-time AI transcription service accessible via web browser.

With just one microphone, it can listen to audio in real-time and transcribe it with speaker diarization, identifying each speaker.

As its name suggests, it's a very simple service specialized for meeting minutes, and its clear usability, stemming from its single function, might be its appeal.

User Local Voice Meeting Minutes System

3. Sloos

Sloos is also an AI transcription service specialized for meeting minutes.

This is also a very simple AI transcription service used via a web browser. Similar to the "User Local Voice Meeting Minutes System," it can transcribe audio recorded with a single microphone, identifying each speaker.

Its ability to transcribe accurately even with some background noise is another appealing feature, characteristic of AI transcription services.

Sloos

4. Group Transcribe

Group Transcribe is a smartphone app for web conferences provided by Microsoft for iPhone.

When all participants in a meeting install this app and conduct a web conference, the audio is transcribed for each speaker.

This app actually achieves speaker diarization using a different method than the other services introduced.

However, the principle is straightforward.

By leveraging the fact that everyone needs to install the app on their respective smartphones, it transcribes the content on each individual smartphone, thereby separating the transcription results by speaker.

It's an AI transcription service that uses an ingenious "Columbus's Egg" idea, unique to smartphone apps.

Group Transcribe

If you're using a speaker diarization feature, "Mojiokoshi-san" is recommended.

If you had to choose one AI transcription service from those introduced so far, we recommend "Mojiokoshi-san."

"Mojiokoshi-san" is a service that uses uploaded files rather than real-time processing, resulting in exceptional transcription accuracy!

Advanced transcriptions using the speaker diarization feature can also be completed quickly in a short amount of time.

Why not try the speaker diarization feature of "Mojiokoshi-san" yourself?

Try Mojiokoshi-san now

Speed up meeting and interview transcriptions with speaker diarization

Until a few years ago, summarizing meeting minutes or interview content by speaker was even more cumbersome than the transcription itself.

However, with AI transcription services available now, you can eliminate that hassle entirely!

Why not try using the speaker diarization feature of an AI transcription service like "Mojiokoshi-san" for convenient, highly accurate, and fast transcriptions?

■ AI transcription service "Mr. Transscription"

"Mr. Transcription" is an online transcription tool that can be used from zero initial cost and 1,000 yen per month (* free version available).

Supports more than 20 file formats such as audio, video, and images
Can be used from both PC and smartphone
Supports technical terms such as medical care, IT, and long-term care
Supports creation of subtitle files and speaker separation
Supports transcription in approximately 100 languages including English, Chinese, Japanese, Korean, German, French, Italian, etc.

To use it, just upload the audio file from the site. Transcription text is available in seconds to tens of minutes.
You can use it for free if you transcribe it for up to 10 minutes, so please try it once.

Start transcribing for free now

It is "Mr. Transcription" who can easily transcribe from audio, video, and images. Transcription allows you to transcribe for up to 10 minutes for free. You can copy, download, search, delete, etc. the transcribed text. You can also create subtitle files, which is ideal for transcription of interview videos.

HP: mojiokoshi3.com
Email: mojiokoshi3.com@gmail.com

文字起こしさんの支払い方法と領収書について。クレジットカード決済や銀行振込に対応！

Audio Transcription Costs: Tips for Affordable Outsourcing

How to Get & Print Stripe Credit/Debit Card Receipts

Top 5 Free Online Transcription Sites & Tools

【動画投稿者必見】YouTubeの動画文字起こし方法とは？ツール・流れを徹底解説

インタビューの文字起こしの方法とは？話者分離機能を使ってテープ起こしする流れを徹底解説！

Mr. Transcription

Transcription for audio / video / image transcription. It is a transcription service that anyone can use for free without installation.

notice

New Articles