Transcribe is a new ‘speech to text’ option in Microsoft Word. It lets you upload an audio file and convert it into text.
Now you can record a speech, lecture or meeting then later get a written transcript. Virtual meetings can be recorded and now that can become a written record as well.
At the moment, Transcribe is available in Word Online (aka Word for web). It’s promised to Office Mobile apps before the end of 2020. Presumably it’ll be expanded to Word for Windows/Mac.
That’s not a big limitation. Make your transcription document with Word in a web browser then open the final document in any other Word software.
It’s a logical extension of the existing Dictate feature which converts live speech into text.
What you need
All you need to start is an audio file in .mp3 .m4a .mp4 or .wav format. Up to 200MB. The speech should be clear with little background noise or music to confuse the ‘speech to text’ system.
Transcribe can also record live, saving the audio file to OneDrive and offering a Word document of the text.
Start from Home tab, over on the right side under Dictate or microphone icon is now a Transcribe option.
If this is your first time with Dictate or Transcribe, you’ll be asked if the microphone can be connected to the browser. Here’s the prompt in Chrome. Choose ‘Allow’ to continue.
The Transcribe pane opens
Upload audio – use an existing audio file in .mp3 .wav .mp4 or .m4a format.
Start recording – records audio using the default microphone. That’s saved to your OneDrive account and converted to text.
At the bottom is a note about the number of transcription minutes used that month.
Uploading and converting to text
Click ‘Upload Audio’ and select the file to transcribe.
It’s strange that you can’t directly select an audio file on OneDrive. It has to be saved or synced to the local computer then uploaded again for Transcribe. Audio files are automatically saved to OneDrive in the /Transcribed Files folder.
Then wait while the audio is uploaded and converted to text.
When it’s finished the recording and transcription text appears.
In this pane, there’s a lot more going on than first appears. See ‘Editing a Transcription’ below.
Clicking Start Recording then the mic icon turns on the default microphone. Start speaking and it’ll be recorded.
There’s a pause/record button available.
When you’re finished choose ‘Save and transcribe now’ to upload and convert the recording into text. See ‘Editing a Transcription’ below.
Difference between Dictate and Transcribe recording
At first, Dictate and Transcribe recording might seem the same. They are similar but there’s important differences.
Dictate is live real-time conversion of text of what you say. Word and phrases appear in the document moments after you speak.
Different speakers aren’t marked in any way.
There’s no pause option (though we wish there was).
Voice commands like ‘New Paragraph’ work in Dictate.
What you say isn’t recorded for later playback.
Transcribe recording records your speech or meeting first. Then it’s uploaded and converted to text once the recording is over.
Different speakers are noted under separate headings.
There’s a Pause button.
Voice commands don’t work.
The transcription can be edited in the Transcribe pane.
The entire recording is saved in your OneDrive /Transcribed Files folder.
Editing a transcription
The transcribed text appears in the Transcribe pane and might seem simple but there’s a lot of useful features hiding here.
We tested Transcribe with a recording of James Earl Jones and his ‘People will Come’ speech from the movie ‘Field of Dreams’. Transcribe did a reasonable conversion job despite the music underlying the voices.
Fixing or editing transcript
Look for the pencil icon and click to edit a section of the transcript.
Now you can change the name from Speaker n to a proper name.
And fix up the transcribed text …
Just like any text in a browser, you can select it, right-click and see what browser options are available like Copy or Search.
Change all Speaker names.
If you choose the ‘Change all Speaker …’ option in Edit a section, the name will be changed all through the transcript.
A speaker’s name can be changed once or throughout the transcription.
At the top of the pane are the usual Play, Pause, Forward, Back and Volume buttons.
Forward and Back jump to the next section of the transcript.
The nice addition is the speed controller on the left. Change the speed of playback from slow (half normal) up to twice normal to skip through the recording.
As you listen, the relevant part of the transcription is highlighted.
Listen and edit
Clicking on any time indicator will jump to that section of the recording.
If necessary, click the time stamp again to listen repeatedly and catch what was said.
Same speaker, over and over again
Transcribe makes separate sections for one speech.
It’s unclear if this is deliberate or a bug.
Perhaps the developers are ‘erring on the side of caution’ in case it’s not the same person talking? Separating the transcription lets the customer change the speaker name.
But it means that a single speech is broken up into (too many) separate sections.
There’s no way to join sections together to make a large spoken block.
Copying to Word document
There are various way to copy some or all of a transcript to the Word document.
Add all to document
Most obvious is the ‘Add all to document’ button at the bottom of the Transcribe pane.
The transcription is copied to the document looking very plain.
The ‘Audio file’ and ‘Transcript’ lines use Heading 1 style. The rest is all in Normal style.
There’s nothing to separate speakers names from the spoken words. It’s a shortsighted decision that makes it unnecessarily difficult to reformat the transcript.
Surely it would be better to use Heading 2 for speakers names? Then users could easily reformat the naming either by changing the look of Heading 2 or Replacing that style with another (e.g. ‘Speaker Name’).
At the least, transcribed words should have a separate style (e.g. ‘Transcript’ or ‘Spoken’), even if the initial style settings are the same as Normal? Ideally, each speakers words should be in a separate style (e.g. ‘Speaker 1 text’ etc.).
As Microsoft has done it, customers must manually go through the document reformatting it. Grrrr.
What’s the point in having powerful formatting and features in Word, if Microsoft won’t make use of them?
Add section to document
Or click the + icon to copy that section of transcript to the document.
Reopening a document with a transcription
Closing a document with a transcription, saves the document and the transcription.
If you reopen the document, go back to Home | Dictate | Transcribe. The Transcribe pane will open with the transcription there.
The audio file needs to remain in the OneDrive /Transcribed Files folder with the same name. It’s not saved in the Word document.
Of course, that only works with Word that supports Transcribe. Opening a transcribe document in another Word (like Word for Windows/Mac) can’t show the Transcribe pane (yet) but the transcription details are still in the .docx file.
Transcribe requirements and limitations
There are some requirements for Transcribe in Word for web.
- Microsoft 365 customers only – any plan, personal, education or corporate.
- Edge or Chrome browser
- US English only … at least for now.
- Each uploaded audio file must be under 200MB.
- Audio formats: .wav .mp4 .m4a or .mp3.
- Not Apple’s .aac format.
- Five hour limit – a total of 300 minutes of transcribing per month. Look at the bottom of the Transcribe pane
One Transcription per document
Only one audio file or recording can be saved in a Word document.
If you choose ‘New Transcription’ at the bottom of a current transcript, you’ll get this warning.
You can only store one transcript per document.
When you create a new transcript, the current transcript will be removed/deleted.
The workaround is to start a new Word document and do a separate Transcribe there.
This limitation appears to be linked to the way a transcription is saved within a Word document.
Transcripts can be copied between Word documents (just like any other text) to make a combined transcript document. Or even link/embed the transcription documents into another Word doc.
Unlimited or not?
Over twenty years of reporting on Microsoft, we’ve become used to seeing hyped over-promising with the limitations either not mentioned or in very fine print.
For Transcribe, Microsoft says apparently conflicting things, two sentences in a row. It’s a classic example of Microsoft very carefully phrasing their promotion.
Check out this snippet from the blog post.
First Microsoft says in bold type (our underlining):
With Transcribe you are completely unlimited in how much you can record and transcribe within Word for the Web.
Then in the next sentence, not bold …
Currently, there is a five hour limit per month for uploaded recordings and each uploaded recording is limited to 200mb.
What’s going on? Is Transcribe unlimited or not?
Microsoft has worded those sentences with legal precision.
Completely unlimited – refers to recording live via Transcribe’s Start Recording button.
Five hour limit – applies to uploaded audio files.
Does the word ‘currently’ mean the five-hour limit will be raised in future? Perhaps, or it could be Microsoft giving customers some false hope.
According to Microsoft:
Your audio files will be sent to Microsoft and used only to provide you with this service. When the transcription is done your audio and transcription results are not stored by our service.
As usual, those assurances don’t tell the whole story.
Audio files are automatically saved to OneDrive in the /Transcribed Files folder. The transcribe text is normally saved to OneDrive, though that’s optional. Anything saved to OneDrive is subject to intrusion by Microsoft.
Microsoft can be compelled to hand over any customer data to government agencies in accordance with local law. That can happen without a warrant or notice to the affected customer.