< Back to all posts

What is New and Noteworthy in Video Transcription?

Posted by Rebekah Toth Burns on May 30, 2013

Transcribing video is important to cut post time, sufficiently archive video, and for spoken words to appear in Google searches. Transcription with timecode notes is a great tool for video Producers to make paper edits and Editors love it to quickly locate sound bites. Organizations often have all footage transcribed and added to their archival library for future reference. One of the most important reasons to have online video transcribed is to make sure all that valuable spoken content can be noticed by crawlers and come up in relevant searches.

In the past, transcription required many layers of steps that were easily screwed up. An additional audio recording, separate from the camera, was required onsite. Usually, timecode was taken from the camera and recorded on a separate channel of the audio recorder. The audio was then uploaded or sent by the crew or producer after the shoot to the transcriptionist. Many of us have dealt with frustrating circumstances when the audio wasn’t recorded properly; the timecode bled over into the audio channel; or the file format wasn’t compatible with the transcriptionists. Generally, there was a 48hr to 5 day delay time before the transcription was available or you had to pay a premium for a quicker turn around. Companies like Google and Adobe have launched speech recognition software but the accuracy still isn’t anywhere near 100% accurate.

We are highlighting 3 online sources that can cut down time and headaches while providing accurate transcriptions your organization can use.

Koemei (pronounced co-may) is a web service platform that transcribes large amounts of video in real time. This is a speech recognition software API. The platform is easy to use with a navigation panel on the left, the time-coded text in the center and the video on the right. When you play the video the text in the center will highlight with each spoken word. The issue with many speech recognition software programs is it doesn’t always give you an accurate transcription. A very cool part of Koemei’s application is if the software isn’t sure that is has transcribed correctly it will underline the text and user can just highlight and change the text within the viewer window. When the transcription has been reviewed it can be exported as a PDF document, XML, TXT, or publish directly to third party online video hosters such as YouTube, Vimeo, Kaltura, and Brightcove. The price for this application is a monthly fee with three tiers of services starting at $99/ month.


Verbalink is a more traditional human transcription service. They offer a wide range in services: transcription, translation, editing/ proofreading, copywriting, and interpretation. They have transcribed a wide variety of projects like conferences, interviews, medical, focus groups, marketing research and more. Verbalink offers video transcription that allows the user to upload .mov, .wmv, .avi and .flv files onto their website. It is import to note that their website upload is limited 2GB so the videos will have to be compressed. They also offer larger files to be uploaded via their FTP site, or you can send them in via mail. Their price structure charges by the hour/ minute, number of speakers, turn-around time and starts at $1.50/minute.

SpeakerText combines human transcriptionists and a software API. This is a good solution for your online videos and podcasts. Import your video library with Ooyala, Brightcove, YouTube, Vimeo, Wistia, or Blip.tv. Select the tracks that you want to transcribe and an email will be sent to you once it is finished. You can download the transcriptions as a text, XML or use a plugin called CaptionBox. CaptionBox is a pretty cool feature that allows viewers to follow along with text as the video plays, navigate to certain places in the video by typing in words spoken, and share quotes within the video via email, blog or social media. The shared link will start the video at the exact point of the selected quote. Unfortunately, this plugin is only supported for enterprise clients. The price starts at $2.00/ minute with discounts for bulk purchases.

Let us know if you have used any of these solutions in your camera work around the globe; we would love to hear your feedback. Also, If you have used another transcription service that works well please share.

Leave a Reply

Your email address will not be published. Required fields are marked *


A perfect video production takes
a perfect crew

Any City. Any Country. Any Time! Regardless of location, we have the perfect video crew for you!

Let's get started:Get a quote!