If you’re like most social media managers, you’ve probably run into issues with adding captions for social media video. Instagram and Facebook both allow for videos and photos to be used interchangeably, and it’s not uncommon to have videos on your feed just as often as photos. There’s even video amidst photo albums, that autoplays just as any other video would:

Video is great, but it’s not always appropriate. Fortunately, Facebook and Instagram timelines begin playing videos sans sound. This lets the user choose whether or not they want to go to the effort of pressing that little speaker button in the corner and get the entire video experience. For marketers, it became obvious very early on that video was more engaging than photos, but you had to catch the attention early, and without relying on sound. I can’t say for sure, but I’m assuming this is why movie trailers have about a 5 second “highlight” moment to announce what you’re going to see.

Watch to the 5 second mark. It’s all a ruse to keep you interested.


Basically it’s a preview of the preview. When you’re audience is rapidly scrolling through a feed, competition for attention becomes cutthroat. Even major motion films have to give an enticing incentive to keep you longer than a couple moments.

So where am I going with this? One essential element of all social media video is using captions (aka subtitles) to convey spoken word without volume. If this doesn’t seem familiar, perhaps your parents were the type to shy away from the “uses subtitles” section at the movie rental store. Growing up, I recall my friends not liking a movie, or choosing to forego it altogether, if it was a foreign film and required subtitles. Now, reading has become a staple of motion pictures on our phones. One could say that subtitles are seeing a resurgence with the dawn of the mobile generation. But who makes the subtitles? Is there really someone dictating every video that appears on Facebook?

It shouldn’t come as any surprise that there have been developments in automated voice recognition and interpretation. For God’s sake, “Siri,” “Ok Google”, and “Alexa” are practically household phrases. But not all voice recognition technologies are created equal.

Not too long ago, Facebook’s video player became a major competitor in video streaming services, with native video uploads dominating what was typically Youtube links attached in a Facebook post (Facebook actually gets more daily views). Facebook is still kind enough to allow Youtube links to be posted, but they DO NOT AUTOPLAY. If you want to guarantee video views on a given platform, make sure you are playing by their rules. Not to mention, Facebook is making it very easy for users to upload their videos natively by suggesting your recent photo and video content on your phone’s storage.

But which video player has better autocaption?

I’m not even going to build the anticipation. Youtube is better, hands down, no question, bar none. I’ve seen Youtube videos recognize names that have “Y’s” and “X’s.” I’ve had a video that uses the phrase carbamide peroxide, which isn’t even recognized by many word processing programs. It even gets very close to perfection with capitalization and punctuation. Facebook on the other hand, has really been nothing but a burden in my captioning experience. Strange things like randomly capitalizing a word, turning a stutter into a completely different word, and in general neglecting logic. *Edit: One of my coworkers told me the Facebook autocaption worked well when the subject was speaking loudly, clearly, and was well articulated. This is good news, but doesn’t change my opinion. Youtube is really impressive.*

Youtube also allows you to download your captions as a separate file, but it keeps all the timing locked in, so no matter the video editing program you use, you’re able to harness the awesome auto caption abilities. My workflow has now evolved into a multi platform chain. Here’s how it works:

  1. Take finished video, and upload to Youtube library.
  2. Automate captions using Youtube auto generate option. Make sure you double check, Youtube isn’t infallible.
  3. Download caption files (Premiere likes .srt files).
  4. Add them to your project and sync to video. They should already be timed properly, so just cut as you would with your video.
  5. Export all as one file, and voila, you have a captioned video, almost no dictation required.

Since Youtube is a Google product, I assume it has the advantage of being able to access Google’s indexing of frequently used terms and phrases (not unlike Google’s autocomplete feature). This isn’t to say that Facebook won’t soon rival Google in this technology, but for the time being, Youtube is retaining its relevance, in my eyes, from it’s caption utility alone. The irony, of course, is if you want to maximize video views, you might need to look to Facebook.