How to Make Video and Audio Accessible - Introduction

Elements of Accessible Video and Audio

To make your video and audio accessible to people with disabilities, provide captions, transcripts, audio description, and optionally sign language — based on the content.

Captions

Captions provide content to people who are Deaf and others who cannot hear the audio. They are also used by people who process written information better than audio.

Captions are a text version of the speech and non-speech audio information needed to understand the content. They are displayed within the media player and are synchronized with the audio.

Most are "closed captions" that can be hidden or shown by people watching the video. They can be "open captions" that are always displayed and cannot be turned off.

Subtitles are the spoken audio translated into another language. They are implemented like captions. Subtitles can be only the spoken audio (for people who can hear the audio) or can be a translation of the caption content including non-speech audio information.

[image: example static image from Perspectives Video on Captions showing captions.]

Transcripts

Basic transcripts are a text version of the speech and non-speech audio information needed to understand the content.

Descriptive transcripts also include visual information needed to understand the content.

Descriptive transcripts are required to provide content to people who are both Deaf and blind. They are also used by people who process text information better than audio and video.

[optional image: person interacting with dynamic Braille display, not looking at video in background]

Interactive transcripts highlight text phrases as they are spoken. Users can select text in the transcript and go to that point in the video. (This is a feature of the media player. It uses the captions file.)

[image like https://w3c.github.io/wai-media-intro/img/xcr_perspectives-d998a967.png]

Audio Description

Audio description provides content to people who are blind and others who cannot see the video adequately.

Audio description describes visual information needed to understand the content. It is usually narration added to the soundtrack.

For some types of video (such as some training videos), description of the visual information can be seamlessly integrated by the speakers as the video is planned and created, and you don't need separate audio description.

[optional image: blind person listening to video]

Sign Language

Sign languages use hand and arm movements, facial expressions, and body positions to convey meaning. For most people who are Deaf, sign language is their native language, and some do not read written language well. Note that there are different sign languages in different regions and countries; for example, American Sign Language (ASL), British Sign Language (BSL), and Auslan (Australian Sign Language) are all different.

Sign language is not required to meet most minimum accessibility standards.

[optional image: person signing]

Video and Audio Content

There are also accessibility requirements for the video or audio content itself. For example, in videos, avoid flashing that can cause seizures.

Accessibility Requirements

Providing a descriptive transcript for videos (or basic transcript for audio-only) meets a wide range of accesibility needs.

To meet Web Content Accessibility Guidelines (WCAG) Level AA, most videos need to include:

Requirements are different based on the content and whether they are live or pre-recorded. To figure out what your video or audio needs, see WCAG Media Standards and What Does My Video/Audio Need?

Develop In-House or Outsource

One approach to developing media alternatives is:

  1. An audio described version is developed by the same people, at the same time as the main video.
  2. Captions are outsourced, including for the main video, for the audio described version, and of the audio description itself. Often whoever produces the video also provides captions.
  3. Descriptive transcripts are developed in-house using the text from the caption files.

Some organizations do it all in-house, and some outsource it all. For help figuring out how to get your media alternatives developed, see Managing Media Accessibility.

Automatic Captions are Not Sufficient

Automatically-generated captions do not meet user needs or accessibility requirements, unless they are confirmed to be fully accurate.

There are tools that use speech recognition technology to turn a soundtrack into a timed caption file. For example, many videos uploaded to YouTube have automatic captions. [YouTube info] However, often the automatic caption text does not match the spoken audio — and in ways that change the meaning (or are embarrassing). For example, missing just one word such as "not" can make the captions contradict the actual audio content.

[optionally as an illustration for visual interest (with text as true text):
"Spoken text: Broil on high for 4 to 5 minutes. You should not preheat the oven."
"Automatic caption: Broil on high for 45 minutes. You should know to preheat the oven."
optional illustration/picture: fire coming from oven, or totally burned food on a broiler pan ;-)]

Automatic captions can be used as a starting point for developing accurate captions and transcripts, as described in Creating Captions and Creating Transcripts.

Additional Benefits

Accessible video and audio is essential for people with disabilities, and is useful for everyone in a variety of situations. For example, accessible video and audio content can be:

Some benefits of captions are illustrated in this 1-minute Video on Captions.

[optional image: screen capture for visual interest]

Making Your Media Accessible

The pages in this resource provide specific guidance on:

[ Next > ]

Back to Top