Accessible video and audio content - Multimedia for all

Do you want to make your audio and video clips accessible? Then learn how to do it this article

The legal requirements for multimedia content can be found in WCAG 2.x under the guideline "1.2 Time-based Media". At this point we deal with best practices

Article Content

Best possible recording quality

In general, the best possible recording quality should be aimed for. Content produced "quick and dirty" can be very problematic. Audio and video content recorded with a smartphone can sometimes be difficult to understand and recognise. Unedited photos are difficult because the image object is often hard to identify. Therefore, try to offer the content in the best possible quality. Dedicated studios that guarantee good image and sound quality are best. Find as quiet a corner as possible if you want to conduct an interview outdoors. Use the best possible equipment such as a dedicated video camera or an external microphone. Avoid low contrasts such as a brightly dressed person in front of a bright wall of a house. Avoid loud background noises such as traffic noise, but also the hum of air conditioners and similar things.

Good quality should already be ensured at the conception and recording stages. Much can be corrected in post-production. However, this is unnecessary extra work and not everything can be improved.

Besides, smartphones already offer some editing possibilities. In any case, refrain from using effects and filters already during recording; they rarely contribute to recognisability and can be better applied in post-production.

Accessible Mediapleyer

The most important is an accessible media player. It should be fully operable by keyboard. The controls should be large and easy to see. Subtitles that can be switched on and off and audio description should be supported as far as possible. It is also always useful to make the content available for download in a common format such as MP3, OGG, MPEG or AVI.

An example of this is the video player of Aktion Mensch: there, audio description, subtitles and sign language can be switched on and off. An alternative is the Able Player.

If you offer podcasts or other audio content, this should of course also be as accessible as possible. Use the best possible audio quality. This is especially helpful for the hard of hearing. For them, loud background noises and hissing are disturbing. It is also difficult if different speakers have different volumes. Hearing impaired persons have to adjust the volume every time the speaker changes. For audio and video content, I would recommend creating a text transcript. A transcript is a complete transcription of the spoken content.


With videos you should make sure that they are as recognisable and understandable as possible. It is true that rather amateurish recordings are also accepted today, especially in social media. But hearing and visually impaired persons have a hard time perceiving such content. Also consider that such content is often consumed on smartphones with rather mediocre displays and speakers, and often on the go in noisy and visually less than optimal environments.

Sudden changes in sound volume or lighting are also not optimal for these groups. They should be kept to a minimum.

Speakers should face the camera as much as possible. This way, there is another channel besides the sound through which they can be understood. Up to 30 per cent of information can be derived from lip movements and body language.

Text transcript

A text transcript is a full transcription of spoken content. It is a lot of hard work. But it offers many advantages:

  • It makes the content accessible to the deaf and severely hard of hearing who understand everyday language.
  • It increases comfort. "Re-reading" is easier than "re-listening".
  • You have additional content for your website that can be indexed better by Google and therefore bring you more visitors. Google doesn't listen to or look at content. But Google can index text.

Text transcripts are essentially suitable for speech-heavy content. For radio plays or action-heavy videos, you would also have to describe scenarios, which almost turns the transcript into a script.


Subtitles should be included in videos for persons who are deaf or hard of hearing. Subtitles transcribe what is being said and also convey important sounds that are important for understanding the clip.

Subtitles have other advantages: They can also be indexed by YouTube and thus by Google.This means that your video content can be found more easily. Furthermore, subtitles can be useful if a speaker is difficult to understand, for example because of a dialect or speech impediment. Besides, someone can also watch the video if they do not have speakers or headphones available.Studies show that subtitles are mainly used by hearing persons because they do not want to or cannot hear the sound in their current situation.

There are open and closed captions. Closed captions can be faded in or out and are therefore preferable.

Various tools facilitate the creation of subtitles .Subtitles can be created in YouTube and exported for other applications such as Facebook.YouTube's automatic subtitling is currently not practical.

A free programme to create them is Subtitle Edit.However, most professional video editors already have a solution for this built in.

If possible, ask a deaf person whether the subtitles are sufficient. If you do not have a deaf person available, watch the video with your subtitles and without sound. Can you understand what it is about? It is better if you ask a colleague who is not involved in the video to do the check. Since you already know the video, you are not really objective.

Audio description

Audio description (AD) is a description of visual video content for blind and visually impaired people. It is particularly important for videos with a small amount of language, otherwise blind people will not understand what the film is about. It is dispensable for videos wit a high share of speech.

AD is usually produced by specialised agencies. This can be expensive. In addition to the actual video production, a narrator is also needed to record the AD. Thus, the AD can cost several thousand euros even for relatively short clips. The AD is relatively easy to create yourself. As a rule, a second soundtrack is created next to the original soundtrack of the film. The texts are recorded and then synchronised with the film.

First listen to the film without the picture, what visual information is important for the non-sighted?Make sure that only the parts of the film where there is no speaking can be used for the description. If necessary, you should already plan appropriate pauses when designing the film.

The sound quality of the AD should correspond to the sound quality of the film, i.e. no noisy recordings with bad microphones if the film is not of the same quality.

If possible, ask a blind person to watch the film with AD.If you do not have a blind person available, watch the video with AD without the screen and check whether the AD is sufficient. As with subtitles, ask an uninvolved colleague to view the film. He or she can judge the quality of the AD more objectively. An alternative to a time-consuming AD is to include the essential visual information in the moderation. One advantage is that this info also benefits visually impaired persons who do not usually use AD. Think about which visual information is particularly important and incorporate it into the moderation text.

Currently, neither Facebook nor YouTube support turning audio description on and off, so you will usually have to upload two versions, one with and one without AD.

The video title should indicate that an audio description is included: "My best holiday video" for the clip without AD and "My best holiday video with audio description".

Sign language

There are two options for sign language: You can create a sign language version for a conventional video. The avatar is then displayed next to or as an overlay in the video. If, on the other hand, you are recording an event at which a sign language interpreter is also on stage, he or she should also be recorded - with his or her consent, of course. Theoretically, it would be conceivable to record the sign language interpreter and the event separately and to synchronise the whole thing in post-production. This would have the advantage that the sign language interpreter would be easier to recognise. In addition, autistic and distraction-prone persons would not be irritated by the interpreter. However, I cannot assess whether this is realistic. For content in sign language, special requirements apply so that the signers can be recognised as well as possible, for example, a bright background and a good viewing angle should be used.

For conventional videos where a sign language version is to be produced. The same applies here as for audio-description: the movements that are not part of the film can distract autistic persons or persons with concentration problems. It should therefore either be possible to switch them on and off - see the Aktion-Mensch-Player - or there should be a video with and without sign language, which is clearly indicated in the title.You can find a guideline and a sample tender for sign language linked to the IT representative of the Federal Government.Plain language in videos

The topic of understandable language in multimedia is very complex.Of course, you can encourage participants to speak as intelligibly as possible. However, if they are not experienced or trained in this, they will quickly fall back into old speaking habits. It is true that additional levels of communication such as body language and context are available in film. Nevertheless, a conventional film is likely to be a challenge for persons with learning disabilities. Again, you can pay attention to this when designing the film. If it is a scripted format, you can provide the most understandable voiceover possible with little jargon and no convolutions. So far, I have only seen content in plain language at the Working Group on Disability and Media.Time and cost planning

If you plan an accessible Filrm, additional costs will be incurred.

My recommendation is that if such films are produced regularly, two or three persons should specialise in-house in making these films accessible. If all possibilities are implemented consistently, one can assume costs of approx. 1000 € per film minute if the film is commissioned externally. This applies to short films; for longer clips, the costs per minute should be somewhat lower.

In my opinion, subtitles and audio description can be done well in-house. However, if you don't have a sign language person in-house, you will have to outsource the sign language transcription. Especially for audio-description and sign language you should plan a lot of additional time. Sign language interpreters are particularly difficult to find. It is best to ask how much time is needed for the editing when you ask for quotes. Experience shows that you should plan on 10 working days for short clips, but it is not possible to make a statement for longer videos, as this depends on the workload of the respective service provider.

Automatic transcription

Speech to text is one of the fastest developing technologies. It lends itself to the creation of text transcripts as well as subtitles for videos. The quality is already quite good. However, manual post-editing is indispensable for the time being. Dialects, own terms and exotic names overtax the TTS.

It is only worth purchasing the appropriate software if you have to transcribe content on a regular basis. If you hire a service provider anyway, he or she will have to transcribe content far more frequently, so it is worthwhile for him or her to purchase such software.

Read more

Further reading