Barriers to Employment: The Deaf Multimedia Authoring Tax

Barriers to Employment: The Deaf Multimedia Authoring Tax
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper describes the challenges that deaf and hard of hearing people face with creating accessible multimedia content, such as portfolios, instructional videos and video presentations. Unlike content consumption, the process of content creation itself remains highly inaccessible, creating barriers to employment in all stages of recruiting, hiring, and carrying out assigned job duties. Overcoming these barriers incurs a “deaf content creation tax” that translates into requiring significant additional time and resources to produce content equivalent to what a non-disabled person would produce. We highlight this process and associated challenges through real-world examples experienced by the authors, and provide guidance and recommendations for addressing them.


💡 Research Summary

The paper “Barriers to Employment: The Deaf Multimedia Authoring Tax” investigates the hidden costs that deaf and hard‑of‑hearing (DHH) individuals incur when creating multimedia content required in modern workplaces. While prior accessibility research has largely focused on consumption—providing sign language interpreters, captions, and transcripts—the authors argue that the production process itself remains largely inaccessible, creating a “deaf multimedia authoring tax” that can jeopardize employment at every stage: job hunting, hiring, and on‑the‑job performance.

The authors, a mixed team of nine DHH members and one hearing interpreter from Gallaudet University, document their real‑world experiences producing portfolios, instructional videos, and presentations. They outline a typical DHH video‑production workflow consisting of eleven steps: (1) drafting an English script, (2) translating it into American Sign Language (ASL), (3) creating a draft ASL video, (4) filming the final ASL performance, (5) back‑translating the signed content into English, (6) manually aligning captions (SRT files), (7) generating English voice‑over either via a hearing speaker or text‑to‑speech (TTS), (8) mixing the voice‑over with the video, (9) writing an audio‑description script for blind users, (10) recording the audio descriptions, and (11) integrating those descriptions without overlapping existing audio. Each step introduces unique challenges: ASL and English differ structurally, so translation and back‑translation are time‑intensive; timing mismatches between signed and spoken language cause awkward pauses or rushed speech; DHH creators cannot reliably evaluate the intelligibility or prosody of TTS or human voice‑overs due to limited hearing; and planning audio‑description gaps is impossible until the voice‑over is finalized.

To quantify the “tax,” the authors measured the effort required for a 15‑minute caption‑quality video produced for the ACM CHI 2024 conference. Their data show that four DHH contributors spent a total of 13 person‑hours on the extra steps (6 h for translation and draft video, 2.5 h for final filming, 1 h for back‑translation, 1 h for caption alignment, 1 h for human voice‑over, and 1.5 h for mixing). By contrast, a hearing creator would likely need only 1–2 hours for filming and caption generation, with caption outsourcing costing $1.25–$3 per minute. Thus, the DHH team incurred a cost equivalent to more than ten additional hours—a disproportionate burden that can make DHH candidates appear less efficient during recruitment and increase labor costs for employers when such content creation is part of job duties.

Quality‑control challenges further exacerbate the problem. When TTS is used, DHH creators cannot assess whether a 1.35× speed‑up maintains intelligibility. Even hearing collaborators may introduce errors, such as filming ASL content backward, which would be obvious to a fluent signer but not to a non‑signer. Therefore, a hearing team member fluent in ASL is essential for verification, adding another layer of coordination and expense.

In the “Future Directions” section, the authors call for the development of sign‑language‑aware tools that automate or streamline the workflow. Potential solutions include robust ASL recognition and synthesis, automatic caption‑timing alignment for signed videos, AI‑generated audio descriptions, and integrated editing environments that allow DHH creators to control both visual and auditory tracks without relying on a hearing intermediary. Institutional measures—such as providing DHH‑specific multimedia creation suites or partnering with specialized service providers—are also recommended.

Overall, the paper highlights a systemic, under‑recognized barrier: the extra time, expertise, and financial resources DHH individuals must expend to meet standard multimedia accessibility requirements. By exposing the magnitude of this “authoring tax” and proposing concrete technical and policy interventions, the authors aim to reduce the employment disparity and enable DHH professionals to compete on equal footing in the digital workplace.


Comments & Academic Discussion

Loading comments...

Leave a Comment