The production of media content across several languages and platforms is both time consuming and complex. Microphones, sound booths and arrays of editing software are typically required to generate translated audio tracks. This paper presents a one-stop solution to simplifying this workflow. With a particular focus on the translation of audio tracks contained in video files, this paper describes an innovative workflow that leverages commercialised Text-To-Speech voice synthesis and a prototypical system running in production. This workflow bypasses the need for microphones, video or audio editing software and allows a single editor to generate multiple mixed-gender voice-overs. A lightweight markup language is presented which helps editors to fine-tune synthetic voices. The balance between automation and editorial and linguistic quality will be also examined. The majority positive feedback received from journalists and audiences indicates that the prototype and its underlying language technology have the potential to become part of the multilingual video production process.