React Video Localisation & HLS

Benefits of Separating Audio and Video Streams

There are a few potential benefits to separating the audio and video streams of a multimedia content, such as a video or audio file.

One benefit is that it can provide more flexibility and control over the audio and video content. By separating the streams, you can manipulate each one independently, which can be useful if you need to adjust the audio or video quality, or if you want to add or remove certain elements from the streams.

Another benefit is that it can make the content more accessible and inclusive. For example, if you provide both the audio and video streams separately, users who are deaf or hard of hearing can access the audio stream, while users who are blind or have visual impairments can access the video stream with audio descriptions.

Separating the audio and video streams can also make it easier to deliver the content to different devices and platforms. For example, you can use different audio and video codecs and formats to optimize the streams for different devices and networks, which can improve the playback quality and performance.

Overall, separating the audio and video streams of a multimedia content can provide a number of benefits, depending on the specific requirements and goals of the project.

Benefits of Changing Only the Audio Stream of a Video

There are a few benefits of separating the audio streams and video streams.

One potential benefit of only changing the audio stream of a video being played is that the visual content will remain the same while the audio switches, this can be helpful if there is important information on the visuals that need to be retained. This could be useful if a user understands a different language or has other preferences for audio.

Another benefit of changing only the audio stream is that it can be more efficient and cost-effective. If the video itself does not need to be changed, you can save on storage and bandwidth costs, as you only need to store and deliver the audio stream for each language.

Additionally, changing only the audio stream can make it easier to provide multiple language options for the video. If the video itself does not need to be changed, you can simply provide the audio streams for each language and allow users to switch between them as needed. This can make it more convenient for users to access the content in their preferred language. More on this below.

Continuing Video Playback When Changing Language on a Web Page

It would be great if videos continued playing while changing the language selection on a web page. This would provide a more seamless and uninterrupted experience for users, and would make it easier for them to switch between languages without losing their place in the video.

There are a few different ways this could be implemented. One approach would be to use an HTML5 <video> element and add a MediaSource object to it. This would allow you to dynamically update the video source as the language selection changes, without interrupting the playback.

Another approach would be to use a video player library like Video.js or Plyr, which provide APIs and event handlers for controlling the video playback. This would allow you to listen for language change events and update the video source accordingly, again without interrupting the playback.

Implementing this functionality requires technical expertise and programming, and it would provide a more seamless and enjoyable experience for users.

Lets look at some code for implementing this.

There are two basic ideas here:

Single button for localisation of web page and video.
Separation of the audio streams from the video. So only the audio of the language chosen is streamed from the backend to the user’s browser.

To do this, we need to have one centralised language button, which is integrated with the video player controls. It isn’t very difficult to achieve this. And secondly to separate the video and audio streams on the backend. Which is also easy.

Here’s demo video, demonstrating this.

youtube: BtAu60lxnjc

Separate audio and video streams using HLS (Http live streaming)

HLS or Http live streaming is a protocol supported by all browsers and devices, which gives it an edge over Dash which isn’t supported by Apple devices (Sad but true) as of Sept, 2020.

The HLS protocol creates smaller segments of a media file which is incrementally streamed to the user. This is beneficial as the end user does not need to download the entire media file, which could be of a large size.

To achieve our objective of switching between audio streams, while downloading a single audio stream. There are three main steps to perform on the media files.

Remove the original audio from the media file, creating a video only and audio only file.

This is needed so we could create a separate independent audio stream from the original audio. Having a video only file, helps us create a video only stream, so if the user selects an alternate language the original audio doesn’t interfere.

A single HLS segment could contain audio and video OR it could contain only video OR only audio. As we need to switch between multiple audio streams, we will have the video segments without any audio and audio streams for each language.

Assuming we have multiple audio translations available as separate audio files. We would have to create HLS audio streams for each of them.
The last step would be to create an HLS playlist, with the audio and video streams. This playlist would use the EXT-X-MEDIA tags for each of the audio streams. Something like below:

#EXT-X-MEDIA:TYPE=AUDIO,URI="eng.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="stream_4",DEFAULT=YES,AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="hin.m3u8",GROUP-ID="default-audio-group",LANGUAGE="hi",NAME="stream_5",AUTOSELECT=YES,CHANNELS="2"

and the video stream would refer to the multiple audio streams, using the ‘GROUP-ID’, like below:

#EXT-X-STREAM-INF: AUDIO="default-audio-group"

with this kind of playlist, we have separated the video stream from the multiple audio streams.

Single button click for page and audio selection.

To be able to select the localisation language and the audio stream with a single button. We would need to integrate our media player to the localisation selector. Lets see, how this can be accompolished using React on the frontend.

For our example, we will use react-i18next for localisation of the page. For the player we will use hls.js.

Lets say, we have two languages, english and hindi on our web page. The selectors for the languages could be like below.

        <select onChange={handleChange}>
          <option value={0}> {t('language.en')} </option>
          <option value={1}> {t('language.hi')} </option>
        </select>

When the user selects the first option, he selects english (en). The second option would be hindi (hi). The {t('language.en')} is the syntax from react-i18next. It essentially will display localised text on the browser, depending on the selected language.

On selection of a language, we will trigger the handleChange function. This function will change the language on the web page and also would change the audio stream selected. The code for this function, could be as below:

  const handleChange = (e) => {
    if (e.target.value === '0') {
      i18n.changeLanguage('en');
    } else if (e.target.value === '1') {
      i18n.changeLanguage('hi');
    }
    hlsRef.current.audioTrack = e.target.value;
  };

The if-else check is to change the language on the web page using the react-i18next’s changeLanguage function. (This is part of the react-118next’s documentation). The hlsRef.current.audioTrack = e.target.value; is the assignment, which changes the audio stream to either english (0) or hindi (1). The 0 or 1 here maps to the order of the audio streams in our playlist.

If you are wondering, what is the hlsRef here, this comes from the hls.js library. Here is some sample react code, to explain that:

  let hlsRef = useRef();

  useEffect(() => {
    if (Hls.isSupported()) {
      let hls = new Hls();
      ...
      hls.on(Hls.Events.MEDIA_ATTACHED, function () {
        console.log('video and hls.js are now bound together !');
        hls.loadSource(videoSrc);
        hls.on(Hls.Events.MANIFEST_PARSED, function () {
          hlsRef.current = hls;
          hlsRef.current.media.controls = true;
        });
        ...
      });
    }
  }, []);

This was a high level description of how to achieve a great localisation experience for a web page with videos and needing to support multiple languages.

Published Sep 15, 2020

I am a software developer and more recently a generative AI consultant. I am passionate about connecting applications to generative AI. Please reach out, if you need to integrate generative AI into your application workflows.Jaikant Kumaran on Twitter