UNIVERSITY OF SCIENCE AND TECHNOLOGY OF HANOI Introduction to Cryptography AES 128 with HLS Protocol Video Encryption Student name: Trần Xuân Phong Nguyễn Văn Tài Nguyễn Đức Anh Tuấn Nguyễn Hoàng Long Tạ Đình Thái Nhân Nguyễn Văn Nhân Class: ICT Lecturer: Dr. Nguyễn Minh Hương Hanoi 02/2023 BA10- 004 BA10- 055 BA10- 066 BI11 - 157 BI11 - 205 BI11 - 206 1. Introduction Providing video content is one of the most popular services on the Internet today. According to statistics from Cisco, video content transmission will account for 82% of internet bandwidth in 2022. [1] The typical challenges in providing video content can include reducing latency, reducing content size, reducing operating costs, and one of them is content protection. Content protection here means that videos will only be accessed by authorized viewers, and encryption with AES-128 is one of the solutions to implement this. Encrypting with AES-128 before transmitting content can be done in many different protocols, and here we will specifically present it with the HLS protocol. 2. Background and related work 2.1. What is video streaming? Video streaming is the process of delivering video content from a server to a client continuously. Instead of sending the entire video file to the client, the server sends small parts of it, allowing the client to view each segment of the video without downloading the entire video. Video streaming can be performed using both TCP and UDP protocols, each with their own advantages and disadvantages: • • TCP has the advantage of reliable data transfer, ensuring that packets are received in the correct order. However, it has a lower speed compared to UDP UDP has a faster transmission rate, but may result in packet loss Therefore, the choice of protocol depends on different situations: • • If low latency is required for video transmission and quality is not a high priority, UDP should be used (e.g., for video calls) In all other cases, where video quality is a priority, TCP should be used 2.2. What is HLS? HTTP Live Streaming (HLS) is an HTTP-based adaptive bitrate streaming video protocol developed by Apple and released in 2009. It is a reliable video streaming protocol that changes the quality of the video being transmitted based on the quality of the network connection to ensure uninterrupted playback. Additionally, it supports various content encoding tools, separate audio and video transmission, and is compatible with HTTP caching platforms for largescale video transmission. [2] Why HLS? • • The HLS protocol is based on the HTTP protocol, so it has many advantages in transmission o Reliable transmission o Can use CDN to provide content, simplifying scaling o Easy to pass through available proxy / firewall systems since most support HTTP packets Adaptive bitrate streaming: The quality of video content will be determined based on the quality of the connection between the server and client. o If the connection quality is good, the server will prioritize video quality (higher bitrate, resolution) o If the connection quality is poor, the server will prioritize continuous playback on the client side (lower bitrate, resolution to reduce load time) How does HLS work? • The video delivery process can be carried out as follows: Original video → Transcoding the original video into multiple segments with different qualities → Encoding the segments (if any) → Sending to CDN → Sending to users. (Image source: https://www.vnetwork.vn/news/cdn-hls-streaming-chuyen-dung-cho-ott-video) • Normally, a single HLS segment will have a duration of 10 seconds, and this duration can be reduced in the case of livestreaming. 2.3. What is AES? Advanced Encryption Standard (AES) is a standard for encryption and decryption established by the U.S. National Institute of Standards and Technology (NIST). AES belongs to the category of symmetric block encryption, using 128-bit data blocks and keys with lengths of 128, 192, and 256 bits. [3] In the case of AES-128, the key length is 128 bits. According to estimates, it would take up to billions of years to decrypt AES-128 without knowing the encryption key, making AES-128 safe to use as long as the key is not leaked. 2.4. How is AES-128 encryption used in HLS? After a video is divided into segments, each segment undergoes an encryption step before being sent to the user. The user will need the encryption key to decrypt the content received from the server and make it viewable as a video. The encryption key is sent in the metadata EXT-X-KEY of the playlist. [4] 3. Details of HLS AES-128 Encryption The following are the specific steps in the process of delivering video to users 3.1. Server side At the server side, the video needs to be processed, divided into segments, and encrypted. FFMPEG can be used to perform this step. First, we will need parameters for AES-128 encryption, which are Key and IV. The Key is a 128-bit string. We will use openssl to generate a random key: The IV is also a 128-bit string, and we will use openssl to generate it: Create the enc.keyinfo file for use with the format [5]: Then, run the following command to generate output segments with a length of 3 seconds, while maintaining the video quality: The output will consist of the playlist file index.m3u8 and the segments. Here is the original video: 3.2. Client side The client side will read the content of the index.m3u8 playlist file provided by the server to receive information about the video, and then use the Key and segment paths to decode and play the video. The decoding process is as follows: In reality, videos will be transcoded into different qualities instead of just one original quality as shown in this example. 3.3. Server-side key security Although AES-128 is secure when the key's content is unknown, it is relatively easy to decrypt if the key is leaked. Therefore, it is crucial to authenticate and verify whether a user has the access rights to the content or not when using this method to distribute non-public content. Server-side verification can be achieved by using a JWT access token. 4. Conclusion Video encryption using AES-128 is highly efficient due to its ease of implementation and application on various devices. However, it may not be suitable if the video requires a high level of anti-piracy measures, in which case this method may not be appropriate. Since this method sends the key to the client, if the user themselves use this key to decrypt video segments downloaded from the server, the content can be easily obtained. In this case, the use of DRM mechanisms such as Google Widevine , Apple Fairplay or Microsoft Playready is necessary. Reference 1. Cisco. VNI Complete Forecast Highlights Global. 2. Pantos & May. RFC 8216. "Introduction to HTTP Live Streaming". pp. 3. 3. NIST. FIPS 197. "Introduction". pp. 5 4. Pantos & May. RFC 8216. "4.3.2.4.EXT-X-KEY". pp. 14. 5. https://www.ffmpeg.org/ffmpeg-formats.html#Options-10