Final Paper - Bad Request

advertisement
DEPARTMENT OF COMPUTER
SCIENCE
Computer Science Honours
Final Paper
2015
Title: Technical aspects of a Video Editing and Upload Tool targeted at DCCT Deaf Users
Author: Montlamedi Maikano
Project Abbreviation: VIDUP
Supervisor: Professor Edwin Blake
Category
Requirement Analysis and Design
Theoretical Analysis
Experiment Design and Execution
System Development and Implementation
Results, Findings and Conclusion
Aim Formulation and Background Work
Quality of Paper Writing and Presentation
Adherence to Project Proposal and Quality of
Deliverables
Overall General Project Evaluation (this section
allowed only with motivation letter from supervisor)
Total marks
1
Min Max Chosen
0
20
20
0
25
0
20
0
15
15
10
20
15
10
15
10
10
10
10
10
0
10
80
80
Technical aspects of a Video Editing and Upload tool
targeted at DCCT Deaf Users
Montlamedi Maikano
Department of Computer Science
University of Cape Town
Mknmon007@myuct.ac.za
ABSTRACT
The purpose of this software engineering application was to allow
DCCT users to archive, edit and upload their videos onto their
website. The archiving software should manage the video files on
a hard disk. The software should be able to be utilized to keep the
videos in a structured storage location within the computer.
Additionally, users will be able to create metadata associated with
the videos. The video editor developed will allow videos to be
converted to a compressed and commonly used video format
(AVI), with the option of adding subtitles and removing sound.
Furthermore videos will be encoded (converted) in a manner that
will maintain intelligibility of Sign Language. Additionally, the
size of the videos needs to be reduced and made available on the
website. Further discussion on the upload process will be
discussed later.
In this paper, we describe the technical aspect of making video
editing and upload tools suitable for Deaf users. Tools and
technologies such as FFmpeg framework and video formats such
as Material exchange format (.MXF) will be discussed later in this
paper in order to solve the problem of video editing issues in DCCT
such as handling large sized MXF files and uploading them to their
website.Performance tests were performed and results were
deduced which showed reliability of FFMPEG algorithms and
result findings showing the impact of conversion, resolution change
and removal of sound from sample videos in order to compress
video sizes.
Keywords
MXF, DCCT, Deaf Community of Cape town, Co-design, FFmpeg,
Transcoding
In this paper we describe the design and software development
aspects of building a system. Therefore there is a need to find a
suitable media framework, converter tools, archiving tools and
API in order to achieve a successful system.
1. INTRODUCTION
The Deaf Community of Cape Town (DCCT) is a nongovernmental organization that serves the needs of Deaf people
around the Western Cape. People who are associated with the term
Deaf are both hearing and non-hearing individuals; this is a cultural
identity, and the primary language for Deaf people is South African
Sign Language [15].
Targeted users
The application will primarily to be used by the System
Administrator at DCCT. Other members may also use this
application to edit and upload. The aim is to cater for both
experienced and novice users.
Problem Statement
Serving the community is at the heart of the DCCT. They are
involved in hosting events and campaigns to build and create
awareness about the Deaf community. During these events, the
DCCT captures videos and images. The content is uploaded onto
their website for public viewers. The quality of the original video
footage has a very high resolution. However, uploading such high
quality onto the website is a challenge. The large size of the video
files and the time it takes to upload becomes an inconvenience,
and as a result only images are currently uploaded. The DCCT has
a large number of video files stored on a hard disk. These videos
are arbitrarily named according to the Video camera settings. This
makes it challenging to find a specific video.
2. BACKGROUND
The design of applications that are sign language inclusive is not a
new field of study. Despite the field’s growth, few to no
investigations have been done on editing and uploading of videos
to a website by the Deaf community. This project therefore
attempts to resolve this by carrying out co-design with the DCCT
community in order to design a video upload tool. There are few
considerations, however, that need to be taken into consideration
to successfully achieve this task. The videos need to be
compressed into a smaller file size. This will help reduce upload
time and require less system resources and bandwidth. The frame
size and frame rate can also be reduced. However, the downside
of this is quality loss. It is important that intelligibility is
maintained in the videos.
Therefore there is a need to co-design with the Deaf community in
order to deliver the requirements successfully with usability taken
into consideration. The proposed tool combines editing,
transcoding and uploading of videos into a single interface. It also
provides access to an archiving tool, which would allow them to
effectively manage current and future media.
Sign Language is a gesture-based language, and requires
recognition of hand signals and facial cues to understand. Adding
2
Sign Language to the interface would be helpful. This would
allow users to understand how the application works in their
preferred language of choice. We looked at a few studies that
consider both intelligibility, Sign Language inclusion and the
interface for such an application.
However if files names are renamed, deleted or moved within the
MXF files, this can lead to corrupting the loose database structure
between files.
MXF has numerous capabilities which are beneficial to media
management. The MXF files are stored in a format that is
streamable and can be viewed while the file is being transferred
[23]. This is similar to video files loading on the internet e.g.
YouTube and that means there is no need for the video to be fully
loaded in order for playback to be achieved. The streamable
format minimizes the latency of packets received in real
time/broadcasted and allows transfer of data from one location to
another with no traffic congestion resulting in a persistent
performance[23]. MXF is able to perform wrapping on any
compression format this makes it different to other container
formats [22].This means there is no need for the video to be fully
loaded in order for playback to be achieved. The streamable
format minimizes the latency of packets received in real
time/broadcasted and allows transfer of data from one location to
another with no traffic congestion resulting in a persistent
performance. MXF is able to perform wrapping on any
compression format this makes it different to other container
formats.
Related Work
Requirements for Sign Language Videos
A major criterion for Sign Language videos is maintaining
intelligibility [2]. Transcoding a video without consideration for
frame size, frame rate and video quality can make Sign Language
communication ineffective [21]. Video compression requires
delving into the specifics of video components. The frame size or
resolution refers to the size of a video. High Definition (HD)
cameras have a frame size of 1920x1080 pixels, which makes for
a clearer and sharper image. This contributes to the size of the
video, making it larger. A frame size of 640x480 pixels is
considered to be Standard Definition (SD), which is better suited
for online streaming and low-end devices. Frame rate is measured
in frames per second (FPS). It is important, particularly for the
visual message of Sign Language videos. A blurred video would
render the video unwatchable, frames that are skipped or dropped
in the video miss important signing information and make the
message incoherent [1]. The bit rate is the “number of bits used
per unit of playback time after data compression” [1]. This is
linked to video compression, and in deciding which kind of codec
will be best for producing a good quality video with a file size that
is reasonably low. Currently, a commonly used video codec for
compression is the MPEG-4 Part 10 codec, also known as H.264.
This is used for high quality digital content such as Blu-ray discs
and for broadcasting HD content. It is good as it provides good
quality at low bit rates, therefore at a lower file size. It is also
commonly used for streaming media online.
REQUIREMENT ANALYSIS AND
DESIGN.
3.
Architecture
An architectural pattern in Software Engineering is a way of
structuring a software system at a higher-level [16].Therefore the
architecture to be used for the application is a Layered architecture
called 3-layered architecture. The layers include presentation layer,
logic layer and data access layer. Within this, the MVC design
pattern was deployed. A design pattern solves recurring problems
in software development and allows for further modular
development [17].
Metadata for additional information about Digital Video
Metadata is simply data about the data. Metadata is needed by the
DCCT in order to identify with video quicker and relate it to
events held. According to the paper by [3], digital information can
be assigned to either of these categories:
 Descriptive - which helps with resource identification
and exploration
 Administrative - which supports resource management
and

Structural - binds complex objects of information
together
Description of MXF format and its capabilities
MXF format is the format currently used by the DCCT to record
event videos. Therefore it is important to know the format as well
as it capabilities. MXF is described as a container for audio and
video data as well as rich metadata [22]. MXF is an open file
format and file transfer format used by anyone. MXF is used
mostly for television production. The metadata is used to store
information about the description of video and audio. The DCCT
events footage can hold information about the title, date of event,
description and author. Such metadata described previously may
not be enabled for other formats but MXF format makes it easier
to add rich metadata that can be adjusted dynamically [22].
Figure 1 Three layered architecture diagram showing the
VideoUp tool components along with external systems.
3
require more time to produce a high quality software in the time
frame allocated therefore a decision was taken to use of separate
systems (3rd party software) can be integrated together with the
video editing tool to bring about a solution.The types of tests done
include white box testing, black box testing and performance
testing. User acceptance tests relating to the features’ performance
was also taken into consideration, this was seen as an indicator of
whether or not the application performed as intended.
The presentation layer is graphical user interface (view) where the
user is able to put in metadata and input video files and subtitles as
well. This is then sent to the logic layer, where the controller
(controller) processes this information by assigning the request to a
media framework which then processes the video editing request.
At the same time, the video subtitle information and general project
information is also saved to the data access layer (model), which is
in the form of a subtitle file or project file. External systems such
as online video repository and archiving tool will work jointly to
fulfil functionalities required.
FUNCTIONAL REQUIREMENTS
Requirements include allowing users to trim a video clip,
add/remove/edit subtitles, remove audio from video, convert
videos, and upload videos to YouTube in order to be linked to
DCCT website and archiving videos. An additional feature of
cropping a video was added which allows a portion of the video to
be kept. The means of gathering requirements involved the use of
minutes of meetings between supervisors and clients. These
requirements were provided by our supervisor and further refined
by our clients over a successive period of time.
Software Engineering - Design guidelines followed
Figure 3: System Development Life cycle
Figure 2 Use Case diagram of Video Up tool
An iterative approach was used for the development of the
system. The steps include understanding the requirements that
included understanding the design and development requirements,
Enquiring about the clients and context through use of prototypes/
minutes of discussions per meeting, analysis of the
requirements/clients’ needs/technology, development of prototype
and finally repeating the cycle until it is acceptable to the clients.
The use case diagram above, summarizes interactions between
users, functionality and systems. The system admin is the primary
actor and DCCT member are classified as secondary users. The
inputs from the user to the tool can include the input video for
editing purposes, information from users such as video details,
video adjusts and subtitle file for viewing with the inbuilt windows
media player. The systems include YouTube for uploading
purposes and Drop it for archiving.
The number of iteration was 6 in total, where the first 2 iterations
were paper and interactive prototypes and the other iterations
were development based. The prototypes produced were a Low
fidelity in which the use of sketches and images were presented to
the system administrator and Deaf volunteers. Two types of
prototypes were designed which were paper based and interactive
prototype which was generated by a program called
‘Balsamiq’.The interactive prototypes was experimented with the
Human Computer Interaction(HCI) lecturer and students. The
advantages found of using prototypes is that it is cheap to create,
easy to adjust and provides developers with early user testing of
proposed software.
3.1 NON-FUNCTIONAL REQUIREMENTS
Non-functional requirements are identified to ensure the system
will function correctly and efficiently. The following are some of
the requirements identified:
Legal and regulatory requirements
Ethics clearance was required from the Science Faculty. This is
done in order for user testing to be legally carried out and to be
ethically sound. Privacy of videos from general public is a
requirement given by the DCCT client1. One of the issues raised
by the community was privacy, the availably of the videos to
general public without accessing their website is undesirable.
The design strategy was either to develop code or find existing
open source software to build on. Development of code would
4
Therefore making the video private means that the viewer would
have to get authorisation from the person who uploaded it. A
better suggestion that was approved to solve the issue would be
“unlisted” privacy control. Unlisted allows a viewer to watch the
video if they have the link to it. Additionally, if the person tries to
search and find the video on YouTube it will not be found. These
videos are essentially “semi-locked”; viewable via a link but
undetectable otherwise.
content can amount to sizes greater than 1 GB .This is depending
on type of video file (AVI, MXF, MP4, and WMV), sound quality
and length.
Minimum CPU requirements
Windows 7 Operating system.
Intel Core (™) i3-2100 CPU @ 3.10 GHz / Intel Core ( ) i3-2120
CPU 3.30 GHz
RAM: 4 GB
64 bit
Performance requirements
 Trimming should always allow any video to be trimmed
successfully.
 Trimmed video should start and end at the times
specified by user.
 Conversion of input video to fixed output format should
be successful.
 Conversion should not result in damaged/missing video
frames.
 Conversion should not result in corrupted video files.
 Bitrate should remain the same (speed on playback).
 Resizing of area of interest should be easy to specify by
user and not slow (greater than 2 seconds).
 Resolution of videos should not result in corrupted
video frames.
 Resolution of video can be adjusted easily by user and
result in desired output.
4. SYSTEM DEVELOPMENT AND
IMPLEMENTATION
The development of the application followed an iterative
approach. This was after completion first paper prototype, and this
started with simple functions such as removing sound, transcoding
and trimming with FFmpeg command line before finding a
programming language such as Java/ C# or python for
development. The FFmpeg commands made the tasks simple to
execute as well as the documentation provided online. Resources
on functions were not so difficult to find online as the video
coding has been tackled by many developers around the globe.
In the beginning of numerous visits to DCCT, communications
between the Deaf has been a challenge as there was a language
barrier. In communication, computer jargons were avoided to
explain technical terms. Visual aids proved to be more effective
when explaining to the Deaf.
Backup policy
Backup was made every day and comments were made on each
iteration. Backup was done to in order prevent loss of information.
GitHub provided cloud storage for the application development,
which maintained changes and kept previous versions of the tool.
After consideration of the prototypes and feedback from the
clients the development iterations then followed.
Availability requirements
The DCCT website is made available via Afri-host. The website
has limited storage space available, which is why the videos are
kept on YouTube. The videos are made to be unlisted on
YouTube, and then they are shared onto the DCCT website.
On the first iteration, locating videos and sending them to a
desired destination within the computer or external storage was
vital part of the software as basis. An Openfiledialog component
within the C# toolbox was used to make this task possible. Less
complicated tasks such as removing sound, addition of metadata
and trimming video from random intervals was integrated with the
C# coded user interface. In respect to a comment requiring
trimming, the clients made a complaint about having event
recordings which take about an hour or more of video content and
suggests to split the videos into smaller clips was given. The
duration of the videos result in video taking a long time to be
uploaded on the DCCT website from the local disk. Therefore
separation of the video into more manageable clips is beneficial
for the purpose of viewing and uploading.Overral this took
approximately 1-2 weeks of development.
Bandwidth and YouTube upload requirements
Videos are limited to 11 hours of uploading onto YouTube
channel [20]. Additionally, the videos can be at most 128GB in
size. To uploaded content that is longer than 15 minutes, a
verification process has to be followed. This process can also be
followed to increase the cap on file size, as the standard is 20GB.
These two factors (length and size) are important when it comes
to uploading a video. Bandwidth consumption for server-rendered
videos can range from about 800Kbps to 6Mbps for a high
quality, full-scale video. It is important that videos are trimmed,
and that the frame size is kept around 1280x720. This will help in
reducing the file size, reducing upload time and ultimately
reducing bandwidth required.
On the second iteration, subtitling was made possible by adding it
as a track to a video file. Restriction of resolution was made in
order to limit users to 4 options of resolution (360p, 480p, 720p
and1080p), this was done using a drop down box for error
handling and it is recognized standard used by YouTube. Video
conversion was possible allowing the user to specify any format to
convert to but this was changed because conversions from MXF
to other formats using FFmpeg produced corrupted files therefore
conversion from any input file to AVI was decided on as the
solution. On regard to the uploading feature was enabled allowing
user to upload and share a link from youtube.
Storage requirements
At least 500MB for installation files this includes Video Up tool
and K-Lite codec, Drop it setup files and videos for applications.
At least 1GB for post installation after K-Lite installation. It is
recommended to have an external storage to for raw input files
which can demand a lot storage. After conversion the file size is
capable being reduced to x5 less of the original file. Raw video
5

On the third iteration there was a change of some elements being
repositioned and an adjustment to user interface and inclusion of
user validation. The uploading functionality had an issue of privacy
from the previous iteration, in this iteration it was adjusted to allow
viewing via the DCCT website using YouTube video links. An
additional feature of cropping videos was added, this was done to
remove unwanted portions of a video clip.
And many more audio and video formats
Provides lots of useful functionality, such as:






On the forth iteration an addition of a better subtitle function was
made which can add/edit and remove subtitles. FFMpeg was used
to integrate subtitles and video files. The archiving tool (Dropit)
was integrated with the Video Upload tool allowing users to drag
and drop files and folders to the tool to organise folders
successfully. Lastly a navigation bar was added to contain
additional options such saving projects, accessing the Drop it
archive and subtitling functionality.
Subtitle display
Hardware acceleration (DXVA2 / NVIDIA CUVID /
Intel QuickSync)
Audio bitstreaming
Video thumbnails in Explorer
File association options
Broken codec detection [8]
-WebMConverter - (converter)
Simple, freely available editor that converts videos into the webm
format. Conversion to other common video formats produces the
desired result. The application is modifiable. However, conversion
takes some time to do [9].
4.1 Tools and Technologies
-Handbrake (transcoder)
Handbrake is open source tool for converting video from nearly
any format to a selection of modern, widely supported codecs.
This transcoder is multiplatform software [10].
The following are some of the technologies and tools that were
considered for the development of the application to meet
requirements and are as follows:
-WinFF (converter)
WinFF is open source and cross platform software which makes
use of FFMPEG framework. This software comes with a basic
user interface [11].
-FFmpeg - (encoder and decoder)
This is a cross-platform conversion tool that allows audio and
video formats to be streamed and converted from one format into
another. It consists of a command line tool that converts videos, a
media server for streaming, a media player and a multimedia
stream analyser capable of previewing subtitles with videos and
embedding them as one entity. This tool provides an easy-to-use
back-end, which requires that the front-end be developed by the
programmers who intend on using it [5].
-Mkvmerge (subtitling)
Mkvmerge is an open source software used to combine different
media files and is capable of connecting their media streams to
one another. This software is able to burn subtitles into videos
easily [12].
-Drop it (archiving tool)
Drop it is an open source software used for automation of
organising files and folders [13].
-Kaltura (Free trial -30 days) - (Online Plugin)
This is a video module available for Drupal. It integrates with
Drupal sites to provide video content, along with supporting audio
and images. It provides a number of features, and is easy to scale
depending on the frame requirements of a website. It has a trial
account that provides 10GB of free hosting and streaming.
Thereafter this model has to be purchased [6].
FFmpeg and Mkvmerge are tools capable of providing subtitles to
video files. FFmpeg is different in its capability to embed subtitles
and Mkvmerge differs to having able to preview subtitles whilst
video is on playback. Before the development of the application,
other converters were considered. These converters were
investigated based on requirements, programming language and
output format (AVI).
-YouTube editor - (Online editor)
This is an online editor to create, manage and upload videos all in
a single interface. It allows a user to perform basic editing and
publish on the YouTube platform. Simple transitions can be
added, and captioning can be done as well. The great thing about
this editor is the simplicity and the direct publishing. However,
this editor is only accessible online, so if a user has intermittent
connection or none at all, it becomes quite challenging [7].
A comparison was made between Handbrake, WinFF and
WebMConverter which are specified in the table below.
Table 1 Converter comparison of Handbrake,WinFF and
WebMCoverter
-KLite codec - (codec)
K-Lite codec pack is available in a variety of codec groups
ranging from options that provide basic (essential) decoders to
advanced decoders which come in a large set. Supports playback
of:
 AVI, MKV, MP4, FLV, MPEG, MOV, TS, M2TS,
WMV, RM, RMVB, OGM, WebM
 MP3, FLAC, M4A, AAC, OGG, 3GP, AMR, APE,
MKA, Opus, Wavpack, Musepack
 DVD and Blu-ray (after decryption)
6
Handbrak
e
WinFF
(FFMpeg)
WebMConverte
r (FFMpeg)
AVI support
no
Yes
yes
programmin
g language
C#
Free
Pascal/Lazaru
s
C#
upload to
YouTube
no
No
no
convert
without
transcoding
no
Yes
yes
trim support
yes
No
yes
subtitle
support
yes
Yes
no
change
resolution
yes
Yes
yes
uploading original files but rather the final video produced to be
uploaded.
Disadvantages of offline applications
Disadvantages of offline applications are as follows: Packages for
a particular software may not be available for the operating
system that a user is using. The packages may only have a trial
period in which a user can use the software for a limited time,
thereafter requiring purchase to continue using. During the
development phase, careful consideration of the operating system
and software compatibilities need to be analysed because the
software may require additional requirements. The use of an
offline application reduces availability, because it can only be
used in the computers where it is installed instead of any device
with internet connections.
After careful examination of pros and cons of making an offline
application, the use of having an offline desktop application was
chosen.
We decided to use the WebMConverter as it is developed in a
language that is commonly used with sufficient documentation
and it passed the checklist of features that are to requirements as
well as an FFMpeg back-end which is compatible with windows
operating systems. As a developer it is was advantageous to use
C# as there was experience/background knowledge of the
language as opposed to no experience in Free Pascal/Lazarus.
4.2 Constraints and Considerations
There are constraints and considerations that need to be
highlighted when developing the software. Examples of such is
integration of applications.
Online plugins
The use of online plugins such as Kaltura, makes it easier for
developers to update software remotely as opposed to an offline
application. The use of existing plugins might make integration
with the current Drupal website less complicated. The offline
application will achieve integration with the Drupal platform with
a link of the uploaded video from YouTube coming from the
software to the Drupal website which the DCCT uses. Online
software does not need to be installed to hard drives and require
the processing power from the CPU. Video rendering tends to
require a large amount of memory depending on the size of the
video used in the editing phase. The online plugins are limited by
their ability to be accessed and adjusted by any developer.
Integration of different applications can be risky and may require
effort to make sure they function together. Involving third party
software can have limitations as code may be complex and
contain unnecessary features that can confuse client’s especially
Deaf users. The open source applications did not have sufficient
documentation such as commenting on code to help understand
pieces of code. Each software has its own specific requirements
and this can create compatibility problems especially if the client
decides to use the program on several computers. An example of
such complications can arise between 32 bit and 64 bit windows 7
computers as FFmpeg framework has different resources.
Open source software
The use of online plugins may include insufficient functionality or
unnecessary features. Open source software may be used if the
choice of using offline applications is taken. Open source software
is free and is accessible for any developer to modify and distribute.
Open source software is likely to have faults depending on how
reputable the service is to the online community [14]. Regarding
the management of open source software, it is difficult to identify
which contributor worked on a particular piece of code, making it
challenging to fix errors.
FFMpeg was unable to convert MXF files to FLV and MKV
without the corruptions of video files. Therefore an alternative
web friendly format and commonly used format capable of being
encoded or decoded by FFMpeg and playable via common media
players had to be chosen and this was an AVI file.
Windows Media player vs VLC media player
Windows Media player and VLC media players were the two
video players which were possibly considered as an in built video
player of the application. The advantages of using VLC plugin is
that it is capable of playing various video formats such as AVI,
mp4 and audio formats that are hard to find in other video players.
Although VLC is capable of playing a wide range of formats,
wrappers that users make worldwide for a C# VLC plugin
requires a lot of development to get to a standard that would be
easy for beginners and advanced user to be satisfied with. There is
a risk of bugs occurring with the wrappers made available and this
one of the reasons the alternative of using the windows media
player. C# Windows media player does not support some video
and audio formats which are more regularly used nowadays. MXF
format is a video format that is mainly used by HD cameras
nowadays, as they are recorded by the DCCT, this causes a
Other options include the use of proprietary software which could
have been costly and would have limited access to developer code
and developing the desktop application from scratch, witch time
constraint was a major concern regarding this approach. Therefore
an approach to use open source software was the most viable.
Advantages of Offline application
The advantages of using an offline application is that videos with
very large sizes which are approximately can require
Gigabytes/Terabytes of memory do not need to be uploaded first
in order to be edited. Large sized videos take a considerably long
time to upload and this is a problem as bandwidth may not be
suitable. Therefore having an offline application will exclude the
7
problem because windows media player cannot play these
unsupported files. However a third party package called K-Lite
codec pack can enable recognition and playback within the
windows media player which is standard across all windows 7
operating systems. Therefore a final decision to implement
Windows media player was taken.
their interactions with each other through the systems layers.
These features are as follows with their code or/and rationale:
Trimming
ffmpeg -i [input video name] -vf trim=[start end]:[end time]
[output video name]
Subtitles
Subtitles can be provided in the following basic formats in various
video channels, these include: SubRip (.srt), SubViewer (.sbv or
.sub), MPsub (MPlayer subtitle), LRC and Videotron Lambda
[1].YouTube prefers to use Scenarist Closed Caption format on the
basis that a video uses captions that are based on CEA-608
features [1].However considering ease of use and compatibility,
the SubRip format was chosen as the primary format.
Furthermore, subtitles can be added to the videos as an external
file. Embedding subtitles into a video file would reduce issues of
file management. This capable with FFmpeg. Mkvmerge adds the
subtitle file as a track to the video. This greatly reduces the time to
add the subtitle to the video file, and also allows subtitles to be
edited after conversion. However, the subtitle file has to be
uploaded separately onto YouTube.
FFmpeg allows user to specify start and end times in which a
particular part of the video can be extracted from the input video
(original video) and be saved to another filename (resultant
video).The start and end times are can be specified in seconds or
e.g. start time is ’6’ with a colon in between to denote ‘until’ and
followed by corresponding end time e.g. ‘8’.
Removing of sound
ffmpeg -i [input video name] -c copy -an [output video name]
FFmpeg make the removal of sound task simple. The use of an
‘an’ flag is used to remove sound. This is an optional feature in
the tool and not automated to execute, the audio tracks can be
separated from video without affecting video quality in any way.
The rationale is that there are videos that the Deaf community
could potential post on their DCCT website that could cater for
non-SASL speakers where verbal communication can be used to
explain SASL communications (working conjointly).
4.2 Recommended Course of Action
FFMpeg will be done using C#. This programming language acts
as a wrapper class that provides the user interface for the
application. This was implemented by creating a class that
executes FFmpeg commands on the command line. Input from the
user is used to create the FFmpeg command. The UI captures all
the user information.
Conversion of video
ffmpeg -i [input video name].[original extension] [output video
name].[destination extension]
Evaluation
FFmpeg allows basic to more complicated ways to convert video
from one format to another. The amount of time taken to do this
depends on multiple factors to be discussed later in the paper.
Converting a video is as simple as changing a file's extension and
this new extension can be specified in the ‘destination extension’
e.g mkv/avi formats. Conversion of video to another format is key
in order to get web friendly formats such as AVI and reduces the
size of video if converted to a more compressed format.
The video editing and upload system was evaluated across the
different functional and non-functional requirements to ensure
that there are no unrecognized bugs within the written code. Error
handling is a very important part of the application as invalid data
such as unwanted strings.
System testing was performed throughout the development of the
Video up tool. Two core testing methodologies were used. Blackbox Testing was conducted to test the system functionality
without knowledge of its interior organisation and rational. Blackbox methodology proved useful in testing the K-Lite codec
Packages and FFmpeg to ensure that expected results were being
returned by the packages. K-Lite codec proved to show that it can
allow the media player component to play the variety of formats.
Changing Resolution
ffmpeg -i [input video name] -vf scale=[height]:[width]
[output video name]
In order for the resizing of a video to be achieved, FFmpeg makes
use of a scale specifier which allows dimensions to be placed. For
example 720p is 1,280 pixels displayed across the screen
horizontally and 720 pixels down the screen vertically therefore in
the code it will look like this scale=1280:720. Changing resolution
of a video affects the dimensions of video on a screen.
User acceptance Testing was conducted with user to get feedback
on the functionality of the system works suitably or is
unacceptable. The results can be found on the user acceptance
form found on the website.
Video uploading
Uploading made use of the YouTube API instead of FFmpeg as it
made use of browsers to connect to a user’s YouTube account.
Uploading speed is dependable on the bandwidth and other tasks
that make use of internet on the computer. An example of this is
automatic updates include antivirus and windows updates.
Currently DCCT makes use of 1 Mbps which is slower than the
4.2.1 Video Editing Features
User interface interactions triggers the following commands to be
executed in the FFmpeg media framework. The following are the
lists of commands according to features and how each function.
The use cases diagram in figure 2 visualizes the following
functions and the architecture diagram show the components and
8
previous speed from Telkom of 4 Mbps which was used.
Uploading the converted clips will require less time as opposed to
the large sized original files. The users are able to upload different
types of video that YouTube supports (e.g. AVI) in order to be
linked to the community's website.
The video files were identical across the 3 different formats
except MXF files as a result of other video formats which were
unable to convert to it using FFMpeg and other converter tools
such as Handbrake.
Results
Archiving
A software called ‘Drop it’ is used to achieve the files that the
video upload application has received as input into a hierarchy of
organised and categorised folders according to the user’s
preference.
The following are the results that were deduced from performance
testing and are as follows:
Conversion Time
(minutes)
8
5. RESULTS AND FINDINGS
Methodology
The experiments were generated using PowerShell scripts which
were executed with the FFmpeg code in the windows command
line, using scripting language built on the .NET framework. The
results recorded in text files and using another program called
(Formatter.java) the results were formatted in a structure
recognizable by .csv files in order to produce graphs (See
http://people.cs.uct.ac.za/~mknmon007/development.html).
6
6,406
6,653
mkv
mp4
7,339
4,358
4
2
0
flv
mxf
Video formats (extensions)
Figure 4 Bar graph showing average time of converting
formats to AVI file.
Performance testing of the FFmpeg algorithms was done in order
to test non-performance requirements. The following criteria was
considered which included: original video format, number of
times (repetitions), size range of video and time taken for the task.
The size ranges are categorised into small, medium and large,
where small is 0-500MB, medium is 500MB-1.5GB and large is
1.5 GB-3GB. However for the MXF video format, medium and
large file sizes are 500MB-4GB and 4GB-8GB respectively. The
MXF size category was specified because they generally require
more space for short duration clips. The specification of the CPU
is also important to note for this testing.
The figure above displays that the FLV format takes on average
the least amount of time to convert to AVI format which is
followed by MP4 and MKV, and MXF takes the longest time to
convert. Considering the amount of total space of the FLV videos
being the least at 3826.17MB this contributed the least to the
duration of processing files as a whole.
50000,00
38507,58
40000,00
Size (MB)
The common video formats to be considered for testing include
MKV, MP4, MXF and FLV files. The number of repetitions is 10
times for each video. The testing will done on 2 windows
computers. The independent variable will be the original video
used in different formats to convert to AVI format in order to keep
standardisation.
41982,00
45365,74
30000,00
20000,00 16672,43
10000,00
1,40
Table 2 Table showing the size criteria, number of files
alongside the total space of size category of video formats for
performance tests
5,67
0,00
flv
mkv
mp4
mxf
Video formats (extensions)
Size
Saved space
Small
4 files
(18,3MB)
4 files
(3,17MB)
4 files
(18,3MB)
1 files
(12,9 MB)
Medium
4 files
(1420MB)
4 files
(433MB)
4 files
(1300MB)
1 files
(859MB)
High
4 files
(5820MB)
4 files
(3390MB)
4 files
(5700MB)
1 files
(4950MB)
format
.mp4
.flv
.mkv
.mxf
Gained space
Figure 5 Bar graph showing conversion space/Gain from
formats to AVI file from 10 iterations of each file.
MXF format to avi was projected to having the most saved space
(45 GB) followed by mp4 (42 GB), mkv (39 GB) and flv (16 GB)
video format. The tests done also revealed that formats such as flv
and MKV can slightly increase in file size after conversion have
taken place. The figure above also reveals that MKV has the most
gained space when converting to AVI format. However other
9
Time (Minutes)
results show that MXF and mp4 formats did not gain more space
with the sample size used in the conversion process.
12
10
8
6
4
2
0
Table 3 Space Saved from changing resolution of formats to
480p
Format
FLV
MKV
MP4
MXF
10,825
flv
0,138
0,137
mkv
mp4
As Table 3 list each formats saved space after resolution change,
MXF format saved the most space of approximately 2 GB
followed by mp4 (534, 40 Mb), mkv (520, 72 Mb) and flv(269,24
Mb). The result show that there was no gained space from Mxf
videos which are 1080p, and rest of the other formats which had a
resolution of 720p. The above diagram is evidence that a change
to lower resolutions saves space and an increase in resolution take
more memory. The reason the 480p resolution was chosen for the
tests was that it is viewable for sign language purposes from the
client. It is important to highlight that reducing resolution is
directly proportional to the reduction of pixels (picture elements
on a screen) this affects the clarity of images within each video
frame. Clarity is important otherwise reduction of resolution to
smaller ones such as 240p will yield smaller file sizes.
0,926
mxf
Video formats (extensions)
Figure 6 Bar graph showing average time of removing sound
from the formats.
Saved Space (MB)
According to Figure 6, the average time to remove audio was the
recorded as the smallest for mp4 followed by mkv, mxf and flv
consecutively. Results reveals that flv formats might probably
contain a large sound space and also indicate quality of sound is
likely very high. Recordings show that mkv, mp4, mxf generally
take less than a minute for sound to be removed successfully. FLV
tends to take ten times more time, thereby positioning it as the
slowest in terms of removing sound.
Total new size (%)
Space Saved (MB)
269, 340
520,720
534,400
2017,700
12
10
8
6
4
2
0
100000,00
80000,00
60000,00
40000,00
1min trim
20000,00
3sec trim
0,00
flv
mkv mp4 mxf
Video formats (extensions)
flv
mkv
mp4
Figure 8 Trimmings of videos over time intervals showing
effects on space saved on the different formats.
mxf
Video formats (extensions)
high
Medium
Figure 8 shows that FLV results of a minute trim compared to 3
second was very similar in size. This means the length of video is
not the only contributor to size of a file within FLV files. MXF files
had a major difference because sample files were fewer than other
formats.
Small
Figure 7 Percentage reduction of new files relative to original
format sizes in each size category (High, Medium and Small)
after sound removal.
Black box tests were done, as trimming tests were also executed to
investigate if times specified extracted the desired video clip.
Results proved that the duration of the new clips produced was the
length of end time minus the start time specified by user. The effect
of trimming in terms of storage space depends on the duration of
the cut video and the contribution of quality of video and sound
within extracted video in relation to original video file.
Apart from MXF, Saved space was the most for FLV video
through the various categories of small, medium and large sized
video files. FLV categories of high, medium and small achieved
saved space difference of approximately 3%, 2% and 5% to other
formats excluding MXF. This was expected as FLV had the
longest duration of removal of audio suggesting that it is possibly
had more audio storage. MXF format was very interesting to note
as the sizes of new files performed better as the categories went
from High to small. This means that a small categorized MXF file
would save more space relative to its original size than compared
to videos categorized as Medium or High.
User acceptance tests on software were done (See
http://people.cs.uct.ac.za/~mknmon007/development.html).
.Uploading and subtitling features was accepted by the DCCT
clients and archiving was successfully accepted as the Dropit
software managed to keep files organised within storage locations
specified by the user within the internal/external hard drive. Folder
10
Conversion time
(min)
contained the correct files according to management instructions
[13].
400
300
200
100
0
Limitations
Sample size of videos used could have been larger per format.
Increasing the sample size would mean that more time would have
been required for testing per function and would have performed
badly in the i3 computers. This would have made the results more
accurate with approximations of time.Embedding of subtitles
made uploading to YouTube server easy, but the limitation is that
embedding/burning it to video and made them a single entity
cannot be changed once done. Separating video and subtitles
allows editing, but requires two uploads to YouTube (video and
subtitles) and mapping the subtitles to the video again.
i3
i5
flv
mkv
mp4
mxf
Video formats (extensions)
The use of windows media player is limited, in terms of
retrieving information from the time bar which displays video
times in minutes and hours therefore the capturing of specific
millisecond time cannot be achieved to edit specific portions of
the video. The use of VLC media player would have eliminated
this problem. Subtitle adjustments to video whilst playing was not
achieving, but could only be preview once loaded anew.
Figure 9 Conversion time between i3 4GB RAM computer
against i5 8GB RAM computer from formats to AVI.
Since conversion produced the greatest impact to saved space
figure 6 displays the speed performance of the two computers. This
shows that i3 performed poorly against the speed taken by i5
computer, this may be a concern for the clients’ computer
especially when converting FLV files which is almost 300 times
slower.
6. CONCLUSIONS
The project aimed to allow DCCT users to archive, edit, and upload
their videos onto their website. In order to do this, a software was
built using software engineering SDLC cycle in order to meet these
requirements and any other user requirements, such as cropping,
which arose from development. The results showed that
conversion, sound removal, resolution change, trimming features
are capable of reducing file sizes in particular, the conversion
feature saved a most spaced for MXF format (45 GB) followed by
MP4 files (42GB) over the total iterations .This makes the two
formats the most desirable to record in or provide as input to the
application. However, there we some limitations highlighted. The
sample size of videos was limited to considerably a small sample
size and the 10 test iterations per video could have been greater this
would allow more data to be produced for more rigorous tests on
speed performance. The media player used from preview of videos
was Windows Media Player, which had its limitations that affected
the trimming and subtitling feature on the basis of capturing time
from its component time bar as it did not allow for millisecond
video capturing of time. Trimming was useful, because there was
an initial problem of DCCT events’ videos taking a long time to
upload and manage on the DCCT website, therefore trimming long
videos into shorter and more manageable clips with made storage
manageable. It was interesting to note that Conversion can be reach
speed up 300 times more quicker on i5 8GB RAM computers than
compared to i3 computers such as one used by the clients[19]. The
performance testing between the computers also confirmed the
importance of considering CPU speeds, RAM, free memory when
building offline based video editing tools.
Findings
Findings relating to size are as follows: Converting functionality
gave the most space saved especially on MXF, removal of audio
track was second best, with resolution on third place. Conversions
made the most significance as opposed to other non-time
dependant features. A time dependant feature is trimming depends
on users’ time preference. Therefore the best way space can be
maximised involves using multiple these multiple features to
achieve optimum video size for uploading to YouTube.
Moreover in respect to output file size, the newly converted files
where seen to remain the same over multiple iterations giving a
new size different from original files, this proves that the
algorithm used by FFmpeg is reliable. It was also interesting to
find out conversions to AVI can yield to gained space of FLV and
MKV files.
Findings relating to speed are that it was interesting to note
particularly that videos that have the longer duration took more
time to convert than shorter video files, this is proved by the
FFmpeg having to cycle over every frame of the video this means
that size of video is not the only variable affecting speed all
function. The time elapsed of converting a video does not remain
constant through every iteration. The number of programs running
on a computer and how much of the resources they required
affected the speed of conversion as free memory diminishes.
Future work that can be done is to make the video editing tool
software a touchscreen based software for the Deaf since the Deaf
users at DCCT responded very well to the interactive prototypes
produced. In order to improve speed performance Solid State
Drivers (SSD) can be used on computers running the application
for the aim of speeding performance [18].
As seen in figure 9, comparisons between the 2 computers is that
time elapsed varied considerably depending on CPU and RAM of
the machine used. On an i5 8GB RAM computer overall feature
performance took far much less time compared to an i3 4 GB
computer. Such an example of this large size files that took a
maximum of 25 minutes to convert to AVI type files opposed to
i3 4GB RAM computer which took over 24 hours to convert. This
confirms that the greater RAM and processing power a CPU has
results in speedups in performance.
7. ACKNOWLEDGEMENTS
I would to thank my supervisor, Professor Edwin Blake. I would
like to thank Meryl Glaser for interpreting SASL in order to
understand our targeted users and also to acknowledge the system
administrator and Deaf users who volunteered to design the
software.
11
https://www.bunkus.org/videotools/mkvtoolnix/doc/mkvmerg
e.html. Accessed: 2015- 11- 09.
8. REFERENCES
[13] DropIt: Personal Assistant to Automatically Manage Your
Files: 2015. http://www.dropitproject.com/. Accessed: 201511- 09.
[1] Erasmus, D. 2010. Video quality requirements for South
African Sign Language communications over mobile phones.
Department Of Computer Science, Faculty of Science at the
University Of Cape Town. (2010).
http://people.cs.uct.ac.za/~edwin/MyBib/2012-erasmusthesis.pdf
[14] Setia, P. et al. 2012. How Peripheral Developers Contribute
to Open-Source Software Development. Information Systems
Research. 23, 1 (2012), 144-163.
[15] Glaser, M. and van Pletzen, E. 2012. Inclusive education for
Deaf students: Literacy practices and South African Sign
Language. Southern African Linguistics and Applied
Language Studies. 30, 1 (2012), 25-37.
[2] G. Olivrin and L. van Zijl. 2008. South African Sign
Language Assistive Translation. Assist. Technol. Telehealth
(2008), 7–12
[16] Vaishnavi, V. and Kuechler, W. Design science research
methods and patterns.
[3] Wactlar, H. and Christel, M. 2002. Digital Video Archives:
Managing Through Metadata. National Digital Information
Infrastructure and Preservation Program. (2002).
[17] Hasan, S. and Isaac, R. 2011. An integrated approach of
MAS-CommonKADS, Model–View–Controller and web
application optimization strategies for web-based expert
system development. Expert Systems with Applications. 38, 1
(2011), 417-428.
[4] Upload subtitles and closed captions - YouTube Help: 2015.
https://support.google.com/youtube/answer/2734698?hl=en.
Accessed: 2015- 10- 19.
[5] Tomar, S. 2006. Converting video formats with FFmpeg.
Linux Journal. 2006, 146 (2006), 10.
[6]
[18] Po, L. and Guo, K. (2007). Transform-Domain Fast Sum of
the Squared Difference Computation for H.264/AVC RateDistortion Optimization. IEEE Trans. Circuits Syst. Video
Technol., 17(6), pp.765-773.
James, T. 2010. Drupal web services. Packt Pub.
[7] YouTube Video Editor - YouTube Help: 2015.
https://support.google.com/youtube/answer/183851?hl=en.
Accessed: 2015- 11- 09.
[19] iTechtics,. Difference Between Intel Processor Generations.
2013. http://www.itechtics.com/processor-generations/.
Accessed: 2015- 11- 09.
[8] About the K-Lite Codec Pack: 2015.
http://www.codecguide.com/about_kl.htm. Accessed: 201511- 09.
[20] Support.google.com,. Upload videos longer than 15 minutes
- YouTube Help. 2015.
https://support.google.com/youtube/answer/71673?hl=en.
[9] WebMBro/WebMConverter: 2015.
https://github.com/WebMBro/WebMConverter. Accessed:
2015- 11- 09.
[21] L. Muir, I. Richardson, and S. Leaper. 2003. Gaze
tracking and its application to video coding for sign
language. Pict. Coding Symp.(2003), 23–25.
[10] HandBrake: Features: 2015.
https://handbrake.fr/features.php. Accessed: 2015- 11- 09.
[22] Wells, N., Morgan, O., Wilkinson, J. and Devlin, B. The
MXF Book. Elsevier, Burlington, 2006.
[11] WinFF - Truly Free Video Converter: 2015.
http://winff.org/html_new/documentation.html. Accessed:
2015- 11- 09.
[23] Wilkinson, J. MPEG-2-Long GOP Mapping for MXF File
Storage Applications. SMPTE Motion Imaging Journal 115,
7-8 (2006), 241-247.
[12] mkvmerge -- Merge multimedia streams into a Matroska file:
2015.
12
Download