Cohen-Music-Note-Detection-App-hw6

advertisement
HW#6
Cp-322
4/5/2012
Isaac Cohen
Fabian Michalczewski
Steven Peters
Ron Roscoe Sabale
I Pledge my honor that I have abided by the Stevens Honor System
Transparent Box Functional Diagram
The figure below, named the Transparent Box functional diagram, displays all the components
that are needed to create the Pitch Detection App. The inputs are either an mp3 file in the Android
library or music that is being recorded on the microphone of the Android device. As can be seen, when
using the microphone we also pick up background noise with the desired to music. To improve the pitch
detection, the music signal will first be put through a noise filter to filter out all of the undesired noise. It
will then convert the file to the frequency domain where it will do the analysis to detect the pitch. After
the pitches have been analyzed through the algorithm, the app can then display the information in a
couple of different ways. It could simply display the notes, play the notes, save the notes to a file for
future use, or even display the notes being played in a piano. This gives the user a good interface and
way to display the information.
Figure 1- Transparent Box Functional Diagram
Function-Means Tree Diagram
Upon the detection of a sound or music file our application will need to perform one of
a multitude of possible techniques in order to detect the pitch changes present in the music
and then present those pitch changes as variations in notes being played. This general approach
should convert the music file into sheet music representative of that music. There are many
techniques available that can be used to perform the first step of detecting and representing
pitches as a form of data which can then be further converted into notes and chords on the
musical scale. Depending on which algorithm is used we will obtain different results with
various levels of accuracy but by analyzing the similarities and differences obtained we can
arrive at a signal translation of maximum correctness.
Figure 2- Function-means tree diagram
There are two primary ways we can generate the data we need to obtain from the
music that is being played. The first way is through time domain analysis through the use of the
auto correlation algorithm. The major goal of this type of analysis is the identification of the
fundamental frequencies of a signal by detecting changes in polarity of the input signal, either
from positive to negative or negative to positive. It is at these transition points that locations
for fundamental frequencies. This is done by taking the signal and shifting it back and forward
in time making use of the assumption that periodic signals show relative similarities to signals
of adjacent periods. By passing a signal through the autocorrelation function and identifying the
minimum values of said function provide an initial list of frequency options a number further
reduced by finding polarity transition points. This technique works well at small frequency
ranges because we need to capture a period of at least double that of the incoming signal. In
terms of our project this technique provides high accuracy readings especially at music
segments with violent swings in pitch but becomes unfeasible to use as a primary detector
because it requires a multitude of calculations but can work for small ranges.
The other way we can analyze the music is through frequency domain analysis with
which we can employ one or more of three techniques. These three techniques are the
Maximum likelihood analysis, the Harmonic Product Spectrum and the Hybrid Cepstrum
analysis. Each of these signals converts the music signal from the time to frequency domain and
that is where they are implemented in order to produce an end result. For the maximum
likelihood analysis the input signal is matched to a set of idealized spectra in order to find the
closest match between the input signal and the idealized case. This analysis is limited by cases
where a signal falls in the middle of two pitches which can cause problems in identification.
Also octaves outside the pitch ranges will produce more errors. These errors we will attempt to
cover up with other methods.
Another technique we can use is the Harmonic Product Spectrum (HPS) which takes the
spectrum and starts by compressing it using down sampling. This isolates the fundamental
frequencies of the signal by eliminating the same frequencies at higher orders by fusing them in
the down sampling. These down sampled signals are then multiplied together to create one
fundamental frequency with relative ease. However this technique works poorly at low
frequencies. Lastly we can use Cepstrum analysis for our data. The first part of the analysis is to
calculate the Cepstrum by taking the DFT of the signal and examining it for a limited range of
frequency values corresponding to the period of the sample. It is then normalized and using a
probability algorithm and dynamic programming (based on a variety of factors) to determine
the pitch with the highest probability of being the one you are listening to. This works really
well at low frequencies and has been tested to be effective for frequencies that encompass
speech.
While all of these techniques have drawbacks each has a particular strength that may
prove to be a valuable asset in our endeavor. Each of the analyses weaknesses can be covered
up by the strength of another and the fusion of all of the techniques into one program we feel
can produce results of the highest clarity with minimum errors and maximum frequency
flexibility. While each will produce a different pattern the patterns can be cleaned up and
joined together to get as close we can to getting good results.
Android Architecture Diagram
The diagram below shows the major components of the Android operating system. Each
section is described in more detail below.
Figure 3- Android Architecture Diagram
Applications:
The top level is applications and refers to the core applications on an android device.
Applications include calendar, email client, maps ,browser, contacts, phone functions among
others. All applications are written in the Java programming language.
Application Framework:
Androids is an open development platform and thanks to this developers are able to build
highly customized apps for whatever their needs are. Developers have access to a full range of
framework APIs used by the core applications. Underlying all applications is a set of services
and systems, including:

A rich and extensible set of Views that can be used to build an application, including
lists, grids, text boxes, buttons, and even an embeddable web browser

Content Providers that enable applications to access data from other applications (such
as Contacts), or to share their own data

A Resource Manager, providing access to non-code resources such as localized strings,
graphics, and layout files

A Notification Manager that enables all applications to display custom alerts in the
status bar

An Activity Manager that manages the lifecycle of applications and provides a common
navigation backstack
These services and system are what enable Android applications to functions and will be
utilized during the development of our app.
Libraries:
Android includes a set of C/C++ libraries used by various components of the Android system.
These capabilities are exposed to developers through the Android application framework. Some
of the core libraries are listed below:

System C library - a BSD-derived implementation of the standard C system library (libc),
tuned for embedded Linux-based devices

Media Libraries - based on PacketVideo's OpenCORE; the libraries support playback and
recording of many popular audio and video formats, as well as static image files,
including MPEG4, H.264, MP3, AAC, AMR, JPG, and PNG

Surface Manager - manages access to the display subsystem and seamlessly composites
2D and 3D graphic layers from multiple applications

LibWebCore - a modern web browser engine which powers both the Android browser
and an embeddable web view

SGL - the underlying 2D graphics engine

3D libraries - an implementation based on OpenGL ES 1.0 APIs; the libraries use either
hardware 3D acceleration (where available) or the included, highly optimized 3D
software rasterizer

FreeType - bitmap and vector font rendering

SQLite - a powerful and lightweight relational database engine available to all
applications
Android Runtime:
Android includes a set of core libraries that provides most of the functionality available in the
core libraries of the Java programming language. Every Android application runs in its own
process, with its own instance of the Dalvik virtual machine. Dalvik has been written so that a
device can run multiple VMs efficiently. The Dalvik VM executes files in the Dalvik Executable
(.dex) format which is optimized for minimal memory footprint. The VM is register-based, and
runs classes compiled by a Java language compiler that have been transformed into the .dex
format by the included "dx" tool.
The Dalvik VM relies on the Linux kernel for underlying functionality such as threading and lowlevel memory management.
Linux Kernel:
Android relies on Linux version 2.6 for core system services such as security, memory
management, process management, network stack, and driver model. The kernel also acts as
an abstraction layer between the hardware and the rest of the software stack.
Source:
http://developer.android.com/guide/basics/what-is-android.html
http://androidsl.wordpress.com/2011/11/30/android-architecture/
Download