RecordAudio Walkthrough: C# Recording an Audio Stream and Monitoring Direction About This Walkthrough In the Kinect™ for Windows® Software Development Kit (SDK) Beta, RecordAudio is a C# console application that demonstrates how to record an audio stream from the microphone array of the Kinect for Xbox 360® sensor and monitor the direction of the audio source. This document is a walkthrough of the RecordAudio application that is provided with the beta SDK. Resources For a complete list of documentation for the Kinect for Windows SDK Beta, plus related reference and links to the online forums, see the beta SDK website at: http://kinectforwindows.org Contents Introduction ....................................................................................................................................................................................................... 2 Program Basics ................................................................................................................................................................................................. 2 Create and Configure an Audio Source Object .................................................................................................................................. 3 Record the Audio Stream ............................................................................................................................................................................. 4 Monitor the Beam Direction ....................................................................................................................................................................... 5 License: The Kinect for Windows SDK Beta is licensed for non-commercial use only. By installing, copying, or otherwise using the beta SDK, you agree to be bound by the terms of its license. Read the license. Disclaimer: This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it. This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes. © 2011 Microsoft Corporation. All rights reserved. Microsoft, DirectX, Kinect, MSDN, and Windows are trademarks of the Microsoft group of companies. All other trademarks are property of their respective owners. RecordAudio Walkthrough: C# – 2 Introduction The audio component of the Kinect™ for Xbox 360® sensor is a four-element microphone array. An array provides some significant advantages over a single microphone, including more sophisticated acoustic echo cancellation and noise suppression. By using beamforming algorithms, applications can use a microphone array as a directional microphone and focus on a particular audio source. RecordAudio is a C# console application that demonstrates how to record an audio stream from the Kinect sensor’s microphone array and monitor the source direction. This document is a walkthrough of the RecordAudio application, which is provided with the Kinect for Windows® Software Development Kit (SDK) Beta. For an example of how to implement a managed application to capture an audio stream from the Kinect sensor’s microphone array, see RecordAudio. For examples of how to implement a C++ application to capture an audio stream from the Kinect sensor’s microphone array, see “MicArrayEchoCancellation Walkthrough,” “AudioCaptureRaw Walkthrough,” and “MFAudioFilter Walkthrough” on the beta SDK website. Program Basics RecordAudio is installed with the Kinect for Windows Software Development Kit (SDK) Beta samples in %KINECTSDK_DIR%\Samples\KinectSDKSamples.zip. RecordAudio is a C# console application that is implemented in a single file, Program.cs. Important RecordAudio targets the x86 platform. The basic program flow is as follows: 1. Create an object to represent the Kinect sensor’s microphone array. 2. Capture the audio stream and write it to a file. 3. Monitor the source direction. To use the RecordAudio 1. Build the application 2. Press Ctrl+F5 to run the application. 3. Speak while you are moving side to side. The following shows some sample output: Recording for 20 seconds Beam direction changed (radians): 0 Beam direction changed (radians): -0.175 Sound source position (radians): -0.217024366288511 Beam direction changed (radians): -0.349 Sound source position (radians): -0.340237945622282 Beam direction changed (radians): -0.175 Sound source position (radians): -0.14727806217808 The remainder of this document walks you through the application. Beam: -0.175 Beam: -0.349 Beam: -0.175 RecordAudio Walkthrough: C# – 3 Note This document includes code excerpts, most of which have been edited for brevity and readability. In particular, most routine error correction code has been removed. For the complete code, see the RecordAudio sample. Hyperlinks in this walkthrough refer to content on the Microsoft® Developer Network (MSDN®) website. Create and Configure an Audio Source Object The KinectAudioSource object represents the Kinect sensor’s microphone array. Behind the scenes, it uses the MSRKinectAudio Microsoft DirectX® Media object (DMO), as described in detail in “MicArrayEchoCancellation Walkthrough“ on the beta SDK website. Most of the sample is implemented in Main. The first step is to create and configure KinectAudioSource, as follows: static void Main(string[] args) { var buffer = new byte[4096]; const int recordTime = 20; const int recordingLength = recordTime * 2 * 16000; const string outputFileName = "out.wav"; Thread.CurrentThread.Priority = ThreadPriority.Highest; using (var source = new KinectAudioSource()) { source.SystemMode = SystemMode.OptibeamArrayOnly; source.BeamChanged += source_BeamChanged; ... } ... } RecordAudio first defines two constants that control the recording process: The recording time, which is set to 20 seconds. The recording length, in bytes, which is set to the product of the recording time, the sample size (2 bytes), and the number of bits per sample (16,000). To avoid dropped samples, RecordAudio sets the thread priority to ThreadPriority.Highest. RecordAudio next creates and configures a KinectAudioSource object, which represents the microphone array. You configure KinectAudioSource by setting various properties, which map directly to the MSRKinectAudio DMO’s property keys. For details, see the API reference. The RecordAudio application configures the KinectAudioSource object’s system mode as an adaptive beam without acoustic echo cancellation (AEC). Otherwise, RecordAudio uses default settings. KinectAudioSource handles beamforming internally and provides the results to the application. To use beamforming, you must set KinectAudioSource.MicArrayMode to one of the following MicArrayMode values, which differ in how they direct KinectAudioSource to choose among multiple audio sources: MicArrayFixedBeam uses the center beam. RecordAudio Walkthrough: C# – 4 MicArrayExternalBeam uses the beam that the application specifies. MicArrayAdaptiveBeam uses the beam that is closest to the direction that is specified by an internal source localization algorithm. This mode is enabled by default if you specify either Optibeam system mode. RecordAudio uses the default MicArrayAdaptiveBeam mode. Finally, RecordAudio subscribes to the KinectAudioSource.SoundSourceChanged event, which is raised when the source direction changes. Record the Audio Stream RecordAudio starts the audio stream, records it for 20 seconds, and writes the recorded stream to a .wav file, as follows: static void Main(string[] args) { ... using (var source = new KinectAudioSource()) { ... using (var fileStream = new FileStream(outputFileName, FileMode.Create)) { WriteWavHeader(fileStream, recordingLength); using (var audioStream = source.Start()) { int count, totalCount = 0; while ((count = audioStream.Read(buffer, 0, buffer.Length)) > 0 && totalCount<recordingLength) { fileStream.Write(buffer, 0, count); totalCount += count; if(source.SoundSourcePositionConfidence>0.9) Console.Write("Sound source position (radians): {0}\t\tBeam: {1}\r", source.SoundSourcePosition, source.MicArrayBeamAngle); } } } } } Before starting the recording process, RecordAudio creates a FileStream object to represent the output file and calls the private WriteWavHeader method to write the file’s .wav header. For details, see the sample. RecordAudio then calls KinectAudioSource.Start, which starts the audio stream and returns the associated Stream object. The recording process is handled by the while loop, which calls Stream.Read to read the stream buffer by buffer and FileStream.Write to write the buffer to the output file. The loop then prints the source and beam directions if the source location’s confidence value is greater than 0.9. The loop terminates when the number of recorded buffers reaches a specified recording length. RecordAudio Walkthrough: C# – 5 Monitor the Beam Direction KinectAudioSource raises a BeamChanged event when the adaptive beamforming algorithm switches beams. RecordAudio handles the event and prints the new beam direction, as follows: static void source_BeamChanged(object sender, BeamChangedEventArgs e) { Console.WriteLine("Beam direction changed (radians): {0}", e.Angle); } The BeamChangedEventArgs object contains the current beam angle, in radians. From the perspective of a user facing the Kinect sensor, you interpret the angle as follows: 0: The beam is directly in front of the sensor. Positive angle: The beam is right of center. Negative angle: The beam is left of center. For More Information For more information about implementing audio and related samples, see the Programming Guide page on the Kinect for Windows SDK Beta website at: http://kinectforwindows.org