DATA ACQUISITION FOR SPEECH RECOGNITION SYSTEM Mitul Shah B.E., Dharmsinh Desai University, India, 2008 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in ELECRICAL AND ELECTRONIC ENGINEERING at CALIFORNIA STATE UNIVERSITY, SACRAMENTO FALL 2010 DATA ACQUISITION FOR SPEECH RECOGNITION SYSTEM A Project by Mitul Shah Approved by: __________________________________, Committee Chair Jing Pang, Ph. D. ____________________________ Date ii Student: Mitul Shah I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the project. ___________________________, Graduate Coordinator Preetham Kumar, Ph. D. Department of Electrical and Electronic Engineering iii __________________ Date Abstract of DATA ACQUISITION FOR SPEECH RECOGNITION SYSTEM by Mitul Shah The Data Acquisition System is the integral part of the speech recognition system. The speech recognition system is used for applications such as voice dialing and speechto-text conversion. Acquiring the human voice and storing the data is very important aspect of that system. The Data Acquisition System provides exactly the same functionality. This project report gives the background information for the analog signal sampling techniques, Analog-to-Digital converter (ADC). It discusses the AVR ATmega32A microcontroller’s internal ADC interfacing with the microphone using the pre-amplifier circuit. After the speech is sampled through the internal ADC of the Atmega32A, the project report gives the description of Serial Peripheral Interface (SPI) and a serial flash interfacing with the SPI port of the AVR. Finally, the Pulse-width Modulation is described in the final section. The PWM waveform from the AVR ATmega32A is then passed through the low-pass filter and then through the power amplifier to regenerate the original signal. iv The AVR STK500 is the development board used to implement the hardware. The software for the AVR microcontroller is written with C programming language. The project is successfully demonstrated using this board. _______________________, Committee Chair Jing Pang, Ph. D. __________________ Date v ACKNOWLEDGMENTS First, I would like to express my appreciation for Dr. Jing Pang. She provided me with an excellent opportunity to work on this project, which provided a great exposure to the field of audio processing and microcontrollers. I thank her for providing all resources, help and guidance needed to complete the project successfully. Her knowledge and expertise in the field was very helpful for me to understand the project and finishing it successfully. She provided me absolutely unmatched guidance for audio processing, the AVR STK500 development platform and testing strategies. Without her support, I would not have completed the work successfully as I have done. I would like to thank the Atmel Corporation for providing such great application notes on the various features such as Pulse Width Modulation, sound recording system, the Serial Peripheral Interface using the AVR devices and STK500. I want to show appreciation to the engineering forum www.avrfreaks.com for providing me an excellent discussion platform on the topic related to my project. I would like to thank Dr. Preetham Kumar for reviewing my report and providing valuable suggestions that helped me to improve my project report. I would like to thank my parents and friends for providing me strength and motivation during the critical phases of this project in last one year. Finally, I would like to thank Department of Electrical and Electronics Engineering at California State University, Sacramento for providing me facilities and laboratory services which prove very critical in completing my project successfully. vi TABLE OF CONTENTS Page Acknowledgments.............................................................................................................. vi List of Tables ..................................................................................................................... ix List of Figures ..................................................................................................................... x Chapter 1. INTRODUCTION ........................................................................................................ 1 1.1 Introduction to Data Acquisition for Speech Recognition System ................... 1 1.2 Background of the Project ................................................................................ 1 1.3 Organization of Project Report ......................................................................... 3 2. SYSTEM OVERVIEW ................................................................................................ 5 2.1 Introduction ....................................................................................................... 5 2.2 Hardware Implementation ................................................................................ 5 2.3 AVR STK500 Development Platform .............................................................. 8 2.4 Software Organization .................................................................................... 10 3. AUDIO SAMPLING .................................................................................................. 12 3.1 Analog to Digital Conversion ......................................................................... 12 3.2 Hardware ......................................................................................................... 15 3.3 Software .......................................................................................................... 17 4. DATA STORAGE USING SERIAL FLASH MEMORY ......................................... 21 4.1 Serial Peripheral Interface in AVR ATMEGA32A ........................................ 21 4.2 Serial Peripheral Interface in the DataFlash W25Q80BV .............................. 25 4.3 Software for the DataFlash Access ................................................................. 27 4.4 Write to DataFlash .......................................................................................... 28 4.5 Erase Operation............................................................................................... 30 4.6 Read Data from the DataFlash ........................................................................ 32 5. AUDIO GENERATION USING PWM ..................................................................... 33 5.1 Pulse-width Modulation Waveform ............................................................... 33 vii 5.2 AVR ATmega32A Setup for PWM Mode of Operation ................................ 35 5.3 Software Setup for Play Back ......................................................................... 37 5.4 Hardware Circuit............................................................................................. 38 6. CONCLUSION ........................................................................................................... 41 6.1 Conclusion ...................................................................................................... 41 6.2 Future Work .................................................................................................... 42 Appendix - Implementation Software ............................................................................... 43 References ......................................................................................................................... 50 viii LIST OF TABLES Page 1. Table 4.1 Data direction ............................................................................................. 22 2. Table 4.2 SPI Control Register in ATmega32A ......................................................... 23 3. Table 4.3 CPOL and CPHA states, functionality and SPI mode of operation............ 24 ix LIST OF FIGURES Page 1. Figure 2.1 Block diagram ............................................................................................. 5 2. Figure 2.2 Detailed hardware interface with AVR ATmega32a .................................. 7 3. Figure 2.3 AVR STK500 components ......................................................................... 8 4. Figure 2.4 AVR STK500 block diagram ..................................................................... 9 5. Figure 2.5 Software flow chart ................................................................................... 10 6. Figure 3.1 Analog signal converted into a time discrete signal .................................. 12 7. Figure 3.2 Illustration of proper and improper sampling of a sinusoidal signal ......... 14 8. Figure 3.3 Microphone pre-amplifier ......................................................................... 16 9. Figure 3.4 Record function flow chart ........................................................................ 17 10. Figure 3.5 ADC running in single conversion mode .................................................. 19 11. Figure 4.1 Master – Slave interconnection for SPI data transfer ................................ 21 12. Figure 4.2 The SPI data transfer format for CPHA = 0 (Mode 0 and 2) .................... 25 13. Figure 4.3 The SPI data transfer format for CPHA = 1 (Mode 1 and 3) .................... 25 14. Figure 4.4 Pin diagram of the DataFlash W25Q80BV ............................................... 26 15. Figure 4.5 Write to DataFlash flow chart ................................................................... 29 16. Figure 4.6 Chip erase function flow chart .................................................................. 31 17. Figure 4.7 DataFlash read flow chart .......................................................................... 32 18. Figure 5.1 A typical PWM waveform......................................................................... 33 19. Figure 5.2 Frequency spectrum of a typical PWM waveform .................................... 34 20. Figure 5.3 Low-pass filter applied to the PWM waveform ........................................ 34 x 21. Figure 5.4 Timing diagram for Phase Correct PWM Mode ....................................... 36 22. Figure 5.5 Bitwise description of TCCR0 .................................................................. 36 23. Figure 5.6 Play back function flow chart .................................................................... 37 24. Figure 5.7 Fifth-order, Stagger-tuned Chebychev low-pass filter circuit ................... 38 25. Figure 5.8 AC analysis of the filter circuit ................................................................. 39 26. Figure 5.9 The power amplifier circuit ...................................................................... 40 xi 1 Chapter 1 INTRODUCTION 1.1 Introduction to Data Acquisition for Speech Recognition System The data acquisition system is an embedded solution that has the capability to record the speech signal and play the same speech back. It is implemented to be a part of the speech recognition system. The speech recognition application includes a wide range of fields including but not limited to voice dialing, appliance control and speech-to-text processing. Obtaining the voice sample from user and storing them for later reference plays a very important part in successfully completing the speech recognition system. Moreover, many other sound processing systems such as voice Coders, sound effect producer and telephone answering machine can also make use of the data acquisition system discussed in this report. 1.2 Background of the Project This project report describes the process of sound recording, data storage in the external memory, and playing back the original speech signal using the development board AVR STK500 from Atmel, the AVR microcontroller ATmega32A, the serial flash memory W25Q80BV from Winbond Corporation and a few extra components like LM324 (general purpose operation amplifier IC), LM386 (power Amplifier IC), resistors and capacitors. The AVR STK500 is AVR studio compatible development platform. It has two RS-232 interfaces to PC: one for programming and control and the other for general use. 2 It supports parallel and serial high voltage programming as well as serial In-system Programming (ISP) of AVR devices. Other feature includes 8 push buttons and 8 LEDs for general use, pin header connectors for all AVR I/O ports, several clock options and supply voltage options for target AVR microcontroller [2]. The ATmega32A is a 40-Pin PDIP package high-performance and low-power 8bit microcontroller with advanced RISC architecture. The ATmega32A includes 32Kbytes of in-system self-programmable flash program memory and 1024 bytes EEPROM. The peripherals in this AVR microcontroller includes two 8-bit timer/ counters, one 16-bit timer/ counter, four PWM channels, 10-bit resolution ADC with single-ended and differential channels, master/ slave SPI serial interface, analog comparator, serial USART, two-wire serial interface. The Supply voltage requirement for the ATMEGA32A is 2.7 – 5.5V and speed grade can go up to 16 MHz clock frequency [1]. The W25Q80BV DataFlash is a low-power, serial-interface flash memory. It is an 8M-bit of memory available in 8-pin PDIP package. The W25Q80BV uses a Serial Peripheral Interface (SPI) to sequentially access its data. This interface facilitates hardware layout, increases system reliability, minimizes switching noise, and reduces package size and active pin count. The DataFlash supports the standard Serial Peripheral Interface (SPI) and a high-performance Dual/ Quad I/O SPI with SPI clock frequencies up to 104 MHz, which is equivalent to 208 MHz (104 MHz x 2) for Dual I/O and 416 MHz (104 MHz x 4) for Quad I/O. The device operates from a single voltage power 3 supply (from 2.7V to 3.6V) for both the write and read operations with current consumption as low as 4mA in active mode and 1uA for power down mode. Moreover, JEDEC standard manufacturer and device identification is also supported with 64-bit Unique Serial Number for the device. Its serial interface is compatible to the Serial Peripheral Interface (SPI) Modes 0 and 3, thus it can easily be interfaced to the AVR microcontroller [4]. 1.3 Organization of Project Report The report contains every detail of the project. Project is divided into 3 different sections – Record the speech, Store the data and Play Back. Each section is covered in its individual chapter. Chapter two describes the overall hardware arrangement from the block diagram to the hardware implementation including pin connections with the AVR device. It also gives information about the development platform. The chapter contains the main loop for the program used in the project. Chapter three discusses the speech sampling procedure. The sampling criteria and procedure are presented briefly. The AVR ATmega32A has an internal analog to digital converter. Its setup is described in this section. Microphone interfacing with the AVR microcontroller using pre-amplifier is also included in the same chapter. Chapter four talks about the Serial Peripheral Interface (SPI) between the AVR ATmega32A and the serial DataFlash W25Q80BV. The algorithms to program, read, and erase the DataFlash is also discussed. 4 Chapter five describes the Pulse Width Modulation (PWM) waveform. The AVR ATmega32A’s internal PWM channel is used to reproduce the original analog signal. The low-pass filter circuit and the speaker amplifier circuit are also described in the later part of this chapter. Chapter six concludes the report and gives prospective about the future work. 5 Chapter 2 SYSTEM OVERVIEW 2.1 Introduction Figure 2.1 shows the basic block diagram of the sound recording system. It basically consists of a four major building blocks. Recording Section – microphone and a pre-amplifier circuit. AVR ATmega32A – computational part of the system. The DataFlash W25Q80BV – 8Mbit of data storage. Play Back Section – a low-pass filter, a power amplifier circuit and a speaker. Each of these blocks is described in the following chapters. Figure 2.1 Block diagram 2.2 Hardware Implementation Figure 2.2 shows the detailed hardware interface with the AVR ATmega32A microcontroller. As shown in the figure, the AVR device is operated with an 8MHz 6 crystal oscillator. The user can operate the system using three pushbutton switches – Erase, Record, and Playback connected to pins 0 to 2 of port D of the microcontroller. 7 Vcc Microphone Pre-Amplifier Speaker Filter Circuit PB3 (OC0) PA0 (ADC0) 22pF XTAL1 8 MHz Erase XTAL2 PD0 PD1 Record Vcc PB4 (SS#) CS# PB5 (MOSI) SDI PB6 (MISO) SDO PB7 (SCK) W25Q80BV PD2 ATMEGA32A Playback Vcc 22pF SCK Vcc GND 100R AGND AVcc GND AREF 100nF Figure 2.2 Detailed hardware interface with AVR ATmega32a The flash memory W25Q80BV is connected to the AVR microcontroller’s SPI bus on the Port B. To convert speech signal into digital samples, the internal ADC of the 8 microcontroller is used. Channel 0 of the ADC is setup to accept the analog input through the microphone pre-amplifier. For Playback, PWM channel 0 (OC0) is used. It is connected to speaker filter circuit to smooth out the high frequency PWM output from the AVR device. 2.3 AVR STK500 Development Platform Figure 2.3 AVR STK500 components [2] Figure 2.3 shows the AVR STK500 board components. Some important components to note here are user LEDs, user pushbutton switches, I/O port headers, ISP programming header, headers for high voltage programming and option setting jumpers. Other significant part is two RS-232 interface sockets, status LED and power LED [2]. 9 Figure 2.4 AVR STK500 block diagram [2] Figure 2.4 shows the block diagram of the same development board. The board is divided into two sections: Control Section – This section has a control microcontroller. It provides controlling and programming support for the target AVR device. The board setting allows user to select the operating voltage of the target section, the analog reference voltage to the 10 ADC in the target device and oscillator frequency of the on-board programmable oscillator [2]. Target Section – This section of the AVR STK500 board where the target AVR device is inserted for the user application [2]. 2.4 Software Organization Figure 2.5 Software flow chart Figure 2.5 shows the main loop used in this project. The SPI bus and I/O ports are setup as per the requirement. One the ports are setup, the user controls the operation of 11 the system by using three pushbutton switches connected to the Port A. The software loops through to poll for any pressed switch. If the “Record” button is pressed, it jumps to the record section of the software and stays there until the button is released. The same is true for other two procedures – Chip Erase and Play back. 12 Chapter 3 AUDIO SAMPLING 3.1 Analog to Digital Conversion Before the analog speech signal from the microphone can be processed in the AVR microcontroller or stored in the DataFlash it has to be converted into a digital signal. This is done in multiple steps. First, the analog signal is converted into a time discrete signal by taking periodic samples (Figure 1). Figure 3.1 Analog signal converted into a time discrete signal [6] The Nyquist-Shannon sampling theorem states that to sample any continuous signal properly, it must not contain any frequency component higher than the one-half of the sampling rate. The time interval between two samples is called the “sampling period” 13 and its reciprocal the “sampling frequency”. Thus, According to the sampling theorem, the sampling frequency has to be at least double the maximum signal frequency. If the minimum sampling frequency criteria is not satisfied, then any frequency component above half of the sampling frequency is indistinguishable from the lower-frequency components. This is called an “alias”. Such an aliased signal cannot be uniquely recovered from its samples. This is illustrated in figure 3.2 where (a), (b) and (c) shows the proper sampling while (d) shows that when the analog signal frequency is higher that one-half of the sample rate, then the original signal cannot be reconstructed from the samples [6]. 14 Figure 3.2 Illustration of proper and improper sampling of a sinusoidal signal [6] The speech signal is band limited to 3000 Hz, i.e., it has most of the information are contained between D.C and 3000 Hz. So, sampling frequency has to be set at least at 6000 Hz. Moreover, a low-pass filter can be used to suppress any component above 3000 Hz and therefore reduces the aliasing effect while reproducing the same signal from its samples. In this project, a simple first-order RC filter is used. It does not have a very steep roll-off for the frequencies above cut-off frequency. So, sampling frequency is chosen to be very high than the minimum 6000 Hz [3]. 15 Second step in the process is to determine the appropriate digital value that truly represents the analog sample. This is called “quantization” (figure 2). Analog sample value can vary in the discrete but infinite set of values. Quantization is the process of approximating the actual analog sample value to nearest value from a finite set of values. Thus, process of approximation introduces an error. It is called “quantization error”. Quantization error depends upon the “resolution” of the system and “dynamic range” of the analog signal. Resolution is defined as the number of digital values and it is limited. For example, an 8-bit system has the resolution of 256. There are two distinct methods to reduce quantization error either by adjusting reference voltages (AGND and AREF) of A/D converter to match the dynamic range of the analog signal or by adjusting the dynamic range of the analog signal to cover the maximum dynamic range of the A/D converter. The latter method described here improves the signal-to-noise ratio (SNR) and hence it is preferred method [3]. 3.2 Hardware In this project, a condenser type electret microphone is used. The design of electret condenser microphone is based on the charging and discharging motion between electric capacity conductors. If microphone is directly hooked up to the A/D converter, then as the microphones produces very small signal output, it will not swing through the maximum range of the A/D converter of the AVR microcontroller and therefore should result into a large amount of quantization error. So, the microphone output has to be 16 amplified up to a level, which can cover the full dynamic range of analog-to-digital converter. Figure 3.3 Microphone pre-amplifier Figure 3 shows the microphone pre-amplifier design. This is implemented using an operation amplifier LM324 in inverting mode and an RC filter. R8 (10Kohm) is used to as a current limiter resistance to power the microphone and C2 (10uF) blocks any DC voltage from reaching the op-amp input. Gain of the amplifier is (R5/R1) and it is set with variable resistance R5 (200KOhm). As the system operates on a single power supply, virtual ground is created by keeping R3 (20KOhm) at the point where ADC in the AVR device would give the output 512 in decimal, i.e., ADC Data Register should reflect the value as 01_1111 _1111 in binary. C5, the capacitor in the feedback loop, introduces 17 a roll-off at high frequencies, giving better stability since the gain is very high. The RC filter is a simple first order filter with a cut-off frequency of approximately 3000 Hz. 3.3 Software Figure 3.4 Record function flow chart 18 The ATMEGA32A microcontroller has an in-system 10-bit resolution A/D Converter (ADC), which is used to obtain the digital voice signal from the microphone. ADC of the ATMEGA32A is connected to the PORTA. So, PORTA is defined as a highimpedance input. The conversion rate is set at approximately 15,686 samples per second. To achieve this rate of sampling, the ADC Data Register (ADCH and ADCL) has to be read at every 510 cycles (15,686Hz x 510 cycles = 8 MHz) for a system operating at 8 MHz of clock frequency. The ADC Pre-scaler Select Bits (ADPS [2:0] in the ADC Control and Status Register A (ADCSRA) to 5. Therefore, ADC clock division factor is set at 32. ADC is set to operate on single conversion mode. The conversion is started by setting the ADC Start Conversion (ADSC) bit and clearing the ADC Auto Trigger Enable (ADATE) bit in the ADCSR. Each conversion, except the first, takes 14 ADC cycles. It takes about 832-oscillator cycle to complete the first conversion and then after it takes only 448 cycles (14 x 32 = 448 cycles). 19 Figure 3.5 ADC running in single conversion mode [1] Setting the ADC interrupt Enable (ADIE) bit in the ADCSRA also enables ADC Conversion Interrupt. The interrupt service routine performs a delay loop to fill the missing 62 cycles (510 – 448 = 62 cycles). Although an internal ADC produces 10-bit output for every sample, it is sufficient to read only the 8 most significant bits and discard 2 least significant bits. Therefore, ADC is set to produce the result that is represented in a left adjusted format by setting ADC Left Adjust Result (ADLAR) bit in ADC Multiplexer Selection (ADMUX) Register. Since, ADC is set to produce left adjusted results and only 8-bit of the information is required, it is sufficient to only read ADCH from the ADC Data Register. Following is the code snippet that shows the ADC settings: //ADC Setup subroutine void setup_adc(void){ DDRA = 0x00; //set PORTA as high impedance input port 20 ADMUX |= (1<<ADLAR); //ADC set to be left adjusted //ADC Enabled, Conversion Complete Interrupt enabled, Clock division factor 32 ADCSRA |= (1<<ADEN)| (1<<ADIE)|(1<<ADPS2)|(1<<ADPS0);} // ADC Conversion Complete Interrupt Service Routine interrupt[ADC_INT] void data_ready(void){ unsigned char sample; static unsigned long int byte_counter=0; sample = ADCH; //as only 8 MSBs are required, reading only ADCH is sufficient PORTD=sample; delay_us(7); //filling up a 62 cycles gap ADCSRA |= (1<<ADSC); //start new ADC conversion DF_page_write(byte_counter,sample); // write data into the dataflash byte_counter++;} 21 Chapter 4 DATA STORAGE USING SERIAL FLASH MEMORY 4.1 Serial Peripheral Interface in AVR ATMEGA32A AVR ATMEGA32A supports master/ slave SPI serial interface. Some SPI features of this AVR microcontroller are following [1]: Three-wire Synchronous Data Transfer Master/ Slave Operation Programmable serial clock rates End of Transmission Interrupt Flag Serial Peripheral Interface (SPI) supports a very high-speed synchronous data transfer between the AVR ATMEGA32A and peripheral devices. Figure 4.1 shows the connection between master and slave SPI devices [1]. Figure 4.1 Master – Slave interconnection for SPI data transfer [1] SPI specifies following four signals: MOSI = Master Output Slave Input 22 MISO = Master Input Slave Output SCK = Clock Signal SS# = Slave Select (Active Low) When the SPI is enabled, the data direction of the MOSI, MISO, SCK and SS# is overridden according to following table. Pin Direction, Master SPI Direction, Slave SPI MOSI User Defined Input MISO Input User Defined SCK User Defined Input SS# User Defined Input Table 4.1 Data Direction [1] Following table describes bit configuration for SPI Control Register (SPCR). Bit Position Name Description 7 SPIE If SPIF bit in the SPSR SPI Interrupt Enable register is set and global interrupt is enable, this bit causes the SPI interrupt to be executed. 6 SPE This bit enables SPI Enable operation, when it is set. SPI 23 5 DORD MSB is transmitted first, Data Order when this bit is set to zero. Otherwise, LSB is transmitted first. 4 MSTR When it is set, the device Master/ Slave Select operates in master SPI mode, and slave SPI mode when written to zero. 3 CPOL SCK is high in idle mode, Clock Polarity when this bit is set. Otherwise, when this bit is written to zero, SCK is low in idle mode. 2 CPHA This bit determines when Clock Phase data is sampled and setup, whether on leading or trailing edge 1, 0 SPR1, SPR0 These bits control the SCK SPI Clock Rate Select 1 & 0 rate of the device configured as a master. Table 4.2 SPI Control Register in ATmega32A [1] 24 The AVR ATMEGA32A has the SPI Data Register (SPDR), which is a read/ write register, and it is used to transfer the data between the master and slave devices. By writing to this register starts the SPI data transmission. When SPDR is read, it actually causes the shift register receive buffer to be read [1]. The combinations of parameter CPOL and CPHA determine the SPI mode of operation. Master device must adhere to the supported mode for the slave device. The four combinations generated by CPHA and CPOL decide the edge of the clock signals on which the data are sampled and setup [1]. SPI Mode CPOL CPHA Leading Edge Trailing Edge 0 0 0 Sample (Rising) Setup (Falling) 1 0 1 Setup (Rising) Sample (Falling) 2 1 0 Sample (Falling) Setup (Rising) 3 1 1 Setup (Falling) Sample (Rising) Table 4.3 CPOL and CPHA states, functionality and SPI mode of operation [1] The SPI data transfer formats for CPHA set to zero (mode 0 and mode 2) are shown in figure 4.2. 25 Figure 4.2 The SPI data transfer format for CPHA = 0 (Mode 0 and 2) [1] The SPI data transfer formats for CPHA set to one (mode 1 and mode 3) are shown in figure 4.3. Figure 4.3 The SPI data transfer format for CPHA = 1 (Mode 1 and 3) [1] 4.2 Serial Peripheral Interface in the DataFlash W25Q80BV The DataFlash W25Q80BV is a serial-interfaced sequentially accessed flash memory. It is a 8Mbits of flash memory divided in to 4096 pages. Each page is 256 bytes. 26 Pages are grouped into a sector. A group of 16 pages (4KB) makes a sector. 8 sectors are grouped together to make a block of 32KB each. The chip has in total of 16 blocks. It is a byte programmable flash memory, but it also has the facility to program 256 bytes of data at once through page program [4]. Figure 4.4 Pin diagram of the DataFlash W25Q80BV [4] The DataFlash supports the SPI operation mode 0 and 1. In the standard SPI operation, the master device uses Serial Data Input (DI) pin to serial transfer instruction, addresses and data to the device on the rising edge of the serial clock. The read data from the DataFlash is put on the Serial Data Out (DO) on the falling edge of the serial clock [4]. The W25Q80BV contains Status Register 1 and Status Register 2. The status registers provide information such as if the device is write enabled or disabled, write protection status, busy status, security register lock bits [4]. 27 The DataFlash instruction begins as the falling edge of the chip selects (CS#) and completed as the rising edge of the chip select (CS#). Data is always sampled as the most significant bit (MSB) first [4]. 4.3 Software for the DataFlash Access As discussed earlier, the DataFlash is reached using an SPI interface of the AVR ATmega32A. The Mode 3 SPI is setup using setting the SPCR register’s CPOL and CPHA bits to logic one. For shifting out MSB of the SPDR first, DORD bit is kept logic zero. The AVR ATmega32A is serving as a master device in this communication, therefore MSTR bit is set to logic one. The Serial Clock (SCK) is run at 4 MHz by setting SPR1 and SPR0 to logic zero. The SPI interrupt is disabled. The Port B of the AVR device has to be setup separately to support the SPI data transfer. The AVR ATmega32A drives the Serial Clock (SCK), Master Output Slave Input (MOSI) and Slave Select (SS#), therefore corresponding pins of Port B – Pin 7 (SCK), Pin 5 (MOSI) and Pin 4 (SS#) are setup as output. Following Code snippet shows the SPI initialization of the AVR device. void DF_SPI_init (void){ PORTB= (1<<PORTB4); //Set SS high DDRB = (1<<PORTB4) | (1<<PORTB5)| (1<<PORTB7); //Set SS, MOSI and SCK as outputs SPCR = (1<<SPE) | (1<<MSTR) | (1<<CPHA) | (1<<CPOL); Master mode, mode 3, MSB first} //Enable SPI in 28 4.4 Write to DataFlash Figure 4.5 shows the flowchart for the write operation on the DataFlash W25Q80BV. Every Page Write operation has to be preceded with a Write Enable instruction. The Write Enable instruction sets the Write Enable Latch (WEL) bit in the Status Register 1. After completing a single Write Enable instruction, the Page Program (02h) instruction is initiated by driving CS# low then shifting the instruction code “02h” followed by a 24-bit address and a data byte, into the DI pin. Page Program is a selftimed operation, i.e., after driving the last data byte into the DI pin and driving CS# high, Page Program instruction begins its internal operation. For that whole time, reading the status register will return the BUSY bit set to 1. Once the Page Program cycle is finished, WEL bit will be automatically cleared to 0 [4]. 29 Figure 4.5 Write to DataFlash flow chart 30 Following is the code segment, which shows the implementation of the above flowchart in C. void DF_page_write(unsigned long int byte_counter, unsigned char data){ unsigned char result; DF_Write_Enable(); PORTB |= (1<<PORTB4); //make sure to toggle CS signal in order PORTB &= ~(1<<PORTB4); result = DF_SPI_RW(0x02); //send Page write opcode result = DF_SPI_RW((char)(byte_counter>>16)); //Send 8 MSB of Page address result = DF_SPI_RW((char)(byte_counter>>8)); //send 15-8 bits of page address result = DF_SPI_RW((char)(byte_counter)); //send 7-0 bits of page address result = DF_SPI_RW(data); //send data while((Read_DF_status() & 0x01));} //make sure that data byte is written in the memory 4.5 Erase Operation Erasing the DataFlash works same as the Page Program cycle discussed earlier. The Write Enable instruction has to be executed before beginning the Erase operation. The Erase instruction is initiated by driving the CS# low and then shifting the instruction code for the Chip Erase operation (C7h) into the DI pin. This will begin the self-timed erase operation once the CS# is driven high again. The BUSY bit in the Status Register will be set to 1 for the whole time. The BUSY bit can be polled continuously to check for 31 the complete execution of the instruction. The WEL bit in the Status Register will be automatically cleared to 0 after the Chip Erase is completed. Figure 4.6 shows the flowchart for the Chip Erase operation [4]. Figure 4.6 Chip erase function flow chart 32 4.6 Read Data from the DataFlash Figure 4.7 shows the flowchart for the Read Data instruction. The instruction begins by driving the CS# pin low and then moving the instruction code (02h) followed by the 24-bit source address into the DI pin. As the master device is responsible for driving the serial clock (SCK) signal, one byte of dummy data has to be shifted out of the AVR device for each data byte that has to be read from the DataFlash. The device stays in the read mode as long as the CS# pin is driven low. The address is automatically incremented internally [4]. Figure 4.7 DataFlash read flow chart 33 Chapter 5 AUDIO GENERATION USING PWM 5.1 Pulse-width Modulation Waveform Pulse-width Modulation uses a base square wave with a fixed frequency whose duty cycle changes (from 0% to 100%) with respect to the amplitude of the analog signal. For example, a high value of an analog signal may represent an increased duty cycle of the PWM waveform and vice verse. A typical PWM waveform is shown in figure 5.1 [7]. Figure 5.1 A typical PWM waveform [7] Figure 5.2 shows the frequency spectrum of the PWM waveform. According to the Fourier analysis of the PWM waveform, it has a strong peak at Fn = 1/T, which is the fundamental PWM frequency and other strong harmonics at F = K/T, where K is an integer. To generate the original signal, all these unwanted harmonics have to be removed [7]. 34 Figure 5.2 Frequency spectrum of a typical PWM waveform [7] Figure 5.3 shows the effect of passing the PWM waveform from a low-pass filter. Theoretically if the bandwidth of the low-pass filter is equal to the fundamental component frequency of the PWM waveform, then the original unmodulated information signal can be restored at the output of that filter. But, for this to happen, the filter has to have a brick-wall type of response. Therefore, for the practical purpose, the low-pass filter should be designed with the bandwidth very much less than the FPWM [7]. Figure 5.3 Low-pass filter applied to the PWM waveform [7] When PWM waveform is passed through an analog RC filter, the average value can be extracted as high frequency PWM component gets suppressed through the process. Thus it makes an inexpensive but efficient digital-to-analog converter (DAC). 35 5.2 AVR ATmega32A Setup for PWM Mode of Operation The AVR ATmega32A has four PWM channels – one for Timer/ Counter 0, Two for Timer/ Counter 1, and one for Timer/ Counter 2. Faithful generation of analog signal depends upon the base PWM frequency. To successfully stop PWM frequency to reach the analog output, the PWM frequency has to be at least double the signal frequency. Higher than double the signal frequency is preferred depending upon the analog filter design. The system clock for the AVR ATmega32A and the PWM resolution decides the maximum PWM frequency. In this project, the AVR device runs at 8 MHz system clock and for 8-bit PWM resolution, the maximum PWM frequency can be 15,686 Hz [3]. Timer/ Counter 0 is used to generate the PWM waveform. Figure 5.1 shows the timing diagram for Phase Correct PWM Mode. Figure 5.2 shows the bit assignment for Timer Control Register 0 (TCCR0). It is set up to operate on Phase Correct PWM Mode with a timer clock prescaler to be 1. The Output Compare Register (OC0) is set to behave as non-inverting mode, i.e., OC0 is cleared on compare match when Timer/ Counter Register 0 (TCNT0) is up counting and OC0 is set on compare match when TCNT0 is down counting [1]. 36 Figure 5.4 Timing diagram for Phase Correct PWM Mode [1] FOC0 WGM00 COM01 COM00 WGM01 CS02 CS01 CS00 Figure 5.5 Bitwise description of TCCR0 [1] Following code segment shows the Timer/ Counter 0 setup for Phase Correct PWM mode used in this project. void setup_pwm(void){ DDRB |= (1<<PORTB3); //Set OC0 as output TCCR0 = (1<<COM01) | (1<<WGM00) | (1<<CS00); TCNT0 = 0; OCR0 = 0; //Initialize Timer/ Counter register 37 TIMSK = 0x01; //Timer Overflow Interrupt Enabled TIFR =0x01;} 5.3 Software Setup for Play Back Figure 5.6 Play back function flow chart 38 5.4 Hardware Circuit The PWM waveform is passed through a low-pass filter to extract the analog information. The filter can be a simple RC filter here. As the steep roll-off is required, a fifth-order, stagger-tuned Chebychev low-pass filter is used. The Chebychev filter is used in place of the Butterworth filter, because it has a steep roll-off factor compared to the Butterworth filter. It contains a first-order unity-gain active low-pass filter followed by another two second-order, unity-gain active low-pass filters. Overall cut-off frequency is set to be approximately 4000 Hz, which is sufficient to put high frequency PWM components at bay and stopping them to reach the output power amplifier input. Figure 5.3 shows the filter used in this system. The R1 and C2 makes the first-order low-pass filter, the R2, R3 and C3, C4 makes the first of the two second-order low-pass filter and the R4, R5 and C5, C6 makes up the other second-order low-pass filter. Figure 5.4 shows the AC Sweep analysis for the filter circuit. The cut-off frequency of the overall design is clearly marked down in the same figure [5]. V1 5Vdc C4 C6 5.6n 0 OUT 2 C2 680p - 11 1 R2 U1B + OUT 6 LM324 V- 5 100k Stage_1_Output - 11 C3 100p 7 R4 10 + Output OUT 9 LM324 V- U1C 91k V+ + 4 U1A V+ 4 3 91k V+ R1 PWM Wav ef orm 10k R5 4 10n 15k R3 Stage_2_Output C5 - 11 8 LM324 V- 560p Figure 5.7 Fifth-order, Stagger-tuned Chebychev low-pass filter circuit Power Amplif ier 39 Figure 5.8 AC analysis of the filter circuit The filter output cannot directly drive a speaker with enough power. Therefore, a power amplifier is used to drive the speaker. Figure 5.5 shows the power amplifier circuit. The LM386, an audio power amplifier IC is used. The gain of the circuit is set to 200 by bypassing the internal resistance between pin 1 and pin 8 using a capacitor C4. The C1 in conjunction with the potentiometer R1 makes a high-filter and blocks any DC component form the filter circuit. The Potentiometer R1 also serves as a volume control. At the output, the capacitor C2 is used to block DC voltage to reach the speaker input. 40 220u R2 C3 10 V1 5Vdc C1 10u R1 10K U1 6 7 Speaker Filter Output 3 C2 5 - 4 8 1 2 + LS1 220u LM386 C4 10u Figure 5.9 The power amplifier circuit C5 0.05u R3 10 SPEAKER 41 Chapter 6 CONCLUSION 6.1 Conclusion In conclusion, this project assignment was about acquiring the human speech faithfully for the processing purposes. The reproduction of the speech provided a check for the whole system. The choice of microphone pre-amplifier design proved essential to the overall success of the project as the gain and the virtual ground was set precisely to cover the dynamic range of the analog-to-digital converter. The choice of the sampling rate was also important, but it was chosen greater than the minimum requirement to compensate for the variations in the component values. The serial DataFlash provided efficient storage in terms of the system resources. The SPI transfer rate was set to maximum to ensure maximum throughput. The DataFlash size was taken as 1 Mbyte, therefore It was possible to store approximately 1 minute of data can be stored with a 15,686 KHz sampling rate. The PWM waveform proved to be very efficient in reproducing the analog signals from the digital samples. The most difficult part of the project was the design of the speaker filter. The Chebychev filter design proved effective in the end as it has steeper roll-off slope than the same of the Butterworth filter. The resistors were chosen to be precise with the mathematical calculations so that the cut-off frequency can be achieved accurately. 42 6.2 Future Work In future, this speech data acquisition system can be integrated into the speech recognition system. Moreover, a SD/ MMC card interface can be added instead of a serial flash memory. It can provide far more flexibility in terms of how the data can be accessed and how much data can be stored for processing. Finally, inclusion of a computer interface can be very helpful to the overall design. 43 APPENDIX Implementation Software #include <mega32.h> #include <delay.h> unsigned char wait = 0; void DF_SPI_init (void); unsigned char DF_SPI_RW (unsigned char output); unsigned char Read_DF_status (void); void DF_Write_Enable(void); void DF_page_write(unsigned long int byte_counter, unsigned char data); void DF_page_read(unsigned long int byte_counter); void DF_chip_erase(void); void setup_adc(void); void disable_adc(void); void setup_pwm(void); void disable_pwm(void); void record_speech(void); interrupt[ADC_INT] void data_ready(void); void generate_pwm(void); interrupt[TIM0_OVF] void nothing(); 44 void main(void){ DDRD = 0xFF; PORTD = 0xFF; #asm sei #endasm DF_SPI_init(); while(1){ if(!(PINA & 8)){ DF_chip_erase(); while(!(PINA & 8));} if(!(PINA & 2)){ record_speech();} if(!(PINA & 4)){ generate_pwm();}}} void DF_SPI_init (void){ PORTB= (1<<PORTB4); //Set SS high DDRB = (1<<PORTB4) | (1<<PORTB5)| (1<<PORTB7); //Set SS, MOSI and SCK as outputs SPCR = (1<<SPE) | (1<<MSTR) | (1<<CPHA) | (1<<CPOL);} //Enable SPI in Master mode, mode 3, MSB first 45 unsigned char DF_SPI_RW (unsigned char output){ unsigned char input; SPDR = output; //put byte 'output' in SPI data register while(!(SPSR & 0x80)); //wait for transfer complete, poll SPIF-flag input = SPDR; //read value in SPI data reg. return input;} //return the byte clocked in from SPI slave unsigned char Read_DF_status (void){ unsigned char result,index_copy; PORTB |= (1<<PORTB4); //make sure to toggle CS signal in order PORTB &= ~(1<<PORTB4); //to reset dataflash command decoder result = DF_SPI_RW(0x05); //send status register read op-code result = DF_SPI_RW(0x00); //dummy write to get result return result;} //return the read status register value void DF_Write_Enable(void){ unsigned char result; PORTB |= (1<<PORTB4); //make sure to toggle CS signal in order PORTB &= ~(1<<PORTB4); //to reset dataflash command decoder result = DF_SPI_RW(0x06); } //send status register read op-code void DF_page_write(unsigned long int byte_counter, unsigned char data){ 46 unsigned char result; DF_Write_Enable(); PORTB |= (1<<PORTB4); //make sure to toggle CS signal in order PORTB &= ~(1<<PORTB4); result = DF_SPI_RW(0x02); //send Page write opcode result = DF_SPI_RW((char)(byte_counter>>16));//Send 8 MSB of Page address result = DF_SPI_RW((char)(byte_counter>>8));//send 15-8 bits of page address result = DF_SPI_RW((char)(byte_counter)); //send 7-0 bits of page address result = DF_SPI_RW(data); //send data while((Read_DF_status() & 0x01)); }//make sure that data byte is written in the memory void DF_page_read(unsigned long int byte_counter){ unsigned char result; PORTB |= (1<<PORTB4); //make sure to toggle CS signal in order PORTB &= ~(1<<PORTB4); result = DF_SPI_RW(0x03); //Send Read Byte Opcode result = DF_SPI_RW((char)(byte_counter>>16));//Send 23-16 bits of address result = DF_SPI_RW((char)(byte_counter>>8));//Send 15-8 bits of page address result = DF_SPI_RW((char)(byte_counter)); } //send 7-0 bits of page address void DF_chip_erase(void){ unsigned char result; DF_Write_Enable(); 47 PORTB |= (1<<PORTB4); //make sure to toggle CS signal in order PORTB &= ~(1<<PORTB4); result = DF_SPI_RW(0xC7); //Send Chip Erase Opcode while((Read_DF_status() & 0x01));} void setup_adc(void){ DDRA = 0x00; //Set PortA as high impedance input port ADMUX |= (1<<ADLAR); //Reference Voltage is Aref, ADC set to be left adjusted, Channel 0 is used ADCSRA |= (1<<ADEN)| (1<<ADIE)|(1<<ADPS2)|(1<<ADPS0); } //Clock division factor = 32, Interrupt Enabled void disable_adc(void){ ADCSRA = 0x00;} void setup_pwm(void){ DDRB |= (1<<PORTB3); //Set OC0 as output TCCR0 = (1<<COM01) | (1<<WGM00) | (1<<CS00); //Timer Clock Prescaler = 1, Non inverting mode, Phase correct PWM TCNT0 = 0; //Initialize Timer/ Counter register OCR0 = 0; TIFR = 0x01; TIMSK =0x01;} void disable_pwm(void){ //Timer Overflow Interrupt Enabled 48 TCCR0 = 0x00; TCNT0 = 0x00; TIMSK = 0x00; TIFR = 0x00;} void record_speech(void){ setup_adc(); ADCSRA |= (1<<ADSC); //START CONVERSION while(!(PINA & 2)); //wait until the record button is pressed disable_adc();} //STOP ADC interrupt[ADC_INT] void data_ready(void){ unsigned char sample; static unsigned long int byte_counter=0; sample = ADCH; //read Digital Sample PORTD=sample; delay_us(7); //loop for approx. 62 cycles ADCSRA |= (1<<ADSC); DF_page_write(byte_counter,sample); //write in to the dataflash byte_counter++;} void generate_pwm(void){ unsigned char stored_sample; 49 unsigned long int byte_counter = 0; setup_pwm(); DF_page_read(byte_counter); while(!(PINA & 4)){ stored_sample = DF_SPI_RW(0x00); while(wait); OCR0 = stored_sample; delay_us(100); wait = 1;} disable_pwm();} interrupt[TIM0_OVF] void nothing() {wait = 0;} 50 REFERENCES [1] Atmel Corp., "8-bit AVR Microcontroller with 32K Bytes In-System Programmable Flash - ATmega32A", July 2009. [2] Atmel Corp., "AVR STK500", User Guide, March 2003. [3] Atmel Corp., "AVR335: Digital Sound Recorder with AVR and DataFlash", Application Note, April 2005. [4] Winbond Electronics Corp., "W25Q80BV - 8M-bit Serial Flash Memory with Dual and Quad SPI", July 2010. [5] Thomas Kugelstadt and Ron Mancini, "Op amps For Everyone: Design Reference", Texas Instruments Inc., August 2002. [6] Steven W. Smith, "Digital Signal Processing - A Practical Guide for Engineers and Scientists", Newnes, 2003. [7] Microchip Technology Inc., "Using PWM to Generate Analog Output", 2002.