Chapter 1 OBJECTIVE The ultimate objective of this project is to develop the next generation Text to Speech software for regional Indian languages like Hindi and Bengali which is called Embedded Shruti. The keyword here is “next generation”. The desktop version of this software which was developed by MediaLab, IIT Kharagpur is called “Shruti”. This software is ported to Windows CE so that the software can be used on Embedded Devices like Handhelds, PDA, Pocket-PC and other devices which support Windows CE operating system. While porting software on Windows CE the key concern is to enhance the performance using the limited resources of Windows CE. Windows CE is an operating system with limited resources and a small Application Programming Interface compared to the Win32 operating system which has an extensive set of Application Programming Interface. There are several limitations on the embedded devices like limited main memory, slow processor and less disk space. These are the constraints that should be taken into account while porting any software for Windows CE platform. Several techniques are used to increase the performance of the software which will be explained in the thesis and the performance evaluation will show how the techniques have resulted in increase in efficiency of the software. This project is an important part of the MediaLab projects and it had extended the Text to Speech software to an evolving area of handheld computers. 1 Chapter 2 INTRODUCTION In recent past the computing industry had seen a tremendous growth in the area of Handheld Systems. Handheld Systems also called as Personal Digital Assistants (PDAs) are gaining popularity. An important reason behind their increase in popularity is that they are mobile devices. People can carry the devices with themselves at any place and use them for their work. Another important reason is that the devices are relatively cheap compared to desktop computes or Laptops. This is because they have limited functionalities. The resources are limited. The disk space is less, processor speed is not very high and main memory is also very limited compared to the desktop computers or Laptops. The most popular Personal Digital Assistants are the following: 1. Pocket-PC 2. Palmtop There are two popular operating systems that were used in these Personal Digital Assistants. They are the following: 1. Windows CE (compact edition) 2. Palm OS (Palm operating system used for Palmtops) Embedded Shruti is developed for Pocket-PC running Windows CE operating system. As the name suggests Windows CE or Windows Compact Edition is an operating system developed by Microsoft which is customized for Handheld Devices. It had a subset of Win32 Application Programming Interface and also had Microsoft Foundation Class support. Due to the MFC support it’s very easy for the programmers who are conversant with traditional windows programming to migrate to windows CE programming. There are some constraints which should be kept in mind while programming in Windows CE API but one who is familiar with Windows API can develop applications for Windows CE very fast. 2 With this brief introduction of underlying hardware (Pocket-PC) and the operating system (Windows CE) the rest of the chapter will introduce the technologies that were used in the design and implementation of Embedded Shruti. 2.1 Windows CE Overview Microsoft® Windows® CE is an open, scalable, 32-bit operating system (OS) that is designed to meet the needs of a broad range of intelligent devices, from enterprise tools such as industrial controllers, communications hubs, and point-of-sale terminals to consumer products such as cameras, telephones, and home entertainment devices. A typical Windows CE-based embedded system is targeted for a specific use, often runs disconnected from other computers, and requires an operating system that has a small footprint and a built-in deterministic response to interrupts. Windows CE offers the application developer the familiar environment of the Microsoft Win32® application programming interface (API), ActiveX controls, message queuing (MSMQ) and Component Object Model (COM) interfaces, Active Template Library (ATL), and the Microsoft Foundation Classes (MFC) Library. ActiveSync provides easy connectivity between the desktop and the embedded device, whether by serial connection, infrared port, or network cable. There is built-in support for multimedia (including DirectX), communications (TCP/IP, SNMP, TAPI, and more), and security. A variety of integrated applications, including Pocket Internet Explorer, Pocket Outlook, and Pocket Word expose objects that allow the developer to extend and customize the existing system, as well as extend the functionality of the application. 2.1.1 Important features of Windows CE 3.0 Windows CE 3.0 offers improved Windows compatibility, combined with hard real-time processing support. New kernel services, such as support for nested interrupts, better thread response, additional task priorities, and semaphores, let the operating system respond immediately to events and interrupts. These real-time features make Windows CE 3.0 ideally suited for industrial applications such as robotics, test and measurement devices, and programmable logic controllers. 3 With greater storage and file-handling capabilities, interprocess communications, and networking support, Windows CE 3.0 interoperates easily with desktop environments that are based on Microsoft Windows NT® and Microsoft Windows 2000, which makes it the optimal choice for an enterprise system that combines small mobile systems with highperformance desktops servers and workstations. New hardware features for Windows CE 3.0 include: Support for on-chip debugging. The device I/O controls (IOCTL) function that allows a unique serial number on each device. Multiple execute-in-place (XIP) regions. 2.1.1.1 Kernel Services Thread response times have been improved in Windows CE 3.0 by the tightening of the upper bounds on scheduling latencies for high-priority Interrupt Service Threads (IST). This improvement in thread response allows developers to know specifically when the thread transitions occur, and aids them in creating new embedded applications by increasing the capabilities of monitoring and controlling hardware in Windows CE. Shorter Interrupt Service Routine (ISR) latencies. Because the kernel uses the interrupt ID that is provided by the ISR to set the event that the IST is waiting on, a short ISR latency is essential for a real-time system. Support for nested interrupts. Support has been added for nested interrupts, which allows interrupts at higher priority levels to be serviced immediately, instead of potentially waiting for a lower-priority ISR to complete. Increased priority levels. Additional priority levels (a total of 256) allow users more flexibility in controlling the scheduling of embedded systems. Support for semaphores. In addition to the currently supported mutexes and events, Windows CE has been expanded to support semaphores. The ability to change the quantum of any thread in the system. This includes support for two APIs: CeSetThreadQuantum and CeGetThreadQuantum. Kernel-level security. A new security model restricts access to system APIs that a rogue application could call to damage the platform. An OEM can specify 4 whether modules and processes can run or not run and specify those that are fully trusted on a particular platform. Two new APIs allow software developers to retrieve the assigned trust level of a module or a process. 2.1.1.2 Files, Databases, and Persistent Storage Windows CE 3.0 supports larger data storage systems, and larger files within those systems. The size of the object store has been increased to 256 MB (from 16 MB in Windows CE version 2.1). Individual files now can be as large as 32 MB and a database volume can be as large as 256 MB. The number of objects that can be kept in the object store has been increased from 216 (65,536) to 222 (4,194,304). Because the allowable number of objects exceeds the number of available object identifiers, freed object identifiers will be reused for new objects, effective with version 3.0. However, an object identifier will not be reused for at least 16 object allocations. Support has been added for querying VERSIONINFO resources to obtain version and language-support information from files. 2.1.1.3 Interprocess Communications Windows CE 3.0 provides interprocess communication support with COM and MSMQ. Two separate COM modules offer two different levels of COM support: a limitedfeature, small-footprint module that provides interprocess calls and a freethreading mode, and a larger module that supports out-of-process calls, fullthreading model support, and Distributed Common Object Model (DCOM). The DCOM module, with the exception of the security interfaces, is fully compatible with Windows NT version 4.0 Service Pack 3 (SP3). Enhanced MSMQ support in Windows CE 3.0 provides independent client support for messaging applications. MSMQ for Windows CE is compatible with Windows NT and Windows 2000 Message Queuing Services. 2.1.1.4 Communication Services 5 Communications enhancements to Windows CE include the following: Lightweight remote access server (RAS) support. RAS uses Telephony Application Program Interface (TAPI) to make the call, and then manages the data through Point-to-point protocol (PPP) or Serial Line Interface Protocol (SLIP). Windows 2000 Transport Control Protocol/Internet Protocol (TCP/IP) support. Network Driver Interface Specification (NDIS) WAN support. 2.1.1.5 Communications Security Security enhancements for Windows CE 3.0 include: Microsoft Cryptography version 2.0 API (CAPI) subset. This is a set of encryption APIs that allow development of applications that will work securely over nonsecure networks, such as the Internet. The CAPI 2.0 subset will provide certificate management support. Microsoft Enhanced Cryptography Service (RSAENH), including 128-bit encryption algorithms. Cryptography Service Provider Development Kit. X.509 certificate authentication. 2.1.1.6 Connectivity Services Smart card PC/SC support. The Windows CE 3.0 smart-card subsystem conforms to the Interoperability Specification for ICCs and Personal Computer Systems, which makes it easy to port existing smart-card applications to Windows CE devices. Windows CE 3.0 ActiveSync version 3.1. 2.1.1.7 User-Interface Services Enhancements to shell services in Windows CE include the following: Microsoft Windows CE Handheld PC (H/PC), Professional Edition shell, which includes the following applications: Pocket Internet Explorer 6 Pocket Inbox Pocket Word Help system Finer granularity componentization of common controls. Ability to print mixed text and graphics. Controls and dialog boxes that are resolution independent. Ability for user to change the appearance of the user interface for notifications. DirectDraw driver for Graphics Device Interface (GDI). 2.1.1.8 Internet Services The new embedded Web Server provides many of the features of Microsoft Internet Information Services (IIS), which have been optimized for the limited resources of an embedded device. Features include: Support for the HTTP/1.0 protocol, persistent connections, multiple connections, file downloading, directory browsing, and multiple virtual paths. A remote administration tool for configuration. Basic and NT LAN Manager (NTLM) authentication support. Internet Services Application Program Interface (ISAPI) extensions and filters. Dynamic pages, through a subset of Active Server Pages (ASP pages). For client-side Internet development, Windows CE 3.0 includes a subset of the WinInet API, to support browser-based applications and FTP services. 2.1.2 Using Platform Builder The platform builder can be used to create customized embedded platforms. By default a number of embedded hardware platforms are available which includes embedded platforms for microprocessors like: a) ARM processors. b) MIPS processors. c) SHX processors. d) IntelX86 processors (Used for debugging the application on desktop environment). 7 Platform builder is essential for creating customized embedded platforms which are hardware specific and not compatible to the processors mentioned above. Introduction to the platform builder will help to port the application to some customized embedded platforms in future. The steps to do the customized platform design are specified below: Platform builder helps the programmer to create customized embedded platforms: 1. Using platform builder with the desktop based Windows CE emulator. 2. Using platform builder with a Windows CE PC based hardware development environment. 3. Creating and adding features to the platform thus made by the platform builder. 2.1.2.1 Designing Operating System elements Using the platform builder one can do the following: a) Create a boot loader. b) Create a Board Support Package. c) Create a custom shell. d) Selecting a configuration for the platform. Getting Started To build a platform based on Windows CE operating system the following steps are to be done: 1. Create a platform using Windows CE configuration with a standard development board (SDB). 2. Customize the platform with additional project and catalog features 3. Build the Operating System image to a hardware platform (CEPC). Platform builder includes boot loaders and Board Support Packages for the CEPC and many other hardware development platforms. 4. After refining and debugging platform on hardware development platform, one can adapt it for custom target device. 8 5. Create a boot loader, OEM adaptation layer and board support package for the specific target device. OEM adaptation layer (OAL) is the layer between KERNEL and TARGET-PLATFORM FIRMWARE. 6. Rebuild the Operating System using new Board Support Package, download it into target device and debug platform. 7. When the platform is complete export a Software Development Kit for the platform. Application developers can import that software development kit into development tools like eMbedded Visual C++. Platform Creation Platform creation is done using new platform wizard in which the following steps are done: a) Select a board support package for the device. b) Choose and Operating System configuration and variant. c) Select the features for the platform. After initial settings have been chosen, the new platform wizard sets up the environment with files that support operating system configuration that was selected. The features included in the platform depend on the operating system configuration which was chosen. There are 13 basic configurations included with Platform Builder and all of them are available from new platform wizard. After the pictorial view of the whole process, an example of building Operating System image for a Thin Client will be provided. Sequence of tasks in process of creating Windows CE based platform with Platform Builder: 9 2.1.2.2 Thin Client Example Platform builder was used to generate Windows CE image for a HCL Win Bee 4000JS Thin Client which is having the following hardware details: 1. 266 MHz National Geode Processor. 2. On board VGA controller up to 4.0 MB shared VRAM. 3. 64 MB RAM upgradeable to 256 MB. 4. 16 MB flash memory. 10 Peripheral support: 1. 104 key PS/2 keyboard. 2. PS/2 and serial mouse support. 3. Audio : 16 bit stereo sound output 4. Microphone output Communications: 1. TCP/IP with DHCP support. 2. 10/100 Mbps Ethernet Twisted pair (UTP RJ45) Interface. 3. Full PPP(Point to Point Protocol) support Server Operating System compatibility/support: 1. Citrix Win-frame and other Citrix compatible Operating system. 2. Windows NT server 4.0,Terminal Server Edition and Windows 2000 server family with CITRIX Optional Support: 1. Smart card reader 2. ISDN support 3. LCD Display 4. USB ports 5. ISA and PCI Expansion slots For the hardware description of the Thin Client given above a Windows CE image had been made and the first step was to choose an appropriate configuration. Platform builder provides two different configurations for Thin Clients which are the following: 1. Windows Thin Client – Minimal version of Microsoft Windows CE that includes Core Operating System and features necessary to support Microsoft terminal services. 11 2. Windows Thin Client with browser- This configuration supports the features of Windows Thin Client along with a browser. Windows Thin Client configuration provides the starting point for remote desktop terminals, including those features necessary to support terminal services client. This configuration provides functionality for Remote Desktop Terminals through support for Microsoft Terminal Services client. It also has SNMP and local browser capabilities along with the possibility to adding additional Windows CE Operating System features. After the appropriate configuration is selected for the Thin Client hardware is selected the next step is to select one or more available board support packages. Platform Builder supports a number of Board Support Packages. 2.1.2.3 Board Support Package A board support package is a software package that contains: 1. Boot Loader 2. OEM Adaptation Layer (OAL) 3. Device drivers for standard development board (SDB) or Hardware Reference Platform BSP is the main part of Microsoft Windows CE based platforms and contains source files and binary files. The OEM adaptation layer (OAL) that links to Kernel image and supports: a) Initializing and managing the hardware. b) Device drives. c) Boot loader. d) Set of configuration files. Use of Boot Loader: 1. Used during development to download Operating System image. 2. Once created BSP can be reconfigured through environmental variables and .bib and .reg file can be modified to attain the reconfiguration. 12 Interaction Diagram: Microsoft Platform Builder provides sample BSPs for many SDBs that are readily available in industry. By using the integrated BSP support, one can quickly evaluate the new Operating System features in Microsoft Windows CE. BSP developments: 1. Extensive infrastructure is provided for developing BSP for SDB made by the developer or hardware. 2. Offers support for developing drivers for the platform. 3. Focus on customizing, refining and developing additional Operating System features. Platform Builder provides BSPs for ten SDBs that are readily available for purchase. At present Platform Builder supports 4 types of BSPs which are the following: 1. ARM BSPs 2. MIPS BSPs 3. SHX BSPs 4. x86 BSPs 13 There are several third party BSPs that are available. Windows CE supported Thin Clients are also available commercially [ref 1]. 2.1.2.4 Creating an OEM Adaptation Layer (OAL) An OEM adaptation layer is a layer of code that resides between the Microsoft Windows CE Kernel and the hardware of target device. Physically OAL is linked with kernel libraries to create the kernel libraries to create the kernel executable files. It also facilitates communication between the Operating System and the target device. It includes code to handle the following: 1. Interrupts 2. Timers 3. Power Management 4. Bus abstraction 5. Generic I/O control codes (IOCTL) Creating the OAL is one of the more complex tasks in the process of getting a Windows CE Operating System to run on a new hardware platform. Easiest way to create an OAL: 1. Copy the OAL implementation from a working platform and then modify it to suit the specific requirements of the platform under consideration. 2. If a new OAL must be created from beginning (similar implementations are not there) then it’s more useful to approach the development process in stages. Each stage adds a little more functionality than the previous one and provides a convenient separation point where new features can be fixed and validated. Steps of creating a new OAL: 1. Preparing BSP files for building the kernel. a) Put the necessary directories and files to build the OAL and kernel image in place. b) config.bib will be created or modified in this step. 2. Creating a Base OAL 14 a) Initialize the platform at startup. b) Enable the serial port for debugging. c) Initialize the communication settings. d) Goal is to provide basic system initialization code that will support further debugging and to ensure that basic initialization code is complete and consistent with target device. 3. Enhancing OAL functionalities: a) Enhance the Interrupt Service Subroutines. b) Manage clock and timers. c) There are alternate debugging options for 1. Ethernet. 2. Enable power management. 3. Provide platform information for applications. Goal: a) Implement the remainder of platform support functions. b) Ensure the fact that full Operating System boot is possible. 4. Completing an OAL: Implement any additional features that the developer want to add to it. 2.1.2.5 Creating a Boot loader Boot loader is an integral part of Windows CE Operating System development process and in some cases part of the final product solution. Purpose of the boot loader is following: 1. Place the Operating System image into memory. 2. Jump to Operating system startup routine. 3. Boot loader can obtain the OS image in a number of different ways: a) Cabled connection(such as Ethernet) b) Through USB or serial port. c) It can also load the Operating System from a local storage device such as compact flash, a hard disk or a disk-on-chip. d) It may store the image in Random Access Memory or in a nonvolatile storage like 15 1. Flash 2. EEPROM 3. Storage device Boot loader is typically used during the development process to save time. Rather than transferring the developmental image to the target device through a manual process such as flash programming, the boot loader allows the developers to quickly download the new development image to the target device. In many final product solutions the boot loader is removed from the product and the Operating System image is stored on the device and bootstrapped by system-reset process. But there are platforms that do not efficiently support this ability, such as X86 platforms or platforms that perform pre boot task. Most common form of boot loader is one that downloads Operating System image over the Ethernet into a target device RAM and much documentation is available on that. 2.1.2.6 Device Driver Development The device drivers are included in every Windows CE operating system image are responsible for direct communication to devices. A device is a physical or logical entity that requires: a) Control b) Resource Management c) Both (a) and (b) from Operating System A device driver is a software module that manages the operation of a: 1. Device 2. Protocol 3. Service A device driver also manages virtual or logical devices. A virtual device exposes a physical device interface, even though there is no physical device to manage it. A device driver for a virtual device is indistinguishable from a device driver for a physical device. 16 Virtual means that there is no physical device to manage but a device like interface is being exposed. File system drivers are example of virtual device drivers. Typically one can characterize drivers by the device interface they expose. In the simplest case there is one interface exposed downward to hardware and one interface upwards to the applications. Interface to Hardware Hardware interface or bus interface Interface to Application Device interface or client interface Application, other drivers or the device manager can manipulate the device interfaces. The provider of the interface determines how these modules manipulate the client interface. Buses are responsible for loading the drivers for the devices on their buses. Bus enumeration is the process of examining the bus and then loading appropriate drivers. Root bus driver is a registry enumerator. Different processes load different drivers. The following tables show the processes that load drivers and what drivers they load. Process Graphics, Windowing and Event Subsystem(GWES) Device Manager (Device.exe) Drivers Battery drivers Display drivers Notification LED drivers Printer drivers Audio drivers, Keyboard drivers, Mouse driver, Serial drivers, PC card drivers, USB devices and any other driver that exposes the stream interface. File System File System (Filesys.exe) Drivers Device driver source code: 17 1. The device driver source code is separated into platform dependent code. 2. CPU support package (CSP) drivers. 3. Common drivers. 2.1.2.7 Creating a Board Support Package Windows CE provides the basic infrastructure that allows one to rapidly and easily create BSPs. Different device driver libraries like Microprocessor-native libraries are shipped in Platform Builder. Microprocessor Native Libraries: The microprocessor native libraries consist of device drivers for high integration microprocessors and their native peripherals. For example Strong-Arm microprocessor and its companion chip integrated many peripherals on microprocessor such as LCD, serial port, USB etc. A SDB that uses a specific microprocessor will use the same set of microprocessor native drivers. 2.1.2.8 Thin Client example revisited In the Thin Client example discussed above a configuration was chosen after carefully studying the hardware characteristics of the Thin Client. Do the following steps after opening the Platform Builder: 1. From the file menu, choose new platform, and new platform wizard will appear. 2. Choose next and select one or more available BSPs. 3. On the BSP page choose next and in the platform configuration dialog box, enter a name for the platform. 4. Select a platform configuration from the available configuration area which in this case will be Windows Thin Client with browser for this particular example. After these steps are completed platform creation is completed. After that the next step is to build the platform: The OS image will be built based on the platform that has been configured. To build an Operating System image select retail or debug build configuration, modify the platform settings as needed, such as enabling kernel debugger, and then build the image. 18 After the platform is configured, the platform can be built using the platform builder integrated development environment (IDE). Platform builder creates an Operating System image in four stages: 1. Sysgen phase. 2. Feature build phase. 3. Release copy phase. Build system generates header files, links modules, copies the resulting modules to a release directory, and generates a binary image file. Static library as well as code supplied by the developer or the third party vendors is combined into a binary file and that is downloaded onto the device. Sysgen Phase: Each feature selected from catalog has a corresponding sysgen variable. If a feature is included in platform, IDE sets the corresponding sysgen variable during the sysgen phase of the build process. Build system uses these variables to link the corresponding static libraries into modules. The system also filters the system header files, creating headers that only contain prototypes for the functions exported by the developer’s platform. Import libraries for the system modules are also created during this phase. Feature Build Phase: After sysgen phase, feature build phase is run. During this phase, all user features including platform builder project (.pbp) files, source files and makefile (.mak) files are compiled and built. Release Copy Phase: Build system copies all the files needed to make an OS image to the release directory. The modules and files created during the sysgen phase are copied to the directory first, followed by the files created by the feature build phase. Make Image Phase: The project specific files which includes Project.bib Project.dat Project.db Project.reg 19 are copied into Release directory. Now information in BIBInfo tab of the platform setting dialog box is then added for each module or file. During this phase, the files in the release directory are combined in binary image file Nk.bin. If some modification is done, then the image should be made again. Platform Downloading After the platform is built, the OS image associated with the platform will be downloaded to a target device. The IDE gives a mechanism that allows downloading using a number of types of communication hardware, an OS image to target device. To download an Os image to a target device, there should be connection from development workstation to target device. Target menu in the IDE provides functionality that allows the developer to download an OS image to a target device. Before downloading configure a connection to the target device. To download using Ethernet, the development workstation and the target device must be on the same subnet. If the subnets will differ then the target device cannot be connected and debugging the OS image will not be possible. Transfer the Operating System image thus made to the Thin Client at the startup. To make a connection to the device, go to target menu and configure remote connection. Choose services tab and from the active named box choose a connection. This completes the deployment of the operating system image to the Thin Client. This process is the general way to build Windows CE Operating System image for a given hardware platform. Many a time this whole process is not necessary since there are some hardware platforms like Pocket-PC for whom Windows CE image were already built and tested. Such hardware platforms come preloaded with Windows CE image and as mentioned above they export a software development kit which the developers can use to build software for Pocket-PC. Embedded Shruti is also developed using the Pocket-PC software development kit not by building the Windows CE image from the scratch. But every embedded software developer should have an idea of the whole design process so that the software can be ported to custom platforms if required in the future. The process 20 was explained using the Thin Client example. The Thin Client example was the first work carried out by me in this project. 2.1.3 Using Standard SDK When included in the platform under development, the standard software development kit (SDK) for Microsoft® Windows® CE provides a common subset of features that allow an application written to conform to the standard SDK to run on a display-based Windows CE platform. To maintain compatibility with the standard SDK, an application must function with only the features provided in the standard SDK. Using additional features will make the application incompatible with the standard SDK. To implement the standard SDK on a Windows CE platform, the standard SDK for a Windows CE feature must be added to a display-based platform. The standard SDK is not compatible with headless devices and is therefore limited to display-based platforms. When added to an operating system (OS) image, the standard SDK will automatically include any features associated with the standard SDK as well as their dependencies. It will also add a registry flag to the image that indicates that the standard SDK has been implemented on that image. This allows any application written for the standard SDK to verify that a particular Windows CE platform supports the standard SDK. When included in a display-based platform, the standard SDK automatically incorporates all associated features into that platform. The following list shows the features that compose the standard SDK. In the next chapter desktop version of Shruti will be introduced and initially one of the modules of desktop Shruti is ported on an emulator running Windows CE with Standard SDK support. Some screenshots will show the primitive version of the Hindianalyser module ported on an emulator running Windows CE and supporting the standard software development kit. 2.2 Pocket-PC SDK (Software Development Kit) After the primitive version on standard software development kit, the next version came on the Pocket-PC emulator running Windows CE and having a Pocket-PC software 21 development kit. In Chapter 4 the details of the porting on Pocket-PC will be provided, but at this point of time an introduction to Pocket-PC software development kit is inevitable since this sdk is used to develop software for Pocket-PC platform very fast. 2.2.1 Introduction to Pocket PC 2002 Microsoft® Windows® Powered Pocket PC 2002 is a personal companion for mobile device users. It offers users the following: An easy, seamless setup experience An intuitive user interface Powerful information management A robust communications platform Customizable user interface Best companionship to Microsoft Outlook® 2.2.2 Working with the Pocket PC 2002 Emulator The Microsoft® Windows® Powered Pocket PC 2002 SDK includes a new emulation environment. This environment provides a virtual computer running Pocket PC 2002 software compiled for the Intel x86 processor. The virtual computer duplicates hardware that runs Microsoft Windows CE on an x86-based PC. Previous Windows CE emulators relied on special emulator compilers that passed instructions to the underlying Microsoft Windows NT® operating system. This led to occasional dramatic differences in appearance and function between the emulator and a 22 Pocket PC device. Because the new emulator is powered by the Windows CE operating system and by Pocket PC components, a much higher level of fidelity exists between an actual Pocket PC device and the device emulation environment. New APIs: The Pocket PC 2002 platform supports the following newly exposed APIs. ActiveSync This API provides ActiveSync 3.5 functionality for the Pocket PC 2002. Windows CE Messaging This API provides a set of interfaces to facilitate the development of messaging applications for the Pocket PC 2002. Connection Manager This API provides the functionality necessary to centralize and automate the establishment and management of the network connections on a Pocket PC 2002 device. HTML Control This API provides the functionality to customize the HTML viewer control; this API also includes an XML parser. MIDI This API provides the capability to play MIDI files on a Pocket PC 2002 device. This API also provides the functionality to create custom sounds such as a DTMF tone or a busy signal. Object Exchange (OBEX) This API provides one method for transmitting information between two Pocket PC 2002 devices. The OBEX protocol requires fewer resources than an HTTP server and transfers information by using the infrared port on the Pocket PC 2002 device. Telephony This API is a superset that includes the following sections: o Assisted TAPI. Allows applications to make telephone calls without requiring the details of the services of the full Telephony API. o Extended TAPI. Extends wireless functionality to include such things as asking for signal strength, choosing the cellular network, and more. 23 o Phone API. Provides the functionality to access a call log and creates a custom report from the information in that log. o Subscriber Identity Module (SIM) Manager. Allows access to information stored on the SIM card. o Short Message Service (SMS). Enables wireless devices to send and receive short messages through an SMS Center. o Telephony Service Provider. Supports communications device control through a set of exported service functions. Emulator: The Pocket PC 2002 SDK includes a new emulation environment. This environment provides a virtual machine running Pocket PC 2002 software compiled for the x86 processor. The virtual machine duplicates hardware known as a CEPC, which is a hardware configuration that runs Windows CE on an Intel x86-based PC. 2.2.3 Programming Pocket PC 2002 Microsoft® Windows® CE operating system version 3.0 for Windows Powered Pocket PC 2002 provides a powerful and easily portable platform for mobile professional users. It combines the power of a personal information manager (PIM), a compact Software package fully compatible with Windows-based desktop computers, and a Windows development environment. Pocket PC 2002 allows users to keep their personal and business information up to date and close at hand by using a sophisticated hardware design to fill the need for a more portable and less expensive device than traditional laptop or palmtop computers. Pocket PC 2002 is designed to quickly access, record, and transmit information at any time. The software bundled with Pocket PC 2002 manages contacts, appointments, and other personal and business information. By using the Voice Recorder application, users can capture ideas and thoughts as they occur. Pocket PC 2002 software can also store telephone numbers and short messages, and it can send and receive e-mail messages by using Internet technologies. All these features are fully compatible with the user's Windows-based desktop applications. 24 Pocket PC 2002 gives the developer access to a rich development environment. The Windows CE operating system is based on the Microsoft Win32® application programming interface. The applications can be created by using Microsoft eMbedded Visual Tools (eVT) 3.0, which are special versions of the familiar Microsoft Visual Studio® tools that the developers may have used to write applications for desktop Windows. The developer can choose to develop applications by using Microsoft eMbedded Visual C++® or Microsoft eMbedded Visual Basic®. Embedded Shurti is developed using Microsoft eMbedded Visual C++®. Pocket PC 2002 supports a variety of input technologies, including freestyle drawing, handwriting character recognition, or a graphical representation of a keyboard for use on a touch screen. 2.2.4 Pocket PC 2002 Hardware Original equipment manufacturers (OEMs) have a variety of hardware options when building Pocket PC 2002 devices. The following illustration shows the different hardware components available for a typical Pocket PC 2002. Touch screen 25 The touch screen is an LCD covered by a resistive touch panel. The LCD has a portrait orientation with a 240 x 320 pixel resolution, which allows users to see interface elements clearly. The dot pitch for Pocket PC 2002 is.22 to .24, depending on the OEM. Tapping the touch screen with a stylus or finger sends the same kind of messages that clicking with the left mouse button does on a desktop computer, although cursor support is limited to a spinning hourglass for wait signals. The user can also select and drag items. In order to sense quick changes in user input, the touch screen has a refresh rate of at least 100 samples per second. Pocket PC 2002 also supports up to a 16 bit per pixel color depth. Stylus and keyboard Pocket PC 2002 does not have a standard, physical keyboard. Text input is accomplished by using the input panel and the stylus. Generally, the input panel is a standard window on the touch screen that displays an input method, allowing users to enter data in a variety of ways. Pocket PC 2002 software includes a simplified QWERTY keyboard input method and a handwriting recognition input method. The stylus is a pointer for accessing a touch screen and input methods. The stylus has a smaller point than a user's finger, yet will not scratch the touch screen. The OEM or a user can add additional input methods. For example, an independent software vendor (ISV) could create an input method for tapping in Morse code. The user could purchase the Morse code input method and install it at home. Navigation controls Pocket PC 2002 comes with several navigation controls, which can be pressed, held down, double-clicked, or pressed in combination with other controls. The following table shows the default Pocket PC 2002 navigation controls. Navigation control Description On/off Turns Pocket PC 2002 on and off Action button Acts as the ENTER key Record button Activates the Voice Recorder application Program button(s) Launches an application Up Acts as an UP ARROW key 26 Down Acts as a DOWN ARROW key Some OEMs may add a silkscreen region, which is an extension of the resistive touch panel, to cover a non-LCD region of a Pocket PC 2002 case. This region is usually directly below the LCD. It is called a silkscreen region because it often has buttons applied by using a silkscreen process. While a silkscreen button is technically a software control, Pocket PC 2002 software does not distinguish between a silkscreen button and other navigation controls; both types of buttons send the same virtual key messages. The OEM is responsible for the driver that handles the silkscreen region. Audio input Depending on the device category, some Pocket PC 2002 devices will not support audio input or playback. For devices that support audio, a built-in microphone is included. It is usually located on the front of the device, so that a user can view the screen while recording. The hardware supports 16-bit sampling at 8 kHz, and codec, the compression and decompression software, compresses the recording to 2.4 Kbps. The codec software is identical to a desktop computer's audio compression manager (ACM). OEMs may add a microphone jack for an external microphone. The jack is transparent to the software. Audio output The developer can use the built-in speakers to play sounds associated with notification events. Speakers can also be used to play voice recordings or other .wav files, or for dual tone multifrequency (DTMF) dialing output. Some OEMs may add a headphone jack for headphones, external speakers, or other audio-out hardware. This jack is transparent to the software. Printing Printing is not currently supported on the Pocket PC 2002. Notification options An OEM may provide several notification options for a Pocket PC 2002: audio, a flashing LED, or vibration controls such as those on cellular phones and pagers. Although all three of these methods are supported by Pocket PC 2002, all except audio notification are OEM options. 27 Power Because a Pocket PC 2002 is portable, battery life is very important. Pocket PC 2002 can run many hours on its standard battery source, and also has a backup battery to avoid data loss if the primary battery loses power. CPU Pocket PC 2002 uses the ARM family of CPUs. The ARM processors offer an excellent combination of high performance and low power consumption. Memory All Pocket PC 2002 devices come with at least 24 megabytes (MB) of ROM and 16 MB of RAM. The upgrade edition offered by some OEMs for their Windows Powered Pocket PC devices is tailored to fit in the 16 MB of Flash RAM available on those upgradeable devices. Because it is important to conserve memory on a Pocket PC 2002, many Pocket PC 2002 operating system (OS) components are compressed in ROM. When a user needs a component, the operating system decompresses that component and transfers it to RAM. Because of the time required for decompression and transfer, compressed files slow performance. Built-in serial port Pocket PC 2002 comes with a built-in 16550 (or equivalent) serial port and some OEMs may include a second serial port. Applications use the serial port for communication between a Pocket PC 2002 and other hardware devices at baud rates from 19.2 kilobits per second (Kbps) to 115.2 Kbps. A Pocket PC 2002 can connect to a desktop computer by using a serial cable or an optional docking cradle, available from many Pocket PC 2002 manufacturers, that is connected to the desktop computer. Some Pocket PC 2002 devices support data communications through a modem connected to the cradle. Infrared communications serial port Pocket PC 2002 includes a serial port that conforms to Infrared Data Association (IrDA) specifications. Pocket PC 2002 devices can communicate with other Pocket PC 2002 devices, other Windows CE devices, Palm™ OS-based handheld computing devices, or desktop computers. 28 2.3 Microsoft eMbedded Visual C++ Microsoft® eMbedded Visual C++ is a member of the Microsoft eMbedded Visual Tools 3.0 family of products which also has eMbedded Visual Basic. These products provide complete integrated development environments for creating applications to run on the Windows CE operating system. 2.3.1 Introduction Microsoft® eMbedded Visual C++ enables programmers to develop Windows CE-based applications using an integrated development environment (IDE) similar to that used in developing desktop Visual C++ applications. This IDE, however, contains Windows CEspecific versions of many of the standard development tools that are used to create, test, and refine applications. It also includes a variety of tools that can be used to develop new software uniquely appropriate for Windows CE platforms and devices. Custom-built for developing Windows CE applications, the eMbedded Visual C++ IDE is easy to learn and will be familiar to programmers who have experience with other members of the Visual C++ family. Applications can be created with eMbedded Visual C++ to run on the Handheld PC Pro (H/PC Pro), Palm-size PC 1.2, and Pocket PC platforms. The developers can also use eMbedded Visual C++ to create applications that run on custom Windows CE-based platforms, or within a desktop emulator that simulates a Windows-CE based platform. Embedded Shruti is coded using eMbedded Visual C++ for Pocket PC platform and tested on the Pocket PC emulator. 2.3.2 Managing Projects and Workspaces In eMbedded Visual C++, applications are developed in a Workspace. Applications by assembled by either creating a project and a workspace simultaneously or by creating a workspace and then adding projects to it. After a workspace is created, the developer can add new projects, new configurations to an existing project, and subprojects. Microsoft eMbedded C++ development is characterized hierarchically by the workspace, projects, and subprojects. A workspace is a container for the development projects. When a new platform is created, a workspace is also created. Use Project view to look at and gain access to the various elements of projects. A workspace can contain multiple projects, including subprojects, but only one platform. 29 A project is a configuration and a group of files that produce an application or final binary file(s). A subproject is a project that is dependent on another project. This dependency may consist of shared files, which need to be built in the subproject first, or it may include shared resources that need updating in the subproject first. 2.3.3 Developing an Application After initially creating a project, developer can create the user interface. This involves first designing and creating dialog boxes, menus, toolbars, accelerators, and other visual and interactive elements, and then hooking them up to code. The user interface elements have to be tailored to the design requirements of the target device. For example, the Pocket PC is long and narrow (about 240x320 pixels), while the Handheld PC is larger (about 640x320 pixels). If the application is designed for the Pocket PC, dialogs will likely be too small for the Handheld PC. If the dialogs are designed for the Handheld PC, they will likely be cramped or not fit at all on the Pocket PC. 2.3.4 Building an Application Microsoft eMbedded Visual C++ provides two ways of building an application. The easiest and most common way is to build within the eMbedded Visual C++ development environment. The other way is to build from the MS-DOS prompt using command-line tools. Building an application involves the preprocessor, the compiler, and the linker. The preprocessor prepares source files for the compiler by translating macros, operators, and directives. The compiler creates an object file containing machine code, linker directives, sections, external references, and function/data names. The linker combines code from the object files created by the compiler and from statically-linked libraries, resolves the name references, and creates an executable file. 2.3.5 The Build Process The following diagram shows the components of the build process in eMbedded Visual C++ starting with the editor which can be used to write source code. 30 If the program is built outside the IDE, the developer may use a makefile to invoke the command-line tools. Microsoft eMbedded Visual C++ provides the NMAKE utility for processing makefiles. If the program is built within the IDE, the eMbedded Visual C++ project system uses the project (.vcp) files to store make information. The .vcp files are not compatible with NMAKE. However, if the program uses a makefile rather than a .vcp file, it can still be built in the development environment as an external project. 2.3.6 Testing and debugging an application Microsoft eMbedded Visual C++ provides tools to help test and debug applications. In the eMbedded Visual C++ options, the developer can choose to automatically or manually download the programs after building them to a connected device. When the developer has completed building a project configuration, the program can be run in eMbedded Visual C++ with or without debugging capabilities provided by the IDE. Running programs without using the debugging capabilities is faster because eMbedded Visual C++ does not have to load the debugger first. With the debugger however, breakpoints can be used and step through execution, inspect memory and registry values, check variables, observe message traffic and generally examine closely how the code works. 2.3.7 MFC for Windows CE The Microsoft® Foundation Class (MFC) library for the Windows® CE operating system is both a mature, comprehensive class library and a complete object-oriented application 31 framework designed to help the developer build applications, components, and controls for Windows CE-based platforms. The developer can use the Microsoft Foundation Classes for Windows CE to create anything from a simple dialog box-based application to a sophisticated application that employs the full MFC document/view architecture. MFC can also be used for Windows CE to create full-featured Microsoft® ActiveX® controls and ActiveX containers. The developers who have experience using MFC for desktop applications will find MFC for Windows CE very similar and the migration to MFC for Windows CE will be smooth. MFC supports a number of classes to help the developers write applications using those utility classes. For example if the developer wants to use a hash data structure then there is no need to design the data structure from scratch. Instead the developer can use a CMap class already defined in MFC and the application will be made quite fast. Microsoft Foundation Classes provides a framework of a number of utility classes and provides the developer with a number of programming options. Microsoft® eMbedded Visual C++® 3.0 toolkits contain all the development tools and wizards needed for building MFC for Windows CE applications. Several MFC classes are used in the design of Embedded Shruti and the details of the classes will be provided in Chapter 4 which explains the design of Embedded Shruti. 2.4 Microsoft SQL server CE During the development phase of Embedded Shruti, Microsoft SQL server for CE was considered as an option for providing database support to the application. This section will explain some of the key features of the Microsoft SQL server CE and the reason why it was not used in Embedded Shruti. 2.4.1 Rapid Application Development SQL Server CE makes application development easy while providing a consistent development model and API set. Microsoft® Visual Basic® developers can rapidly develop Windows CE applications by using eMbedded Visual Basic and ADOCE (ActiveX data objects from Windows CE). Microsoft® Visual C++® developers can leverage their existing skills to build sophisticated Windows CE-based database applications that target mobile and embedded solutions. 2.4.2 High-Performance Database Engine 32 SQL Server CE offers rich relational database functionality in the small memory footprint on today's devices. Microsoft SQL Server developers will appreciate the robust feature set which includes: A compatible SQL grammar with SQL Server 2000. Statements that run on SQL Server CE will, in general, run on SQL Server. A wide range of data types, including: TINYINT, SMALLINT, INTEGER, BIGINT REAL, NUMERIC, FLOAT BIT, BINARY, VARBINARY, IMAGE UNICODE character data types NATIONAL CHARACTER, NATIONAL CHARACTER VARYING, NTEXT MONEY, DATETIME, UNIQUEIDENTIFIER 32 indexes per table, multicolumn indexes NULL support Nested transactions 128-bit file level encryption DDL: Create databases, alter tables, referential integrity, default values DML: INSERT, UPDATE, DELETE SELECT: SET Functions (aggregates), INNER/OUTER JOIN, subselect, GROUP BY/HAVING Scrollable and forward-only cursors Hardware and Software Requirements Hardware Requirements Platform Requirements SQL Server system See the operating system requirements in SQL Server Books Online. IIS system 120 MB of available disk space. Development system 30 MB of available disk space. The computer will need an additional 30 MB of temporary storage space for the setup files. 33 Windows CE device Between 1 and 3 MB of available storage space, depending on processor type and components installed. The file sizes for the SQL Server CE components vary by processor type and Windows CE operating system version. Hard disk space requirements also depend on which SQL Server CE components are installed. Operating System Requirements Platform Supported operating systems SQL Server system See the operating system requirements in the SQL Server Books Online. Development system Microsoft Windows 98 Second Edition, Microsoft Windows Millennium (Me), Microsoft Windows NT® 4.0 with Service Pack 5 or later, or Microsoft Windows 2000. Windows CE desktop emulation requires Windows NT 4.0 or Windows 2000. Emulation is not supported on Windows 98. Microsoft ActiveSync 3.1 or later. IIS system Windows NT 4.0 with Service Pack 5 or later, or Windows 2000. Windows CE device Windows CE version 2.11 or later. SQL Server Requirements SQL Server Supported SQL Server CE features SQL Server 2000 All features are supported including merge 34 replication and RDA. SQL Server version 6.5 with Service RDA is supported; replication is not supported. Pack 5 or later and SQL Server 7.0 Internet Information Services and Internet Explorer Requirements Component Requirements Microsoft Internet Explorer 5.0 Internet Explorer 5.0 or later is required on the development system to access SQL Server CE HTML Help. Internet Explorer 5.0 or later is required on IIS system. IIS Replication and RDA require IIS 4.0 on Windows NT 4.0 or IIS 5.0 on Windows 2000. ActiveSync Requirements Component Requirements SSCERelay.exe Windows 98 Second Edition, Windows Millennium (Me), Windows NT 4.0 with Service Pack 5 or later, or Windows 2000. Windows CE Requirements Platform Windows CE operating system version Handheld PC Pro (H/PC Pro) 2.11 or later Palm-size PC (P/PC) 2.11 or later Pocket PC 3.0 or later HPC 2000 3.0 or later* 35 2.4.3 Drawbacks of SQL server (Windows CE version) The tables showed above gives a complete reference to the hardware and software requirements for SQL server for CE. Take a look at the Windows CE device requirement. Depending on the processor type and components installed, 1 MB to 3 MB of storage space is required on the Windows CE device. On the top of that Embedded Shruti doesn’t need a database that supports extensible set of SQL statements. Queries are made only by sending a key rather than writing the SQL statements. Keeping this design issue in mind, using SQL server for Windows CE will just eat up the resources while not producing any significant gain. For an embedded application 1 MB space (minimal Windows CE installation) is quite large. The application needed an extendible hashing based database which would save the database values according to the key field and then retrieve the values efficiently. Extendible hashing is an efficient implementation since retrieval is having a complexity of O (1+alpha) where alpha is the load factor. For balanced hash tables the retrieval will give better performance than SQL server for Windows CE which have to process the SQL queries involving extra overhead. A well known hash based database called GDBM which is quite popular on UNIX platform is ported on Windows CE platform and used in Embedded Shruti. Thus Embedded Shruti uses a variant of GNU software and thus is a merger of Microsoft technologies and Open source GDBM project. The next section will introduce the GDBM (GNU Database Manager) in general. 2.5 GNU Database Manager (popularly called GDBM) GDBM - The GNU database manager is a set of database routines that use extensible hashing. 2.5.1 Synopsis #include <gdbm.h> // This file contains all the function and data type definitions for // GDBM extern gdbm_error gdbm_errno extern char 36 *gdbm_version GDBM_FILE gdbm_open (name, block_size, read_write, mode, fatal_func) char * name; int block_size, read_write, mode; void (*fatal_func) (); void gdbm_close (dbf) GDBM_FILE dbf; int gdbm_store (dbf, key, content, flag) GDBM_FILE dbf; datum key, content; int flag; datum gdbm_fetch (dbf, key) GDBM_FILE dbf; datum key; int gdbm_delete (dbf, key) GDBM_FILE dbf; datum key; datum gdbm_firstkey (dbf) GDBM_FILE dbf; 37 datum gdbm_nextkey (dbf, key) GDBM_FILE dbf; datum key; int gdbm_reorganize (dbf) GDBM_FILE dbf; void gdbm_sync (dbf) GDBM_FILE dbf; int gdbm_exists (dbf, key) GDBM_FILE dbf; datum key; char * gdbm_strerror (errno) gdbm_error errno; int gdbm_setopt (dbf, option, value, size) GDBM_FILE dbf; int option; int *value; int size; 2.5.2 Description 38 GNU dbm is a library of routines that manages data files that contain key/data pairs. The access provided is that of storing, retrieval, and deletion by key and a non-sorted traversal of all keys. A process is allowed to use multiple data files at the same time. A process that opens a gdbm file is designated as a "reader" or a "writer". Only one writer may open a gdbm file and many readers may open the file. Readers and writers can not open the gdbm file at the same time. The procedure for opening a gdbm file is: GDBM_FILE dbf; dbf = gdbm_open ( name, block_size, read_write, mode, fatal_func ) Name is the name of the file (the complete name, gdbm does not append any characters to this name). Block_size is the size of a single transfer from disk to memory. This parameter is ignored unless the file is a new file. The minimum size is 512. If it is less than 512, dbm will use the stat block size for the file system. Read_write can have one of the following values: GDBM_READER reader GDBM_WRITER writer GDBM_WRCREAT writer - if database does not exist create new one GDBM_NEWDB writer - create new database regardless if one exists For the last three (writers of the database) there is an extra value that that can be added to read_write by bitwise or, GDBM_FAST. This requests that gdbm write the database with no disk file syncronization. This allows faster writes, but may produce an inconsistant database in the event of abnormal termination of the writer. Mode is the file mode (Read, Write or both) if the file is created. (*Fatal_func) () is a function for dbm to call if it detects a fatal error. The only parameter of this function is a string. If the value of 0 is provided, gdbm will use a default function. The return value dbf is the pointer needed by all other routines to access that gdbm file. If the return is the NULL pointer, gdbm_open was not successful. The errors can be found in gdbm_errno for gdbm errors and in errno for system errors. (For error codes, refer to gdbmerrno.h) 39 In all of the following calls, the parameter dbf refers to the pointer returned from gdbm_open. It is important that every file opened is also closed. This is needed to update the reader/writer count on the file. This is done by: gdbm_close (dbf); The database is used by 3 primary routines. The first stores data in the database. ret = gdbm_store ( dbf, key, content, flag ) Dbf is the pointer returned by gdbm_open. Key is the key data. Content is the data to be associated with GDBM_INSERT the key. insert Flag only, can have generate one an of the error following if key values: exists GDBM_REPLACE replace contents if key exists. If a reader calls gdbm_store, the return value will be -1. If called with GDBM_INSERT and key is in the database, the return value will be 1. Otherwise, the return value is 0. If the data is stored for a key that is already in the data base, gdbm replaces the old data with the new data if called with GDBM_REPLACE. Two data items for the same key are not obtained and there is no error from gdbm_store. To search for some data: content = gdbm_fetch ( dbf, key ) Dbf is the pointer returned by gdbm_open. Key is the key data. If the dptr element of the return value is NULL, no data was found. Otherwise the return value is a pointer to the found data. The storage space for the dptr element is allocated using malloc. Gdbm does not automatically free this data. It is the programmer's responsibility to free this storage when it is no longer needed. To search for some data, without retrieving it: ret = gdbm_exists ( dbf, key ) Dbf is the pointer returned by gdbm_open. Key is the key data to search for. 40 If the key is found within the database, the return value ret will be true. If nothing appropiate is found, ret will be false. This routine is useful for checking for the existance of a record, without performing the memory allocation done by gdbm_fetch. To remove some data from the database: ret = gdbm_delete ( dbf, key ) Dbf is the pointer returned by gdbm_open. Key is the key data. The return value is -1 if the item is not present or the requester is a reader. The return value is 0 if there was a successful delete. The next two routines allow for accessing all items in the database. This access is not key sequential, but it is guaranteed to visit every key in the database once. (The order has to do with the hash values.) key = gdbm_firstkey ( dbf ) nextkey = gdbm_nextkey ( dbf, key ) Dbf is the pointer returned by gdbm_open. Key is the key data. The return values are both of type datum. If the dptr element of the return value is NULL, there is no first key or next key. Again notice that dptr points to data allocated by malloc and gdbm will not free it for the developer. These functions were intended to visit the database in read-only algorithms, for instance, to validate the database or similar operations. File `visiting' is based on a `hash table'. gdbm_delete re-arranges the hash table to make sure that any collisions in the table do not leave some item `un-findable'. The original key order is NOT guaranteed to remain unchanged in ALL instances. It is possible that some key will not be visited if a loop like the following is executed: key = gdbm_firstkey ( dbf ); while ( key.dptr ) { nextkey = gdbm_nextkey ( dbf, key ); if ( some condition ) { gdbm_delete ( dbf, key ); free ( key.dptr ); 41 } key = nextkey; } The following routine should be used very infrequently. ret = gdbm_reorganize ( dbf ) If there are a lot of deletions and the developer would like to shrink the space used by the gdbm file, this routine will reorganize the database. Gdbm will not shorten the length of a gdbm file except by using this reorganization. (Deleted file space will be reused.) If GDBM_FAST value is used in gdbm_open call, the following routine can be used to guarantee that the database is physically written to the disk file. gdbm_sync ( dbf ) It will not return until the disk file state is syncronized with the in-memory state of the database. To convert a gdbm error code into English text, use this routine: ret = gdbm_strerror ( errno ) Where errno is of type gdbm_error, usually the global variable gdbm_errno. The appropiate phrase is returned. gdbm now supports the ability to set certain options on an already open database. ret = gdbm_setopt ( dbf, option, value, size ) Where dbf is the return value from a previous call to gdbm_open, and option specifies which option to set. The valid options are currently: GDBM_CACHESIZE: Set the size of the internal bucketcache. This option may only be set once on each GDBM _FILE descriptor, and is set automatically to 100 upon the first access to the database. 42 GDBM_FASTMODE: Set fast mode to either on or off. This allows fast mode to be toggled on an already open and active database. value (see below) should be set to either TRUE or FALSE. value is the value to set option to, specified as an integer pointer. size is the size of the data pointed to by value. The return value will be -1 upon failure, or 0 upon success. The global variable gdbm_errno will be set upon failure. For instance, to set a database to use a cache of 10, after opening it with gdbm_open, but prior to accessing it in any way, the following code could be used: int value = 10; ret = gdbm_setopt( dbf, GDBM_CACHESIZE, &value, sizeof(int)); The following two external variables may be useful: gdbm_errno is the variable that contains more information about gdbm errors. (gdbm.h has the definitions of the error values and defines gdbm_errno as an external variable.) gdbm_version is the string containing the version information This gives an introduction to GNU Database manager. Embedded Shruti has used a version of GDBM ported into Windows CE so that it can be used by Pocket-PC Software Development Kit. Details of the GDBM functions used in Embedded Shruti will be provided in Chapter 4 which explains the complete design of Embedded Shruti. 43 Chapter 3 Shruti: Desktop Version In recent years, it has become critical to bridge the gulf of between the man and the machine. The Internet has become an integral part of today’s life and the greatest knowledge repository on Earth. Technology for accessing the Internet and harnessing the myriad powers of the personal computer is a must if one is not to fall behind. The need of the hour is intelligent human-computer interfacing, enabling a wider community such as the rural neo-literates and pre-literates, the physically challenged (like the visually impaired and the speech impaired) to interact with computer systems in a natural way. The speech interfaces like Shruti may have manifold uses. They could serve as: • Computer interfaces for the visually challenged, for whom graphical interfaces are not viable. • The voice of the speech impaired. • Computer interfaces for neo-literates and pre-literates. • Modules in software to help pre-literates learn languages using a computer. • Interfacing modules in multilingual environments, where, depending on the need, the computer can talk in different languages. Text to speech has been one of the greatest challenges of modern computational science. While the utterance of flat speech by a computer has been achievable – the greatest challenges in the field are to impose natural intonation and prosody based on the characteristics of the language, dialect, person and context. The diagram shown below gives a complete idea of the modules of a Text to Speech converter. The diagram is detailed and gives a clear idea of a TTS converter: 44 Various techniques exist to convert a given text to speech. Initially, a grapheme to phoneme mapper is required to convert the given graphemes (the smallest unit of written language) to a list of phonemes (the smallest unit of spoken language). The next stage is to render the string of phonemes – to synthesize the speech. Speech synthesizers can be broadly classified into two different classes. Some synthesizers are articulatory where speech synthesis is controlled by parameters that represent the speech production system rather than the signal itself, the other being concatenative synthesizers where different signal units from a dictionary are concatenated to produce synthetic speech. However, the prime challenge in all cases is the quality of the sound produced and its naturalness. The desktop version of Shruti implements the Text-to-Speech converter for regional languages like Hindi and Bengali using concatenative approach. Concatenative approach finds voice units corresponding to a Phoneme and concatenates them to produce the 45 sound file. Smoothening algorithms are also applied on the concatenated speech and the noted improvements are achieved in this process. After understanding the basic essence of Text-to-Speech software let’s quickly understand how the desktop version of Shruti is implemented. 3.1 Features of Shruti 1. The front-end of the software is written using Java. Refer to the block diagram shown above. The front-end is used to take the input text and produce the output sound file. The processing is done by two backend dynamic link libraries. 2. There are two backend dynamic link libraries that are written using C++ and which implements two important parts of the Text-to-Speech synthesizer. 3. The first dynamic link library implements the Natural Language Processing (NLP) unit which will be referred as Hindianalyser module in remaining thesis. 4. The second dynamic link library implements the Indian Language Phonetic Synthesizer (ILPS) unit which will be referred as Hindiengine module in remaining thesis. 5. These dynamic link libraries are loaded at the runtime when required and the appropriate functions from the library will be called. 3.2 Overview of Shruti The processing part of the desktop based (Win32 API) text-to-speech software can be divided into 2 sub modules: • HindiAnalyser: It takes the input supplied by the frontend to produce tokens which corresponds to a unique sound clip of the sound library. • HindiEngine: It takes the tokens and sound units from the library and generates the whole sound clip. After generation smoothening algorithms are applied for a smooth speech. 46 The frontend is responsible to take the input and to play the wav file generated by the backend. The next figure shows a dataflow diagram for the software. Dataflow diagrams always facilitate the understanding of a software product. 3.3 Technologies used The front-end is written using Java and an important feature of this implementation is to call the dynamic link libraries made by Visual C++ from Java program. This is done using Java Native Interfaces. In the code for the dynamic link library made by Visual C++, the following code snippet is added: JNIEXPORT void JNICALL Java_hindidisplay_Analyse (JNIEnv *env, jobject obj) This function of the dll can be accessed from the java code. Two files called “jni.h” and “jni_md.h” are included during the build process. See the references to find the source code of this implementation. 47 3.4 Comparisons 1. Java Developer Kit (JDK) should be installed on the desktop computer running the software. JDK is bulky software so it is not possible to use JDK for Embedded Shruti where memory is a main concern and in such a case installing JDK is more of a burden than of any substantial use. Therefore Embedded Shruti uses Windows CE API and Microsoft Foundation Classes customized for Windows CE. Such an implementation don’t need any JDK on the hardware (Pocket-PC in this case) on which the software will be executed. 2. Now in the implementation completely using Windows CE API and MFC customized for Windows CE a dynamic link library (mfcce400d.dll) of size 819 KB is required which is considerably smaller than the JDK. The JDK for Windows CE with least features has a size of 8.5 MB. 3. The backend dlls are made using Embedded Visual C++ and transferred on the system folder of the device running Windows CE. Win CE Hindianalyser dll: 43 KB Win32 version : 256 KB Win CE Hindiengine dll: 29 KB Win32 version : 260 KB 4. The next design issue was to choose a database. A SQL server for Windows CE would have required at least 1 MB of memory. But the port of GDBM to Windows CE which is used in Embedded Shruti require only a dynamic link library called gdbmce.dll which is of size 31KB and it is appropriate for the application since a hash based structure was needed rather than a database which implements SQL queries. The names and the sizes of the dynamic link libraries that will be needed to run Embedded Shruti are the following: 1. gdbmce.dll 31 KB (for database application) 2. hindianalyser.dll 43 KB NLP module 3. hindiengine.dll 29 KB 4. mfcce400d.dll 819 KB (for standard SDK emulation) ILPS module 5. mfcce300.dll + mfcce300d.dll 289 KB+ 846KB (for Pocket PC emulation) 48 This data shows that this implementation needs much less disk space compared to an implementation that uses JDK and build the software on top of it. The next chapter will explain the different implementations of Embedded Shruti one by one and the key features of each implementation will be provided. Each implementation is referred as a model. The performance comparison will be provided subsequently. 49 Chapter 4 EMBEDDED SHRUTI Last chapter introduced the desktop version of Shruti and the structure of the source code was explained along with the dataflow diagram for the software. In this chapter different models of Embedded Shruti will be explained one by one and the drawbacks of each model will be sited which resulted into a new modified and efficient model. 4.1 Model 1: Windows CE crude port This is the first model of Embedded Shruti. It started with the source code of Win32 version and first of all the structure of the native source code is identified. The points are identified where the API functions that are used in native code are not supported in Windows CE API. At all these points the modifications will be done accordingly so that the native code remains consistent. The input output characteristic of the native code should not be changed. Embedded Shruti is designed in a modular way. There are three modules in Model 1. These are the following: 1. Frontend 2. Hindianalyser 3. Hindiengine Frontend was designed using Java in the native Win32 code but in Embedded Shruti it’s designed using MFC customized for Windows CE in eMbedded Visual C++. The Frontend has a dialog box having the following contents: 1. Input Text Box: It takes the text input from the user which is to be changed to speech. The input should be in Hindi/Bengali at present. If multilingual keyboard is not there spell the Bengali/Hindi words using English alphabets and then fed the English alphabets into the text box. 2. Analyse Button: Analyse button on clicking read the input text from the text box and then write the text into a temporary file on the device called “TextIscii.txt”. This file will 50 be read later on by the dynamic link library. Now after saving the input text on a file it loads the dynamic link library for Natural Language Processing called as hindianalyser.dll. Code snippet for loading dll is provided. The dll should export the functions which other executable can call. The method of exporting the functions from a dll will be given shortly. Before that the procedure to load a dll and call an exported function from executable code is given. //Define a function pointer to call the DLL function typedef int(*MBFuncPtr)(DWORD cBytes); This is a pointer to a function whose return type is integer and which takes as input a DWORD. DWORD is a datatype defined in MFC. This refers to positive integers. //Instance variable required to load a library HINSTANCE hInst1; //Loading a dynamic link library hInst1 = ::LoadLibrary(L"hindianalyser.dll"); if(hInst1 == NULL) MessageBox(L"Unable to load the analyser library"); else MessageBox(L"Analyser Library successfully loaded"); If the library is not loaded successfully then hInst1 will be null. Once the library is loaded into the main memory the exported function from the library is accessed using the function pointer. The functions exported from the dynamic link libraries can be accessed only by the function pointer. //Getting the address of the analyser function into the function pointer MBFuncPtr pFunction=(MBFuncPtr)GetProcAddress(hInst1,L"Analyse"); Analyse is the name of the function exported from the dynamic link library. The functions that were exported by the dynamic link library are mentioned on .def(definition) file of the dynamic link library source code. A typical .def file will look like: //analyser.def LIBRARY Analyser EXPORTS 51 Analyse The name of the library is specified on the first line of the def file which is called Analyser library in this case. After that there are a list of functions that are exported from the dll which are mentioned under the EXPORTS header. There may be a number of function exported by a dll. There should be a function in dll that starts with the name as mentioned under EXPORTS tag. The pointer to that function will be copied into the function pointer from the calling program (the executable in this case) and the function is called with appropriate inputs. pFunction called above will be NULL if there is no such function exported by the dll. if(pFunction = = NULL) MessageBox(L"Unable to load the Analyse function"); else { MessageBox(L"Analyse function exported from the dll called"); tokenLength=(*pFunction)(cBytes); } Once the function pointer is obtained in pFunction, the function can be called with DWORD as parameter and as the return type is integer it will return an integer value after processing the input text file “TextAscii.txt”. After the use of library is over it’s always advisable to Free the library. As the dynamic link libraries are loaded on RAM, for devices running Windows CE which have very limited RAM space it’s advisable to unload the dll as soon as the work is done. //Unloading a dynamic link library ::FreeLibrary(hInst1); The methods discussed above a necessary to do operations related to dynamic link libraries. The next important difference between the native Win32 source code and the Windows CE version are the file operations. As already mentioned above on clicking the Analyse button the text input is saved on a file in disk. Windows CE doesn’t support file operations like fopen, fread, fclose, fseek and so on. Therefore while porting it is very important to find the equivalent of each of these file operations using Windows CE API. 52 In Windows CE API all devices are accessed by handles. The developer can access a file on disk, or a USB port or a sound device using handles. No other layer is defined like fopen and fseek. The following code snippets will show how to create a file, read a file and write a file using Windows CE API. //To create a file HANDLE exampleHandle = 0; //Create a file in read mode exampleHandle = CreateFile (L"TextIscii.txt",GENERIC_READ,FILE_SHARE_READ,NULL,OPEN_EXISTING,FI LE_ATTRIBUTE_NORMAL,NULL); //Create a file in write mode exampleHandle= CreateFile(L"TextIscii.txt",GENERIC_WRITE,FILE_SHARE_READ,NULL,OPEN_E XISTING,FILE_ATTRIBUTE_NORMAL,NULL); A complete reference to the CreateFile function is provided below: This function creates, opens, or truncates a file, communications resource, disk device, or console. It returns a handle that can be used to access the object. It can also open and return a handle to a directory. HANDLE CreateFile( LPCTSTR lpFileName, DWORD dwDesiredAccess, DWORD dwShareMode, LPSECURITY_ATTRIBUTES lpSecurityAttributes, DWORD dwCreationDispostion , DWORD dwFlagsAndAttributes, HANDLE hTemplateFile ); Parameters lpFileName Pointer to a null-terminated string that specifies the name of the object (file, communications resource, disk device, console, or directory) to create or open. If *lpFileName is a path, there is a default string size limit of MAX_PATH characters. This limit is related to how the CreateFile function parses paths. 53 When lpFileName points to a communications resource to open, the developer must include a colon after the name. For example, specify "COM1: " to open that port. dwDesiredAccess Specifies the type of access to the object. An application can obtain read access; write access, read-write access, or device query access. This parameter can be any combination of the following values. Value Description 0 Specifies device query access to the object. An application can query device attributes without accessing the device. GENERIC_READ Specifies read access to the object. Data can be read from the file and the file pointer can be moved. Combine with GENERIC_WRITE for read-write access. GENERIC_WRITE Specifies write access to the object. Data can be written to the file and the file pointer can be moved. Combine with GENERIC_READ for read-write access. dwShareMode Specifies how the object can be shared. If dwShareMode is 0, the object cannot be shared. Subsequent open operations on the object will fail, until the handle is closed. To share the object, use a combination of one or more of the following values: Value Description FILE_SHARE_READ Subsequent open operations on the object will succeed only if read access is requested. FILE_SHARE_WRITE Subsequent open operations on the object will succeed only if write access is requested. lpSecurityAttributes Ignored; set to NULL. dwCreationDispostion 54 Specifies which action to take on files that exist, and which action to take when files do not exist. For more information about this parameter, see the Remarks section. This parameter must be one of the following values: Value Description CREATE_NEW Creates a new file. The function fails if the specified file already exists. CREATE_ALWAYS Creates a new file. If the file exists, the function overwrites the file and clears the existing attributes. OPEN_EXISTING Opens the file. The function fails if the file does not exist. OPEN_ALWAYS Opens the file, if it exists. If the file does not exist, the function creates the file as if dwCreationDisposition were CREATE_NEW. TRUNCATE_EXISTING Opens the file. Once opened, the file is truncated so that its size is zero bytes. The calling process must open the file with at least GENERIC_WRITE access. The function fails if the file does not exist. dwFlagsAndAttributes Specifies the file attributes and flags for the file. Any combination of the following attributes is acceptable for the dwFlagsAndAttributes parameter, except all other file attributes override FILE_ATTRIBUTE_NORMAL. Value Description FILE_ATTRIBUTE_ARCHIVE The file should be archived. Applications use this attribute to mark files for backup or removal. FILE_ATTRIBUTE_HIDDEN The file is hidden. It is not to be included in an ordinary directory listing. 55 FILE_ATTRIBUTE_NORMAL The file has no other attributes set. This attribute is valid only if used alone. FILE_ATTRIBUTE_READONLY The file is read only. Applications can read the file but cannot write to it or delete it. FILE_ATTRIBUTE_SYSTEM The file is part of or is used exclusively by the operating system. FILE_ATTRIBUTE_TEMPORARY Not supported. hTemplateFile Ignored; as a result, CreateFile does not copy the extended attributes to the new file. Return Values An open handle to the specified file indicates success. If the specified file exists before the function call and dwCreationDisposition is CREATE_ALWAYS or OPEN_ALWAYS, a call to GetLastError returns ERROR_ALREADY_EXISTS, even though the function has succeeded. If the file does not exist before the call, GetLastError returns zero. INVALID_HANDLE_VALUE indicates failure. To get extended error information, call GetLastError. //Read a file The file can be read by the handler only if it is opened in GENERIC_READ mode using CreateFile. Thus the call to read file must come after the file is opened appropriately. ReadFile(exampleHandler,rbuff,cBytes,&readBytes,NULL); Here the file is read into the character array rbuff where cBytes specifies number of bytes to be read and readBytes will contain the number of bytes actually read from the file. readBytes is passed by address so that it can be modified in ReadFile and the modifications will be visible in the calling function. API reference to ReadFile: This function reads data from a file, starting at the position indicated by the file pointer. After the read operation has been completed, the file pointer is adjusted by the number of bytes actually read. BOOL ReadFile( 56 HANDLE hFile, LPVOID lpBuffer, DWORD nNumberOfBytesToRead, LPDWORD lpNumberOfBytesRead, LPOVERLAPPED lpOverlapped ); Parameters hFile Handle to the file to be read. The file handle must have been created with GENERIC_READ access to the file. This parameter cannot be a socket handle. lpBuffer Pointer to the buffer that receives the data read from the file. nNumberOfBytesToRead Number of bytes to be read from the file. lpNumberOfBytesRead Pointer to the number of bytes read. ReadFile sets this value to zero before doing any work or error checking. lpOverlapped Unsupported; set to NULL. Return Values The ReadFile function returns when one of the following is true: the number of bytes requested has been read or an error occurs. Nonzero indicates success. If the return value is nonzero and the number of bytes read is zero, the file pointer was beyond the current end of the file at the time of the read operation. Zero indicates failure. To get extended error information, call GetLastError. //Write to a file The file can be read by the handler only if it is opened in GENERIC_WRITE mode using CreateFile. Thus the call to read file must come after the file is opened appropriately. WriteFile (exampleHandler,wbuff,cBytes,&writeBytes,NULL); Here the character array wbuff is written into the file specified by the exampleHandler where cBytes specifies number of bytes to be written and writeBytes will contain the 57 number of bytes actually written into the file. writeBytes is passed by address so that it can be modified in WriteFile and the modifications will be visible in the calling function. API Reference to WriteFile: This function writes data to a file. WriteFile starts writing data to the file at the position indicated by the file pointer. After the write operation has been completed, the file pointer is adjusted by the number of bytes actually written. BOOL WriteFile( HANDLE hFile, LPCVOID lpBuffer, DWORD nNumberOfBytesToWrite, LPDWORD lpNumberOfBytesWritten, LPOVERLAPPED lpOverlapped ); Parameters hFile Handle to the file to be written to. The file handle must have been created with GENERIC_WRITE access to the file. lpBuffer Pointer to the buffer containing the data to be written to the file. nNumberOfBytesToWrite Number of bytes to write to the file. A value of zero specifies a null write operation. A null write operation does not write any bytes but does cause the time stamp to change. WriteFile does not truncate or extend the file. To truncate or extend a file, use the SetEndOfFile function. Named pipe write operations across a network are limited to 65,535 bytes. lpNumberOfBytesWritten Pointer to the number of bytes written by this function call. WriteFile sets this value to zero before doing any work or error checking. lpOverlapped Unsupported; set to NULL. Return Values Nonzero indicates success. Zero indicates failure. To get extended error information, call GetLastError. Remarks 58 If part of the file is locked by another process and the write operation overlaps the locked portion, this function fails. Accessing the output buffer while a write operation is using the buffer may lead to corruption of the data written from that buffer. Applications must not read from, write to, reallocate, or free the output buffer that a write operation is using until the write operation completes. After understanding the basic operations to Create, Read or Write in file in Windows CE, the crude way of porting is to replace all the file operations like fopen, fread, fwrite, fclose, fseek by the CreateFile, ReadFile and WriteFile. The following diagram shows the file operations done in the hindianalyser native code which are replaced by CreateFile, ReadFile and WriteFile operations to make it compatible with Windows CE API. As discussed above clicking the Analyse button takes the input and invokes hindianalyser dll and return the number of tokens written in tokens file. Each file operation is implemented using the Windows CE API now. After this stage the tokens are saved in a file Tokens.txt on disc. The file contains token in this form: 1. The token name: For example token name can be 0704 which is the name of the sound file corresponding to this token and which will be obtained in the hindiengine phase. 2. The token type: Token type specifies that whether this is a vowel or a consonant and on the basis of that the sound generation algorithm works. 59 Please look at the references to know the algorithms used in Shruti which are also used in the Embedded version. Another important difference between Win32 API and Windows CE API is the memory allocation techniques. C type memory allocation (malloc) doesn’t work on Windows CE API. The following function should be used in place of that: char *s; s = (char *)LocalAlloc(LMEM_FIXED,cBytes); LocalAlloc allocate memory on local heap of size cBytes and returns a pointer that is stored in s. Local heap is always there in Windows CE by default. But the developer can declare heaps on their own and write efficient memory code. 3. Generate button: On clicking the generate button first of all the Hindiengine library is loaded into the main memory. After that the exported function from the Hindiengine library is called with tokenLength as the input where tokenLength is the number of bytes in the Tokens.txt file. Generate function from the dll read the Token.txt file, retrieve the tokens and token type from the file and then concatenates the sound files according to tokens into one file. For example if token is 0704, then it will retrieve the sound file 0704.wav from a sound library (in this model the sound library is a directory which contains all the sound files) Thus for all tokens the sound files will be read from the directory, appropriately concatenated and the output sound file will be produced. In this dll Windows CE counterpart for fread, fwrite and fseek were written which will be given shortly but before that the last function of the generate button click is to play the sound file generated by the dll function. //Code to play a wav format file on Windows CE MMRESULT PlayWave(LPCTSTR szWavFile) { HWAVEOUT hwo; WAVEHDR whdr; MMRESULT mmres; CWaveFile waveFile; 60 HANDLE hDoneEvent = CreateEvent(NULL, FALSE, FALSE, TEXT("DONE_EVENT")); UINT devId; DWORD dwOldVolume; // Open wave file if (!waveFile.Open(szWavFile)) { TCHAR szErrMsg[MAX_ERRMSG]; _stprintf (szErrMsg, TEXT("Unable to open file: %s\n\n"),szWavFile); MessageBox(NULL, szErrMsg, TEXT("File I/O Error"), MB_OK); return MMSYSERR_NOERROR; } // Open audio device for (devId = 0; devId < waveOutGetNumDevs(); devId++) { mmres = waveOutOpen(&hwo, devId, waveFile.GetWaveFormat(), (DWORD) hDoneEvent, 0, CALLBACK_EVENT); if (mmres == MMSYSERR_NOERROR) { break; } } if (mmres != MMSYSERR_NOERROR) { return mmres; } // Set volume mmres = waveOutGetVolume(hwo, &dwOldVolume); if (mmres != MMSYSERR_NOERROR) { return mmres; } waveOutSetVolume(hwo, 0xFFFFFFFF); if (mmres != MMSYSERR_NOERROR) { return mmres; } // Initialize wave header ZeroMemory(&whdr, sizeof(WAVEHDR)); whdr.lpData = new char[waveFile.GetLength()]; whdr.dwBufferLength = waveFile.GetLength(); whdr.dwUser = 0; whdr.dwFlags = 0; whdr.dwLoops = 0; whdr.dwBytesRecorded = 0; whdr.lpNext = 0; whdr.reserved = 0; // Play buffer waveFile.Read(whdr.lpData, whdr.dwBufferLength); 61 mmres = waveOutPrepareHeader(hwo, &whdr, sizeof(WAVEHDR)); if (mmres != MMSYSERR_NOERROR) { return mmres; } mmres = waveOutWrite(hwo, &whdr, sizeof(WAVEHDR)); if (mmres != MMSYSERR_NOERROR) { return mmres; } // Wait for audio to finish playing while (!(whdr.dwFlags & WHDR_DONE)) { WaitForSingleObject(hDoneEvent, INFINITE); } // Clean up mmres = waveOutUnprepareHeader(hwo, &whdr, sizeof(WAVEHDR)); if (mmres != MMSYSERR_NOERROR) { return mmres; } waveOutSetVolume(hwo, dwOldVolume); if (mmres != MMSYSERR_NOERROR) { return mmres; } mmres = waveOutClose(hwo); if (mmres != MMSYSERR_NOERROR) { return mmres; } delete [] whdr.lpData; waveFile.Close(); return MMSYSERR_NOERROR; } Take a look at the source code to understand the sound producing code clearly. Now let’s take a look at the file operations done in the hindiengine.dll in the following diagram and the respective operations to do fread, fwrite and fseek: 62 From the figure it is clear that hindiengine native code contained a number of file operations which are modified using Windows CE API functions to port it to Pocket-PC running Windows CE. Mapping of fread, fwrite, fseek and fscanf to Windows CE API functions: fread: It can be implemented using ReadFile. fwrite: It can be implemented using WriteFile. fseek: Read the file till the desired position. fscanf: fscanf operation can be implemented by reading the file byte by byte and then putting the characters on a temporary array till the delimiter(say a blank) and then changing it to appropriate format like an integer or a string. The following code snippet retrieves token and token type from the tokens.txt file and save the token in a string and the token type in integer variable. //Code to implement fscanf do //Read tokens till the phrasal boundary { while(no_of_blanks!=2){ ReadFile(fin,&c,1,&bytesRead,NULL); length+=1; if(c==' ') no_of_blanks+=1; if(no_of_blanks==0){ token[tokencount]=c; tokencount+=1; } if(no_of_blanks==1){ ttype1[ttypecount]=c; ttypecount+=1; 63 } if(length==tokenLength) break; } token[tokencount]='\0'; ttype1[ttypecount]='\0'; tokencount=0; ttypecount=0; no_of_blanks=0; ttype=atoi(ttype1); } while(length !=tokenLength); The variable token contains the token as a string and the variable ttype contains the type of the token. To read from the files like tokens.txt, intonation and epoch the above substitute of fscanf is used. To read the wav files fread and fseek are sufficient since the size of the wav file can be obtained by reading 4 bytes of the wav file after 40th byte. Thus using fseek and fread Windows CE implementation the wav files can be read from the sound file library (a directory in this implementation) and concatenated according to the ILPS algorithm to get the speech. Please take a look at the reference section to get the algorithm used in NLP module and ILPS module. 4.1.1 Drawbacks Take a close look at the fscanf implementation written above. The main drawback of this model is that the operations like fscanf takes time proportional to the number of characters in the file, which is not the case when fscanf is implemented using operating system directives. The file is not read byte by byte but in blocks and thus fscanf is implemented efficiently rather than reading one byte at a time. In this implementation the whole file is to be read character by character and linear scanning takes time. Also there are intonation file and epoch file which if read character by character takes high amount of time. In epoch file there is epoch value corresponding to a given token and this model linearly search for the epoch value corresponding to a given token. Linear search is expensive and therefore this is another main disadvantage of this model. Another important inference that can be made from this model is that if the token value is assumed as a key then epoch value and the sound file can be retrieved using that key value. This 64 observation leads to the use of extendible hash based database in subsequent models. In the next model first an extendible hash based database will be explained and then the implementation of Embedded Shruti with this database will be presented. 4.2 Model 2: Windows CE port using GDBM (Without voice.db) In Chapter 2 GNU Database manager was introduced and the reason for choosing it in place of Microsoft SQL server for CE was explained. The crude port discussed in last model has several disadvantages and in this model the linear scan required in epoch file was avoided using an extendible hash based database called GDBM. GNU Database manager was ported on Windows CE platform. Take a look at the source code of GDBMCE for details. At this point of time it is important to understand the meaning of extendible hashing since this application needs a hash based database not a SQL supported database. Each line of the Epoch file contains the first entry as the token and it is followed by 4 different epoch values to be used in ILPS algorithm (Hindiengine module). Extendible hash based databases are very efficient when retrieval is to be done by the specified key value (in this case the token name like 0704) and the complexity of retrieval operation is O(1+alpha) where alpha is load factor which is nearly 0 for a balanced database. In this model the epoch file is read and saved in GDBM database using the token name as the key and the value being the epoch. Four epoch databases are made which contains the following: epoch1.db Key is the token name and the value is the first epoch value specified on the line corresponding to that particular token name on the epoch.txt file. epoch2.db Key is the token name and the value is the second epoch value specified on the line corresponding to that particular token name on the epoch.txt file. epoch3.db Key is the token name and the value is the third epoch value specified on the line corresponding to that particular token name on the epoch.txt file. epoch4.db Key is the token name and the value is the fourth epoch value specified on the line corresponding to that particular token name on the epoch.txt file. For example take one line from epoch.txt file: 0165179 104 206 307 409 epoch1.db Key is 0165179, values is 104 65 epoch2.db Key is 0165179, values is 206 epoch3.db Key is 0165179, values is 307 epoch4.db Key is 0165179, values is 409 The present version of Embedded Shruti uses epoch1.db. For producing better quality speech later versions of the software might use the other epoch database files. Intonation file is also saved into a GDBM database and used accordingly in the program. Thus the new model of the hindiengine can be represented by the following picture: Both the epoch database and the intonation database are saved on disc(secondary storage rather than main memory or RAM). The advantage is that linear scan is avoided now and the epoch can be obtained in almost O(1) time provided the key value of the epoch which is the token name. After understanding the basic structure of this model, let’s take a detailed look on extendible hashing and why it is the most efficient data structure when a values is to retrieved according to the key value. 4.2.1 Extendible hashing Traditional hash methods are burdened with 2 disadvantages: Sequential processing of a file according to the natural order on the keys is not supported. They are not extendible. Hash table size is pre-determined Hash table size heavily relies on hash function 66 Overestimation of the number of records results in wasted space. Underestimation of the number of records results in rehashing Extendible hashing method allows hashing to adapt to dynamic files. Hash tables are naturally balanced. By extending the hash address space from the directory address space, hash tables can be made extendible. 4.2.2 Extending hash tables Assumptions: A hash function, h, exists. If K is a key, then K’ = h(K) is a pseudokey. File is structured into two levels Leaves: contain (K, I(K)) (I(K) is the information associated with K) Contains a header that stores the local depth Directory: the record associated with K or a pointer to the record Contains a header that stores the depth Contains pointers to leaf pages Example The following figures explain the working of extendible hash structures. Figure 1 67 Figure 2 68 Figure 3 4.2.3 Using GDBMCE library The gdbmce.dll library exports all the functions to do the database operations. In Chapter 2 all the functions of GDBMCE was explained and all functions are ported on Windows CE platform. The .def (definition) file for the gdbmce dynamic link library exports the following functions which can be accessed by the pointers as discussed in the last model. //GDBMCE .def file LIBRARY GDBMCE EXPORTS gdbm_open gdbm_close gdbm_store gdbm_fetch gdbm_delete gdbm_firstkey gdbm_nextkey gdbm_reorganize gdbm_sync gdbm_exists gdbm_setopt gdbm_errno gdbm_version Code snippets that are used to do the database operations using gdbmce.dll. //Database variable GDBM_FILE dbf; //Variables to work with the database datum key, content; /* 69 datum is a data structure defined in gdbmce.h which has two important members. The first member of the structure is a pointer to character array while the next member of the character array is an integer which stores the number of elements in the character. Using this data structure the values are stored and retrieved from the database. */ //Define the function pointers to call the gdbm functions typedef GDBM_FILE(*GDBMOpen_ptr)(WCHAR*,int,int,int,void*); typedef void(*GDBMClose_ptr)(GDBM_FILE); typedef int(*GDBMStore_ptr)(GDBM_FILE,datum,datum,int); typedef datum(*GDBMFetch_ptr)(GDBM_FILE,datum); //Instance to call the dll HINSTANCE hInst1; //Loading the engine dll hInst1 = ::LoadLibrary(L"gdbmce.dll"); //Getting the function pointers of 4 GDBM functions GDBMOpen_ptr pgdbm_open=(GDBMOpen_ptr)GetProcAddress(hInst1,L"gdbm_open"); GDBMClose_ptr pgdbm_close=(GDBMClose_ptr)GetProcAddress(hInst1,L"gdbm_close"); GDBMStore_ptr pgdbm_store=(GDBMStore_ptr)GetProcAddress(hInst1,L"gdbm_store"); GDBMFetch_ptr pgdbm_fetch=(GDBMFetch_ptr)GetProcAddress(hInst1,L"gdbm_fetch"); If the pointers (pgdbm_open. pgdbm_close, pgdbm_store, pgdbm_fetch) are NULL then the functions are not exported by the dynamic link library. It is always advisable to check whether the function pointers are NULL or not. //Opening a database //Database reader dbf = (*pgdbm_open)(newstring,512,GDBM_READER,777,0); 70 newstring contains the name of the database to open. 512 specifies the block size in which the data will be accessed from the disc. GDBM_READER specifies that the database is to be opened in read mode. 777 defines the mode of the database file which means read, write and execute permission on the database file thus created. 0 refers to the default value that should be passed to the error function. //Database writer dbf = (*pgdbm_open)(newstring,512,GDBM_WRITER,777,0); GDBM_WRITER specifies that the database is opened for writing. It also requires that the database should be present on the disc. //Create a database dbf = (*pgdbm_open)(newstring,512,GDBM_WRCREAT,777,0); This will create a database if the database doesn’t exist and provide both a reader and writer for the database. *pgdbm_fetch reads the database and retrieves value corresponding to a given key value while *pgdbm_store writes into the database according to the key value thus provided. //Storing into a database Suppose there are two strings. The first string contains the key which is the tokenname (“0704” for example) and the second string which is epoch contains the epoch value corresponding to the key (“107” for example). Now to store the information into the database the following function will be used: //Storing the key and its size into the datum structure key.dptr = tokenname; key.dsize = strlen(tokenname); //Storing the value and its size to datum structure content.dptr = epoch ; content.dsize=strlen(epoch); //Storing into the database after successful opening (*pgdbm_store)(dbf, key, content, GDBM_INSERT); Take a look at the GDBM_INSERT option on the function call. GDBM_INSERT parameter inserts the value corresponding to the key. If the key exists then the store 71 operation will fail. GDBM_REPLACE is used in those cases where the key already exists and the value needs to be changed. //Fetch <key, value> pairs from the database Suppose the epoch value corresponding to the token name (“0165179” for example) is to be fetched from a database that is opened successfully in GDBM_READ mode. The following code will be used to fetch the values: //Storing the key and its size into the datum structure key.dptr = tokenname; key.dsize = strlen(tokenname); //fetch the value corresponding to the key content = (*pgdbm_fetch)(dbf,key); //copy the content to a string called epoch strcpy (epoch,content.dptr); int epochval = atoi(epoch); Thus as the operation completes, epochval will contain the integer value of the epoch string that is fetched from the database. 4.2.4 Advantages Model 2 avoids linear scan of epoch file and the intonation file. Therefore this implementation is much faster and efficient compared to Model 1. A performance comparison will be given in next chapter where both the implementations are checked on some given input. For this application where <key, value> pairs are to be retrieved efficiently, Extendible Hashing is the best data structure and GDBMCE uses extendible hashing. 4.2.5 Drawbacks There are two drawbacks in this model: 1. The Tokens.txt file generated after hindianalyser phase is scanned linearly by the hindiengine dynamic link library to retrieve the token name and the token type. Refer to the implementation of fscanf using ReadFile in Model 1 section of this chapter. So in the next model to avoid this linear scan token name and token type 72 are also saved in a GDBMCE database using an index value as key which starts from 0 and increases till there are more tokens. 2. The second drawback of this model is that the sound library is still a directory which contains the wav files. In the next model the sound files are also kept in a GDBM database as the wav files also have the token name as the key and the value is the sound file. A directory with a number of wav files are replaced by this one file called voice.db. It facilitate the transfer of this single file to the Windows CE device. Also this single file can be provided on flash ram accompanying the application. Also making the voice database will also avoid linear scanning of voice files during the generation of the speech. Since now the voice file is saved on the database as a character array, efficient array operations like memcpy can be used to avoid the linear scan of the sound files that was done in the present model. These drawbacks resulted into the development of a third model in which the voice database is added, along with the epoch database and the intonation database that were already there. The tokens are also saved in a database and no normal file operations are used in this version. That makes the third model as the most robust model out of the other two models. Let’s now take a look at the changes done in third model. 4.3 Model 3: Windows CE port using GDBM (With voice.db) First take a look at the new dataflow structure of Hindiengine module. The following figure displays the dataflow structure after the GDBM databases are added: 73 The modifications that are made in Model 2 to obtain Model 3 were already discussed in the last section. Model 3 is the most efficient implementation so let’s take a complete pictorial view of Embedded Shruti in this implementation. In this implementation tokens database and voice database were added. The basic database operations that were discussed in last section will hold well in this model also. 4.3.1 Making voice database In earlier models the sound files are saved in a directory called voice and according to the token name appropriate file is taken from the directory and appended to the existing output voice files after modifications. In this model the sound files are saved in GDBM database with key as the name of the sound file which is eventually the token name. For example if there is a file in the voice library called 0164179.wav then from this file name the token will be extracted which is filename-extension( 0164179 in this case) and the key value will be set to 0164179. After that the wav file 0164179.wav will be saved in 74 the database with this key value. Later hindiengine will retrieve the sound file using this key value. The database file voice.db was created on linux platform since directory scanning using system calls is quite easier in linux. GDBM is preinstalled on almost all linux platforms. A GDBM database file created on linux platform can be used in Windows CE using the GDBMCE library function calls. The following code shows how the voice database is made from the voice directory: /* /* #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <gdbm.h> #include <dirent.h> #include <unistd.h> #include <string.h> #include <sys/dir.h> #include <sys/types.h> #include <sys/param.h> Linux code Header files needed //Directory pointer to read the contents of voice directory struct dirent *dpointer; main() { //define the database handler for the music database GDBM_FILE dbf; //To make the music database datum key,content; int i=0,size; DIR *dirp; char *name,*buffer; char *path; char *voice = "voice/"; FILE* fpt; /* Steps by which music database will be created 1.The directory of sound files are transferred to linux machine. 2.Scan the directory,get the name of each file. 3.key = Name - trailing .wav 4.Content = The wav file.The size of the wav file can be obtained from the file itself.The 40-44 bytes of the wav file gives the file size. 75 */ */ 5.Accordingly the wav file will be stored by the key described above. */ dbf=gdbm_open("voice.db",512,GDBM_WRCREAT,777,0); path = (char*)malloc(30*sizeof(char)); if((dirp=opendir("voice"))==NULL) { fprintf(stderr,"Error opening voice\n"); perror("dirlist"); exit(1); } //Code to scan the directory and put each wav file into database. while(dpointer=readdir(dirp)) { if(i>1) { name = strtok(dpointer->d_name,"."); //Inserting values in key key.dptr = name; key.dsize = strlen(name); strcpy(path,voice); strcat(path,name); strcat(path,".wav"); fpt = fopen(path,"r+"); fseek(fpt,40,0); fread(&size,4,1,fpt); fseek(fpt,0,0); //Add the 44 bytes of the header size=size+44; buffer=(char*)malloc(size*sizeof(char)); fread(buffer,size,1,fpt); //Inserting values in content content.dptr = buffer; content.dsize = size; //Inserting in database printf("Inserting in database File no : %d\n",i); gdbm_store(dbf1,key,content,GDBM_INSERT); free(buffer); fclose(fpt); } i++; } 76 closedir(dirp); gdbm_close(dbf1); } /* compile line bash$cc voicedb.c –o voicedb -lgdbm /* */ execution */ bash$./voicedb This will create the voice database in the present working directory. The database will be transferred to Windows CE emulator or device for use. 4.3.2 Making tokens database To make the tokens database first of all the GDBMCE dynamic link library should be loaded into memory. After that open a database file with the name tokens.db. This file will contain the token and type for characters of the input text provided by the frontend through the file TextIscii.txt. In the database file save the token and type as they are produced by Natural Language Processor. An index value is maintained that starts with 0 and works as a key to this database. As a new set of token and type is added to the database according to the current index, the index value is increased by 1. Hindianalyser phase returns the value of index and the frontend passes this value to Hindiengine to retrieve all the tokens and their respective type. When the token and type are inserted in database a delimited is added between the two. The following code shows the insertion technique: char buffer[20]; char keybuffer[20]; //0 concatenated with word[i] gives the token name. //1 is the token type in this case. //delimiter | is added between the token name and type. sprintf(buffer,"%d%d|%d",0,word[i],1); //content is the datum variable needed for insertion 77 content.dptr = buffer; content.dsize = strlen(buffer); //key is the datum variable to hold the key which is index in this case sprintf(keybuffer,"%d",index); key.dptr = keybuffer; key.dsize = strlen(keybuffer); //dbftokens is the handler to the tokens database //Function is called with GDBM_REPLACE as the argument so that the //database will be rewritten if the key already exists. (*pgdbm_store)(dbftokens,key,content,GDBM_REPLACE); //increase the index value index = index + 1; 4.3.3 Producing the output sound file Hindiengine will first get the token name and token type from tokens database. Once the token name is obtained, it is used as a key to retrieve the sound file from the voice database. The following code shows the whole process: //Fill the key to retrieve content from the database sprintf(tokenbuffer,"%d",index); tokenskey.dptr = tokenbuffer; tokenskey.dsize = strlen(tokenbuffer); tokenscontent = (*pgdbm_fetch)(dbftokens,tokenskey); //Get the token and the type from tokenscontent. token = strtok(tokenscontent.dptr,"|"); ttype1 = strtok(NULL,"|"); ttype = atoi(ttype1); 78 //This code finds the length of the voice file and copies the header of //the wav file into an variable Header which will later be added to the //output sound file. In wav files 4 bytes after 40 bytes gives the size //of the file. voicekey.dptr = token; voicekey.dsize = strlen(token); voicecontent = (*pgdbm_fetch)(dbfvoice,voicekey); memcpy(Header,&voicecontent.dptr[0],40); len = (voicecontent.dsize-44); //This code copies the remaining wav file (except the first 44 bytes of //the header) into a temporary character array called data and it will //be concatenated to the output sound file later. memcpy(data,&voicecontent.dptr[44],len); This completes the description of model 3. 4.3.4 Advantages This model is the most efficient model compared to other two models discussed above. Linear scans of the files are completely avoided and memory efficient functions like memcpy are used to increase the performance of the software. As all databases are kept on the disc, this model gives an upper bound on the performance. No efficient performance that always access disc rather than main memory cannot take time less than this implementation. No file operations are used in this implementation and therefore this is a robust implementation. 4.3.5 Drawbacks This model also has the following drawbacks: 1. The tokens database is saved on disc rather than in main memory. So both in Hindianalyser and Hindiengine modules, to write the tokens and then again to read the tokens disc access is needed which takes more time compared to RAM access. A solution to this problem is to maintain the list of tokens in main memory rather writing it in database in disc. Another solution will be to maintain the 79 tokens database in a Flash RAM rather than on the secondary storage which is the disc. Flash RAM access time is much less compared to disc access time. 2. Whenever a sound file is retrieved from the database disc is accessed since the voice database is saved on disc. This will take more time and affect the performance of the software. These drawbacks lead to the development of Model 4 in which both these problems are addressed and possible solutions are suggested. The next section will explain the solutions that will increase the efficiency by removing the drawbacks of Model 3. 4.4 Model 4: Final Windows CE port Model 4 is a hybrid model which judiciously uses main memory and secondary storage (disc) to obtain the optimal performance. Model 3 if implemented on a Flash RAM will give better performance but still the performance can be enhanced by using this hybrid approach. 4.4.1 Approach This model does two important additions to the last model and attempts to increase the performance. The additions are the following: 1. The token list which was stored on the disc in last model will be stored in main memory so that the access will take less time. Since Embedded Shruti is built on Microsoft Foundation Classes for Windows CE, a number of MFC utility classes can be used to maintain a hash structure of the tokens. The hash facilitates retrieval. In this implementation a MFC class CMap is used to store the token name and the token type keyed by a variable index which is increased accordingly. 2. The other problem in model 3 was that each time a new sound file is accessed from the database, disc access time is needed. A cache structure is implemented to speed up sound file access. The cache structure stores N number of sound files on the main memory. N is selected according to the application. Typically N can be 20 sound files. When a new file is required first of all the cache is checked 80 whether the file is there in the cache or not. If the file is available then the disc access time is not required. If the file is not available the file is brought from the database on disc and it will be saved in the cache. If the cache is full then an appropriate cache replacement strategy like Least Recently Used Algorithm (LRU) is used. The victim sound file will be chosen and removed from the cache. In its place the new sound file will be kept. The modifications suggested will optimally use the Random Access Memory and the secondary disc so that Embedded Shruti gives the ideal performance. Till now the models are tested on Pocket-PC emulator and as mentioned in the API reference of the PocketPC, the performance of the software on real Pocket-PC will be better than the performance on the emulator. This chapter will conclude with a graphical view of the model 4. The next chapter will give the performance comparison of different models on some input string. Since the input string remains same the time taken by different models will clearly differentiate between the performances of the implementations. 4.4.2 Dataflow Model 81 Chapter 5 PERFORMANCE COMPARISON In the last chapter different models for Embedded Shruti were discussed at length. This chapter will present the performance of each model on a given input text. This chapter will also specify the steps to run different models of Embedded Shruti on Pocket-PC emulator. Performance comparison is an important part of the development process since this phase determines which model scores over the others and therefore should be chosen as the final implementation that will be ported. 5.1 Performance Parameters First of all the performance parameters are to be specified. For real time applications the performance should be measured on the basis of actual time taken. Windows CE is a real time operating system and therefore softwares running on real time applications. To measure the performance real time should be considered. For different models the time to get the output once the input text is supplied and the Analyse and Generate buttons are clicked will be considered. The model which scores better over others on this metric will be considered as the better implementation. There may be several other parameters like RAM space used and Disc space used but they are not of much interest. For Embedded Applications real time constraints should be satisfied. So the time taken for the application to execute is the most important performance parameter. 5.2 Simulations on test inputs This section will explain how to run each model on Pocket-PC emulator and then to check the program on given test inputs. For every implementation the steps will be provided to run the application. 5.2.1 Model 1 For running the application on Model 1 the following steps should be done: 82 1. First of all compile the source code. The source code includes the source code for Frontend, and the source for the two dynamic link libraries Hindianalyser and Hindiengine. All the three modules will be compiled using the eMbedded Visual C++. 2. On the eMbedded Visual C++ IDE, specify the SDK as the Pocket-PC, and the next fields as Win32 (WCE x86 debug) and Pocket PC 2002 emulation. 3. If the compilation is successful, then the dynamic link libraries will be made and transferred into the Pocket-PC emulator. 4. The executable Frontend.exe will also be made and transferred into the Pocket PC’s default executable path. 5. The dynamic link libraries are copied into \Windows directory on the target emulator or the target device. An MFC dll is also copied to \Windows directory as MFC dll is needed to run the Frontend executable 6. Now before running the application upload the files epoch.txt and inton_bengali onto the emulator. To upload the files go to Tools Remote File Viewer. Once the Remote File Viewer appears the files can be transferred onto the emulator or the device. 7. After the files are transferred the next step is to transfer the sound library on the emulator or target device. A new directory will be made called voice on the device and then using Remote File Viewer the wav files will be uploaded into the voice directory. 8. The application can be run by clicking on Frontend.exe on the start menu. The Frontend GUI will start and the Bengali text for Text-to-Speech conversion will be applied to it. 9. An example of a Bengali or Hindi Text can be “mera naam piyush hai”. This text will be entered in input area and then performance will be obtained by pressing the Analyse button and the Generate button. 10. A sound file will be generated as the output and will be played on the emulator or the device. 11. The next section will compare the real time performance of this model with other models. 83 5.2.2 Model 2 1. In this model GDBMCE library is used. Therefore first of all source code for GDBMCE library will be compiled. 2. Successful compilation will upload the gdbmce.dll to \Windows folder on the target device or the emulator. 3. epoch.txt and inton_bengali files will not be transferred to the device. Instead the hash database epoch1.db and inton_bengali.db will be transferred to the device. 4. Remaining all steps will be same as done in Model 1. 5.2.3 Model 3 1. This model don’t require the sound library (the folder consisting of wav files) to be transferred on the device. Instead in this model the voice database (voice.db) is transferred to the device. 2. Rest all steps remains same as Model 2. 5.3 Comparison of performance The following table gives the performance of each model according to the time taken by the Hindianalyser module (clicking on the Analyse Button) and the time taken by the Hindiengine module (clicking on the Generate Button). The input text used for the Comparison: “mera naam piyush hai” Model Name Model 1 Hindianalyser Performance 4 seconds (Tokens in file) Model 2 4 seconds (Tokens in file) 84 Hindiengine Performance 70 seconds (Very inefficient) 10 seconds (Increase in efficiency) Model 3 4 seconds (Token database added) 5 seconds (Best performance) On the basis of this performance chart it can be concluded that Model 3 is the most efficient one and this is to be used for the final implementation. Model 4 is also suggested which is an extension of this model 3 and use a hybrid approach as discussed in the last chapter. 85 Chapter 6 CONCLUSIONS This thesis provides the complete design and implementation of Embedded Shruti. It started with an introduction to technologies that were used in this software product. After that details of different models were provided. The last chapter provides a comparison of the performance of different models and the reason for choosing Model 3 as the final implementation. Embedded Shruti has several advantages over the desktop version. To run the desktop version one need a desktop computer system that is costly compared to a Personal Digital Assistant like Pocket-PC. Pocket-PC is a mobile device and therefore the software can be used on the move by the user. A person with speech disorders only have to carry a PDA with Embedded Shruti installed on it. The person can communicate with others using the software and since it is installed on a PDA rather than a desktop computer he can take the software with him at any place. I personally feel that Embedded Shruti realizes the dream of providing Shruti software to every person who needs it. For a person with speech disorders this software will be an integral part of life. Carry a Personal Digital Assistant having Embedded Shruti and you have the power to communicate with people despite your serious speech disorders. 86 Chapter 7 FUTURE WORK Embedded Shruti is to be tested completely on a number of variable length test inputs. Hindianalyser module is tested completely but Hindiengine module is not rigorously tested. The software is giving perfect results for the input strings on which it is tested till now but still more testing is required. The final version which is to be shipped to customers will save the voice database, intonation database and epoch database on a Flash rather than on the secondary storage of the Pocket-PC device. Flash comparatively takes less time than disc and so it will surely increase the performance. A Flash version of the code is to be written. In the Flash version of the code, the path of the database has to be changed. Presently since the database is on the root directory of Pocket-PC, the path is simply the name of the database like “voice.db”. For example when the database is opened, the path of the database is “voice.db”. But when a Flash will be connected to the device the path will change to “\Storage Card\voice.db”. In the Flash version of the code this change will be incorporated. The software is tested on Pocket-PC emulator till now. Once the testing and debugging is done it will be ported on Pocket-PC hardware. The databases, the dynamic link libraries (hindianalyser, hindiengine and gdbmce) and the application Frontend.exe will be transferred to the device using eMbedded Visual C++. The environment is changed to Pocket-PC (default device) from Pocket-PC (Emulation) and the device will be connected to the development workstation using some COM port. After that the files can be transferred to Pocket-PC. 87 Chapter 8 SOFTWARE SCREENSHOTS 8.1 Hindianalyser module on Standard SDK emulator (First port) 88 The output of Hindianalyser module is shown in the second text box. 0204 is the token and 0 is the token type. The output has repetitive sequence of token and token type. 8.2 Model 1 on Pocket-PC Emulator 8.2.1 Hindianalyser module execution: The text box contains the token and token type generated in this phase. The <token, token type>pairs generated are: 0204 0 0204172 3 0172 1 0172207 2 0207 0 -2 5 0198 0 0198164 3 0164 1 0164164 4 0164 1 0164204 2 0204 0 -2 5 0200 0 0200166 3 0166 1 -1 5 0205 0 0205168 3 0168 1 0168213 2 0213 0 -2 5 0216 0 0216173 3 0173 1 -2 5 89 8.2.2 Hindiengine Module execution: After the Analyse button is clicked the tokens are generated and displayed on the second text box. This completes the execution of Hindianalyser module. Hindiengine module is called when Generate button is clicked. After Generate is clicked the speech file will be generated and played. The second text box will be updated by the number of bytes in the tokens.txt file. 8.3 Model 3 on Pocket-PC emulator: Model 1 screenshots are shown above. Please refer chapter 4 for knowing in details about the models. Model 3 is implemented using only Hash databases and no file operations are used. This model works 90 efficiently as compared to the other two models. Refer next page for Model 3 screenshots. 8.3.1 Hindianalyser module execution: In Model 3 before executing the Frontend, database files will be sent to the device. The database files are: inton_bengali.db (intonation database), epoch.db (epoch values) and the voice database (voice.db). This model gives the best performance as shown in Chapter 6. 8.3.2 Hindiengine Module execution: The next screenshot shows the application after the Hindiengine module is executed by clicking on Generate button. It plays the 91 sound and gives the output as the total number of <token, token type> pairs in the tokens database. The number of < token, token type> pairs generated from the input text is shown at the text box. The Hindiengine model execution is efficient compared to Model 1 and Model 2. Therefore this Model emerges as the winner and it will be used in the final version of Embedded Shruti. 92 Chapter 9 REFERENCES Embedded Shruti is an implementation project. The documentations which helped me in this project are listed chapter wise: Chapter 1 None Chapter 2 1. Programming Microsoft Windows CE (Second Edition) by Douglas Boling 2. Windows CE .NET documentation. 3. Pocket-PC SDK documentation. 4. Microsoft eMbedded Visual C++ documentation. 5. Microsoft SQL Server CE documentation. 6. GDBM man pages Chapter 3 1. Choudhury M. Rule-based Grapheme to Phoneme Mapping for Hindi Speech Synthesis. Presented at the 90th Indian Science Congress of ISCA, Bangalore, 2003 2. Source code of Desktop version of Shruti Chapter 4 1. GDBM man pages. 2. Source code of GDBM port to Windows CE. 3. Ronald Fagin, Jrg Nievergelt, Nicholas Pippenger, H. Raymond Strong, Extendible hashing - a fast access method for dynamic files, ACM Transactions on Database Systems, New York, NY, Volume 4 Number 3, 1979, pages 315-344. The complete source code of the implementation is available in MediaLab, Indian Institute of Technology, Kharagpur. Take a look at the source code to understand how the software is working. All the models are implemented separately and you can yourself do a performance evaluation of the respective models. 93