Thesis - Department of Electrical Engineering and Computer Science

advertisement
Chapter 1
OBJECTIVE
The ultimate objective of this project is to develop the next generation Text to Speech
software for regional Indian languages like Hindi and Bengali which is called Embedded
Shruti. The keyword here is “next generation”. The desktop version of this software
which was developed by MediaLab, IIT Kharagpur is called “Shruti”. This software is
ported to Windows CE so that the software can be used on Embedded Devices like
Handhelds, PDA, Pocket-PC and other devices which support Windows CE operating
system. While porting software on Windows CE the key concern is to enhance the
performance using the limited resources of Windows CE. Windows CE is an operating
system with limited resources and a small Application Programming Interface compared
to the Win32 operating system which has an extensive set of Application Programming
Interface. There are several limitations on the embedded devices like limited main
memory, slow processor and less disk space. These are the constraints that should be
taken into account while porting any software for Windows CE platform. Several
techniques are used to increase the performance of the software which will be explained
in the thesis and the performance evaluation will show how the techniques have resulted
in increase in efficiency of the software. This project is an important part of the
MediaLab projects and it had extended the Text to Speech software to an evolving area of
handheld computers.
1
Chapter 2
INTRODUCTION
In recent past the computing industry had seen a tremendous growth in the area of
Handheld Systems. Handheld Systems also called as Personal Digital Assistants (PDAs)
are gaining popularity. An important reason behind their increase in popularity is that
they are mobile devices. People can carry the devices with themselves at any place and
use them for their work. Another important reason is that the devices are relatively cheap
compared to desktop computes or Laptops. This is because they have limited
functionalities. The resources are limited. The disk space is less, processor speed is not
very high and main memory is also very limited compared to the desktop computers or
Laptops.
The most popular Personal Digital Assistants are the following:
1. Pocket-PC
2. Palmtop
There are two popular operating systems that were used in these Personal Digital
Assistants. They are the following:
1. Windows CE (compact edition)
2. Palm OS (Palm operating system used for Palmtops)
Embedded Shruti is developed for Pocket-PC running Windows CE operating system. As
the name suggests Windows CE or Windows Compact Edition is an operating system
developed by Microsoft which is customized for Handheld Devices. It had a subset of
Win32 Application Programming Interface and also had Microsoft Foundation Class
support. Due to the MFC support it’s very easy for the programmers who are conversant
with traditional windows programming to migrate to windows CE programming. There
are some constraints which should be kept in mind while programming in Windows CE
API but one who is familiar with Windows API can develop applications for Windows
CE very fast.
2
With this brief introduction of underlying hardware (Pocket-PC) and the operating
system (Windows CE) the rest of the chapter will introduce the technologies that were
used in the design and implementation of Embedded Shruti.
2.1 Windows CE Overview
Microsoft® Windows® CE is an open, scalable, 32-bit operating system (OS) that is
designed to meet the needs of a broad range of intelligent devices, from enterprise tools
such as industrial controllers, communications hubs, and point-of-sale terminals to
consumer products such as cameras, telephones, and home entertainment devices. A
typical Windows CE-based embedded system is targeted for a specific use, often runs
disconnected from other computers, and requires an operating system that has a small
footprint and a built-in deterministic response to interrupts.
Windows CE offers the application developer the familiar environment of the Microsoft
Win32® application programming interface (API), ActiveX controls, message queuing
(MSMQ) and Component Object Model (COM) interfaces, Active Template Library
(ATL), and the Microsoft Foundation Classes (MFC) Library. ActiveSync provides easy
connectivity between the desktop and the embedded device, whether by serial
connection, infrared port, or network cable. There is built-in support for multimedia
(including DirectX), communications (TCP/IP, SNMP, TAPI, and more), and security. A
variety of integrated applications, including Pocket Internet Explorer, Pocket Outlook,
and Pocket Word expose objects that allow the developer to extend and customize the
existing system, as well as extend the functionality of the application.
2.1.1 Important features of Windows CE 3.0
Windows CE 3.0 offers improved Windows compatibility, combined with hard real-time
processing support. New kernel services, such as support for nested interrupts, better
thread response, additional task priorities, and semaphores, let the operating system
respond immediately to events and interrupts. These real-time features make Windows
CE 3.0 ideally suited for industrial applications such as robotics, test and measurement
devices, and programmable logic controllers.
3
With greater storage and file-handling capabilities, interprocess communications, and
networking support, Windows CE 3.0 interoperates easily with desktop environments that
are based on Microsoft Windows NT® and Microsoft Windows 2000, which makes it the
optimal choice for an enterprise system that combines small mobile systems with highperformance desktops servers and workstations.
New hardware features for Windows CE 3.0 include:

Support for on-chip debugging.

The device I/O controls (IOCTL) function that allows a unique serial number on
each device.

Multiple execute-in-place (XIP) regions.
2.1.1.1 Kernel Services
Thread response times have been improved in Windows CE 3.0 by the tightening of the
upper bounds on scheduling latencies for high-priority Interrupt Service Threads (IST).
This improvement in thread response allows developers to know specifically when the
thread transitions occur, and aids them in creating new embedded applications by
increasing the capabilities of monitoring and controlling hardware in Windows CE.

Shorter Interrupt Service Routine (ISR) latencies. Because the kernel uses the
interrupt ID that is provided by the ISR to set the event that the IST is waiting on,
a short ISR latency is essential for a real-time system.

Support for nested interrupts. Support has been added for nested interrupts, which
allows interrupts at higher priority levels to be serviced immediately, instead of
potentially waiting for a lower-priority ISR to complete.

Increased priority levels. Additional priority levels (a total of 256) allow users
more flexibility in controlling the scheduling of embedded systems.

Support for semaphores. In addition to the currently supported mutexes and
events, Windows CE has been expanded to support semaphores.

The ability to change the quantum of any thread in the system. This includes
support for two APIs: CeSetThreadQuantum and CeGetThreadQuantum.

Kernel-level security. A new security model restricts access to system APIs that a
rogue application could call to damage the platform. An OEM can specify
4
whether modules and processes can run or not run and specify those that are fully
trusted on a particular platform. Two new APIs allow software developers to
retrieve the assigned trust level of a module or a process.
2.1.1.2 Files, Databases, and Persistent Storage
Windows CE 3.0 supports larger data storage systems, and larger files within those
systems.

The size of the object store has been increased to 256 MB (from 16 MB in
Windows CE version 2.1). Individual files now can be as large as 32 MB and a
database volume can be as large as 256 MB.

The number of objects that can be kept in the object store has been increased from
216 (65,536) to 222 (4,194,304). Because the allowable number of objects exceeds
the number of available object identifiers, freed object identifiers will be reused
for new objects, effective with version 3.0. However, an object identifier will not
be reused for at least 16 object allocations.

Support has been added for querying VERSIONINFO resources to obtain version
and language-support information from files.
2.1.1.3 Interprocess Communications
Windows CE 3.0 provides interprocess communication support with COM and MSMQ.

Two separate COM modules offer two different levels of COM support: a limitedfeature, small-footprint module that provides interprocess calls and a freethreading mode, and a larger module that supports out-of-process calls, fullthreading model support, and Distributed Common Object Model (DCOM). The
DCOM module, with the exception of the security interfaces, is fully compatible
with Windows NT version 4.0 Service Pack 3 (SP3).

Enhanced MSMQ support in Windows CE 3.0 provides independent client
support for messaging applications. MSMQ for Windows CE is compatible with
Windows NT and Windows 2000 Message Queuing Services.
2.1.1.4 Communication Services
5
Communications enhancements to Windows CE include the following:

Lightweight remote access server (RAS) support. RAS uses Telephony
Application Program Interface (TAPI) to make the call, and then manages the
data through Point-to-point protocol (PPP) or Serial Line Interface Protocol
(SLIP).

Windows 2000 Transport Control Protocol/Internet Protocol (TCP/IP) support.

Network Driver Interface Specification (NDIS) WAN support.
2.1.1.5 Communications Security
Security enhancements for Windows CE 3.0 include:

Microsoft Cryptography version 2.0 API (CAPI) subset. This is a set of
encryption APIs that allow development of applications that will work securely
over nonsecure networks, such as the Internet. The CAPI 2.0 subset will provide
certificate management support.

Microsoft Enhanced Cryptography Service (RSAENH), including 128-bit
encryption algorithms.

Cryptography Service Provider Development Kit.

X.509 certificate authentication.
2.1.1.6 Connectivity Services

Smart card PC/SC support. The Windows CE 3.0 smart-card subsystem conforms
to the Interoperability Specification for ICCs and Personal Computer Systems,
which makes it easy to port existing smart-card applications to Windows CE
devices.

Windows CE 3.0 ActiveSync version 3.1.
2.1.1.7 User-Interface Services
Enhancements to shell services in Windows CE include the following:

Microsoft Windows CE Handheld PC (H/PC), Professional Edition shell, which
includes the following applications:

Pocket Internet Explorer
6

Pocket Inbox

Pocket Word

Help system

Finer granularity componentization of common controls.

Ability to print mixed text and graphics.

Controls and dialog boxes that are resolution independent.

Ability for user to change the appearance of the user interface for notifications.

DirectDraw driver for Graphics Device Interface (GDI).
2.1.1.8 Internet Services
The new embedded Web Server provides many of the features of Microsoft Internet
Information Services (IIS), which have been optimized for the limited resources of an
embedded device. Features include:

Support for the HTTP/1.0 protocol, persistent connections, multiple connections,
file downloading, directory browsing, and multiple virtual paths.

A remote administration tool for configuration.

Basic and NT LAN Manager (NTLM) authentication support.

Internet Services Application Program Interface (ISAPI) extensions and filters.

Dynamic pages, through a subset of Active Server Pages (ASP pages).

For client-side Internet development, Windows CE 3.0 includes a subset of the
WinInet API, to support browser-based applications and FTP services.
2.1.2 Using Platform Builder
The platform builder can be used to create customized embedded platforms. By default a
number of embedded hardware platforms are available which includes embedded
platforms for microprocessors like:
a) ARM processors.
b) MIPS processors.
c) SHX processors.
d) IntelX86 processors (Used for debugging the application on desktop environment).
7
Platform builder is essential for creating customized embedded platforms which are
hardware specific and not compatible to the processors mentioned above.
Introduction to the platform builder will help to port the application to some customized
embedded platforms in future. The steps to do the customized platform design are
specified below:
Platform builder helps the programmer to create customized embedded platforms:
1. Using platform builder with the desktop based Windows CE emulator.
2. Using platform builder with a Windows CE PC based hardware development
environment.
3. Creating and adding features to the platform thus made by the platform builder.
2.1.2.1 Designing Operating System elements
Using the platform builder one can do the following:
a) Create a boot loader.
b) Create a Board Support Package.
c) Create a custom shell.
d) Selecting a configuration for the platform.
Getting Started
To build a platform based on Windows CE operating system the following steps are to be
done:
1. Create a platform using Windows CE configuration with a standard development board
(SDB).
2. Customize the platform with additional project and catalog features
3. Build the Operating System image to a hardware platform (CEPC). Platform builder
includes boot loaders and Board Support Packages for the CEPC and many other
hardware development platforms.
4. After refining and debugging platform on hardware development platform, one can
adapt it for custom target device.
8
5. Create a boot loader, OEM adaptation layer and board support package for the specific
target device. OEM adaptation layer (OAL) is the layer between KERNEL and
TARGET-PLATFORM FIRMWARE.
6. Rebuild the Operating System using new Board Support Package, download it into
target device and debug platform.
7. When the platform is complete export a Software Development Kit for the platform.
Application developers can import that software development kit into development tools
like eMbedded Visual C++.
Platform Creation
Platform creation is done using new platform wizard in which the following steps are
done:
a) Select a board support package for the device.
b) Choose and Operating System configuration and variant.
c) Select the features for the platform.
After initial settings have been chosen, the new platform wizard sets up the environment
with files that support operating system configuration that was selected. The features
included in the platform depend on the operating system configuration which was chosen.
There are 13 basic configurations included with Platform Builder and all of them are
available from new platform wizard.
After the pictorial view of the whole process, an example of building Operating System
image for a Thin Client will be provided.
Sequence of tasks in process of creating Windows CE based platform with Platform
Builder:
9
2.1.2.2 Thin Client Example
Platform builder was used to generate Windows CE image for a HCL Win Bee 4000JS
Thin Client which is having the following hardware details:
1. 266 MHz National Geode Processor.
2. On board VGA controller up to 4.0 MB shared VRAM.
3. 64 MB RAM upgradeable to 256 MB.
4. 16 MB flash memory.
10
Peripheral support:
1. 104 key PS/2 keyboard.
2. PS/2 and serial mouse support.
3. Audio : 16 bit stereo sound output
4. Microphone output
Communications:
1. TCP/IP with DHCP support.
2. 10/100 Mbps Ethernet Twisted pair (UTP RJ45) Interface.
3. Full PPP(Point to Point Protocol) support
Server Operating System compatibility/support:
1. Citrix Win-frame and other Citrix compatible Operating system.
2. Windows NT server 4.0,Terminal Server Edition and Windows 2000 server
family with CITRIX
Optional Support:
1. Smart card reader
2. ISDN support
3. LCD Display
4. USB ports
5. ISA and PCI Expansion slots
For the hardware description of the Thin Client given above a Windows CE image had
been made and the first step was to choose an appropriate configuration. Platform builder
provides two different configurations for Thin Clients which are the following:
1. Windows Thin Client – Minimal version of Microsoft Windows CE that includes
Core Operating System and features necessary to support Microsoft terminal
services.
11
2. Windows Thin Client with browser- This configuration supports the features of
Windows Thin Client along with a browser.
Windows Thin Client configuration provides the starting point for remote desktop
terminals, including those features necessary to support terminal services client. This
configuration provides functionality for Remote Desktop Terminals through support for
Microsoft Terminal Services client. It also has SNMP and local browser capabilities
along with the possibility to adding additional Windows CE Operating System features.
After the appropriate configuration is selected for the Thin Client hardware is selected the
next step is to select one or more available board support packages. Platform Builder
supports a number of Board Support Packages.
2.1.2.3 Board Support Package
A board support package is a software package that contains:
1. Boot Loader
2. OEM Adaptation Layer (OAL)
3. Device drivers for standard development board (SDB) or Hardware Reference
Platform
BSP is the main part of Microsoft Windows CE based platforms and contains source files
and binary files.
The OEM adaptation layer (OAL) that links to Kernel image and supports:
a) Initializing and managing the hardware.
b) Device drives.
c) Boot loader.
d) Set of configuration files.
Use of Boot Loader:
1. Used during development to download Operating System image.
2. Once created BSP can be reconfigured through environmental variables and .bib
and .reg file can be modified to attain the reconfiguration.
12
Interaction Diagram:
Microsoft Platform Builder provides sample BSPs for many SDBs that are readily
available in industry. By using the integrated BSP support, one can quickly evaluate the
new Operating System features in Microsoft Windows CE.
BSP developments:
1. Extensive infrastructure is provided for developing BSP for SDB made by the
developer or hardware.
2. Offers support for developing drivers for the platform.
3. Focus on customizing, refining and developing additional Operating System features.
Platform Builder provides BSPs for ten SDBs that are readily available for purchase. At
present Platform Builder supports 4 types of BSPs which are the following:
1. ARM BSPs
2. MIPS BSPs
3. SHX BSPs
4. x86 BSPs
13
There are several third party BSPs that are available. Windows CE supported Thin
Clients are also available commercially [ref 1].
2.1.2.4 Creating an OEM Adaptation Layer (OAL)
An OEM adaptation layer is a layer of code that resides between the Microsoft Windows
CE Kernel and the hardware of target device.
Physically OAL is linked with kernel libraries to create the kernel libraries to create the
kernel executable files. It also facilitates communication between the Operating System
and the target device. It includes code to handle the following:
1. Interrupts
2. Timers
3. Power Management
4. Bus abstraction
5. Generic I/O control codes (IOCTL)
Creating the OAL is one of the more complex tasks in the process of getting a Windows
CE Operating System to run on a new hardware platform.
Easiest way to create an OAL:
1. Copy the OAL implementation from a working platform and then modify it to suit the
specific requirements of the platform under consideration.
2. If a new OAL must be created from beginning (similar implementations are not there)
then it’s more useful to approach the development process in stages.
Each stage adds a little more functionality than the previous one and provides a
convenient separation point where new features can be fixed and validated.
Steps of creating a new OAL:
1. Preparing BSP files for building the kernel.
a) Put the necessary directories and files to build the OAL and kernel image in place.
b) config.bib will be created or modified in this step.
2. Creating a Base OAL
14
a) Initialize the platform at startup.
b) Enable the serial port for debugging.
c) Initialize the communication settings.
d) Goal is to provide basic system initialization code that will support further debugging
and to ensure that basic initialization code is complete and consistent with target device.
3. Enhancing OAL functionalities:
a) Enhance the Interrupt Service Subroutines.
b) Manage clock and timers.
c) There are alternate debugging options for
1. Ethernet.
2. Enable power management.
3. Provide platform information for applications.
Goal:
a) Implement the remainder of platform support functions.
b) Ensure the fact that full Operating System boot is possible.
4. Completing an OAL: Implement any additional features that the developer want to add
to it.
2.1.2.5 Creating a Boot loader
Boot loader is an integral part of Windows CE Operating System development process
and in some cases part of the final product solution. Purpose of the boot loader is
following:
1. Place the Operating System image into memory.
2. Jump to Operating system startup routine.
3. Boot loader can obtain the OS image in a number of different ways:
a) Cabled connection(such as Ethernet)
b) Through USB or serial port.
c) It can also load the Operating System from a local storage device such as
compact flash, a hard disk or a disk-on-chip.
d) It may store the image in Random Access Memory or in a nonvolatile
storage like
15
1. Flash
2. EEPROM
3. Storage device
Boot loader is typically used during the development process to save time. Rather than
transferring the developmental image to the target device through a manual process such
as flash programming, the boot loader allows the developers to quickly download the new
development image to the target device.
In many final product solutions the boot loader is removed from the product and the
Operating System image is stored on the device and bootstrapped by system-reset
process.
But there are platforms that do not efficiently support this ability, such as X86 platforms
or platforms that perform pre boot task.
Most common form of boot loader is one that downloads Operating System image over
the Ethernet into a target device RAM and much documentation is available on that.
2.1.2.6 Device Driver Development
The device drivers are included in every Windows CE operating system image are
responsible for direct communication to devices.
A device is a physical or logical entity that requires:
a) Control
b) Resource Management
c) Both (a) and (b) from Operating System
A device driver is a software module that manages the operation of a:
1. Device
2. Protocol
3. Service
A device driver also manages virtual or logical devices. A virtual device exposes a
physical device interface, even though there is no physical device to manage it. A device
driver for a virtual device is indistinguishable from a device driver for a physical device.
16
Virtual means that there is no physical device to manage but a device like interface is
being exposed.
File system drivers are example of virtual device drivers.
Typically one can characterize drivers by the device interface they expose. In the simplest
case there is one interface exposed downward to hardware and one interface upwards to
the applications.
Interface to Hardware

Hardware interface or bus interface
Interface to Application

Device interface or client interface
Application, other drivers or the device manager can manipulate the device interfaces.
The provider of the interface determines how these modules manipulate the client
interface.
Buses are responsible for loading the drivers for the devices on their buses. Bus
enumeration is the process of examining the bus and then loading appropriate drivers.
Root bus driver is a registry enumerator.
Different processes load different drivers. The following tables show the processes that
load drivers and what drivers they load.
Process
Graphics, Windowing and Event
Subsystem(GWES)
Device Manager
(Device.exe)
Drivers
Battery drivers
Display drivers
Notification LED drivers
Printer drivers
Audio drivers, Keyboard drivers, Mouse
driver, Serial drivers, PC card drivers,
USB devices and any other driver that
exposes the stream interface.
File System
File System
(Filesys.exe)
Drivers
Device driver source code:
17
1. The device driver source code is separated into platform dependent code.
2. CPU support package (CSP) drivers.
3. Common drivers.
2.1.2.7 Creating a Board Support Package
Windows CE provides the basic infrastructure that allows one to rapidly and easily create
BSPs. Different device driver libraries like Microprocessor-native libraries are shipped in
Platform Builder.
Microprocessor Native Libraries:
The microprocessor native libraries consist of device drivers for high integration
microprocessors and their native peripherals. For example Strong-Arm microprocessor
and its companion chip integrated many peripherals on microprocessor such as LCD,
serial port, USB etc.
A SDB that uses a specific microprocessor will use the same set of microprocessor native
drivers.
2.1.2.8 Thin Client example revisited
In the Thin Client example discussed above a configuration was chosen after carefully
studying the hardware characteristics of the Thin Client. Do the following steps after
opening the Platform Builder:
1. From the file menu, choose new platform, and new platform wizard will appear.
2. Choose next and select one or more available BSPs.
3. On the BSP page choose next and in the platform configuration dialog box, enter a
name for the platform.
4. Select a platform configuration from the available configuration area which in this case
will be Windows Thin Client with browser for this particular example.
After these steps are completed platform creation is completed. After that the next step is
to build the platform:
The OS image will be built based on the platform that has been configured. To build an
Operating System image select retail or debug build configuration, modify the platform
settings as needed, such as enabling kernel debugger, and then build the image.
18
After the platform is configured, the platform can be built using the platform builder
integrated development environment (IDE).
Platform builder creates an Operating System image in four stages:
1. Sysgen phase.
2. Feature build phase.
3. Release copy phase.
Build system generates header files, links modules, copies the resulting modules to a
release directory, and generates a binary image file.
Static library as well as code supplied by the developer or the third party vendors is
combined into a binary file and that is downloaded onto the device.
Sysgen Phase: Each feature selected from catalog has a corresponding sysgen variable. If
a feature is included in platform, IDE sets the corresponding sysgen variable during the
sysgen phase of the build process.
Build system uses these variables to link the corresponding static libraries into modules.
The system also filters the system header files, creating headers that only contain
prototypes for the functions exported by the developer’s platform. Import libraries for the
system modules are also created during this phase.
Feature Build Phase: After sysgen phase, feature build phase is run. During this phase,
all user features including platform builder project (.pbp) files, source files and makefile
(.mak) files are compiled and built.
Release Copy Phase: Build system copies all the files needed to make an OS image to the
release directory. The modules and files created during the sysgen phase are copied to the
directory first, followed by the files created by the feature build phase.
Make Image Phase: The project specific files which includes
Project.bib
Project.dat
Project.db
Project.reg
19
are copied into Release directory.
Now information in BIBInfo tab of the platform setting dialog box is then added for each
module or file. During this phase, the files in the release directory are combined in binary
image file Nk.bin. If some modification is done, then the image should be made again.
Platform Downloading
After the platform is built, the OS image associated with the platform will be downloaded
to a target device. The IDE gives a mechanism that allows downloading using a number
of types of communication hardware, an OS image to target device.
To download an Os image to a target device, there should be connection from
development workstation to target device.
Target menu in the IDE provides functionality that allows the developer to download an
OS image to a target device. Before downloading configure a connection to the target
device. To download using Ethernet, the development workstation and the target device
must be on the same subnet. If the subnets will differ then the target device cannot be
connected and debugging the OS image will not be possible.
Transfer the Operating System image thus made to the Thin Client at the startup. To
make a connection to the device, go to target menu and configure remote connection.
Choose services tab and from the active named box choose a connection. This completes
the deployment of the operating system image to the Thin Client. This process is the
general way to build Windows CE Operating System image for a given hardware
platform. Many a time this whole process is not necessary since there are some hardware
platforms like Pocket-PC for whom Windows CE image were already built and tested.
Such hardware platforms come preloaded with Windows CE image and as mentioned
above they export a software development kit which the developers can use to build
software for Pocket-PC. Embedded Shruti is also developed using the Pocket-PC
software development kit not by building the Windows CE image from the scratch. But
every embedded software developer should have an idea of the whole design process so
that the software can be ported to custom platforms if required in the future. The process
20
was explained using the Thin Client example. The Thin Client example was the first
work carried out by me in this project.
2.1.3 Using Standard SDK
When included in the platform under development, the standard software development kit
(SDK) for Microsoft® Windows® CE provides a common subset of features that allow
an application written to conform to the standard SDK to run on a display-based
Windows CE platform. To maintain compatibility with the standard SDK, an application
must function with only the features provided in the standard SDK. Using additional
features will make the application incompatible with the standard SDK. To implement
the standard SDK on a Windows CE platform, the standard SDK for a Windows CE
feature must be added to a display-based platform. The standard SDK is not compatible
with headless devices and is therefore limited to display-based platforms. When added to
an operating system (OS) image, the standard SDK will automatically include any
features associated with the standard SDK as well as their dependencies. It will also add a
registry flag to the image that indicates that the standard SDK has been implemented on
that image. This allows any application written for the standard SDK to verify that a
particular Windows CE platform supports the standard SDK.
When included in a display-based platform, the standard SDK automatically incorporates
all associated features into that platform. The following list shows the features that
compose the standard SDK. In the next chapter desktop version of Shruti will be
introduced and initially one of the modules of desktop Shruti is ported on an emulator
running Windows CE with Standard SDK support. Some screenshots will show the
primitive version of the Hindianalyser module ported on an emulator running Windows
CE and supporting the standard software development kit.
2.2 Pocket-PC SDK (Software Development Kit)
After the primitive version on standard software development kit, the next version came
on the Pocket-PC emulator running Windows CE and having a Pocket-PC software
21
development kit. In Chapter 4 the details of the porting on Pocket-PC will be provided,
but at this point of time an introduction to Pocket-PC software development kit is
inevitable since this sdk is used to develop software for Pocket-PC platform very fast.
2.2.1 Introduction to Pocket PC 2002
Microsoft® Windows® Powered Pocket PC 2002 is a personal companion for mobile
device users. It offers users the following:

An easy, seamless setup experience

An intuitive user interface

Powerful information management

A robust communications platform

Customizable user interface

Best companionship to Microsoft Outlook®
2.2.2 Working with the Pocket PC 2002 Emulator
The Microsoft® Windows® Powered Pocket PC 2002 SDK includes a new emulation
environment. This environment provides a virtual computer running Pocket PC 2002
software compiled for the Intel x86 processor. The virtual computer duplicates hardware
that runs Microsoft Windows CE on an x86-based PC.
Previous Windows CE emulators relied on special emulator compilers that passed
instructions to the underlying Microsoft Windows NT® operating system. This led to
occasional dramatic differences in appearance and function between the emulator and a
22
Pocket PC device. Because the new emulator is powered by the Windows CE operating
system and by Pocket PC components, a much higher level of fidelity exists between an
actual Pocket PC device and the device emulation environment.
New APIs: The Pocket PC 2002 platform supports the following newly exposed APIs.

ActiveSync
This API provides ActiveSync 3.5 functionality for the Pocket PC 2002.

Windows CE Messaging
This API provides a set of interfaces to facilitate the development of messaging
applications for the Pocket PC 2002.

Connection Manager
This API provides the functionality necessary to centralize and automate the
establishment and management of the network connections on a Pocket PC 2002 device.

HTML Control
This API provides the functionality to customize the HTML viewer control; this API also
includes an XML parser.

MIDI
This API provides the capability to play MIDI files on a Pocket PC 2002 device. This
API also provides the functionality to create custom sounds such as a DTMF tone or a
busy signal.

Object Exchange (OBEX)
This API provides one method for transmitting information between two Pocket PC 2002
devices. The OBEX protocol requires fewer resources than an HTTP server and transfers
information by using the infrared port on the Pocket PC 2002 device.

Telephony
This API is a superset that includes the following sections:
o
Assisted TAPI. Allows applications to make telephone calls without
requiring the details of the services of the full Telephony API.
o
Extended TAPI. Extends wireless functionality to include such things as
asking for signal strength, choosing the cellular network, and more.
23
o
Phone API. Provides the functionality to access a call log and creates a
custom report from the information in that log.
o
Subscriber Identity Module (SIM) Manager. Allows access to information
stored on the SIM card.
o
Short Message Service (SMS). Enables wireless devices to send and
receive short messages through an SMS Center.
o
Telephony Service Provider. Supports communications device control
through a set of exported service functions.
Emulator: The Pocket PC 2002 SDK includes a new emulation environment. This
environment provides a virtual machine running Pocket PC 2002 software compiled for
the x86 processor. The virtual machine duplicates hardware known as a CEPC, which is a
hardware configuration that runs Windows CE on an Intel x86-based PC.
2.2.3 Programming Pocket PC 2002
Microsoft® Windows® CE operating system version 3.0 for Windows Powered
Pocket PC 2002 provides a powerful and easily portable platform for mobile professional
users. It combines the power of a personal information manager (PIM), a compact
Software package fully compatible with Windows-based desktop computers, and a
Windows development environment. Pocket PC 2002 allows users to keep their personal
and business information up to date and close at hand by using a sophisticated hardware
design to fill the need for a more portable and less expensive device than traditional
laptop or palmtop computers.
Pocket PC 2002 is designed to quickly access, record, and transmit information at any
time. The software bundled with Pocket PC 2002 manages contacts, appointments, and
other personal and business information. By using the Voice Recorder application, users
can capture ideas and thoughts as they occur. Pocket PC 2002 software can also store
telephone numbers and short messages, and it can send and receive e-mail messages by
using Internet technologies. All these features are fully compatible with the user's
Windows-based desktop applications.
24
Pocket PC 2002 gives the developer access to a rich development environment. The
Windows CE operating system is based on the Microsoft Win32® application
programming interface. The applications can be created by using Microsoft eMbedded
Visual Tools (eVT) 3.0, which are special versions of the familiar Microsoft
Visual Studio® tools that the developers may have used to write applications for desktop
Windows. The developer can choose to develop applications by using Microsoft
eMbedded Visual C++® or Microsoft eMbedded Visual Basic®. Embedded Shurti is
developed using Microsoft eMbedded Visual C++®.
Pocket PC 2002 supports a variety of input technologies, including freestyle drawing,
handwriting character recognition, or a graphical representation of a keyboard for use on
a touch screen.
2.2.4 Pocket PC 2002 Hardware
Original equipment manufacturers (OEMs) have a variety of hardware options when
building Pocket PC 2002 devices. The following illustration shows the different hardware
components available for a typical Pocket PC 2002.

Touch screen
25
The touch screen is an LCD covered by a resistive touch panel. The LCD has a portrait
orientation with a 240 x 320 pixel resolution, which allows users to see interface
elements clearly. The dot pitch for Pocket PC 2002 is.22 to .24, depending on the OEM.
Tapping the touch screen with a stylus or finger sends the same kind of messages that
clicking with the left mouse button does on a desktop computer, although cursor support
is limited to a spinning hourglass for wait signals. The user can also select and drag
items. In order to sense quick changes in user input, the touch screen has a refresh rate of
at least 100 samples per second. Pocket PC 2002 also supports up to a 16 bit per pixel
color depth.

Stylus and keyboard
Pocket PC 2002 does not have a standard, physical keyboard. Text input is accomplished
by using the input panel and the stylus. Generally, the input panel is a standard window
on the touch screen that displays an input method, allowing users to enter data in a
variety of ways. Pocket PC 2002 software includes a simplified QWERTY keyboard
input method and a handwriting recognition input method.
The stylus is a pointer for accessing a touch screen and input methods. The stylus has a
smaller point than a user's finger, yet will not scratch the touch screen.
The OEM or a user can add additional input methods. For example, an independent
software vendor (ISV) could create an input method for tapping in Morse code. The user
could purchase the Morse code input method and install it at home.

Navigation controls
Pocket PC 2002 comes with several navigation controls, which can be pressed, held
down, double-clicked, or pressed in combination with other controls. The following table
shows the default Pocket PC 2002 navigation controls.
Navigation control
Description
On/off
Turns Pocket PC 2002 on and off
Action button
Acts as the ENTER key
Record button
Activates the Voice Recorder application
Program button(s)
Launches an application
Up
Acts as an UP ARROW key
26
Down
Acts as a DOWN ARROW key
Some OEMs may add a silkscreen region, which is an extension of the resistive touch
panel, to cover a non-LCD region of a Pocket PC 2002 case. This region is usually
directly below the LCD. It is called a silkscreen region because it often has buttons
applied by using a silkscreen process. While a silkscreen button is technically a software
control, Pocket PC 2002 software does not distinguish between a silkscreen button and
other navigation controls; both types of buttons send the same virtual key messages. The
OEM is responsible for the driver that handles the silkscreen region.

Audio input
Depending on the device category, some Pocket PC 2002 devices will not support audio
input or playback.
For devices that support audio, a built-in microphone is included. It is usually located on
the front of the device, so that a user can view the screen while recording. The hardware
supports 16-bit sampling at 8 kHz, and codec, the compression and decompression
software, compresses the recording to 2.4 Kbps. The codec software is identical to a
desktop computer's audio compression manager (ACM). OEMs may add a microphone
jack for an external microphone. The jack is transparent to the software.

Audio output
The developer can use the built-in speakers to play sounds associated with notification
events. Speakers can also be used to play voice recordings or other .wav files, or for dual
tone multifrequency (DTMF) dialing output. Some OEMs may add a headphone jack for
headphones, external speakers, or other audio-out hardware. This jack is transparent to
the software.

Printing
Printing is not currently supported on the Pocket PC 2002.

Notification options
An OEM may provide several notification options for a Pocket PC 2002: audio, a
flashing LED, or vibration controls such as those on cellular phones and pagers.
Although all three of these methods are supported by Pocket PC 2002, all except audio
notification are OEM options.
27

Power
Because a Pocket PC 2002 is portable, battery life is very important. Pocket PC 2002 can
run many hours on its standard battery source, and also has a backup battery to avoid data
loss if the primary battery loses power.

CPU
Pocket PC 2002 uses the ARM family of CPUs. The ARM processors offer an excellent
combination of high performance and low power consumption.

Memory
All Pocket PC 2002 devices come with at least 24 megabytes (MB) of ROM and 16 MB
of RAM. The upgrade edition offered by some OEMs for their Windows Powered Pocket
PC devices is tailored to fit in the 16 MB of Flash RAM available on those upgradeable
devices.
Because it is important to conserve memory on a Pocket PC 2002, many Pocket PC 2002
operating system (OS) components are compressed in ROM. When a user needs a
component, the operating system decompresses that component and transfers it to RAM.
Because of the time required for decompression and transfer, compressed files slow
performance.

Built-in serial port
Pocket PC 2002 comes with a built-in 16550 (or equivalent) serial port and some OEMs
may include a second serial port. Applications use the serial port for communication
between a Pocket PC 2002 and other hardware devices at baud rates from 19.2 kilobits
per second (Kbps) to 115.2 Kbps. A Pocket PC 2002 can connect to a desktop computer
by using a serial cable or an optional docking cradle, available from many Pocket PC
2002 manufacturers, that is connected to the desktop computer. Some Pocket PC 2002
devices support data communications through a modem connected to the cradle.

Infrared communications serial port
Pocket PC 2002 includes a serial port that conforms to Infrared Data Association (IrDA)
specifications. Pocket PC 2002 devices can communicate with other Pocket PC 2002
devices, other Windows CE devices, Palm™ OS-based handheld computing devices, or
desktop computers.
28
2.3 Microsoft eMbedded Visual C++
Microsoft® eMbedded Visual C++ is a member of the Microsoft eMbedded Visual Tools
3.0 family of products which also has eMbedded Visual Basic. These products provide
complete integrated development environments for creating applications to run on the
Windows CE operating system.
2.3.1 Introduction
Microsoft® eMbedded Visual C++ enables programmers to develop Windows CE-based
applications using an integrated development environment (IDE) similar to that used in
developing desktop Visual C++ applications. This IDE, however, contains Windows CEspecific versions of many of the standard development tools that are used to create, test,
and refine applications. It also includes a variety of tools that can be used to develop new
software uniquely appropriate for Windows CE platforms and devices.
Custom-built for developing Windows CE applications, the eMbedded Visual C++ IDE
is easy to learn and will be familiar to programmers who have experience with other
members of the Visual C++ family. Applications can be created with eMbedded Visual
C++ to run on the Handheld PC Pro (H/PC Pro), Palm-size PC 1.2, and Pocket PC
platforms. The developers can also use eMbedded Visual C++ to create applications that
run on custom Windows CE-based platforms, or within a desktop emulator that simulates
a Windows-CE based platform. Embedded Shruti is coded using eMbedded Visual C++
for Pocket PC platform and tested on the Pocket PC emulator.
2.3.2 Managing Projects and Workspaces
In eMbedded Visual C++, applications are developed in a Workspace. Applications by
assembled by either creating a project and a workspace simultaneously or by creating a
workspace and then adding projects to it. After a workspace is created, the developer can
add new projects, new configurations to an existing project, and subprojects. Microsoft
eMbedded C++ development is characterized hierarchically by the workspace, projects,
and subprojects.

A workspace is a container for the development projects. When a new platform is
created, a workspace is also created. Use Project view to look at and gain access
to the various elements of projects. A workspace can contain multiple projects,
including subprojects, but only one platform.
29

A project is a configuration and a group of files that produce an application or
final binary file(s).

A subproject is a project that is dependent on another project. This dependency
may consist of shared files, which need to be built in the subproject first, or it may
include shared resources that need updating in the subproject first.
2.3.3 Developing an Application
After initially creating a project, developer can create the user interface. This involves
first designing and creating dialog boxes, menus, toolbars, accelerators, and other visual
and interactive elements, and then hooking them up to code. The user interface elements
have to be tailored to the design requirements of the target device. For example, the
Pocket PC is long and narrow (about 240x320 pixels), while the Handheld PC is larger
(about 640x320 pixels). If the application is designed for the Pocket PC, dialogs will
likely be too small for the Handheld PC. If the dialogs are designed for the Handheld PC,
they will likely be cramped or not fit at all on the Pocket PC.
2.3.4 Building an Application
Microsoft eMbedded Visual C++ provides two ways of building an application. The
easiest and most common way is to build within the eMbedded Visual C++ development
environment. The other way is to build from the MS-DOS prompt using command-line
tools. Building an application involves the preprocessor, the compiler, and the linker.

The preprocessor prepares source files for the compiler by translating macros,
operators, and directives.

The compiler creates an object file containing machine code, linker directives,
sections, external references, and function/data names.

The linker combines code from the object files created by the compiler and from
statically-linked libraries, resolves the name references, and creates an executable
file.
2.3.5 The Build Process
The following diagram shows the components of the build process in eMbedded Visual
C++ starting with the editor which can be used to write source code.
30
If the program is built outside the IDE, the developer may use a makefile to invoke the
command-line tools. Microsoft eMbedded Visual C++ provides the NMAKE utility for
processing makefiles. If the program is built within the IDE, the eMbedded Visual C++
project system uses the project (.vcp) files to store make information. The .vcp files are
not compatible with NMAKE. However, if the program uses a makefile rather than a .vcp
file, it can still be built in the development environment as an external project.
2.3.6 Testing and debugging an application
Microsoft eMbedded Visual C++ provides tools to help test and debug applications. In
the eMbedded Visual C++ options, the developer can choose to automatically or
manually download the programs after building them to a connected device. When the
developer has completed building a project configuration, the program can be run in
eMbedded Visual C++ with or without debugging capabilities provided by the IDE.
Running programs without using the debugging capabilities is faster because eMbedded
Visual C++ does not have to load the debugger first. With the debugger however,
breakpoints can be used and step through execution, inspect memory and registry values,
check variables, observe message traffic and generally examine closely how the code
works.
2.3.7 MFC for Windows CE
The Microsoft® Foundation Class (MFC) library for the Windows® CE operating system
is both a mature, comprehensive class library and a complete object-oriented application
31
framework designed to help the developer build applications, components, and controls
for Windows CE-based platforms. The developer can use the Microsoft Foundation
Classes for Windows CE to create anything from a simple dialog box-based application
to a sophisticated application that employs the full MFC document/view architecture.
MFC can also be used for Windows CE to create full-featured Microsoft® ActiveX®
controls and ActiveX containers.
The developers who have experience using MFC for desktop applications will find MFC
for Windows CE very similar and the migration to MFC for Windows CE will be smooth.
MFC supports a number of classes to help the developers write applications using those
utility classes. For example if the developer wants to use a hash data structure then there
is no need to design the data structure from scratch. Instead the developer can use a
CMap class already defined in MFC and the application will be made quite fast.
Microsoft Foundation Classes provides a framework of a number of utility classes and
provides the developer with a number of programming options.
Microsoft® eMbedded Visual C++® 3.0 toolkits contain all the development tools and
wizards needed for building MFC for Windows CE applications. Several MFC classes
are used in the design of Embedded Shruti and the details of the classes will be provided
in Chapter 4 which explains the design of Embedded Shruti.
2.4 Microsoft SQL server CE
During the development phase of Embedded Shruti, Microsoft SQL server for CE was
considered as an option for providing database support to the application. This section
will explain some of the key features of the Microsoft SQL server CE and the reason why
it was not used in Embedded Shruti.
2.4.1 Rapid Application Development
SQL Server CE makes application development easy while providing a consistent
development model and API set. Microsoft® Visual Basic® developers can rapidly
develop Windows CE applications by using eMbedded Visual Basic and ADOCE
(ActiveX data objects from Windows CE). Microsoft® Visual C++® developers can
leverage their existing skills to build sophisticated Windows CE-based database
applications that target mobile and embedded solutions.
2.4.2 High-Performance Database Engine
32
SQL Server CE offers rich relational database functionality in the small memory footprint
on today's devices. Microsoft SQL Server developers will appreciate the robust feature
set which includes:

A compatible SQL grammar with SQL Server 2000. Statements that run on SQL
Server CE will, in general, run on SQL Server.

A wide range of data types, including:

TINYINT, SMALLINT, INTEGER, BIGINT

REAL, NUMERIC, FLOAT

BIT, BINARY, VARBINARY, IMAGE

UNICODE
character
data
types
NATIONAL
CHARACTER,
NATIONAL CHARACTER VARYING, NTEXT

MONEY, DATETIME, UNIQUEIDENTIFIER

32 indexes per table, multicolumn indexes

NULL support

Nested transactions

128-bit file level encryption

DDL: Create databases, alter tables, referential integrity, default values

DML: INSERT, UPDATE, DELETE

SELECT: SET Functions (aggregates), INNER/OUTER JOIN, subselect, GROUP
BY/HAVING

Scrollable and forward-only cursors
Hardware and Software Requirements
Hardware Requirements
Platform
Requirements
SQL Server system
See the operating system requirements in SQL
Server Books Online.
IIS system
120 MB of available disk space.
Development system
30 MB of available disk space. The computer will
need an additional 30 MB of temporary storage
space for the setup files.
33
Windows CE device
Between 1 and 3 MB of available storage space,
depending on processor type and components
installed.
The file sizes for the SQL Server CE components
vary by processor type and Windows CE operating
system version. Hard disk space requirements also
depend on which SQL Server CE components are
installed.
Operating System Requirements
Platform
Supported operating systems
SQL Server system
See the operating system requirements in the SQL
Server Books Online.
Development system
Microsoft Windows 98 Second Edition, Microsoft
Windows Millennium (Me), Microsoft Windows
NT® 4.0 with Service Pack 5 or later, or Microsoft
Windows 2000.
Windows CE desktop emulation requires Windows
NT 4.0 or Windows 2000. Emulation is not
supported on Windows 98.
Microsoft ActiveSync 3.1 or later.
IIS system
Windows NT 4.0 with Service Pack 5 or later, or
Windows 2000.
Windows CE device
Windows CE version 2.11 or later.
SQL Server Requirements
SQL Server
Supported SQL Server CE features
SQL Server 2000
All features are supported including merge
34
replication and RDA.
SQL Server version 6.5 with Service
RDA is supported; replication is not supported.
Pack 5 or later and SQL Server 7.0
Internet Information Services and Internet Explorer Requirements
Component
Requirements
Microsoft Internet Explorer 5.0
Internet Explorer 5.0 or later is required on the
development system to access SQL Server CE
HTML Help.
Internet Explorer 5.0 or later is required on IIS
system.
IIS
Replication and RDA require IIS 4.0 on Windows
NT 4.0 or IIS 5.0 on Windows 2000.
ActiveSync Requirements
Component
Requirements
SSCERelay.exe
Windows 98 Second Edition, Windows
Millennium (Me), Windows NT 4.0 with Service
Pack 5 or later, or Windows 2000.
Windows CE Requirements
Platform
Windows CE operating system version
Handheld PC Pro (H/PC Pro)
2.11 or later
Palm-size PC (P/PC)
2.11 or later
Pocket PC
3.0 or later
HPC 2000
3.0 or later*
35
2.4.3 Drawbacks of SQL server (Windows CE version)
The tables showed above gives a complete reference to the hardware and software
requirements for SQL server for CE. Take a look at the Windows CE device requirement.
Depending on the processor type and components installed, 1 MB to 3 MB of storage
space is required on the Windows CE device.
On the top of that Embedded Shruti doesn’t need a database that supports extensible set
of SQL statements. Queries are made only by sending a key rather than writing the SQL
statements. Keeping this design issue in mind, using SQL server for Windows CE will
just eat up the resources while not producing any significant gain. For an embedded
application 1 MB space (minimal Windows CE installation) is quite large. The
application needed an extendible hashing based database which would save the database
values according to the key field and then retrieve the values efficiently. Extendible
hashing is an efficient implementation since retrieval is having a complexity of O
(1+alpha) where alpha is the load factor. For balanced hash tables the retrieval will give
better performance than SQL server for Windows CE which have to process the SQL
queries involving extra overhead. A well known hash based database called GDBM
which is quite popular on UNIX platform is ported on Windows CE platform and used in
Embedded Shruti. Thus Embedded Shruti uses a variant of GNU software and thus is a
merger of Microsoft technologies and Open source GDBM project.
The next section will introduce the GDBM (GNU Database Manager) in general.
2.5 GNU Database Manager (popularly called GDBM)
GDBM - The GNU database manager is a set of database routines that use extensible
hashing.
2.5.1 Synopsis
#include <gdbm.h> // This file contains all the function and data type definitions for
// GDBM
extern gdbm_error
gdbm_errno
extern char
36
*gdbm_version
GDBM_FILE
gdbm_open (name, block_size, read_write, mode, fatal_func)
char * name;
int block_size, read_write, mode;
void (*fatal_func) ();
void
gdbm_close (dbf)
GDBM_FILE dbf;
int
gdbm_store (dbf, key, content, flag)
GDBM_FILE dbf;
datum key, content;
int flag;
datum
gdbm_fetch (dbf, key)
GDBM_FILE dbf;
datum key;
int
gdbm_delete (dbf, key)
GDBM_FILE dbf;
datum key;
datum
gdbm_firstkey (dbf)
GDBM_FILE dbf;
37
datum
gdbm_nextkey (dbf, key)
GDBM_FILE dbf;
datum key;
int
gdbm_reorganize (dbf)
GDBM_FILE dbf;
void
gdbm_sync (dbf)
GDBM_FILE dbf;
int
gdbm_exists (dbf, key)
GDBM_FILE dbf;
datum key;
char *
gdbm_strerror (errno)
gdbm_error errno;
int
gdbm_setopt (dbf, option, value, size)
GDBM_FILE dbf;
int option;
int *value;
int size;
2.5.2 Description
38
GNU dbm is a library of routines that manages data files that contain key/data pairs. The
access provided is that of storing, retrieval, and deletion by key and a non-sorted traversal
of all keys. A process is allowed to use multiple data files at the same time.
A process that opens a gdbm file is designated as a "reader" or a "writer". Only one writer
may open a gdbm file and many readers may open the file. Readers and writers can not
open the gdbm file at the same time. The procedure for opening a gdbm file is:
GDBM_FILE dbf;
dbf = gdbm_open ( name, block_size, read_write, mode, fatal_func )
Name is the name of the file (the complete name, gdbm does not append any characters to
this name). Block_size is the size of a single transfer from disk to memory. This
parameter is ignored unless the file is a new file. The minimum size is 512. If it is less
than 512, dbm will use the stat block size for the file system. Read_write can have one of
the following values:
GDBM_READER reader
GDBM_WRITER writer
GDBM_WRCREAT writer - if database does not exist create new one
GDBM_NEWDB writer - create new database regardless if one exists
For the last three (writers of the database) there is an extra value that that can be added to
read_write by bitwise or, GDBM_FAST. This requests that gdbm write the database with
no disk file syncronization. This allows faster writes, but may produce an inconsistant
database
in
the
event
of
abnormal
termination
of
the
writer.
Mode is the file mode (Read, Write or both) if the file is created. (*Fatal_func) () is a
function for dbm to call if it detects a fatal error. The only parameter of this function is a
string. If the value of 0 is provided, gdbm will use a default function.
The return value dbf is the pointer needed by all other routines to access that gdbm file. If
the return is the NULL pointer, gdbm_open was not successful. The errors can be found
in gdbm_errno for gdbm errors and in errno for system errors. (For error codes, refer to
gdbmerrno.h)
39
In all of the following calls, the parameter dbf refers to the pointer returned from
gdbm_open.
It is important that every file opened is also closed. This is needed to update the
reader/writer count on the file. This is done by:
gdbm_close (dbf);
The database is used by 3 primary routines. The first stores data in the database.
ret = gdbm_store ( dbf, key, content, flag )
Dbf is the pointer returned by gdbm_open. Key is the key data. Content is the data to be
associated
with
GDBM_INSERT
the
key.
insert
Flag
only,
can
have
generate
one
an
of
the
error
following
if
key
values:
exists
GDBM_REPLACE replace contents if key exists.
If a reader calls gdbm_store, the return value will be -1. If called with GDBM_INSERT
and key is in the database, the return value will be 1. Otherwise, the return value is 0.
If the data is stored for a key that is already in the data base, gdbm replaces the old data
with the new data if called with GDBM_REPLACE. Two data items for the same key are
not obtained and there is no error from gdbm_store.
To search for some data:
content = gdbm_fetch ( dbf, key )
Dbf is the pointer returned by gdbm_open. Key is the key data.
If the dptr element of the return value is NULL, no data was found. Otherwise the return
value is a pointer to the found data. The storage space for the dptr element is allocated
using malloc. Gdbm does not automatically free this data. It is the programmer's
responsibility to free this storage when it is no longer needed.
To search for some data, without retrieving it:
ret = gdbm_exists ( dbf, key )
Dbf is the pointer returned by gdbm_open. Key is the key data to search for.
40
If the key is found within the database, the return value ret will be true. If nothing
appropiate is found, ret will be false. This routine is useful for checking for the existance
of a record, without performing the memory allocation done by gdbm_fetch.
To remove some data from the database:
ret = gdbm_delete ( dbf, key )
Dbf is the pointer returned by gdbm_open. Key is the key data.
The return value is -1 if the item is not present or the requester is a reader. The return
value is 0 if there was a successful delete.
The next two routines allow for accessing all items in the database. This access is not key
sequential, but it is guaranteed to visit every key in the database once. (The order has to
do with the hash values.)
key = gdbm_firstkey ( dbf )
nextkey = gdbm_nextkey ( dbf, key )
Dbf is the pointer returned by gdbm_open. Key is the key data.
The return values are both of type datum. If the dptr element of the return value is NULL,
there is no first key or next key. Again notice that dptr points to data allocated by malloc
and gdbm will not free it for the developer.
These functions were intended to visit the database in read-only algorithms, for instance,
to validate the database or similar operations.
File `visiting' is based on a `hash table'. gdbm_delete re-arranges the hash table to make
sure that any collisions in the table do not leave some item `un-findable'. The original key
order is NOT guaranteed to remain unchanged in ALL instances. It is possible that some
key will not be visited if a loop like the following is executed:
key = gdbm_firstkey ( dbf );
while ( key.dptr ) {
nextkey = gdbm_nextkey ( dbf, key );
if ( some condition ) {
gdbm_delete ( dbf, key );
free ( key.dptr );
41
}
key = nextkey;
}
The following routine should be used very infrequently.
ret = gdbm_reorganize ( dbf )
If there are a lot of deletions and the developer would like to shrink the space used by the
gdbm file, this routine will reorganize the database. Gdbm will not shorten the length of a
gdbm file except by using this reorganization. (Deleted file space will be reused.)
If GDBM_FAST value is used in gdbm_open call, the following routine can be used to
guarantee that the database is physically written to the disk file.
gdbm_sync ( dbf )
It will not return until the disk file state is syncronized with the in-memory state of the
database.
To convert a gdbm error code into English text, use this routine:
ret = gdbm_strerror ( errno )
Where errno is of type gdbm_error, usually the global variable gdbm_errno. The
appropiate phrase is returned.
gdbm now supports the ability to set certain options on an already open database.
ret = gdbm_setopt ( dbf, option, value, size )
Where dbf is the return value from a previous call to gdbm_open, and option specifies
which option to set. The valid options are currently:
GDBM_CACHESIZE:
Set the size of the internal bucketcache. This option may only be set once on each GDBM
_FILE descriptor, and is set automatically to 100 upon the first access to the
database.
42
GDBM_FASTMODE:
Set fast mode to either on or off. This allows fast mode to be toggled on an already
open and active database. value (see below) should be set to either TRUE or FALSE.
value is the value to set option to, specified as an integer pointer. size is the size of the
data pointed to by value. The return value will be -1 upon failure, or 0 upon success. The
global variable gdbm_errno will be set upon failure.
For instance, to set a database to use a cache of 10, after opening it with gdbm_open, but
prior to accessing it in any way, the following code could be used:
int value = 10;
ret = gdbm_setopt( dbf, GDBM_CACHESIZE, &value, sizeof(int));
The following two external variables may be useful:
gdbm_errno is the variable that contains more information about gdbm errors. (gdbm.h
has the definitions of the error values and defines gdbm_errno as an external variable.)
gdbm_version is the string containing the version information
This gives an introduction to GNU Database manager. Embedded Shruti has used a
version of GDBM ported into Windows CE so that it can be used by Pocket-PC Software
Development Kit. Details of the GDBM functions used in Embedded Shruti will be
provided in Chapter 4 which explains the complete design of Embedded Shruti.
43
Chapter 3
Shruti: Desktop Version
In recent years, it has become critical to bridge the gulf of between the man and the
machine. The Internet has become an integral part of today’s life and the greatest
knowledge repository on Earth. Technology for accessing the Internet and harnessing the
myriad powers of the personal computer is a must if one is not to fall behind. The need of
the hour is intelligent human-computer interfacing, enabling a wider community such as
the rural neo-literates and pre-literates, the physically challenged (like the visually
impaired and the speech impaired) to interact with computer systems in a natural way.
The speech interfaces like Shruti may have manifold uses. They could serve as:
• Computer interfaces for the visually challenged, for whom graphical interfaces
are not viable.
• The voice of the speech impaired.
• Computer interfaces for neo-literates and pre-literates.
• Modules in software to help pre-literates learn languages using a computer.
• Interfacing modules in multilingual environments, where, depending on the
need, the computer can talk in different languages.
Text to speech has been one of the greatest challenges of modern computational science.
While the utterance of flat speech by a computer has been achievable – the greatest
challenges in the field are to impose natural intonation and prosody based on the
characteristics of the language, dialect, person and context.
The diagram shown below gives a complete idea of the modules of a Text to Speech
converter. The diagram is detailed and gives a clear idea of a TTS converter:
44
Various techniques exist to convert a given text to speech. Initially, a grapheme to
phoneme mapper is required to convert the given graphemes (the smallest unit of written
language) to a list of phonemes (the smallest unit of spoken language). The next stage is
to render the string of phonemes – to synthesize the speech. Speech synthesizers can be
broadly classified into two different classes. Some synthesizers are articulatory where
speech synthesis is controlled by parameters that represent the speech production system
rather than the signal itself, the other being concatenative synthesizers where different
signal units from a dictionary are concatenated to produce synthetic speech. However, the
prime challenge in all cases is the quality of the sound produced and its naturalness.
The desktop version of Shruti implements the Text-to-Speech converter for regional
languages like Hindi and Bengali using concatenative approach. Concatenative approach
finds voice units corresponding to a Phoneme and concatenates them to produce the
45
sound file. Smoothening algorithms are also applied on the concatenated speech and the
noted improvements are achieved in this process.
After understanding the basic essence of Text-to-Speech software let’s quickly
understand how the desktop version of Shruti is implemented.
3.1 Features of Shruti
1. The front-end of the software is written using Java. Refer to the block diagram
shown above. The front-end is used to take the input text and produce the output
sound file. The processing is done by two backend dynamic link libraries.
2. There are two backend dynamic link libraries that are written using C++ and
which implements two important parts of the Text-to-Speech synthesizer.
3. The first dynamic link library implements the Natural Language Processing
(NLP) unit which will be referred as Hindianalyser module in remaining thesis.
4. The second dynamic link library implements the Indian Language Phonetic
Synthesizer (ILPS) unit which will be referred as Hindiengine module in
remaining thesis.
5. These dynamic link libraries are loaded at the runtime when required and the
appropriate functions from the library will be called.
3.2 Overview of Shruti
The processing part of the desktop based (Win32 API) text-to-speech software can be
divided into 2 sub modules:
•
HindiAnalyser: It takes the input supplied by the frontend to produce tokens
which corresponds to a unique sound clip of the sound library.
•
HindiEngine: It takes the tokens and sound units from the library and generates
the whole sound clip. After generation smoothening algorithms are applied for a
smooth speech.
46
The frontend is responsible to take the input and to play the wav file generated by the
backend.
The next figure shows a dataflow diagram for the software. Dataflow diagrams always
facilitate the understanding of a software product.
3.3 Technologies used
The front-end is written using Java and an important feature of this implementation is to
call the dynamic link libraries made by Visual C++ from Java program. This is done
using Java Native Interfaces. In the code for the dynamic link library made by Visual
C++, the following code snippet is added:
JNIEXPORT void JNICALL
Java_hindidisplay_Analyse (JNIEnv *env, jobject obj)
This function of the dll can be accessed from the java code. Two files called “jni.h” and
“jni_md.h” are included during the build process. See the references to find the source
code of this implementation.
47
3.4 Comparisons
1. Java Developer Kit (JDK) should be installed on the desktop computer running
the software. JDK is bulky software so it is not possible to use JDK for Embedded
Shruti where memory is a main concern and in such a case installing JDK is more
of a burden than of any substantial use. Therefore Embedded Shruti uses
Windows CE API and Microsoft Foundation Classes customized for Windows
CE. Such an implementation don’t need any JDK on the hardware (Pocket-PC in
this case) on which the software will be executed.
2. Now in the implementation completely using Windows CE API and MFC
customized for Windows CE a dynamic link library (mfcce400d.dll) of size 819
KB is required which is considerably smaller than the JDK. The JDK for
Windows CE with least features has a size of 8.5 MB.
3. The backend dlls are made using Embedded Visual C++ and transferred on the
system folder of the device running Windows CE.
Win CE Hindianalyser dll: 43 KB Win32 version : 256 KB
Win CE Hindiengine dll: 29 KB
Win32 version : 260 KB
4. The next design issue was to choose a database. A SQL server for Windows CE
would have required at least 1 MB of memory. But the port of GDBM to
Windows CE which is used in Embedded Shruti require only a dynamic link
library called gdbmce.dll which is of size 31KB and it is appropriate for the
application since a hash based structure was needed rather than a database which
implements SQL queries.
The names and the sizes of the dynamic link libraries that will be needed to run
Embedded Shruti are the following:
1. gdbmce.dll

31 KB (for database application)
2. hindianalyser.dll

43 KB NLP module
3. hindiengine.dll

29 KB
4. mfcce400d.dll

819 KB (for standard SDK emulation)
ILPS module
5. mfcce300.dll + mfcce300d.dll  289 KB+ 846KB (for Pocket PC emulation)
48
This data shows that this implementation needs much less disk space compared to an
implementation that uses JDK and build the software on top of it.
The next chapter will explain the different implementations of Embedded Shruti one by
one and the key features of each implementation will be provided. Each implementation
is referred as a model. The performance comparison will be provided subsequently.
49
Chapter 4
EMBEDDED SHRUTI
Last chapter introduced the desktop version of Shruti and the structure of the source code
was explained along with the dataflow diagram for the software. In this chapter different
models of Embedded Shruti will be explained one by one and the drawbacks of each
model will be sited which resulted into a new modified and efficient model.
4.1 Model 1: Windows CE crude port
This is the first model of Embedded Shruti. It started with the source code of Win32
version and first of all the structure of the native source code is identified. The points are
identified where the API functions that are used in native code are not supported in
Windows CE API. At all these points the modifications will be done accordingly so that
the native code remains consistent. The input output characteristic of the native code
should not be changed.
Embedded Shruti is designed in a modular way. There are three modules in Model 1.
These are the following:
1. Frontend
2. Hindianalyser
3. Hindiengine
Frontend was designed using Java in the native Win32 code but in Embedded Shruti it’s
designed using MFC customized for Windows CE in eMbedded Visual C++.
The Frontend has a dialog box having the following contents:
1. Input Text Box: It takes the text input from the user which is to be changed to speech.
The input should be in Hindi/Bengali at present. If multilingual keyboard is not there
spell the Bengali/Hindi words using English alphabets and then fed the English alphabets
into the text box.
2. Analyse Button: Analyse button on clicking read the input text from the text box and
then write the text into a temporary file on the device called “TextIscii.txt”. This file will
50
be read later on by the dynamic link library. Now after saving the input text on a file it
loads the dynamic link library for Natural Language Processing called as
hindianalyser.dll. Code snippet for loading dll is provided. The dll should export the
functions which other executable can call. The method of exporting the functions from a
dll will be given shortly. Before that the procedure to load a dll and call an exported
function from executable code is given.
//Define a function pointer to call the DLL function
typedef int(*MBFuncPtr)(DWORD cBytes);
This is a pointer to a function whose return type is integer and which takes as input a
DWORD. DWORD is a datatype defined in MFC. This refers to positive integers.
//Instance variable required to load a library
HINSTANCE hInst1;
//Loading a dynamic link library
hInst1 = ::LoadLibrary(L"hindianalyser.dll");
if(hInst1 == NULL)
MessageBox(L"Unable to load the analyser library");
else
MessageBox(L"Analyser Library successfully loaded");
If the library is not loaded successfully then hInst1 will be null. Once the library is loaded
into the main memory the exported function from the library is accessed using the
function pointer. The functions exported from the dynamic link libraries can be accessed
only by the function pointer.
//Getting the address of the analyser function into the function pointer
MBFuncPtr pFunction=(MBFuncPtr)GetProcAddress(hInst1,L"Analyse");
Analyse is the name of the function exported from the dynamic link library. The
functions that were exported by the dynamic link library are mentioned on
.def(definition) file of the dynamic link library source code. A typical .def file will look
like:
//analyser.def
LIBRARY Analyser
EXPORTS
51
Analyse
The name of the library is specified on the first line of the def file which is called
Analyser library in this case. After that there are a list of functions that are exported from
the dll which are mentioned under the EXPORTS header. There may be a number of
function exported by a dll. There should be a function in dll that starts with the name as
mentioned under EXPORTS tag. The pointer to that function will be copied into the
function pointer from the calling program (the executable in this case) and the function is
called with appropriate inputs.
pFunction called above will be NULL if there is no such function exported by the dll.
if(pFunction = = NULL)
MessageBox(L"Unable to load the Analyse function");
else
{
MessageBox(L"Analyse function exported from the dll called");
tokenLength=(*pFunction)(cBytes);
}
Once the function pointer is obtained in pFunction, the function can be called with
DWORD as parameter and as the return type is integer it will return an integer value after
processing the input text file “TextAscii.txt”.
After the use of library is over it’s always advisable to Free the library. As the dynamic
link libraries are loaded on RAM, for devices running Windows CE which have very
limited RAM space it’s advisable to unload the dll as soon as the work is done.
//Unloading a dynamic link library
::FreeLibrary(hInst1);
The methods discussed above a necessary to do operations related to dynamic link
libraries. The next important difference between the native Win32 source code and the
Windows CE version are the file operations. As already mentioned above on clicking the
Analyse button the text input is saved on a file in disk. Windows CE doesn’t support file
operations like fopen, fread, fclose, fseek and so on. Therefore while porting it is very
important to find the equivalent of each of these file operations using Windows CE API.
52
In Windows CE API all devices are accessed by handles. The developer can access a file
on disk, or a USB port or a sound device using handles. No other layer is defined like
fopen and fseek. The following code snippets will show how to create a file, read a file
and write a file using Windows CE API.
//To create a file
HANDLE exampleHandle = 0;
//Create a file in read mode
exampleHandle = CreateFile
(L"TextIscii.txt",GENERIC_READ,FILE_SHARE_READ,NULL,OPEN_EXISTING,FI
LE_ATTRIBUTE_NORMAL,NULL);
//Create a file in write mode
exampleHandle=
CreateFile(L"TextIscii.txt",GENERIC_WRITE,FILE_SHARE_READ,NULL,OPEN_E
XISTING,FILE_ATTRIBUTE_NORMAL,NULL);
A complete reference to the CreateFile function is provided below:
This function creates, opens, or truncates a file, communications resource, disk device, or
console. It returns a handle that can be used to access the object. It can also open and
return a handle to a directory.
HANDLE CreateFile(
LPCTSTR lpFileName,
DWORD dwDesiredAccess,
DWORD dwShareMode,
LPSECURITY_ATTRIBUTES lpSecurityAttributes,
DWORD dwCreationDispostion ,
DWORD dwFlagsAndAttributes,
HANDLE hTemplateFile );
Parameters
lpFileName
Pointer to a null-terminated string that specifies the name of the object (file,
communications resource, disk device, console, or directory) to create or open.
If *lpFileName is a path, there is a default string size limit of MAX_PATH characters.
This limit is related to how the CreateFile function parses paths.
53
When lpFileName points to a communications resource to open, the developer must
include a colon after the name. For example, specify "COM1: " to open that port.
dwDesiredAccess
Specifies the type of access to the object. An application can obtain read access; write
access, read-write access, or device query access. This parameter can be any combination
of the following values.
Value
Description
0
Specifies device query access to the object. An
application can query device attributes without
accessing the device.
GENERIC_READ
Specifies read access to the object. Data can be
read from the file and the file pointer can be
moved. Combine with GENERIC_WRITE for
read-write access.
GENERIC_WRITE
Specifies write access to the object. Data can be
written to the file and the file pointer can be
moved. Combine with GENERIC_READ for
read-write access.
dwShareMode
Specifies how the object can be shared. If dwShareMode is 0, the object cannot be shared.
Subsequent open operations on the object will fail, until the handle is closed.
To share the object, use a combination of one or more of the following values:
Value
Description
FILE_SHARE_READ
Subsequent open operations on the object will
succeed only if read access is requested.
FILE_SHARE_WRITE
Subsequent open operations on the object will
succeed only if write access is requested.
lpSecurityAttributes
Ignored; set to NULL.
dwCreationDispostion
54
Specifies which action to take on files that exist, and which action to take when files do
not exist. For more information about this parameter, see the Remarks section. This
parameter must be one of the following values:
Value
Description
CREATE_NEW
Creates a new file. The function fails if the
specified file already exists.
CREATE_ALWAYS
Creates a new file. If the file exists, the function
overwrites the file and clears the existing
attributes.
OPEN_EXISTING
Opens the file. The function fails if the file does
not exist.
OPEN_ALWAYS
Opens the file, if it exists. If the file does not
exist, the function creates the file as if
dwCreationDisposition were CREATE_NEW.
TRUNCATE_EXISTING
Opens the file. Once opened, the file is truncated
so that its size is zero bytes. The calling process
must
open
the
file
with
at
least
GENERIC_WRITE access. The function fails if
the file does not exist.
dwFlagsAndAttributes
Specifies the file attributes and flags for the file.
Any combination of the following attributes is acceptable for the dwFlagsAndAttributes
parameter, except all other file attributes override FILE_ATTRIBUTE_NORMAL.
Value
Description
FILE_ATTRIBUTE_ARCHIVE
The file should be archived. Applications
use this attribute to mark files for backup or
removal.
FILE_ATTRIBUTE_HIDDEN
The file is hidden. It is not to be included in
an ordinary directory listing.
55
FILE_ATTRIBUTE_NORMAL
The file has no other attributes set. This
attribute is valid only if used alone.
FILE_ATTRIBUTE_READONLY The file is read only. Applications can read
the file but cannot write to it or delete it.
FILE_ATTRIBUTE_SYSTEM
The file is part of or is used exclusively by
the operating system.
FILE_ATTRIBUTE_TEMPORARY Not supported.
hTemplateFile
Ignored; as a result, CreateFile does not copy the extended attributes to the new file.
Return Values
An open handle to the specified file indicates success. If the specified file exists before
the
function
call
and
dwCreationDisposition
is
CREATE_ALWAYS
or
OPEN_ALWAYS, a call to GetLastError returns ERROR_ALREADY_EXISTS, even
though the function has succeeded. If the file does not exist before the call,
GetLastError returns zero. INVALID_HANDLE_VALUE indicates failure. To get
extended error information, call GetLastError.
//Read a file
The file can be read by the handler only if it is opened in GENERIC_READ mode using
CreateFile. Thus the call to read file must come after the file is opened appropriately.
ReadFile(exampleHandler,rbuff,cBytes,&readBytes,NULL);
Here the file is read into the character array rbuff where cBytes specifies number of bytes
to be read and readBytes will contain the number of bytes actually read from the file.
readBytes is passed by address so that it can be modified in ReadFile and the
modifications will be visible in the calling function.
API reference to ReadFile:
This function reads data from a file, starting at the position indicated by the file pointer.
After the read operation has been completed, the file pointer is adjusted by the number of
bytes actually read.
BOOL ReadFile(
56
HANDLE hFile,
LPVOID lpBuffer,
DWORD nNumberOfBytesToRead,
LPDWORD lpNumberOfBytesRead,
LPOVERLAPPED lpOverlapped );
Parameters
hFile
Handle to the file to be read. The file handle must have been created with
GENERIC_READ access to the file. This parameter cannot be a socket handle.
lpBuffer
Pointer to the buffer that receives the data read from the file.
nNumberOfBytesToRead
Number of bytes to be read from the file.
lpNumberOfBytesRead
Pointer to the number of bytes read. ReadFile sets this value to zero before doing any
work or error checking.
lpOverlapped
Unsupported; set to NULL.
Return Values
The ReadFile function returns when one of the following is true: the number of bytes
requested has been read or an error occurs.
Nonzero indicates success. If the return value is nonzero and the number of bytes read is
zero, the file pointer was beyond the current end of the file at the time of the read
operation. Zero indicates failure. To get extended error information, call GetLastError.
//Write to a file
The file can be read by the handler only if it is opened in GENERIC_WRITE mode using
CreateFile. Thus the call to read file must come after the file is opened appropriately.
WriteFile (exampleHandler,wbuff,cBytes,&writeBytes,NULL);
Here the character array wbuff is written into the file specified by the exampleHandler
where cBytes specifies number of bytes to be written and writeBytes will contain the
57
number of bytes actually written into the file. writeBytes is passed by address so that it
can be modified in WriteFile and the modifications will be visible in the calling function.
API Reference to WriteFile:
This function writes data to a file. WriteFile starts writing data to the file at the position
indicated by the file pointer. After the write operation has been completed, the file pointer
is adjusted by the number of bytes actually written.
BOOL WriteFile(
HANDLE hFile,
LPCVOID lpBuffer,
DWORD nNumberOfBytesToWrite,
LPDWORD lpNumberOfBytesWritten,
LPOVERLAPPED lpOverlapped );
Parameters
hFile
Handle to the file to be written to. The file handle must have been created with
GENERIC_WRITE access to the file.
lpBuffer
Pointer to the buffer containing the data to be written to the file.
nNumberOfBytesToWrite
Number of bytes to write to the file.
A value of zero specifies a null write operation. A null write operation does not write any
bytes but does cause the time stamp to change. WriteFile does not truncate or extend the
file. To truncate or extend a file, use the SetEndOfFile function.
Named pipe write operations across a network are limited to 65,535 bytes.
lpNumberOfBytesWritten
Pointer to the number of bytes written by this function call. WriteFile sets this value to
zero before doing any work or error checking.
lpOverlapped
Unsupported; set to NULL.
Return Values
Nonzero indicates success. Zero indicates failure. To get extended error information, call
GetLastError.
Remarks
58
If part of the file is locked by another process and the write operation overlaps the locked
portion, this function fails.
Accessing the output buffer while a write operation is using the buffer may lead to
corruption of the data written from that buffer. Applications must not read from, write to,
reallocate, or free the output buffer that a write operation is using until the write operation
completes.
After understanding the basic operations to Create, Read or Write in file in Windows CE,
the crude way of porting is to replace all the file operations like fopen, fread, fwrite,
fclose, fseek by the CreateFile, ReadFile and WriteFile.
The following diagram shows the file operations done in the hindianalyser native code
which are replaced by CreateFile, ReadFile and WriteFile operations to make it
compatible with Windows CE API.
As discussed above clicking the Analyse button takes the input and invokes hindianalyser
dll and return the number of tokens written in tokens file.
Each file operation is implemented using the Windows CE API now. After this stage the
tokens are saved in a file Tokens.txt on disc. The file contains token in this form:
1. The token name: For example token name can be 0704 which is the name of the
sound file corresponding to this token and which will be obtained in the
hindiengine phase.
2. The token type: Token type specifies that whether this is a vowel or a consonant
and on the basis of that the sound generation algorithm works.
59
Please look at the references to know the algorithms used in Shruti which are also used in
the Embedded version.
Another important difference between Win32 API and Windows CE API is the memory
allocation techniques. C type memory allocation (malloc) doesn’t work on Windows CE
API. The following function should be used in place of that:
char *s;
s = (char *)LocalAlloc(LMEM_FIXED,cBytes);
LocalAlloc allocate memory on local heap of size cBytes and returns a pointer that is
stored in s. Local heap is always there in Windows CE by default. But the developer can
declare heaps on their own and write efficient memory code.
3. Generate button: On clicking the generate button first of all the Hindiengine library is
loaded into the main memory. After that the exported function from the Hindiengine
library is called with tokenLength as the input where tokenLength is the number of bytes
in the Tokens.txt file. Generate function from the dll read the Token.txt file, retrieve the
tokens and token type from the file and then concatenates the sound files according to
tokens into one file.
For example if token is 0704, then it will retrieve the sound file 0704.wav from a sound
library (in this model the sound library is a directory which contains all the sound files)
Thus for all tokens the sound files will be read from the directory, appropriately
concatenated and the output sound file will be produced.
In this dll Windows CE counterpart for fread, fwrite and fseek were written which will be
given shortly but before that the last function of the generate button click is to play the
sound file generated by the dll function.
//Code to play a wav format file on Windows CE
MMRESULT PlayWave(LPCTSTR szWavFile)
{
HWAVEOUT hwo;
WAVEHDR whdr;
MMRESULT mmres;
CWaveFile waveFile;
60
HANDLE hDoneEvent = CreateEvent(NULL, FALSE, FALSE, TEXT("DONE_EVENT"));
UINT devId;
DWORD dwOldVolume;
// Open wave file
if (!waveFile.Open(szWavFile)) {
TCHAR szErrMsg[MAX_ERRMSG];
_stprintf (szErrMsg, TEXT("Unable to open file: %s\n\n"),szWavFile);
MessageBox(NULL, szErrMsg, TEXT("File I/O Error"), MB_OK);
return MMSYSERR_NOERROR;
}
// Open audio device
for (devId = 0; devId < waveOutGetNumDevs(); devId++) {
mmres = waveOutOpen(&hwo, devId, waveFile.GetWaveFormat(), (DWORD) hDoneEvent,
0, CALLBACK_EVENT);
if (mmres == MMSYSERR_NOERROR) {
break;
}
}
if (mmres != MMSYSERR_NOERROR) {
return mmres;
}
// Set volume
mmres = waveOutGetVolume(hwo, &dwOldVolume);
if (mmres != MMSYSERR_NOERROR) {
return mmres;
}
waveOutSetVolume(hwo, 0xFFFFFFFF);
if (mmres != MMSYSERR_NOERROR) {
return mmres;
}
// Initialize wave header
ZeroMemory(&whdr, sizeof(WAVEHDR));
whdr.lpData = new char[waveFile.GetLength()];
whdr.dwBufferLength = waveFile.GetLength();
whdr.dwUser = 0;
whdr.dwFlags = 0;
whdr.dwLoops = 0;
whdr.dwBytesRecorded = 0;
whdr.lpNext = 0;
whdr.reserved = 0;
// Play buffer
waveFile.Read(whdr.lpData, whdr.dwBufferLength);
61
mmres = waveOutPrepareHeader(hwo, &whdr, sizeof(WAVEHDR));
if (mmres != MMSYSERR_NOERROR) {
return mmres;
}
mmres = waveOutWrite(hwo, &whdr, sizeof(WAVEHDR));
if (mmres != MMSYSERR_NOERROR) {
return mmres;
}
// Wait for audio to finish playing
while (!(whdr.dwFlags & WHDR_DONE)) {
WaitForSingleObject(hDoneEvent, INFINITE);
}
// Clean up
mmres = waveOutUnprepareHeader(hwo, &whdr, sizeof(WAVEHDR));
if (mmres != MMSYSERR_NOERROR) {
return mmres;
}
waveOutSetVolume(hwo, dwOldVolume);
if (mmres != MMSYSERR_NOERROR) {
return mmres;
}
mmres = waveOutClose(hwo);
if (mmres != MMSYSERR_NOERROR) {
return mmres;
}
delete [] whdr.lpData;
waveFile.Close();
return MMSYSERR_NOERROR;
}
Take a look at the source code to understand the sound producing code clearly.
Now let’s take a look at the file operations done in the hindiengine.dll in the following
diagram and the respective operations to do fread, fwrite and fseek:
62
From the figure it is clear that hindiengine native code contained a number of file
operations which are modified using Windows CE API functions to port it to Pocket-PC
running Windows CE.
Mapping of fread, fwrite, fseek and fscanf to Windows CE API functions:
fread: It can be implemented using ReadFile.
fwrite: It can be implemented using WriteFile.
fseek: Read the file till the desired position.
fscanf: fscanf operation can be implemented by reading the file byte by byte and then
putting the characters on a temporary array till the delimiter(say a blank) and then
changing it to appropriate format like an integer or a string. The following code snippet
retrieves token and token type from the tokens.txt file and save the token in a string and
the token type in integer variable.
//Code to implement fscanf
do //Read tokens till the phrasal boundary
{
while(no_of_blanks!=2){
ReadFile(fin,&c,1,&bytesRead,NULL);
length+=1;
if(c==' ')
no_of_blanks+=1;
if(no_of_blanks==0){
token[tokencount]=c;
tokencount+=1;
}
if(no_of_blanks==1){
ttype1[ttypecount]=c;
ttypecount+=1;
63
}
if(length==tokenLength)
break;
}
token[tokencount]='\0';
ttype1[ttypecount]='\0';
tokencount=0;
ttypecount=0;
no_of_blanks=0;
ttype=atoi(ttype1);
} while(length !=tokenLength);
The variable token contains the token as a string and the variable ttype contains the type
of the token. To read from the files like tokens.txt, intonation and epoch the above
substitute of fscanf is used. To read the wav files fread and fseek are sufficient since the
size of the wav file can be obtained by reading 4 bytes of the wav file after 40th byte.
Thus using fseek and fread Windows CE implementation the wav files can be read from
the sound file library (a directory in this implementation) and concatenated according to
the ILPS algorithm to get the speech.
Please take a look at the reference section to get the algorithm used in NLP module and
ILPS module.
4.1.1 Drawbacks
Take a close look at the fscanf implementation written above. The main drawback of this
model is that the operations like fscanf takes time proportional to the number of
characters in the file, which is not the case when fscanf is implemented using operating
system directives. The file is not read byte by byte but in blocks and thus fscanf is
implemented efficiently rather than reading one byte at a time. In this implementation the
whole file is to be read character by character and linear scanning takes time. Also there
are intonation file and epoch file which if read character by character takes high amount
of time. In epoch file there is epoch value corresponding to a given token and this model
linearly search for the epoch value corresponding to a given token. Linear search is
expensive and therefore this is another main disadvantage of this model. Another
important inference that can be made from this model is that if the token value is assumed
as a key then epoch value and the sound file can be retrieved using that key value. This
64
observation leads to the use of extendible hash based database in subsequent models. In
the next model first an extendible hash based database will be explained and then the
implementation of Embedded Shruti with this database will be presented.
4.2 Model 2: Windows CE port using GDBM (Without voice.db)
In Chapter 2 GNU Database manager was introduced and the reason for choosing it in
place of Microsoft SQL server for CE was explained. The crude port discussed in last
model has several disadvantages and in this model the linear scan required in epoch file
was avoided using an extendible hash based database called GDBM. GNU Database
manager was ported on Windows CE platform. Take a look at the source code of
GDBMCE for details.
At this point of time it is important to understand the meaning of extendible hashing since
this application needs a hash based database not a SQL supported database. Each line of
the Epoch file contains the first entry as the token and it is followed by 4 different epoch
values to be used in ILPS algorithm (Hindiengine module). Extendible hash based
databases are very efficient when retrieval is to be done by the specified key value (in this
case the token name like 0704) and the complexity of retrieval operation is O(1+alpha)
where alpha is load factor which is nearly 0 for a balanced database. In this model the
epoch file is read and saved in GDBM database using the token name as the key and the
value being the epoch. Four epoch databases are made which contains the following:
epoch1.db 
Key is the token name and the value is the first epoch value specified
on the line corresponding to that particular token name on the epoch.txt file.
epoch2.db 
Key is the token name and the value is the second epoch value specified
on the line corresponding to that particular token name on the epoch.txt file.
epoch3.db 
Key is the token name and the value is the third epoch value specified
on the line corresponding to that particular token name on the epoch.txt file.
epoch4.db 
Key is the token name and the value is the fourth epoch value specified
on the line corresponding to that particular token name on the epoch.txt file.
For example take one line from epoch.txt file:
0165179 104 206 307 409
epoch1.db  Key is 0165179, values is 104
65
epoch2.db  Key is 0165179, values is 206
epoch3.db  Key is 0165179, values is 307
epoch4.db  Key is 0165179, values is 409
The present version of Embedded Shruti uses epoch1.db. For producing better quality
speech later versions of the software might use the other epoch database files.
Intonation file is also saved into a GDBM database and used accordingly in the program.
Thus the new model of the hindiengine can be represented by the following picture:
Both the epoch database and the intonation database are saved on disc(secondary storage
rather than main memory or RAM). The advantage is that linear scan is avoided now and
the epoch can be obtained in almost O(1) time provided the key value of the epoch which
is the token name.
After understanding the basic structure of this model, let’s take a detailed look on
extendible hashing and why it is the most efficient data structure when a values is to
retrieved according to the key value.
4.2.1 Extendible hashing
Traditional hash methods are burdened with 2 disadvantages:

Sequential processing of a file according to the natural order on the keys is not
supported.

They are not extendible.
Hash table size is pre-determined
Hash table size heavily relies on hash function
66
Overestimation of the number of records results in wasted space.
Underestimation of the number of records results in rehashing
Extendible hashing method allows hashing to adapt to dynamic files. Hash tables are
naturally balanced. By extending the hash address space from the directory address space,
hash tables can be made extendible.
4.2.2 Extending hash tables
Assumptions:

A hash function, h, exists.

If K is a key, then K’ = h(K) is a pseudokey.
File is structured into two levels

Leaves: contain (K, I(K)) (I(K) is the information associated with K)
Contains a header that stores the local depth

Directory: the record associated with K or a pointer to the record
Contains a header that stores the depth
Contains pointers to leaf pages
Example
The following figures explain the working of extendible hash structures.
Figure 1
67
Figure 2
68
Figure 3
4.2.3 Using GDBMCE library
The gdbmce.dll library exports all the functions to do the database operations. In Chapter
2 all the functions of GDBMCE was explained and all functions are ported on Windows
CE platform. The .def (definition) file for the gdbmce dynamic link library exports the
following functions which can be accessed by the pointers as discussed in the last model.
//GDBMCE .def file
LIBRARY GDBMCE
EXPORTS
gdbm_open
gdbm_close
gdbm_store
gdbm_fetch
gdbm_delete
gdbm_firstkey
gdbm_nextkey
gdbm_reorganize
gdbm_sync
gdbm_exists
gdbm_setopt
gdbm_errno
gdbm_version
Code snippets that are used to do the database operations using gdbmce.dll.
//Database variable
GDBM_FILE dbf;
//Variables to work with the database
datum key, content;
/*
69
datum is a data structure defined in gdbmce.h which has two important members. The
first member of the structure is a pointer to character array while the next member of the
character array is an integer which stores the number of elements in the character. Using
this data structure the values are stored and retrieved from the database.
*/
//Define the function pointers to call the gdbm functions
typedef GDBM_FILE(*GDBMOpen_ptr)(WCHAR*,int,int,int,void*);
typedef void(*GDBMClose_ptr)(GDBM_FILE);
typedef int(*GDBMStore_ptr)(GDBM_FILE,datum,datum,int);
typedef datum(*GDBMFetch_ptr)(GDBM_FILE,datum);
//Instance to call the dll
HINSTANCE hInst1;
//Loading the engine dll
hInst1 = ::LoadLibrary(L"gdbmce.dll");
//Getting the function pointers of 4 GDBM functions
GDBMOpen_ptr
pgdbm_open=(GDBMOpen_ptr)GetProcAddress(hInst1,L"gdbm_open");
GDBMClose_ptr
pgdbm_close=(GDBMClose_ptr)GetProcAddress(hInst1,L"gdbm_close");
GDBMStore_ptr
pgdbm_store=(GDBMStore_ptr)GetProcAddress(hInst1,L"gdbm_store");
GDBMFetch_ptr
pgdbm_fetch=(GDBMFetch_ptr)GetProcAddress(hInst1,L"gdbm_fetch");
If the pointers (pgdbm_open. pgdbm_close, pgdbm_store, pgdbm_fetch) are NULL then
the functions are not exported by the dynamic link library. It is always advisable to check
whether the function pointers are NULL or not.
//Opening a database
//Database reader
dbf = (*pgdbm_open)(newstring,512,GDBM_READER,777,0);
70
newstring contains the name of the database to open. 512 specifies the block size in
which the data will be accessed from the disc. GDBM_READER specifies that the
database is to be opened in read mode. 777 defines the mode of the database file which
means read, write and execute permission on the database file thus created. 0 refers to the
default value that should be passed to the error function.
//Database writer
dbf = (*pgdbm_open)(newstring,512,GDBM_WRITER,777,0);
GDBM_WRITER specifies that the database is opened for writing. It also requires that
the database should be present on the disc.
//Create a database
dbf = (*pgdbm_open)(newstring,512,GDBM_WRCREAT,777,0);
This will create a database if the database doesn’t exist and provide both a reader and
writer for the database. *pgdbm_fetch reads the database and retrieves value
corresponding to a given key value while *pgdbm_store writes into the database
according to the key value thus provided.
//Storing into a database
Suppose there are two strings. The first string contains the key which is the tokenname
(“0704” for example) and the second string which is epoch contains the epoch value
corresponding to the key (“107” for example). Now to store the information into the
database the following function will be used:
//Storing the key and its size into the datum structure
key.dptr = tokenname;
key.dsize = strlen(tokenname);
//Storing the value and its size to datum structure
content.dptr = epoch ;
content.dsize=strlen(epoch);
//Storing into the database after successful opening
(*pgdbm_store)(dbf, key, content, GDBM_INSERT);
Take a look at the GDBM_INSERT option on the function call. GDBM_INSERT
parameter inserts the value corresponding to the key. If the key exists then the store
71
operation will fail. GDBM_REPLACE is used in those cases where the key already exists
and the value needs to be changed.
//Fetch <key, value> pairs from the database
Suppose the epoch value corresponding to the token name (“0165179” for example) is to
be fetched from a database that is opened successfully in GDBM_READ mode. The
following code will be used to fetch the values:
//Storing the key and its size into the datum structure
key.dptr = tokenname;
key.dsize = strlen(tokenname);
//fetch the value corresponding to the key
content = (*pgdbm_fetch)(dbf,key);
//copy the content to a string called epoch
strcpy (epoch,content.dptr);
int epochval = atoi(epoch);
Thus as the operation completes, epochval will contain the integer value of the epoch
string that is fetched from the database.
4.2.4 Advantages
Model 2 avoids linear scan of epoch file and the intonation file. Therefore this
implementation is much faster and efficient compared to Model 1. A performance
comparison will be given in next chapter where both the implementations are checked on
some given input. For this application where <key, value> pairs are to be retrieved
efficiently, Extendible Hashing is the best data structure and GDBMCE uses extendible
hashing.
4.2.5 Drawbacks
There are two drawbacks in this model:
1. The Tokens.txt file generated after hindianalyser phase is scanned linearly by the
hindiengine dynamic link library to retrieve the token name and the token type.
Refer to the implementation of fscanf using ReadFile in Model 1 section of this
chapter. So in the next model to avoid this linear scan token name and token type
72
are also saved in a GDBMCE database using an index value as key which starts
from 0 and increases till there are more tokens.
2. The second drawback of this model is that the sound library is still a directory
which contains the wav files. In the next model the sound files are also kept in a
GDBM database as the wav files also have the token name as the key and the
value is the sound file. A directory with a number of wav files are replaced by this
one file called voice.db. It facilitate the transfer of this single file to the Windows
CE device. Also this single file can be provided on flash ram accompanying the
application. Also making the voice database will also avoid linear scanning of
voice files during the generation of the speech. Since now the voice file is saved
on the database as a character array, efficient array operations like memcpy can be
used to avoid the linear scan of the sound files that was done in the present model.
These drawbacks resulted into the development of a third model in which the voice
database is added, along with the epoch database and the intonation database that
were already there. The tokens are also saved in a database and no normal file
operations are used in this version. That makes the third model as the most robust
model out of the other two models. Let’s now take a look at the changes done in third
model.
4.3 Model 3: Windows CE port using GDBM (With voice.db)
First take a look at the new dataflow structure of Hindiengine module. The following
figure displays the dataflow structure after the GDBM databases are added:
73
The modifications that are made in Model 2 to obtain Model 3 were already discussed in
the last section. Model 3 is the most efficient implementation so let’s take a complete
pictorial view of Embedded Shruti in this implementation.
In this implementation tokens database and voice database were added. The basic
database operations that were discussed in last section will hold well in this model also.
4.3.1 Making voice database
In earlier models the sound files are saved in a directory called voice and according to the
token name appropriate file is taken from the directory and appended to the existing
output voice files after modifications. In this model the sound files are saved in GDBM
database with key as the name of the sound file which is eventually the token name. For
example if there is a file in the voice library called 0164179.wav then from this file name
the token will be extracted which is filename-extension( 0164179 in this case) and the
key value will be set to 0164179. After that the wav file 0164179.wav will be saved in
74
the database with this key value. Later hindiengine will retrieve the sound file using this
key value.
The database file voice.db was created on linux platform since directory scanning using
system calls is quite easier in linux. GDBM is preinstalled on almost all linux platforms.
A GDBM database file created on linux platform can be used in Windows CE using the
GDBMCE library function calls. The following code shows how the voice database is
made from the voice directory:
/*
/*
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <gdbm.h>
#include <dirent.h>
#include <unistd.h>
#include <string.h>
#include <sys/dir.h>
#include <sys/types.h>
#include <sys/param.h>
Linux code
Header files needed
//Directory pointer to read the contents of voice directory
struct dirent *dpointer;
main()
{
//define the database handler for the music database
GDBM_FILE dbf;
//To make the music database
datum key,content;
int i=0,size;
DIR *dirp;
char *name,*buffer;
char *path;
char *voice = "voice/";
FILE* fpt;
/*
Steps by which music database will be created
1.The directory of sound files are transferred to linux machine.
2.Scan the directory,get the name of each file.
3.key = Name - trailing .wav
4.Content = The wav file.The size of the wav file can be obtained from the
file itself.The 40-44 bytes of the wav file gives the file size.
75
*/
*/
5.Accordingly the wav file will be stored by the key described above.
*/
dbf=gdbm_open("voice.db",512,GDBM_WRCREAT,777,0);
path = (char*)malloc(30*sizeof(char));
if((dirp=opendir("voice"))==NULL)
{
fprintf(stderr,"Error opening voice\n");
perror("dirlist");
exit(1);
}
//Code to scan the directory and put each wav file into database.
while(dpointer=readdir(dirp))
{
if(i>1)
{
name = strtok(dpointer->d_name,".");
//Inserting values in key
key.dptr = name;
key.dsize = strlen(name);
strcpy(path,voice);
strcat(path,name);
strcat(path,".wav");
fpt = fopen(path,"r+");
fseek(fpt,40,0);
fread(&size,4,1,fpt);
fseek(fpt,0,0);
//Add the 44 bytes of the header
size=size+44;
buffer=(char*)malloc(size*sizeof(char));
fread(buffer,size,1,fpt);
//Inserting values in content
content.dptr = buffer;
content.dsize = size;
//Inserting in database
printf("Inserting in database File no : %d\n",i);
gdbm_store(dbf1,key,content,GDBM_INSERT);
free(buffer);
fclose(fpt);
}
i++;
}
76
closedir(dirp);
gdbm_close(dbf1);
}
/*
compile line
bash$cc voicedb.c –o voicedb -lgdbm
/*
*/
execution
*/
bash$./voicedb
This will create the voice database in the present working directory. The database will be
transferred to Windows CE emulator or device for use.
4.3.2 Making tokens database
To make the tokens database first of all the GDBMCE dynamic link library should be
loaded into memory. After that open a database file with the name tokens.db. This file
will contain the token and type for characters of the input text provided by the frontend
through the file TextIscii.txt. In the database file save the token and type as they are
produced by Natural Language Processor.
An index value is maintained that starts with 0 and works as a key to this database. As a
new set of token and type is added to the database according to the current index, the
index value is increased by 1. Hindianalyser phase returns the value of index and the
frontend passes this value to Hindiengine to retrieve all the tokens and their respective
type.
When the token and type are inserted in database a delimited is added between the two.
The following code shows the insertion technique:
char buffer[20];
char keybuffer[20];
//0 concatenated with word[i] gives the token name.
//1 is the token type in this case.
//delimiter | is added between the token name and type.
sprintf(buffer,"%d%d|%d",0,word[i],1);
//content is the datum variable needed for insertion
77
content.dptr = buffer;
content.dsize = strlen(buffer);
//key is the datum variable to hold the key which is index in this case
sprintf(keybuffer,"%d",index);
key.dptr = keybuffer;
key.dsize = strlen(keybuffer);
//dbftokens is the handler to the tokens database
//Function is called with GDBM_REPLACE as the argument so that the //database will
be rewritten if the key already exists.
(*pgdbm_store)(dbftokens,key,content,GDBM_REPLACE);
//increase the index value
index = index + 1;
4.3.3 Producing the output sound file
Hindiengine will first get the token name and token type from tokens database. Once the
token name is obtained, it is used as a key to retrieve the sound file from the voice
database. The following code shows the whole process:
//Fill the key to retrieve content from the database
sprintf(tokenbuffer,"%d",index);
tokenskey.dptr = tokenbuffer;
tokenskey.dsize = strlen(tokenbuffer);
tokenscontent = (*pgdbm_fetch)(dbftokens,tokenskey);
//Get the token and the type from tokenscontent.
token = strtok(tokenscontent.dptr,"|");
ttype1 = strtok(NULL,"|");
ttype = atoi(ttype1);
78
//This code finds the length of the voice file and copies the header of //the wav file into an
variable Header which will later be added to the //output sound file. In wav files 4 bytes
after 40 bytes gives the size //of the file.
voicekey.dptr = token;
voicekey.dsize = strlen(token);
voicecontent = (*pgdbm_fetch)(dbfvoice,voicekey);
memcpy(Header,&voicecontent.dptr[0],40);
len = (voicecontent.dsize-44);
//This code copies the remaining wav file (except the first 44 bytes of //the header) into a
temporary character array called data and it will //be concatenated to the output sound file
later.
memcpy(data,&voicecontent.dptr[44],len);
This completes the description of model 3.
4.3.4 Advantages
This model is the most efficient model compared to other two models discussed above.
Linear scans of the files are completely avoided and memory efficient functions like
memcpy are used to increase the performance of the software. As all databases are kept
on the disc, this model gives an upper bound on the performance. No efficient
performance that always access disc rather than main memory cannot take time less than
this implementation. No file operations are used in this implementation and therefore this
is a robust implementation.
4.3.5 Drawbacks
This model also has the following drawbacks:
1. The tokens database is saved on disc rather than in main memory. So both in
Hindianalyser and Hindiengine modules, to write the tokens and then again to
read the tokens disc access is needed which takes more time compared to RAM
access. A solution to this problem is to maintain the list of tokens in main memory
rather writing it in database in disc. Another solution will be to maintain the
79
tokens database in a Flash RAM rather than on the secondary storage which is the
disc. Flash RAM access time is much less compared to disc access time.
2. Whenever a sound file is retrieved from the database disc is accessed since the
voice database is saved on disc. This will take more time and affect the
performance of the software.
These drawbacks lead to the development of Model 4 in which both these problems are
addressed and possible solutions are suggested. The next section will explain the
solutions that will increase the efficiency by removing the drawbacks of Model 3.
4.4 Model 4: Final Windows CE port
Model 4 is a hybrid model which judiciously uses main memory and secondary storage
(disc) to obtain the optimal performance. Model 3 if implemented on a Flash RAM will
give better performance but still the performance can be enhanced by using this hybrid
approach.
4.4.1 Approach
This model does two important additions to the last model and attempts to increase the
performance. The additions are the following:
1. The token list which was stored on the disc in last model will be stored in main
memory so that the access will take less time. Since Embedded Shruti is built on
Microsoft Foundation Classes for Windows CE, a number of MFC utility classes
can be used to maintain a hash structure of the tokens. The hash facilitates
retrieval. In this implementation a MFC class CMap is used to store the token
name and the token type keyed by a variable index which is increased
accordingly.
2. The other problem in model 3 was that each time a new sound file is accessed
from the database, disc access time is needed. A cache structure is implemented to
speed up sound file access. The cache structure stores N number of sound files on
the main memory. N is selected according to the application. Typically N can be
20 sound files. When a new file is required first of all the cache is checked
80
whether the file is there in the cache or not. If the file is available then the disc
access time is not required. If the file is not available the file is brought from the
database on disc and it will be saved in the cache. If the cache is full then an
appropriate cache replacement strategy like Least Recently Used Algorithm
(LRU) is used. The victim sound file will be chosen and removed from the cache.
In its place the new sound file will be kept.
The modifications suggested will optimally use the Random Access Memory and the
secondary disc so that Embedded Shruti gives the ideal performance. Till now the models
are tested on Pocket-PC emulator and as mentioned in the API reference of the PocketPC, the performance of the software on real Pocket-PC will be better than the
performance on the emulator.
This chapter will conclude with a graphical view of the model 4. The next chapter will
give the performance comparison of different models on some input string. Since the
input string remains same the time taken by different models will clearly differentiate
between the performances of the implementations.
4.4.2 Dataflow Model
81
Chapter 5
PERFORMANCE COMPARISON
In the last chapter different models for Embedded Shruti were discussed at length.
This chapter will present the performance of each model on a given input text. This
chapter will also specify the steps to run different models of Embedded Shruti on
Pocket-PC emulator. Performance comparison is an important part of the
development process since this phase determines which model scores over the others
and therefore should be chosen as the final implementation that will be ported.
5.1 Performance Parameters
First of all the performance parameters are to be specified. For real time applications
the performance should be measured on the basis of actual time taken. Windows CE
is a real time operating system and therefore softwares running on real time
applications. To measure the performance real time should be considered. For
different models the time to get the output once the input text is supplied and the
Analyse and Generate buttons are clicked will be considered. The model which scores
better over others on this metric will be considered as the better implementation.
There may be several other parameters like RAM space used and Disc space used but
they are not of much interest. For Embedded Applications real time constraints
should be satisfied. So the time taken for the application to execute is the most
important performance parameter.
5.2 Simulations on test inputs
This section will explain how to run each model on Pocket-PC emulator and then to
check the program on given test inputs. For every implementation the steps will be
provided to run the application.
5.2.1 Model 1
For running the application on Model 1 the following steps should be done:
82
1. First of all compile the source code. The source code includes the source code for
Frontend, and the source for the two dynamic link libraries Hindianalyser and
Hindiengine. All the three modules will be compiled using the eMbedded Visual
C++.
2. On the eMbedded Visual C++ IDE, specify the SDK as the Pocket-PC, and the
next fields as Win32 (WCE x86 debug) and Pocket PC 2002 emulation.
3. If the compilation is successful, then the dynamic link libraries will be made and
transferred into the Pocket-PC emulator.
4. The executable Frontend.exe will also be made and transferred into the Pocket
PC’s default executable path.
5. The dynamic link libraries are copied into \Windows directory on the target
emulator or the target device. An MFC dll is also copied to \Windows directory as
MFC dll is needed to run the Frontend executable
6. Now before running the application upload the files epoch.txt and inton_bengali
onto the emulator. To upload the files go to Tools  Remote File Viewer. Once
the Remote File Viewer appears the files can be transferred onto the emulator or
the device.
7. After the files are transferred the next step is to transfer the sound library on the
emulator or target device. A new directory will be made called voice on the
device and then using Remote File Viewer the wav files will be uploaded into the
voice directory.
8. The application can be run by clicking on Frontend.exe on the start menu. The
Frontend GUI will start and the Bengali text for Text-to-Speech conversion will
be applied to it.
9. An example of a Bengali or Hindi Text can be “mera naam piyush hai”. This text
will be entered in input area and then performance will be obtained by pressing
the Analyse button and the Generate button.
10. A sound file will be generated as the output and will be played on the emulator or
the device.
11. The next section will compare the real time performance of this model with other
models.
83
5.2.2 Model 2
1. In this model GDBMCE library is used. Therefore first of all source code for
GDBMCE library will be compiled.
2. Successful compilation will upload the gdbmce.dll to \Windows folder on the
target device or the emulator.
3. epoch.txt and inton_bengali files will not be transferred to the device. Instead the
hash database epoch1.db and inton_bengali.db will be transferred to the device.
4. Remaining all steps will be same as done in Model 1.
5.2.3 Model 3
1. This model don’t require the sound library (the folder consisting of wav files) to
be transferred on the device. Instead in this model the voice database (voice.db) is
transferred to the device.
2. Rest all steps remains same as Model 2.
5.3 Comparison of performance
The following table gives the performance of each model according to the time taken
by the Hindianalyser module (clicking on the Analyse Button) and the time taken by
the Hindiengine module (clicking on the Generate Button). The input text used for the
Comparison: “mera naam piyush hai”
Model Name
Model 1
Hindianalyser Performance
4 seconds
(Tokens in file)
Model 2
4 seconds
(Tokens in file)
84
Hindiengine Performance
70 seconds
(Very inefficient)
10 seconds
(Increase in efficiency)
Model 3
4 seconds
(Token database added)
5 seconds
(Best performance)
On the basis of this performance chart it can be concluded that Model 3 is the most
efficient one and this is to be used for the final implementation. Model 4 is also
suggested which is an extension of this model 3 and use a hybrid approach as
discussed in the last chapter.
85
Chapter 6
CONCLUSIONS
This thesis provides the complete design and implementation of Embedded Shruti. It
started with an introduction to technologies that were used in this software product.
After that details of different models were provided. The last chapter provides a
comparison of the performance of different models and the reason for choosing
Model 3 as the final implementation.
Embedded Shruti has several advantages over the desktop version. To run the desktop
version one need a desktop computer system that is costly compared to a Personal
Digital Assistant like Pocket-PC. Pocket-PC is a mobile device and therefore the
software can be used on the move by the user. A person with speech disorders only
have to carry a PDA with Embedded Shruti installed on it. The person can
communicate with others using the software and since it is installed on a PDA rather
than a desktop computer he can take the software with him at any place.
I personally feel that Embedded Shruti realizes the dream of providing Shruti
software to every person who needs it. For a person with speech disorders this
software will be an integral part of life. Carry a Personal Digital Assistant having
Embedded Shruti and you have the power to communicate with people despite your
serious speech disorders.
86
Chapter 7
FUTURE WORK
Embedded Shruti is to be tested completely on a number of variable length test inputs.
Hindianalyser module is tested completely but Hindiengine module is not rigorously
tested. The software is giving perfect results for the input strings on which it is tested till
now but still more testing is required.
The final version which is to be shipped to customers will save the voice database,
intonation database and epoch database on a Flash rather than on the secondary storage of
the Pocket-PC device. Flash comparatively takes less time than disc and so it will surely
increase the performance. A Flash version of the code is to be written. In the Flash
version of the code, the path of the database has to be changed. Presently since the
database is on the root directory of Pocket-PC, the path is simply the name of the
database like “voice.db”. For example when the database is opened, the path of the
database is “voice.db”. But when a Flash will be connected to the device the path will
change to “\Storage Card\voice.db”. In the Flash version of the code this change will be
incorporated.
The software is tested on Pocket-PC emulator till now. Once the testing and debugging is
done it will be ported on Pocket-PC hardware. The databases, the dynamic link libraries
(hindianalyser, hindiengine and gdbmce) and the application Frontend.exe will be
transferred to the device using eMbedded Visual C++. The environment is changed to
Pocket-PC (default device) from Pocket-PC (Emulation) and the device will be connected
to the development workstation using some COM port. After that the files can be
transferred to Pocket-PC.
87
Chapter 8
SOFTWARE SCREENSHOTS
8.1 Hindianalyser module on Standard SDK emulator (First port)
88
The output of Hindianalyser module is shown in the second text box. 0204 is the
token and 0 is the token type. The output has repetitive sequence of token and token
type.
8.2 Model 1 on Pocket-PC Emulator
8.2.1 Hindianalyser module execution: The text box contains the token and token
type generated in this phase. The <token, token type>pairs generated are:
0204 0 0204172 3 0172 1 0172207 2 0207 0 -2 5 0198 0 0198164 3 0164 1 0164164
4 0164 1 0164204 2 0204 0 -2 5 0200 0 0200166 3 0166 1 -1 5 0205 0 0205168 3
0168 1 0168213 2 0213 0 -2 5 0216 0 0216173 3 0173 1 -2 5
89
8.2.2 Hindiengine Module execution: After the Analyse button is clicked the tokens
are generated and displayed on the second text box. This completes the execution of
Hindianalyser module. Hindiengine module is called when Generate button is clicked.
After Generate is clicked the speech file will be generated and played. The second
text box will be updated by the number of bytes in the tokens.txt file.
8.3 Model 3 on Pocket-PC emulator: Model 1 screenshots are shown above. Please
refer chapter 4 for knowing in details about the models. Model 3 is implemented
using only Hash databases and no file operations are used. This model works
90
efficiently as compared to the other two models. Refer next page for Model 3
screenshots.
8.3.1 Hindianalyser module execution: In Model 3 before executing the Frontend,
database files will be sent to the device. The database files are: inton_bengali.db
(intonation database), epoch.db (epoch values) and the voice database (voice.db).
This model gives the best performance as shown in Chapter 6.
8.3.2 Hindiengine Module execution: The next screenshot shows the application
after the Hindiengine module is executed by clicking on Generate button. It plays the
91
sound and gives the output as the total number of <token, token type> pairs in the
tokens database.
The number of < token, token type> pairs generated from the input text is shown at
the text box. The Hindiengine model execution is efficient compared to Model 1 and
Model 2. Therefore this Model emerges as the winner and it will be used in the final
version of Embedded Shruti.
92
Chapter 9
REFERENCES
Embedded Shruti is an implementation project. The documentations which helped me
in this project are listed chapter wise:
Chapter 1
None
Chapter 2
1. Programming Microsoft Windows CE (Second Edition) by Douglas Boling
2. Windows CE .NET documentation.
3. Pocket-PC SDK documentation.
4. Microsoft eMbedded Visual C++ documentation.
5. Microsoft SQL Server CE documentation.
6. GDBM man pages
Chapter 3
1. Choudhury M. Rule-based Grapheme to Phoneme Mapping for Hindi Speech
Synthesis. Presented at the 90th Indian Science Congress of ISCA, Bangalore, 2003
2. Source code of Desktop version of Shruti
Chapter 4
1. GDBM man pages.
2. Source code of GDBM port to Windows CE.
3. Ronald Fagin, Jrg Nievergelt, Nicholas Pippenger, H. Raymond Strong, Extendible
hashing - a fast access method for dynamic files, ACM Transactions on Database
Systems, New York, NY, Volume 4 Number 3, 1979, pages 315-344.
The complete source code of the implementation is available in MediaLab, Indian
Institute of Technology, Kharagpur. Take a look at the source code to understand
how the software is working. All the models are implemented separately and you
can yourself do a performance evaluation of the respective models.
93
Download