licenta

advertisement
Table of Contents
Abstract .................................................................................................................................................................. 2
1 Basic Concepts .................................................................................................................................................... 3
1.1 Embedded systems .......................................................................................................................................... 3
1.2 Microkernel ...................................................................................................................................................... 4
1.3 Virtualized Linux ............................................................................................................................................... 5
1.4 Scheduling ........................................................................................................................................................ 6
1.5 Wombat - Iguana model .................................................................................................................................. 7
2 PerfMonitor Description ..................................................................................................................................... 9
2.1 Architecture ..................................................................................................................................................... 9
2.2 Use cases ........................................................................................................................................................ 10
2.3 Concepts ......................................................................................................................................................... 11
2.3.1 Function ...................................................................................................................................................... 11
2.3.2 Channel ....................................................................................................................................................... 12
2.3.3 Data Hierarchies ......................................................................................................................................... 12
2.3.4 Event Loop .................................................................................................................................................. 13
2.3.5 Resource Manager ...................................................................................................................................... 14
3 Implementation details..................................................................................................................................... 15
3.1 Client - Server Protocol .................................................................................................................................. 15
3.2 Client Library API ............................................................................................................................................ 19
3.3 Configuration Files ......................................................................................................................................... 26
Abstract
The real time requirements for embedded systems and the flexibility and high performance of
general purpose computers is a hard target to reach for a single system. Measuring performance for
ARM processors and setting profiles for applications is a starting point for building a scheduler that
can guarantee real time deadlines while stil maintaing high performance.
1
Basic Concepts
1.1 Embedded systems
An embedded system could be regarded as a computer system designed to perform
one or more dedicated functions, usually with real-time constraints. They can contain one or
more processors, each dedicated to handle a particular task. A general purpose computer is
defined, in contrast, to be flexible and meet end-user needs.
An embedded system is not a strictly definable term. Most systems have
programmability and extensibility elements, which are specific to general pupose
computers. A handheld computer can have an embedded operating system and multiple
specialized processors, like a digital signal processor (DSP), while also allowing different
applications to be loaded and peripherals to be connected.
The embedded market has well over 99% of processors and is growing strongly while
the PC market is rather flat.
Most of these systems have very small operating systems, their kernel consisting of
some device drivers and libraries, and are based on 8-bit and 16-bit microcontrollers. The
demand for more sophysticated devices, with 32-bit general purpose microprocessors, or
even 64-bit, with memory management units (MMU) has risen as more and more different
functionalities are now incorporated into a single device.
As an embedded system is usually a real-time computing system, so that the
correctness of an operation on such a system depends not only on the logical correctness of
the code, but also on the restrictions on time. Depending on the time constrains these
systems can be classified into:

hard real time: the execution of a critical part of code is guaranted to be
within a given time interval. A deadline violation is considered a critical failure and can have
disastrous consequences. For example when deploying the airbags on a automobile.

soft real time: if the time constrains are violated only the service quality for
the product is being reduced. A deadline violation is not critical and may occour from time
to time. For example when diplaying a video frame, an omission may occour with no
disastrous consequece.
A normal Linux is not suitable for real time operations, because it's response times
are unpredictable. RTLinux (Real Time Linux) addresses this problem. RTLinux is a
microkernel that runs the entire Linux operating system as a fully preemptive process,
meaning that the Linux will run on top of this core as a thread. This system should provide
real-time performance. However, the system is to complex to be fully analized, and tests
under heavy load have shown deadline violations[MHH02].
The real time requirements for embedded systems and the flexibility and high
performance of general purpose computers is a hard target to reach for a single system.
Measuring performance for ARM processors and setting profiles for applications is a starting
point for building a scheduler that can guarantee real time deadlines while stil maintaing
high performance.
1.2 Microkernel
A microkernel is a minimal kernel, which only provides mechanisms needed to
implement an operating system. It does not provide any services. The microkernel is the
only software running in privileged mode, and if the hardware provides multiple levels of
privilege, than the microkernel would be running at the most privileged one.
The actual operating system is implemented in user-mode (unprivileged mode).
Services like device drivers, protocol stacks, file system, user interfaces, all run in usermode. The microkernel provides mechanisms like address space management, thread
management and inter-process communication.
This approach gives some benefits. The microkernel is very small in size and fast. The
code is easyer to maintain, and therefor it is more reliable. Also the failure of a service,
which now runs in user-mode does not corrupt the kernel. For example, if a networking
service crashes, this will only terminate this service, leaving the rest of the system
functional. On the other hand, because of the Linux kernel very large size the code is hard to
maintain, and a bug in the kernel is more likely to appear. This is making the Linux kernel
not suitable for embedded systems.
The microkernel can have servers running on top of it. Servers are basically daemon
programs, for which the kernel can grant some special privileges, like interacting with
phisical memory. Device drivers can interact directly with the hardware. A basic set of
servers for a general purpose microkernel could be made of file system, device drivers,
networking servers. A crash from such a server can be corrected by simply restarting it. This
can lead to some system state loss, but in most cases this is not a problem.
Fig 1: Monolithic Kernel - Microkernel
1.3 Virtualized Linux
Virtualization is a framework or methodology of dividing the resources of a computer
into multiple execution environments, by applying one or more concepts or technologies
such as hardware and software partitioning, time-sharing, partial or complete machine
simulation, emulation, quality of service, and many others.
A virtualized Linux will not have direct access to a computer's resources. Over the
resources there will be added an abstractization layer. This layer is usually called Virtual
Machine Monitor (VMM). There are many ways to think about virtualization. The VMM
could run itself over the hardware, without requiring any host operating system, or it could
run as a top level application, over an existing hosted OS.
1.4 Scheduling
Scheduling is mainly concerned with CPUs resources allocation to processes. The
software entity responsible with this is called scheduler. As there usually are more processes
requesting resources than there are resources, they need to be shared. Scheduling
algorithms are made to take into consideration the following factors:
 CPU utilization - to keep the CPU as busy as possible.
 Throughput - number of processes that complete their execution per time unit.
 Turnaround - total time between submission of a process and its completion.
 Waiting time - amount of time a process has been waiting in the ready queue.
 Response time- amount of time it takes from when a request was submitted until the
first response is produced.
 Fairness - Equal CPU time to each thread.
Scheduling algorithms can be classified into:
◦ first in first out (FIFO): it is the simplest algorithm, meaning the first process to
arive in the ready queue gets served
◦ shortest remaining time: the scheduler puts the processes in the ready queue
according to how much time they have left to execute
◦ fixed priority preemptive scheduling: processes are given priorities and the
scheduler puts them in the ready queue according to that. If a higher than
current priority process preempts, the scheduler will interrupt the current
running process.
◦ round robin: every process is assigned a fixed time slice, and the scheduler cycles
through them.
The scheduling in Linux since kernel version 2.5 is done with a multilevel feedback
queue algorithm. It is a combination of multiple algorithms. There are priority levels ranging
from 0 to 140, 0 – 99 are real-time priorities and 100 – 140 non-real time. For the real-time
processes the fixed priority algorithm is applied, and for processes with the same real-time
priority, it is the round – robin. For the non-real time ones, there are multiple FIFO ready
queues. A newly starting process will get positioned at the top level FIFO queues. After a
process uses its time slice it will drop from its current FIFO queue to the one on the next
level. This continues until the process reaches its base ready queue, where a round-robin
algorithm will aply. A process can also be promoted in the ready queues if it blocks for an
I/O operation. This kind of scheduling favours the short jobs and the I/O intesive ones.
From version 2.6 a new scheduler was introduced, the O(1) scheduler. This one
reduces the overhead of the previous one, and does the scheduling in a constant time (0(1)),
no matter how many processes are runnning the system.
And from version 2.6.23 the Completely Fair Scheduler was introduced. This one
changes the concept of running queues with one of a time-ordered red-black tree to build a
timeline of future task execution. Also it uses nanosecond granularity accounting, removing
the notion of timeslice and other heuristics.
1.5 Wombat - Iguana model
This Wombat – Iguana model is an implementation of a system over a microkernel.
As the microkernel offers no services, the Iguana layer is needed for basic services:
allocating and sharing memory, memory proctection and general resource management.
The Iguana layer has an address-space management that reduces overheads on
context-switching on processors with virtual addressed caches.
A virtual address is tied to particular process. As different processes tend to use the
same virtual addresses for specific code/data segments, on a context switch also a cache
flush is required. This can be avoided if the processes have non-overlapping address spaces.
So Iguana is trying to avoid overlapping. Rather than every process having its own address
space, they all try to share the same one, each one getting its own protection domain. A
process can access data in it's virtual address space, only if it is inside its protection domain.
On 32-bit processors the 4GB of addressing space may not be enough for all processes, so
newly created processes can have their own virtual address space.
Fig 2:
Wom
bat –
Iguan
a
Wo
mba
t
repr
esen
ts
the
Linu
x
serv
er. It
is
runn
ing in its own protection domain as a process. The Compatibility Mode Linux Process is
running in a different virtual address space.
2 PerfMonitor Description
2.1 Architecture
VMX Performance monitoring library allows applications built on top of it to obtain
performance information for selected Linux processes, process this information and send it
to a remote analysis tool.
The library has two components:
 core - used both by client and server to build channel/function hierarchies and to
talk to each other.
 client - used to build applications with support for Performance Monitoring. Client
library exposes to the user a C API detailed described in a later chapter. The API
provides an easy way to define what information is to be extracted, how it is
processed and also when to read/send this information to the remote analysis tool.
Communication with Performance Monitoring Server (vPmon) is abstracted. Client is
provided with the option of using a configuration file instead of hard-wire function
calls in the application. In this case the user needs to insert event hooks in the places
with special meaning to the application and at a later time, without recompiling,
configure what is to be read.
The other component is the server. It is responsible with collecting messages from
clients, interpret them and send them to a remote tool. It can also remote monitor
applications.
An overview of how the library is used to analyse performance data in a system is
described in the picture below:
2.2 Use cases
There are two main use cases:
 self monitored. Application source code is modified to use performance monitoring
library. Event signaling triggers collecting, processing and forwarding of performance
data to Performance Monitor Server. This is the preferred method of using the
library because it allows accurate profiling. In the picture above, self monitored
application uses the C API to declare which performance counters are of interest and
what kind of data is to be reported for each event. Counters are read within
application context using PM Linux Syscall. This syscall is only ashim as the Kernel
does not store performance management information. It forwards the request to the
Micro Kernel which reads performance management virtualized registers and sends
the data back to Linux Kernel and then to User Space. Because counters are read
using a Micro Kernel syscall read atomicity is guaranteed. When syscall returns to
performance monitoring library the data isprocessed according to the channel
hierarchy and then sent to vPmon server via Unix Socket. Server unpacks event data
formats it according to the selected output method and sends it to the remote
analysis toolvPerf.
 remote monitored. Allows collecting of performance management data without the
need to recompile the application. The downside is that events are notnecessarily
correlated with the innerworkings of the application as there is no way of knowing
when it starts or ends running a specific task. This case is selected when another
application, built with performance monitoring library, requests that the server
starts monitoring a specific Linux process ID. When event is signalled, usually by a
periodic timer, the server uses PM Linux Syscall to retrieve counters data for the
selected process, then it passes it through associated channels and formats it to the
selected output method and sends it through Serial/UDP.
2.3 Concepts
2.3.1 Function
This object stands for a specific performance parameter: number of cycles, number
of instructions, thread execution time, etc. It has a pair of PMC/PMD registers associated
with it. The association is not fixed since the same function can be acomplished with
different PMC/PMD register pairs. Platform Manager object handles mapping of PMC/PMD
registers to Function instances (object factory).
Function object supports the following:
 read/write counter value - Write() call fails for read only functions like thread
execution time. Counter value is not necessarily 8 byte wide. Size() reports how
many bytes are needed to store the value returned by Read().
 introspection - via Type() function. Each type (cycles, instructions, time, etc.) has its
own ID. Type() together with Size() define a data type which can be processed by
Channels.
 behaviour configuration - various flags can be set which change the way Function
behaves. For instance, cycle and instruction counters can be configured to either
continue counting or reset to zero after Read().
2.3.2 Channel
This object extends Function interface providing same functionality while not being
tied to a specific performance parameter. Channels are linked to other Functions or to other
Channels and process in different ways data coming from the entities they are linked to:
accumulate, add, group, etc.
Channel object supports the following:
 read/write - the value obtained is related to the type of channel: it can be the sum of
two Functions, it can be an accumulated value for one Function over multiple
Read()'s, etc.
 introspection - via Type() and Visit() functions. The library user can enumerate
entities linked to the selected channel and dynamically change the hierarchy of
performance management data being read upon an event.
 dynamic structure changes - via Link() and Unlink(). Functions and Channels can be
added/removed at runtime.
2.3.3 Data Hierarchies
Here is an example of a configuration of channels and functions linked together in
two tree-like structures assigned to two events. When an event is triggered, from an
internal source or based on a timeout, channels are read. Values that are obtained by
Read(), described by Type() and Size() are sent to the Server to be processed further and
sent to the remote monitoring application.
Fu
nctions/Ch
annels can
be linked
to multiple
channels
and
a
Channel,
depending on its type can have multiple Functions/Channels linked to it. User must take
care not to create loops as Link() method does not check against them. A mechanism is in
place so that a Function is Read() only once when an Event is triggered, even if it is
reachable from the top Channel through multiple paths.
2.3.4 Event Loop
Performance management library is event driven. The server handles messages
received from clients and also configuration commands received from the remote
monitoring application. Also, the server handles periodic reading of channels for remote
monitored applications. For all this to work the library offers a generic event loop object.




It supports the following:
monitoring of file descriptors ready to be read()/recv()
configurable timeouts - which are also used to offer timer functionality for periodic
channel reads
callbacks triggered each time something happens - at every loop - useful mostly for
debugging
configurable loop latency - a bound on how much to wait for a file descriptor to be
ready. If there is an every-loop callback configured then it is guaranteed to be called
with at most loop-latency delay
Event
Loop
functionality
is
presented in the
figure below:
2.3.5 Resource Manager
Resources are managed at two levels: allocation of low-level performance regisers is
managed by Platform objects, availability of high-level Functions (performance register
pairs) is managed by Resource Manager objects. Functions and low-level register allocation
are notgroupped together in the same class. In the case of a self-monitored thread that is
also remote-monitored there is only one Platform instance but there are two Resource
Managers: one in the application itself and one in Performance Monitor server (in charge of
remote monitoring).




Resource Manager supports the following:
initialization - based on PID and Platform instance. All Functions are instantiated and
made available.
function reservation - user requests for a specific function to be available. Low-level
register contention may occur because another function is using the same PMC/PMD
registers
function release - marks the function and associated Platform registers as available
one-shot read - the specified set of Functions is read and result is cached. This
cancels the errors caused by time-delay in reading Channels.
Self-monitored threads resource allocation is done using a Platform proxy. Requests
to acquire/release PMC/PMD registers are sent to Performance Monitoring server and
approved/rejected.
3 Implementation details
3.1 Client - Server Protocol
Messages from client to the PerfMonitor server are of TLV type:
0______3_4_____7_8______________
| Type | Length| Value
|
|______|_______|_______________|
TLVs can be composed, the Value field can contain other TLVs. The Length field
contains the length of the Value field.
Messages are sent as commands and are triggered by different events:
Commands:
 PM_CMD_CLIENT_REGISTER
0_______________________3_4______7_8_______11
| PM_CMD_CLIENT_REGISTER | 4
| pid
|
|________________________|________|_________|
-registers the client with process id 'pid' to the server, after this, the client can send
other messages
 PM_CMD_CLIENT_UNREGISTER
0_________________________3_4______7_8_______11
| PM_CMD_CLIENT_UNREGISTER | 4
| pid
|
|__________________________|________|_________|
-unregisters the client with process id 'pid', so that the server can clear the context
 PM_CMD_CLIENT_ECHO
0_________________________3_4______7
| PM_CMD_CLIENT_ECHO
|0
|
|__________________________|________|
-the client cand send an echo, and get a reply, to see if the vPmon is still active
 PM_CMD_CLIENT_PLATFORM
0_________________________3_4______7_8____11_12____15_16_________
| PM_CMD_CLIENT_PLATFORM | length | pid | type | registers |
|__________________________|________|_______|________|___________|
-'pid' is the client's pid making a request
-'type' can be: ACQUIRE or RELEASE
-'registers' are pairs of uint_32 representing the registers to acquire or release
- the client sends these messages when trying to reserve/release hardware
functions. The client will receive a reply:
0_________________________3_4______7_8____11
| PM_CMD_CLIENT_PLATFORM | 4
| result|
|__________________________|________|_______|
- 'result' is the result of the client's request. It will contain success, or the error type.
 PM_CMD_CLIENT_MONITOR_PERIODICALLY
0_________________________3_4______7_8____11_12____15_16_____19_20______23
| PM_CMD_CLIENT_MONITOR_PER| 16
| pid | type | 4
| period |
|__________________________|________|_______|________|_________|_________|
- a client sends this request in order for the server to start or stop monitoring a
remote process
- 'pid' is the pid of the process to be monitored/ stop monitoring.
- 'type' is event type. It can be ET_START_THREAD - for start monitoring or
ET_END_THREAD - for stop monitoring.
- 'period' is the period at which data is collected from the monitored process.
 PM_CMD_CLIENT_EVENT
0_________________________3_4______7_8____11_12_____________
| PM_CMD_CLIENT_EVENT
| length | pid | Events
|__________________________|________|_______|_______________
- a command can contain multiple client events
 PM_EVENT
0______________3_4______7_8____11_12_____________
| PM_EVENT
| length | tag | Channels
|_______________|________|_______|_______________








- 'tag' contains the event type. It can be one of the following:
ET_START_SYSTEM_RUN
ET_END_SYSTEM_RUN
ET_START_APP
ET_END_APP
ET_START_THREAD
ET_END_THREAD
ET_START_THREAD_PERIOD
ET_END_THREAD_PERIOD






ET_END_SESSION
ET_POWER
ET_SYSINT_SY
ET_SYSINT_IL
ET_SYSINT_SK
ET_SYSINT_SI
Channels do different transformations on the Functions and Channels they are linked
with. The TLVs returned by the channels can be simple or composed.
The composed TLS have this structure:
0_________________3_4______7_8______
| PM_FUNC_COMPOSED | length | TLVs
|__________________|________|_______






- the 'TLVs' field can be null or contain other TLVs.
Channels that give composed TLVs are:
PM_CHAN_PASS_THROUGH_MULTIPLE - it concatenates in the 'TLVs' field the TLVs
returned by the Functions and Channels it is linked with.
If no Function or channel is linked the TLVs field remains empty
PM_CHAN_NULL - it returns an empty 'TLVs' field. It reads the linked
Functions/Channels but does not store the results.
PM_CHAN_REVERSE - it reads the linked Functions/Channels in reverse order than
the one filled in the 'TLVs'
The remaining Channels give simple 'TLVs':
PM_CHAN_PASS_THROUGH - it can be linked to only 1 Function/Channel and gives
the exact TLV it reads
PM_CHAN_MICRO2MILI - it can be linked to only 1 Function/Channel and if the read
TLV is one of the uint_64 types, it divides the value with 100
PM_CHAN_AGGREGATE - it can be linked to multiple Functions/Channels. If they are
of the same types, uint_64 type, it adds the data of all the read TLVs, and gives a TLV
with all aggregated data.
The TLVs returned by hw Functions:
0_____3_4______7_8______
| type | length | data |
|______|________|_______|
- the 'type' field can be:
PM_FUNC_VOLTAGE = 1,
PM_FUNC_CURRENT,
PM_FUNC_POWER,
PM_FUNC_INSTR,
PM_FUNC_CYCLES,
PM_FUNC_FREQ,
PM_FUNC_TIME_STAMP,
PM_FUNC_DUMMY,
The TLVs returned by user Functions:
0_____________3_4______7_8______12_13_______
| PM_USER_FUNC | length | tag | data |
|______________|________|_________|_________|
- the 'tag' field contains can be:
PM_FUNC_ALGORITHM
PM_FUNC_TEXT
PM_FUNC_PPID
PM_FUNC_INSTANCE_NO
PM_FUNC_TIME_SCHEDULED
PM_FUNC_TIME_DEADLINE
PM_FUNC_TIME_EXECUTION
PM_FUNC_REAL_TIME
PM_FUNC_THREAD_TYPE
3.2 Client Library API
Functions in this API return values from this enum:
enum{
PM_RESULT_OK = 0,
PM_ERR_FAIL,
PM_ERR_INVALID_PARAM,
PM_ERR_OUT_OF_BOUNDS,
PM_ERR_CONNECTED,
PM_ERR_NOT_CONNECTED,
PM_ERR_MISMATCH,
PM_ERR_UNAVAILABLE,
PM_ERR_UNKNOWN,
PM_ERR_NO_MEMORY,
PM_ERR_BIND,
PM_ERR_TIMEOUT
}
PM_RESULT pmon_open( PM_HANDLE *out_HLibrary, uint32_t in_Flags,
const char* in_AppName, int in_AppID,
const char* in_CfgFile,
...);

Main PerfMonitor library function.







Offers the performance monitoring capabilities to any application that uses it.
Opens a handler to the PerfMonitor library that will be used for any further perfmonitoring activities.
out_HLibrary = reference to the handler that this function will open.
in_Flags = configuration flags related to the internal capabilities of the vPmon
handler. e.g.: buffered or non-buffered events;
in_CfgFile = if present (!= NULL), the resources management in terms of channels,
events and functions can be done 'automatically' by using a configuration file in
which all the relationships between functions, events and channels are described. If
NULL then all the associations between functions, channels and events have to be
done manually.
in_AppName = the name of the desired data set inside the configuration file. e.g.
decoder, display
in_AppId = the identifier of the data set inside the configuration file. e.g.: 1, 2 and so
on
PM_RESULT pmon_close( PM_HANDLE *in_HLibrary );



Closes the handler previously open using pmon_open().
Any vPmon facilities are now unavailable to the application that used it.
in_HLibrary = a valid handler (a previously open vPmon main resource, using
pmon_open()) has to be passed to this function.
PM_RESULT pmon_channel_open( PM_HANDLE in_HLibrary,
int in_ChannelType,
PM_HANDLE *out_HChannel );





Opens a channel.
A channel must be open in order for the application to be able to perform resource
monitoring. This entity then must be linked with some other vPmon entities - the
functions. Finally, a channel is instructed to do the measurements using the third
kind of vPmon entity - the events.
in_HLibrary = a valid handler, previously open using pmon_open().
in_ChannelType = designate the channel's type.
Actual channel types are:







PM_CHAN_PASS_THROUGH
PM_CHAN_PASS_THROUGH_MULTIPLE
PM_CHAN_MICRO2MILI
PM_CHAN_AGGREGATE
PM_CHAN_NULL
PM_CHAN_REVERSE
out_HChannel = reference to the open channel.
PM_RESULT pmon_channel_close( PM_HANDLE in_HLibrary,
PM_HANDLE *in_HChannel );

Closes a channel.

in_HLibrary = a valid handler, previously open using pmon_open().

in_HChannel = reference to the channel to be closed.
PM_RESULT pmon_channel_link( PM_HANDLE in_HLibrary,
PM_HANDLE in_HChannel,
PM_HANDLE in_HFunction );





Links a function to a channel.
By doing this, it offers the channel the possibility to get information from the
platform reserved performance management registers and pass it to the applicationlevel.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HChannel = reference to an open channel.
in_HFunction = reference to a function. The functions inside vPmon offer the
functionality and flexibility the application needs. e.g.: measure the real-time at
which some event happens, identify the number of cycles and/or instructions that
some operation takes, report the CPU's voltage and/or the current at a given
moment and so on.
PM_RESULT pmon_channel_unlink( PM_HANDLE in_HLibrary,
PM_HANDLE in_HChannel,
PM_HANDLE in_HFunction );




Unlinks (detaches ) a function from a channel. Therefore, that channel cannot offer
the information related to the unlinked function any more.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HChannel = reference to an open channel.
in_HFunction = reference to a function.
PM_RESULT pmon_function_open( PM_HANDLE in_HLibrary,
PM_HANDLE *out_HFunction,
const char* in_FunctionType, ... );






Opens a function.
The function itself is the entity that offers actual information to the application that
uses PerfMonitor library. Depending on its type, it performs the the desired
measurement and offers the information to the user-level.
in_HLibrary = a valid handler, previously open using pmon_open().
in_FunctionType = the function type offers the desired functionality to the
application that uses vPmon.
The actual function types are:
~> Application related measurement functions:







PM_FUNC_VOLTAGE
PM_FUNC_CURRENT
PM_FUNC_POWER
PM_FUNC_INSTR
PM_FUNC_CYCLES
PM_FUNC_FREQ
PM_FUNC_TIME_STAMP
 PM_FUNC_DUMMY
PM_RESULT pmon_function_close( PM_HANDLE in_HLibrary,
PM_HANDLE *in_HFunction );



Closes a previously opened function.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HFunction = reference to a valid function, previously
pmon_function_open().
open
with
PM_RESULT pmon_function_enable( PM_HANDLE in_HLibrary,
PM_HANDLE in_HFunction );




Enables a function.
Once a function is open, in order to use it with a channel it has to be enabled first.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HFunction = reference to a valid function, previously open with
pmon_function_open().
PM_RESULT pmon_function_disable( PM_HANDLE in_HLibrary,
PM_HANDLE in_HFunction );




Disables a function.
Before its close, a function must be disabled first. The final operation on a function is
closing it.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HFunction = reference to a valid function, previously open with
pmon_function_open().
PM_RESULT pmon_event_create( PM_HANDLE in_HLibrary,
int in_EventTag,
PM_HANDLE *out_HEvent );


Creates an event.
The event is the PerfMonitor entity that triggers an actual measurement. It relates to
a channel that, at its turn, relates to the effective measurement entities, the
functions. When an event occurs that channel is activated and so are the functions
related to the channel. After that, one can have the desired measurements in place.
 An event can be inserted into the code in any place that the programmer is
interested to have data for. Usually, the events are named after the actions they
have to trigger, e.g.: start thread, start period, end application and so on.
in_HLibrary = a valid handler, previously open using pmon_open().
in_EventTag = event's unique name. Suggestive names can be chosen, so that no
confusion could arise in reading the instrumented code.
out_HEvent = reference to the newly created event.
Currently, the supported events' tag are:




ET_START_SYSTEM_RUN
ET_END_SYSTEM_RUN
ET_START_APP
ET_END_APP










ET_START_THREAD
ET_END_THREAD
ET_START_THREAD_PERIOD
ET_END_THREAD_PERIOD
ET_END_SESSION
ET_POWER
ET_SYSINT_SY
ET_SYSINT_IL
ET_SYSINT_SK
ET_SYSINT_SI
PM_RESULT pmon_event_delete( PM_HANDLE in_HLibrary,
PM_HANDLE *in_HEvent );




Deletes an event.
This is the last operation to be performed on an event.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HEvent = valid reference to a previously created event, that is about to be
deleted.
PM_RESULT pmon_event_target( PM_HANDLE in_HLibrary,
PM_HANDLE in_HEvent,
PM_RESULT (*in_Callback)(TLVEvent*, void*),
void *in_Context );




In case of an event signal, this function offers the programmer the possibility to
manage the data measured by the hardware counters all by himself.
The in_Callback (function pointer) and the in_Context (context) are both used to call
the function that is passed as argument.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HEvent = a valid event handler, previously open with pmon_event_create().
PM_RESULT pmon_event_link( PM_HANDLE in_HLibrary,
PM_HANDLE in_HEvent,
PM_HANDLE in_HChannel );





Links a channel to an event.
By attaching a channel to an event, whenever that event is signaled (using
pmon_event_signal()) the channel is triggered. Consequently, the functions related
to that channel will perform their measurement actions and data will be available for
processing.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HEvent = reference to a valid event handler, previously open with
pmon_event_create().
in_HChannel = reference to a valid channel, previously open with
pmon_channel_create().
PM_RESULT pmon_event_unlink( PM_HANDLE in_HLibrary,
PM_HANDLE in_HEvent,
PM_HANDLE in_HChannel );





Un-links a channel from an event.
Once this action is taken, the channel still has functions attached to it, but it cannot
be triggered any more by the event.
in_HLibrary= a valid handler, previously open using pmon_open().
in_HEvent=reference to a valid event handler, previously open with
pmon_event_create().
in_HChannel=reference to a valid channel, previously open with
pmon_channel_create().
PM_RESULT pmon_event_signal( PM_HANDLE in_HLibrary,
PM_HANDLE in_HEvent );




Signals an event.
This means that, according to the programmer's desire, at the time an action in the
code has to be instrumented (measured using vPmon) this is the function that
triggers the event's related channel. Therefore data from the functions linked to that
channel will be available in that place in the code.
in_HLibrary = a valid handler, previously open using pmon_open().
in_HEvent = reference to a valid event handler, previously open with
pmon_event_create().
PM_RESULT pmon_remote_monitor_start( PM_HANDLE in_HLibrary,
int in_MonitorPid,
int in_Period);




PerfMonitor also offers the possibility to monitor some remote process, besides the
basic facility of a thread's self-monitoring.
This function can identify such a remote process based on its process id and initiates
the periodic query for that process at in_Period regular intervals.
in_HLibrary = a valid handler, previously open using pmon_open().
in_MonitorPid = the remote process's id.
PM_RESULT pmon_remote_monitor_stop( PM_HANDLE in_HLibrary,
int in_MonitorPid);



Stops
the
remote
monitoring
activity,
previously
pmon_remote_monitor_start().
in_HLibrary = a valid handler, previously open using pmon_open().
in_MonitorPid = the process's id to remote monitor.
started
by
PM_RESULT pmon_echo( PM_HANDLE in_HLibrary );



Replies to a message from a monitored thread/process.
It uses Linux socket-connection to send back a message at regular intervals. It can be
used to check Pmon's functionality/availability.
in_HLibrary = a valid handler, previously open using pmon_open().
PM_RESULT pmon_flush( PM_HANDLE in_HLibrary );



Flushes all the messages that happens to be buffered in the communication channel
for the monitored thread, upto the moment of its call.
Uses Linux socket-connections.
in_HLibrary = a valid handler, previously open using pmon_open()
3.3 Configuration Files
PerfMonitor's client features are available also from a configuration file. Events,
Channels, Functions can be declared and linked from the configuration file. In the
application's code the events still need to be triggered at the needed time. The benefits of
using the configuration file is that the syntax is a bit simpler than writing code, so it is less to
write, and a change in the config file does not needed a recompilation of the application's
sources.
The parsing are done with Flex and Bison tools, and the grammar is presented
below:
<app_name> <id> {
<instructions>
}
instructions:
Function <func_name> hw <func_type> <flag>;
Function <func_name> user <func_type> <user_function>;
Function <func_name> const uint32 <func_type> <number>;
Function <func_name> const string <func_type> <string>;
Channel <chan_name> <chan_type>;
Event <event_name> <event_type>;
<ent> = {<ent1>, <ent2>, ...};
app_name, func_name, user_function, string, chan_name, event_name, ent, ent1, ent2: [a-zA-Z_][a-zA-Z_0-9]*
number, id: [0-9]+
func_type:
//hw functions
FUNC_INSTR // (I) instructions
FUNC_CYCLES // (Y) cycles
FUNC_TIME_STAMP // (TE) time spent on processor by the thread
//user functions
FUNC_ALGORITHM // (A) algorithm
FUNC_TEXT // (X) text
FUNC_PPID // (X) parent pid
FUNC_INSTANCE_NO // (N) frame number
FUNC_TIME_SCHEDULED // (TS)
FUNC_TIME_DEADLINE // (TD)
FUNC_TIME_EXECUTION // (TE) not used
FUNC_REAL_TIME // (R)
FUNC_THREAD_TYPE //(TT)
flag:
FLAG_NONE
FLAG_ENABLED //flag for enabling the function
FLAG_RESET_AFTER_READ //flag for reseting the counter value after each read
chan_type:
CHAN_PASS_THROUGH // simple channel, can be linked to 1 function, does not modify function's output
CHAN_PASS_THROUGH_MULTIPLE // can be linked to multiple functions, does not modify any function's
output
CHAN_MICRO2MILI // can be linked to 1 function, divides its result to 1000
CHAN_AGGREGATE // can be linked to multiple functions, adds the results of the linked functions
CHAN_NULL // can be linked to multiple functions, gives no output, used for reseting the counters
CHAN_REVERSE // can be linked to multiple functions, reads the linked functions in a reversed order
event_type:
START_SYSTEM_RUN
END_SYSTEM_RUN
START_APP
END_APP
START_THREAD
END_THREAD
START_THREAD_PERIOD
END_THREAD_PERIOD
END_SESSION
POWER
The application registers itself with app_name and id. When parsing the file it will run
the instructions under the block identified by the app_name and id taken together (as a pair
of identification values).
The configuration file also accepts C style comments (/* comment ... */)
Linking can done between:
 functions and channel;
 channels to channel;
 channels and events.
So ent can be one of event_name or chan_name; ent1, ent2 can be one of chan_name
or func_name.
The user_function is defined inside the application and is identified by the
user_function string.
Short example:
In here, the block is named test 1, taken both the app_name (here with a value of
test) and the id (here with a value of 1 in here) as an identifier pair.
test 1{
Function F_I hw FUNC_INSTR FLAG_RESET_AFTER_READ;
Function F_TT const uint32 FUNC_THREAD_TYPE 1;
Channel C_MP CHAN_PASS_THROUGH_MULTIPLE;
Event ev START_APP;
C_MP = {F_I, F_TT};
ev = {C_MP};
}
Download