sbornik_2013_TP-20140107-MK

advertisement
MASARYKOVA UNIVERZITA
Proceedings of the
10th Summer School of Applied Informatics
Sborník příspěvků
10. letní škola aplikované informatiky
Bedřichov
6.–8. 9. 2013
Editoři:
Editors:
Techničtí editoři:
Technical editors:
Jiří Hřebíček
Jan Ministr
Tomáš Pitner
Jiří Kalina
Radka Novotná
Odborní garanti letní školy:
Prof. RNDr. Jiří Hřebíček, CSc., MU, IBA
Expert guarantos:
Doc. RNDr. Tomáš Pitner, Ph.D., MU, FI
Programový výbor:
Editorial and programme board:
Doc. RNDr. Ladislav Dušek, Dr. – Masarykova univerzita, Brno, CZ
Doc. Ing. Josef Fiala, CSc. – Vysoká škola podnikání, Karviná, CZ
Doc. Ing. Alena Kocmanová, CSc. - Vysoké učení technické v Brně, CZ
RNDr. Jaroslav Ráček, Ph.D. – IBA CZ, s.r.o., CZ
Prof. Andrea Rizzoli – University of Applied Sciences and Arts of Southern
Switzerland, Lugano, CH
Dr. Gerald Schimak - Austrian Institute of Technology, Vienna, AT
Doc. Ing. Jan Žizka, CSc. – Mendelova univerzita v Brně, CZ
© 2013 Nakladatelství Littera
ISBN 978-80-210-6058-6
Slovo úvodem
10. letní škola aplikované informatiky navázala na předchozí letní školy aplikované (environmentální) informatiky
v Bedřichově, které se zde konají od roku 2002 s výjimkou roku 2004, kdy se letní škola konala v Šubířově a roku 2005,
kdy byla hlavní akcí Masarykovy univerzity (MU) v oblasti environmentální informatiky na 19. mezinárodní konferenci
Informatics for Environmental Protection - EnviroInfo 2005 s nosným tématem Networking environmental information
a kterou hostila Masarykova univerzita ve dnech 7. až 9. září 2005 v brněnském hotelu Voroněž. V letech 2006 až 2012
se letní školy konaly opět v Bedřichově v penzionu U Alexů.
V letošním roce 10. letní škola aplikované informatiky se konala ve dnech 6. až 8. září 2013 v Bedřichově opět
v Bedřichově v penzionu Artis (dříve U Alexů). Všemi přednáškami 10. letní školy prolínala skutečnost, že aplikace
moderních informačních a komunikačních technologií pro životní prostředí (potažmo environmentální informatiky) jak
v České republice (ČR), tak i mezinárodně v Evropské unii (EU) a ve světě se zaměřuje na podporu eEnvironmentu,
Jednotného informačního prostoru pro životního prostředí (SISE – Single Information Space in Environment for
Europe) a Sdíleného informačního systému pro životní prostředí (SEIS – Shared Environmental Information System),
které podporují naplňování nové politiky v budování informační společnosti EU a ČR, které přinesla „Digital Agenda
for Europe“ v rámci vize nové Evropské komise „eEUROPE 2020“. Jde zejména o přeshraniční informační služby
v rámci eEnvironmentu. Jedná se o sdílení monitorovaných a zpracovávaných dat a informací o atmosféře, povrchových
i podzemních vodách, odpadech, půdě, biodiverzitě, atd. pomocí Globálního monitorovacího systému životního
prostředí a bezpečnosti (GMES – Global Monitoring for Environment and Security). Tyto informační služby umožňují
efektivnější a přesnější sledování aktuálního stavu životního prostředí a udržitelného rozvoje v Evropě, dále pak jeho
modelování a simulaci jeho dalšího vývoje.
Za hlavní přínos 10. letní školy považujeme skutečnost, že na letošním ročníku setkali a jsou v příspěvcích zastoupeni
nejen doktorandi a učitelé z Masarykovy univerzity (Institut biostatistiky a analýz, Přírodovědecká fakulta a Fakulta
informatiky), ale i z Mendlovy univerzity (Provozně ekonomická fakulta), Vysokého učení technického (Podnikatelské
fakulta) v Brně a Vysoké školy báňské – Technické univerzity (Ekonomická fakulta) v Ostravě. Jejich příspěvky ve
sborníku přispívají k tomu, že se letní škola stala širokým interdisciplinárním odborným fórem v rámci České republiky.
Dále je důležité, že několik příspěvků, jejichž spoluautoři jsou ze zahraničí, je publikováno v anglickém jazyce, který
podtrhuje mezinárodní význam sborníku.
Těžiště projednávaných otázek na letní škole bylo především v detailní diskusi věnované řešení projektů vývoje
informačního systému GENASIS, ale zahrnulo i další problematiku, která se týkala oblasti „eEnvironment“,
„eGovernment“, „eParticipation“ a prezentaci nově řešeného projektu podporovaného Grantovou agenturou České
republiky s názvem „Construction of Methods for Multifactor Assessment of Company Complex Performance in
Selected Sectors”, vedený pod registračním číslem P403/11/2085.
V Brně dne 31. listopadu 2013
Jiří Hřebíček
Jan Ministr
Tomáš Pitner
Editoři
Obsah
Cross Platform Mobile Development
Jakub Dubrovský
1
Business Intelligence in Environmental Reporting Powered by XBRL
M. Hodinka, M. Štencl, J. Hřebíček, O. Trenz
15
Sustainability indicators for measuring SME performance and ICT tools
Edward Kasem, Jiří Hřebíček
26
Curriculum planning and construction:
How to combine outcome-based approach with process-oriented paradigm
Martin Komenda, Lucie Pekárková
39
Information Technology Tools for Current Corporate Performance Assessment and
Reporting
Ondřej Popelka, Michal Hodinka, Jiří Hřebíček, Oldřich Trenz
47
Generování japonsko-českého slovníku
Jonáš Ševčík
69
Availability and scalability of web services hosted in Windows Azure
David Gešvindr
73
Support for Partner Relationship Management Internet Data Sources in Small Firms
Jan Ministr
84
Cross Platform Mobile Development
Jakub Dubrovský
Masarykova univerzita - Fakulta informatiky
Botanická 68a, 602 00 Brno
jkdubr@me.com
Abstract
Mobile devices are fast becoming ubiquitous in today’s society. New devices are constantly
being released with unique combinations of hardware and software, or platforms. In order to
support the ever-increasing number of platforms, developers must embrace some method of
crossplatform development in which the design and implementation of applications for different
platforms may be streamlined. To solve that, cross platform tools have been investigated and
was found to solve device functionality features, platform and device coverage problems. It was
also noted that cross platform development was found to be cheaper and takes less time.
Abstract
Mobilní zařízení se rychle stávají všudypřítomné v dnešní společnosti. Nová zařízení jsou
vydávány s neustálou unikátní kombinací hardwaru a softwaru nebo platformy. S cílem
podporovat stále rostoucí počet platforem, vývojáři musí přijmout nějakou metodu
crossplatform vývoje, v němž může zefektivnit návrh a implementaci aplikací pro různé
platformy. Vyřešit, multiplatformní nástroje byly prošetřeny a zjistilo se, že řešení funkce
zařízení nabízí zařízení a platformu pokrytí problémy. Bylo také zjištěno, že multiplatformní
vývoj je levnější a trvá kratší dobu.
Keywords
Mobile development, Android, iOS, Windows Phone, Phonegap, Cordova
Klíčová slova
Vývoj mobilních, Android, iOS, Windows Phone, Phonegap, Cordova
1 Introduction
Today mobile applications are in a lot of sense quite—with the right and cleverly designed
applications—capable of pulling above their weight to compete with traditional stationary
computers. Applications can be developed using a wide variety of tools, methodology and using
the traditional applications development life cycle. The number of tools that promise a trouble
free, squeaky clean UI, lightning speed development period, and maximum interoperable
software product can only be described as staggering. The mobile applications development
scene has split into at least two categories—native and cross platform development.
The most straightforward method of developing an application will be to target native
applications with native thinking, native target and deliberately native technology. Besides it
runs faster, most of its ailments are probably fully groped through, well documented
idiosyncrasies and often a relatively reliable but expensive professional support.
One of the major roadblocks to native development is the number and variety of platforms and
mobile devices in the market—making mobile applications development end up with
fragmented development arrangements and costly software development strategies [1].
The potential benefits of writing-once-running-everywhere comes from the cost reduction—in
having only one code to write and maintain, and the time reduction— being able to write one
code and target multiple devices and platforms, making researching cross-platform mobile
applications development worth one’s effort and findings
The main objectives of research questions are:




Is native development faster than cross platform development?
What is the benefit of using cross platform SDKs to develop cross platform?
Evaluate cross platform tools using time, technology, maturity, and cost
aspects of mobile apps development.
What tool best supports cross-platform development?
2 Mobile platforms
2.1 Android mobile operating system
Android is a Linux-based operating system designed primarily for touchscreen mobile devices
such as smartphones and tablet computers. Initially developed by Android, Inc., which Google
backed financially and later bought in 2005, Android was unveiled in 2007 along with the
founding of the Open Handset Alliance: a consortium of hardware, software, and
telecommunication companies devoted to advancing open standards for mobile devices. The
first Android-powered phone was sold in October 2008 [7].
2.2 Apple iOS mobile operating system
The Apple iPhone hit the market in January 2007 and brought with it the iOS operating
system—an operating system filled with goodies that seemed to tickle everyone’s fancy. iOS
is a heavily guarded proprietary operating system whose fate is tied to a romanticized device in
the name of iPhone and other iFamily devices such as iPod touch, and iPad devices. Before the
advent of Android, the iPhone proved to be a formidable weapon in the war of smartphone
dominance and successfully positioned iOS itself in the top rank in just a short time since its
debut, now just trailing behind Symbian and in 2007, it was considered as the invention of the
year in five categories among which were being beautiful and defining the future [2].
iOS provides services that any operating system provides for iFamily devices. In addition to its
pioneering technology and innovative design, iOS was considered to be a status symbol making
it a market leader in a short period of time. Consequently, other operating systems in the market
had to make way for it and die. One of the obstacles to developing applications on iOS is that
it necessarily required a license for developers and that the system is closed and that the apps
are Apple approved products on the iGlue online store of iTunes app store, and not to mention
the that fact that the devices come at a premium.
2.3 Windows Phone 8
Windows Phone 8 is the second generation of the Windows Phone mobile operating system
from Microsoft. It was released on October 29, 2012, and like its predecessor, it features the
interface known as Metro [9].
Windows Phone 8 replaces its CE-based architecture used on Windows Phone 7 devices with
the Windows NT kernel found on many Windows 8 components. Current Windows Phone 7.x
devices cannot run or update to Windows Phone 8 and new applications compiled specifically
for Windows Phone 8 are not made available for Windows Phone 7.x devices [9].
2.4 Mobile platforms with low market penetration
Most popular mobile operation system is iOS and Android is operating system with the biggest
market penetration. What about other platforms? Another operation system is for example Bada
by Samsung, BlackBerry. These platforms have low market penetration and developers have
no big reason for developing for these platforms.
2.5 Common native features of a mobile platform
There are several native features in a mobile phone. Some of the most common native features
supported in a mobile are described below.








Accelerometer is a feature of a mobile phone that is used to detect the
orientation of a mobile device or how a user holds the phone. It can be used to
detect motion in three dimensions.
Geolocation is an API that is used to provide or access geographical location
of a mobile holder or the mobile device to a computer application. Location
information is usually based on Global Positioning System (GPS), IP address,
Wi-Fi, Bluetooth, MAC address or GSM/CDMA mobile networks.
Barcode is an API used to read barcodes through a mobile device camera and
interpret the information encoded in the barcode.
The Calendar API gives access to the native calendar functionality and data of
a mobile phone platform.
A fast lightweight embedded SQLite database module in a mobile platform. It
is used to create, access and store data on the mobile phone and accomplish
database management tasks.
The media feature enables to access multimedia functionalities of a mobile
platform such as playing and recording of audio and video. Access to platform
players and multimedia file operations.
Notification allows raising audible, visual or vibrational notification.
Push notification services are often based on information preferences
expressed in advance. This is called a publish/subscribe model. A mobile
device "subscribes" to various information "channels" provided by a server;
whenever new content is available on one of those channels, the server pushes
that information out to the client.
3 Mobile development
3.1 Native mobile applications development
Native application development involves developing software specifically designed for
a specific platform, hardware, processor and its set of instructions. The advantages of native
applications development are predictable performance, streamlined support, native UI, native
API, coherent library updates.
3.2 Cross platform development
Cross platform development promises lower cost and time of development since it produces
one code base to maintain and write targeting multiple devices and platforms. The figure below
shows the trend for native to cross platform development cost and time factors.
Figure 1: Native vs. Cross platform development [1]
Claims that cross platform development might inhibit future development in the field since it
might be focused only on making it easy for developers than on innovation. Some benefits of
cross platform development are:





Learning web technologies might be less challenging that learning an all new
platform
Devices come in all sorts of technology and feature that through cross platform
development it might be possible to develop an application that works well on
each one of them with less effort
Cross platform mobile apps written using one SDK, operating System and
development environment requiring less financial investment
Cross platform apps try best to emulate natively written applications
Cross platform development might help reduce cost in human resource, time
resource, and development period
Table 1: App development comparison
4 Cross platform tools
4.1 Appcelerator Titanium
Appcelerator Titanium is a cross platform mobile application development tool. It claims to use
open web standards, architecture and follows open source initiatives.
Titanium Mobile SDK comes with its own IDE. Titanium Studio is an eclipse like IDE which
is used to write and test applications together with and Android Emulator. Appcelerator
titanium supports database, media, geolocation, contacts, notification and many other native
features of a smartphone.
Titanium enables web developers to create native mobile, desktop, and tablet applications using
open web technologies such as JavaScript, HTML and CSS. One uses JavaScript APIs for UI
(native), JavaScript for scripting
4.2 PhoneGap
PhoneGap is a cross platform mobile applications development tool. It uses HTML5, JavaScript
and CSS3 web technologies to develop cross platform mobile applications that exploit native
features of a mobile device [8].
PhoneGap supports the largest number of platforms of all other tools—iOS, Android,
BlackBerry, webOS, WP7, Symbian and Bada. PhoneGap developed apps are hybrid apps that
are neither pure web apps nor native apps. PhoneGap is an open source mobile apps
development tool. With PhoneGap, one can develop an abstraction based cross platform
applications using web technologies and wrap the code in native accessing system architecture
of an application [8].
PhoneGap uses jQuery JavaScript library in its development framework and made it easier to
build jQuery Base mobile apps to native feature accessing applications. It supports
accelerometer, camera, compass, contacts, file, geolocation, media, network, notification (alert,
sound, vibration) and storage [8].
4.3 Xamarin
Xamarin is a commercial cross platform mobile applications development tool. Xamarin
enables to develop applications for iOS and Android using .NET framework and C#
programming language. Application development using Xamarin comes in two forms as
MonoTouch for iOS and Mono for Android [3].
Mobile applications development using Xamarin is claimed to bring the benefit of sharing
codes between platforms, using existing .NET expertise, easy access to native APIs, the niceties
of rich IDE—Visual Studio in case of Android—and developing applications using strongly
typed programming along with garbage collecting feature of .NET framework. Xamarin IDE
comes as MonoDevelop for android and as a Visual Studio plug-in component for windows
and Mac OSX and as monotouch for Mac OSX only [3].
4.4 Rhombile
Rhomobile is a Motorola Solutions owned company that brought a cross platform mobile
applications development tool that relies on Model-View-Controller(MVC) system architecture
of programming applications using HTML, CSS and JavaScript and Ruby. Rhomobile supports
iOS, RIM, Windows Mobile and Android [4].
4.5 MoSync
MoSync is an open source cross platform mobile applications development tool. It enables one
to develop native like cross platform mobile applications using C/C++, HTML5 and JavaScript
[5].
4.6 Corona
Corona is a cross platform mobile applications development tool that is used to author games
and apps. Mobile applications development in Corona are written using the programming
language Lua. The Lua code is compiled into an intermediate bytecode for a native runtime
abstraction
5 Conclusion
Cross platform mobile applications come in so many forms. Each tool might approach cross
platform mobile applications development in so many ways. Some come as runtime
implementations or interpreter abstraction layers and when it comes to licensing some are
completely free and open source while some are proprietary and commercial. The level of
native feature support of a mobile device varies widely that much. As seen in the previous
sections, the level native support some tools provide is wider and some of the other ones also
provide a limited support but might also provide a special feature that a particular developer
might be keen to utilize.
Cross platform development is an actively developing young technology. It could be sometime
before the dust settles and a tools comes out as a winner. Considering each tool based on its
merit for a mobile application need and target audience seems the only sensible option available
right now.
6 References
[1]
AllianceTek. Comparison chart for mobile applications development, 2011
http://www.alliancetek.com/downloads/article/comparison-chart-for-mobileapp.pdf,
April 2012.
[2] Feida Lin and Weiguo Ye. Operating system battle in the ecosystem of smart- phone
industry. In Information Engineering and Electronic Commerce, 2009. IEEC ’09.
International Symposium on, pages 617 –621, may 2009.
[3] Xamarin. Xamarin documentation http://www.xamarin.com, August 2013.
[4] Rhomobile.
The
universal
mobile
application
http://www.rhomobile.com/products/rhoconnect/, August 2013.
integration
server
[5] MoSync. Mosync home http://www.mosync.com/, August 2013.
[6] Badadevelopers.Architectureofbadahttp://developer.bada.com/library/help, August 2013.
[7] Android. Android https://developer.android.com/index.html, August 2013.
[8] Phonegap. Phonegap for developers http://phonegap.com/developer/, August 2013.
[9] Windows Phone. Dev Center for developers http://developer.windowsphone.com/en-us,
August 2013.
Business Intelligence in Environmental Reporting
Powered by XBRL
M. Hodinka, M. Štencl, J. Hřebíček, O. Trenz
Mendel University in Brno, Department of Informatics, Faculty of Business and Economics,
michal.hodinka@mendelu.cz, michael.stencl@mendelu.cz, jiri.hrebicek@mendelu.cz
Abstract
Today, companies are handling increasing amounts of transactional data. This phenomenon
commonly named as "Big Data", has transformed from a vague description of massive
corporate data to a household term that refers to not just volume but the variety, veracity and
velocity of change. Commonly used approach leads to usage of Business Intelligence (BI)
technologies used not only to environmental reporting purposes, but also used for a data
discovery discipline. The critical issue in general data processing tasks is to get the right
information quickly, near to real time, targeted, and effectively.
This article aims on several critical points of the whole concept of BI environmental reporting
powered by XBRL. First, and most important, is the usage of structured data delivered via
XBRL. The main profit on usage of XBRL is the optimization of the ETL process and its
combination commonly used best practices on data warehouse models. The whole BI workflow
could be moved further by additional data quality health checks, extended mathematical and
logical data test, basics of data discovery and drill-down techniques.
First part of the article review the state of the art on the XBRL level and also review current
trends in environmental reporting. We also analyse the basics of Business Intelligence
regarding to the application domain on environmental reporting. The methodology reflects
today’s technical standards of XBRL accordingly to the application via ETL process. In results
we describe concept for standardized data warehouse model for the environmental reporting
based on the specific XBRL taxonomy and known dimensions. In discussion we explain our
next approach and all the pros and cons of the selected approach.
Abstrakt
Společnosti mají dnes větší množství transakčních údajů k zpracování. Tento jev běžně
pojmenovaný jako "Big Data", se přeměnil z vágní popis masivní podnikových dat na
domácností termín, který odkazuje na nejen objem, ale odrůdu, pravdivost a rychlost změn.
Běžně používaný přístup vede k využití Business Intelligence (BI), technologie nejen pro
environmentální účely vykazování, ale také pro disciplínu zjišťování údajů. Kritickou otázkou
obecně zpracování dat úkolů je dostat správné informace rychle, v blízkosti reálném čase, cíleně
a efektivně.
Tento článek se zaměřuje na několik kritických bodů celého konceptu BI environmentálního
reportingu poháněné XBRL. První a nejdůležitější je využití strukturovaných dat
prostřednictvím XBRL. Hlavní zisk z použití XBRL je optimalizace procesu ETL a jeho
kombinace běžně používají osvědčené postupy v datový sklad modelů. Celý pracovní postup
BI mohl být dále přesunutý, další údaje kvality kontroly, test rozšířené matematických
a logických dat, základy datových objevu a podrobnostem techniky.
První část tohoto článku zkontroluje stav techniky na úrovni XBRL a rovněž přezkoumává
současné trendy v environmentálního reportingu. Také jsme analyzovali základy analytické
nástroje týkající se domény aplikace na environmentálního reportingu. Metodika odráží dnešní
technické standardy XBRL odpovídajícím způsobem k aplikaci prostřednictvím procesu ETL.
Výsledky popisujeme jako koncept pro standardizované datový sklad model pro ekologické
vykazování založené na určité taxonomie XBRL a známých dimenzí. V diskusi si vysvětlíme,
náš další přístup a všechny výhody a nevýhody vybraných přístupu.
Keywords
Corporate reporting, Environmental reporting, XBRL, Business Intelligence, Performance
evaluation, Key performance indicators
Klíčová slova
Podniková vykazování, environmentální reporting, XBRL, Business Intelligence, hodnocení
výkonnosti, klíčové ukazatele výkonu
1 Introduction
Big data as a term is one of the most mentioned topic in past years. As a major factor standing
behind the massiveness of a big data processing in the SMB is the easy access to techniques
and tools that help to any user generate information from any data. The current product
philosophy of technological giants as Google or Apple already changed the IT and software
world. The “user friendliness” or “user experience” of current products allows using a complex
data analytics tools in very easy manner without a deep analytic and reporting knowledge and
skills. So, the world of analytics had already changed. Developers are now facing new goals or
old known “hard-to-handle” tasks like data quality, real-time analytics of streaming data,
different forms and uncertainty of data, reducing the complexity of data transformations and
others. Commonly we can speak about the Business Intelligence (BI). BI technologies are
widely used not only to environmental reporting purposes, but those technologies are part of
the data discovery discipline. Traditional understanding defines Business intelligence, or BI, as
an umbrella term that refers to a variety of software applications used to analyse an
organization’s raw data. BI as a discipline is made up of several related activities, including
data mining, online analytical processing, querying and reporting and has been adopted mostly
by the enterprise companies, not the SMB segment.
Today, also the SMB companies are handling increasing amounts of transactional data and they
have an easy access to even bigger scale of data from e.g. social networks and other “new”
media resources. This phenomenon could also be classified into "Big Data". Big data now
changed its definition from a vague description of massive corporate data to a household term
that refers to not just volume but the velocity, veracity or variety of data. The critical issue in
data processing and data analysis tasks is to get the right information quickly, near to real time,
targeted, and effectively.
Speaking about traditional approach (data marts), and new in-memory analytics with main
motivation to explain the structured/unstructured data in order to real-time analytics and data
efficiency, aka data monetization. The purpose of this paper is to enhance understanding of the
current approach to BI along with the extension based on the extensible business reporting
language (XBRL) tools ready to transform existing data into an informational source of
knowledge. It begins by linking cutting-edge scientific research on the XBRL level with review
of current trends in environmental reporting. The analysis includes basics of Business
Intelligence regarding to the environmental reporting. It serves the purpose to show how
complexity can change by applying an XBRL data model.
2 Methods and Resources
The methodology reflects today’s technical standards of XBRL accordingly to the application
via ETL process. In results we describe concept for standardized data warehouse model for the
environmental reporting based on the specific XBRL taxonomy and known dimensions. In
discussion we explain our next approach and all the pros and cons of the selected approach.
Any reporting service must be based on a set of predefined industry specific KPIs.
The Key Performance Indicators (KPIs) for the measurement of economic performance in
relation to the sustainability and ESG indicators. The economic performance indicators provide
quantitative forms of feedback, which reflect the results in the framework of corporate strategy.
The approach is not different when we control environmental, social and governance issues.
The non-financial KPIs that an organization develops, manages and ultimately reports –
whether internally or externally – will depend on its strategic priorities, and will reflect the
unique nature of the organization. What is most important is to recognize what is measured,
what is controlled, and it is important that the measures create value for the company and its
stakeholders.
The proposed KPIs can help organizations to plan and manage their economic priorities, in
particular, when the economic indicators are focused on the core business strategy, by means
of operational plans, which include performance targets.
We want to demonstrate, how XBRL can be potentially applied in different areas beyond the
original design objectives of the standard. Many organizations have focused on employing
XBRL GR in primarily transaction-oriented focus. How to link integrated system with external
reporting using XBRL shows us Klement (2007). Standard reporting process takes XBRL GL
instance and import into central data warehouse. This instance is imported into a relational
database, which serves basis for incremental updates to data warehouses and then to the OLAP
cubes respectively. Currently Klement (2007) think there isn't any XBRL-based standard tools
for this purpose. To avoid large quantities of transactions exist highly optimized bulk load
toolkits for data import into relational databases but XBRL-based implementation will not
replace performance optimized bulk load toolkits in the short term.
The severest problem at the so-called drill-down or drill-around is likely to be data access over
system boundaries. XBRL reports also have the capacity to incorporate benchmarking and drilldown capabilities to access highly detailed level information on a ”need to know” basis.
A significant advancement promises is a central XBRL repository for storing reporting facts
and data mapping. An important prerequisite to improve drill-down functions for aggregation
is no inverse drill-down function back to the original facts and available of mapping
descriptions, which support efficient analysis queries. For that purpose Klement (2007) did not
find any suitable XBRL standards.
In this example there exist three transitions between non-XBRL and XBRL formats:
1. ERP data export to XBRL GL
2. XBRL GL import into relational database (a typical ETL application)
3. OLAP cube data export to an XBRL FR instance
BI provides a broad set of algorithms to explore the structure and meaning of data. All the data
scrubbing and pre-processing (extract, transform and load: ETL) has to do with mapping of
metadata and can be neglected when leverage clean and meaningful XBRL data.
Chamoni (2007) ask the question: why not use semantic layer and taxonomy of XBRL to go
beyond reporting and do more in-depth analysis? Real-time control of business processes is
currently hyped within the data warehousing industry. As every business process should
possibly be traced in the accounts of a company a constant flow of financial data in XBRLformat into BI-system will be necessary for continuous control of operations, for early fraud
detection and BI as a source of compliance systems. The future enterprise will be based on the
intelligent real-time technologies. Chamoni (2007) point out what future research must be done
to develop analytical applications with a high degree of intelligence and very low reaction time
based on XBRL and BI. The contribution of XBRL in BI is semantically enriched data transport
in favor of generic base systems and may be used to build presentation and access systems.
Fig. 1: Shell of BI-tools (Gluchowski, Kemper 2006)
Modern BI solutions do not longer update data in periodic ETL process, but use event-based
triggers to feed the analytical applications with streaming of data. Short response time and
synchronous signal processing opens the field for new control applications. The significance of
these reactive BI solutions is significant and Chamoni (2007) shown the impact in research of
framework for BI and XBRL. It leads to an issue of having different multidimensional modeling
approaches: one for data warehouses and another for business reports, which are usually
transferred between data warehouses. There is a new multidimensional approach based on
XBRL Dimensional Taxonomies (XDT). This add on described by Felden (2007) has the
potential to perform highly sophisticated multidimensional tasks such as directly facilitating
OLAP solutions. Felden (2007) can be continued by different research lines. It can be related
to other areas like data quality, database security, temporal issues, query optimization, and
translation to logical/physical level methodologies, or just studying modeling problems at
conceptual level. Especially the conceptual level offers research potential.
In XBRL is data model based on taxonomies expressing metadata and instance documents
referring to the taxonomies representing business reports. The primary taxonomies represent
business fact data, which are later reported accordingly in instance documents. The domain
member taxonomies model the content of the explicit dimensions and the holder properties for
the typed dimensions. The template taxonomies amend the multidimensional model
connections the primary items with dimensions using hypercubes (HOFFMAN, 2006). In order
to model the relationships between various elements (primary items, hypercubes, dimensions
and domain members) the following connections (arcroles) (PIECHOCKI et al., 2007) used:
●
all or notAll (primary item – hypercube),
●
hypercube-dimension,
●
dimension-domain,
●
domain-member.
Fig. 2: Dependencies in XDT (PIECHOCKI et al., 2007)
Figure above represents the usage of the taxonomies, elements and connections described by
(PIECHOCKI et al., 2007). In case of explicit dimensions all domain members are known and
grouped into exactly one dimension. Typed dimensions are used, if the amount of members is
too large so that it cannot be expressed with a finished number of members. The domain
members within explicit dimensions are connected using the arcrole domain-member creating
dimensional hierarchies. Further the explicit dimensions are connected to the domain members
via dimension-domain arcrole. Both explicit and typed dimensions are gathered in hypercubes
using the hypercube-dimension arcroles (HOFFMAN, 2006). Finally the arcroles all
dimensions of a hypercube are excluded from the primary item (HERNANDEZ-ROS et al.
2006). Due to the reason that not each primary item has to be linked to the hypercube, the
arcrole domain-member can be used not only within domain member’s taxonomies, but also
within the primary taxonomies. This offers the possibility to link a full tree hierarchy of primary
items to the respective hypercube.
The evaluation of the XDT Modeling Technique divides (PIECHOCKI et al., 2007) to three
stages. The highest stage (+) informs about the complete fulfillment of the analyzed criteria. In
case the criteria fulfillment level is insufficient the middle stage (o) is assigned. The third stage
(-) represents the situation when the criteria is not fulfilled or not concerned. The following
figure 6 illustrates the evaluation of the used modeling techniques.
Fig. 3: Evaluation result (PIECHOCKI et al., 2007)
To summarize the results we can follow the research by (PIECHOCKI et al., 2007) so the
multidimensional data can be modeled by using XDT. This is shown by the fulfilled evaluation
criteria. Due to the graphically representation of the model elements, data warehouse engineers
have an improved understanding of the multidimensional data of this approach because the
model elements have more comprehensive semantics. Thus the main advantage is the
possibility of mapping between the modeled XBRL taxonomies and the data warehouse
schemas. However, XBRL technology should be seen as a complement rather than
a replacement for traditional data warehouse/mart-driven BI reporting.
Vendors support for XBRL are also growing; many vendors have committed themselves to
supporting XBRL as an interchange format for importing and exporting data from their
systems. Microsoft's FRX financial software now supports XBRL as a widely accepted format
for reporting on and publishing financial information. SAP was one of the early joiners of an
international project committee set up to launch XBRL back in 2000. Oracle expanded XBRL
support (via a XBRL Manager component) in Enterprise Performance Management System.
Enterprise Engineering has released a suite of XBRL-based financial management, analysis and
reporting products built on its EnterpriseFTX platform (SHEINA, 2008).
3 Results
Different path take us down Chamoni (2007). He analyses the interrelationship between XBRL
and business intelligence concepts. Both concepts have common the support and automation of
the management process of reporting and analysing business information. Difference is that
XBRL tries to describe the meaning of business data and to standardize data exchange; BI seeks
to analyse and report these decision-relevant data. Both concepts come from different
perspectives, XBRL from semantic description of data within an XML environment and BI
from search of knowledge in data.
Fig. 4: Proposed processing architecture (HODINKA, 2013)
So it is not surprising that an integration of both architectures and specially the convergence of
taxonomies will bring benefits to business applications. An architectural design presented on
figure 4 in this chapter is more deeply described in dissertation thesis of the first author
(HODINKA, 2013) of this article. The three components are the XBRL Processor, XBRL Gate
and the Business Intelligence ETL/Replication component. To completely cover any amount
of data (including the Big Data), each of the component must by multi-tenant and highly
scalable. This design supports the Cloud environment as well as the Private Cloud primarily.
Fig. 5: Example of an XBRL report transformation (DEBRECENY, 2013)
Even these concepts are founded on similar basis one of the main differences is that XBRL
instances are normally snapshots of single data points whereas fact tables in BI systems
represent time series. At Chamoni (2007) time, there was only evidence for integration between
BI and XBRL. Taxonomies were built in the data layer and used in reporting systems in the
presentation layer, more over; our concept includes a separate ETL or replication resource for
full adoption on Big Data domain. The connection to the domain of Big Data and XBRL we
see through the global reporting services. Now you can see companies in SMB that already acts
on global markets and are data driven. Then, the complete automation of the extraction process
with minimizing the effort on report data transformation is more then important. And here we
see the big additional value from XBRL. The language concept and its adoption from big
technological companies show possibility to be widely adopted what confirms the evolution
process of the XBRL. Its Formulas and easy translation on multi-dimension models (see figures
5 and 6) have solid potential to be early adopted in the ETL, or ELT, process without any
additional investment. Together with the business rules that can easily describe the business
logic, which is now generally covered on the application level, makes the whole thing even
more powerful not only in the business world, but also in government area like in environmental
reporting.
Fig. 6: Report-Dimensions mapping (DEBRECENY, 2009)
Our aim is to explain how to adopt environmental reporting in Community Environmental
Management and Audit Scheme (EMAS) and XBRL-tagged reporting formats. We used Global
Reporting Initiative (GRI) industry-specific categorization scheme that defines and “tags” data
in relation to its purpose, framework or outline. The key is to identify individual detailed
reporting elements, which can be easily shared. Of course, XBRL was not designed explicitly
as a BI technology. It was designed as a metadata representation language, but organizations
can benefit from well-defined structured format collecting and disseminating sustainability
information. With Chamoni approach we can see interesting maturity model for BI where he
portrays XBRL playing a native role in areas such as text mining and web reporting. We agree
with Debreceny (2007) that this was only important first step, the study by Chamoni provides
only a tantalizing preview of future XBRL-based BI implementation and there a clear need for
case studies and research pilots that would test the propositions made by Chamoni.
4 Discussion
Even XBRL might seem like a finance-only play, but the data exchange standard is flexible
enough to support the reporting requirements outside the office of finance. We discussed the
relationship between environmental and sustainability indicators and corporate sustainability
reporting in chapter “Sustainability indicators: development and application for the agriculture
sector” printed by Springer (HŘEBÍČEK et al., 2013). Our research contains the possibility of
the utilization of information and communication technology and XBRL taxonomy. We
suggest the formalization of the reporting systems of agriculture organizations on the basis of
the universal markup language XML by means of the use of the XBRL to minimize main
barriers why agriculture organizations do not support sustainability reporting Hřebíček (2009)
for example:
● Collecting and managing data is expensive, technical issues with data
collection are also a problem.
● Determining the set of appropriate sustainability indicators to monitor and
measure is difficult.
● There are difficulties in capturing reliable data information (some aspects of
the agrosystem are very difficult to collect meaningful and repeatable data).
● Information disclosure can create business risks, which competitors and
regulators may seize upon.
● Difficulty to determine the sphere of influence of an organization.
● Many organizations have good intentions, but simply have not allocated
enough resources due to the current economic situation in the Czech Republic.
● Reporting is seen as a superfluous and burdening activity.
The core of these barriers is the certain time-demanding nature of the agriculture farm data
processing, and the absence of positive feedback (HODINKA et al. 2012).
By implementing the XML scheme, the agriculture organization gains a whole set of
advantages. The administration and editing of information is much easier and much more
effective. Employing the above-mentioned framework enables an improved communication
and collaboration with target groups and concerned parties. By implementing the scheme the
company acquires the possibility of creating and publishing compact, focused messages that
are generated automatically on the basis of the template rules of one single scheme.
(HŘEBÍČEK at al., 2013)
5 Summary
As we provided in the text of this paper he critical issue in data processing and data analysis
tasks is to get the right information quickly, near to real time, targeted, and effectively. If XBRL
as data exchange format will be adopted in the whole information supply chain process it will
be eliminating most of the costly, often manual, processes of data preparation. This is because
XBRL data allows itself to be transformed by software or other mapping tools automatically,
which in turn increases consistency, accuracy and trust in data - all key tenets of successful BI
reporting. We can see XBRL taxonomies as s start point to build a global data warehouse kernel
of qualitative information throughout international corporations.
6 Acknowledgements
The Czech Science Foundation supports this paper. Name of the Project: Construction of
Methods for Multi Factor Assessment of Company Complex Performance in Selected Sectors.
Registration No. P403/11/2085.
7 References
[1] DEBRECENY, R., et all. XBRL for Interactive Data. Introduction to XBRL.
[http://dx.doi.org/10.1007/978-3-642-01437-6_3]. Springer Berlin Heidelberg 2009, pp.
35-50. ISBN 978-3642014369.
[2] DEBRECENY, R., FELDEN, C., PIECHOCKI, M. New Dimensions of Business
Reporting and XBRL. Springer, 2007, 271 p., ISBN 978-3835008359.
[3] DEBRECENY, R. New Dimensions of Business Reporting and XBRL: Research into
XBRL – Old and New Challanges. Springer, 2007, pp. 5-13, ISBN 978-3835008359.
[4] GLUCHOWSKI, P., KEMPER, H.-G. Quo Vadis Business Intelligence, In: BI- Spektrum,
TDWI, 2006, pp. 12-19.
[5] HERNANDEZ-ROS, I., WALLIS, H. (2006). XBRL Dimensions 1.0 Candidate
Recommendation,. XBRL International (2006) [online] Filetype RTF. [cit. 2013-01-10].
http://www.xbrl.org/Specification/XDT-CR3-2006-04-26.rtf.
[6] HŘEBÍČEK, J. et all. Sustainability Indicators: Development and Application for the
Agriculture Sector. Springer Verlag: Sustainability Appraisal: Quantitative Methods and
Mathematical Techniques for Environmental Performance Evaluation, 2013, pp. 63-102,
ISBN 978-3642320804.
[7] HŘEBÍČEK, J., HÁJEK, M., CHVÁTALOVÁ, Z., RITSCHELOVÁ, I. Current Trends in
Sustainability Reporting in the Czech Republic. In EnviroInfo 2009. Environmental
Informatics and Industrial Environmental Protection: Concepts, Methods and Tools. 23.
International Conference on Informatics for Environmental Protection. Aachen: Shaker
Verlag, 2009. pp. 233-240, ISBN 978-3832283971.
[8] HODINKA, M. Formalization of Business Rules and Methodology of their
Implementation to the Information System. Ph.D. thesis. Brno, 2013.
[9] HODINKA, M., ŠTENCL, M., HŘEBÍČEK, J., TRENZ, O. Current trends of corporate
performance reporting tools and methodology design of multifactor measurement of
company overall performance. Acta univ.agric. et silvic. Mendel. Brun., Brno: MZLU,
2012, LX, č. 2, pp. 85–90. ISSN 1211-8516.
[10] HOFFMANN, C. Financial Reporting Using XBRL: IFRS and US GAAP Edition.
Lulu.com, 2006, 515 p., ISBN 978-1411679795.
[11] CHAMONI, P. New Dimensions of Business Reporting and XBRL: XBRL and Business
Intelligence – From business Reporting to Advanced Analysis. Springer, 2007, pp. 179188, ISBN 978-3835008359.
[12] KLEMENT, T. New Dimensions of Business Reporting and XBRL: Standardized
Company Reporting with XBRL. Springer, 2007, pp. 249-271, ISBN 978-3835008359.
[13] PIECHOCKI, M., FELDEN, C., GRANING, A. Multidimensional XBRL reporting. In:
Proceedings of the Fifteenth European Conference on Information Systems, ECIS,
Switzerland, 2007.
[14] SHEINA, M. MarketWatch: XBRL bubbling up in BI. (2008). [online] Filetype HTML.
[cit. 2013-01-10]. http://netdotwork.co.za/article.aspx?pklarticleid=4965.
Sustainability indicators for measuring SME
performance and ICT tools
Edward Kasem, Jiří Hřebíček
Mendel University, Department of informatics, Faculty of Business and Economics
Zemědělská 1,
613 00 Brno, Czech Republic
xkasem@node.mendelu.cz
hrebicek@iba.muni.cz
Abstract
Small and medium enterprises (SMEs) have significant effects on Europe economy, especially
on Czech Republic economy. In this paper, we concentrate on the agriculture SME’s more than
others. At the beginning, we describe SMEs impacts on the economy. Then two types of
frameworks for measuring and reporting on four pillars of sustainability (economic,
environmental, social and governance) are introduced. These frameworks include Global
Reporting Initiative GRI and Sustainable assessment of food and agriculture systems (SAFA).
Finally, the current state of sustainability indicators and information and communication
technologies for measuring complex enterprise performance are presented.
Abstract
Malé a střední podniky (MSP) mají významný vliv na evropskou ekonomiku, zejména na
ekonomiku České republiky. V tomto dokumentu se zaměřujeme na zemědělství SME víc než
ostatní. Na začátku, popisujeme MSP dopady na ekonomiku. Poté jsou představeny dva typy
rámců pro měření a hlášení o čtyřech pilířích udržitelnosti (ekonomické, ekologické, sociální
a vládnutí). Tyto rámce patří Global Reporting Initiative GRI a udržitelnému hodnocení
potravin a systémů zemědělství (SAFA). Nakonec jsou uvedeny aktuální stavy ukazatele
udržitelnosti a informačních a komunikačních technologií pro měření výkonu v areálu podniku.
Keywords
GRI, SAFA, SME, sustainability
Klíčová slova
GRI, SAFA, SME, udržitelnost
1 Introduction
Sustainability reporting is a global trend that engages companies in disclosure of their overall
economic, environmental, social, and governance impacts and efforts [1]. At the current stage,
primarily the large companies only who got involved in sustainability reporting practices. Small
and medium sized enterprises (SMEs), despite of their significant impacts in the European
economy, employment and environment, show low level of engagement. Because of the
important effects of SMEs that impact on the Europe economy, small and medium enterprises
need a method or procedure to measure, control and improve their performance. Such
a procedure for effective and efficient management should be simple, efficient and integrate
these various viewpoints of the performance in environmental, economic and social terms [2].
The main goal of this paper can be abbreviated in assessment the current state (state-of-art) of
ICTs for measuring enterprise performance and sustainability of the selected sectors of the
national economy.
In this paper, we summarized the results that are achieved by the project No.P403/11/2085 [5]
in chosen organizations of the agriculture and food processing sector [3], [6]. These results
related to ESG aspects of corporate performance evaluation, indicators and reporting. The most
of these researches depend on Global Reporting Initiative (GRI G3.1) and Sustainability
Assessment of Food and Agriculture (SAFA) systems. Our goal conclude in making the
assessment of the current state (state-of-art) of ICTs for measuring enterprise performance and
sustainability, then develop it using more advanced version of Global Reporting Initiative like
GRI G4.
2 Small and Medium enterprise’s Sustainability
The small and medium enterprise (SME) sector is a driver of the business domain, growth,
innovation as well as competitiveness. SMEs play a major role in the economic and social
dimensions generating the most significant economic value and employment rate in Europe.
25
50
75
100
CZ
PT
IT
ES
SE
HU
NO
LU
SI
EU-27
FI
DK
FR
LT
PL
AT
EE
BG
NL
LV
UK
DE
RO
SK
Figure 1: Describes the density of SMEs (number of SMEs per 1 000 inhabitants).
At the same time, small and medium enterprises are highly sensitive to the quality of the
business environment. They have a weaker financial background, because the main goal of their
existence is increasing their profit. The importance of small and medium enterprises in the
economy is not challenged by anyone. SMEs are estimated to generate 58.6% of the value
added within the non-financial business economy. Furthermore, they have a significant
contribution to the employment in the EU hiring 66.7% of the non-financial business economy
workforce [7]. Figure 1 depicts the density of SMEs in different Europe countries [7].
In Czech Republic, according to 31st December 2010 statistics, there were a total of 1 029 871
small and medium enterprises, representing 99.83 % of all economically active entities. They
had a 62.33 % share in the total employment rate and a 36.22 % share in the GDP. A detailed
view of the small and medium enterprise development in the Czech Republic in 2000 - 2010 is
provided in Figure 2 [8].
Figure 2: Development of the number of active SME in the CR
According to the survey, 57% of companies include a reference to sustainable development,
and include it into their strategic goals, and 43% of companies do not include a reference to
sustainable development and it is not a part of their strategic goals, either. While, sustainability
development is an important issue that should be taken into consideration regardless the field
of interest of certain organization. Table 1 describes the most frequently practical and effected
fields in sustainability reporting according to the conducted SMEs research [8].
Definitely
Not
Rather
Not
Rather
Yes
Definitely
Yes
In
environme
ntal field
5.6%
6.4%
16.9%
13.7%
In
economic
field
3.2%
6.5%
15.3%
21.0%
In social
field
4.8%
7.3%
16.9%
12.9%
Table 1: Link to the different areas of sustainability development.
3 Global Reporting Initiative (GRI)
Nowadays, the Global Reporting Initiative (GRI) [9], [10], [11] is the most common non-profit
organization for developing comprehensive sustainability reporting framework that is widely
used across the world. Its mission is to provide a credible transparent framework for
sustainability reporting that can be used for all organizations regardless of their size, sector or
location. This Framework enables all organizations to measure and report on four key areas of
sustainability (economic, environmental, social and governance impacts). GRI is a multistakeholder, network-based organization. Their vision is a sustainable global economy where
organizations manage their economic, environmental, social, and governance performance,
impacts responsibly, and report transparently. To achieve these goals, they made sustainability
reporting standard practice by providing guidance and support to organizations.
G3.1 Guidelines [11] was considered as a starting point of the G4 Guidelines [9], [10]. Some
changes were made to develop G3.1 and get more qualified Guidelines. The G4 Guidelines are
presented in two parts: Reporting Principles and Standard Disclosures [9], and Implementation
Manual [10]. Reporting Principles and Standard Disclosures explains the requirements of
reporting against the framework, ‘what’ must be reported. Whereas Implementation Manual
provides further guidance on ‘how’ organizations can report against G4 criteria. The objectives
of developing the next generation of its reporting Guidelines (G4) are to offer user-friendly
guidance helps the reporters understand and use these guidelines, improve the technical quality
of the guidelines content to eliminate ambiguities and differing
interpretations, harmonize with other international accepted standards, and offer guidance
related to link the sustainability reporting process to the preparation of an Integrated Report [2].
These changes can be summarized in five most significant areas:

Materiality, which takes central stage.

Reporting boundaries.

Application levels replaced by the concept “In Accordance”.

Governance disclosure.

Supply chain.
3.1 Materiality
Materiality is one of the most important concepts in G4 guidelines. G4 concentrate on the
materiality of each aspect. Materiality is not a new concept; it exists in G3 guidelines with the
same definition and principle. G4 contains new requirements for organizations to explain the
process, which is used to identify their Material Aspects.
G4 encourages reporters to focus on content which effects on their business rather than
reporting on everything. To begin the process of defining the content of a report, the
organization is required to select material Aspects. That means that the materiality determines
what sustainability indicators will be reported. This is the major difference between G4 and
G3.1, where in G3.1 the reports should be made using certain number of indicator which is
called “core indicator”, regardless of whether they were impacted your organization or not.
According to all above, GRI G4 describes the steps that the organization may go through in
order to define the materiality and the specific content of the report. These steps can be
described in Figure 3 and summarized in identification, prioritization, validation and review.
More details can be found in G4’s Implementation Manual from pages 32-40.
Figure 3: Defining material Aspect and Boundaries process overview
3.2
Reporting boundaries
After determine the material aspects, the reporter must consider whether the impact of each
issue lies inside, outside, or in the both sides of the organization and describe where the impact
ends. To determine whether an impact is within, outside or both, we have to consider what
makes the Aspect relevant. For example, material aspects could be relevant primarily to entities
inside the organization like anti-corruption. It could be relevant primarily to entities outside
the organization, such as suppliers, distributors, or consumers. On the other hand emission
could be an aspect, which impacts within and outside of the organization. These what have been
extended in G4 to encourage companies to consider and report on a broader range of impacts
compared with G3.1 which reports only on impacts that the company has control over or
significant influence on.
3.3 The application levels replaced by the concept “In Accordance”
In G3.1 Guidelines, there is something called “Application level” system. It appears to declare
the level of the reports, which is applied according to the GRI Reporting Framework. To meet
the needs of beginners, advanced reporters, and those somewhere in between, there are three
levels in the system. These levels are titled C, B, and A and reflect a measure of the extent of
application or coverage of the GRI Reporting Framework. A “plus” (+) is available at each
level (ex., C+, B+, A+) if the reports utilized external assurance [11]. Whereas the Guidelines
G4 offer two options for an organization in order to prepare its sustainability report ‘in
accordance’ with the Guidelines: the Core option and the Comprehensive option. The Core
option contains the essential elements of a sustainability report. It provides the background
against which an organization communicates the impacts of its economic, environmental and
social and governance performance. The Comprehensive option builds on the Core option by
requiring additional Standard Disclosures of the organization’s strategy and analysis,
governance, and ethics and integrity. In addition, the organization is required to communicate
its performance more extensively by reporting all indicators related to identified material
Aspects.
3.4 New governance disclosure requirements
In G4 Guidelines, number of changes has been proposed on governance, along with a new
category of disclosure on ‘Ethics and integrity’. These changes are related to information on
the composition, involvement and authority of the reporting organization’s highest governance
body. They were made to strengthen the link between governance and sustainability
performance, taking into account the consistency within existing governance frameworks and
developments in that field. The G4 Guidelines contain 10 new standard disclosures on
governance (G4.35, G4.36, G4.42, G4.43, G4.46, G4.48, G4.50, G4.52, G4.54, G4.55) [10].
These changes are so important for reporters in order to achieve the ‘Comprehensive’ level of
reporting.
3.5 Supply chain
More prominences are given for supply chain included new and amended disclosures. Supply
chain and supplier are redefined in G4 Guidelines. Companies must disclose how they manage
environmental and social issues related to the material impacts of their supply chains. More
comprehensive reports should be done related to supply chain including details of supply chain
assessments, risks identified, the organization’s performance in managing these risks and the
management processes put in place. In additional, the reporter should concentrate on suppliers
boundaries and reporting by region, total spend, type of supply, etc.
4 SAFA Sustainability Assessment of Food and Agriculture
systems
SAFA is a holistic global framework for the assessment of sustainability along food and
agriculture value chains [12]. It is one of the most important concept that should be taken into
consideration when informing the society about the progress done in the economic, social and
environmental and governance areas. This framework can be applied on small, medium and
large-scale companies, organizations and other stakeholders that participate in crop, livestock,
forestry and fishery value chains. SAFA seeks to harmonize long-term objective of sustainable
transformation of food systems, providing a transparent and aggregated framework for
assessing sustainability.
The guiding vision of SAFA is that food and agriculture systems worldwide are characterized
by four dimensions of sustainability; good governance, environmental integrity, economic
resilience and social well-being. SAFA Guidelines include 21 themes and 56 sub-themes
defined through expert consultations [13]. The Guidelines are the result of an iterative process,
built on the cross- comparisons of codes of practice, corporate reporting, standards, indicators
and other technical protocols currently used by private sector, governments, not-for-profits and
multi-stakeholder organizations that reference or implement sustainability tools.
4.1 SAFA System
The guiding vision of SAFA is that food and agriculture systems worldwide are characterized
by all four dimensions of sustainability; good governance, environmental integrity, economic
resilience and social well-being [12], [13].
SAFA system consists of four nested levels to enhance coherence between each other (see
Figure 4) [12]. These four levels begins with SAFA Framework, the highest level, which
encompass many aspects by aggregating four dimensions of sustainability; good governance,
environmental integrity, economic resilience and social well-being. The second level contains
a set of 21 core sustainability issues or universal “themes”. At this level, policy-makers and
national governments can work towards alignment and harmonization of a holistic scope of
sustainability without defining the specific pathways. In the third level, each of the 21
sustainability themes is divided into sub-themes, or individual issues within SAFA themes. This
level composed of 56 sub-themes, is relevant for supply chain actors doing a contextualization
analysis, which identifies hot spot areas, as well as gaps in existing sustainability efforts.
SAFA Framework
Themes(21)
Universal
Sub-Themes(56)
Specific to supply chain
Core Indicators(113)
For crops, livestock, forestry &
fisheries
Figure 4: depicts the SAFA System.
The last level defines Core Indicators within each sub-theme. These indicators identify the
measurable criteria for sustainable performance for the sub-theme. Core Indicators provide
ratings for the highest performance (green) and unacceptable practices (red) (This is detailed in
SAFA Guidelines Section 2) [12].
4.2 SAFA Principles
SAFA divides into two types of principles: Methodological principles compose of many
concepts like holistic, relevance, rigor, efficiency, performance orientation, transparency,
adaptability, and continuous improvement. Implementation principles, which describe the
following facts: build on existing tools, take place in an open and learning system, and
accessibility.
4.3 Scope of SAFA assessment
SAFA can be adapted to different contexts and scopes. These scopes can be presented as



Supply chain scope starts with the production of a primary commodity, ends
with the consumption of the final product and it includes all of the economic
activities undertaken between these phases, such as: processing, delivery,
wholesaling and retailing. In other word supply chain scope specified in
setting the boundaries.
Temporal scope which is defining the time frame. For establishing the
thresholds of the ratings, especially in the environmental dimension,
entity’s activities for one year interval is assumed. But there are some
indicators, multi- year trends should be assessed or sustainability impacts
be allocated to a longer period – usually in these cases a period of five years
is suggested.
Thematic scope which is defining the sustainability context. SAFA outlines
essential elements of sustainability through 21 high level themes. These are
applicable at any level of development, for instance national level or
commodity specific. The themes are further divided into 56 sub-themes.
SAFA sub-themes are tailored to food and agriculture supply chains and
thus, are not well suitable for policy development. Then each sub-theme
consists of core indicators proposed in order to facilitate measuring progress
towards sustainability.
5 Current state of sustainability reporting according to GRI
G3.1 and SAFA guidelines
The performance indicators have been developed last few years [1], [2], [14]. In this section,
we will depict the latest version of key generic performance indicators achieved by GACR 403
project [5]. This set of indicators is listed in the Table 2 [13] and can be used for wide range of
SMEs on the food and agriculture sector [3], [6].
Measurable
Performance Indicator
Scale and (Unit)
Environmental indicators
EN01 – Environmental
protection investments
Investment
EN02 – Environmental
protection
expenditures
Emission
Resource
Consumption
Total investments to
environmental
protection / value added [%]
Cost of Environmental
protection
expenditures /value added
[%]
EN03 – Total air
emissions
Total direct and indirect
emissions to air / value
added [t / CZK]
EN04 – Total
greenhouse gas
emissions
Direct and indirect
emissions / value
EN05 – Total annual
energy consumption
Direct and indirect energy
consumed / value added
[MWh /CZK]
EN06 – Total
renewable energy
consumption
Energy from renewable
resources * 100 / total
energy consumption [%]
EN07 – Consumed
materials
added [t / CZK]
Consumption of materials
by weight
/ value added [t / CZK]
EN08 – Recycled input
materials
The share of raw materials
from recycled materials,
percentage of total input
materials [%]
EN09 – Total annual
water consumption
The total volume of water
removed /value added. [m³ /
CZK]
EN10 – Total annual
waste production
The total amount of each
type of waste generated by
operation of the company /
value added [t / CZK]
Waste
EN11 – Total annual
production of
hazardous waste
The total amount of
hazardous waste generated
by operation of the company
/ value added
[t / CZK]
Social indicators
Company
SO01 – Community
Social investment to local
communities * 100 / value
added [%]
SO02 – Contributions
to municipalities
Monetary value of projects
of municipalities * 100 /
value added [%]
HR01 – Discrimination
Number of discriminatory
cases * 100 / number of
employees [%]
Human rights
HR02 – Equal
opportunities
Labor
relations
LA01 – The rate of
staff turnover
Terminated employment
relationships * 100 / average
number of employees [%]
LA02 – Expenditure on
education and training
Total expenditure on
education * 100 /value added
[%]
LA03 – Occupational
Diseases
Occupational illness * 100 /
average number of
employees [%]
LA04 – Deaths
Number of fatal accidents *
100 / average number of
employees [%]
PR01 – Customer
Loyalty
Number of customers at the
end of the year – the
newcomers during the year
* 100 number of customers
at the beginning of the year
[%]
PR02 – Marketing
Communications
Communications – website,
etc. [y/n]
PR03 – Health and
safety of customers
The total monetary value for
noncompliance * 100 / value
[%]
Responsibility
for products
Number of women * 100 /
average number of
employees [%]
Corporate Governance (CG) indicators
Monitoring
CG01 – Company
and reporting
Information
Effectiveness
of CG
Information about the
objectives and strategy of
the company [y/n]
CG02 – CG
Responsibility
Collective agreement [y/n]
CG03 – CG
Standardization
Publishing of standardized
(GRI,vCSR) reports [y/n]
CG04 – Ethical
behavior
Code of ethics [y/n]
CG05 – CG Codex
Code of corporate
governance [y/n]
CG Structure
CG06 – CG
Remuneration
Amount of remuneration of
board of directors and the
supervisory board * 100 /
Annual labor costs [%]
CG07 – CG
Membership
Number of independent
members of the CG * 100 /
number of top management
members [%]
CG08 – Equal
opportunities
Number of women in the
total number of members of
CG [%]
CG09 – Compliance
with regulatory
standards
Monetary value of fines for
noncompliance * 100 / value
added [%]
Economic indicators
Value added
Market
position
Efficiency
EE01 – Value added
per employee
value added * 100 / average
number of employees in the
year [%]
EE02 – Value added to
personnel costs
value added * 100 / payroll
costs [%]
EE03 – Market Share
turnover size in
manufacturing * 100 /
turnover [%]
EE04 – Profit EAT
Earnings after taxes – EAT
[CZK]
EE05 – Profit EBT
Earnings before taxes – EBT
[CZK]
EE06 – Profit EBIT
Earnings before interest and
taxes – EBIT [CZK]
EE07 – ROE
Performance
ROE = EAT / Equity [CZK]
EE08 – ROA
Performance
Return on assets ROA =
EBIT / Total assets [CZK]
EE09 – ROS
Performance
Return on sales ROS = EAT
/ Total sales [CZK]
EE10 – ROI
Performance
Return on invested capital
ROI = EBIT / Total equity
[CZK]
EE11 – ROCE
Performance
ROCE = EBIT / Equity +
Long-term liabilities [CZK]
EE12 – Turnover size
Sales of own products and
services * 100 / value added
[%]
Cash Flow
EE13 – Operating
Cash Flow
Total cash resources of the
company * 100 / Total
operating costs [%]
EE14 – Expenses on R
&D
Total costs * 100 / value
added [%]
EE15 – Number of
employees
Average number of
employees in the year
(persons) [number]
Table 2 ESG indicators proposed for manufacturing enterprises
Also a prototype of reporting information system was implemented. This prototype was made
to insure that the indicators can be implemented successfully and verify usability of information
system for end user. It concentrates on assessing the sustainability of crop production systems
for the conditions of the Czech Republic. Figure 5 depicts the core data model of the reporting
system. According to the figure 5 the basic indicator abstraction entities are: indicator group,
indicator, indicator item, indicator item aspect. This prototype describes a developed
infrastructure for ICT tools [15] to support rapid knowledge-based decision making for
sustainable development and integrated information from various environmental social and
corporate governance impacts related with economy [1][2]. More information about this
prototype can be found in [13] [16].
6 XBRL
XBRL, eXtensible Business Reporting Language [17], is an XML-based markup language used
to communicate financial and business data electronically. It is generally user friendly because
it isn’t requiring any previous XML or IT background. XBRL is used to encode financial
information such as financial statements, earnings releases, and so forth. Using XBRL, the
financial information can be automatically and more easily sorted and compared. Because
computers don’t have any knowledge about financial reporting and do not understand
information that is not fully defined, XBRL making the information machine-readable.
companyid
Associated
Indicator
companyid
IndicatorReportType
Report Type
PK
PK,FK1 indicatorid
PK,FK2 reportTypeid
FK3 IndicatorReportType
reportTypeid
name
version
Indicator
PK
PK,FK1 parentIndicatorid
PK,FK2 childIndicatorid
indicatorid
name
formula
FK1 indicatorGroupid
description
Indicator group
PK Indicatorgroupid
name
IndicatorReportGroup
Company
PK indicatorReportGroupid
PK name
name
FK1 parentid
Report Instance
companyid
UserCompany
PK
PK
userid
companyid
PK reportInstenceid
companyid
reportTypeid
FK3 period
Indicator Value
PK indicatorValueid
FK3 period
FK1 indicatorItemid
FK2 indicatorAspectid
value
FK4 dataTypeid
Company
Period
PK
FK1
companyid
name
userid
Data Type
Indicator Item
PK
dataTypeid
PK indicatorItemid
name
formula
FK2 dataTypeid
FK1 indicatorid
description
Indicator Item Aspect
PK indicatorItemAspectid
FK1 indicatorItemid
formula
FK2 dataTypeid
description
PK periodid
Figure 5: describes the core data model of the reporting system.
In other words, XBRL is a tool that benefits all users of financial statements by providing
increased functionality over traditional paper, Hyper Text Markup Language (HTML), and
other image-based financial reporting formats. The markup codes used in XBRL describe
financial data in a format that computers can classify, sort, and analyze. The analysis of these
financial data will be done using pre-existing names to standardize business-reporting
terminology. For example different companies usually use different terms to describe the same
concept (accounts receivable, receivables, and trade [accounts] receivable). XBRL normalizes
company-specific terminology to get to the root business reporting concepts, allowing
comparison of essentially similar information across multiple financial statements and among
multiple companies.
XBRL doesn’t use for change how, what, or when to report financial information or change
financial reporting disclosure requirements. It simply provides a standard mechanism for
computers to understand and compare financial data by transmitting financial information in
a way that leverages computer technology.
XBRL assigns appearance and organizational characteristics to electronic documents which
can be used in Internet other website content. According to the current standard, XBRL 2.1
[18], published by XBRL International to perform tagging, preparers must have a collection of
previously defined elements that provide the definitions that computers use to understand
financial statement data. These collections of defined elements are called taxonomies. To create
XBRL-encoded financial statements, preparers match each piece of information from the
financial statements to an element in the taxonomy. During this matching process, known as
mapping, preparers note where the taxonomy differs from the financial statements. In such
cases, preparers extend the XBRL US GAAP Taxonomy by relabeling elements, rearranging
elements, or creating new elements to match the financial statements. Once preparers have
a fully mapped financial statement and extend the XBRL US GAAP Taxonomy to match the
financial statements, they can then tag each piece of financial statement information to create
an XBRL-encoded financial statement known as an instance document.
7 Conclusion
In this paper, we make an assessment of the current state (state-of-art) of the sustainability
indicators and information and communication technologies (ICTs) which is used for
measuring enterprise performance and guarantee its sustainability of the food and agriculture
sectors of the UE economy from theoretical and practical point of view. All explained indicators
depend on the SAFA and GRI 3.1 guidelines. These days a new GRI G4 Guidelines have been
lunched. So we present a brief overview about the new GRI G4 and explain its new
requirements. SAFA system with its levels, principles and scope are explained. Extensible
Business Reporting Language (XBRL), which is used to compare financial data and make it
understandable for computer, is mentioned.
Future work will concentrate on designing the economic indicators of performance in relation
to environmental, social and corporate governance indicators according to new guidelines GRI
G4. Develop advanced methods, which identify key performance indicators for ESG
performance of chosen company with the possibility of the utilization of information and
communication technology and XBRL taxonomy for corporate sustainability reporting.
Develop an ICT infrastructure powerful ICT tools to support rapid knowledge-based decision
making for sustainable development and integrated information from various environmental
social and corporate governance impacts related with economy.
8 References
[1] Chvátalová, Z., Kocmanová, A., Dočekalová, M., 2011: Corporate Sustainability
Reporting and Measuring Corporate Performance, In Environmental Software
Systems. Frameworks of eEnvironment. 9th IFIP WG 5.11 International Symposium.
ISESS 2011. Heidelberg: Springer, 398–406, ISBN 978-3-642-22284-9.
[2] Hřebíček, Soukopova J., Štencl, M., Trenz, O. 2011: Integration of Economic,
Environmental, Social and Corporate Governance Performance and Reporting in
Enterprises. Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis
59, 7: 157–177, ISSN 1211-8516.
[3] Sustainability Reporting Guidelines: Construction and Real Estate Sector Supplement
based
on
GRI
G3.1,
[online]
Available:
https://www.globalreporting.org/reporting/sector-guidance/sector-guidance.
[4] ISO 9000 ISO 14000 ISO 18000 documentation. [online]
http://www.isohelpline
.com/ims_iso_9000_iso_14000_ohsas_18000_document_manual.htm
Available:
[5] Project GACR P403/11/2085, Construction of Methods for Multifactor Assessment of
Company complex performance in selected sector. [online] Available:
http://www.gacr403 .cz/en/
[6] J. Hřebíček, O. Popelka, M. Štencl, O. Trenz (2012) Corporate performance indicators
for agriculture and food processing sector. Acta universitatis agriculturae ET
silviculturae mendelianae brunensis.
[7] European Business: figure and facts: Eurostat statistical books (2009). [online]
Available:
http://epp.eurostat.ec.europa.eu/portal/page/portal/european_business/publications/fac
ts_figures
[8] Kocmanova, M. Docekalova, P. Nemecek, I. Simberova. Sustainability:
Environmental, Social and Corporate Governance Performance in Czech SMEs. In:
The 15th World Multi-Conference on Systemics, Cybernetics and Informatics. 2011.
IFSR, Orlando, USA: WMSCI 2011, 94–99, ISBN 978-1-936338-42-9.
[9] G4 Guidelines, 2013: G4 Guidelines: Reporting principles and standard disclosures,
[online] Available: https://www.globalreporting.org/reporting/g4/.
[10] G4 Guidelines, 2013: G4 sustainable reporting Guidelines: Implementation manual,
[online] Available: https://www.globalreporting.org/reporting/g4/.
[11] G3.1,
2011:
G3.1
Guidelines,
http://www.globalreporting.org/ReportingFrame work/G31Guidelines/
Available:
[12] SAFA Sustainability Assessment of Food and Agriculture systems. SAFA Guidelines
version 2.0, [online] Available: http://www.fao.org/nr/sustainability/sustainabilityassessments-safa/en/
[13] Popelka, O., Hodinka, M., Hřebíček, J., Trenz, O. Information System for Global
Sustainability Reporting. In Environmental Software Systems. Fostering Information
Sharing, 10th IFIP WG 5.11 International Symposium, ISESS 2013, Neusiedl am See,
Austria, October 9-11, 2013. Proceedings. Berlin Heidelberg : Springer, 2013. ISBN
978-3-642-41151-9, pp. 630-640. 9.10.2013, Neusiedl am See, Austria.
[14] Hřebíček, Soukopova J., Štencl, M., Trenz, 2011: Corporate Key Performance
Indicators for Environmental Management and Reporting. Acta Universitatis
Agriculturae et Silviculturae Mendelianae Brunensis 59, 2: 99–108, ISSN 1211-8516.
[15] Hodinka, M., Štencl, M., Hřebíček, J., Trenz, O.: Current trends of corporate
performance reporting tools and methodology design of multifactor measurement of
company overall performance. Acta univ. agric. et silvic. Mendel. Brun., 2012, LX,
No. 2, pp. 85–90.
[16] Popelka, O., Hodinka, M., Hřebíček, J., Trenz O. Information system for corporate
performance assessment and reporting. Proceedings of the 2013 International
Conference on Systems, Control, Signal Processing and Informatics (SCSI 2013). :
EUROPMENT, 2013. ISBN 978-1-61804-204-0, pp. 247-253. 16.7.2013, Rhodes
Island, Greece
[17] XBRL, 2012: An Introduction to XBRL. [online] [cit. 2012-24-3] Available:
http://www.xbrl.org/ WhatIsXBRL/.
[18] Homer, B., Hamscher, W., Westerlund, L., Illuzzi, K.L., Cohen, L.P., and Cohen, D.:
XBRL US GAAP Taxonomy Preparers Guide. 2008 XBRL US.
Curriculum planning and construction:
How to combine outcome-based approach with processoriented paradigm
Martin Komenda1,2, Lucie Pekárková2
1
Institute of Biostatistics and Analyses, Kamenice 126/3, 625 00, Brno
2
Faculty of Informatics, Botanická 68a, 602 00, Brno
komenda@iba.muni.cz, xpekark@fi.muni.cz
Abstract
The paper discusses a proposal of innovative combination of two various approaches used
during a curriculum planning lifecycle. Outcome-based and process-oriented paradigms are
introduced regarding the need for replacing current models in tertiary education. Outcomebased education helps teachers to tell students more precisely what is expected of them and is
being implemented in many institutions. On the other side, process-oriented paradigm can
strongly enhance the design of curriculum composition. The basic goal of this contribution is
to highlight a challenge how to find general concept for planning and construction of the
curriculum and course structure, followed by theoretical background of learning.
Abstract
Příspěvek pojednává o kombinaci dvou různých přístupů používaných během plánování
životního cyklu výuky. Představuje procesně orientovaný přístup a přístup založený na
výstupech z učení s ohledem na opakovaně akcentovanou potřebu změny současně používaných
modelů v terciálním vzdělávání. Přístup založených na výstupech z učení pomáhá pedagogům
jasně formulovat, co od studentů vlastně očekávají, a i proto je využíván na mnoha univerzitách.
Na druhé straně stojí procesní přístup, který posiluje návrh skladby celého kurikula. Základním
cílem příspěvku je v souladu s teoretickými principu procesu učení se zdůraznit nové výzvy
v podobě hledání obecného konceptu pro plánování a konstrukci kurikula, a to včetně struktury
jednotlivých kurzů,
Key words
Outcome-based approach, process-based approach, curriculum planning, patterns, process
modeling
Klíčová slova
Přístup založený na výstupech z učení, procesně orientovaný přístup, plánování učiva, vzory,
procesní modelování
1 Introduction
In recent years, innovation of the curriculum has gained increasing attention especially in the
tertiary sphere. The need for a more fundamental paradigm shift in education has been accented.
There appears to be some space for replacing the current model of the educational process [1].
An appropriate curriculum structure is required, so universities and other stakeholders have
invested a great deal of time and funding into a detailed curriculum review process and
subsequent optimization. Curriculum planning and development is very much on today’s
agenda for undergraduate, postgraduate and continuing education. The whole curriculum is
more than just a syllabus or a statement of content. It is about what should happen in a teaching
program – about the intentions of the teachers and about the way they make these happen. This
extended vision of a curriculum is illustrated in Fig. 1 [2]. There have appeared a lot of
innovative ways how to improve curriculum planning, the following chapter describes in detail
method based on learning outcomes.
Fig. 1: A wider view of a curriculum [2].
1.1 Outcome-based approach
The new paradigm has introduced outcome-based education, a performance-based approach at
the cutting edge of curriculum development, which offers a powerful and appealing way of
reforming and managing education. The emphasis is on the product – what sort of university
graduates will be produced – rather than on the educational process. In outcome-based
education the learning outcomes are clearly and unambiguously specified. These determine the
curriculum content and its organization, the teaching methods and strategies, the courses
offered, the assessment process, the educational environment and the curriculum timetable.
Among the major advantages in adopting an outcome-based model for tertiary education are
relevance, controversy, acceptability, provision of a framework, clarity, accountability, selfdirected learning, guidance in assessment, participation in curriculum planning and also a tool
for curriculum evaluation [3].
The concept of learning outcomes has been widely used in education research [4]. The paper
focuses on an educational area covering tertiary fields of study, so close attention has been paid
to experiences in university programs, where there are also demands for defining new aims of
innovation, such as: (i) the application of innovative thinking and approaches to the new
learning technologies including e-learning and virtual reality; (ii) the application of new
approaches to curriculum planning and advanced instructional design incorporating approaches
such as curriculum mapping, outcome-based education, the use of electronic study guides, peerto-peer learning, dynamic learning and a bank of “reusable learning objects”; (iii) the provision
of a unique international perspective which takes into account the trend towards globalization
and offers benefits to society, to universities and to students; (iv) the development of a flexible
curriculum which meets the needs of different students [5]. Jenkins and Unwin [6] suggest that
learning outcomes help teachers to tell students more precisely what is expected of them and
also provide benefit to teachers themselves [7].
Fig. 2: A model for the curriculum emphasizing the importance
of educational outcomes in curriculum planning [7].
1.2 From theory to practice
This approach promotes user-friendly methods for planning, implementation and evaluation,
especially applied in medical and health care oriented fields of study but not only for that. The
integration of learning outcomes into the curriculum can be presented in a number of ways
depending on the objectives and goals of individual study programs. For example, Brown
University described their learning outcomes as a list of nine abilities, such as effective
communication, basic clinical sciences, diagnosis, management, prevention, etc. [8]. The
English National Board of Nursing, Midwifery and Health [9] has identified ten key
characteristics as the basis for the learning outcomes required for the Higher Award. The
Association of American Medical Colleges in the USA has developed a set of goals for medical
education. These attributes were memorably stated in four groups: the physician must be
altruistic, knowledgeable, skillful, and dutiful [10].
The University of New South Wales has applied the outcome-based approach and has
introduced a new curriculum in which the capabilities of graduates are central to its design. In
developing its curriculum, the idea of “capabilities” was adopted from the Capability in Higher
Education movement in the United Kingdom. Twelve capabilities include not only knowledge
and skills but also the capacity to continue to learn from experience, to act in unfamiliar and
changing contexts, to be clear on professional purpose and to successfully function with others
in the workforce [11].
Schwarz [12] introduced the concept of global minimum essential requirements (GMER),
which present a set of global minimum learning outcomes for graduates of medical schools.
This approach is produced by the Institute for International Medical Education (IIME) and does
not imply a global uniformity of medical curricula and educational processes. Medical schools
should adopt their own particular curriculum design, but in so doing they should first assure
that their graduates will possess the core competences stated in the GMER document and,
second, the competences necessary to meet the unique healthcare needs of the area they serve.
Simpson et al [13] described set of learning outcomes that clearly define the abilities of medical
graduates from any of the five Scottish medical schools. The outcomes are divided into twelve
domains that fit into one of three essential elements for the competent and reflective medical
practitioner. They are similar to the seven domains proposed by the IIME. In the Scottish
approach more emphasis is given to the relationship between the outcomes and to the
integration of the knowledge, skills and attitudes in the practice of medicine.
Fig. 3: Desirable capabilities of graduating medical students
at the University of New South Wales [11].
Bloch and Bürgi [14] introduced the implementation of the Swiss Catalogue of Learning
Objectives for Undergraduate Medical Training, which was inspired by the Dutch model [15]
for training doctors based largely on the traditional disciplines. The Dutch blueprint more
clearly reflected the discipline-based nature of medical academe in Switzerland. The Swiss
learning objectives represent a more discipline-based view of medical education than that found
in the IIME or the Scottish outcomes. The first section of the catalogue sets out the profile of
the doctor by the end of undergraduate education under four headings. The second section is
subdivided into medical aspects, scientific aspects, personal aspects and aspects related to
society and the healthcare system, each with several further subheadings. A third section of the
report lists 283 problems in medicine. These are defined as a complex of complaints, signs and
symptoms, which may lead a patient to seek medical counsel [16].
2 Challenges and objectives
Outcome-based education is being implemented in many institutions despite controversy and
debate about whether learning outcomes can be clearly defined. The curriculum represents
a multifaceted concept at a variety of levels and the chosen form always depends on individual
institution requirements. Obviously a lot of academic institutions across the world have been
using the concept of outcome-based curricula in different ways and in various forms. The goal
is still the same, to improve the current educational process. The precise reforming of
curriculum covers not only sophisticated planning and using of appropriate theoretical
approach. We have to take into the consideration the physical format and structure of every
individual course, which represents the basic building unit of the common curriculum. There
appears a challenge how to find general paradigm for design the curriculum structure, followed
by theoretical background of learning and teaching. The second need has been already fulfilled
by the use of outcome-based approach. The paper introduces the way of creating a new
curriculum arrangement with the use of process-oriented paradigm.
2.1 Process-based approach
Curriculum structure is like the skeleton of a body: strong and in balance, giving direction and
support to activities and determining the outline of what it represents. It needs careful
consideration and planning. The characteristics of the ‘skeleton’ may be determined by the
method of teaching, e.g. outcome-based learning [17]. We describe the structured methodology
named process-based approach, which can strongly enhance the process of curriculum
composition design. This approach provides guidelines which were subsequently used to
inform the development of a process-based approach to performance measurement system
design [18]. From the tertiary curriculum perspective, it considers every instance of a course as
a process. A process is a repetitive sequence of interdependent and linked procedures with the
beginning and the end, which, based on one or more inputs, produces demanded outputs. So,
the process is a sequence of tasks, which relate to each other and transform inputs (students'
and teachers' effort) into desired outputs (students with gained knowledge, their grades or
certificates). Of course, there must be a reason to compare courses to processes. When the
behavior is defined, e.g. basic checkpoints such as beginning, end, sequence of tasks leading
from beginning to end, it can get very predictable and therefore it is simpler to monitor and
evaluate processes in the course. That also means that the whole learning process/course itself
can be automated and optimized. The introduction of roles, definition of rules, procedures,
repeatable processes and some principles of process management can contribute to
a comprehensive learning environment. In the business world, the business process
management (BPM) implementation leads to one major goal, which is cost reduction and profit
improvement. In the e-learning domain the goal could be higher efficiency and quality of
education. Implementation and automation of processes in e-learning should project as
a combination of direct and non-direct effects [19] that lead the further learning process
improvement.
Direct effects are:



students have clearly defined tasks and duties including deadlines which
supports continuous work;
students can share information easily due to a knowledge base, which is a space
for sharing and storing educational materials;
students have instant access to comparison with other students in desired
situations for positive motivation.
There are non-direct effects:

number of administrative tasks are reduced and automated thus this supports
team work in the tutor team, which leads to even distribution of duties and
competences. The tutor then doesn't need to worry about administration and
has more time to prepare the content;
 the tutor has an instant overview of the situation status, completed and
uncompleted tasks thus the tutor team can flexibly react on actual situation;
 possibility to continuously evaluate efficiency and to optimize the learning
process.
To find learning processes is a pedagogical problem - every course or lecture needs different
approach. A good learning process often involves a lot of activities that need to be done on both
tutor’s and student’s side and the complexity of learning process organization is growing and
needs to be managed. The most obvious case of a process from the university environment is
a course. The course is taught repeatedly and the structure remains mainly the same. Modeling
of courses with a lot of mixed e-learning class work, teamwork and other activities is very
useful. In case of really rich and interactive course the organization is too complex and starts
to take too much of time to prepare it. At this moment the process-oriented approach can help
to automate certain tasks and at the same time keep the organization complexity under control.
On the other hand, modeling of each process from scratch is time consuming, but in general
courses consist of a lot of same sequences of activities (such as teamwork, assignment
evaluation etc.). As process approach gives emphasis on re-usability with help of global sub-
processes, it is possible to identify certain sub-processes in learning process that can be reused
later – the learning patterns [20], [21].
The patterns are used to capture best practices from the educational area. The implementation
of mentioned patterns could ease design and optimize learning processes and also increase the
quality of education. On the other hand, formalization of patterns through processes can lead
to further discussion of the whole learning process. The tutor chooses suitable patterns and
eventually compiling single patterns together creates the whole course. Compiled courses have
to be functional and make sense likewise from the technical point of view. The educational
pattern is a small reusable sequence of steps that are included in one learning activity. Patterns
should be used as building blocks for more complex courses [22]. Each pattern should define
general standard structure of the activity that can be extended according to specific needs of
other (re-)users, i.e. it should be customizable.
One of modeled processes is illustrated on Fig. 4. Colloquium is a method of examination,
where small group of student discuss gained knowledge with the teacher. Each student also
defends elaborated essay. The process starts when the teacher publishes topics for essays, then
each student is supposed to elaborate his/her own final essay, submit it, afterwards read essays
of two colleagues and write a report which will be discussed at the colloquium. The process
ends when teacher submits evaluation.
Fig. 4: Example of modeled "Colloquial exam" process
This model was created with the project MEDUSY (Multi-Purpose EDUcation SYstem1) that
is focused on development of rich e-learning environment that will enable wide integration of
various e-learning tools. Project tries to leverage some of the e-learning best practices used in
modern learning management systems (LMSs) and introduce process-oriented approach to
course development and education flow in general. One functional prototype of process based
e-learning application, tool for learning process modeling and basic stones of pattern repository
have been created. Also a prototype of the engine was designed which enabled to analyze
possibilities of process management and patterns utilization in e-learning/blended learning.
Integrating monitoring and evaluation tools into the project will follow.
1
http://medusy.sf.net/
3 Conclusions
The combination of outcome-based approach along with process-based architecture can
significantly help to find a way to plan, respectively restructure, the curriculum. The key fact
is that curriculum designers attention is focused not only on formulation of concrete required
outputs that make up the minimum level of graduate's knowledge, but also on the design of
individual courses. These can be designed in many different ways, but courses always consist
of clearly defined building blocks, the learning patterns, that are mutually connected and form
a comprehensive learning unit (the course). Due to the wide utility of this methodology, its
practical applicability in LMSs is an interesting stimulus for the future. An indisputable
advantage of this approach is the possibility of monitoring and subsequent analytical processing
of information with a clear purpose - to improve the quality of the tertiary education.
Interconnection between two seemingly different approaches provides standardization not only
of the learning outcomes, but also of the learning patterns.
4 References
[1] R. M. Harden, “Looking back to the future: a message for a new generation of medical
educators,” Med. Educ., vol. 45, no. 8, pp. 777–784, 2011.
[2] R. M. Harden, “Approaches to curriculum planning,” Med. Educ., vol. 20, no. 5, pp. 458–
466, 1986.
[3] R. M. Harden, J. R. Crosby, and M. H. Davis, “AMEE Guide No. 14: Outcome-based
education: Part 1ÐAn introduction to outcome-based education,” Med. Teach., vol. 21, no.
1, pp. 7–14, 1999.
[4] Y. Mong, M. Chan, and F. K. H. Chan, “Web-based outcome-based teaching and learning
- An experience report,” in Advances in Web Based Learning - Icwl 2007, vol. 4823, H.
Leung, F. Li, R. Lau, and Q. Li, Eds. Berlin: Springer-Verlag Berlin, 2008, pp. 475–483.
[5] R. M. Harden and I. R. Hart, “An international virtual medical school (IVIMEDS): the
future for medical education?” Med. Teach., vol. 24, no. 3, pp. 261–267, May 2002.
[6] Jenkins and D. Unwin, “Writing learning outcomes for the Core Curriculum,” NCGIA
GISCC Learn. Outcomes, 1996.
[7] R. M. Harden, “AMEE Guide No. 14: Outcome-based education: Part 1-An introduction
to outcome-based education,” Med. Teach., vol. 21, no. 1, pp. 7–14, 1999.
[8] Smith S. R. and Dollase R., “AMEE guide No. 14: Outcome-based education: Part 2Planning, implementing and evaluating a competency-based curriculum,” Med. Teach.,
vol. 21, no. 1, pp. 15–22, 1999.
[9] “Ten Key Characteristics of the ENB Higher Award,” ENB Publ., vol. 1991.
[10] AAMC, “Guidelines for Medical Schools,” Rep. 1 Learn. Object. ForMedical Stud. Educ.,
1998.
[11] S. B. Dowton, “Imperatives in medical education and training in response to demands for
a sustainable workforce,” Med. J. Aust., vol. 183, no. 11/12, pp. 595–8, Dec. 2005.
[12] M. R. Schwarz and A. Wojtczak, “Global minimum essential requirements: a road towards
competence-oriented medical education,” Med. Teach., vol. 24, no. 2, pp. 125–129, 2002.
[13] “The Scottish doctor—learning outcomes for the medical undergraduate in Scotland”
[14] R. Bloch and H. Bürgi, “The Swiss catalogue of learning objectives,” Med. Teach., vol.
24, no. 2, pp. 144–150, 2002.
[15] J. C. Metz, A. M. M. Verbeek-Weel, H. J. Huisjes, and A. van de Wiel, “Blueprint 2001:
training of doctors in The Netherlands,” Utrecht NFU, 2001.
[16] “Developments in outcome-based education.”
[17] R. Oliver, H. Kersten, H. Vinkka-Puhakka, G. Alpasan, D. Bearn, I. Cema, E. Delap, P.
Dummer, J. P. Goulet, T. Gugushe, E. Jeniati, V. Jerolimov, N. Kotsanos, S. Krifka, G.
Levy, M. Neway, T. Ogawa, M. Saag, A. Sidlauskas, U. Skaleric, M. Vervoorn, and D.
White, “Curriculum structure: principles and strategy,” Eur. J. Dent. Educ., vol. 12, pp.
74–84, 2008.
[18] A. Neely, J. Mills, K. Platts, H. Richards, M. Gregory, M. Bourne, and M. Kennerley,
“Performance measurement system design: developing and testing a process-based
approach,” Int. J. Oper. Prod. Manag., vol. 20, no. 10, pp. 1119–1145, 2000.
[19] D. Tovarňák, “Procesní řízení v e-learningu,” 2011.
[20] O. Marjanovic, “Towards A Web-Based Handbook of Generic, Process-Oriented Learning
Designs.,” Educ. Technol. Soc., vol. 8, no. 2, pp. 66–82, 2005.
[21] T. Kolář and T. Pitner, “Process-based approach to e-learning educational processes and
patterns,” in Sborník sedmého ročníku konference o e-learningu – SCO 2011, 2011.
[22] P. Avgeriou, A. Papasalouros, S. Retalis, and M. Skordalakis, “Towards a pattern language
for learning management systems,” Educ. Technol. Soc., vol. 6, no. 2, pp. 11–24, 2003.
Information Technology Tools for Current Corporate
Performance Assessment and Reporting
Ondřej Popelka, Michal Hodinka, Jiří Hřebíček, Oldřich Trenz
Department of Informatics, Faculty of Business and Economics
Mendel University
Zemědělská 1, 61300 Brno, Czech Republic
ondrej.popelka@mendelu.cz, michal.hodinka@mendelu.cz, hrebicek@mendelu.cz,
oldrich.trenz@mendelu.cz
Abstract
Current trends of corporate performance assessment (measurement of economic/financial,
environmental, social and governance key performance indicators) and information and
communication technologies (ICT) tools for corporate reporting are discussed in the paper. We
focus on corporate performance assessment particularly in the agricultural and food processing
sector. The aim of the paper is the introduction of an information system prototype proposal for
corporate sustainable reporting designed for small and medium-sized enterprises (SME). We
describe a relational data model of this information system with several abstract entities to
represent miscellaneous differences in various corporate reporting frameworks. Our goal is to
design a generic information system of corporate sustainable reporting which may be used by
SMEs to start with their corporate performance assessment and reporting. The paper describes
the overall structure of the information system, along with the core data model and an example
of possible extensions into XBRL-based reporting and business intelligence.
Abstrakt
Současné trendy hodnocení podnikové výkonnosti (měření ekonomických a finančních,
environmentálních, sociálních, správních a řídících klíčových ukazatelů výkonnosti) a nástrojů
informačních a komunikačních technologií (ICT) pro podnikový reporting jsou diskutovány
v příspěvku. Zaměřujeme se na hodnocení podnikové výkonnosti zejména v odvětví zemědělství
a zpracování potravin. Cílem článku je představení návrhu prototypu informačního systému pro
podnikový reporting pro malé a střední podniky (MSP). Popisujeme relační datový model tohoto
informačního systému s několika abstraktními entitami představují různé rozdíly v různých
rámcích podnikových reportů. Naším cílem je navrhnout obecný informační systém pro podnikový
reporting, který může být používaný v malých a středních podnicích s jejich hodnocení jejich
výkonnosti a vytváření reportů. Clánek popisuje strukturu celého informačního systému, jádro
datového modelu a příkladem možného rozšíření na bázi XBRL reportu a využití business
intelligence.
Keywords
Corporate performance, Information system, GRI, XBRL, SAFA, Corporate reporting,
Performance assessment, Key performance indicators, Business Inteligence
Klíčová slova
Podniková výkonnost, informační systém, GRI, XBRL, SAFA, podnikový reporting, hodnocení
výkonnosti, klíčové ukazatele výkonnosti, Business Inteligence
1 Introduction
The research teams of the Faculty of Business and Management (FBM) of Brno University of
Technology (BUT) and the Faculty of Business and Economics (FBE) of Mendel University in
Brno (MENDELU) have been taking part in the research project No P403/11/2085 “Construction
of Methods for Multi-factorial Assessment of Company Complex Performance in Selected
Sectors” since January 2011. The project is being solved through the years 2011-2013 and funded
by the Czech Science Foundation. The main research goals of this project are specified by its six
partial research targets [6], [10], [11], [15]:
 The analysis of the state-of-the-art of economic, environmental, social and governance
aspects of corporate performance through targeted research of the global information
and database sources available at the FBM BUT and the FBE MENDELU using the
available Information and Communication Technologies (ICTs) tools.
 A detailed analysis of the implementation of economic, environmental, social and
governance reporting in chosen economic activities and its justification.
 Assessment, analysis, and the categorization of contemporary characteristics of the
individual pillars of corporate performance in relation to the measure of progress or
dynamics of the development of overall corporate performance.
 The identification of the importance and relative roles of Environmental, Social and
Governance (ESG) factors using ESG data and Key Performance Indicators (KPIs) in
the overall company performance.
 The construction of quantitative and qualitative methods of the multifactor
measurement of corporate performance in the chosen economic activities with the use
of ICT tools.
 The application of the developed methods for multifactor measurement of corporate
performance of chosen economic activities in practice, with feedback for possible
change correction aimed at further improvement.
We will discuss the current state of corporate sustainability reporting following G4
Sustainability Reporting Guidelines [21] and SAFA Guidelines version 2.0 [18] in the paper.
Further we introduce a developed prototype of the specialized corporate reporting information
system [22] for small and medium-sized enterprises (SME). We will present also two primary goals
of the prototype implementation: a verifying that the proposed corporate performance indicators
can be implemented successfully and usability of this corporate reporting information system for
end-users. Though our main goal is the implementation of the SAFA Guidelines methodology [18],
we have currently implemented the methodology SAGROS (Sustainable AGROSystems) [16] for
assessing the sustainability of crop production systems for the conditions of the Czech Republic.
2 Corporate Reporting
Reporting on sustainability and environmental, social and governance (ESG) performance is
a crucial step towards a market that rewards the creation of long-term wealth in a just and
sustainable society [30]. Sustainability key performance indicators can play a crucial role in
supporting markets that create such long term wealth [9]. They can form the backbone of
sustainability disclosure that tracks, and allows for improvement on, those issues most tied to
a corporation’s environmental and social impact, and are most material to a company’s financial
performance [11], [14], [23].
Nowadays, we can see significant change in how reporting systems are evolving. They are moving
from the usage of few analysts to a ubiquitous use across many organizational functions. We focus
on the development an information system accessible to small and medium-sized enterprises. Large
corporations usually already have some kind of reporting system implemented. SMEs usually
perform only the mandatory reporting required by laws, basically in order to fulfill regulatory
demands. Our goal is to facilitate performance reporting even for SMEs and make it as easy as
possible. This should help enhance the strategic decision making and planning of SMEs, which in
turn will help to improve the risk management and sustainability of the business and its
environment.
The Food and Agriculture Organization of the United Nations (FAO) is developing the
Sustainability Assessment of Food and Agriculture systems (SAFA) guidelines [18] to assist the
achievement of fair practices in food and agriculture production and trade on a local and regional
level. The SAFA framework is the result of an iterative process, built on the cross-comparisons of
codes of practice, corporate reporting, various standards, indicators and other technical protocols
currently used by food and agricultural enterprises that implement sustainability assessment.
The SAFA Guidelines are based on certain core methodological principles including Bellagio
Stamp [24]. Additionally, SAFA draws upon the ISO standards for Life Cycle Assessment (ISO
14040:2009), the ISEAL Code of Good Practice [12], [25], the Reference Tools of the Global
Social Compliance Program (GSCP) [26] and the GRI Sustainability Reporting Guidelines G3 [4],
G3.1 [3] and G4 [21] and its Food Processing Sector Supplement [32].
The guiding vision of SAFA is that food and agriculture systems worldwide be characterized by
environmental integrity, economic resilience, social well-being and good governance. In recent
years, there has been some progress in the realization and acknowledgement of sustainable
development, which is summarized in [5], [7] [8] and [9]. Many stakeholders in the agriculture
sector have contributed to this progress by improving agricultural productivity, protecting natural
resources and human resources by implementing frameworks and standards for assessing and
improving sustainability across the agricultural sector and along the value chain [18].
The SAFA guidelines consist of three core sections – the SAFA framework, the SAFA
implementation and the Sustainability dimensions [18]. In the following chapters, we describe
a special information system, whose role may be divided into several main steps:
 collecting data;
 reporting in standardized format for the purpose of stakeholders and required regulatory
demands;
 reporting in XBRL standardized interchange format;
 customized reporting;
 evaluating company performance by computing standardized key performance
indicators.
These steps correspond to the last two steps of Section 2 of the SAFA framework implementation
(indicators, reporting). To cover the first two steps (mapping, contextualization) seems impractical
at present due to the highly specific nature of the problem for each reporting SME / corporation.
Therefore, these steps are not considered a part of the proposed information system.
One of our goals is the development of a special information system or information system module
for SME performance reporting. These organizations would be able to use such a system to
generate standardized reports and assess their corporate performance.
3 Overview of existing performance Indicators
3.1 SAFA methodology
Because we concentrate mainly on agricultural organizations, the SAFA methodology [18] is
highly important to us. To allow a greater general scope of the information system for SMEs we
have also considered and evaluated other performance methodologies and their indicators. There
are several levels of SAFA, which are nested to enhance coherence at all levels (see Figure 1). The
SAFA Framework begins with the high level overarching dimensions of sustainability: good
governance, environmental integrity, economic resilience and social well-being. It is recognized
that these dimensions are broad and encompass many aspects. These are translated into
a universally agreed definition of sustainability, through themes and sub-themes for each of the
sustainability pillars. Together it is 21 themes and 56 sub-themes. These are measurable and
verifiable through a set of 112 core indicators applicable to food and agriculture supply chains.
SAFA Guidelines [18] provide the guidance for the application (calculation) of these indicators.
Fig. 1.
SAFA Guidelines structure [18]
SAFA has defined 112 Core Indicators within each of 56 sub-themes, which identifies the
measurable criteria for sustainable performance for the sub-theme, Table 1. Core indicators are
applicable at the macro level – meaning to all enterprise sizes and types, and in all contexts. Core
indicators serve the purpose of providing standardized metrics to guide future assessments on
sustainability. The core indicators’ set is needed for a general level of reporting, as SAFA users do
not necessarily have the knowledge to develop indicators themselves, without the risk of lowering
the bar of the assessment. The overview of core indicators per sub-themes and themes of Good
governance and Environmental integrity are introduced in Table 1 and 2. Core indicators per subthemes and themes of economic resilience and social well-being [18] are not presented, because of
extent of the paper.
Dimension G: GOOD GOVERNANCE
Themes
G1 Corporate Ethics
G2 Accountability
Sub-Themes
G1.1 Mission Statement
Core Indicators
G 1.1.1 Mission
explicitness
G 1.1.2 Mission driven
G 1.2 Due Diligence
G 1.2.1 Due diligence
G 2.1 Holistic Audits
G 2.1.1 Holistic audits
G 2.2 Responsibility
G 2.2.1 Responsibility
G 2.3 Transparency
G 2.3.1 Transparency
Dimension G: GOOD GOVERNANCE
Themes
Sub-Themes
Core Indicators
G 3.1.1 Stakeholder
identification
G 3.1 Stakeholder
Dialogue
G3 Participation
G4 Rule of Law
G 3.1.3 Engagement
barriers
G 3.1.4 Effective
participation
G 3.2 Grievance
Procedures
G 3.2.1 Grievance
procedures
G 3.3 Conflict Resolution
G 3.3.1 Conflict
resolution
G 4.1 Legitimacy
G 4.1.1 Legitimacy
G 4.2 Remedy,
Restoration and
Prevention
G 4.2.1 Remedy,
restoration and
prevention
G 4.3 Civic Responsibility
G 4.3.1 Civic
responsibility
G 4.4 Resource
Appropriation
G5 Holistic
Management
G 3.1.2 Stakeholder
engagement
G 4.4.1 Free, prior and
informed consent
G 4.4.2 Tenure rights
G 5.1 Sustainability
Management Plan
G 5.1.1 Sustainability
management plan
G 5.2 Full-cost
Accounting
G 5.2.1 Full-cost
accounting
Table 1. Overview of core indicators per sub – themes and themes of good governance [18]
3.2 GRI methodology
In June 2013, the GRI released the G4 Sustainability Reporting Guidelines [21] updating the G3.1
Guidelines from 2011 [3] and updating and completed the G3 Guidelines from 2006 [4]. The new
G4 Sustainability Reporting Guidelines (the Guidelines) [21] offer Reporting Principles, Standard
Disclosures and an Implementation Manual for the preparation of sustainability reports by
organizations, regardless of their size, sector or location.
Dimension E: ENVIRONMENTAL INTEGRITY
Themes
E1 Atmosphere
Sub-Themes
E 1.1 Greenhouse Gases (GHG)
Core Indicators
E 1.1.1 GHG reduction
target
E 1.1.2 GHG mitigation
practices
E 1.1.3 GHG balance
E 1.2 Air Quality
E 1.2.1 Air pollution
reduction target
E 1.2.2 Air pollution
prevention practices
E 1.2.3 Ambient
concentration of air
pollutants
E2 Water
E 2.1 Water Withdrawal
E 2.1.1 Water conservation
target
E 2.1.2 Water conservation
practices
E 2.1.3 Ground and surface
water withdrawals
E 2.2 Water Quality
E 2.2.1 Clean water target
E 2.2.2 Water pollution
prevention practices
E 2.2.3 Concentration of
water pollutants
E 2.2.4 Wastewater quality
E3 Land
E 3.1.1 Soil- improvement
practices
E 3.1.2 Soil physical
structure
E 3.1 Soil Quality
E 3.1.3 Soil chemical quality
E 3.1.4 Soil biological
quality
E 3.1.5 Soil organic matter
content
E 3.2 Land Degradation
E 3.2.1 Land conservation
and rehabilitation
plan
E 3.2.2 Land conservation
and rehabilitation
practices
E 3.2.3 Net loss/gain of
productive land
E4 Biodiversity
E 4.1 Ecosystem Diversity
E 4.1.1. Landscape/marine
habitat conservation plan
E 4.1.2 Ecosystemenhancing practices
E 4.1.3 Structural diversity
of ecosystems
E 4.1.4 Ecosystem
connectivity
E 4.1.5 Land–use and landcover change
E 4.2 Species Diversity
E 4.2.1 Species conservation
target
E 4.2.2 Species conservation
practices
E 4.2.3 Diversity and
abundance of key
species
E 4.2.4 Diversity of
production
E 4.3 Genetic Diversity
E 4.3.1 Wild genetic
diversity enhancing
practices
E 4.3.2 Agro-biodiversity insitu conservation
E 4.3.3 Locally adapted
varieties/breeds
E 4.3.4 Genetic diversity in
wild species
E 4.3.5 Saving of seeds and
breeds
E5 Materials
and Energy
E 5.1 Material Use
E 5.1.1 Material
consumption practices
E 5.1.2 Nutrient balances
E 5.1.3 Renewable and
recycled materials
E 5.1.4 Intensity of material
use
E 5.2 Energy Use
E 5.2.1 Renewable energy
use target
E 5.2.2 Energy-saving
practices
E 5.2.3 Energy consumption
E 5.2.4 Renewable energies
E 5.3 Waste Reduction and
Disposal
E 5.3.1 Waste reduction
target
E 5.3.2 Waste reduction
practices
E 5.3.3 Waste disposal
E 5.3.4 Food loss and waste
reduction
E6 Animal
Welfare
E 6.1 Health and Freedom from
Stress
E 6.1.1 Integrated animal
health practices
E 6.1.2 Humane animal
handling practices
E 6.1.3 Species-appropriate
animal husbandry
E 6.1.4 Freedom from stress
E 6.1.5 Animal health
Table 2. Overview of core indicators per sub – themes and themes of environmental integrity [18]
The Guidelines also offer an international reference for all those interested in the disclosure of
governance approach and of the environmental, social and economic performance and impacts of
organizations. Food Processing Sector Supplement (FPSS) [32] Guideline provided organizations
in this sector a tailored version of the G3.1 Guidelines. It includes the original Guidelines, which
set out the Reporting Principles, Disclosures on Management Approach and Performance
Indicators for economic, environmental and social issues. Sector-specific categories on Sourcing
and Animal Welfare are included for the food-processing sector. The Supplement’s additional
commentaries and Performance Indicators, developed especially for this sector, capture the issues
that matter most for food processors.
The G4 Guidelines consists of two parts:
1) Reporting Principles and Standard Disclosures and
2) Implementation Manual.
The first part of the Guidelines contains Reporting Principles, Standard Disclosures, and the criteria
to be applied by an organization to prepare its sustainability report “in accordance” with the
Guidelines. Definitions of key terms are also included. The second part contains explanations of
how to apply the Reporting Principles, how to prepare the information to be disclosed, and how to
interpret the various concepts in the Guidelines. References to other sources, a glossary and general
reporting notes are also included.
Standard Disclosures are divided as follows: General Standard Disclosures (Strategy and Analysis;
Organizational Profile; Identified Material Aspects and Boundaries; Stakeholder Engagement;
Report Profile; Governance and Ethics and Integrity) and Specific Standard Disclosures (Guidance
for Disclosures on Management Approach; Guidance for Indicators and Aspect-specific
Disclosures on Management Approach). The GRI guidelines are probably the most comprehensive
guide covering both what information should be reported and how this information should be
provided.
3.3 Integrated reporting methodology
Integrated reporting as proposed by the International Integrated Reporting Council (IIRC) [27] is
a form of corporate reporting that provides a clear and concise representation of how an
organization creates value, now and in the future. The current definition of “integrated reporting”
according to the IIRC, is “a process that results in communication, most visibly a periodic
‘integrated report’ about value creation over time. An integrated report is a concise communication
about how an organization’s strategy, governance, performance and prospects lead to the creation
of value over the short, medium and long term.” Essentially, the goal is to promote “integrated
thinking” in a manner that tells the “corporate story” for investors on how financial and nonfinancial reporting line up and how true performance is determined by equating all features of dayto-day operations.
The IIRC has proposed the elements to be presented in an integrated report through the Integrated
Reporting Framework they are now developing. From 16 April 2013, for the 90 days leading up to
15 July 2013, the IIRC was calling on all stakeholders across the world to read the Consultation
Draft of the International <IR> Framework [28]. The IIRC asks stakeholders to understand it,
challenge it, critique it, and feedback to us what parts you feel work and what parts perhaps do not.
Final version of the Integrated Reporting Framework will be published in the second half of 2013.
3.4 Other methodologies
There are a number of alternative reporting methodologies that focus more on what should be
reported, rather than how it should be reported. An example is AEI EU (Agri-Environmental
Indicators of European union). The defined AEIs EU [1] are not yet generally usable in practical
generic reporting for SMEs in the agricultural sector [9]. Some of the indicators cannot be
computed because of missing data; sometimes the data required are not homogenous enough [7].
Because these indicators are designed mainly for the purpose of CAP EU (Common Agricultural
Policy of EU) they should be considered usable only for special cases of farms. As such, they are
divided into completely different domains and subdomains, because the Driving force – State –
Response framework is used (responses, driving forces, pressures and benefits and state/impact).
In addition, they focus solely on environmental monitoring and are completely lacking both
economic and corporate governance domains [8]. The same problems are brought by the
application of indicators tracked by the European Environment Agency [13]. Again, these
concentrate solely on the environmental domain and are unsuitable for corporate performance
reporting. The main reason for the evaluation of these AEIs was that they are highly desirable for
the development of ICT tools, which will be capable of producing the required values of these
indicators if necessary.
Another set of standardized performance indicators comes from the collaboration of OECD,
Eurostat, and the Food and Agriculture Organization (FAO) [17]. Although these indicators are
not based on the Driving force – State – Response framework, they are still primarily agroenvironmental indicators and are therefore insufficient for corporate sustainability reporting.
Within the solution of the project GACR 403, we have proposed a minimal set of generic key
performance indicators [10], [14] [19] and [23] applicable to a wide range of enterprises. This
forms one set of indicators considered for implementing and validating the information system
prototype. The second set of testing indicators are indicators described in Methodology for
assessing the sustainability of crop production systems for the conditions of the Czech Republic
[9], [16], [20]. The rationale behind these choices is that we want to test both simpler (and generic)
reporting systems such as the one proposed by [14] and corporate reporting systems of indicators
such as SAFA [7], [18] or SAGROS (Sustainable AGROSystems) [16], [20].
Our goal is to provide an architectural proposal of ICT tools that will be generic enough to contain
all of the above-described methodologies. To implement such an information system successfully,
several abstractions are required. They need to be carefully constructed to ensure that they are
hidden before the end-user and that they do not add more complexity to the corporate reporting
itself.
4 System requirements and objectives
In our vision, we foresee two main objectives of the corporate reporting information system.
A company may use such a system to create various reports and share data with both its
stakeholders and, in the future, with the state administrations concerned with regulatory demands
and mandatory corporate reporting [29].
The second objective is the corporate performance assessment performed by evaluating key
performance indicators by the means of one of the proposed methodologies. This way, the
company can share with the public its performance or check a performance development progress.
Evaluating corporate sustainability performance can be done through various reports as stated
above or through custom dashboards.
We define the following three main generic actors:
 Reporter – a person responsible for ensuring company mandatory reporting (especially
financial, social and environmental) or reporting for stakeholders; this would typically
be a member of the accounting team or a contractor. This is the typical actor who would
enter data into the information system.
 Evaluator – a person responsible for evaluating company performance; this role can be
represented by a wide range of people starting with managers, auditors and various
internal interested parties. The evaluator may also enter data into the information
system, but it is not his/her primary task, typically this would be the additional data
required for the report generation, but not required as the part of mandatory reporting
(e.g. company strategy, goals a targets, vision etc.)
 Administrator – the administrator of the information system responsible for defining
the report templates and business rules used to generate reports.
The reporting system should include a flexible layout design, rich visualization,
business requirements, and data logic definition etc. following Reporting Principles of
the G4 Guidelines. In addition, there is the need for translation or publishing
requirements, a central deployment and the customization of interactive reports. These
primary functions represent a user-friendly reporting system with strong delivery
capabilities of relevant reports. From the definition of the generic roles we can proceed
to basic functional requirements. Basic high-level requirements for the reporting
information system are:

store source company data;

compute performance indicators according to selected methodology;

generate reports in the selected format for offline storage, printing and archiving;

import source company data from the main company information system;

provide selected information in the form of reports accessible online to the general
public or only to selected persons;

provide selected information in the standardized XBRL format for interchange with
other systems;

provide the possibility to evaluate company performance anonymously;

provide the possibility to define report data and business logic;

dynamically generate the XBRL taxonomies for a given report;

provide the possibility to customize standardized reports and to generate fully
customized reports;

provide the possibility to compare reports.
Non-functional requirements for the reporting information system are:

the system must be easy to use and accessible to the general public (not just to a few
data analysts);

as many indicator values as possible should be computed or imported from external
sources;

the system must be able to provide the software as a service (SAAS) for cloud
computing;

report layouts must be flexible;

report visualizations should be rich and interactive.
For all the use cases of the generic actors – reporter and evaluator – we consider two scenarios.
The system may be used either on a regular basis (a registered company scenario) to generate
scheduled company reports (annual, quarterly, etc.), performance and to check the trailing
corporate sustainability performance, and its development. The second scenario (an anonymous
company scenario) represents the case when a company wants to generate a one-time irregular
report.
The second scenario can occur for example when the company is applying a report for a subsidy
or grant or when the company is evaluating changes in the company operation (e.g., restructuring),
where a report would be generated for the period before the change and for the period after the
change. The second scenario can also be seen as an entry point into the reporting system for
companies that do not use it regularly.
5 Information System structure
The reporting information system consists of several modules that can be separated into two main
groups – data collection and data retrieval. The data collection module may occur for the two
scenarios described above. Hence, it has two main modules: RegisteredCompanyModule and
SurveyModule. The former allows entering data on a regular basis by registered users for
companies registered in the information system. The latter allows entering data anonymously, for
several years at once. The third data retrieval module is the ImportModule that represents an
interface between the information system and its external sources. This can be used in conjunction
with both the RegisteredCompanyModule and the SurveyModule. The import module must be able
to import data in XBRL format, it should also be able to import data from other sources, but this is
highly dependent on the performance assessment methodology used. In these modules are also
included the operations of document validation and the application of defined business rules [5],
[6].
The data retrieval modules also include the ReviewModule which is a basic document viewer used
for reviewing entered data regardless of the reporting framework used. The second important
module of data retrieval is the ReportModule that generates the reports according to the selected
methodology for the end-user. Both modules may use several different internal modules, which
provide the system with indicator mapping, and additional processing of optional business rules
regarding the report output (e.g., how often should the report be generated, uniform visual style,
etc.). The internal modules include the XBRLReportModule that provides the data in the form of
a XBRL document; this module may also be used directly for exporting reports in an
interchangeable format.
The above-described architecture describes the portion of the information system built as
a transactional system utilizing a relational database, which forms the main data storage system.
We have proposed a further extension of the reporting information system with XML and XBRL
data store to provide greater flexibility [5], [6].
6 Indicator abstraction
To encompass the above-defined rules and different methodologies described in the previous
chapter, we propose the following abstraction of performance indicators (applicable universally to
performance indicators [23]). For the prototype software application, we have used a relational
database as a storage system. Hence, the following description uses relational database technology.
See the chapter 8 for the discussion of other storage options. The basic indicator abstraction entities
are:
 indicator group;
 indicator;
 indicator item;
 indicator item aspect.
Fig. 2.
Structuring of indicators with sample concrete indicators from methodology [16]
These entities are organized into a hierarchical structure shown on Fig. 2. Each indicator sub-entity
is shown along with examples of real values according to the methodology SAGROS [16].
The end user will always fill in the leaves of this tree-structure – i.e., if an indicator value is not
computed and contains only one value, it may contain only a single indicator item, which will
contain no aspects. The concrete example may be an indicator average number of employees,
which is the value directly known to the company and a value directly used in a report. It is
important to note that an indicator group must always have at least one indicator and all the
indicators must have at least one indicator item while an indicator item may have zero or more
indicator item aspects. This is important in order to maintain the flexibility of the reporting
information system. In case the indicator item contains several aspects, it is expected that all of
them will be required for the indicator item value computation, unless specified otherwise in the
indicator item formula.
The entities described above define the core of the data model of the information system. The
following schema (0) shows further details with the associated entities. Here we have defined
a report type, which represents an instance of a given methodology, e.g., the SAFA version 2.0
[18].
We have also introduced a report instance that represents a concrete filled-in report for a specific
company for a specified time period. The indicator value, which is an actual value entered by the
end-user, may be bound to either indicator item or indicator item aspect, but not both. Note that
the indicator value is not actually related to a report instance. This is internal so that a single
indicator value may be reused in several report instances.
For example, the value of the indicator average number of employees may be used both in the
SAFA 2.0 report for the year 2012 and also for the mandatory financial report for the same period.
As long as the period is the same, the value should be reusable. This is crucial to maintaining the
consistency of data across different reports and it is a highly important feature in order to make
reporting as easy and quick as possible.
There is also the indicator report group entity, which defines grouping of indicators in the defined
report types. The linkage is provided by the indicator report-type association table, which assigns
indicators to specific groups in specified report types. Note that it is possible that an indicator is
used in more than one report definition, which is desired behaviour. It is also possible that an
indicator is used in more than one group, which is also desired behaviour. For example, the
indicator average salary may be part of the economic indicators group in the report, or it may be
part of the social responsibility group, or even both.
The actual computing of the reported indicator value is provided by the formula code defined in
the indicator, he indicator item, and the indicator item aspect entities. The formula API
(Application Programming Interface) is discussed further in the text. The computed value is not
stored in the schema shown in 0, because it is recoverable from the base values. However, for
performance and/or version tracking reasons, the computed values may be stored in the caching
schema.
The entities: indicator value, indicator item, and indicator item aspect all link to data type. The
value data types model is shown in 0. Here we define another entity named Core data type, which
represents an XBRL native type (e.g., monetary, numeric, fractional, etc.). This is because the
XBRL data types are too coarse for all kinds of reports. The definition of a constraint also uses the
formula API in the following chapter.
Fig. 3: Core data model of the reporting system
Closely linked to an indicator value computation is also the Associated Indicator table. This is
used to track dependencies between indicators in a case when one indicator is used for the
computation of another indicator.
The concrete example can be seen in the methodology [15] where the economic performance
indicator EE14 – Expenses on R&D is computed as the value of the Total costs of R&D of
a company divided by the value of the economic indicator EE01 – Value added of the company
and expressed in per cent. The EE01 – Value added is the economic performance indicator of
[14] defined as the value added of the company per employee. The further economic
performance indicator EE02 – Value added per personal costs was introduced [14]. Both EE01
and EE02 indicators are computed as the value added of the company divided by the average
number of its employees in the year for the former, and by the payroll costs, for the latter.
To fully represent the above-mentioned indicators, we need to define the following indicators
in the application:
 source indicator: total costs of R&D;
 source indicator: average number of employees;
 source indicator: payroll costs;
 source indicator: value added of the company;
 output indicator: EE14 – Expenses on R&D;
 output indicator: EE01 – added value per employee;
 output indicator: EE02 – added value to personnel costs.
Note that the separation of indicator groups into source and output indicators is purely
explanatory, there is no such division made in the data model as any indicator may be used in
any report. Because of this, it is important to maintain associations between indicators (or
potential indicators) so that when the user decides to fill a given type of report, all of the required
values are requested – that is, all the Core indicators, see 0.
Fig. 4: Data type data model
7 Indicator computation
The actual indicator values for a report are computed from the indicator items and indicator
item aspects. The computing itself is provided by a formula defined in each of these entities.
The formula is a piece of executable code used to compute the value from the source data. In
our prototype of the reporting information system, the formula is currently defined as a PHP
code. This is the native language of the information system. In the future it might be better to
switch to another interpreted language such as JavaScript, which would provide more flexibility
and portability. Regardless of the formula language used, the following data API is proposed.
Each indicator may have associated indicator values; these are the values used in more
indicators at once. The concrete example would be an indicator added value, which would be
used as both a report indicator and also as a common divisor for several other indicators in the
methodology [15].
The API for the formula code has one available base entity: indicator. The properties defined
in each entity correspond to those defined in the data model. The computation must return
a single value of the arbitrary data type. We expect that a single report indicator must always
have a single value. This may pose a limitation to the report generator, because the end user
might want to obtain a list of values in the report. Currently, it is only possible to convert the
data type among the indicator, indicator item and indicator item aspect; it is also possible to
generate strings of characters from raw input values. In the future development we will see
whether such a solution is sufficient or whether multivalued data types will be required.
When working with the formulas, the question of versioning comes into attention. All
methodologies are being constantly updated and the computations may change, so it is required
to keep track of historical data – the definitions of indicators. We have proposed a standard
recording of history, where after each change in the formula a new physical indicator is created,
while maintaining the link to the original indicator. That is, when an indicator is modified, a new
indicator is created, one that supersedes the original indicator. The above principle applies to
all core entities of the data model (i.e., indicators, indicator items, indicator item aspects).
8 Other storage option concepts
Current trends in reporting are causing that the usage of the reporting systems is moving from
thee specialized usage by a few data analysts to general and ubiquitous use across many
organizational units and users. In addition, data storages with simply historical information are
moving to real-time analysis and predictive analysis of the future. When we focus more on the
organization unit, we can see how formerly fragmented views with silos of information – in
many cases stored in transactional systems – are being brought together to enterprise wide and
unified views. There are two basic approaches a company can take when trying to achieve such
unified views – either a centralized data warehouse is built, or a series of connectors is built
between various systems. Companies are also searching for time-relevant and critical data using
pre-built Business Intelligence (BI) applications [5], [6], [31]. This is being required more than
just individual reports based on as-needed basis built within dedicated analytics tools. When
comparing classic reporting tools (also called traditional reporting tools) against an approach
based on XBRL, we expect that the classic reporting tools will have slower development cycles,
higher cost of development and maintenance. The main reason for this being the fact that prebuild tools (e.g. Oracle Hyperion) may be used to transform XBRL data directly into reports
without the need of programing a new application. In our previous research [5], we already
mentioned many XBRL and inline XBRL (iXBRL) technical challenges.
Fig. 5: Overview of system architecture built on XBRL database
For example, we discussed how to improve core processing engine and integrity with business
rules approach in [5], or how to connect BI with XBRL in [6]. There are still more challenges
which need to be resolved such as an efficient queryability and analytics and scalable lookups
over large volumes of XBRL documents. As we mentioned in [6], XBRL was not explicitly
designed for the role of a core storage system; it was designed as a metadata representation
language. However, it is possible for XBRL to become a native data source in the areas of text
mining and web reporting. With this approach, we can easily eliminate the most of the costly
process of data preparation. The reason why the proposed reporting information system should
work is that existing software or other mapping tools can do transforming of XBRL data
automatically. This automation can increase the consistency, accuracy and trust in the data – all
key tenets of successful reporting.
0 shows an overview of a system architecture based on an XBRL database. The top application
layer accepts report submissions and provides results of report queries. It can be composed of
various interfaces such as web application user interface or some defined APIs. The application
layer communicates with the storage layer via SQL queries or via XQuery language. The data
layer is composed of both the RDB system and the XBRL database. The data layer exposes
both relational and XML interface so that the data can be queried by both SQL and XQuery
languages. The application layer may therefore provide a wide range of APIs to connect the
database to various sources. This offers a high interoperability of the system, which is
a necessary feature. Aside to the application layer is the business intelligence engine, which can
use direct access to the data layer. This will be possible if the data layer is self-containing – i.e.
it can provide a XBRL document with all necessary taxonomies and templates.
Our proposed architecture allows easy integration of XBRL processing with third party
business intelligence software products to satisfy diverse use cases [31]. This solution allows
the XML-based querying and protocol access. On 0, we describe on how a range of APIs and
Internet protocols may be used to access the source data. For example, SOAP, REST, JDBC or
file based access to the content can be provided with WebDAV interface. Important feature of
this approach is the storage of XBRL in its original XML representation while still providing
relational access. The background is a relational representational view that is built over the
XBRL content by exposing set of base entities physically implemented as relational views over
XBRL documents.
RDBMS logical data model and XBRL content together provide an easy integration with BI
tools. Direct access to source data can be used to perform a wide variety of reporting and
charting outputs. Native XBRL storage with the integrity based on XBRL rules allows both
XML and SQL querying based on XBRL semantics. The main feature is the connection with
the current transactional database via relational application integration. This is an important
feature for those enterprises, which do not have data in the form of XBRL documents. There is
a clear need for case studies and research pilots that would test online analytics based on XBRL
dimension (explicit, typed).
9 Conclusion
We have implemented a prototype of the described special reporting information system. There
have been two primary objectives of the prototype implementation: to verify that the proposed
indicator abstraction can be implemented successfully and to verify the usability of the
reporting information system for end-users. Though our main objective is the implementation
of the SAFA Guidelines methodology [18], we have currently implemented the Methodology
for assessing the sustainability of crop production systems for the conditions of the Czech
Republic [16], because the SAFA Guidelines version 2.0 have been published only recently. As
far as usability is concerned, we are currently testing two approaches to user interface – survey
interface, and guide interface. In the former interface, the user can enter all the required and
optional values on a single page form with minimum structuring. In the latter case, the user is
lead through various steps, filling different parts of the report gradually.
Following the SAFA vision [18], we use four dimensions of sustainability: good governance,
environmental integrity, economic resilience and social well-being, 21 universal Themes and
56 Sub-themes including 112 Core indicators. The proposed reporting information system is
able to deal with other corporate sustainability performance assessment frameworks and
guidelines. Our future work will concentrate on improving the XBRL reporting by adding
dynamic XBRL taxonomies and an XBRL database layer to the GRI XBRL taxonomy proposal.
We will also concentrate on improving the end-user experience of the report outputs so that
performance evaluation becomes available to a wide range of users. This will be accomplished
by utilizing business intelligence tools to add a set of dashboards and overviews to the report
[5].
10 Acknowledgment
This paper is supported by the Czech Science Foundation. Name of the Project: Construction
of Methods for Multifactor Assessment of Company Complex Performance in Selected Sectors.
Reg. No P403/11/2085.
11 References
[1] Analytical framework. European Commision. Eurostat Home [online]. 2010, Avaliable
from
:
http://epp.eurostat.ec.europa.eu/portal/page/portal/agri_environmental_indicators/introd
uction/analytical_framework
[2] Z. Chvátalová, A. Kocmanová and M. Dočekalová, “Corporate Sustainability Reporting
and Measuring Corporate Performance,” in Proceedings of 9th IFIP WG 5.11
International Symposium. ISESS 2011. Environmental Software Systems. Frameworks
of eEnvironment. J. Hřebíček, G. Schimak, R. Denzer, Eds. Heidelberg: Springer, 2011,
pp. 398–406.
[3] G3.1 Guidelines on Globalreporting. 2011. [Online].
https://www.globalreporting.org/reporting/G3andG3-1/g3-1guidelines/Pages/default.aspx
Available
from:
[4] G3 Guidelines website on Globalreporting. 2006. [Online]. Available from:
https://www.globalreporting.org/reporting/G3andG3-1/g3-guidelines/Pages/default.aspx
[5] M. Hodinka and M. Štencl, “Reporting Tools for Business Rules,” In IT for Practice 2011.
Ostrava: VŠB Technická univerzita Ostrava, 2011, pp. 29-36.
[6] M. Hodinka, M. Štencl, J. Hřebíček and O. Trenz, “Business intelligence in Environmental
reporting powered by XBRL,” Acta univ. Agric. Et silvic. Mendel. Brun., 2013. in press
[7] J. Hřebíček, O. Trenz and E. Vernerová “Optimal set of agri-environmental indicators for
the agricultural sector of Czech Republic,” Acta univ. agric. et silvic. Mendel. Brun., 2013.
in press.
[8] J. Hřebíček, M. Popelka, M. Štencl and O. Trenz, “Corporate performance indicators for
agriculture and food processing sector,” Acta univ. Agric. Et silvic. Mendel. Brun., Vol.
60, pp. 99-108, February 2012.
[9] J. Hřebíček, S. Valtyniová, J. Křen, M. Hodinka, O. Trenz and P. Marada, “Sustainability
indicators: development and application for the agriculture sector,” in Sustainability
Appraisal: Quantitative Methods and Mathematical Techniques for Environmental
Performance Evaluation, M. G. Erechtchoukova, P. Khaiter and P. Golinska, Eds.
Heidelberg: Springer, 2013, pp. 63-102.
[10] J. Hřebíček, M. Štencl, O. Trenz and J. Soukopová, “Current Trends in Corporate
Performance Evaluation and Reporting in the Czech Republic,” International Journal of
Energy and Environment, WSEAS Press, Vol. 6, pp. 39-48, January 2012.
[11] J. Hřebíček, M. Hodinka, M. Štencl, O. Popelka and O. Trenz. “Sustainability Indicators
Evaluation and Reporting: Case Study for Building and Construction Sector“. In N.
Altawell, K. Volkov, C. Matos, P. F. de Arroyabe. Recent Researches in Environmental
and Geological Sciences. Proceedings of the 7th WSEAS International Conference on
Energy & Environment (EE '12). USA: WSEAS Press, 2012. pp. 305-312,
[12] ISEAL Alliance. Assessing the impacts of environmental and social standards systems.
ISEAL Code of Good Practice, version 1.0. ISEAL Alliance, London Available from:
http://www.isealalliance.org/resources/p041-impactscode-of-good-practice.
[13] Indicators: European Environment Agency. European Environment Agency [online].
Copenhagen, 2013. Available from: http://www.eea.europa.eu/data-and-maps/indicators
[14] A. Kocmanová and M. Dočekalová, “Construction of the economic indicators of
performance in relation to environmental, social and corporate governance (ESG) factors,”
Acta univ. agric. et silvic. Mendel. Brun., Vol. 60, pp. 195–205. April 2012.
[15] A. Kocmanová, M. Dočekalová, P. Němeček, and I. Šimberová, “Sustainability:
Environmental, Social and Corporate Governance Performance in Czech SMEs,”. in The
15th World Multi-Conference on Systemics, Cybernetics and Informatics. IFSR, Orlando,
USA: WMSCI, 2011, pp. 94–99.
[16] J. Křen, Methodology for assessing the sustainability of crop production systems for the
conditions of the Czech Republic. Brno: Mendel University in Brno, 2011.
[17] OECD. Environmental Indicators for Agriculture Volume 1: Concepts and framework
[online].
Paris:
OECD,
1999
Avaliable
from:
http://www.oecd.org/agriculture/sustainableagriculture/40680795.pdf
[18] SAFA Sustainability Assessment of Food and Agriculture systems. Draft Guidelines
(Version
2.0),
2013,
Available
from:
http://www.fao.org/fileadmin/templates/nr/sustainability_pathways/docs/SAFA_Guidelin
es_final_draft.pdf.
[19] J. Soukopová and E. Bakoš, “Assessing the efficiency of municipal expenditures regarding
environmental protection,” in Environmental Economics and Investment Assessment. III.
ed. Cyprus. WIT Press. 2010, pp. 107 –119.
[20] S. Valtýniová and J. Křen,”Indicators used for assessment of the ecological dimension of
sustainable arable farming – review,”, Acta univ. agric. et silvic. Mendel. Brun., Vol. 59,
pp. 247–256, March 2011.
[21] G4 Guidelines website on Global reporting. 2013. [Online]. Available from:
https://www.globalreporting.org/reporting/g4/Pages/default.aspx
[22] O. Popelka, M. Hodinka, J. Hřebíček and O. Trenz “Information System for Corporate
Performance Assessment and Reporting”. In: In Charles A. Long, Nikos E. Mastorakis,
Valeri Mladenov. RECENT ADVANCES in SYSTEMS, CONTROL, SIGNAL
PROCESSING and INFORMATICS. Proceedings of the 2013 International Conference on
Systems, Control, Signal Processing and Informatics (SCSI 2013). EUROPMENT, 2013.
pp. 247-253
[23] Z. Chvátalová and I. Šimberová, “Analysis and identification of joint performance
measurement indicators: Social and corporate governance issues”. Acta univ. agric. et
silvic. Mendel. Brun., Vol. 60, pp. 127-138, July 2012.
[24] L. Pinter, P. Hardi, A. Martinuzzi and J. Hall, ”Bellaggio STAMP: Principles for
sustainability assessment and measurement”. Ecological Indicators Vol. 17, pp. 20-28,
2011.
[25] ISEAL Alliance. Assessing the impacts of environmental and social standards systems. the
ISEAL Credibility Principles (ISEAL Credibility Principles v0 3 - 11 June 2013. ISEAL
Alliance,
London
Available
from:http://www.isealalliance.org/resources/p041impactscode-of-good-practice.
[26] GSCP. 2010. Reference Tools of the Global Social Compliance Programme. Issy-lesMoulineaux, France Available from http://www.gscpnet.com/working-plan.html.
[27] IIRC. 2013. [Online]. Available from: http://www.theiirc.org/the-iirc/. 2013.
[28] Consultation Draft of the International <IR> Framework [Online]. Available from:
http://www.theiirc.org/wp-content/uploads/Consultation-Draft/Consultation-Draft-of-theInternationalIRFramework.pdf
[29] J. Hřebíček, F. Piliar and J. Soukopová. “Case study of voluntary reporting in SMEs“. In:
Sustainability Accounting and Reporting at macroeconomic and microeconomic level. pp.
265-273, Linde, Praha, 2009.
[30] J. Hřebíček, J. Soukopová and E. Kutová. “Standardisation of Key Performance Indicators
for Environmental Management and Reporting in the Czech Republic“. In USCUDAR
2010. Recent Advantages in Urban Planning, Cultural Sustainability and Green
Development. USA: WSEAS Press, 2010. pp. 76-82.
[31] D. Airinei and D. Homocianu. “Data Visualization in Business Intelligence“. In RECENT
ADVANCES in MATHEMATICS and COMPUTERS in BUSINESS, ECONOMICS,
BIOLOGY & CHEMISTRY. Proceedings of the 11th WSEAS International Conference
on MATHEMATICS AND COMPUTERS IN BUSINESS AND ECONOMICS (MCBE
'10). USA: WSEAS Press, 2010. pp. 164-167.
[32] Food Processing Sector Supplement, 2012. [Online]. Available from:
https://www.globalreporting.org/reporting/sectorguidance/sector-guidance/foodprocessing/Pages/default.aspx
Generování japonsko-českého slovníku
Jonáš Ševčík
Masarykova univerzita - Fakulta informatiky
Laboratoř softwarových architektur a informačních systémů
Botanická 68a, 602 00 Brno
255493@mail.muni.cz
Abstract.
This Article aims to present the reader with methods suitable for automatic bilingual dictionary
generation. Proposed methodology uses automatic generation using pivot-language method.
Input data are Japanese-English and English-Czech dictionaries, where using WordNet and
pivot-language method we generate Japanese-Czech corpus, which will be used in online
collaborative service suitable for finishing the rest of sense translation.
Abstrakt
Tento článek si klade za cíl seznámit čtenáře s principy automatizovaného generování
dvojjazyčného slovníku. Navržená metodologie uplatňuje automatické generování pivotovou
metodou. Jako vstupní data jsou navrženy japonsko-anglický a anglicko-český slovník, kde za
pomocí WordNetu a pivotové metody je získán japonsko-český korpus, který následně bude
použit v online kolaborační službě určené pro crowdsourcingové dotvoření zbylého překladu.
Keywords
Bilingual dictionary, automatic generation, Japanese, Czech, crowdsourcing
Klíčová slova
Dvojjazyčný slovník, automatizované generování, japonština, čeština, crowdsourcing
1
Úvod
České prostředí postrádá elektronický japonsko-český slovník, který by alespoň částečně
dosahoval kvalit slovníku JMDict [1], případně jeho lokalizovaným verzím. Samozřejmě zde
existují papírové verze slovníků, jako je např. Japonsko český slovník [3] nebo Japonsko český
studijní znakový slovník [4]. V současné době však neexistuje ucelený japonsko-český slovník
v elektronické podobě. Dostupnost takovéhoto slovníku je nezbytná nejen pro studium
Japonštiny. Užití překladů do jiného než rodného jazyka uživatele vyžaduje netriviální znalost
cílového jazyka, kde navíc bez uvedení kontextu může dojít k chybné interpretaci jednotlivých
hesel.
Tvorba slovníku není triviální a vyžaduje kvalifikované lidské zdroje. Jedná se tedy
o nákladnou činnost, která je v případě tvorby jednotlivcem též časově náročná. Je proto potřeba
navrhnout řešení, které by umožnilo tvorbu slovníku realizovat levněji a s co nejmenšími
nároky na lidské zdroje.
Tento výzkum si dává za cíl navržení univerzálně použitelného nástroje pro kolaborativní
tvorbu elektronického japonského slovníku. Data by tedy měla být tvořena samotnými uživateli
slovníku. Pro vytvoření základního korpusu je zamýšleno použití automatického generování
obsahu.
Při automatizované tvorbě obsahu se nabízí řešení použití volně dostupného slovníku, např.
zmiňovaného Edict [1] a jednotlivá hesla přeložit doslova spárováním s jiným např. anglickočeským slovníkem. Toto řešení překladu přes společný jazyk je však naivní. Jednotlivá hesla
slovníku můžou obsahovat významové posuny, případně bez uvedení příkladových vět může
dojít k jejich chybné interpretaci.
1.1 State of the art
Během 80. let 20. Století byla vyvinuta řada metodik pro automatické generování
dvojjazyčných slovníků. Za zmínku stojí Dice-coefficient, Corespondence-tables, či Mutual
information [4]. Nevýhodou těchto přístupů je, že vyžadují rozsáhlé dvojjazyčné korpusy.
V roce 1994 představil Tanaka [5] nový přístup založený na tzv. pivotovém jazyku. Jako zdroj
jsou použity slovníky s překlady do a z pivotového jazyka. Tato metoda spočívá ve vyhledávání
slov přes třetí zprostředkovatelský jazyk. A takovýto přístup samozřejmě může být
automatizován. Metoda používá pro automatizovaný překlad dvou obousměrných slovníků2.
Správnost překladu je ověřována přeložením hesla v jednom a následně zpětném směru.
Podobný přístup navrhl také Sjöbergh [6], který pro tvorbu japonsko-švédského slovníku využil
angličtiny jako pivotovacího jazyka.
Rozšíření uvedené pivotové metodiky existuje celá řada. Ovšem nejedná se o obecně použitelné
techniky. Vychází především ze specifik použité kombinace jazyků nebo jejich podobnosti.
Např. Širai [7] využívá generování japonsko-korejského slovníku přes angličtinu, ovšem
využívá též specifik dřívějšího zápisu korejštiny pomocí hanzi (čínských znaků) a pro překlad
sino-korejských slov využívá přímé mapování na slova sino-japonská. V tomto případě je
možné najít jednoznačnou rovnost mezi jednotlivými slovy. Podobného mapování je za pomocí
zápisu hanzi možno využít i u ostatních většiny substantiv. Je však zřejmé, že s výjimkou
jazyků využívajících čínských zanků, případně přejatých čínských slov není možné tyto
doplňkové metody použít.
Jako další z možných způsobů automatického generování slovníkových dat je využití
lingvistické databáze WordNet [8], která definuje pevně danou strukturu pro indexování slov
jednotlivých světových jazyků. Za použití indexu je možné mezi sebou jazyky propojovat
a vytvořit tak jednoduše dvojjazyčný slovník. Předpokladem je však, že pro daný jazyk je
vytvořen WordNet v dostatečném rozsahu. Český WordNet obsahuje pouhých 40000 synsetů3
a japonský přibližně 57000, oproti tomu anglický WordNet obsahuje 117000 synsetů
[9, 10, 11]. Je tedy zřejmé, že generované slovníkové data jsou omezeny průnikem těchto
porovnávaných struktur. Jako nejvhodnější pro naši potřebu se jeví použití metody publikované
Vargou [12]. Ta využívá právě WordNetu a pivotové metody. Ta funguje následovně.
Nejdříve se provádí fáze, ve které jsou tranzitivně propojeny jednotlivé slovníky. V našem
případě všechna japonská hesla, která mají ekvivalent v angličtině a následně ekvivalent
v češtině. Takovéto spojení je nazváno kandidátem pro překlad. Ve druhé fázi proběhne
roztřízení kandidátů do 6 skupin, ze kterých jsou vyřazeni ti kandidáti, kteří nesplní
požadovanou mez [13]. Výsledky výzkumu [12, 13] ukazují, že z aktuálně dostupných dat, je
tímto způsobem možné vygenerovat přibližně 70 % synsetů z celkového průniku propojených
WordNetů.
2
Navržené řešení
Pro tvorbu japonsko-českého slovníku jsme zvolili pivotovou metodu, jak ji uvádí Varga [12].
Zjevná přednost použití předgenerovaných dat je ta, že je možné tímto způsobem automaticky
generovat podstatnou část slovníku. Tento korpus může sloužit jako základ, který bude
následně rozšiřován. Jak již bylo zmíněno v úvodu – překlad za pomocí jednotlivce
profesionála je časově neefektivní a též finančně náročný. Z toho důvodu jsme se rozhodli
vytvořit online službu, která jako svůj základ bude používat japonsko-českého korpusu
vygenerovaného výše uvedenou metodou. Pro generování dat bude použito volně dostupného
japonského a českého WordNetu. Japonský slovník JMDict (zdroj japonského WordNetu) bude
též sloužit jako základ pro české překlady, jelikož obsahuje překlad hesel nejen do angličtiny,
ale i němčiny, ruštiny, holandštiny a dalších jazyků. Pro umožnění crowdsourcingového
překladu zbývajících nepřeložených hesel to má tedy tu výhodu, že překladatel může vidět u
2
3
Pro češtinu a angličtinu by to byl anglicko-český a česko-anglický slovník.
Množin synonym.
daného hesla překlady v jiných jazycích a následně doplní český překlad. Využití slovníku
JMDict jako základ navíc limituje seznam japonských hesel, takže nebude možné přidávat zcela
nová. To má tu výhodu, že nebude umožněno zanést novým japonským heslem chybu, případně
neexistující slovo, což by se v opačném případě muselo ověřovat. Všechna data obsažená ve
slovníku JMDict již prošla kontrolou, proto je bezpečné je použít a dále na nich stavět. Tento
slovník je navíc neustále aktualizován, takže nehrozí jeho případné zastarání a je možné
iterativně doplňovat nové přírůstky i do japonsko-českého slovníku.
2.1 Bezpečnostní opatření
Navržená služba bude obsahovat strohý uživatelský interfejs. Uživatelům se bude profilovat
jako japonsko-český slovník a k dispozici bude pole pro vstupní text a následně omezující
kritéria pro hledání řetězce – přímé hledání, hledání prefixu, hledání sufixu a hledání středu
zadaného slova. Výsledky budou zobrazovány v českém jazyce, v případě nedostupnosti
českého překladu bude uživateli nabídnut cizojazyčný překlad, dle osobních preferencí nadále
uložených pomocí cookies. V případě cizojazyčného překladu bude ještě nabídnut strojový
překlad do českého jazyka4. Libovolný uživatel pak bude moci navrhnout vlastní překlad
daného hesla, případně potvrdit správnost nabízeného strojového překladu. Aby nebyli
uživatelé odrazováni od navrhování překladů povinnou registrací, tak nebude použita. Překlady
by tak byl schopen navrhnout kdokoli. Aby bylo možné zaručit kvalitu dat, tak čerstvě
přeložené hesla nebudou přímo propagována do dočasného slovníku. K tomu, aby bylo možné
heslo přenést do slovníku, je potřeba překlad potvrdit. Potvrzení je možné pomocí speciální
role administrátora, typicky člověka se znalostí jazyka, který jen rozhodne o správnosti
překladu. Není tedy potřeba, aby jej sám překládal.
2.2 Nasazení
Pro nasazení služby jsme zvolili cloudové řešení společnosti Red Hat – OpenShift5. Tato
platforma nabízí zdarma 3 konfigurovatelné serverové instance, je proto možné si nastavit např.
optimální velikost operační paměti. Celá služba je řešena v programovacím jazyce Java, data
jsou ukládána do MySQL databáze. Aktuálně se jedná teprve o prototyp služby, proto není
veškerá funkcionalita implementována. Nicméně na vývoji se neustále pracuje.
3
Závěr
V článku jsme představili možné řešení automatického generování japonsko-českého slovníku.
K němu byla zvolena pivotová metodika, která využívá angličtiny jako pivotovacího jazyka.
Jednotlivá hesla jsou navzájem provázána pomocí databáze WordNet. Vzniklý japonsko-český
korpus bude nasazen v online službě, která nabídne funkcionalitu online s lovníku, s tím, že
nevygenerované překlady pro hesla budou strojově doslovně přeloženy a nabídnuty překlady
v jiných cizích jazycích. Uživatelé též budou mít možnost poskytnout vlastní překlad
k jednotlivým heslům, který bude uložen a nabídnut ke schválení skupinou administrátorů, aby
bylo zamezeno výskytu chyb.
Jako možné vylepšení služby vidíme možnost eliminovat nutnost existenci administrátorů. Na
místo ní by bylo umožněno rozhodovat o správnosti překladu uživatelům s určitým množstvím
přeložených a uznaných hesel. A o samotném schvalování výrazů by se rozhodovalo
hlasováním, tzn. pokud by navržený překlad obdržel požadovaný počet hlasů, byl by označen
automaticky za správný.
4
5
Např. pomocí služby Google Translate.
Testovací služba je dostupná na: https://jmd-hotovy.rhcloud.com/Dict/
4
Literatura
[1] JMDict.
[online].
[cit.
2013-10-31].
http://www.csse.monash.edu.au/~jwb/wwwjdicinf.html
Dostupné
z:
[2] LABUS, David a Jan SÝKORA. Japonsko český studijní znakový slovník. Praha: Paseka,
2005, 625 s. ISBN 80-718-5736-X.
[3] KROUSKÝ, Ivan a František ŠILAR. Japonsko český slovník. Voznice: LEDA, 2005, 519
s. ISBN 80-733-5045-9.
[4] BROWN, R.D. 1997. Automated Dictionary Extraction for Knowledge-Free ExampleBased Translation, Proceedings of the 7th International Conference on Theoretical and
Methodological Issues in Machine Translation, pp. 111-118.
[5] TANAKA, K., UMEMURA, K. 1994. Construction of a bilingual dictionary
intermediated by a third language, Proceedings of COLING-94, pp. 297-303.
[6] SJÖBERGH, J. 2005. Creating a free Japanese-English lexicon, Proceedings of PACLING,
pp. 296-300.
[7] ŠIRAI, S., JAMAMOTO, K. 2001. Linking English words in two bilingual dictionaries to
generate another pair dictionary, ICCPOL-2001, pp. 174-179.
[8] MILLER G.A., BECKWITH R., FELLBAUM C., GROSS D., MILLER K.J. (1990).
Introduction to WordNet: An Online Lexical Database, Int J Lexicography 3(4), pp. 235244
[9] BLAHUŠ, Marek. Extending Czech WordNet Using a Bilingual Dictionary. Brno, 2011.
Diplomová práce. Masarykova univerzita.
[10] Japanese WordNet. [online]. [cit. 2013-10-31]. Dostupné z: http://nlpwww.nict.go.jp/wnja/index.en.html
[11] WordNet. [online]. [cit. 2013-10-31]. Dostupné z: http://wordnet.princeton.edu/
[12] VARGA I, JOKOJAMA Š.: Bilingual dictionary generation for low-resourced language
pairs. EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural
Language Processing. 862-870.
[13] VARGA I, JOKOJAMA Š.: iChi: a bilingual dictionary generating tool. Annual Meeting
of the Association of Computational Linguistics. 2009.
Availability and scalability of web services hosted in
Windows Azure
David Gešvindr
Masarykova univerzita - Fakulta informatiky
Laboratoř softwarových architektur a informačních systémů
Botanická 68a, 602 00 Brno
255653@mail.muni.cz
Abstract
This article aims to present the reader with principles of a software architecture design that
leads toward better scalability and availability of a web service hosted in Windows Azure cloud
environment. This article introduces the reader to major Windows Azure services and points
out the most important differences of the cloud environment, that need to be taken in account
in an early software architecture design phase so the service can effectively use advantages of
the cloud especially almost unlimited horizontal scalability.
Abstrakt
Tento článek si klade za cíl seznámit čtenáře s principy návrhu softwarové architektury webové
služby, jenž vedou k lepším charakteristikám škálovatelnosti a dostupnosti v cloudovém
prostředí Windows Azure. Článek seznamuje čtenáře s hlavními službami Windows Azure
a upozorňuje na důležité odlišnosti cloudového prostředí, které musí být brány v potaz již při
návrhu architektury služby, pokud má služba efektivně využívat výhod cloudu, zejména téměř
neomezené horizontální škálovatelnosti.
Keywords
Cloud computing, Windows Azure, software architecture, scalability, availability
Klíčová slova
Cloud computing, Windows Azure, softwarová architektura, škálovatelnost, dostupnost
1
Introduction
With increasing dependency of modern society on daily used web services even in emergency
situations, there are more strict requirements on availability and scalability of hosted services.
One of the possible solutions to increase these service characteristics is to use a cloud-based
hosting with guaranteed service availability and scalability.
Together with broader use of the cloud there are emerging questions if current software
architectures used for 2 or 3-tier web applications are able to use full potential of the cloud and
if they are also highly available and scalable or there is need to change the architecture of the
designed service to ensure this characteristics.
In first section of this paper Windows Azure cloud platform is introduced. Section 2 describes
recommendations for software architecture to increase scalability of a service. Section 3
describes architecture of a service designed for efficient use of cloud environment. Section 4
evaluates this approach using load tests.
2
Overview of the Windows Azure Platform
Windows Azure is Microsoft’s cloud platform providing rich set of hosting services in
Microsoft’s datacenters all over the world [1]. It offers hosting services that fall mostly in
categories - Infrastructure as a service (IaaS) and Platform as a Service (PaaS). Developers can
buy this platform’s compute and storage resources to host their own web applications and
services. Considering the size of datacenters of Windows Azure, there is almost unlimited
amount of resources that can be provisioned. Payment is based on Pay-As-You-Go model,
where the customer is billed only for currently allocated amount of compute power, storage
capacity, amount of outgoing traffic from the datacenter, database size, or another billable
service metric based on the pricelist of Windows Azure.
In the following sections we shortly introduce important services offered by Windows Azure
and discuss their key qualitative attributes in the area of availability and scalability.
2.1 Compute Services
Windows Azure provides a range of computation services allowing developers to run their
applications in the cloud environment. Biggest difference is in a level of control the developer
has over the environment – Virtual Machines service provides freedom to build a custom server
in contrast to the Web Sites service that offers almost a classical web hosting with minimum
configuration options.
Developers can create their own virtual server using Virtual Machine service and vertically
scale this server by choosing server configuration – number of CPU cores and RAM size.
During the creation process they can choose from different available operating system images
or they can upload their own hard drive file to Azure Storage and use it in the virtual machine.
This service offers high degree of freedom, but also puts a responsibility for server maintenance
at the customer. Typical use of this service is a hosting of a specific application server or
strongly customized web server. To provide high availability according to the SLA – 99.95%,
there have to be at least two instances of the virtual machine grouped together into an
availability set.
Cloud Service offers lower degree of configuration freedom but also significantly lowers
a required maintenance effort. In background it works almost the same way as a virtual
machine. Developers can choose between 2 roles – web role and worker role, depending
whether they need to host the application in Microsoft’s web server Internet Information
Service or run it as a stand-alone process. Application they build is during publication packaged
into an installation package together with its configuration and is uploaded to Windows Azure.
Windows Azure infrastructure then allocates required computation resources (Fabric Controller
creates virtual server from unified server image depending on a selected role, deploys published
application to a freshly created server and configures network infrastructure to distribute load
between instances of the service if there is more than one). Newly created instance of virtual
server can be customized by a startup script, which is a part of the installation package and can
reconfigure OS and install another applications. The main disadvantage of cloud service is
absence of persistent storage, because the cloud infrastructure never migrates existing virtual
server, it always creates new instance based on the application package. Therefore all
application data should be stored in Azure Storage or relational database. Cloud Service can
scale vertically – increase computing power of a single node or horizontally by increasing
a number of nodes and distributing load between them.
For web services that do not require any specific configuration changes in a web server it may
be cost efficient to use the Azure Web Sites service. This service offers support for ASP.NET,
PHP, Node.JS and Python. Web sites can be deployed directly from source controls with
support for Git, Bitbucket, CodePlex and Team Foundation Service or via FTP. Their content
is stored in the Windows Azure Storage and is loaded directly by web servers, which can run
in shared or reserved mode. In a shared mode websites are running on a virtual machine that is
concurrently hosting many other websites of different customers in a slightly modified version
of Internet Information Service. User in this mode pays for every hosted site. To guarantee
better performance it is possible to create instance that is reserved only for one customer and
can host up to 500 web sites. Reserved mode is billed in the same way as a virtual machine
independently on the number of hosted web sites. Web sites in a reserved mode can be scaled
vertically by selecting small, medium or large virtual machine instance and also horizontally
by increasing the number of web servers serving the same content.
2.2 Scalability and availability of compute services
Benefit of the cloud is that a compute power can be easily increased when it is needed and
decreased when the application is idle or under lower load. There are two approaches how
a service can be scaled – vertically and horizontally. [2]
The vertical scaling increases available resources of a single computation node – it adds more
CPU cores and RAM. This is done by changing a configuration of the virtual server. Advantage
of this approach is that almost every application that can utilize mode CPU cores can benefit
from this type of scalability without any needs to change the application. Disadvantage is that
we have to stop the virtual server to do this configuration change and this type of scalability
does not increase an availability of the hosted application. In Windows Azure we are limited
because the larges instance has only 8 CPU cores. To scale the application behind this limit we
have to use the horizontal scalability.
When we scale application horizontally we add more compute nodes and distribute the load
between them. This approach works correctly with stateless applications, which are mostly web
applications so a load balancer can distribute user requests independently on an inner state of
the application on the specific node. Not every application can benefit from the horizontal
scalability. Applications that work with the inner state have to be updated to store their inner
state in a storage shared between all nodes using Azure Storage, SQL Database or Caching
Service. Advantage of the horizontal scalability is that we can add more nodes or remove some
of them without an interruption of the service running on existing nodes, simply by
reconfiguring the load balancer. We can effectively use the horizontal scalability to adjust the
reserved compute power to meet current load requirements.
Availability of compute services is at least 99.95% according to SLA with minimum of 2
deployed instances. When one of the instances fails, load balancer detects the missing instance
and stops sending requests to the missing virtual server. When using the Virtual Machine
service, it depends on the hosted application if supports failover scenarios.
2.3 Storage Services
Windows Azure offers specialized services for storing and retrieving data. Windows Azure
storage works as a storage for binary data, tabular data (No-SQL approach) and queue
messages. Current web services very often use SQL databases, which can be hosted in a second
storage service named SQL Database.
Windows Azure Storage is a key service for storing data in Windows Azure cloud providing
almost unlimited storage capacity. Depending on a type of stored data user can create different
storage containers – Blob Storage for storing large amounts of unstructured text or binary data
such as images or video.
Table Storage offers No-SQL functionality for applications that need to work with large
volumes of unstructured data. Data in the table storage are stored as rows in a large table without
a defined schema as required in SQL databases. Each row has two important fields that together
uniquely identify each row – Partition Key and Row Key. Each row then can contain other
fields of basic data types. In such a table there can be more different types of rows. The most
important attribute of this storage to consider is that data can be effectively retrieved only based
on their Partition and Row Keys. Any other type of search is very inefficient and slow.
Queue Storage works as a highly scalable queue that can be used by applications to exchange
small messages between different components to create indirect dependencies. Very often this
queue is used as a queue for pending task to be processed by worker roles. This approach allows
off-load complex calculations from a web role to a worker role, so the web role can serve higher
volume of less demanding requests and slow computations are done asynchronously by worker
roles, which can be then easily horizontally scaled.
Windows Azure SQL Database is a hosted version of Microsoft SQL Server database with
minor modifications to provide a high availability by replication data among three synchronous
replicas. This database service has advantage in full compatibility of database clients with
Microsoft SQL Server. This means that an existing database can be migrated from SQL Server
to the cloud with some minor changes and existing applications only has to update its
connection strings to point at the hosted version of the database.
2.4 Scalability and availability of storage services
Windows Azure Storage guarantees according to SLA availability at least 99.9%. This service
has a very complex multi-tier architecture [3] that helps achieve high availability and high
scalability of this service. Blobs and Tables are an ISO 27001 certified managed service, which
can auto scale to meet massive volume of up to 200 terabytes and throughput for all accounts
created after June 7th, 2012.
An additional Content Delivery Network service can be used to increase a throughput of
Windows Azure Blob Storage. The CDN works as a distributed file cache with servers spread
all over the world. When a user requests a file, it is served by the closest CDN node which in
case of the first request loads the file from Azure Storage and then holds it for 72 hours in its
cache.
SQL Database service offers availability of hosted databases of at least 99.9%. Because this
service works in a multi-tenant environment where database servers are hosting databases of
different customers there are specific policies that limit the maximum number of database
connections and the overall load at the database to provide all customers the same quality of
the service. To scale this service up there is available a new premium version of this service,
which creates a dedicated database servers without mentioned limits especially for the
customer. These premium servers are fully manage by Microsoft so there is no additional
maintenance effort for the customer.
3
Recommendations to increase service scalability
This section describes general principles to increase scalability of a designed service. [4]
3.1 Software components
1. Every component of a software system should have a clearly defined purpose. The
web service should be built as a collection of interconnected software components. It
is preferable to design smaller components, which are easier to design, code and
maintain. Together with increasing size of a component there is also increasing number
2.
3.
4.
5.
of dependencies for a given component. It may be very difficult to efficiently identify
a performance bottleneck in a very complex component.
Every component should have a clearly defined communication interface. It
should be possible to simply swap an inefficient implementation of a component for
a new more efficient implementation. Having a clearly defined communication
interface between software components makes this process simpler and cost effective.
Do not burden component with interaction mechanism responsibility. Every
software component has its responsibilities and dependencies. A mechanism for
communication between components should not be implemented as a part of
a component because it increases its complexity and decreases its versatility.
Distribute data sources. It is preferable to distribute data sources across different
layers of the service. All data of the service can be stored centrally in a single relational
database. But together with increasing number of processing nodes there is increasing
load at this central database. Relational databases are in general poorly scalable so this
central database becomes soon the performance bottleneck we will need to replace. We
can use different storage services available in the cloud with different performance
characteristics and pricing models to create an efficient and scalable multi-tier storage.
Use data replication. We can replicate data to be stored in more than one copy closer
to a computation node, so it can access the data faster (e.g. cache a data object in an inmemory cache of a web server). This approach works easily for read-only data, which
can be easily without any conflicts stored in many identical copies. Using the data
replication for data that are continuously modified by worker roles requires
implementing a synchronization mechanism that propagates changes across all nodes
in the system. Depending on the target characteristics of this synchronization
mechanism, it may be very difficult to implement it. In heavily used systems with many
changes in replicated data there will occur many change conflicts that may be difficult
to resolve in an automatic way.
3.2 Connection between software components
1. Use explicit connectors. Connectors that are used to connect different components
together should not be an integral part of the component so they can be easily replaced
by different implementation. This can be achieved by creating a technology-dependent
communication interface that wraps over an inner technology independent
communication interface of the component.
2. Each connector should have a clearly defined purpose. So far we discussed the
distinct purpose of a component but this apply also for communication interfaces which
should support only few required protocols and we should rather create more simpler
connectors for each protocol than one complex connector.
3. Use the simplest available connector. There are available many different connectors
for different communication protocols that support different set of functionality.
Developer should choose the simplest possible connector that meets all current
requirements instead of a complex connector that may be useful in future.
4. Choose between direct and indirect dependencies. Direct dependencies between
components like a synchronous procedure call have benefits in an obvious control flow
in the system. Use of indirect dependencies where requests to call procedures are sent
over a shared queue to be called asynchronously has better scalability characteristics
because they can be distributer among higher number of workers.
5. Do not put application logic into connector. We should clearly distinguish between
a purpose of a component and its connector. The purpose of the connector is to mediate
communication between the component not to run an additional application logic that
would increase connector’s complexity and slow it down.
6. Use scalable connectors. To increase overall scalability of the application it is
important to use a scalable communication mechanism between its key components.
4
Architecture of service designed for cloud
Based on Windows Azure service’s characteristics and general recommendations for an
architecture of the software service I design following high-level service architecture that is
depicted in Figure 1.
Figure 1: Proposed high-level service architecture
4.1 Storage Tier
Based on the recommendation to distribute data sources of the designed service we should use
multiple services for storing data so a user load can be effectively distributed among them.
Storage services should be tiered based on their scalability and availability characteristics. The
service with the best scalability should process majority of user’s requests. The poorly scaling
service should process only a few specific requests. This means that we use the relational
database service SQL Database to store most of service’s structured data because it provides
a mechanisms for transaction processing and can easily ensure consistency of stored data. But
we cannot stress the database with all user’s load because of limited scalability caused by the
multi-tenant hosting environment. Therefore we use Windows Azure Storage to store cached
data loaded from the database in the form that requires minimum processing by compute service
to lower its load and allow better scalability of the frontend server. Disadvantage of this
approach is a data duplication, but the overall costs for storage capacity in Windows Azure
Storage are minimal. We should effectively combine Windows Azure Storage services like the
Table Storage where we can store pre-processed data that can be addressed by unique key that
is required for effective retrieval of stored data. For binary objects, like images and other
resources that we need to distribute to client, we will use the Blob Storage. The client can
download files directly from the Azure Storage without need to generate a load on our frontend servers in case files are stored in a public container in the storage. To increase the
distribution capacity of this service we use the Content Delivery Network to distribute files that
will not be changed often or we can bypass the inability to invalidate a CDN cache when file
changes by implementing a mechanism that generates new files with new names after each
change.
To increase a performance of the storage system we can put the most often used data into the
in-memory cache of the web server but we have to deal with the replication problem, because
when we cache data that may change, each server can cache a different version and this may
lead to inconsistent results when a load balancer lets client communicate each time with
a different server in our service. Therefore we can use a Cache Service in Windows Azure that
will host our in-memory cache in a farm of dedicated servers. The data retrieval from this cache
is a way faster that from Azure Storage so we get performance benefits and at the same time
there is not the problem with different versions of cached data. This service is the most
expensive type of storage so we should put read-only data to the local cache and use this shared
cache only for data that will change.
When we use all available storage technologies in the correct combination we can significantly
increase scalability of the designed service and also optimize its response time and lower
running its costs.
4.2 Compute Tier
Clients communicate with the designed service via a HTTP REST communication interface,
which is hosted by a farm of web servers (does not matter if the Azure Cloud Service or the
Azure Websites Service is used because this API can be hosted in both services and both
mentioned offer vertical and horizontal scalability).
For resource intensive long running compute operations it is more efficient to use an
asynchronous processing. This means that if the operation runs for a considerably longer time
than other requests at the web server, it is better not to process the request but to put it into
a processing queue. A different server running as a worker can load this operation from the
shared queue and process it asynchronously.
This approach brings following benefits:
1. Response time optimization. The goal is to lower the overall load at front end servers
to decrease the response time for client requests. Therefore we use more storage
services as described earlier and we try not to process long-lasting requests that may
exhaust compute resources so other requests will have to wait in an internal queue of
the webserver for processing.
2. Better load distribution. Because the network load-balancer monitors only
availability of the web server and not its current load (due to large amount of
monitoring data and requirement for extremely fast request redirection) it can happen
that load balancer will redirect a more resource expensive request to a small group of
servers that will become overloaded. Other servers in the farm may be idle at the same
time. The network load balancer cannot effectively distribute more complex compute
operations without a knowledge of the servers load therefore we have to implement
this mechanism on our own. At the design time we know what operations will be
resource intensive so we can optimize them to run asynchronously on a different server.
3. Recovery after error during processing. It can happen that processing of the longrunning operation fails due to an infrastructure failure or inner exception during
processing. Windows Azure Storage Queue and Azure Service Bus Queue both
implement two-phase message retrieval from the queue. Worker process loads
a message and it becomes for a certain time invisible to others. If is the operation
processed successfully the worker tells the queue to remove the message, but if the
worker fails the message with a task definition appears automatically back in the queue
and can be processed by a different worker. Azure Service Bus Queue also supports
moving messages to the specific queue after certain number of failures so they can by
for example reviewed by a developer or an administrator.
This approach brings also some disadvantages:
1. Asynchronous processing of the task. Basic paradigm of the web and a stateless
HTTP protocol is that a client requests data from a server and the server responds with
data the client wants. The server cannot contact the client in case of an event at the
server side. When we use asynchronous processing in the web environment there is
problem that the server cannot send to the client results of the asynchronously
processed operation, it can only send a confirmation that this operation was queued.
When the operation is finished, there is no way how we can notify the client using
a basic HTTP protocol. To resolve this issue we can use a modern approach of
a WebSocket protocol. This protocol allows us to have an opened connection with
every client so the server can deliver results of the asynchronously processed operation
after its completion. It is also possible to off-load traffic to Windows Azure Storage by
delivering to the client only an URL address where the results are stored. It depends
mostly on the size of results. This behavior has to be implemented on our own and it
may significantly increase an overall complexity of the service.
2. Delay in task completion. This approach may lead to longer overall time needed to
process the results of less complex operations than in case of the synchronous
processing because of a queue service delay, a wait time of the request in the queue
and a notification delay.
Asynchronous processing of complex compute operations on different servers significantly
improves overall service scalability.
4.3 Client Tier
The Client Tier is in this paper discussed very briefly because it is strongly dependent on a used
technology and each technology has its specific rules for creating well-designed application
architecture.
The chosen communication protocol for hosted web services is HTTP REST, which is
supported by wide range of technologies used for development of client applications.
While designing the architecture of the client application it should be kept in a mind, that it
should follow the same recommendations as mentioned for the service’s architecture – clearly
defined components with defined communication interfaces. When communicating with the
server, the client should utilize a client-side caching when possible to decrease the service load.
5
Evaluation of implemented service using load tests
I have implemented a reference service according to described recommendations. This service
was deployed to the Windows Azure cloud and some characteristics were benchmarked using
the Performance Load Testing in Visual Studio. Load tests were running on my server in
a datacenter to achieve a higher throughput of the Internet connection.
5.1 Test Case 1: Impact of local cache at service scalability
In this test two different implementations of a REST web service are evaluated. The first
implementation every time loads data from an Azure SQL Database, the second implementation
uses a local cache for data loaded from the database.
This service was tested in many configurations of the cloud environment to capture an impact
of the vertical and horizontal scalability. Parameters of different virtual server configurations
are described in Table 1.
Compute Instance
CPU
(GHz)
Memory
Storage
Bandwidth
Extra Small (A0)
1 (shared) 768 MB
20 GB
5 Mbps
Small (A1)
1.6
1.7 GB
225 GB
100 Mbps
Medium (A2)
2 x 1.6
3.5 GB
490 GB
200 Mbps
Large (A3)
4 x 1.6
7 GB
1000 GB 400 Mbps
Extra Large (A4)
8 x 1.6
14 GB
2040 GB 800 Mbps
A6
4 x 1.6
28 GB
1000 GB 1000 Mbps
A7
8 x 1.6
56 GB
240 GB
2000 Mbps
Table 3: Available virtual machine configurations in Windows Azure
Cost / hour
0,02 $
0,08 $
0,16 $
0,32 $
0,64 $
0,90 $
1,80 $
Load test results for the service without the local cache:
Extra Small
Small
Medium
Large
Requests per second
2000
1615
1462
1500
1194
969
1000
500
390
144 180
721
706 670
554
1092
487
264
340
294
0
1
2
4
8
Number of web role instances
16
Load test results for the service with the local cache:
Extra Small
Small
Medium
Large
Requests per second
2500
2105
2000
1676
1500
1758
2021
1841
1653
2074
1805
1648
1554
1725
1264
935
1000
500
2064
558
469
0
1
2
4
8
Number of web role instances
16
Included test results clearly shows the importance of more complex tiered storage, where
version of the service with active in-memory cache can process 2-4 times more requests than
version without cache.
Note: I expect that this service is capable of processing even more requests, particularly in
configurations of 8 and 16 instance, and in scenarios with medium and large instances. The
performance bottleneck is probably the computer where the tests were running.
5.2 Test Case 2: Scalability of binary file delivery
Significant part of the overall service traffic is caused by a transfer of images and other
resources. Therefore I wanted to test which way of storing images is more effective and
scalable.
The first implementation of a tested service loads images stored as binary streams in a database
SQL Database. Storing binary files in a relational database is an approach that many current
web services use. The second implementation stores data at a local drive of the web role. This
approach is suitable only for resources that are part of an installation package of the application.
The third approach tests performance of the Windows Azure Blob Storage. The fourth tests
performance of the Windows Azure Blob Storage combined with the Content Delivery
Network.
Results for the service implementation that loads image data from the database:
Extra Small
Small
Medium
Requests per second
2000
Large
1648
1624
1500
1407
1261
1710
1292
1210
1029
1000
697
667
500
543
358
352
204
102
49.6
0
1
2
4
8
Number of web role instances
16
Results for the service implementation that loads image data from the local disk of the web
role:
Extra Small
Small
Medium
Large
Requests per second
2500
2000
1901
1731
1706
1870
1749 1679
1810
1718
1959
1916
1801
1500
1084
1000
500
370
529
186
94.1
0
1
2
4
8
Number of web role instances
16
Windows Azure Blob Storage: 1845 requests/second
Windows Azure Blob Storage + CDN: 1639 requests/second
Test results demonstrate that currently used approach to store binary objects in a relational
database is not as effective as use of the Windows Azure Blob Storage. When files are loaded
from a local drive of the web role results are much better, but this scenario works only for files
distributed together with an installation package. The Content Delivery Network has almost the
same performance as the Windows Azure Storage.
6
Conclusion
This paper briefly introduces the most important services of the Windows Azure cloud platform
and discusses their scalability and availability characteristics so the reader understands that
current generation of web services which were built to run on a single web server loading data
only from relational database cannot fully benefit from the cloud. To design architecture of
highly available and scalable cloud service it is needed to take into account wide range of
offered cloud services and discussed recommendations in Sections 3 and 4, which generally
describe how to effectively use these services.
References
[1]
MICROSOFT. Windows Azure: Microsoft's Cloud Platform [online]. 2013 [cit. 201311-10]. Dostupné z: http://www.windowsazure.com/en-us/
[2]
BONDI, André B. Characteristics of scalability and their impact on performance. In:
Second International Workshop on Software and Performance, WOSP2000:
proceedings : Ottawa, Canada, September 17-20, 2000 [online]. New York:
Association for Computing Machinery, c2000 [cit. 2013-11-10]. ISBN 158113195x.
Dostupné z: http://dl.acm.org/citation.cfm?id=350432
[3]
CALDER, Brad et al. Windows Azure Storage: A Highly Available. In: SOSP 2011 —
Proceedings [online]. 2011 [cit. 2013-05-24]. SOSP '11, October 23-26, 2011, Cascais,
Portugal. Dostupné z: http://sigops.org/sosp/sosp11/current/2011-Cascais/printable/11calder.pdf
[4]
RICHARD N. TAYLOR, Richard N.Nenad Medvidović. Software architecture:
foundations, theory, and practice. Hoboken, N.J: Wiley, 2010, s. 468-475. ISBN
9780470167748.
Support for Partner Relationship Management Internet Data
Sources in Small Firms
Jan Ministr
VŠB – Technical University of Ostrava
Faculty of economics
Sokolská třída 33, 701 21 Ostrava
jan.ministr@vsb.cz
Abstract
The paper analyses the processing of data sources to increase competitiveness in small
businesses by using of Partner Relationship Management systems. Timely and accurate
information about a business partner decisively affect the level of cooperation with a partner
and it also reduces the risk of closed shops. The paper describes the possibilities of processing
data sources of small businesses in terms of their search and further processing in the
combination of information that are stored by in the enterprise information systems.
In conclusion, the article discusses the availability of such systems in terms of cost
and technology required.
Abstrakt
Příspěvek analyzuje možnosti zvýšení konkureneceschopnosti malých firem v oblasti Partner
Relationship Managementu. Včasná a pravdivá informace o obchodním partnerovi ovlivňuje
rozhodujícím způsobem rozsah spolupráce s daným partnerem a snižuje riziko uzavřených
obchodů. Příspěvek popisuje možnosti zpracování internetových datových zdrojů z hlediska
jejich vyhledávání a dalšího zpracování v kombinaci informacemi, které jsou standardně
uchovávány v podnikových informačních systémech. V závěru článku je diskutována
dostupnost takových systémů z hlediska jejich nákladů a požadované technologie.
Keywords
Partner Relationship Management, Data Source, Competitiveness, Resource Description
Frame, Web Ontology Language
Klíčová slova
Partner Relationship Management, datový zdroj, konkurenceschopnost, Resource Description
Frame, Web Ontology Language
1
Introduction
Partner Relationship Management (PRM) systems support the management of relationships
with external, mainly business partners. Alternatively, these systems are also referred
to as systems Customer Relationship Management (CRM) or Customer Intelligence (CI).
The activities of these systems can be understood as a continuous, ethical process of acquiring
and evaluating data, information and knowledge from around the company to achieve a better
position in the market [5]. Data sources PRM systems can be divided into the following
two categories:

internal corporate data sources, which have a stable data structure in form of database
records;

External data sources include in particular data available in the Internet [1], which have
unstructured form and they are primarily stored in:
 special public portals;
 corporate portals;
 social networks, blogs and Internet discussions;
 social electronic media (e-magazines, e-zines, etc.);
 e-mail messages, etc.
Processing of web data sources can be divided on the basis of the stability of their data
structures on processing:


Internet data with relatively stable structure that is stored on a special public portals;
non-structured texts, which represent additional source of information, however, they
can significantly impact in particular the business of the corporation [2].
Processing of external information sources in the context of PRM provides special
opportunities for companies whose use of these companies also gets a significant competitive
advantage [3]. These mainly include the required information about business partners
and customers, competition in order to increase their own sales success and potential
employees.
2
Processing of Web Data Sources
Data from the Internet Data sources can be obtained in two ways:


2.1
Indirectly (offline, one-off or repeatedly) as a service provided by specialized SW
companies. Data are usually structured. However, this method has certain time lag and
is very costly.
Directly (online, continuously) as a service of a component of the corporate
information system RPI. The key problem here is the identification of relevant data
sources. In the Czech Republic, there are several public portals providing paid or free
information on companies and contacts of their key employees. These data sources
include primarily the following:
 www.ipoint.cz,
 www.justice.cz,
 www.statnisprava.cz,
 www.zivefirmy.cz,
 www.abc.cz.
Processing Web Data Sources with Relatively Stable Data Structure
These sources provide current information on the company such as: Name, ID Number (IČO),
Tax ID (DIČ), Registered Offices (Street, Postal Code, City), Contact (P.O.BOX, Telephone,
Fax, E-mail, Website), The state of the company (for instance, pending execution procedure,
bankruptcy, etc.), names and contacts of key employees and officers. The information
can be downloaded using open source tools, or other, such as Visual Web Ripper. The tool
downloads the required data, converts them to the necessary format and stores them
in the corporate database. A drawback of such approach of data processing is the relatively
instability of Internet pages and nonstandard description of the displayed details. This fact
requires monitoring of the structure of selected data resources and subsequent updating of data
stored in RDF (Resource Description Framework), which contains a description of both metadata and procedures for processing them. For more advanced search, Web Ontology Language
(WOL) must be used, which extends the features of RDF in the area of creating ontological
structures, as is shown, for instance, in Figure 1.
The relationship
to the company
The owner
He (she) works
for the company
The member
of the management
The employee
Figure 1. Ontological Model of Expressing the Relationship to the Company (Source: own)
The structure of the described software solution is based on the definition
of the Web information sources using object classes (Figure 2), which are specified in more
detail for the individual tool used for the downloading of data (for instance, Visual
Web Ripper). The class “Source” is specified by three elements defining its purpose [7]:



input parameter for searching in the defined resource (“ResourceSearchBy”);
type of result for the information obtained from the resource (“ResourceResultType”);
identification of the tool used for data downloading (“Tool”).
Another class is “Query”, representing every query entered in the system by the user, used
to initiate new retrieval of information from the Internet. The class “Result” is the super-class
of all result types, it contains the data field Note, used for recording of information that cannot
be included in the defined data item in the concrete subclass used for the result type. A subclass
of the class “Result” is the class “ResultEntity” defining the data items specific to the entity.
Every result of a query execution contains information on the resource that was used and the
query in which it was created.
A benefit of this processing type is obtaining of up-to-date information from verified resources
in the required and clear form of view. Another issue is the need of acquiring “new” knowledge
of
working
with
semantic
Web,
which
increases
the
demands
put
on the personnel in charge of administering the PRM system.
ResourceSearchBy
 Id. Number
 Name
ResourceResultType
 Subject of search
 Visual Web
Ripper
Query
Result
Resource
 Name
 Priority
Tool
 Note
 Query time
 Search text
ResultSubject
 Address
 Establishment date
 et cetera
Figure 2. Chart of Classes with Relatively Stable Data Structures (Source: own)
2.2
Processing Unstructured Web Data Sources
Overwhelming majority of data stored in the Internet has the form of unstructured texts, which
must be converted into clear and uniform user environment. The whole process can be divided
into the following 4 stages [6]:




monitoring and collection of source data;
administration and storage of source data;
data processing using analytic tools;
presentation of analytic results.
During the process of processing unstructured data for the aforementioned purposes, one must
first define the target of the Internet monitoring, in our case, this is an individual or a company
we wish to monitor. Further, access of the defined social networks and discussion forums must
be obtained. Here the current laws of the relevant country must be observed, for instance, not
to pose as someone else, etc. Selected targets are then monitored in the Internet and if the search
engine finds any “interesting” information on the target, it is stored on the local server in a smart
database, which allows, first of all, fast searching and data indexing. This data is subsequently
analysed and the results of the analysis are reported to the end user.
On the level of selected text database, analytic functions sorting data by time, source, type, etc.,
are normally used. On this level of unstructured text processing, analytic functions such as
filtering of selected word or a person's name can also be used. These operations are usually
carried out during later processing stages using special software tools, which must be calibrated
according to domain areas, which is not usual for the database-level tools [8]. Processing of
unstructured data can be divided into the following areas:




content analysis;
entity identification (authorship analyses);
relations analysis;
sentiment analysis.
Content analysis is one of the main areas of unstructured text processing. The goal is to identify
key words that appear in the text. However, to achieve this, words carrying semantic
information, i.e., combinations of selected nouns and adjectives, verbs and numerals, are
selected from the text. At the same time, the synonyms that make terms equal in meaning
replaced with one selected proxy. Such modified text is then used for deriving the topics of the
different texts, which can have, for instance, the form of a list of keywords. The whole process
of content analysis consists of several main steps that are described in the following paragraphs
[4], [10]:



Tagging of word types using external libraries (such as POS Tagger). From such tagged
words, keywords are selected, mostly nouns, verbs and adjectives.
Processing of synonyms permits grouping of words according to their meaning, not
only depending on their form, but also with an option of creating relationships between
them.
Lemmatization and stemming tries to convert words to their basic forms.
Lemmatization is more time-consuming as the processing is based on the context of the
word. Words found in different forms must be recognized as identical. Then equivalent
words can be grouped and it is possible to avoid multiplicity of keywords [19].
Person
Employe
r
Interests
Educatio
n
Squash
Bicycle
Motorcyc
le
 Motorcycles
 Motorbike
 Motorbikes
 Bike
 Other synonyms
Figure 3. Example of Ontology (Source: own)
Ontology of decision-making trees is a different and simplified approach to processing
of analysis for unstructured texts from the one described above. It is based
on the pre-definition of keywords, i.e., entity attributes, in advance by the user. From these
attributes, a query in the decision-making tree is generated, with the nodes of the tree being the
queried entity attributes (areas of information needs of the user with respect to the entity), which
are expressed by a table of synonyms for the queried area. The detail level of the query is then
expressed by the level of explosion of the given node, see Figure 3. The structure of the query
can be optimized using the decision-making trees induction theory or Occam's razor [2].
A shortcoming of this approach is the fact that in generation of the list of entity attributes, not
all key attributes need be specified that can be identified using analytic tools directly from the
unstructured texts than the approach described above, for instance, we monitor attributes such
as sports for some persons, but we cannot find out from the given data that he has started
gambling. In general, this approach is appropriate in the segment of small business as the
application development is less costly.
2.3
Presentation of Analysis Results
The results of the analyses are presented in three different forms:



detail of the found record;
graph, which is used primarily for visualization of the development in time of the
phenomenon (line and bar graphs), visualization of percentages for the different entity
types (pie-charts), or for visualization of relationships (network charts, polygons);
combination of graphs, i.e., special type of visualization such as the spectrum graph,
which is often used in combination with Liferay, a corporate portal based on Java EE,
allowing easy integration of information, applications and processes. The portal
presents information to its visitors via its Web interface, which can be customized to
suit the current needs of different users.
The form of presentation of results is a very important part of the application, and it basically
sells it, because the user must quickly comprehend the information in a wider context.
Figure 4. Example of Data analyses Presentation (Source: own)
3
Specifics of the PRM Add on in Small Business
In implementing the PRM add-on in small business companies, one must take into account the
purpose of the future utilization of the module. Most current users in this segment will
not practically use all the options mentioned above, and at the same time, the experience of the
author of this article suggests that they do not know which type of information they precisely
require and in what form they would like to obtain it. Therefore, one must very carefully
regulate sometimes unrealistic ideas of the users that all they would like to know is really freely
available in the Internet, and at the same time, advise them on the costs of obtaining such
information. Most of the current solutions that are offered on the market solves this problem
off line approach or the processing of relatively expensive one-off analyzes, that not cheap. For
these reasons, the author recommend very precise identification of the domain areas and the
type of information resources in the Internet. On completing a serious presentation of the
options and costs of the PRI module add-on, most customers will be satisfied with their
information system featuring online processing of Internet resources with a relatively stable
data structure, where within the small business segment, most users require basic information
on the business partner or customer (registered offices, economic condition). Processing of
unstructured data from the Internet allows for obtaining more “specific” information on the
entity, and one can be better acquainted with the characteristics of an “unknown” business
partner or with new information on existing entities, however, it must be considered that every
such solution is fundamentally unique and therefore, at present, it is more costly. In the small
business segment, most users require standard information on their business partner or
customer. It is very rarely that these customers want processing in the sentiment analysis area,
as they are mostly in direct contact with their customers and know their opinions. Further, one
must take into account the technological limits of the companies and therefor, development of
such applications will probably be directed towards using a cloud computing platform.
4
Conclusion
Currently, it is possible to support the small business in the area of obtaining customer data,
using the approach combining data from the primary ERP system with a PRM add-on.
For processing of data using the PRM module, practical options are structure data (or data with
a relatively stable structure) and unstructured data. For deployment in small companies, it is
important to define the purpose of processing such data. However, what is decisive for the
deployment of PRM, is the cost of prepared application, as at present, no typical solutions are
on the market. As for the requirements of standard information, offline analysis can be used,
and it is possible to make a selection from the companies processing data on economic entities
on the domestic market. In the case of more complex requirement with respect to the volume
and structure of processed information, the turn-key solutions would entail large financial
burden.
5
Acknowledgements
This project is based on the experience obtained in solving the OPPI project of the Czech
Ministry of Industry “Use of ontology and semantic Web technologies in a CRM solution”, in
which the approaches of Internet information resource processing described in the article were
applied.
References
[1]
D. M. Boyd, N.B. Ellison, N. B. Social Network Sites: Definition, History, and
Scholarship. Journal of Computer-Mediated Communication, vol. 13, no. 1, pp. 210-230,
2007, doi: 10.1111/j.1083-6101.2007.00393.x.
[2] F. Nguyen, T. Pitner, T. Information System Monitoring and Notifications Using
Complex Event Processing. Paper presented at the 5th Balkan Conference in Informatics
BCI. Novi Sad, Serbia. pp. 211-216, 2012, doi:10.1145/2371316.2371358.
[3] J. Ministr, J. Ráček. Analysis of Sentiment in Unstructured Text, In IDIMT-2011:
Interdisciplinarity in Complex Systems. pp. 299-304, Jindřichův Hradec: Trauner
Verlag, Linz 2011.
[4] V. A. Yatsko, T. N., Vishnyakov. A method for evaluating modern systems of automatic
text summarization. Automatic Documentation and Mathematical Linguistics, vol. 41,
no. 3, pp. 93–103. 2007, doi: 10.3103/S0005105507030041.
[5] J. Hančlová, P. Doucek. The Impact of ICT Capital on Labor Productivity Development
in the Sectors of Czech Economy. In IDIMT-2012: ICT Support for Complex Systems.
pp. 123-134, Jindřichův Hradec: Trauner Verlag, Linz 2012.
[6] ALLEMANG, Dean a James HENDLER, 2011. Semantic Web for the Working
Ontologist: Effective Modeling in RDFS and OWL. 2nd ed. Waltham: Morgan Kaufmann
Publishers, ISBN 978-0-12-385965-5.
[7] HEBELER, John, Colin EVANS a Jamie TAYLOR, 2009. Semantic web programming:
modeling in RDF, RDFS, and OWL. 1st ed. Indianapolis: Wiley Publishing, 616 s. ISBN
978-047-0418-017.
[8] YU, Liyang, 2007. Introduction to the Semantic Web and Semantic Web Services. 1st ed.
Chapman and Hall/CRC, 368 s. ISBN 978-1584889335.
[9] Christopher D. Manning Yoram Singer Kristina Toutanova, Dan Klein. Feature-rich
part-of-speech tagging with a cyclic dependency network. In Proceedings of HLTNAACL, pages 252–259, 2003.
[10] Stefan Evert Eugenie Giesbrecht. Is part-of-speech tagging a solved task? an evaluation
of pos taggers for the german web as corpus. In Web as Corpus Workshop WAC5, pages
27–35, September
10th Summer School of Applied Informatics
Proceedings
10. letní škola aplikované informatiky
Sborník příspěvků
Bedřichov, 6.–8. září 2013
Editoři/Editors:
prof. RNDr. Jiří Hřebíček, CSc.
ing. Jan Ministr, Ph.D.
doc. RNDr. Tomáš Pitner, Ph.D.
Vydal: Nakladatelství Littera 2013
1. vydání
Tisk: Tiskárna Knopp, Černčice 24, 549 01 Nové Město nad Metují
ISBN 978-80-210-6058-6
Download