Abstracting code specific concepts to a graphical representation by

advertisement
Abstracting code specific concepts
to a graphical representation by pattern
matching and refactoring
David Flenstrup
Kongens Lyngby 2009
IMM-M.Sc.-2009-43
ii
Technical University of Denmark
Department of Informatics and Mathematical Modeling
Richard Petersens Plads, DTU – Building 321
DK-2800 Kongens Lyngby, Denmark
Phone +45 45253351, Fax +45 45882673
reception@imm.dtu.dk
www.imm.dtu.dk
Abstract
This thesis deals with automatic analysis of Microsoft Dynamics NAV (NAV) application code. NAV (formerly
Navision) is an advanced Enterprise Resource Planning (ERP) system like SAP and Axapta, which can handle
anything from accounting to stock control for a company. NAV is customized to the individual enterprise
using the application language C/AL. The NAV configuration application has grown incrementally, from
version to version for more than 20 years, becoming a complex piece of software.
With the improvements that have emerged in languages and technologies, the NAV organization now
stands before a number of choices. It is clear that the application code has to be reorganized, but it is
unclear if the code should be kept in C/AL or moved to C#, because there are obvious benefits from both
choices.
The aim of this thesis is to contribute to this process with knowledge, by uncovering what exists today and
model some chosen suggestions for refactoring the present C/AL code.
Our approach is to identify, analyze, and implement recognition of software patterns in the NAV application
code. The software patterns we focus on, define relations between objects.
The application code’s data model consists of table objects which, among other things, save all data for the
enterprises. We use Unified Modeling Language (UML) diagrams to describe the relations we identify
between tables. The concepts we identify are relationships which in UML terms are described as
containment, aggregation, and generalization.
This gives the NAV organization an insight into which implications a change in one table object will have on
other objects. An example of one of the relations we identify is the relation between the Customer table,
which keeps information on all the company’s customers, and the Customer Bank Account table, which
contains information regarding the customers’ bank accounts. From our analysis we can see that the
existence of objects of the type Customer Bank Account is conditioned by the existence of a Customer
object. This is described with the concept containment in UML.
We show that it is possible to analyze the application code and extract concepts which contribute with an
overview of selected relations. Furthermore, we show that a graphic dynamic display of the relations is
preferable. We have developed a good intuitive way to present our results, where it is possible to make
sense of even very large diagrams with the provided filtering methods of the Concept Viewer tool.
We extended the project scope underway in the project, and examined the possibility of identifying specific
concepts from the accounting ontology Resource, Events, and Agents (REA) in the C/AL code. We found
that Events can easily be identified, but that the classification of Resources and Agents has some
complications which require a refinement of the chosen approach.
iv
Résumé
Dette speciale omhandler automatiseret analyse af Microsoft Dynamics NAV (NAV) applikationskoden. NAV
(tidligere Navision) er et avanceret Virksomhedsstyringssystem (ERP), i stil med SAP og Axapta, der kan
håndtere alt fra regnskab til lagerstyring i en virksomhed. NAV tilpasses til den enkelte virksomhed i
applikationssproget C/AL. NAV konfigurationsapplikationen har vokset inkrementalt fra version til version i
mere end 20 år og er i dag et yderst komplekst stykke software.
I takt med at der er kommet generelle forbedringer i sprog og teknologier, står NAV organisationen nu over
for en række valg. Det står klart, at applikationskoden skal reorganiseres. Det er dog uklart om koden skal
bibeholdes i C/AL eller flyttes til C#, da der er åbenlyse fordele ved begge dele.
I dette speciale søger vi gennem omfattende kode analyse at bidrage med viden til denne proces ved at
afdække hvad der findes i dag, samt ved at modellere nogle udvalgte forslag til kodeomskrivninger ud fra
den nuværende C/AL kode.
Vores fremgangsmåde er at identificere, analysere og implementere genkendelse af softwaremønstre i NAV
applikationskoden. De softwaremønstre vi fokuserer på udtrykker relationer mellem objekter.
Applikationskodens datamodel består af tabelobjekter der blandt andet bruges til at gemme alle data for
virksomheden. Vi benytter Unified Modeling Language (UML) diagrammer til at beskrive de relationer vi
identificerer mellem tabeller. De koncepter vi identificerer, er relationsbegreberne der i UML termer
beskrives som containment, aggregation og generalization.
Dette giver NAV organisationen et indblik i hvilke implikationer en ændring i et tabelobjekt vil have på
andre objekter. Et eksempel på en af de relationer vi identificerer, er relationen mellem Customer tabellen,
der indeholder oplysninger om alle virksomhedens kunder, og Customer Bank Account tabellen, der
indeholder oplysninger om kunders bank konti.
Fra vores analyse kan vi se at objekter af typen Customer Bank Account eksistens er betinget af eksistensen
af et Customer objekt. Dette beskrives med UML begrebet containment.
Vi viser at det er muligt at analysere applikationskoden og udtrække koncepter der bidrager med et
overblik over udvalgte relationer. Derudover viser vi, at en dynamisk grafisk visning af relationerne er at
foretrække. Vi har udviklet en god intuitiv måde at præsentere vores resultater, hvor der kan findes mening
i selv meget store diagrammer med de filtreringsmetoder værktøjet Concept Viewer tilbyder.
Vi udvidede projektes mål undervejs i projektet og undersøgte muligheden for at identificere specifikke
koncepter fra regnskabsontologien Resource, Events, and Agents (REA) ud fra C/AL koden. Vi finder at
Events nemt kan identificeres, men at klassificeringen af Resources og Agents har nogle komplikationer der
kræver en videreudvikling af den valgte tilgang.
vi
Preface
This thesis was prepared in collaboration with the Department of Informatics and Mathematical Modeling
(IMM) at the Technical University of Denmark (DTU) and the Microsoft Dynamics NAV Application Team
(APP Team) at Microsoft Development Center Copenhagen (MDCC). This thesis is prepared as the
fulfillment of the final requirement for earning the degree of Master of Engineering in Computer Science.
The thesis is the result of the work carried out from December 2008 to July 2009 with a workload of 35
ECTS credit.
Kongens Lyngby, July 2009
David Flenstrup
viii
Acknowledgements
I would like to thank Microsoft and especially Jesper Kiehn for his commitment to my project. I am really
happy with the subject we have chosen to investigate and if it had not been for Jesper’s passion and
humongous insight in NAV and REA (and his power of persuasion) this topic could not have been covered.
From DTU I would like to thank Peter Falster and Jeppe Revall Frisvad. I am very grateful for the excessive
interest and support my project has been given. It has truly been a great experience and an invaluable
resource for me.
x
Table of Contents
Abstract ....................................................................................................................................................... iii
Résumé ........................................................................................................................................................ v
Preface ....................................................................................................................................................... vii
Acknowledgements ..................................................................................................................................... ix
List of Figures.............................................................................................................................................. xv
List of Tables ............................................................................................................................................. xvii
List of Formulas ......................................................................................................................................... xix
List of Code Samples .................................................................................................................................. xxi
1
2
Introduction ......................................................................................................................................... 1
1.1
Background for the project ........................................................................................................... 1
1.2
Project Aim................................................................................................................................... 1
Enterprise Resource Planning Domain .................................................................................................. 3
2.1
Introduction to Enterprise Resource Planning (ERP) (3)................................................................. 3
2.2
ERP Business Opportunities (10) ................................................................................................... 4
2.3
Microsoft Dynamics NAV (NAV) (11) ............................................................................................. 5
2.3.1
NAV architecture .................................................................................................................. 7
2.3.2
Application Code (14) VS. Product Code ................................................................................ 7
2.3.3
Application Language Transition ........................................................................................... 8
2.4
3
Fundamentals of the C/AL language ............................................................................................. 9
2.4.1
C/AL Design Criteria .............................................................................................................. 9
2.4.2
C/AL Syntax........................................................................................................................... 9
Related Work ......................................................................................................................................13
3.1
What has been accomplished inside Microsoft ............................................................................13
3.1.1
Partial C/AL parser ...............................................................................................................13
3.1.2
Codedub ..............................................................................................................................14
3.1.3
Navision Developer Toolkit (25) ...........................................................................................14
3.1.4
Object Map ..........................................................................................................................14
xii
4
5
3.2
What has not been accomplished inside Microsoft ......................................................................14
3.3
What has been accomplished outside Microsoft ..........................................................................14
3.3.1
Refactoring from Code to UML ............................................................................................14
3.3.2
UML designers .....................................................................................................................16
3.3.3
Refactoring to Resource, Events and Agents (REA) (30) ........................................................16
3.4
What has not been accomplished outside Microsoft....................................................................16
3.5
Related work summary................................................................................................................16
Foundational Relations (33) ................................................................................................................17
4.1
Definition of the Is_a relation ......................................................................................................17
4.2
Definition of the Part_of relation .................................................................................................17
Unified Modeling Language (UML)(34) ................................................................................................19
5.1
6
UML Terminology Overview ........................................................................................................19
5.1.1
Dependency.........................................................................................................................19
5.1.2
Generalization .....................................................................................................................20
5.1.3
Association ..........................................................................................................................20
5.1.4
Aggregation .........................................................................................................................20
5.1.5
Containment........................................................................................................................20
Analysis...............................................................................................................................................21
6.1
Identifying Containment Pattern .................................................................................................21
6.1.1
The role of the Sales Header and Sales Line tables in NAV ....................................................21
6.1.2
Manual analysis of Containment pattern .............................................................................24
6.1.3
Variations in containment pattern .......................................................................................25
6.1.4
Manual identification of additional containments ................................................................26
6.2
Identifying Generalization Pattern ...............................................................................................27
6.2.1
Code smell Large Class and solution Extracting Class ............................................................28
6.2.2
Manual analysis of the Generalization pattern .....................................................................28
6.2.3
Manual analysis of the generalization relationship from Sales Line ......................................30
6.3
Identifying REA Concepts .............................................................................................................31
6.3.1
The Resource, Event and Agent (REA) model (40).................................................................31
6.3.2
Introduction to Accounting Theory ......................................................................................34
6.3.3
Naming of tables in NAV ......................................................................................................36
6.3.4
REA concepts in NAV............................................................................................................36
6.3.5
Manual analysis of REA Events in NAV..................................................................................37
xiii
7
Tools ...................................................................................................................................................41
7.1
.NET Framework ..........................................................................................................................41
7.2
Interoperability ...........................................................................................................................41
7.3
F# ................................................................................................................................................41
7.3.1
8
7.4
LEX, YACC and Abstract Syntax Trees (54) ....................................................................................45
7.5
C#................................................................................................................................................46
7.6
Regular expressions (58), (59) ......................................................................................................47
7.7
Lambda expressions (60), (61) .....................................................................................................47
7.8
LINQ (63) (64) ..............................................................................................................................48
7.8.1
Example 1: Without LINQ .....................................................................................................49
7.8.2
Example 2: With LINQ ..........................................................................................................50
Implementation ..................................................................................................................................51
8.1
Problem with Parser ....................................................................................................................51
8.1.1
CALParser AST to XML AST ...................................................................................................51
8.2
New data representation .............................................................................................................51
8.3
CAL parser extension for table relations ......................................................................................52
8.3.1
Generic parser vs. specific parsing rules ...............................................................................53
8.3.2
Parser Implementation ........................................................................................................53
8.4
Parsing rules (68) .........................................................................................................................54
8.5
LINQ for Querying........................................................................................................................55
8.6
Lambda Expressions in action ......................................................................................................57
8.7
Graph generation ........................................................................................................................58
8.7.1
9
Language syntax ..................................................................................................................42
Microsoft Automatic Graph Layout (69) ...............................................................................59
8.8
Performance boosts ....................................................................................................................59
8.9
Algorithm design .........................................................................................................................60
8.9.1
Containment........................................................................................................................60
8.9.2
Inheritance ..........................................................................................................................61
8.9.3
Inheritance – Reusing generalization objects .......................................................................62
8.9.4
Implementation of REA identification ..................................................................................62
8.9.5
Limitations by approach and solution suggestion .................................................................64
8.9.6
Analysis of initial REA results ................................................................................................65
Results ................................................................................................................................................67
xiv
9.1
TableRelation Analysis and TableRelation Parser Quality Assurance.............................................68
9.1.1
Matches on Key fields ..........................................................................................................68
9.1.2
Matches on all fields ............................................................................................................68
9.2
Results for Containment Pattern..................................................................................................69
9.3
Results for Generalization Pattern ...............................................................................................71
9.4
Results from refactoring the generalization objects .....................................................................72
9.5
The Concept Viewer and its output ..............................................................................................72
9.5.1
Sales Header and Sales Line – Aggregations and Containments ............................................73
9.5.2
Sales Header and Sales Line – Associations ..........................................................................74
9.5.3
Sales Header and Sales Line – Associations refactored via Generalization objects ................74
9.5.4
Sales Header and Sales Line – Associations refactored via refactored Generalization objects
75
9.5.5
Sales Line, Standard Sales Line, and Sales Line Archive –Reused Generalization objects
explored 75
9.5.6
Sales Line, Standard Sales Line, and Sales Line Archive – Generalization objects explored ....76
9.5.7
Sales Line, Standard Sales Line and Sales Line Archive – Associations explored ....................76
9.5.8
Item – Containments and Aggregations mapped with reused Generalization objects ...........77
9.5.9
Item –Reused Generalization objects containing Item ..........................................................78
9.6
Feedback from the Application team ...........................................................................................78
10 Conclusion ..........................................................................................................................................81
10.1
Parsing the C/AL application ........................................................................................................81
10.2
UML Relationship pattern matching ............................................................................................81
10.2.1
Generalization .....................................................................................................................81
10.2.2
Containment........................................................................................................................81
10.3
The Concept Viewer ....................................................................................................................82
10.4
Resource, Events and Agents (REA) relationship pattern matching ..............................................82
11 Future work and Perspective ...............................................................................................................83
12 Abbreviations......................................................................................................................................85
13 Works Cited ........................................................................................................................................87
14 Appendix.............................................................................................................................................91
14.1
Content on the enclosed DVD ......................................................................................................91
List of Figures
Figure 2-1 Integration Data Flow (5) ............................................................................................................. 3
Figure 2-2 ERP process flow (5) .................................................................................................................... 4
Figure 2-3 Home page of an Order Processor in Microsoft Dynamics NAV 2009 (Role Tailored Client) .......... 6
Figure 2-4 Current C/AL Compilation for new and old product stack ............................................................ 8
Figure 3-1 UML diagram manually created with Visio ..................................................................................15
Figure 3-2 UML diagram generated from code with MagicDraw ..................................................................15
Figure 5-1 UML mapping of a dependency ..................................................................................................19
Figure 5-2 UML mapping of a generalization ...............................................................................................20
Figure 5-3 UML mapping of an association ..................................................................................................20
Figure 5-4 UML mapping of an aggregation.................................................................................................20
Figure 5-5 UML mapping of a containment .................................................................................................20
Figure 6-1 Overview of all Sales Orders in the Role Tailored Client ..............................................................22
Figure 6-2 New Sales Order in the Role Tailored Client ................................................................................23
Figure 6-3 Viewed as an OO UML diagram ..................................................................................................24
Figure 6-4 Viewed as a database diagram with primary keys (PK) and foreign keys (FK) ..............................24
Figure 6-5 Table field association ................................................................................................................27
Figure 6-6 Multiple associations refactored to a single association..............................................................27
Figure 6-7 Selection box with elements .......................................................................................................29
Figure 6-8 Property window displaying the OptionString of the field Type...................................................29
Figure 6-9 Candidates for Generalization Refactoring..................................................................................30
Figure 6-10 Refactoring multiple associations to single associations............................................................31
Figure 6-11 The cookie company (42)..........................................................................................................33
Figure 7-1 The structure of a abstract syntax tree .......................................................................................46
Figure 8-1 Arrow heads for Containment, Aggregation and Generalization .................................................59
Figure 8-2 Illustrating procedure references between CU12 and CU13 ........................................................64
Figure 9-1 Template, Batch, Line pattern found in the initial Containment work .........................................67
Figure 9-2 Template, Batch, Line pattern in the Concept Viewer .................................................................67
Figure 9-3 The Concept Viewer ...................................................................................................................73
Figure 9-4 Sales Header and Sales Line – Aggregations and Containments ..................................................74
Figure 9-5 Sales Header and Sales Line Associations....................................................................................74
Figure 9-6 Sales Header and Sales Line Generalizations ...............................................................................75
Figure 9-7 Sales Header and Sales Line Generalizations refactored..............................................................75
Figure 9-8 Sales Line, Standard Sales Line, and Sales Line Archive reused Generalization objects explored ..76
Figure 9-9 Sales Line, Standard Sales Line, and Sales Line Archive Generalization objects explored .............76
Figure 9-10 Sales Line, Standard Sales Line, and Sales Line Archive – Associations explored ........................77
Figure 9-11 Sales Line, Standard Sales Line and Sales Line Archive – Associations explored .........................77
Figure 9-12 Item –Reused Generalization objects containing Item ..............................................................78
xvi
List of Tables
Table 2-1 NAV Customer segment................................................................................................................ 5
Table 6-1 Sales Header and Sales Line Containment ....................................................................................24
Table 6-2 Sales Header Containment in detail .............................................................................................25
Table 6-3 Purchase Header and Purchase Line Containment .......................................................................26
Table 6-4 Profile Questionnaire Header and Profile Questionnaire Line Aggregation ...................................26
Table 6-5 Generalizations in Sales Line ........................................................................................................30
Table 6-6 Color definition for Figure 6-9 ......................................................................................................31
Table 6-7 The double-entry accounting system ...........................................................................................34
Table 6-8 Table naming in NAV ...................................................................................................................36
Table 6-9 Code for posting Journals to Entries.............................................................................................38
Table 7-1 Sync vs. Async execution .............................................................................................................45
Table 8-1 REA results from step 1 ...............................................................................................................63
Table 8-2 REA candidate sets ......................................................................................................................65
Table 9-1 Key fields .....................................................................................................................................68
Table 9-2 Unique key fields .........................................................................................................................68
Table 9-3 All fields ......................................................................................................................................69
Table 9-4 All unique fields ...........................................................................................................................69
xviii
List of Formulas
Formula 4-1 is_a .........................................................................................................................................17
Formula 4-2 part_for ..................................................................................................................................17
Formula 4-3 has_part..................................................................................................................................18
Formula 4-4 part_of ....................................................................................................................................18
xx
List of Code Samples
Code 2-1 Table 3 Payment Terms ...............................................................................................................11
Code 6-1 Requirements for Generalizations ................................................................................................28
Code 6-2 Requirements for Generalizations ................................................................................................28
Code 7-1 Example with complex numbers – Definition of active patterns ...................................................42
Code 7-2 Example with complex numbers – Add function ...........................................................................43
Code 7-3 Example with complex numbers – Multiply functions ...................................................................43
Code 7-4 Example with Sync and Async execution - Synchronous function ..................................................44
Code 7-5 Example with Sync and Async execution - Asynchronous function ................................................44
Code 7-6 C/AL variable assignment .............................................................................................................46
Code 7-7 API description of the First method ..............................................................................................47
Code 7-8 Example with lambda expression .................................................................................................48
Code 7-9 Example without lambda expression ............................................................................................48
Code 7-10 Example without LINQ................................................................................................................49
Code 7-11 Example with LINQ .....................................................................................................................50
Code 8-1 C/AL IF statement ........................................................................................................................51
Code 8-2 Segment from our abstract syntax tree XML representation ........................................................52
Code 8-3 TableRelation matching Expression2 ............................................................................................54
Code 8-4 Appendix DVD, file
\Code\RegularExpressionParser\RegularExpressionParser\CALRegularExpressions.fs .................................54
Code 8-5 TableRelation matching Expression3 ............................................................................................55
Code 8-6 Appendix DVD, file
\Code\RegularExpressionParser\RegularExpressionParser\CALRegularExpressions.fs .................................55
Code 8-7 LINQ Example1 - Select record variables with number equals var ...............................................56
Code 8-8 Querying the abstract syntax tree ................................................................................................56
Code 8-9 Querying the abstract syntax tree ................................................................................................56
Code 8-10 LINQ Example2 – Select all statements .......................................................................................57
Code 8-11 LINQ Example3 – Select all Exp1 variables with ID equals var......................................................57
Code 8-12 Appendix DVD, file \Code\MatchingInLinq\IdentifyInheritance.cs, method RefactorInheritance 58
Code 8-13 Appendix DVD, file \Code\MatchingInLinq\IdentifyREAConcepts.cs, method
FindProcedureReferences ...........................................................................................................................58
Code 8-14 Appendix DVD, file \Code\MatchingInLinq\IdentifyREAConcepts.cs, method Call .......................58
Code 8-15 Appendix DVD, file \Code\MatchingInLinq\ParseRemaningElements.cs, method
ParseElementsToXML .................................................................................................................................58
Code 8-16 Pseudo code for the Containment algorithm ..............................................................................61
Code 8-17 Pseudo code for Generalization algorithm..................................................................................62
Code 8-18 Pseudo code for refactoring Generalization objects ...................................................................62
Code 9-1 Special case of TableRelation ignored in Containment analysis .....................................................70
xxii
Code 9-2 Special case of TableRelation ignored in Generalization analysis ..................................................71
Code 9-3 Special case of TableRelation dependent on YES/NO values ignored in Generalization analysis ....72
Chapter 1
1 Introduction
1.1 Background for the project
The overall scope for projects sponsored by the NAV Application Team is to find the primitives for the best
path towards the next generation NAV application. This is a very interesting task due to a number of
reasons we will cover later. The application team is highly motivated in finding the best steps towards a
new application design and student projects are used as one type of contribution for uncovering and
analyzing this challenge.
The application has grown incrementally from version to version, and the long term goal is to refactor the
application code, but the first steps are to analyze the available code and provide knowledge on how the
application is tied together. Due to the way the application has been developed, see section 2.3.2, no clear
overview of the application exists in terms of Unified Modeling Language (UML) (see section 12, for a list of
all abbreviations), or similar, models. These are necessary because the code is too complex for developers
to comprehend, and the learning curve for new developers is very steep.
One of the key drivers for a redesign is to reduce the code base. J. Kiehn stated that the code base has been
growing with a factor 10 from Navision v. 1. This is of course an exaggeration, but the application code base
is now counting more than 2.4 million code lines, which indicates, the complexity of the application.
Earlier work (1) has shown that the number of dependencies in the code base is high, which is confirmed by
J. Kiehn. Dependencies increase the complexity of code, and complex code is more prone to contain bugs
and costs more man hours to maintain. These factors are important and needs to be taken care of and the
NAV team are aware of the potential problems.
1.2 Project Aim
This thesis aims at contributing to a solution for the above problem, by uncovering what exists today and by
modeling some chosen suggestions for refactorizations from the present C/AL code.
Our approach is to identify, analyze, and implement recognition of software patterns in the NAV application
code. The software patterns we will focus on, defines foundational relations, see section 4, between
objects. We will use the Unified Modeling Language (UML) (2) terminology to describe the identified
relations. The relations we focus on identifying is the concepts containment and generalization introduced
in section 5.
The project aim was extended underway in the project. The goal was to examine the possibility of
identifying specific concepts from the accounting ontology Resource, Events, and Agents (REA), see section
3.3.3, from the C/AL code. The goal from such an approach is that REA offers domain specific knowledge as
opposed to the general domain knowledge provided by UML.
2
Introduction
Chapter 2
2 Enterprise Resource Planning Domain
The following section is provided to give an introduction to Enterprise Resource Planning (ERP) solutions in
general. Following this is a more specific introduction to the fundamentals of the Microsoft Dynamics NAV
system and the C/AL language.
2.1 Introduction to Enterprise Resource Planning (ERP) (3)
Today, ERP systems are an invaluable tool for most companies counting more than a few employees. An
ERP system is a tool for running an enterprise, and the ERP system is the backbone of the organization for
providing data that can be used in decision making (4). In the past, every department of a company made
decisions independently of each other. ERP provides a platform for collaboration and a common ground for
decision making.
Purchasing
Marketing and
Sales
Accounting and
Finance
Information
Manufacturing
Human Resources
Inventory
Figure 2-1 Integration Data Flow (5)
The model on the left illustrates how ERP
systems are based on storing all information
in one central database, enabling all
departments in the ERP solution to work
with the same data. (6)
Common modules in an ERP system are:
 Financial Management (FM)
 Customer Relationship
Management (CRM)
 Supply Chain Management (SCM)
 Business Intelligence (BI)
The most common application of an ERP system is the automation of business processes. One of the most
important chains of business processes for a manufacturing company, is the support for selling the
manufactured goods (7). Figure 2-2 illustrates how an arbitrary ERP system would support this scenario by
allowing the Sales, Warehouse, Accounting and Receiving departments to work together to handle the
entire flow, from sale and shipping to receiving payment, for the goods by sharing a common ground in the
central information database, and forwarding an order fulfillment to the department responsible for the
next step in the process.
4
Enterprise Resource Planning Domain
Warehouse
Sales
Sales quote
Pack and ship
Sales order
Information
Receiving
Returns
Accounting
Payment
Billing
Figure 2-2 ERP process flow (5)
One of the key drivers for implementing an ERP system is that it can often substitute the mess of different
applications that emerges in a company along the way (8), and thereby streamline and simplify the use of
software within the company and, as a result, allow the organization to do better with the same, or even a
smaller, amount of money.
Another aspect is the possibilities emerging from having all the organizations information stored centrally.
This makes it possible to derive important byproducts which, for instance, could enable the organization to
do forecasting of production schedules during a holiday period, based on expected order income. This
would allow management to adjust the available workforce to make sure that production can cope with the
demand from incoming orders.
The extensibility is also an important attribute of an ERP system. To fully qualify as an ERP system, it is
expected to offer more than ”just” a comprehensive integration of various organizational processes(9). ERP
systems should thus strive to be:



Flexible –Systems should be able to grow with the organization
Modular and open –Systems should offer open interfaces allowing easy interoperability and
extensibility to third-party add-on components.
Beyond the company –Systems should support integration to customers, partners and vendors
because many business processes require interaction with actors outside the organization.
2.2 ERP Business Opportunities (10)
There is a large market for ERP software, and many ERP suppliers. SAP has published a report from AMR
Research listing the estimated revenues in 2006 for the 17 most dominant ERP vendors. The report
estimates the total revenue on ERP software to be $ 28.8 Billion with a revenue growth from 2005 of 14 %.
In comparison, the Danish gross domestic product (GDP) 2006 was $ 202.9 Billion, which means that the
5
1
revenue of ERP software is a market of the entire GDP of Denmark, underlining that the ERP market is of
7
great importance.
The AMR report lists SAP as the number one ERP supplier with revenues nearly the double of number two
on the list.
During the last 10 years there has been a consolidation on the market for ERP software. Oracle acquired
PeopleSoft (2004), Navision acquired Axapta (2000) and Microsoft acquired Navision (2002) and Great
Plains (2000) amongst others. The five most dominant ERP vendors in 2006 were SAP, Oracle, Infor, Sage
Group, and Microsoft.
2.3 Microsoft Dynamics NAV (NAV) (11)
The ERP product we focus on in this thesis is, as previously stated, the Microsoft Dynamics NAV product.
This following section introduces the NAV user segment, product, and architecture. Furthermore, we
describe the difference between product code and application code in NAV, which is completely essential
in order to understand this project.
Microsoft has focused on getting a position on the global ERP market as described in the previous section.
The Microsoft Dynamics product group was kick started by acquisitions of successful ERP vendors in the
beginning of the 2000’s, and this product group has been developed continuously to gain market shares.
Microsoft Dynamics NAV is one of these products. NAV focuses on the Mid-Market+ (defined as companies
with 1-5000 employees) segment of the ERP market, leaving the enterprise market to other ERP products.
The company size definitions used are the following:
Enterprise
Corporate Account Segment(CAS)
Midmarket
Small Business
5000+ employees
1000-5000 employees
50-1000 employees
1-49 employees
Table 2-1 NAV Customer segment
The following numbers are provided to give an idea about the forces driving NAV. These numbers are from
the beginning of 2008(12)





> 65.000 Customers (companies that have bought NAV)
> 3.300 certified partners (IT professionals working with sale and customization of NAV)
> 1.800 add-on solutions (products offered for specific needs by partners)
> 40 localized versions (supporting local languages and date formats)
> 1.000.000 licensed users (employees in customer companies using NAV)
Microsoft Dynamics NAV was originally created by the three college friends, J. Balser, T. Wind and P. Bang,
from the Technical University of Denmark (DTU) under the product name Navigator and later Navision.
NAV is currently in version 6 (project Corsica). It has been developed in Vedbæk since 1984. It has grown
incrementally from version to version becoming a very elaborate ERP system. The latest new addition to
the product is the Role Tailored Client (RTC), offering customized user profiles. This enables every user role
in the system to have a personalized User Interface (UI) with focus on the tasks they perform, making it
6
Enterprise Resource Planning Domain
easier for employees to understand and interact with the system. Figure 2-3 shows the home page for an
Order Processor. The Order Processor is the role in a company who is responsible for shipping incoming
orders to customers and putting customer returns back in stock. The Order Processor is one of 21 role
centers that ship with NAV out of the box and partners can add more to suit individual customer needs.
Figure 2-3 Home page of an Order Processor in Microsoft Dynamics NAV 2009 (Role Tailored Client)
7
2.3.1 NAV architecture
NAV supports three architectures: one- and two-tier for legacy purposes, and the new three-tier
architecture that offers new features and a better scalability. In time, the one- and two- tier architectures
are going to be discontinued.

One-tier setups consists of the
C/SIDE client simply using a file as
C/SIDE Client
database (not shown in figure)
 Two-tier setups consists of the
C/SIDE client and a database
(either Microsoft SQL or the NAV
Database Server (legacy product))
 Three-tier setups are following the
Application
Model View Control pattern (13):
Service Tier
C/SIDE Client
o Role Tailored Client (RTC)
for users (View)
o Application Service Tier
(AST) (Control)
o Microsoft SQL or a NAV
Database Server (Model)
Microsoft SQL Server or
Furthermore,
the C/SIDE client
NAV Cserver Database
Microsoft SQL Server or
is
also
part
of
the
three-tier
NAV Cserver Database
setups and is mostly used for
2-1 Two-tier archictecture 2-2 Three-tier architecture
development.
The Role Tailored Client is only running on the three-tier setup and does, for now, not support
development. The C/SIDE client is the old client that is still used for development, but with time, the plan is
to provide a new development environment and discontinue the old C/SIDE client.
Role Tailored
Client
2.3.2 Application Code (14) VS. Product Code
When we describe NAV, it is important to note the difference between product code and application code.
New product code is written in C# and legacy code is written primarily in C++ (15). New application code is
written in C/AL (often written AL) and has always been developed in C/AL. The NAV product is the platform
that hosts the NAV ERP application. The product is also running on a platform, namely Windows.
Application code is, as stated above, written in the NAV specific language C/AL, and the application
implementation is the actual code forming the ERP solution. The application is hosted inside the product
and can be custom tailored to support specific customer needs.
According to Jesper Kiehn, the application codebase has grown considerably from version to version. One
reason for this is the sales channel for NAV. In general, Microsoft does not sell NAV directly to customers.
The sale is done via a partner that gets the customer up and running with NAV. As mentioned above, more
than 1.800 add-on products exist, and a number of add-on products have propagated back into the product
over time to fulfill general customer requests. This has been done by buying the add-on from the partner
and merging their code into the shipping application. As described before, the NAV system has been
developed continuously over the last 25 years. The result is an application of more than 2.4 million lines of
C/AL code.
8
Enterprise Resource Planning Domain
The NAV ERP application is divided into 192 granules. A granule is a pack of Pages, Forms, Codeunits, and
Tables, introduced in section 2.4.2, together adding a feature to the ERP application. One example of a
granule is the Commerce Gateway granule, allowing the NAV ERP system to interact via Commerce
Gateway. The commerce gateway granule adds support for business to business transactions. One example
of such a transaction could be electronic exchange of sales documents. The NAV system is designed to grow
with the company, allowing the company to extend their use of the system incrementally. This is in
compliance with theory for implementing ERP software in companies. In general, it is advised to implement
ERP systems incrementally in a company, increasing the scope along the way. Furthermore, it is advised to
design the system so it is able to handle future company growth (16).
As described above, this can easily be achieved with the introduction of new granules in the implemented
NAV ERP. In fact, the granules are already installed in the customers NAV. If the customer system runs on
the standard application, they just have to pay for an extension of their license to get the appropriate
access rights.
2.3.3 Application Language Transition
As described in section 2.3.2, the configuration of NAV is done with C/AL. Until NAV version 6.0, C/AL has
been compiled solely to a NAV specific binary format, but the new three-tier architecture described in
section 2.3.1, supported by the Role Tailored Client and the Application Service Tier, is running directly on
C# code. The transformation from C/AL to C# is done using a token parser. The generated C# code is fully
running C# code, but not nice readable code. The C# code is then compiled to Microsoft Intermediate
Language (17) (MSIL), allowing the binary (CLR) application code to run on any platform supported by the
.NET framework.
C/AL
Compiled to NAV specific
binary format
Transformed to C#
Old Product Platform (C/SIDE)
(C++ Code)
New Product Platform (RTC, AST)
(C# Code)
Large parts of the product were
designed to run with the NAV specific
binary format, and still do. Therefore,
the product is in a transition phase
where it needs both the C/AL and the C#
representation.
Product Interaction
Figure 2-4 Current C/AL Compilation for new and old product stack
A NAV specific object model has been created in C#, which enables application code to be written in C#, but
the product will need both the C/AL representation and the C# representation. At this point, only parsing
from C/AL to C# is possible, not the other way around. This causes some inconvenience when debugging
application code on the new product stack (three-tier). When debugging the application code on either the
Role Tailored Client or the Application Service Tier, debugging can only be done in the generated C# code.
This is far from ideal because there is no direct link between generated C# code and the C/AL code. Fixing
an issue identified in the generated C# application code has to be fixed in C/AL from the C/SIDE client.
9
2.4 Fundamentals of the C/AL language
The code we analyze throughout this project is written in the language C/AL. This section is provided to give
an introduction to the design criteria and syntax for C/AL.
C/AL is not an object oriented language because only eight predefined non extendable object types can be
used. This implicitly means that C/AL does not support concepts such as inheritance. The language is
strongly typed, meaning that variable types cannot be inferred and have to be defined explicitly. C/AL is a
procedural language in the family of imperative programming languages. Imperative languages specify the
individual steps of a computation (how we want it) in contradiction to declarative languages that specify
what the program should do (what we want) (18).
2.4.1 C/AL Design Criteria
Michael Nielsen (Director of Development in NAV) has a long history with NAV and was one of the original
designers of the language. M. Nielsen stated that the design criteria’s (19) for C/AL were to provide an
environment that could be used without:





Dealing with memory and other resource handling
Thinking about exception handling and state
Thinking about database transactions and rollbacks
Knowing about set operations (SQL)
Knowing about OLAP(20) and SIFT(21)
2.4.2 C/AL Syntax
The C/AL syntax is heavily inspired by Pascal, but the language is simpler. The full language reference can be
found on MSDN (22). The C/AL language provides a limited set of predefined object types, which also help
to reduce the complexity.
Overall, the general design goal has been to provide a flexible language that enables developers to quickly
assimilate themselves with developing for NAV.
C/AL has eight kinds of objects: Tables, Forms, Reports, Dataports, XMLports, Codeunits, MenuSuites and
Pages.



Pages are Forms (Graphical User Interface) for the Role Tailored Client. Forms and MenuSuites
objects will probably be discontinued as the old C/SIDE client is outfaced.
Dataports and XMLports are objects to setup import and export of data via text files or web
services.
Report is an object type to define Business Intelligence (BI) reports.
We will not spend more time with the above object types. They contain no information we need in our
further analysis. All relevant information in regards to our analysis is placed in Tables and Codeunits. The
following section will give an introduction to these two object types.
10
Enterprise Resource Planning Domain
2.4.2.1 Tables
Tables are data containers and represent the foundation of the application. Tables express data and logic in
the form of triggers. Tables contain two kinds of triggers:


Trigger Events are activated on specific actions. One Trigger Event is the OnDelete trigger that is
activated every time a row is deleted in the table.
Trigger Functions are defined by developers and can be activated on any instance of the table.
An example of the structure of a table can be seen in Code 2-1. This table is the first table in the
application, and is also one of the shortest. We will briefly describe each component of the table:






OBJECT-PROPERTIES contain meta data for the table such as creation data etc.
PROPERTIES contain general properties stored on the table and Trigger Events. This table only
contains a definition of the OnDelete trigger.
FIELDS contain the definition of the fields in the table.
KEYS contain a list of the fields forming the table’s primary key. In this case, the field Code is the
primary key.
FIELDGROUPS contain a list of prioritized fields used by the Role Tailored Client.
CODE contains the defined Trigger Functions. Every Trigger Function is defined as a procedure
11
OBJECT Table 3 Payment Terms
{
OBJECT-PROPERTIES
{
Date=05-11-08;
Time=12:00:00;
Version List=NAVW16.00;
}
PROPERTIES
{
DataCaptionFields=Code,Description;
OnDelete=VAR
PaymentTermsTranslation@1000 : Record 462;
BEGIN
WITH PaymentTermsTranslation DO BEGIN
SETRANGE("Payment Term",Code);
DELETEALL
END;
END;
CaptionML=ENU=Payment Terms;
LookupFormID=Form4;
}
FIELDS
{
{ 1;;Code;Code10;CaptionML=ENU=Code; NotBlank=Yes }
{ 2;;Due Date Calculation;DateFormula;CaptionML=ENU=Due Date Calculation }
{ 3;;Discount Date Calculation;DateFormula;
CaptionML=ENU=Discount Date Calculation }
{ 4;;Discount %;Decimal;CaptionML=ENU=Discount %;
DecimalPlaces=0:5; MinValue=0; MaxValue=100 }
}
KEYS
{
{
;Code;Clustered=Yes }
}
FIELDGROUPS
{
{ 1
;DropDown;Code,Description,Due Date Calculation
}
}
CODE
{
PROCEDURE TranslateDescription@1(VAR PaymentTerms@1000: Record3; Language@1001:
Code[10]);
VAR PaymentTermsTranslation@1002 : Record 462;
BEGIN
IF PaymentTermsTranslation.GET(PaymentTerms.Code,Language) THEN
PaymentTerms.Description := PaymentTermsTranslation.Description;
END;
BEGIN
END.
}
}
Code 2-1 Table 3 Payment Terms
12
Enterprise Resource Planning Domain
2.4.2.2 Codeunits
Codeunits act only as a container for functions. The standard function libraries are stored in Codeunits and
consist of utility routines that serve a general purpose in NAV.
User defined Codeunits can contain user defined functions. Codeunits can advantageously be used to:


Reduce the code in a table. Code can be extracted and stored in a Codeunit procedure.
If code is of a more general nature, and influences more than one object, it will be more correct to
store it in a Codeunit instead of storing it in a table.
The structure of Codeunits is identical to the structure of Tables presented above with the only exception
being that Codeunits only contain the components OBJECT-PROPERTIES, PROPERTIES and CODE.
Chapter 3
3 Related Work
The following section describes the prior work and accomplishments with relevance for our project. The
section is divided in two main parts. The first section is regarding related work done within Microsoft and
the Microsoft Dynamics NAV organization in particular. In the second section, we widen our scope and see
what has been accomplished elsewhere.
3.1 What has been accomplished inside Microsoft
The following section describes the achievements from former student projects and projects carried out by
Microsoft internally, that we know of. Elements of the previous work described later in this section have
worked as a great starting point for this project.
3.1.1 Partial C/AL parser
The core theme of this project relies on the ability to work with the application code in an abstract
queryable representation, see section 1.
A parser (CALParser) was developed by a former student project carried out by T. Hvitved 2008 (1). The
parser was developed to analyze the application to examine if the code could be modularized, to reduce
the number of dependencies in the code. The complete code for this project can be seen on the appendix
DVD in the folder \Code\Work From T. Hvitved\oo parser\.
CALParser is implemented in F# and uses the F# variant of the tools LEX (23) and YACC (24) to define the
parsing rules. The mentioned technologies are described in section 7.3.
The parser is of high quality, but unfortunately it is a complex piece of software that requires more work to
be complete.
From working with the parser we have found the following inexpedient issues we need to find solutions for.



The parser is not able to parse the actual application code.
Preprocessing removing multiline comments is necessary for the parser to work with the actual
application code.
The parser does not parse to the level of detail we need.
Table keys, table fields and field properties OptionString, TableRelationString and other options are
handled as strings and require subparsing.
The parser is implemented with F#’s immutable data types. This is a problem because we cannot
modify the data that needs subparsing.
We had a meeting with Tom Hvitved (Phd. Student at DIKU) who wrote the parser, and from his
introduction we learned that the intension with the parser was to implement enough level of detail to
14
Related Work
support his thesis, not to develop a complete C/AL parser. Taking this information into consideration, and
from our initial work with the parser, we know that we will be able to use the CALParser but not as-is.
3.1.2 Codedub
Codedub is a project carried out in collaboration between J. Kiehn (MDCC) and a Master’s thesis student,
Till Blume (IT University of Copenhagen). The focus for this thesis has been to find candidates for
refactoring. The short story is, that due to the lack of support for inheritance in C/AL, a lot of code blocks
are copied here and there. Code duplicates can be identified by comparing hashing values from code
blocks. The project deadline is June 2009.
3.1.3 Navision Developer Toolkit (25)
The Navision Developer toolkit (NDT) is a legacy tool for partners and internal users. The tool is able to
assist in upgrade operations and can display a number of details of the code. The NDT is not meant as a
development environment in the same sense as Visual Studio or Eclipse, but more as an analysis tool.
According to J. Kiehn, the tool has a limited success and is probably going to be discontinued. Partners and
developers rather use text comparison and search tools, such as Beyond Compare (26), in their analysis of
AL code.
3.1.4 Object Map
Object Map is an internal tool for displaying identified relations. The tool is a generic viewer of relations,
which are identified from an XML file. The XML file is produced with the Navision Developer Toolkit (NDT)
and it contains relations between Tables, Forms, Codeunits and Pages. The Navision Developer Toolkit
does, however, not support methods to determine the nature of a relation.
3.2 What has not been accomplished inside Microsoft




In relation to our project, the parser is still not complete, and needs further work to fulfill our
requirements.
Tools exist to display relations, but the view is static and based on a dump from the NDT. The
available tools do not support any kind of relation classification, i.e. it is possible to map a relation
but there is no way to know the deeper meaning of the relation (association, dependency, etc.).
The overall goal for Object Map is to display an interactive graph which allows us to obtain
knowledge about the structure of the application. The lack of relation classification reduces the
overall information we are able to read from a diagram displaying relations. Furthermore, if we
were able to apply general terms to how the relations are in the code, people with general
computer science knowledge will be able to get a deep insight into the application code
organization more quickly.
Suggestions for refactoring have in many ways been unsuccessful.
A good way to move from C/AL to C# has not been found.
3.3 What has been accomplished outside Microsoft
3.3.1 Refactoring from Code to UML
There are many projects offering code generation from an abstract representation. A search on Google
reveals that there seems to be an overweight in products offering generation of C++ and Java code.
15
A few products offer both refactoring from code to UML and vice versa. We have looked at two of the
larger products: IBM’s Rational Rose (27) and No Magic’s MagicDraw (28). We have previously worked with
Rational Rose without being completely satisfied with the UML generation. We therefore chose to look at
MagicDraw to see what the product offers when it comes to generating UML from code. The Enterprise
edition of MagicDraw comes with refactoring capabilities for Java, C++ and C# among other languages. We
installed a demo of MagicDraw 16.5 Enterprise and generated a diagram based on two files with a
generalization relation between them.
Table
-number : string
-id : string
+Number : string
+ID : string
+Table(in number : string, in id : string)
+Table(in info, in ctxt)
+GetObjectData(in info, in txt) : Table
REATable
+REATable(in table : REATable, in tablewith : Table)
+REATable(in info, in ctxt)
+GetObjectData(in info, in ctxt) : REATable
Figure 3-1 UML diagram manually created with Visio
Figure 3-2 UML diagram generated from code with MagicDraw
Figure 3-1 shows the relation we would like to map between Table and REATable. The analyzed classes can
be seen in the folder “Code\MatchingInLinq\MatchingInLinq\” on the attached DVD. The generalization
relation is described in depth in section 4.1. Figure 3-2 shows the generated model from MagicDraw. The
diagram does show a directed relationship from REATable to Table, but it is not clear that REATable is a
child/subclass to Table. We believe there are two legitimate reasons for this.


We have not spent hours on fully understanding how all the options of the tool works. MagicDraw
is a complex tool and we just generated a standard UML class diagram, expert users are probably
able to produce more useful diagrams.
The tool is directed towards creating code from a model and thus the model has to be very
detailed.
We do however believe, that the model created manually in Visio is providing a cleaner view of the
relationship between the two classes.
We believe MagicDraw is a fair representative for products offering UML diagrams generated from code.
Furthermore, we find that the generated diagram has a low readability, and as such does not offer the
overview we wish to provide. Many of the tools offer some kind of refactoring in a simple drag and drop
16
Related Work
fashion, but we have chosen not to look further into how they perform. None of the found refactoring tools
offer support for C/AL.
3.3.2 UML designers
Many good tools exist for drawing UML. Most tools offer good design options where the mouse can be
used to arrange the diagrams in any preferred order. We have primarily used Visio 2007 for this report, but
many other good tools exist. A list of UML tools can be found at the homepage of Associate Professor M.
W. Godfrey from University of Waterloo (29). We have not found a UML designer that can generate large
diagrams as a batch process.
3.3.3 Refactoring to Resource, Events and Agents (REA) (30)
Since the beginning of accounting software, all systems have relied on the double-entry accounting system,
introduced in section 6.3.2. Other systems have, however, emerged, trying to solve the drawbacks from the
double-entry system. One of the most successful systems is the Resource, Events and Agents (REA)
ontology. The system is developed by William E. McCarthy, who published his first paper presenting REA in
1982 (31). Since then, he and others have contributed in developing REA into an elaborate bookkeeping
ontology.
The scope of this project was, as described in the introduction (see section 1), extended to cover initial REA
identification (see section 6.3.1). We therefore looked into prior work in this field to learn from former
achievements. We found that not much work has been done in refactoring accounting systems into REA. As
previously stated, most accounting systems are based on the double-entry system. We have looked into a
paper analyzing the relationship between REA and the Enterprise Resource Planning system SAP (32). The
paper focuses on the similarities in the data models of SAP and REA. Evidence is found that proves the
existence of duality within SAP, but it is also found that SAP has implementation compromises that will not
fit into a REA model, and that a REA model cannot fully describe the SAP system. The paper concludes that
REA terminology is able to present SAP data models and that the REA presentation is able to provide
valuable information to the data model.
We have not found evidence of any work identifying REA concepts from automated code analysis.
3.4 What has not been accomplished outside Microsoft


Many products for refactoring between code and UML exist. They are complicated, and it requires
a larger study to benefit from their capabilities. No refactoring tools for C/AL exist.
Prior work mapping REA to an ERP product has not focused on analyzing the actual code
3.5 Related work summary
The most important notes learned from the related work study were, that we will need to extend the
parser to support the relation identification this project aims at solving. Furthermore, we will need to build
a custom viewer tool for presenting the identified relations in terms of UML.
Chapter 4
4
Foundational Relations (33)
The following section explains the conceptual meaning of the UML relations containment and
generalization, described in section 5. As previously described, the primary task of this project aims at
identifying the two above concepts from the NAV application code. Therefore, we introduce the formal
theory for the above concepts. The formal theory uses the abbreviations 𝑝𝑎𝑟𝑡_𝑜𝑓 for containment and is_a
for generalization. For this section, we primarily use the naming 𝑝𝑎𝑟𝑡_𝑜𝑓 and is_a, but in the later sections
we will solely use the naming containment and generalization to follow the UML standards described in
section 5.
We present the definition of 𝑝𝑎𝑟𝑡_𝑜𝑓 (containment) and is_a (generalization) in terms of standard firstorder logic.
The following two primitive relations are used in our formulas.


Inst(x, A) (short for instance) maps the relation between an instance and its class.
A statement using inst is: Jane is an instance of a human being.
Part(x,y) defines parthood between two instances.
A statement using part is: Jane’s heart is part of Jane’s body.
4.1 Definition of the Is_a relation
The is_a (generalization) relation is the simpler of the two relations we describe. The formula below
expresses that A is_a B. The right side definition of the is_a relation expresses that if x is an instance of class
A then x is an instance of class B and that this is true for all x. A statement fulfilling this formula is: A human
male is a human.
𝐴 is_a 𝐵 = def ∀𝑥 𝑖𝑛𝑠𝑡 𝑥, 𝐴 → 𝑖𝑛𝑠𝑡 𝑥, 𝐵
Formula 4-1 is_a
4.2 Definition of the Part_of relation
The 𝑝𝑎𝑟𝑡_𝑜𝑓 (containment) relation is a bit more complex. To provide a better understanding the formula
is divided into two parts which are joined in a third formula.
The part_for formula below express that if x is an instance of A then there exists an instance of y where y is
an instance of B and x is part of y and that this is true for all x. With the part_for formula we can state that:
Human testis part_for human being. But we cannot state that human being has_part human testis because
only males have testis.
𝐴 part_for 𝐵 = def ∀𝑥 𝑖𝑛𝑠𝑡 𝑥, 𝐴 → ∃𝑦 𝑖𝑛𝑠𝑡 𝑦, 𝐵 & 𝑝𝑎𝑟𝑡 𝑥, 𝑦
Formula 4-2 part_for
18
Foundational Relations
The ℎ𝑎𝑠_part formula below expresses that if y is an instance of B then there exists an instance of x where
x is an instance of A and where x is part of y. With ℎ𝑎𝑠_part we can state that human being has_part heart
but not that heart part_for human beings because many animals have a heart.
𝐵 ℎ𝑎𝑠_part 𝐴 = def ∀𝑦 𝑖𝑛𝑠𝑡 𝑦, 𝐵 → ∃𝑥 𝑖𝑛𝑠𝑡 𝑥, 𝐴 & 𝑝𝑎𝑟𝑡 𝑥, 𝑦
Formula 4-3 has_part
The 𝑝𝑎𝑟𝑡_𝑜𝑓 formula is a combination of the part_for and ℎ𝑎𝑠_part formulas. It expresses that A is
𝑝𝑎𝑟𝑡_𝑜𝑓 B if A is part_for B and B ℎ𝑎𝑠_part A. With the 𝑝𝑎𝑟𝑡_𝑜𝑓 formula we are able to state: Human
heart part_for human being and human being has_part human heart, which is the definition of the
𝑝𝑎𝑟𝑡_𝑜𝑓 relation.
𝐴 𝑝𝑎𝑟𝑡_𝑜𝑓 𝐵 = def 𝐴 𝑝𝑎𝑟𝑡_𝑓𝑜𝑟 𝐵 & 𝐵 ℎ𝑎𝑠_𝑝𝑎𝑟𝑡 A
Formula 4-4 part_of
The first-order logic definitions for is_a and 𝑝𝑎𝑟𝑡_𝑜𝑓 respectively generalization and containment are the
requirement to the patterns this thesis aims at identifying. The formal definitions have not been in the
foreground when developing the identification algorithms. The formulas are presented to provide a link
between the UML concepts generalization and containment, introduced in section 5, and their formal
mathematical definitions.
Chapter 5
5
Unified Modeling Language (UML)(34)
As mentioned earlier, see section 1, we have selected UML as the output standard for our findings. The
reason is that we aim at identifying common computer science concepts, described in section 4, and
present them in general abstract terms to present a generic result. In the following section, we will give an
introduction to the part of UML we use to facilitate a view of our findings.
5.1 UML Terminology Overview
The Unified Modeling Language (hereafter UML) has become the de facto standard for describing object
oriented software by means of models and diagrams. Since the object oriented languages started gaining
acceptance in the late 1970s, available modeling techniques proved inadequate for describing objects and
their relationships. This was the driving force behind the development of a new modeling technique and
Grady Booch, James Rumbaugh and Ivar Jacobson started developing and documenting UML in 1994. They
founded a UML consortium with several organizations willing to allocate resources into developing UML.
IBM, Microsoft, Oracle, Rational and Texas Instruments were some of the contributing organizations. The
outcome of this work was the UML 1.0 standard presented in 1997 (35). The popularity of the object
oriented programming languages has helped UML reach a broad acceptance in both the business and
university environments (36).
UML covers many different methods and models. Use cases, sequence diagrams, and deployment diagrams
to mention a few. The model and syntax we have chosen to work with is the Object Model.
We have in fact even limited our use of the Object Model because the analysis of C/AL is limited to
identifying relationships between the code objects. It would be easy to map all attributes and procedures
into our model, but the model contains more than 900 table objects and most of these have more than 100
attributes. Therefore, attributes and procedures are left out to keep the model simple and facilitate a clear
overview of the findings we wish to promote.
In the following sections, we explain the UML terminology for relationships, e.g. how the objects connect.
We give an example of each relationship and an explanation of its usage.
5.1.1 Dependency
The dependency relationship is also known as the
“using” relationship. The figure on the left shows a
dependency. Window is dependent on Event and a
change in Event might cause a change in Window.
Event
Window
Figure 5-1 UML mapping of a dependency
20
5.1.2 Generalization
The generalization relationships is the relationship
between a general class (also called a super class or
parent class) and a more specific class (also called a
sub class or child class).
On the right we see that we have the super class
Car(for all cars) and a more specific class Ford (only
for Fords).
5.1.3 Association
The association relationship is used to map a
relationship between objects. In the example to the
right a Company can have *(zero to many)
Employees, and an Employee can work at 1 (one)
Company.
5.1.4 Aggregation
An aggregation is a model of two objects at the
same level, i.e. there exists no generalization
between them. The relationship can be described as
the class Company “Has a” Department. The
Company represents the whole and the
Department is part of the Company.
5.1.5 Containment
Containments are closely related to the aggregation
relation described above. The containment is a
special aggregation also known as a “strong
aggregation”. The containment is mapped with a
filled diamond on the aggregation line.
The example on the right expresses that
Department has a containment to Company and
therefore cannot exist without the Company.
Unified Modeling Language (UML)
Car
Ford
Figure 5-2 UML mapping of a generalization
Company
Employee
1
*
Figure 5-3 UML mapping of an association
Company
Department
1
*
Figure 5-4 UML mapping of an aggregation
Company
Department
1
*
Figure 5-5 UML mapping of a containment
The notation for aggregation, association, containment, and generalization will be used throughout this
thesis to represent the conceptual meaning of these concepts.
Chapter 6
6 Analysis
The following section contains the analysis of the patterns we aim at identifying in the C/AL application
code. We elaborate on the patterns for identifying containment, generalization, and REA candidates in the
given order. The section provides additional information on NAV and a broader introduction to accounting
with the double-entry system and with the REA ontology. The aim of this section is to uncover the code
expressing the containment, generalization and REA patterns.
6.1 Identifying Containment Pattern
The first concept we are identifying is containment. A containment, is as the definition in section 4.2 states,
a relationship where the existence of one object is a requirement for another object to exist. For example,
a car cannot have tires if it does not have wheels.
The following bullet points are indicators for containments in NAV (37):



The primary key of a table ends with an integer (indicates the multiplicity of the relation is 1..n (one
to many))
The dependent table has a foreign key, to a referenced table, as part of its primary key. The foreign
key field has the name Document Type
The table being referenced has code for cascading deletion of all dependent rows in the referencing
table
6.1.1 The role of the Sales Header and Sales Line tables in NAV
The Sales Header and Sales Line tables are identified by J. Kiehn as an example of a containment in NAV.
They are used respectively for storing information on shipping and for what to ship.
We have confirmed this containment in NAV by deleting a Sales Header from the NAV Demo database
through the C/SIDE client:


When a row in Sales Header is deleted, the corresponding Sales Line rows are deleted
When creating a row in the Sales Line table, only existing Sales Header IDs are accepted
To get a better idea of how the Sales Header and Sales Line tables are used in NAV, we provide an
introduction to Sales Orders.
22
Analysis
Sales Orders in NAV are created when a customer is requesting a delivery of some product. Figure 6-1
shows the overview where all active (not been shipped) Sales Orders are displayed. When an order is
shipped an employee will post the Sales Order. The posting of a Sales Order will copy fields to a journal
table. When the customer has paid the delivered products, the journal will be posted to an entry, thereby
finalizing the monetary transaction of goods. This part is covered in section 6.3.2.2.
Figure 6-1 Overview of all Sales Orders in the Role Tailored Client
23
When an order is received, a new Sales Order is created. Figure 6-2 shows the user interface (UI) to create
such a document. If one of the Sales Orders from Figure 6-1 are selected they will also show in this window.
In Figure 6-2 we see that the UI has two main parts, General and Lines. The General part is stored as a row
in the Sales Header table, and each line in the Lines part is stored as a row in the Sales Line table. The Sales
Header table contains information that is common for the entire order, and the Sales Line table contains
information that is individual for each product in the order.
Figure 6-2 New Sales Order in the Role Tailored Client
24
Analysis
6.1.2 Manual analysis of Containment pattern
In the following section we analyze the Sales Header (hereafter SH) and Sales Line (hereafter SL) tables to
identify the containment pattern.
Many NAV objects are very elaborate, and SH and SL are no exception. SH has an arity (number of fields) of
180 and counts 3334 lines of code, and SL has an arity of 194 and counts 4117 lines of code.
There is 1 (one) SH row to 1..n (one to many) SL rows. If a SH row is deleted, all SL rows for the
corresponding SH row are deleted. The figure below depicts the relationship between SH and SL in UML
and database terms.
6.1.2.1 Containment relation between Sales Header and Sales Line
1
Sales Header
Sales Line
Sales Header
1..n
Sales Line
PK
PK
Document Type
No.
PK,FK1
PK
PK
Document Type
Document No.
Line
Figure 6-3 Viewed as an OO UML diagram
Figure 6-4 Viewed as a database diagram with primary keys (PK)
and foreign keys (FK)
Both SH and SL have a primary key ending with a variable containing “No.”, which indicate they are of type
integers or code, and a lookup in the code shows they are of type Code. Table primary keys are commonly
the combination of a number of fields, as seen in table SH and SL.
The Document No. variable that is part of both tables is defined as a foreign key in SL to SH. None of the
variables in the SH joined primary key has foreign keys to SL, this proves a 1..* relationship between SH and
SL.
Reading the OnDelete() procedure trigger it is clear, that SH truly has a cascade delete of SL rows. A table’s
OnDelete() trigger is activated when a row in the table is deleted.
Looking at the C/AL code associated with the tables, we find the following.
Table 36 Sales Header
Primary Key
Document Type, No.
OnDelete()
SalesLine should be defined as a variable referencing the table that has a
foreign key to Sales Header.
SalesLine.SETRANGE("Document Type","Document Type");
SalesLine.SETRANGE("Document No.","No.");
SalesLine.DELETEALL(TRUE); (* Not how it is
implemented)
Table 37 Sales Line
Document Type, Document No., Line No.
Primary Key
Document No. has the property:
Foreign Key
TableRelation = "Sales Header".No. WHERE (Document
Type=FIELD(Document Type))
Table 6-1 Sales Header and Sales Line Containment
From the results in Table 6-1 we see that the relationship between tables SH and SL is in fact a
containment.
25
6.1.3 Variations in containment pattern
The above code is an example of the simplest occurrence. There are many variations for containments, and
the code for Sales Header is actually a little bit more complex than listed, because Sales Lines of the type
“Item” have to be deleted first. This is a consequence of storing multiple types in a single table. The actual
code can be seen in Table 6-2.
Globals
(Global
variables for
table SH)
OnDelete()
Function in SH
Function in SH
Name
DataType
Subtype
SalesLine
Record
Sales Line
SalesLine.SETRANGE("Document Type","Document Type");
SalesLine.SETRANGE("Document No.","No.");
SalesLine.SETRANGE(Type,SalesLine.Type::"Charge
(Item)");
DeleteSalesLines;
SalesLine.SETRANGE(Type);
DeleteSalesLines;
DeleteSalesLines()
IF SalesLine.FINDSET THEN BEGIN
HandleItemTrackingDeletion;
REPEAT
SalesLine.SuspendStatusCheck(TRUE);
SalesLine.DELETE(TRUE);
UNTIL SalesLine.NEXT = 0;
END;
HandleItemTrackingDeletion()
WITH ReservEntry DO BEGIN
RESET;
SETCURRENTKEY(
"Source ID","Source Ref. No.","Source
Type","Source Subtype", "Source Batch Name","Source
Prod. Order Line","Reservation Status");
SETRANGE("Source Type",DATABASE::"Sales Line");
SETRANGE("Source Subtype","Document Type");
SETRANGE("Source ID","No.");
SETRANGE("Source Batch Name",'');
SETRANGE("Source Prod. Order Line",0);
SETFILTER("Item Tracking",'> %1',"Item
Tracking"::None);
IF ISEMPTY THEN
EXIT;
IF HideValidationDialog OR NOT GUIALLOWED THEN
Confirmed := TRUE
ELSE
Confirmed :=
CONFIRM(Text052,FALSE,LOWERCASE(FORMAT("Document
Type")),"No.");
IF NOT Confirmed THEN
ERROR('');
IF FINDSET THEN
REPEAT
ReservEntry2 := ReservEntry;
ReservEntry2.ClearItemTrackingFields;
ReservEntry2.MODIFY;
UNTIL NEXT = 0;
END;
Table 6-2 Sales Header Containment in detail
26
Analysis
The code, presented in Table 6-2, reveals how the OnDelete() trigger actually calls the procedure
DeleteSalesLine(), which again references the procedure HandleItemTrackingDeletion().
In this case, the procedures were located in the same table. This is not always the case.
6.1.4 Manual identification of additional containments
We know that all tables with “line” as part of their name are candidates for being part of a containment
pair. We are able to find all these tables and display their first primary key. This can be done by creating a
new Form in the C/Side client displaying the system table Keys and applying a filter (TableName =
*Line* and No. = 1). This filter only displays table names containing the string “Line” and their
standard primary key. In NAV it is possible to define alternate sets of primary keys on a table. The above
filter identifies 149 obvious candidates that might satisfy the requirement for a containment.
Manual analysis of two random candidates indicated that more containments are waiting to be identified.
Table 38 Purchase Header
Primary Key Document Type, No.
PurchLine.SETRANGE("Document Type","Document Type");
OnDelete()
PurchLine.SETRANGE("Document No.","No.");
PurchLine.SETRANGE(Type,PurchLine.Type::"Charge
(Item)");
DeletePurchaseLines;
PurchLine.SETRANGE(Type);
DeletePurchaseLines;
Table 39 Purchase Line
Primary Key Document Type, Document No., Line No.
Foreign Key
Document No. has the property:
TableRelation = "Purchase Header".No. WHERE
(Document Type=FIELD(Document Type))
Containment Yes, follows the same pattern as SH and SL.
Table 6-3 Purchase Header and Purchase Line Containment
Table 5087 Profile Questionnaire Header
Primary Key Code
OnDelete()
Not defined
Table 5088 Profile Questionnaire Line
Primary Key Profile Questionnaire Code,Line No.
Foreign Key
Profile Questionnaire Code has the property:
TableRelation = "Profile Questionnaire Header"
Containment No, there is no code for cascading delete – might be an
aggregation?
Table 6-4 Profile Questionnaire Header and Profile Questionnaire Line Aggregation
27
6.2 Identifying Generalization Pattern
The second relationship we are indentifying is generalization. C/AL does not support inheritance, and
therefore there is no direct way to program generalizations in the application code. The generalizations we
are identifying are therefore special associations whose behavior can be understood as following the
generalization pattern we have defined. It is our belief, that if the system should be refactored to a
language supporting generalization, the overall design would benefit from this refactoring.
Ref Table 1
Ref Table 3
Table
Ref Table 2
Ref Table 4
Ref Table 5
The software patterns we are analyzing the code for
are those matching the form of Figure 6-5 on the
left.
Figure 6-5 shows how a table has fields that can be
of n table types.
Figure 6-5 Table field association
We suggest the following refactoring to introduce a cleaner design inducing simplicity. The refactoring is
modeling the relationship above as a single association to a generalization object, encapsulating the
dependencies in a generalization object.
Generalization
Table
Ref Table 1
Ref Table 2
Ref Table 3
Ref Table 4
Ref Table 5
We believe this is a desirable representation for the
pattern defined above. We believe this refactoring
could achieve induced simplicity by reducing the
total number of lines of code and reduce the total
number of dependencies.
Lines of code could be reduced because we expect
the generalization patterns to occur in multiple
tables, allowing us to reuse the generated
generalization objects. This is documented in
section 9.4.
Figure 6-6 Multiple associations refactored to a single
association
The total number of references would also be reduced by managing the dependencies in the reused
generalization object, and thereby reducing the multiple associations from the table to a single association
to the generalization object.
The application contains 911 tables and they account for 179835 lines of code (LOC). The average LOC is
fairly low (197) but some tables, for instance Sales Header (3354 LOC) and Sales Line (4221 LOC), are
candidates for the code smell, described in section 6.2.1, Large Class (38) which can be dealt with by
Extracting Class which is part of our refactoring. The proposed refactoring will not reduce the code
significantly but it will be one of many steps.
28
Analysis
6.2.1 Code smell Large Class and solution Extracting Class
A code smell is a term introduced by M. Fowler and is used to describe the “smell” of something that is not
right in the code, i.e. code gone bad. The code smell large class (38) is a smell described as a class that is
trying to do too much. No exact definition of when a class is trying to do too much exists. Some of the
indications are when the class contains too many instance variables, duplicated code, and in general counts
too many lines of code. There are many different opinions on when a class is too large, and there will be
variance according to which programming language the code is implemented in. The code in the example
tables Sales Header and Sales Line are a lot larger than the average lines of code which is another indication
for the code smell.
The extracting class is defining the refactoring of extracting a part of a class, hopefully a part occurring
more than once, and using an instance of the new class instead of the original code (39).
An important byproduct from this refactoring is induced independability which is a necessary step in
preparing the system for a higher degree of modularization. A more modularized system can more easily
be understood, maintained, and extended. When large codebases are not modularized, it can become hard
if not impossible to change an item, simply because it is difficult to foresee the implications a change will
have if it is used excessively throughout the code.
6.2.2 Manual analysis of the Generalization pattern
The generalization pattern is expressed via a relationship between properties on two fields in the same
table.
The following example lists the requirements for the generalization pattern described with the two fields
Field 1 and Field 2.
Field 1 has to be of the Navision data type Option and has to have the property OptionString set to a
pattern similar to this:
item_1, item_2,..,item_n
Code 6-1 Requirements for Generalizations
Field 2 has to be of the Navision data type Integer or Code and has to have the field property TableRelation
set to a pattern similar to this:
IF (Field 1 = CONST(“item_1”)) table reference 1 ELSE IF (Field 1 = CONST(“item_2”))
table reference 2 .. ELSE IF (Field 1 = CONST(“item_n”)) table reference 3
Code 6-2 Requirements for Generalizations
29
The field type Option is displayed as a selection box in NAV and the field’s OptionString is being parsed as a
comma separated list of elements by the selection box. Figure 6-7 and Figure 6-8 below displays what this
looks like in the NAV object designer.
Figure 6-7 Selection box with elements
Figure 6-8 Property window displaying the OptionString of the field Type
30
Analysis
6.2.3 Manual analysis of the generalization relationship from Sales Line
The following section contains a manual analysis of table 37 Sales Line (hereafter SL). The goal of the
analysis is to identify the code concepts expressing the generalization pattern defined in section 6.2.2.
Table 37 Sales Line
Field
Type
Type
Property
Option OptionString
No.
Code
TableRelation
Posting Group
Code
TableRelation
Unit of Measure Code
Code
TableRelation
Originally Ordered No.
Variant Code
Code
Code
TableRelation
TableRelation IF (Type=CONST(Item)) "Item Variant".Code WHERE
Value
,G/L Account,Item,Resource,Fixed Asset,Charge
(Item)
IF (Type=CONST(" ")) "Standard Text" ELSE IF
(Type=CONST(G/L Account)) "G/L Account" ELSE IF
(Type=CONST(Item)) Item ELSE IF
(Type=CONST(Resource)) Resource ELSE IF
(Type=CONST(Fixed Asset)) "Fixed Asset" ELSE IF
(Type=CONST("Charge (Item)")) "Item Charge"
IF (Type=CONST(Item)) "Inventory Posting Group"
ELSE IF (Type=CONST(Fixed Asset)) "FA Posting
Group"
IF (Type=CONST(Item)) "Item Unit of
Measure".Code WHERE (Item No.=FIELD(No.)) ELSE
IF (Type=CONST(Resource)) "Resource Unit of
Measure".Code WHERE (Resource No.=FIELD(No.))
ELSE "Unit of Measure"
IF (Type=CONST(Item)) Item
(Item No.=FIELD(No.))
Table 6-5 Generalizations in Sales Line
The table above lists the extracted information from table SL. The table lists the fields fulfilling the defined
initial requirements for the generalization pattern.
We have identified 6 fields, 1 with the property OptionString and 5 with the property TableRelation. The
TableRelation’s properties are dependent on some or all of the elements in the OptionString. The token
after the if-statement in the TableRelations is a global table id referring to a table. The diagram in Figure 6-9
illustrates our findings.
Unit of Measure Code/Type
Sales Line
Variant Code/Type
Item Variant
Item Unit of Measure
No./Type
Originally Ordered No./Type
Posting Group/Type
Resource
Standard Text
Fixed Asset
Inventory Posting Group
G/L Account
Item Charge
FA Posting Group
Item
Figure 6-9 Candidates for Generalization Refactoring
Figure 6-9 illustrates the findings in our manual analysis of SL. The color codes used are listed in Table 6-6.
31
The analyzed table (in this case Sales Line)
Combination of fields in the Sales Line table named Field1/Field2 expressing the
analyzed relationship.
Field1 has the property TableRelation which uses variables from Field2’s OptionString
property.
The table’s referenced by the TableRelation
Table 6-6 Color definition for Figure 6-9
We have listed the relations in a simplified manner, the correct display of the relations would be to show
each relation as an arrow going from SL to the referenced table. We have chosen to draw the relations this
way to underline the relation to the extracted code in Table 6-5.
To display Figure 6-9 in terms of UML, we would remove the origin fields and simply map the Sales Line
table and its relations to referenced tables as associations. This would produce a star diagram with Sales
Line in the middle and connections to each of its referenced tables. In our opinion, it is advantageous to
make this refactoring. The result is shown in the following Figure 6-10.
Generalization
Standard Text
Fixed Asset
G/L Account
Item Charge
Item Variant
Sales Line
Item Unit of Measure
Resource
Item
Inventory Posting Group
FA Posting Group
Figure 6-10 Refactoring multiple associations to single associations
The proposed refactoring will for SL reduce the number of relations from 11 to 1 which is desirable.
6.3 Identifying REA Concepts
As described in the introduction (section 1) of this thesis, the project aim was expanded midway in the
process. J. Kiehn introduced the REA accounting system and suggested that we should try and see how far
we could get in identifying elements from the C/AL code that could fit into the REA model. The following
sections give an introduction to the REA model, the double-entry accounting system (used in NAV), and
describes our plan for identifying REA concepts in the C/AL code.
6.3.1 The Resource, Event and Agent (REA) model (40)
McCarthy found inspiration to REA around 1975 by the emerging relational databases and started working
on defining a framework for building accounting systems in a shared data environment. The result was the
REA model.
32
Analysis
The model’s core feature is an object pattern consisting of two mirror image constellations that
represented semantically the input and output components of a business process.
The model displays the exchange of goods for money with focus on the duality of the transaction. McCarthy
describes this as: “Stock-flow relationships associate the flows in and out of a resource category while the
duality links keep the economic rationale for the exchange boldly in the forefront” (41).
We bring a simple example (Figure 6-11) to illustrate how the REA model is constructed. The model
describes an enterprise selling cookies.
The core understanding of the model is that we “Give Cookies and Take Cash” and that the duality of this
action is the corner stone in the model. The mirroring is marked by the stippled line.
Events are a central part of the model because they map the actions and relations. The principle in the
model is the two mirror image constellations where every component in the model has a corresponding
component on the mirror image.
The following section explains the components in a REA model and identifies the elements in Figure 6-11.
We introduce the elements in the following order: Relationships, Resources, Events, and Agents.
6.3.1.1 Relationships
Relationships map the association between Resources – Events and Events – Agents. The relationship
specifies the nature of the relation in terms of the action taking place. The relationship provides data about
entities (Resources and Agents). The relationship between Resources and Events expresses the exchange of
the Resource and the relationship between Events and Agents describes the role of the Agent.
The model below contains the three types of relationships:



Inside Participation – Mapping the relationship between Events and internal employees
(Salesperson or Cashier)
Outside Participation – Mapping the relationship between Events and external actors (Customer)
Stock-Flow – Mapping the relationship between Events and Resources
6.3.1.2 Resources
Resources are mapping the stock of some entity. Typically, resources are the corn and bread for a company.
Resources are produced from other Resources and the ultimate goal is normally to exchange them for
money. The model in Figure 6-11 contains two resources:


Cookies – The enterprise’s stock of cookies
Cash – The enterprise’s holding of cash
33
6.3.1.3 Events
Events map the individual actions inside the company forming the processes in the company. The complete
REA model describes the business processes and an Event maps a single step in the process. The model in
Figure 6-11 contains two events:


Sale – The Customer buys cookies from the Salesperson and receives the cookies
Cash Receipt – The Customer pays for the cookies and the Cashier receives the payment
6.3.1.4 Agents
Agents are the individual actors in the model. Agents are the initiators of Events and they therefore play an
important role in the system. The model in Figure 6-11 contains three actors:



Salesperson – Employee in the cookie enterprise handling the delivery of goods to the customer.
Cashier – Employee in the cookie enterprise handling monetary transactions.
Customer – External actor buying cookies from the Cookie Enterprise.
Economic Resource
Cookies
Economic Agent
Inside
Participation
Salesperson
Economic Event
Stock-Flow
Sale
Economic Agent
Outside
Participation
Customer
Give
Duality
Take
Economic Agent
Outside
Participation
Customer
Economic Event
Stock-Flow
Cash Receipt
Economic Agent
Inside
Participation
Economic Resource
Cash
Figure 6-11 The cookie company (42)
Cashier
34
Analysis
6.3.1.5 Extending the Model
The model in Figure 6-11 can easily be extended to cover more business processes and ultimately cover the
entire cookie enterprise. An easy place to start would be to focus on the requirements of producing
cookies. We need to have flour, sugar, and butter to produce cookies, so extending the model to cover
inventory would be easy. The duality would be covered by mapping the monetary exchange for goods with
Vendors. From here on the model could be extended to cover employee wages, equipment, and so forth.
The REA model offers a good scalability and a down to earth approach in mapping the relevant parts of the
business in respect to the business processes.
6.3.2 Introduction to Accounting Theory
As described previously, the core part of an ERP system is the accounting application. This is also the case in
NAV, and most of the tables and functionality that we will use for examples throughout this thesis are
components from the accounting application. In the following sections we will introduce some new
concepts. First we will give an introduction to the accounting system that NAV is designed after, namely the
all dominant double-entry system. Thereafter, we will draw parallels between the theory and the NAV
application to illustrate how the double-entry system shines through in the naming of tables and behavior
of the system. Finally, we will use this information to apply REA theory to the NAV application to examine if
REA can be applied, and provide additional information on the architecture and behavior of the application.
6.3.2.1 The double-entry accounting system
The most used accounting system is the double-entry system. The double-entry system can be traced far
back and the first, and most influential, textbook on the technique was published in Venice in 1494 by the
Franciscan monk and mathematician Frater Lucal Pacioli (43). The double-entry system is still the dominant
system today, and the largest ERP products, including systems such as SAP and NAV, are based on the
double-entry system.
The basic principle of the system is, that every entry in the book should be presented as both a debit and a
credit. Postings therefore always influence at least two accounts. If an amount is added on one account it
should be subtracted on another account. The principle is that debit should always be equal to credit,
which is the guarantor for the balance to be correct.
This can be expressed in the simplified model below.
Debit
Credit
Expenses
Paying salaries, buying raw material
Assets
Money in the bank, inventory
Revenue
Money from sale of goods
Liabilities
Debt
Table 6-7 The double-entry accounting system
The accounting technique is to always register entries on the debit side except if the amount is negative,
where it is recorded on the credit side.
The debit-credit system is completely necessary in today’s bookkeeping methods.
35
6.3.2.2 The double-entry accounting terms
The double-entry accounting system has a special naming convention. The following section gives an
introduction to the terms ledger, entry, and journal used in double-entry accounting, and explains with an
example how NAV would support a simple beverage store.
The following definition of the term ‘ledger’ gives a good explanation of how ledgers, entries and journals
are tied together. The definition states: “A book for recording the monetary transactions of a business
entity in the form of debits and credits. Entries recorded in the subsidiary journals are posted (recorded) to
the general ledger as final entries.”(44)
To illustrate how this maps to the real world we will use the beverage fridge of a dormitory as a simple
example.
At the dormitory I live in, I am responsible for managing our nonprofit fridge with beverages. The system is
simple: When a person takes a beverage from the fridge, they mark it on a list. The list is a simple spread
sheet with names of the residents on the y-axis and types of beverages on the x-axis.
1. Each line in this list corresponds to a sales line in NAV, listing which beverages we have sold to each
customer.
2. When we once in a while change the lists on the fridge, we add together the beverages sold to each
customer and write the price in our book. Each line in this book is equal to journal lines in NAV.
3. Every month we add up the total amount for each customer and give them an invoice. When the
invoice is paid, we note in our book that the invoice has been paid. This action is equivalent to the
NAV action of posting a journal line to an entry in the general ledger account and deleting the
journal afterwards.
36
Analysis
6.3.3 Naming of tables in NAV
The following section is provided to elaborate on the double-entry accounting terms and show how these
naming conventions are reflected in NAV.
The naming of NAV tables follows the double-entry accounting terms and therefore tells us a lot about the
role of a table object in the system. This section lists the most relevant NAV tables, in terms of accounting,
and describes their role in shaping the basic NAV bookkeeping system.
Table name
Salesperson/
Purchaser
Customer
Customer Invoice
Discount
Customer Account
Customer Ledger
Entry
Sales Header
Sales Line
Vendor
Vendor Invoice
Discount
Vendor Ledger
Entry
Purchase Header
Purchase Line
Item
Item Ledger Entry
General Journal Line
G/L Account
G/L Entry
Table role in NAV
Table for storing sales and purchase employees. The salesperson would be an
employee having contact with customers, so that the customer would take one or
more of the company’s products. The purchaser would be an employee having contact
with vendors with the purpose of purchasing materials used to manufacture products
or similar.
Table for storing customer information, such as ship to address and similar
information.
Table for storing the customer negotiated discount from the regular list price.
Table for storing all financial transactions with a customer.
Table to create entries regarding customers in the G/L Account.
Table for storing information on shipping.
Table for storing information on what to ship.
Table for storing vendor information such as the company’s account number in the
vendor’s system.
Table for storing the negotiated discount from the Vendors regular list price.
Table to create vendor entries in the G/L Account.
Table for storing information about the vendor the company bought products from.
Table for storing information on which products the company bought.
Table for storing the products the company sells.
Table for creating entries regarding Items in the G/L Account.
Records showing the exchanged goods not yet paid for.
The general ledger account is the general accounting book for the company.
The general ledger entry table is for creating entries in the general book.
Table 6-8 Table naming in NAV
The important note is that all tables with names such as journal, entry, and ledger are clear indicators that
NAV is based on the double-entry accounting system.
6.3.4 REA concepts in NAV
As presented in section 6.3.3 NAV contains characteristics of the double-entry system. If the system was to
be refactored to the REA accounting model all tables handling receipt of goods, use of materials, and other
actions related to exchanging resources would have to be refactored to REA Events (45).
For NAV this would mean that entries, journals and ledgers tables would be replaced with REA events
tables. We will in the following section examine how this could be done.
37
In Table 6-8 it can be seen that the system contains journal and entry tables which follow the concept from
the double entry system. We have identified the pattern of a journal line being posted to an entry. The
action of posting an entry with data from a journal line is equivalent to an Event in REA. If we can find the
journal line we can also find the tables that contribute to the values in the journal line. The idea proposed
by J. Kiehn was to identify REA Resources and Agents from the posting event. We do however know that
many values are written to journal tables that are not themselves part of a REA relation. The Gen. Journal
Line table contains 142 fields and contains data such as customer discount, date of maturity, etc. The
following section contains an analysis of the steps necessary to identify the REA candidates.
6.3.4.1 Identifying REA concepts
This section contains an overall step by step approach for identifying REA candidates. The goal here is to
layout the overall requirements to REA candidates we will analyze this in depth in the manual analysis in
section 6.3.5.
1. Find all entry/journal table variable sets where:
a. Field values from a journal line variable is copied to an entry table variable:
Ex: CustLedgEntry."Customer No." := GenJnlLine."Account No.";
b. The entry table instance is inserted in the entry table
Ex: CustLedgEntry.INSERT;
c. The journal line instance is deleted from the journal line table
Ex: GenJnlLine.DELETE;
2. Find all tables from which values are copied to the journal line:
a. Identify all the table objects copying values to the journal line tables identified in step 1.
Ex: GenJnlLine."Account No." := SourceCodeSetup.”Account No.”;
3. Find REA candidates from the journal line:
a. If there exists a containment (described in section 6.1) between the entry table identified in
step 1 and the table objects identified in step 2 we have identified a REA candidate.
b. If there does not exist a containment we check if there is a resemblance in the naming of
the tables.
For instance Vender Ledger Entry which will be identified in step 1 as part of the variable
set (Vendor Ledger Entry / Gen. Journal Line) has a resemblance to name of the Vendor
table which will be identified in step 2. Therefore it would be reasonable to assume they
are related.
6.3.5 Manual analysis of REA Events in NAV
In section 6.3.4 we presented how we see REA as a candidate to replace the double-entry system. We
found that the posting of data from journal lines to entries is a key component in the double-entry system.
Further we found that we can apply REA terminology to the meaning of this movement of data. We know
that the posting of data to an entry could be expressed as an event in REA terminology. Our initial work is
therefore to find all relations between tables where values from a Journal Line table are copied to an Entry
table. J. Kiehn has identified Codeunit 12 Gen.Jnl.-Post Line as a Codeunit that posts journal lines (of type
81 Gen. Journal Line) to an entry. The name Jnl.-Post Line is used as an acronym for Posting of Journal Lines.
When we search through the Codeunits in NAV we find the following Codeunits with same naming:
 12 Gen. Jnl.-Post Line
 22 Item Jnl.-Post Line
38
Analysis
 32 BOM Jnl.-Post Line
 212 Res. Jnl.-Post Line
 1012 Job Jnl.-Post Line
 5633 FA Jnl.-Post line
 5652 Insurance Jnl.-Post Line
Common for all the journal line posting Codeunits above is that they are called from a Codeunit with an ID
number one higher and the same ID name where Line is replaced with Batch i.e. 12 Gen. Jnl.-Post Line -> 13
Gen. Jnl.-Post Batch. In NAV batch normally refers to a process run as a batch process i.e. a process with no
screen output.
We will manually examine Codeunit 12 Gen .Jnl.-Post Line (CU12) and Codeunit 13 Gen .Jnl.-Post Batch
(CU13) to find a suitable method to identify the posting mechanism. When the posting mechanism has
been identified and we have automated the process of identification the next step will be to identify tables
that post values to a journal line and transitively post values to an entry table. The identified tables will be
REA Resource and Agent candidates. We expect the seven posting Codeunits to share a common structure
due to the lack of inheritance described in section 2.4.
CU12 and CU13 count respectively 7245 and 996 lines of code. We will in the following section start our
analysis from CU13 which initiates the event. We only present the most significant code lines.
ID
Codeunit and procedure
Code
1
Code;
2
3
4
CU13.OnRun(Rec)
Rec: Gen. Journal
Line(record)
CU13.Code()
CU12.RunWithoutCheck()
CU12.Code()
5
CU12.PostCust()
6
CU13.Code() (Continuing)
Table 6-9 Code for posting Journals to Entries
GenJnlPostLine.RunWithoutCheck(GenJnlLine5,TempJnlLineDim);
Code(false);
IF "Account No." <> '' THEN
CASE "Account Type" OF
"Account Type"::"G/L Account":
PostGLAcc;
"Account Type"::Customer:
PostCust;
…
"Account Type"::"IC Partner":
PostICPartner;
END;
WITH GenJnlLine DO BEGIN
CustLedgEntry.LOCKTABLE;
CustLedgEntry.INIT;
CustLedgEntry."Customer No." := "Account No.";
…
CustLedgEntry."Posting Date" := "Posting Date";
…
CustLedgEntry.INSERT;
GenJnlLine3.DELETE; //(referring to same instance)
The section hereafter describes each of the above seven code blocks and explains why they are relevant.
1. This part is taken from the OnRun trigger. The OnRun trigger is executed when the Codeunit is run.
The OnRun trigger takes a record of type Gen. Journal Line as parameter and calls the Code
procedure in the same procedure.
2. The Code procedure calls the procedure RunWithoutCheck in CU12.
39
3. The RunWithoutCheck calls the local procedure Code.
4. The Code procedure calls local procedures based on which account type the postings should be
posted to.
5. One of the posting procedures in step 4 is PostCust(). This procedure copies values from the Gen.
Journal Line table (GenJnlLine) to the Customer Ledger Entry table (CustLedgEntry) and finally the
entry is inserted.
6. When PostCust() returns the execution of CU13.Code from step 2 continues. The Code procedure
deletes the inserted Gen. Journal Line (GenJnlLine3).
From the manual analysis we found that there is a variation in how the posting procedures are
implemented. Further the analysis revealed that the code is complex, and that there are many procedure
references that needs to be analyzed. The analyzed pattern is spread among five procedures located in two
Codeunits. The automated identification algorithm will need to analyze all referenced code.
40
Analysis
Chapter 7
7 Tools
The tools section introduces the tools, technologies, and methods used to implement:



The parser extension identified as a requirement in section 3.1.1.
The containment, generalization, and REA identification algorithms described in section 8.9
A tool (Concept Viewer), see section 9.5, for dynamically displaying the findings of our identification
algorithms
7.1 .NET Framework
.NET is the platform under the latest version of the NAV ERP application and all related work, see section 3,
builds on it. Thus .NET is the only sensible platform for working with the job at hand.
The .NET framework is the Microsoft flagship when it comes to development. Version 1.0 was released in
2001 and the latest release is version 3.5. The framework allows developers to program for any platform
for which there exists a .NET framework implementation. Much like what the Java Virtual Machine does for
Java. Microsoft has made the standard open allowing implementations for non Microsoft platforms to
appear. Mono (46) is one of such solutions offering full C# 3.0 support in their latest release (version 2.4)
targeted for Windows, Mac OS X and Linux.
The different flavors of programming languages (F#, C#, VB.NET, and others) are compiled via language
specific compilers to the Intermediate Language (MSIL). When the code is run, MSIL is compiled with a JIT
compiler to Common Language Runtime (CLR) which is executable bits. This can be studied in depth at
MSDN (47).
7.2 Interoperability
A great advantage with the .NET languages C#, VB.net, F# is the interoperability they offer. As described
above, they all share a common base with the .NET framework and this enables developers to use classes,
modules, methods and function from any other .NET based language. The integration can be done easily in
Visual Studio simply by referencing a project or .dll. The integration is done completely allowing the
developer to seamlessly debug through the executed code lines jumping between C# and F# projects
without having to notice the technology and project shift.
7.3 F#
Hvitved’s parser was written in F#, but it is incomplete, see section 3.1.1, with respect to the problems that
we are going to solve in this project. Thus we need a basic understanding of F# in order to extend the C/AL
parser.
F# is the latest addition to the family of .NET languages. The latest release is the Community Technology
Preview (CTP 1.9.6.2) which was released in September 2008. F# is the result of a research project by Don
42
Tools
Syme at Microsoft Research in Cambridge, England. The language is going to be fully integrated in Visual
Studio 10 which is expected to ship together with the .NET framework version 4.0 in the latter half of 2009.
The language is a declarative language extended with some support for imperative programming. The
language is heavily inspired by OCaml (48) (Objective Caml), which is inspired by CAML (49) (Categorical
Abstract Machine Language), which again is inspired from ML (50) (Metalanguage) developed in the 1970s
at Edinburg University. As described, the language has a strong heritance with many potential users among
scientists and researchers in various fields.
The biggest achievement in F# is the combination of the brevity and robustness of the Caml programming
language family with .NET interoperability facilitating seamless integration of F# programs with any other
program written in a .NET language, described in section 7.2.
7.3.1 Language syntax
As mentioned before the language is heavily inspired by OCaml. The syntax is actually so familiar to OCaml
and Haskel that F# forums even recommend to read books on Haskel and OCaml to learn F# (51). Speaking
of syntax it is worth noticing that white spaces matter which will be a surprise to the average OO
programmer. The language is strongly typed and uses type inference as we will demonstrate in the
following examples. Among other features worth mentioning are generics and modules. Functions in F# are
implicitly generic and can be reused in other functions. They can even be nested inside other functions
returning their result directly to a parent function. Modules is a great help when writing F# programs
allowing the developer to shield off the functions not meant for use outside a module.
F# is one of the first mainstream programming languages implementing active patterns and asynchronous
programming constructs. The two sections below provide an introduction to the concepts. The active
pattern concept is used for our parser extension and the section on asynchronous programming is provided
even though we have not used asynchronous programming in our project. The asynchronous programming
features of F# are unique for languages in the .NET family and a good motivation for learning F#.
7.3.1.1 Active Patterns (52)
Pattern matching is normally done over the core representations of data structures such as lists, options,
records and discriminated unions. In F# pattern matching is extensible allowing the developer to define
new ways of matching over existing types. Active patterns are the technique for this.
The following example is an introduction to how active patterns can be useful in converting an object.
In computer science, complex numbers are normally represented as rectangular coordinates (x + yi). But in
math we often use the polar representation of coordinates of a phase and magnitude. For some
calculations the polar representation is preferable. The next example illustrates how we can define two
active patterns to match complex numbers in either the rectangular or polar representation.
let (|Rect|) (x:complex) = (x.RealPart, x.ImaginaryPart)
let (|Polar|) (x:complex) = (x.Magnitude, x.Phase)
Code 7-1 Example with complex numbers – Definition of active patterns
‘Let’ is the definition keyword used for declaring functions and values in F#. The active patterns we have
defined are named Rect and Polar they both take one parameter x of type complex (requires import of
43
Microsoft.FSharp.Math) and the returned value will be the value set for either the rectangular or polar
representation of the complex number.
The next three code examples shows how the two active patterns defined above can facilitate cleaner and
shorter algorithm designs. The code defines three calculations one add function and two multiplication
functions.
The first function addViaRect is a function which adds two complex numbers (a and b) by using their
rectangular representation. F# is a strongly typed language with type inference therefore the parameter
type will automatically be inferred to be of type complex based on the call to the Rect function. The
addViaRect matches the input with Rect and return the sum of the two complex numbers as a new
complex number.
let addViaRect a b =
match a, b with
| Rect(ar,ai), Rect(br,bi) -> Complex.mkRect(ar+br, ai+bi)
Code 7-2 Example with complex numbers – Add function
The last two functions mulViaRect and mulViaPolar are provided to underline the advantage of
having both representations. We see that multiplying complex numbers in the rectangular representations
is much more complicated than performing the same operation on the polar representation. Therefore it is
beneficial to be able to use the optimal representation in each case.
let mulViaRect a b =
match a, b with
| Rect(ar,ai), Rect(br,bi) -> Complex.mkRect(ar*br - ai*bi, ai*br + bi*ar)
let mulViaPolar a b =
match a, b with
| Polar(m,p), Polar(n,q) -> Complex.mkPolar(m*n, (p+q))
Code 7-3 Example with complex numbers – Multiply functions
7.3.1.2 Asynchronous programming (53)
F# supports asynchronous workflows, allowing developers to easily convert single-threaded code into
multi-threaded code.
In contradiction to the ‘let’ keyword, ‘let!’ is used to define objects as being able to run
asynchronously. Every function defined with the ‘let!’ keyword executes on a dedicated thread that is
taken from and released to a thread pool when the computation is done. Therefore async computation
uses more threads compared to single threaded execution. One of the reasons multithreaded computation
is easy to do in F#, is due to its immutable data types. Data types in F# and functional languages in general
are implicitly immutable, meaning that a declared value cannot be changed during execution. This is a great
advantage when designing thread safe code because we always can rely on the state of value. When
mutable data types are needed they have to be explicitly declared with the keyword mutable. This will have
some limitations on the available options for performance improvements by running in multiple threads.
The following modified example from the book Expert F# is provided to explain how asynchronous
programming can be done in F#. The provided code does not actually run in the CTP version of the F#
language due to changes in the FSharp.Core.dll, but the code can still be used as a good example for
illustrating how asynchronous computations work in F#.
44
Tools
open System.Net
open System.IO
open Microsoft.FSharp.Control.CommonExtensions
let museums = ["MOMA", "http://moma.org/";
"British Museum", "http://www.thebritishmuseum.ac.uk/";
"Prado", "http://museoprado.mcu.es"]
let fetchSync(nm,url:string) =
do printfn "Creating request for %s..." nm
let req = WebRequest.Create(url)
let resp = req.GetResponse ()
do printfn "Getting response stream for %s..." nm
let stream = resp.GetResponseStream()
do printfn "Reading response for %s..." nm
let reader = new StreamReader(stream)
let html = reader.ReadToEnd()
do printfn "Read %d characters for %s..." html.Length nm
for nm,url in museums do
fetchSync(nm,url))
Code 7-4 Example with Sync and Async execution - Synchronous function
Code 7-4 above has three main parts. The first part declares the list museums, with three elements each
containing the name and URL of a museum. The second part is the fetchSync function which creates a
request for a webpage, waits for the response, and prints the length of the response to the screen.
The third is a for loop which calls the fetchSync function for each element in the museums list. The
above code will run in a single thread, retrieving a single URL at a time. This will lead to starvation, while
the program waits for the response, and while the response is downloaded.
Code 7-5 is the asynchronous version of part two and three from above. The first thing to notice is how
similar the two solutions are! The fetchAsync functions inner code is declared inside an async {},
marking that the code within {} is a part of an asynchronous workflow. The next part is the let! keyword
described earlier, declaring that the result should be run in a dedicated thread. Finally the function call in
the for loop calls fetchAsync by spawning a new thread with Async.Spawn for each function call.
let fetchAsync(nm,url:string) =
async { do printfn "Creating request for %s..." nm
let req = WebRequest.Create(url)
let! resp = req.GetResponseAsync()
do printfn "Getting response stream for %s..." nm
let stream = resp.GetResponseStream()
do printfn "Reading response for %s..." nm
let reader = new StreamReader(stream)
let! html = reader.ReadToEndAsync()
do printfn "Read %d characters for %s..." html.Length nm }
for nm,url in museums do
Async.Spawn (fetchAsync(nm,url))
Code 7-5 Example with Sync and Async execution - Asynchronous function
45
The performance of the two approaches is very different. The performance of fetchSync would be
heavily impaired by the waiting time from sequentially requesting and receiving one html page at a time.
The performance of the fetchAsync function would be better because we would shoot off all three
requests right away instead of waiting for other requests to complete.
The box below show how the output from the two approaches would produce the same output in different
order and implicitly speed, due to the reasons described above. Therefore the execution time of the
asynchronous version will be close to the execution time required to handle the slowest or largest
response, in this case the MOMA museums homepage. The highlighted text gives an indication on which
step the synchronous approach will be executing when the asynchronous execution completes.
Synchronous execution
Asynchronous execution
Creating request for MOMA...
Getting response for MOMA...
Reading response for MOMA...
Read 41635 characters for MOMA...
Creating request for British Museum...
Getting response for British Museum...
Reading response for British Museum...
Read 24341 characters for British Museum...
Creating request for Prado...
Getting response for Prado...
Reading response for Prado...
Read 188 characters for Prado...
Table 7-1 Sync vs. Async execution
Creating request for MOMA...
Creating request for British Museum...
Creating request for Prado...
Getting response for MOMA...
Reading response for MOMA...
Getting response for Prado...
Reading response for Prado...
Read 188 characters for Prado...
Read 41635 characters for MOMA...
Getting response for British Museum...
Reading response for British Museum...
Read 24341 characters for British Museum...
7.4 LEX, YACC and Abstract Syntax Trees (54)
This section is an introduction to the necessary steps in parsing a text language into an abstract
representation, keeping the syntax of the text. These steps are relevant in this project, because much of
this work is based on these techniques. We will describe the two parsing tools LEX (55) and YACC (56), their
role in a compiler, and how the steps of a compiler are closely related to the work in this thesis.
There are many tools for building parsers. The CALParser, described in section 3.1.1, is built with FSLEX and
FSYACC, which is the F# variant of LEX and YACC.
LEX is a tool to produce lexical analyzers. A lexical analyzer is a program that can recognize tokens in a text
and output the identified tokens. The lexical analyzer is produced by compiling a LEX file defined in a special
grammar. The grammar is relying heavily on regular expressions, this technique is described in section 7.6.
YACC (Yet Another Compiler Compiler) is a tool that can be used to define grammar rules. The grammar
rules are recognizing sequences of tokens defined in a Lexer file. The parser file definition is compiled with
YACC, and the result is a parser. The output from the parser is token blocks, that can be stored in a suited
data structure such as an Abstract Syntax Tree (AST).
The data structure used by the CALParser is an Abstract Syntax Tree (AST). The AST is constructed by
matching YACC output to data object types with the use of Active Patterns, described in section 7.3.1.1.
46
Tools
:=
.
Object
Value
Property
Figure 7-1 The structure of a abstract syntax tree
Figure 7-1 on the left shows how an assignment
statement would be represented in an abstract
syntax tree. The assignment code can be seen
below.
Object.Property := Value;
Code 7-6 C/AL variable assignment
The full data model can be seen in the F# source file \Code\Work From T. Hvitved\oo parser\CALast.fs on
the appendix DVD.
The use of abstract syntax trees in this project is elaborated in section 8.
Parsing files is the initial tasks in a compiler and the work in this project is actually very similar to the work
of a compiler.
A typical compiler goes through the following states when compiling a program:





Lexing
Dividing code into understandable tokens
Parsing
Parse the code into understandable blocks (if, for, while, etc. statements)
Semantic Analysis
Check if the code makes sense (variables are assigned before they are referenced etc.)
Optimizations/Transformations
Optimize the code by removing unused variables and other more advanced transformations.
Code generaration
Generate output which could be running code
The parsed code has already been compiled by the internal C/AL compiler. Therefore, there is no need to
perform semantic analysis. The code is tokenized and parsed with the CALparser and the developed
subparser. The primary task in this project is the optimization and transformation step in the compiler. The
presented refactorings are the result of this step. The output standard for this “compiler” is UML diagrams,
provided by the Concept Viewer.
7.5 C#
C# is the target platform, we even want to port the application code to C#, and the final goal of the
refactorization is to produce a more readable C# code then the present C# token parser, see section 2.3.3.
According to the TIOBE programming community index, C# is the 7th most popular programming (57)
language with Java still taking the 1st prize. This is a very good result for a programming language that has
only been around for 8 years, taken into account that the indexing algorithm, used by TIOBE among other
things, counts legacy code written 20 years ago, when calculating its ranking.
47
As most Danes in computer science know, C# and the .NET framework are developed by lead architect
Anders Hejlsberg. Version 1.0 of C# was released in 2001, and the latest version is 3.0 matching version 3.5
of the .NET framework. C# is a highly advanced imperative programming language offering many advanced
language extension. We have decided to present a few interesting concepts from the .NET languages from
a C# perspective. We will go over the following language extensions in the following sections: Regular
Expressions, Lambda Expressions and LINQ.
7.6 Regular expressions (58), (59)
Regular Expressions originate from automata and formal language theory developed by Warren McCulloch
and Walter Pitts in their work on the McCulloch-Pitts neuron model.
The Regular Expression technique has been used since the 1960s and is today widely supported in
programming languages. Regular Expressions have been supported by the .NET framework from the first
versions, and all the common .NET languages support the use of Regular Expressions.
Regular Expressions are useful for declaring expression patterns that can be used to identify patterns in
strings. The functionality is described in section 8.4.
7.7 Lambda expressions (60), (61)
We use lambda expressions to write cleaner C# code. The use of lambda expression has enabled us to
reduce the total number of C# code lines.
Lambda expressions originate from lambda calculus which formed some of the ground pillars of computer
science as we know it today. Lambda calculus was developed by Alonzo Church during the 1930s and has
directly or indirectly inspired Von Neumann in his design of a computer able to process jobs using software
(62).
The principles behind lambda calculus has in many ways formed functional languages such as Erlang,
Haskell, Lisp, ML and F#. Also other types of languages offer support for lambda calculus for example the
array programming language APL and the object oriented C#. C# introduced support for lambda expression
in version 3.0 which has made lambda expressions a valuable and useful language extension in everyday C#
programming.
Lambda expressions in C# are often used on collections. Every collection implementing the
IEnumerable<T> interface has implicit support for lambda expression operations. The definition in the
box Code 7-7 shows the method First which can be called on any Enumerable type (List, Array,
Dictionary, etc.) The documentation of the First methods defines First as a generic method able to select an
element based on the provided Func(TSource, Boolean).
Enumerable.First(TSource) Generic Method (IEnumerable(TSource), Func(TSource, Boolean))
Code 7-7 API description of the First method
The code in Code 7-8 is a simple example of how to use the First method. The code defines an integer
array with 15 random values. The First method returns a single integer value. The type we work on is a
number implicitly inferred to be of type integer, and we choose the first number with a value above 80. The
returned value will in this case be the value 92 at index 3.
48
Tools
int[] numbers = { 9, 34, 65, 92, 87, 435, 3, 54, 83, 23, 87, 435, 67, 12, 19 };
int first = numbers.First(number => number > 80);
Code 7-8 Example with lambda expression
The code in Code 7-8 is easily understood and can be expressed in a single line. If we had to write the code
in Code 7-8 without lambda expressions it would require 9 lines of code as seen in the example Code 7-9.
Therefore the presence of generic methods, which can be configured with lambda expressions, is a great
way to improve the readability by removing the need for extra lines of code.
public static int First(int[] numbers)
{
foreach (int number in numbers)
{
if (number > 80)
return number;
}
throw new InvalidOperationException("No number satisfied condition");
}
Code 7-9 Example without lambda expression
7.8 LINQ (63) (64)
Language Integrated Query in daily terms LINQ enables us to query data of various types. We use LINQ to
query an abstract syntax tree, see section 7.4, representing the NAV application code. LINQ was presented
to the public in 2005 during the annual Microsoft Professional Developers Conference (PDC), and was later
released with the .NET framework version 3.5. LINQ is developed to fill the gap between object oriented
languages and data that does not exist as objects. One example is data in a relational database.
LINQ is the result of a long term research investment, and many other projects have formed the basis that
LINQ is based on. Among the more significant projects are: Cω (C-Omega), ObjectSpaces and XQuery. LINQ
is designed by Anders Hejlsberg who, among a lot of other frameworks and languages, also designed the
.NET framework, and has one big advantage compared to the other former projects.
LINQ is designed to generically support all types of data source, which was one of the main reasons for
focusing on LINQ instead of funding separate projects aiming at individual data sources.
The vast majority of applications being developed access data of some kind. The consequence is that a
developer needs to learn more than one language. For instance creating a database query often requires
that the developer writes a SQL statement as a string and sends it to the database.
Product manager for Visual Studio, Jason McConnell expressed this as: “It was like you had to order your
dinner in one language and drinks in another,” (65)
LINQ is aiming at removing this gap between the data world and the world of general-purpose
programming languages by providing a uniform way to access data from within the programming language.
To underline how LINQ fills the gap between data and general purpose programming languages, one can
conceive LINQ as consisting of two complementary parts: a set of tools that work with data, and a set of
programming language extensions.
49
The uniformity of the design enables the developer to query objects in memory, relational database, XML
documents and other data from within the same language. Notably with the same simple SQL inspired
syntax on all data sources. LINQ is implemented on the .NET framework and can be used from within the
.NET languages.
To give an idea of the differences between code written without LINQ and code written with LINQ, we bring
an example (66) in Code 7-10 and Code 7-11 for some simple C# code working on XML data. Both examples
produce a book collection with two books published in 2006.
7.8.1
Example 1: Without LINQ
using System;
using System.Xml;
class Book //used in both examples
{
public string Title;
public string Publisher;
public int Year;
public Book(string title, string publisher, int year)
{
Title = title;
Publisher = publisher;
Year = year;
}
}
static class HelloLinqToXml
{
static void Main()
{
Book[] books = new Book[] {
new Book("Ajax in Action", "Manning", 2005),
new Book("Windows Forms in Action", "Manning", 2006),
new Book("RSS and Atom in Action", "Manning", 2006)
};
XmlDocument doc = new XmlDocument();
XmlElement root = doc.CreateElement("books");
foreach (Book book in books)
{
if (book.Year == 2006)
{
XmlElement element = doc.CreateElement("book");
element.SetAttribute("title", book.Title);
XmlElement publisher = doc.CreateElement("publisher");
publisher.InnerText = book.Publisher;
element.AppendChild(publisher);
root.AppendChild(element);
}
}
doc.AppendChild(root);
doc.Save(Console.Out);
}
}
Code 7-10 Example without LINQ
50
Tools
7.8.2 Example 2: With LINQ
Example 2 reuses the code from example 1. The code marked with bold text in example 1 is replaced with
the code in example 2.
XElement xml = new XElement("books",
from book in books
where book.Year == 2006
select new XElement("book",
new XAttribute("title", book.Title),
new XElement("publisher", book.Publisher)
)
);
Console.WriteLine(xml);
}
Code 7-11 Example with LINQ
When we compare the two examples we first notice that the example with LINQ (Code 7-11) is shorter than
the example without LINQ (Code 7-10). Example 1 counts 17 LOC and example 2 counts 10 LOC, which is a
41 % decrease.
When we examine example 1 to see where the extra code went we see that most of the code is related to
creating and adding elements to the XML document. Further we see that we have to handle advanced
concepts, such as root node, create element, set attribute, and append child. These concepts require
developers to watch their steps and put more of their attention towards the XML technology because, it
can be hard to predict the actual XML outcome from longer and more complex document.
When we look at the LINQ counterpart in Example 2, we see how easily this can be done with a complete
down to earth approach, by simply returning the XML structure directly. The code is shorter, simpler and
easier to read, because we can create the generated XML structure in a single statement. Further the LINQ
example is able to replace both the C# foreach and if statement with a simple where and select
statement.
Common for code comparison from examples using LINQ to examples not using LINQ, we find that code
using traditional data source binder code as ADO.net, System.XML or similar data binder when compared to
its counterpart in LINQ, will be found less optimal. LINQ code is more compact and has automatically a high
degree of readability.
LINQ offers out of the box support for objects, SQL, XML, and any data structure implementing the
IEnumerable<T> interface. Enableling arrays, collections, databases and XML to be queried with same
uniform syntax.
Chapter 8
8 Implementation
This section introduces how we actually solved the problem defined in section 1. We describe some of the
obstacles we had to overcome. Furthermore, we introduce parts of our solution and some of the key
components enabling us to solve the problem.
8.1 Problem with Parser
We have found that we cannot extend the CALParser implementation nor the produced abstract syntax
tree (AST) it produces. The reason we cannot extend the produced AST is because F# is a functional
language that uses immutable data types to enable asynchronous computation. When a TableRelation (TR)
has been stored in the AST we cannot subparse it. Therefore we decided to export the AST to XML. By
exporting the entire AST to XML we are able to query the code with LINQ described in section 8.5.
8.1.1 CALParser AST to XML AST
The CALparser produces an Abstract Syntax Tree from the parsed C/AL code. Further the package contains
a printer function that prints the AST as C/AL code. We decided to export the C/AL parser output AST to
XML. We did this be rewriting the print function to write well-formed XML and write the result to a file. By
exporting the AST to XML we have attained the following important goals:




We have a data structure we can easily manipulate and extend
We can access the data from any programming language able to handle XML and files
We can use the .NET language LINQ to query the XML files in an easy manner
We have prepared precomputation by storing the AST data in an XML file. This will have a positive
impact on computation time later in the process because we don’t have to run the CALparser every
time we do analysis.
8.2 New data representation
As decided above we store the parsed code in an XML file. The file follows the rules for well formed
XML(67). We give a smaller example of our XML structure and leave the larger study to the reader. The full
XML representation of Tables and Codeunits can be found in the appendix DVD in the folder Code\Abstract
Syntax Tree XML Files\.
To give an example of how C/AL code is represented in our XML document we have chosen to extract the
XML node for the following simple IF statement:
IF PaymentTermsTranslation.GET(PaymentTerms.Code,Language) THEN
PaymentTerms.Description := PaymentTermsTranslation.Description;
Code 8-1 C/AL IF statement
The IF statement in Code 8-1 is the first statement in the first table in the system. It is extracted from the
TransalateDescription method in the table Payment Terms.
52
Implementation
The IF statement uses GET to check if a row exists in table Payment Terms Transalation with
fields matching the two given parameters (PaymentTerms.Code and Language). If GET returns true
the description field in Payment Terms is assigned the value of the description field in table Payment
Terms Transalation.
The corresponding XML data for the two lines in Code 8-1 is provided in Code 8-2. The AST stores both the
structure of the code and the meaning of each token and is logically more space consuming than the
original code. The two lines of C/AL code in Code 8-1 require 36 lines in our XML document.
<BeginEnd>
<Stmt>
<StmtIfThen>
<If>
<OpName><![CDATA[.]]></OpName>
<Exp1><![CDATA[PaymentTermsTranslation]]></Exp1>
<Exp2>
<ExpCall>
<Name>GET</Name>
<Param>
<OpName><![CDATA[.]]></OpName>
<Exp1><![CDATA[PaymentTerms]]></Exp1>
<Exp2><![CDATA[Code]]></Exp2>
</Param>
<Param><![CDATA[Language]]></Param>
</ExpCall>
</Exp2>
</If>
<Then>
<StmtAssign>
<OpName>:=</OpName>
<Exp1>
<OpName><![CDATA[.]]></OpName>
<Exp1><![CDATA[PaymentTerms]]></Exp1>
<Exp2><![CDATA[Description]]></Exp2>
</Exp1>
<Exp2>
<OpName><![CDATA[.]]></OpName>
<Exp1><![CDATA[PaymentTermsTranslation]]></Exp1>
<Exp2><![CDATA[Description]]></Exp2>
</Exp2>
</StmtAssign>
</Then>
</StmtIfThen>
</Stmt>
</BeginEnd>
Code 8-2 Segment from our abstract syntax tree XML representation
A text file containing the code for all tables in the application counts 179,835 lines of code and has a size of
10.5 MB. In comparison an XML file storing the AST representation of the code counts 1,093,223 lines of
code and has a size of 47.5 MB.
8.3 CAL parser extension for table relations
As stated in section 3.1.1, parsing the TableRelation (hereafter TR) property is necessary for analyzing the
codebase. TRs express the foreign key of a field, but the syntax is far from trivial. TRs can contain both IF,
ELSE IF, WHILE, FILTER and GROUP statements. Furthermore, they are not restricted to reference a
single table/field and the syntax causes identifier names to vary.
53
8.3.1 Generic parser vs. specific parsing rules
As described in section 3.1.1, the generic parser for C/AL has a few shortcomings that require further work.
Extending the CALParser to fully parse TRs has shown not to be a good solution because the lexer and
parser definitions have rules that do not apply to the semantics of TRs. TRs can contain variables with
names containing spaces and dots and other special signs without being quoted with “” Another aspect is
that the CALParser already is a fairly complicated piece of software (3000+ LOC).
Further we have found that extending the generated abstract syntax tree (AST) cannot be done after it has
been generated due to the immutable datatypes in F# (see section 7.3.1.2). Therefore we decided to export
the generated AST to a data format enabling us to manipulate, extend and query the data easily. This is
described in section 8.1.1.
This leaves us to decide whether a generic TR parser is needed or if a few specific parsing rules can do the
job. Therefore we looked at the pros and cons of both solutions to find the best approach.
Generic Parser
Pros: Building a generic parser from scratch would bring the CALParser one step closer to being complete
and it would form a solid basis for future projects.
Cons: Very time consuming. Techniques such as YACC and LEX are far from trivial and a substantial amount
of hours would have had to be allocated in order to develop a generic parser.
Specific parsing rules
Pros: The goal is to find similar patterns not the unique ones. Developing a few specific rules is fast.
Cons: If the need for parsing all TRs should arise, a lot of rules, only occurring once, would be needed and it
would probably result in messy code. This parser will not be generic.
8.3.2 Parser Implementation
We put a lot of thought into whether we should develop a generic parser with YACC and LEX or if we should
create specific parser rules for the TRs we are interested in. We spent time studying the techniques behind
YACC, LEX and Regular Expressions and found that building a parser definition with YACC and LEX would
require a lot more time than we had set aside for the parsing task.
Based on the pros and cons in the previous section combined with that extending the parser was not part
of the initial scope for this project the choice of developing a generic parser would require a reevaluation of
the end project goal. Therefore we decided to develop specific parsing rules that will cover the required TRs
with regular expressions.
54
Implementation
8.4 Parsing rules (68)
As described in the previous section, we decided to base our TableRelation parser on Regular Expressions
(RE). Therefore we will go over some of the core operations of REs to explain how our parser works.
The following section analyzes two expression examples from the code base. The expressions are the rules
we use to parse. The complete code for the examples can be found on the attached DVD in the folder:
\Code\RegularExpressionParser\RegularExpressionParser\CALRegularExpressions.fs
Expression2
Expression2 is the simplest of our rules. The rule matches patterns of the type:
"Sales Header"
Code 8-3 TableRelation matching Expression2
Expression2 in Code 8-4 is used together with rules Expression0, Expression1, and Expression4
to match the TableRelation patterns that are part of the containment pattern defined in section 6.1. Below
is a walkthrough of the rule.
let SimpleID = "[a-zA-Z0-9-/\.]+"
let AdvancedID = "\"[a-zA-Z0-9\s-/\.\(\)]+\""
let AdvancedOrSimpleID = "(" + AdvancedID + "|" + SimpleID + ")"
let Expression2 = "^(?<table>" + AdvancedOrSimpleID + ")" + "$"
Code 8-4 Appendix DVD, file \Code\RegularExpressionParser\RegularExpressionParser\CALRegularExpressions.fs
The SimpleID string "[a-zA-Z0-9-/\.]+" will match any string containing the letters a-z, A-Z, 0-9, ‘‘, ‘/’, ‘.’ The dot is a keyword used for wildcard. The wildcard can substitute any character. If the match
needed is an actual dot it has to be written with a backslash in front of it as in many programming
languages. The same is the case for the quotation mark (\“). The quotation with *++ marks that the pattern
will match any string containing any of the allowed characters with a length of 1..n (one to many). If an
asterisk had been used instead of a plus the multiplicity would have been 0..n (zero to many) and if there
had been no plus or asterisks the quotation would only match against a single character.
The AdvancedID string matches any string that starts and ends with quotation marks. Further n spaces
are (\s), and parentheses (\() and (\)) are allowed in the string.
The AdvancedOrSimpleID string is using the syntax ( .. | .. ) this syntax means either or. The RE will
match any string of either the SimpleID form or the AdvancedID form.
The Expression2 string is our final expression for the simplest of our regular expressions. The start and
end character is ^ and $. The ^ character defines that the pattern must start from the first character in the
string we match against. $ defines that the pattern must match the last character in the string that we are
matching against. The (?<table> variable) syntax allows us to access parts or all of the matched pattern. In
this case we have named the entire pattern table (<table>) and can use this keyword from F# to get the
variable AdvancedOrSimpleID.
55
Expression3
Expression3 is a bit more advanced. An example of the pattern that the Expression3 rule match is:
IF (Table ID=CONST(13)) Salesperson/Purchaser
ELSE IF (Table ID=CONST(15)) "G/L Account"
ELSE IF (Table ID=CONST(18)) Customer
ELSE IF (Table ID=CONST(23)) Vendor
Code 8-5 TableRelation matching Expression3
Expression3 defined in Code 8-6 is used to match the TableRelation patterns that are part of the
requirements for the generalization pattern defined in section 6.2. After the below box there is a
walkthrough of the rule.
let AlternateID = "[a-zA-Z0-9\s-/\.]+"
let AdvSimOrAltID = "(?<table>(" + AdvancedID + "|" + SimpleID + "|" + AlternateID +
"))"
//Simple tokens
let Const = "=CONST\("
let RP2 = "\)\)"
let RPC = "\),"
let EmptySpace = "[\s|\t|\n]*"
let Dot = "\."
let Else = "ELSE\s"
let Expression3 = "^" + "IF\s\(" + AdvSimOrAltID + Const + AdvSimOrAltID + RP2 + "\s" +
AdvSimOrAltID + "(" + EmptySpace + Else + "IF\s\(" + AdvSimOrAltID + Const +
AdvSimOrAltID + RP2 + "\s" + AdvSimOrAltID + ")*"
Code 8-6 Appendix DVD, file \Code\RegularExpressionParser\RegularExpressionParser\CALRegularExpressions.fs
The AlternateID string matches strings containing spaces and dots without being surrounded with
quotation marks. The Const, RP2, RPC, Dot and Else are simple tokens that match what their name
implies. The EmptySpace string matches any number of spaces, tabs and new lines.
Expression3 has to start with an IF statement (“^IF”) and the last set of parenthesis ends with a *
which means that the pattern within the parenthesis can occur any number of times.
8.5 LINQ for Querying
We have found LINQ very useful for our project in fact LINQ was the key driver for exporting our AST to
XML. This is reflected in our pattern identification algorithms where LINQ is used excessively.
The use of LINQ for querying our XML AST enables our algorithms to be of acceptable size and improves the
readability of the code.
The following three LINQ statements are taken from the C# project MatchingInLinq, found on the appendix
DVD in the folder \Code\MatchingInLinq\. The project contains the developed pattern identification
algorithms for containments and generalizations. The statements are taken from the class
IdentifyContainments which is the class that contains the algorithm for identifying table pairs
matching the defined containment pattern described in section 6.1. The method which the statements are
extracted from is the ReplaceVariableNameWithRecordName method. This method is used, as the
name implies, to replace local variable names with the actual ID of the table the variable is referencing. We
56
Implementation
do this to remove the abstraction of local and global variables names for determining the table ID. This step
is described as step 6 in our preparation for pattern identification section 8.9.1.
The first LINQ statement is a select statement. The structure of the query is: FROM variable IN data WHERE
condition SELECT variable.
IEnumerable<XElement> variables =
from exampVar in referencedTable.Element("TABLEBODY").Element("CODE")
.Element("VarDecls").Elements("Var")
where exampVar.Element("VarDecl").Element("Type") != null
&& exampVar.Element("VarDecl").Element("Type").Value == "Record"
&& exampVar.Element("VarDecl").Element("Number").Value
== table.Element("NUMBER").Value
select exampVar;
Code 8-7 LINQ Example1 - Select record variables with number equals var
Return Value: We see that the return type is of type IEnumerable<XElement> which is a collection of
XML elements.
From: We see that exampVar is referring to the XML elements Var in a document of the following
structure:
<TABLEBODY>
<CODE>
<VarDelcs>
<Var>
</Var>
<Var>
</Var>
……..
<Var>
</Var>
</VarDelcs>
</CODE>
</TABLEBODY>
Code 8-8 Querying the abstract syntax tree
Where: The where condition states that Var must have the element type with the value Record. Further its
value number must be equal to the number of the other table in the containment candidate pair we are
examining.
<TABLEBODY>
<CODE>
<VarDelcs>
<Var>
<VarDecl>
<Type>Record</Type>
<Number></Number>
</VarDecl>
</Var>
<Var>
</Var>
……..
<Var>
</Var>
</VarDelcs>
</CODE>
</TABLEBODY>
Code 8-9 Querying the abstract syntax tree
57
Select: Finally if the WHERE condition is satisfied we select exampVar which, as we saw in the From
section, refers to the entire Var element.
The second example is the simplest of the LINQ examples we present. All this example does is to extract all
elements of type Stmt.
IEnumerable<XElement> stmts =
onDelete.Element("TAPTrig").Element("Body")
.Element("BeginEnd").Elements("Stmt");
Code 8-10 LINQ Example2 – Select all statements
The third and last example is a join of the two collections returned by example 1 and example 2.
IEnumerable<XElement> k =
from aVar in stmt.Descendants("Exp1")
join bVar in variables
on aVar.Value equals bVar.Element("VarDecl").Element("ID").Value
select aVar;
Code 8-11 LINQ Example3 – Select all Exp1 variables with ID equals var
Return: We still return a value of type IEnumerable<XElement>.
From: The nature of the original algorithm requires the presence of a for-each loop that iterates over each
element in Stmts. We have left out this loop because this section only focuses on LINQ. The current
element of the for-each loop is named stmt and we extract the XElement aVar from it. Stmt’s
Descendants statement means that any child element of stmt of the given name is returned. It might
be defined as a child of a child of a child to stmt and it will still be returned with the Descendants
method.
Join: The Join operation defined that we will join variable aVar with variable b. The variable bVar is
referring to the variable collection which was the result in example 1, see Code 8-7.
On: The On keyword is the Where clause of a join. It declares which variables that should be used to join
on. The statement forms the condition that the value of aVar should match the ID of variable bVar.
Select: If the above On clause is true variable a is added to the result set.
The above examples have shown how LINQ can be used to easily query the abstract syntax tree. As
described earlier the provided examples are used to return all relevant variable references in the code of a
table. The algorithm for the containment pattern identification, see section 8.9.1, will rename each
element in the result from example 3 as part of step 6 in the algorithm.
8.6 Lambda Expressions in action
We use lambda expression (LE) throughout our code. A search through the project code finds 30 lambda
expressions. We have chosen to present four of those to illustrate what LE’s can facilitate.
The first example is running on a part of the RefactorInheritance method from the MatchingInLinq
project. The variable tableList is a list of serializable custom objects created to store inheritance. The
C# code defines a foreach statement and the lambda expression code sets the property
ToGeneralization to the value of the string genName for every Inheritance object in the
58
Implementation
tableList collection. The ForEach function actually maps the lambda expression on every element in
the collection.
tableList.ForEach(p => p.ToGeneralization = genName);
Code 8-12 Appendix DVD, file \Code\MatchingInLinq\IdentifyInheritance.cs, method RefactorInheritance
The code in Code 8-13 selects a single XML element from a collection of XML elements. The XML element is
selected based on the element <Number> being equal to the Number in our Codeunit object. Here it is
worth mentioning that when we are dealing with Navision objects such as Tables, Codeunits, Forms and
Reports, both their Number and ID (name) are unique. Therefore we are certain in the example below that
we only have one match.
XElement codeunit = codeunits.Single(p => p.Element("Number").Value == cu.Number);
Code 8-13 Appendix DVD, file \Code\MatchingInLinq\IdentifyREAConcepts.cs, method FindProcedureReferences
The next example, Code 8-14, is illustrating how lambda expressions can be nested in conditional
expressions. The code checks if we have a global variable defined on Codeunit level with an ID matching the
value of s. If this is the case, the Any statement returns true thereby fulfilling the if statement.
if (globalVariables.Any(p => p.ID == s)
Code 8-14 Appendix DVD, file \Code\MatchingInLinq\IdentifyREAConcepts.cs, method Call
The last statement we bring, Code 8-15, is based on a ForEach method just like the first example. This
expression is different though. First of all it is being called on the Array class instead of being run on an
instance of a data structure. The actual data structure is provided as the parameter, in this case temp. The
second parameter p points to the current element in temp and each element is added as a new child
element to the XML element x as the value of the element PartOfkey.
//Type of data to parse "{
;Document Type,No.
"
temp = x.Value.Replace("{
;", "").Split(new char[] { ',' });
Array.ForEach(temp, p => x.AddAfterSelf(new XElement("PartOfKey", p.Trim())));
Code 8-15 Appendix DVD, file \Code\MatchingInLinq\ParseRemaningElements.cs, method ParseElementsToXML
From our use of lambda expressions, we have found them to be a valuable tool. We have found that
lambda expression enable us to write cleaner and shorter code. We estimate that on average a single line
with a lambda expression requires 6 lines of code to be expressed without lambda expressions. Therefore
we save 5 lines of code per lambda expression on average. We have used 30 lambda expressions in our
project which probably means that we have been able to shorten our code with around 150 lines of code.
This is not a lot but in larger projects, or if this project would be elaborated, the importance would grow
accordingly.
8.7 Graph generation
We aimed at delivering a tool for displaying graphs. Initially we experimented with automating Microsoft
Visio. The generated diagrams were good but the computation time was excessive. Generating a graph with
100+ elements took hours which is completely unacceptable if we aim at delivering an interactive tool.
Therefore we looked into alternatives and found Microsoft Automatic Graph Layout.
59
8.7.1 Microsoft Automatic Graph Layout (69)
Microsoft Automatic Graph Layout (MSAGL) is an internal research project by Microsoft Research offering
easy graph generation and dynamic layouts with support for the Multi Dimensional Scaling (MDS) algorithm
and the Sugiyama and Ranking layout schemes.
We did however quickly find two caveats with MSAGL
1. The Windows Presentation Foundation (WPF) component was in a very rough state
2. MSAGL only supported two arrowhead types
The first was a minor problem because its counterpart for Windows Forms proved to be very stable and
fast. We found that MSAGL is able to generate large graphs of 100+ elements in a few seconds which is
acceptable for our tool. The second caveat was of more serious nature because we needed support for
UML symbols to present our findings.
Therefore I took contact to Lev Nachmanson (70) from Microsoft Research in Redmond to get access to the
source code. Lev replied he would like to add the expansion to the pack and granted us access to the source
depot.
We added support for the arrow head styles for the UML symbols for containment, aggregation and
generalization and sent our contribution to Lev. The changes were accepted without comments and have
been checked into MSAGL allowing others to model UML components with MSAGL.
Figure 8-1 Arrow heads for Containment, Aggregation and
Generalization
The Arrow head styles added are shown in Figure
8-1 on the left. The figure shows:
 Containment, from B to A
 Aggregation from C to A
 Generalization from D to A
8.8 Performance boosts
The aim is to deliver a tool able to generate graphs in seconds. With the MSAGL toolkit we are able to do
this. The parsing and pattern identification algorithms do however still require minutes to execute. We
therefore added pre-computation to speed up the user interaction for our Concept Viewer tool.
Pre-computation has been added the following places:



CALParser and Subparser
As described in section 8.1.1 we have changed our parser into writing XML files. This allows the
pattern identification algorithms to run directly on the AST stored in the XML files instead of having
to parse the C/AL code.
Containment Identification
The containment algorithm produces two files storing respectively containment and aggregations
pairs.
Generalization Identification
60
Implementation
The generalization algorithm produces two files storing respectively generalization sets and
refactored generalization sets.
The four files produced by the containment and generalization identification algorithm are stored in a
serialized data structure and saved in a binary file. The Concepts Viewer loads the files at startup and has
fast access to all results. The pre-computation has resulted in a satisfying performance for the graph
generation.
8.9 Algorithm design
The language of choice for implementing the algorithms identifying containments, aggregations and
generalization candidates is C#. C# was the logical choice because it offers seamless integration with F# and
our ability to produce good code fast in C# is better than in F#. The algorithms are described in pseudo code
and we are only listing the core steps of the algorithms.
8.9.1 Containment
The pseudo code for the containment algorithm can with advantage be read with the manual analysis of
the containment pattern from section 6.1 in mind. The pseudo code for the algorithm is provided in the box
below. Additional steps to explain the behavior of the algorithm are the following:
1. The XML AST is queried to return all tables.
2. For each table in the results of step 1 we query to find all TableRelations.
3. For each TableRelation we check if the TableRelation type is of type Expression0, Expression1,
Expression2 or Expression4 (these can be seen in section 8.4).
4. If step 3 is satisfied we check if the table contains an OnDelete trigger.
5. If the existing OnDelete trigger contains references to other procedures we need to search through
their code. We do this by inserting the code from the procedure until there are no more procedure
references in our OnDelete trigger. This means that we insert code that needs to be inspected in
order to check whether it contains a procedure reference.
6. When all code is collected variables are replace with their type. Meaning that if the code contains a
variable Cust of type table 18 Customer, we replace Cust with Customer. This is done to
remove the abstraction of variable names to prepare the AST for the final step.
7. Now we have an AST representing the OnDelete trigger in which we have collected all the code it
contains and reference. Furthermore, the AST is simplified by the removed abstraction of variable
names (step 6). The AST can now easily be query for delete/deleteall function calls and
check if the table they are calling on is the table we started querying in step 2.
8. If step 7 is satisfied, we know we have found a containment. If on the other hand, step 4 or 7 was
not satisfied we know that we have an aggregation.
Find all tables
Do for each table
{
Find all TableRelations for the current table
Foreach TableRelation
{
If TableRelation is of type 0,1,2 or 4
{
Get ID of referenced table
Find the referenced table‟s OnDelete trigger
61
If (OnDelete trigger exists)
{
While (Trigger contains procedure calls)
{
Replace procedure calls with code of referenced procedure code
}
While (Trigger contains unconverted variable names to tables)
{
Replace variable names with ID of the table they reference
}
If (OnDelete trigger contains a DELETEALL
for the table that holds the TableRelation)
{
we found a containment
}
else
{
we found an aggregation
}
}
Else
{
We found an aggregation
}
}
}
}
Code 8-16 Pseudo code for the Containment algorithm
8.9.2 Inheritance
The pseudo code for the inheritance algorithm can with advantage be read with the manual analysis of the
generalization pattern in section 6.2.2 in mind. Following the approach from above we have provided
pseudo code for the algorithm in the box below accompanied with a numbered list providing more detail
on some of the algorithm steps.
Step 1-3 are identical to the containment algorithm with the exception that the TableRelation type we are
looking for in this case is of type Expression3.
4. Find the field with ID or Number equal to the value of <ItemField></ItemField> in the
parsed TableRelation.
5. If the identified field in step 4 contains the property FIPOptionString, we extract all
<ItemVariable></ItemVariable> fields.
6. The extracted FIPOptionString is matched with the extracted ItemVariables.
7. If we found a match in step 6, we extract all <ItemTable></ItemTable> values and add a
new inheritance object from the table we are currently working on to all tables extracted from
<ItemTable>.
Find all tables
Do for each table
{
Find all TableRelations of type 3
Foreach TableRelation
{
Var_1 = field with ID or Number == ItemField of TR
IF (Var_1 has property FIPOptionString)
{
62
Implementation
Get ItemVariables references from TableRelation
If (Var_1.FIPOptionString match ItemVariables)
{
Get ItemTable reference from TableRelation
Add new inheritance
}
}
}
}
Code 8-17 Pseudo code for Generalization algorithm
8.9.3 Inheritance – Reusing generalization objects
The findings of the algorithm above allow us to map the identified relations as pure associations (the way it
is implemented today) or to refactor them into generalization objects. One of the great advantages of using
generalization objects is that we can reuse the generalization objects and thereby reduce their number.
This algorithm identifies generalization object matches and allows us to present the refactored objects. The
outcome of this algorithm is available in section 9.4. The examples are described similar to prior examples.
1. The input of the refactoring algorithm is the result from the inheritance algorithm. The refactoring
algorithm is iterated until all inheritance objects in the inheritance collection have been refactored.
2. Select the first inheritance object
3. Find all inheritance objects where the inheritance property ToRecords match the ToRecords
property of the inheritance object selected in step 2.
4. If the inheritance object from step 2 is unique, i.e. if we do not find any inheritance objects with a
matching ToRecords property, we remove the object from the collection of non refactored
inheritance objects and add the inheritance object to our result collection.
5. If we find inheritance objects with a matching ToRecords property, we remove them from the
collection of non refactored inheritance objects. Rename the ToRecords property to a common
name and add the objects to the result collection.
Find all inheritance objects
While (exist inheritance object not refactored)
{
Get first inheritance object.
Find all inheritance objects with matching „ToRecords‟
If(inheritance object is unique)
{
Remove inheritance object from collection of not refactored objects
Add inheritance object to result
}
else
{
Remove inheritance collection from collection of not refactored objects
Rename „ToRecords‟ to common naming
Add inheritance collection to result
}
}
Code 8-18 Pseudo code for refactoring Generalization objects
8.9.4 Implementation of REA identification
This section covers the implementation of the steps identified in section 6.3.4.1 and elaborated in section
6.3.5. During the implementation of step 1 in the original identification algorithm we ran into a problem
related to locating the relevant code lines. The root of the problem was that the pattern does not require
63
code lines to be located in the same procedures or even Codeunit. In principle it could be stored in a table
trigger as well. When we started out we assumed we could identify a delete on a journal (step 1.c) and then
find a referenced procedure copying fields to an entry and inserting the entry (step 1.a and 1.b).
Unfortunately this is not the case. When we use this approach we find:
ID
1
2
3
4
5
6
7
To Entry Table
21 Cust. Ledger Entry
25 Vendor Ledger Entry
5802 Value Entry
355 Ledger Entry Dimension
281 Phys. Inventory Ledger Entry
355 Ledger Entry Dimension
355 Ledger Entry Dimension
From Journal Table
81 Gen. Journal Line
81 Gen. Journal Line
83 Item Journal Line
83 Item Journal Line
83 Item Journal Line
83 Item Journal Line
356 Journal Line Dimension
Table 8-1 REA results from step 1
The reason we find such a small result set is because the lines we look for are widely spread among
different Codeunits and procedures. To illustrate this we will explain how we find result 1 from table 1.
8.9.4.1 Finding the value set Cust. Ledger Entry / Gen. Journal Line
The relationship of result 1, and 2 for that matter, is initiated from Codeunit 13 Gen. Jnl.-Post
Batch (hereafter CU13) as the naming of CU13 implies this Codeunit is used to post General Journal Lines
as a batch process, meaning that the process runs without any visible output to screen. From the earlier
definition of journals we know that posting a journal refers to inserting data from the journal into a ledger
entry. Therefore it makes sense to assume we will be able to observe the copying of journal data to entry
data in CU13.
When we inspect the code of CU13 to find the code expressing the pattern, we find that CU13 is calling the
Codeunit 12 Gen. Jnl.-Post Line (hereafter CU12) for posting the journal to an entry. Hereafter
CU13 performs a delete of the journal line (step 1.c). Therefore our algorithm has to analyze the code of all
referenced Codeunits and procedures.
We find that the posting of the journal is code by CU12’s PostCust procedure. When we look at the
reference chain for the PostCust, procedures that directly or indirectly reference PostCust we find
that this produces a graph of 152 procedures.
Building this graph is a time consuming task because we are querying large amount of XML data many
times.
For building the graph we need to do the following, starting at the identified starting point:
1.
2.
3.
4.
Look up all variables names in the procedure to check if they refer to the table we work on
Analyze all variable assignments in the procedure to check if they follow the defined pattern
Find all references to the current Codeunit procedure
Do step 1-4 for each result of step 3
Our first design was recursive but that approach required our graph to be acyclic so we had to redesign it to
an iterative design when we ran into cycles. Figure 8-2 shows a segment of the produced graph for
PostCust containing 152 elements. The model displays the five procedures (marked with blue
64
Implementation
background) that are identified as part of the pattern we identified from Table 6-9 in section 6.3.5. Further
the model display how large a graph is produced when we search through all procedure references. Finally
the figure show the identified reference cycle (marked with grey background).
Figure 8-2 Illustrating procedure references between CU12 and CU13
8.9.5 Limitations by approach and solution suggestion
As we found in the previous section we cannot know for certain which procedures will perform the delete
of journal lines and any of the 152 referencing procedures can have a reference that performs the journal
line delete we have to build a reference graph for each element in each reference graph. This would be a
very time consuming task both to program and to run and cannot be advised. The performance of the
algorithm we have at this point takes minutes to execute and this approach would heavily induce the
execution time. The bottleneck of the computation is the querying of XML data. This querying could be
cached if we created a complete lookup system analyzing the code in a top down fashion instead of
traversing the code in reference chains.
The lookup system would also need to store table object actions that could fit into a step in the
requirements for REA candidates described in section 6.3.5.
65
8.9.6 Analysis of initial REA results
Before we chose to start implementing a time consuming lookup system described in the above section, we
decided to use the results found by our initial REA identification algorithm. As described earlier our REA
algorithm only identifies the variable sets of step 1 in 6.3.4.1 if they are located in the same reference chain
i.e. a procedure referencing PostCust deletes the journal line after the execution of PostCust as in the
example of Figure 8-2.
The idea is that we would like to get an indication of the quality of the pattern to see if the improvements
suggested above could be used. Therefore, we have accepted the initial results from table as the result of
step 1 in the identification algorithm from section 6.3.4.1, and will manually perform step 2 and 3 of
algorithm.
When we look at step 2 of our identification algorithm with the results we found in Table 8-1 we find that
results 4, 6 and 7 are false positives. The Journal Line dimension and ledger entry dimension tables contain
information regarding journal lines or ledger entries but they are not journals or entries themselves.
Therefore we remove these results from our findings. This leaves us with 4 candidates to examine in step 2
namely 1, 2, 3 and 5 listed in Table 8-2. The REA Candidate is a table we have identified as writing values to
a journal line.
ID
1
2
3
5
REA Candidate
18 Customer
23 Vendor
27 Item
27 Item
To Entry Table
21 Cust. Ledger Entry
25 Vendor Ledger Entry
5802 Value Entry
281 Phys. Inventory Ledger Entry
From Journal Table
81 Gen. Journal
81 Gen. Journal
83 Item Journal
83 Item Journal
Line
Line
Line
Line
Table 8-2 REA candidate sets
Common for the four sets above is that there does not exist any TableRelations on the fields in the entry
table’s primary keys. This does not fulfill the requirements of step 3.a (section 6.3.4.1) because a
TableRelation is a required part in our definition of a containment (section 6.1). This leaves us with method
3.b (section 6.3.4.1) which is to identify the candidate based on the naming of the tables. We see that REA
candidate 2 maps good to its entry table but candidates 1, 3 and 5 maps poorly.
This indicates that even with a larger result set from step 1 we would still not be able to match results in a
satisfying manner based on the chosen identification algorithm.
Taken the complexity of implementing step 1 correctly (section 6.3.4.1) and the indication of bad results
from step 2 and 3 we have decided not to invest more time in implementing this identification algorithm.
Another general problem with the examined approach is the information we are able to extract from the
events. It is a good approach to identify the events because they are the behavior in the system i.e. selling
goods, buying raw materials, paying employees etc. When analyzing the events we are able to find the
tables supplying information to the event. If the event was a sale the event would contain information such
as Customer, Item, etc. With our knowledge on accounting, REA and our trust in the NAV naming
conventions we can refer that Customer will be an Agent and Item will be a Resource. But from a code
perspective we do not have the same luxury. We are not able to differentiate Resource tables from Agent
tables. Further there exist tables that write values to journal lines without being part of a REA relationship.
Therefore the results from a REA algorithm will at best be a table listing all reliable REA candidates.
66
Implementation
We believe the code in general is too long and too complex for the chosen approach and we would suggest
looking for an alternate identification pattern. Earlier work trying to modularize the architecture in NAV to
components concluded that it was not possible to achieve due to high levels of interdependencies.
“… we tried to modularize the architecture via components, but it turned out that the interdependency
levels were in general far too high to make a reasonable modularization.”(71)
Which to some extend could be described as part of the same problem we have been experiencing related
to the many references within the code.
Chapter 9
9 Results
The result section presents the results of each of our identification algorithms. The coverage is calculated to
provide information on how good our algorithms are. Hereafter we present the Concept Viewer developed
to present our findings. The Concept Viewer is the tool that the Navision Application team can use if they
wish to explore the identified relations. Finally we will present diagrams for interesting cases of relations.
From the initial analysis we found Visio to be a slow diagram producer we did however get some interesting
information from the diagrams. Visio organizes similar structures in the same manner allowing one to
discover repeating patterns simply by looking at the component in the diagram. This way we found
evidence that our approach was working and that the number of repeated patterns was high. One of the
repeating patterns can be seen in Figure 9-1. The figure shows a pattern constellation related to journal
tables. The pattern shows that a Journal Line requires both a Template and a Batch to exist and that the
Batch requires the Template. The pattern is repeated around 10 times throughout the NAV application.
Figure 9-2 shows the same pattern as presented in the Concept Viewer.
FA Journal Template
*
-End186
1
-End185
1
-End189
FA Journal Batch
*
*
1
-End190
-End188
-End187
FA Journal Line
Figure 9-1 Template, Batch, Line pattern found in the initial
Containment work
Figure 9-2 Template, Batch, Line pattern in the Concept Viewer
68
Results
9.1 TableRelation Analysis and TableRelation Parser Quality Assurance
The goal for the TableRelation (TR) parser is as described in section 3 to parse TRs for table fields. The
following section explains the technicalities behind the parser and evaluates the coverage of the parser.
We have decided to base our parser on 5 regular expressions. This section presents the results from
analyzing the parser coverage.
9.1.1 Matches on Key fields
As defined in section 6.1, one of the requirements for the containment pattern is that the there exists a
TableRelation (TR) from the primary key on the dependent table. Therefore it is relevant to look at our
parsing coverage for the subset of the TRs defined on table keys.
From our program we know that the number of TRs defined on keys is 753. Table 9-1 is the result from
analyzing the coverage of each of our regular expressions. It is interesting to note that pattern 1
(Expression2) counts for 56.3 % coverage of all TRs and that we are able to get a TR coverage of 95.1 % with
the 5 patterns.
ID
1
2
3
4
5
Pattern
Frequency
Coverage
Expression2
424
56.3 %
Expression0
154
20.5 %
Expression3
48
6.4 %
Expression1
66
8.8 %
Expression4
24
3.2 %
Total Frequency Total Coverage
424
56.3 %
578
77.8 %
626
83.1 %
692
91.9 %
716
95.1 %
Table 9-1 Key fields
To ensure that the high coverage percentage found in Table 9-1 is accurate, and to leave out the possibility
that a single couple of TRs are repeating throughout the code and that we might not find variations over
the pattern, we decided to extract all unique TRs. We redid our calculations with the 292 unique TRs we
found from the set above. Table 9-2 lists the result and a resemblance in the distribution over our five
patterns can be seen as well as the coverage is still very acceptable. The total coverage is 93.5 % only
leaving out 19 TRs from our further analysis.
ID
1
2
3
4
5
Pattern
Frenquency Coverage
Expression2
131
44.5 %
Expression0
54
18.5 %
Expression3
38
13 %
Expression1
35
12 %
Expression4
15
5.1 %
Total Frequency Total Coveage
131
44.5 %
185
63.4 %
223
76.4 %
258
88.4 %
273
93.5 %
Table 9-2 Unique key fields
9.1.2 Matches on all fields
The generalization pattern is analyzing all TRs and does not distinguish between tables that are part of a
key from those that are not. Therefore we have done our analysis on all TRs as well as the analysis above
only concerned with TRs defined on keys.
The entire application has 5011 TRs. The table below displays the frequency distribution for each of our five
patterns. We see that we have a coverage of 67.2 % on a single pattern. Further we see that we have a total
coverage 95 % which is the same as the outcome of our analysis in Table 9-1
69
ID
1
2
3
4
5
Pattern
Frequency Coverage Total Frequency
Total Coverage
Expression2
3369
67.2 %
3369
67.2 %
Expression0
858
17.1 %
4227
84.4 %
Expression3
332
6.6 %
4559
91 %
Expression1
136
2.7 %
4695
93.7 %
Expression4
66
1.3 %
4761
95 %
Table 9-3 All fields
To complete the analysis, we analyzed all unique TRs. We found that the application code contains a total
of 651 unique TRs. The analysis below show the lowest pattern 1 coverage with a coverage of only 35.8
percent. This indicates a high repetition in this type of TRs. Further we see that we get a total coverage of
90.1 % leaving out 59 TRs from our further analysis.
ID
1
2
3
4
5
Pattern
Frequency
Coverage
Expression2
233
35.8 %
Expression0
124
19.1 %
Expression3
136
20.9 %
Expression1
69
10.6 %
Expression4
30
4.6 %
Total Frequency Total Coveage
233
35.8 %
357
54.8 %
493
75.7 %
562
86.3 %
592
90.1 %
Table 9-4 All unique fields
From the above analysis we see that the system contains 5011 TRs and that the subparser parse the 4765.
This excludes 246 TRs from our further analysis. Of the excluded TRs we know from Table 9-4 that there
exist (651 in total – 592 parsed) 59 unique patterns to parse.
To reach 100 % coverage by creating parsing expressions for the 59 remaining patterns is trivial. It has been
left out in this work due to the following reasons.
Complete coverage is not a priority goal. We parse the TRs we are interested in for our analysis of
containments and generalizations and if the need for parsing more TRs would arise each pattern could be
added in around 30 min. We believe that we achieved to get a high TR coverage and, at the same time, to
keep our code clean. We left out rules that parsed single or few TRs because there was no need for them in
the further analysis.
9.2 Results for Containment Pattern
We found in our TR analysis that there are at most 644 containments in the application by adding the
frequencies of expression0, expression1, and expression2 in Table 9-1.
The algorithm we have developed identifies:




148 complete matches
368 partial matches (OnDelete Trigger does not contain Cascading Delete)
95 partial matches (No OnDelete Trigger)
33 partial matches not possible to classify (Referenced object not a table)
In total 644 matches. (100 % coverage)
70
Results
The 148 complete matches are pure containments after the pattern we have defined in section 6.1. We are
only searching through table code for cascading deletes and the code has to be placed in or called from the
OnDelete trigger of the referenced table.
The 368 partial matches are matches where the referenced table does contain an OnDelete trigger but
there is no cascading delete.
The 95 partial matches are matches where the referenced table does not contain an OnDelete trigger
definition.
The 33 partial matches are matches where the structure of the TR matches the patterns for the
containment TRs. They are, however, not part of the containment pattern, although they match the
pattern. Common for all 33 TRs are that they are using system keywords such as Object, Field and
AllObj.
An example of such a TR is Table 78 Printer Selection. This table has a TR on the Report ID field:
Object.ID WHERE(Type=CONST(Report))
Code 9-1 Special case of TableRelation ignored in Containment analysis
The use of Object implies that the reference can be any object. The WHERE clause limits this to objects
of the type Reports.
As expected we are finding a large result set of partial matches. Fortunately we can actually use the
information for the following reason. The containment pattern we are searching for is a special case of an
aggregation namely a strong aggregation. The requirements to a normal aggregation match the
requirements of the 368 and 95 identified partial matches and they can therefore be modeled as
aggregations. This came as a byproduct from our algorithm without being in the original project aim.
The outcome from the containment algorithm analysis is therefore that we have found that 148 out of 644
candidates were containments. 463 out of 644 were regular aggregations and 33 out of 644 did not contain
references to tables.
We know that we do not have complete coverage on containments because we do not scan through code
placed in Codeunits. One example of such a cascading delete, which we do not detect, is the call of the
PostSalesLines-Delete Codeunit from table 110 Sales Shipment Header’s OnDelete
trigger. We do detect the aggregation, but we do not analyze the Codeunit code that contains the actual
delete. The PostSalesLines-Delete Codeunit contains methods to delete rows in the following
tables:





Sales Shipment Line
Sales Invoice Line
Sales Cr. Memo Line
Return Receipt Line
Posted document Dimension
71
The application code contains in total 5 Codeunits where the name implies that they are used for deleting
rows in tables. It is very likely that they all contain cascading deletes that are not detected at this point.
It would have been desirable to have the parsing and analysis of Codeunits as part of the algorithm. This
would, for sure, make our results more accurate. We decided to prioritize differently when Jesper Kiehn
introduced the concept of REA in the beginning of March 2009. Instead we decided to extend the project
scope with an examination of the possibility of identifying candidates for REA. This settled that we would
not have time to iterate the containment and generalization patterns to reach complete coverage. During
our REA work we exported parsed Codeunit data to XML. This is the basis for our analysis of Codeunits. We
estimate that the improved algorithm could be implemented in 2-3 days.
9.3 Results for Generalization Pattern
We found in our TR analysis that there are at most 332 generalizations in the application (frequency of
expression3, ID 3 in Table 9-3).
The algorithm we have developed identifies:



303 complete matches
4 ignored matches
25 matches with no corresponding OptionString
In total 332 matches. (100 % coverage)
The 303 matches are complete matches on our generalization pattern defined in section 6.2. The 4 ignored
matches are TRs that do not contain a one to one match against an OptionString. The TR from table 39
Purchase Line field No. in Code 9-2 is one of the four. The TR has an ELSE IF statement on the
constant value 3. This is a code defect where the value is linked to the index of the OptionString instead of
the value. We have not added a specific rule to detect these 4 special cases.
IF (Type=CONST(" ")) "Standard Text" ELSE IF (Type=CONST(G/L Account)) "G/L Account"
ELSE IF (Type=CONST(Item)) Item ELSE IF (Type=CONST(3)) Resource ELSE IF
(Type=CONST(Fixed Asset)) "Fixed Asset" ELSE IF (Type=CONST("Charge (Item)")) "Item
Charge"
Code 9-2 Special case of TableRelation ignored in Generalization analysis
The application team should probably consider changing the TRs defined on:




Purchase Line.No.
Standard Purchase Line.No.
Requisition Line.No.
Purchase Line Archive.No.
Finally we have 25 matches where the TR is of the correct type but no OptionString exist on the field. We
did some further investigation to analyze these 25 matches.
We found that the field numbers 71, 72, 5714 and 5715 defined on tables: 37 Sales Line, 39
Purchase Line, 5108 Sales Line Archive, and 5110 Purchase Line Archive account
72
Results
for 16 out of the 25 matches. They are all of the form below, which means that they are dependent on a
Yes or No value from another field in the table.
IF (Drop Shipment=CONST(Yes)) "Purchase Header".No. WHERE (Document Type=CONST(Order))
Code 9-3 Special case of TableRelation dependent on YES/NO values ignored in Generalization analysis
Of the remaining 9 matches the following 7 are coded to use integer values and should probably use field
name instead:


87 Date Compr. Register (on field) 12 Register No.
352 Default Dimension (on field) 2 No.





368 Dimension Selection Buffer (on field) 5 Dimension Value Filter
368 Dimension Selection Buffer (on field) 5 Dimension Value Filter
458 Overdue Notification Entry (on field) 3 Document No.
7340 Posted Invt. Put-away Header (on field) 7306 Source No.
7342 Posted Invt. Pick Header (on field) 7306 Source No.
The last 2 TRs not accounted for were encoded correctly but not related to any OptionString. These TRs
express special cases that could be analyzed as generalizations but they do not satisfy the pattern
requirements we have defined. Further giving that the 23 other matches according to J. Kiehn would
probably be candidates for a rewrite, we decided not to look further into these relations.
9.4 Results from refactoring the generalization objects
In section 9.3 we listed that we identified 303 matches on the generalization pattern. To simplify the code
and draw benefit from the object oriented principles, we have implemented an algorithm to reduce the
number of generated generalization objects by sharing identical objects. The algorithm is described in
section 8.9.3.
We found that from 303 generalization objects only 5 were unique throughout the code. The remaining 298
matches occurred two times or more. This way we were able to reduce the 298 objects to 86 objects.
This can be explored in the Concepts Viewer by checking the menu item “Relations To Map” -> “Map
Generalization” -> “Map Refactored Generalization Objects”.
9.5 The Concept Viewer and its output
The main outcome of this project has been identified relations and the Concept Viewer application able to
view the relations. As described in section 8.8, we analyze the C/AL code and save the results to four binary
data files storing respectively:




Aggregations
Containments
Generalizations
Refactored Generalizations
As mentioned earlier, the Concept Viewer tool is a tool to dynamically display the content of the data files
in an easy way thereby enabling the user to choose which concepts to map and which objects to map.
73
The Concept Viewer facilitates a dynamic approach to explore our findings. With the Concept Viewer, seen
in Figure 9-3, we can map the identified aggregations, containments, and generalizations (without
generalization objects, with generalization objects, with refactored generalization objects, and with parent
generalizations).
The individual mapping options and their outcome are illustrated by examples in the following sections.
Tables can be mapped on table or granule level and are simply selected by selecting the object in the tree
view in the left side of the form, the diagram is updated accordingly. The Concept Viewer allows the user to
zoom, undo, drag and drop single/multiple elements, save diagrams as images or vector graphics, and other
commonly expected features of a graph viewer tool.
The Concept Viewer can be installed from the attached DVD by running the \Concept Viewer
Installer\Concept Viewer.msi.
Figure 9-3 The Concept Viewer
9.5.1 Sales Header and Sales Line – Aggregations and Containments
We start out by exploring the tables Sales Header and Sales Line. These two tables are interesting because
we based much of our manual analysis on these two tables. The diagram in Figure 9-4 displays the
containment from Sales Line to Sales Header displayed in the manual analysis in section 6.1.2. From the
manual analysis, knowledge of UML and the foundational relation definition of part_of in section 4.2, we
know that a containment maps that a Sales Line cannot exist without a corresponding Sales Header.
Further we see that Sales Header has two more aggregations. Namely the aggregations from respectively
Item Charge Assignment (Sales) and Sales Planning Line. From our knowledge about UML, we know that the
aggregation implies that Sales Header’s relationship to these two tables is of type ‘has a’ which defines that
Sales Header has one or more instances of (rows in the table) Sales Planning Line.
74
Results
Figure 9-4 Sales Header and Sales Line – Aggregations and Containments
9.5.2 Sales Header and Sales Line – Associations
The diagram in Figure 9-5 displays the associations that we map from Sales Header and Sales Line. As
described in section 6.2, the high number of associations in the application code has proven to be too
inconvenient when the behaviour of code is changed. The high number of associations gave the idea to
refactor some of these associations to a generalization object thereby reducing the total number of
associations. This diagram is the first we present in a row of related diagrams presenting the architechture
in different ways.
This diagram displays the associations as they actually are defined among the tables.
Figure 9-5 Sales Header and Sales Line Associations
9.5.3 Sales Header and Sales Line – Associations refactored via Generalization objects
As described, the goal is to reduce the total number of relations on the mapped tables. In diagram Figure
9-6 we have refactored some of the associations to generalization objects. The associations are declared on
different fields and we require them to be defined on the same field for being part of the same
generalization. Further we have chosen not to create generalization objects for single associations. This
means that Sales Line has a field with a single releation to Item Variant. We do not add any value by
creating an extra object when the number of relations from Sales Line will be the same, therefore we have
left these out.
The diagram contains 3 generalization objects reducing the total number of associations for Sales Header
from 2 to 1 and Sales Line from 10 to 5. Furthermore, this abstraction has allowed us to see that we in fact
have more than one reference to the Item table. It is important to note that this refactoring is an abstract
refactoring suggestion. It would not be possible to implement it in C/AL at this point because the C/AL
language does not have support for inheritance. The diagram is, however, still very valuable for two
reasons. It provides information on how the code is actually organized: How the associations are defined
and how they are grouped. Looking at the diagram is significantly faster than reading the code. Further with
an understanding of UML we can easily learn a lot about the code simply by reading a diagram. Secondly
serious thoughts around the future of the NAV application and the use of C/AL is being thought in the NAV
organization. The product itself is moving to C# and it would be an option to move the C/AL application to
75
C#. This is described more in section 2.3.3. The syntax on the generalization objects is “table number”
“table ID (name)” : “field name”.
Figure 9-6 Sales Header and Sales Line Generalizations
9.5.4
Sales Header and Sales Line – Associations refactored via refactored Generalization
objects
The not so idiomatic header appart, this is quite interesting. In the example above, we introduced a
refactoring of association to reduce the total number of association on tables. This refactoring is provided
to reduce the total number of association objects. The turquoise boxes in Figure 9-5 are still our
generaliztion objects. We have introduced the refactoring that we use the same generalization object type
on every table listed in the box. This displays that the generalization object used by Sales Header is used by
17 tables in total. Using a single object would simplify the code and reduce the total number of lines of
code. The diagram also reveals resemblance between tables. The generalization object associated to the
Sales Line tables Posting Group field shows that every table listed using this generalization object is named
something with line implying that the tables with some justification has similarities. We see the same is the
case for the generalization object on the Sales Header field Bal. Account No. all except the Payment
Method table contain a form of the word Header as part of their name.
Figure 9-7 Sales Header and Sales Line Generalizations refactored
9.5.5
Sales Line, Standard Sales Line, and Sales Line Archive –Reused Generalization objects
explored
In the diagram in Figure 9-7 we could see a resemblance in the naming between the objects using the
generalizations. Diagram Figure 9-8 is exploring the generalization object used by Sales Line, Standard Sales
76
Results
Line, and Sales Line Archive. It is important to note that the tables do not share a common base. If they
share code, it is because the code has been replicated in all tables. When we analyze the diagram below,
we see that there is a remarkable resemblance between the relations. The only difference we can see from
the properties we map is that the Standard Sales Line table is missing the association to the generalization
object for FA Posting Group and Inventory Posting Group. Besides this association, they seem to be
completely alike. Here it is important to emphasize that we only map a small fragment of the code and that
they can have many other properties that are not alike.
Figure 9-8 Sales Line, Standard Sales Line, and Sales Line Archive reused Generalization objects explored
9.5.6 Sales Line, Standard Sales Line, and Sales Line Archive – Generalization objects explored
To underline how the mapping of Sales Line, Standard Sales Line, and Sales Line Archive would look if
mapped with non refactored generalization objects, we have provided the diagram Figure 9-9 which maps
each single generalization object. The diagram still appears readable, but the number of crossing lines has
made it more complicated to read than the previous diagram. Furthermore, the larger the diagram gets,
the more caotic it will seem.
Figure 9-9 Sales Line, Standard Sales Line, and Sales Line Archive Generalization objects explored
9.5.7 Sales Line, Standard Sales Line and Sales Line Archive – Associations explored
The following diagram, in Figure 9-10, displays the association on respectively Sales Line, Standard Sales
Line, and Sales Line Archive without any generalizations, i.e. how it is actually implemented today. In our
opinion, this diagram has a low degree of readability and we do not get much information from reading the
diagram. We could have provided information on where the association is defined but in general the
diagram is not very useful.
77
Figure 9-10 Sales Line, Standard Sales Line, and Sales Line Archive – Associations explored
9.5.8 Item – Containments and Aggregations mapped with reused Generalization objects
When we explore the Concept Viewer, we find that the amount of concepts to map vary a lot from table to
table. For some tables there are no concepts to map at all and for other tables there are an immense
number of concepts to map making the diagrams chaotic at best. This is interesting because with this
knowledge we can deduce an indication of the effect a change in this object will have on the rest of the
system. By analyzing different tables we found that the Item table is one of the ground pillars in the
application and based on this knowledge we are able to foresee that changes to behavior of the Item table,
with a high degree of certainty, will cause side effects in other parts of the application.
Mapping the containments and aggregations of Item, discloses that 14 tables have a containment to Item
meaning that their sole existence is dependent on the existence of Item. Furthermore, we disclose that 16
tables have an aggregation to Item meaning that they are also dependent on the Item table.
Figure 9-11 Sales Line, Standard Sales Line and Sales Line Archive – Associations explored
78
Results
9.5.9 Item –Reused Generalization objects containing Item
The Concept Viewer also contains the option for mapping every generalization that the mapped object is
part of. Meaning that we are able to map all generalization objects that Item is part of. As expected, given
the knowledge from the diagram above, this diagram is large. The diagram does not provide a clean
overview because it suffers from information overload. Nevertheless, we are able to see that Item is heavily
used by the generalization objects. In total there are 27 generalization objects referenced from 92 tables.
When the diagram is displayed in our tool reorganizing, highlighting and zooming can help to extract
information from even larger and more complex diagram.
Figure 9-12 Item –Reused Generalization objects containing Item
9.6
Feedback from the Application team
We presented the Concept Viewer tool for the Application team, Lead Software Development Engineer
Bardur Knudsen. We introduced the tool and watched him play around with it.
By first glance, what he liked the most was the reused generalization objects. He finds that they are good
to reveal similarities in the code. The reused generalization objects can be seen in section 9.5.4, and in
particular in Figure 9-7.
“What often happens is that we have to change something in a Sales Header but forgets to update Sales
Invoice Header, Sales Header Archive or some other related table” He finds that the generalization objects
are showing this. Forgetting to change related tables is a problem when data is copied from table to table
as described in section 6.3.4. In best case data is lost and in worst case the system could crash.
79
In general he was very pleased with the diagrams, but would like to see more relations covered, especially
the ones defined in Codeunits, which was left out due to the work towards REA concepts identification.
No documentation for the NAV application exists. Further the app team will not create any, because as B.
Knudsen stated “It will be outdated before it is finished”. B. Knudsen would really like to have a tool like the
Concept Viewer to create documentation from code on the fly.
He liked the diagrams showing containments and aggregations and stated that NAV could benefit from
these diagrams, when studying how a given change would propagate throughout the system. He was
however also of the opinion that most experienced developers already have this knowledge from years of
experience with NAV development. The tool can however be used by less experienced developers to gain
this knowledge.
Based on the responses from NAV application developers, we believe that this tool could help developers
gain a better understand of the application.
80
Results
Chapter 10
10
Conclusion
The conclusion sums up the work we have done and lists our solution to the problem defined in section 1.
The presented solution consists of the following four main parts:




Code parsing
General domain pattern matching
Diagram generation
Domain specific pattern matching
The achievement for each of the above parts will be described separately in the following sections.
10.1 Parsing the C/AL application
We extended the available C/AL parser to parse the fields, which was not parsed by the original parser
implementation.
We proved that the produced Abstract Syntax Tree (AST) with advantage could be saved in XML. The XML
representation made it easy to query, extend, and store the AST.
We found that a simple parser with five parsing rules were able to parse 100 % of the TableRelations,
matching the defined patterns. In total 95 % of all TableRelations were parsed.
The chosen approach was fast, accurate and easy to work with.
10.2 UML Relationship pattern matching
We developed identification patterns for the concepts generalization and containment. We found a high
number of matches indicating, that the concepts were present throughout the NAV application code.
10.2.1 Generalization
We were able to detect 303 candidates for refactoring to generalizations. From the results, we presented
two suggestions to refactorizations:
1. The introduction of a generalization object to reduce dependencies between objects from 2..* (two
to many) to 1 (one).
2. The introduction of a reused generalization object able to reduce the need for generalization
objects with more than 70 %.
10.2.2 Containment
The containment algorithm identifies 148 containments. From the partial matches we were able to derive
463 aggregations. We are certain that more containments could be identified from the 463 partial matches.
The primary step towards finding more containments is to analyze the code placed in Codeunits. As
82
Conclusion
described in section 1, it was chosen to focus on applying domain specific knowledge instead of perfecting
the general domain knowledge.
10.3 The Concept Viewer
We are presenting our UML findings in a dynamic, fast, and detailed relation viewing tool. The tool was
built with the Microsoft Automatic Graph Layout (MSAGL) research project (69). We were able to
contribute to the project, by adding support for UML symbols. The feedback we got from the NAV
Application Team, was that they liked the tool, and that they found they were able to gain information
from the produced diagrams. According to B. Knudsen the NAV organization could really use a tool like the
Concept Viewer to assist developers and architects by providing accurate updated documentation
dynamically.
10.4 Resource, Events and Agents (REA) relationship pattern matching
We are able to easily identify REA Events. Further we are able to identify a few Resource and Agent
candidates, but the work is far from complete. The work suffers from two problems:
1.
There is a high diversity in the formulation of the code expressing the REA relations. The variations
are making it hard to detect the actual code lines expressing the actions expressing the REA
pattern.
2. The chosen approach lacks a way to distinguish between REA Resources and REA Agents. Their role
in the system is identical according to the pattern we use at this point. Therefore we need a pattern
to distinguish between these two REA types.
We believe that REA theory, and the REA model in particular is very relevant for NAV. REA provides a way
to view the duality in an accounting system and is able to view the transactions. We see REA analysis as a
supplement to UML analysis, because the knowledge we gain from applying REA is domain specific in
contradiction to the UML concepts that are of a more general nature.
Chapter 11
11
Future work and Perspective
Inside the NAV organization our work could form a basis for a supplement to a new development
environment. Being able to see the design of your work instantly simply by pressing a button that would
make the tool parse the code and identify the relations and present the corresponding UML, would
contribute to great knowledge. The tool could also be used in assisting the NAV organization in the
refactoring of the NAV application, and the tool could easily be extended to cover more relations and
refactorizations.
There are a number of issues we would like to continue to work with:
1. It would be interesting to compare our findings with the findings from the work of Till Blume, described
in section 3.1.2. We expect that we would find similarities in the results from the two tools even
though the approach is different.
2. It would be nice to analyze the code placed in Codeunits in our containment and generalization
algorithms. This is fairly simple and could be achieved in a couple of days.
3. It would be interesting to include multiplicity in the containment algorithm. We are not detecting
whether the multiplicity is a 1-1 (one to one) or 1-* (one to many) relationship.
In terms of REA we found that REA did provide useful information on the behavior of the NAV application
objects. We believe that REA is very interesting because it allows us to do domain specific modeling as a
supplement to the general purpose modeling information provided by UML.
We believe that there could be basis for a project focusing on uncovering the REA patterns and modeling
the findings preferable as a new version of the Concept Viewer.
A promising project started after this work, focusing on building a framework for searching through the
code. It would be relevant to see if the REA pattern identification algorithms could benefit from being
implemented with this framework.
If the project should be started all over again it would be interesting to look into the C# converted C/AL
code. The C# code will probably be incorporated even more in NAV v. 7, and if we worked on the C#
representation, we would be able to use standard tools or create a tool that could be used for both the
product and the application, see section 2.3.2. Further if we make the tool of a more general nature, C#
programmers would be able to use the tool to analyze general purpose code.
Tools exist for parsing C# code to an abstract syntax tree, thereby we could save a lot of time by relying on
a fully working, supported parser. One such tool is antlr (ANother Tool for Language Recognition) (72), but
more interestingly Anders Hejlsberg revealed the future of C# after version 4.0 (73) at PDC (Microsoft
Professional Developer Conference) 2008. He introduced the concept “Compiler as a Service”, allowing
84
Future work and Perspective
developers to use the C# compiler in many advanced ways, including producing abstract syntax trees. It
would be very interesting to contact the .NET team, and see if it would be possible to use their work.
Chapter 12
12
Abbreviation
App Team
AST
AST
BI
C/AL
C/SIDE
CLR
CRM
DTU
ERP
FM
GDP
GUI
IMM
LE
LINQ
NAV
NDT
MDCC
MSDN
MSIL
OP
RE
REA
RTC
SCM
SH
SL
TR
UI
UML
Abbreviations
For
Microsoft Dynamics NAV Application Team
Abstract Syntax Tree (in general)
Application Service Tier (related to NAV architecture)
Business Intelligence
Client/Application Language
Client/Server Integrated Development Environment
Common Language Runtime
Customer Relationship Management
Technical University of Denmark
Enterprise Resource Planning
Financial Management
Gross Domestic Product
Graphical User Interface
Department of Informatics and Mathematical Modeling
Lambda Expression
Language INtegrated Query
Microsoft Dynamics NAV
Navision Developer Toolkit
Microsoft Development Center Copenhagen
Microsoft Developer Network (www.msdn.com)
Microsoft Intermediate Language
OptionString
Regular Expression
Resource, Events, and Agents
Role Tailored Client
Supply Chain Management
Sales Header
Sales Line
TableRelation
User Interface
Unified Modeling Language
86
Abbreviations
Chapter 13
13Works
Cited
1. Hvitved, T. Architectual analysis of Microsoft Dynamics NAV. s.l. : University of Copenhagen, 2008.
2. [Online] http://www.uml.org/#UML2.0.
3. S. Haag, P. Baltzan, A. Phillips. Business Driven Technology. s.l. : McGraw-Hill Higher Education, 2008, Ch.
10-12.
4. —. Business Driven Technology. s.l. : McGraw-Hill Higher Education, 2008. Page 134, section 1, line 1 .
5. —. Business Driven Technology. s.l. : McGraw-Hill Higher Education, 2008, page 135, fig. 12.1, fig. 12.2.
6. —. Business Driven Technology. s.l. : McGraw-Hill Higher Education, 2008, page 134, section 3, line 1.
7. —. Business Driven Technology. s.l. : McGraw-Hill Higher Education, 2008, page 134, section 4, line 1.
8. —. Business Driven Technology. s.l. : McGraw-Hill Higher Education, 2008, page 136, section 3 line 1.
9. —. Business Driven Technology. s.l. : McGraw-Hill Higher Education, 2008, page 136, section 1, line 1.
10. S. Jacobson, J. Shepherd, M. D'Aquila, K. Carter. The ERP Market Sizing Report, 2006-2011. s.l. : AMR
Research, 2007. AMR-R-20495.
11. D. Roys, V. Barbic. Implementing Microsoft Dynamics NAV 2009. s.l. : Packt Publishing, 2008.
12. Slide deck from MDCC Dynamics NAV all hands meeting. March 28 2008.
13. A cookbook for using the model-view controller user interface paradigm in Smalltalk-80. G. E. Krasner, S.
T. Pope. 3, s.l. : SIGS Publications, 1988, Vol. 1.
14. Studebaker, D. Programming Microsoft dynamics NAV. s.l. : Packt Publishing, 2007.
15. Stoustrup, B. The C++ Programming Language. s.l. : Addison Wesley, 1997.
16. S. Haag, P. Baltzan, A. Phillips. Business Driven Technology. s.l. : McGraw-Hill Higher Education, 2008,
page 124, section 1 and 2, page 131, figure 11.4, no. 2 and 4.
17. Troelsen, A. Pro C# 2008 and the .NET 3.5 framework. s.l. : Apress, 2007. Ch. 1.
18. [Online] http://msdn.microsoft.com/en-us/library/bb669144.aspx.
19. Studebaker, D. Programming Microsoft Dynamics NAV. s.l. : Packt Publishing, 2007, page 4-5.
20. [Online] http://en.wikipedia.org/wiki/OLAP.
21. [Online] http://www.parentenet.com/parentetech/parentetech_unique_technology.htm#sift.
88
Works Cited
22. [Online] http://msdn.microsoft.com/en-us/library/dd301468.aspx.
23. [Online] http://dinosaur.compilertools.net/#lex.
24. [Online] http://dinosaur.compilertools.net/#yacc.
25. [Online] https://mbs.microsoft.com/partnersource/downloads/releases/NDTAll.
26. [Online] http://www.scootersoftware.com/.
27. [Online] http://www-01.ibm.com/software/awdtools/developer/rose/.
28. [Online] www.magicdraw.com/.
29. [Online] http://plg.uwaterloo.ca/~migod/uml.html.
30. On the relationship between REA and SAP. O'Leary, D. E. s.l. : International Journal of Accounting
Information Systems, 2004, Vol. 5.
31. The REA Accounting Model: A Generalized Framework for Accounting Systems in a Shared Data
Environment. McCarthy, W. E. No. 3, s.l. : The Accounting Review, 1982, Vol. Vol. 57.
32. [Online] http://www.sap.com.
33. B. Smith, C. Rosse. The Role of Foundational Relations in the Alignment of Biomedical Ontologies. s.l. :
MEDINFO, 2004, page 444-445.
34. G. Booch, J. Rumbaugh, I. Jacobson. The Unified Modeling Language User Guide. s.l. : Addison Wesley,
1999, preface, ch. 4-5.
35. —. The Unified Modeling Language User Guide. s.l. : Addison Wesley, 1999, preface, page xx, line 7.
36. Contemporary approaches and techniques for the systems analyst. Batra, D. & Satzinger. No. 17, s.l. :
Journal of Information Systems Education, 2006, page 257-266.
37. Identified by J. Kiehn.
38. Fowle, M. Refactoring, Improving The Design of Existing Code. s.l. : Addison-Wesley, 2002, page 78.
39. Fowler, M. Refactoring Improving the Design of Existing Code. s.l. : Addison Wesley, 2002, page 147.
40. The REA Modeling Approach to Teaching Accounting Information Systems. McCarthy, W. E. s.l. :
Accounting Education, 2003, Vol. 18.
41. —. McCarthy, W. E. No. 4, s.l. : Accounting Education, 2003, page 430, section 2, line 7, Vol. Vol. 18.
42. —. McCarthy, W. E. No. 4, s.l. : Accounting Education, 2003, page 431, figure 4, Vol. Vol. 18.
43. Accounting for Rationality: Double-Entry Bookkeeping and the Rhetoric of Economic Rationality. B. G.
Carruthers, W. N. Espeland. No. 1, s.l. : The American Journal of Sociology, 1991, page 37, line 2, Vol. Vol.
97.
89
44. [Online] http://www.reallifeaccounting.com/dictionary.asp#L.
45. On the relationship between REA and SAP. O’Leary, D.l E. 5, s.l. : International Journal of Accounting
Information Systems, 2004, section 5.2.
46. [Online] http://mono-project.com/.
47. [Online] http://channel8.msdn.com/Posts/MSIL-the-language-of-the-CLR-Part-1.
48. Kickey, J. Introduction to Objective Caml. s.l. : Cambridge University Press, 2008.
49. G. Cousineau, M. Mauny. The Functional Approach to Programming. s.l. : Cambridge University Press,
1998.
50. Harper, R. Programming in Standard ML. s.l. : Carnegie Mellon University, 2009.
51. [Online] http://stackoverflow.com/questions/179492/f-and-ocaml.
52. D. Syme, A. Granicz, A. Cisternino. Expert F#. s.l. : Apress, 2007, page 224-230 section on active
patterns.
53. —. Expert F#. s.l. : Apress, 2007, ch. 13 on Asynchronous computation.
54. D. Syme, A. Granicz and A. Cisternino. Expert F#. s.l. : Apress, 2007, ch. 16.
55. [Online] http://dinosaur.compilertools.net/#lex.
56. [Online] http://dinosaur.compilertools.net/#yacc.
57. [Online] http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html.
58. Friedl, J. E. F. Mastering Regular Expressions, 3rd edition. s.l. : O'Reilly, 2006, section 3.2.
59. D. Jurafsky, J. H.Martin. Speech and language processing - An Introduction to Natural Language
Processing, Computational Linquistics, and Speech Recognition. s.l. : Pearson Higher Education, 2008, page
42.
60. The Impact of the Lambda Calculus in logic and computer science . Barendregt, H. s.l. : The bulletin of
symbolic Logic, 1997, page 14.
61. [Online] http://msdn.microsoft.com/en-us/library/bb397687.aspx and
http://en.wikipedia.org/wiki/Lambda_calculus.
62. The Impact of the Lambda Calculus in logic and computer science. Barendregt, H. s.l. : The bulletin of
symbolic Logic, 1997, page 194, section 3.2, line 16.
63. F. Marguerie, S. Eichert, J. Wooley. LINQ in Action. s.l. : Manning, 2008.
64. Ferracchiati, F. C. LINQ for Visual C# 2008. s.l. : Apress, 2008.
65. F. Marguerie, S. Eichert, J. Wooley. LINQ in Action . s.l. : Manning, 2008, page 5, quote 1.
90
Works Cited
66. —. LINQ in Action. s.l. : Manning, 2008, page 35-37.
67. [Online] http://w3schools.com/xml/xml_syntax.asp.
68. [Online] http://www.regular-expressions.info/,
http://www.codeproject.com/KB/dotnet/regextutorial.aspx and http://msdn.microsoft.com/enus/library/hs600312.aspx .
69. [Online] http://research.microsoft.com/en-us/projects/msagl/ .
70. [Online] http://research.microsoft.com/en-us/um/people/levnach/.
71. Hvitved, T. Architectural analysis of Microsoft Dynamics NAV. s.l. : University of Copenhagen, 2008,
page 45, section 3.
72. [Online] http://antlr.org.
73. [Online] http://channel9.msdn.com/pdc2008/TL16/.
74. [Online] http://channel9.msdn.com/pdc2008/TL16/.
Chapter 14
14
Appendix
14.1 Content on the enclosed DVD
The DVD attached to this report contains, the Concept Viewer Installer and code for our work. The
individual projects are described hereunder:
The root folder on the DVD contains two folders:

Concept Viewer Installer:
This folder contains the installer for the Concept Viewer tool. To install the program run the
Concept Viewer.msi. The default installation directory is
$:\Program Files\David Flenstrup\Microsoft Dynamics NAV Concept Viewer\.
When the program is installed it can be run from the “Microsoft Dynamics NAV Concept Viewer”
shortcut found on the Windows desktop and all programs menu.

Code:
This folder contains the following subfolders containing the code for this project:
o Abstract Syntax Tree XML Files
This folder contains the abstract syntax tree representation of the C/AL application code.
The folder contains three files:
 Codeunits as Abstract Syntax Tree.xml
XML representation of all Codeunits
 Tables as Abstract Syntax Tree - Not Subparsed.xml
XML representation of all tables with the level of detail offered by the original
CALParser
 Tables as Abstract Syntax Tree – Subparsed.xml
XML representation of all tables after with the added details from our subparser
o ASTTable:
Initial analysis project used to extract properties directly from the CALParser. The project is
only used for analyzing the C/AL application and does not contribute directly to the rest of
our work.
o ConceptViewer:
The project contains the implementation of the relation viewer tool (the Concept Viewer)
presenting our findings.
o MatchingInLinq:
This project contains the implementation algorithms for containment, aggregation, and
generalization.
o MSAGL:
92
Appendix
o
o
o
This folder contains the dll files for the MSAGL project. We do not have ownership of the
MSAGL project and can therefore not pass on the actual code.
RegularExpressionParser:
This is the subparser we have developed for C/AL. It parses TableRelations and creates an
XML element with the parsed TableRelation.
Simple OO Parser:
This folder contains the extended CALParser developed by T. Hvitved. The primary
extension is the added CALToXML.fs file printing the abstract syntax tree to XML.
Work From T. Hvitved:
This folder contains code from a former project(1). The folder contains the CALParser we
have extended into the parser Simple OO Parser and a text export of the NAV application
compatible with the CALParser.
Download