TO BE FILLED IN BY THE AUTHOR

advertisement
BIRT 2.2 Dynamic Cross Tables
By Jason Weathersby and Virgil Dodson
Dynamic Cross Tables and BIRT Data Cubes
BIRT 2.2 was just released as part of the Eclipse Europa train. With this release, BIRT offers many
new features, and topping this list is the new dynamic cross tabulation report item. This new element
allows data distributions to be displayed using a matrix style table, complete with aggregation at the
column and row levels. As part of building a cross tab style report, the BIRT team also offers a new data
cube element which allows OLAP style cubes to be constructed using existing BIRT datasets. In this
article, we will discuss these two new features and provide an example report which illustrates building a
dynamic cross tab.
Introduction
Cross tabulation style reports are often useful when trying to present data that is aggregated across
periods such as quarters, organizational hierarchies, regions, and other dimensions. While previous
versions of BIRT offered this type of capability, the specific levels of the dimension had to be known
beforehand or code had to be written to modify the currently running report to add the specific dimensional
level. BIRT 2.2 addresses this issue with two new elements, the data cube element and the cross tab report
item.
The data cube element allows the creation of an OLAP based cube using existing BIRT datasets and
consists of dimensions and measures. These dimensions and measures are constructed using the new BIRT
cube builder, which stores the metadata for the cube in the XML report design.
Once the cube is created, it can be bound to the new cross tab report item, similar to binding a standard
dataset to a table. The dimensions are placed in either the row area or the column area and the measures
are placed at the intersection point. When the report is executed, the number of columns and rows are
automatically rendered based on the what the report designer choose for the dimension level. For example,
a cross tab may contain a sales forecast by quarter across several regions. The designer could choose to
display year, quarter, and month in the column level, and country, state, and city in the row level. The
number of columns and rows would then be dynamically generated based on the dataset used within the
cube.
In this article, we will first explain the process for building a cube and what options are available. We
will then take this cube and produce a cross tabulation report using the new cross tab report item. We will
also discuss how to do totaling, filtering, sorting, and other features within the cross tab report item.
BIRT Data Cubes
BIRT data cubes are multi-dimensional cubes, based on one or more BIRT datasets. Currently the
primary purpose of the cube is to support the new cross tab report item. Cubes are built using the new Cube
Builder, which supports adding dimensions and measures. The Cube Builder supports star schemas that
consist of one fact dataset and one or more dimension datasets. It is important to realize that one fact
dataset can be constructed in BIRT to consist of multiple fact tables using standard joins and the Joint
dataset. This will allow the user to create cubes across datasets and data sources. For example, a user can
create a cube that consists of HR data from one database and financial information from an entirely
separate database.
Figure 1 – Cube Builder
Dimensions are defined as a set of data fields arranged in hierarchies and levels. Country, State, City
and Year, Quarter, Month are examples of how a dimension may be structured. BIRT currently supports
level based dimensions, where each level can be defined as static or dynamic. Static dimension levels are
defined by the cube developer, using the expression builder. These are usually created using a dataset field
within the expression. For example, a dimension level may be created, that uses the QTYORDERED field
to determine large and small orders, which creates two possible members at the defined level in the
dimension. Dynamic dimension levels are defined by the dataset used to construct the level. Defining a
dynamic level within a dimension as the COUNTRY data field would then generate a level member for
each country in the dataset.
Measures are a special type of dimension, that are defined in BIRT, using either a dataset fieldname or
JavaScript expression that is aggregated using one to the supported functions. BIRT currently supports
SUM, AVG, MAX, MIN, FIRST, LAST, COUNT, and COUNTDISTINCT aggregate functions. BIRT
also supports multiple measures per cube.
When building a cube in BIRT, the user must first select a primary dataset, which will contain the fact
table and is used to generate the measures for the cube. Additional datasets can be included in the cube as
dimension tables. To illustrate the cube features, we will build a cube that sums the sales of products
across product lines by year and quarter. This example will be built using the BIRT sample database.
To begin we first create two datasets, one for the fact table and one for the product line dimension.
The fact dataset query should be as follows:
select *
from CLASSICMODELS.ORDERDETAILS, CLASSICMODELS.ORDERS
where CLASSICMODELS.ORDERS.ORDERNUMBER =
CLASSICMODELS.ORDERDETAILS.ORDERNUMBER
and
CLASSICMODELS.ORDERS.STATUS = 'Shipped'
This query returns all the details concerning the orders for the Classic Models sample database and
uses two tables.
Figure 2 – Order Details Dataset
The second dataset which will be used for the product line dimension is defined as:
select *
from CLASSICMODELS.PRODUCTS
This dataset returns the product information for all the Classic Models sample products and will be
linked to the orders dataset by the PRODUCTCODE field.
Figure 3 – Product Details Dataset
Now that the two datasets are built, we can construct the cube. Select the Data Explorer view and
right click on the Data Cubes element and select New Data Cube.
Figure 4 – New Data Cube
This will launch the Cube Builder. The Cube Builder has three pages used for constructing the cube.
The first page is the Dataset page, which is used to select the primary dataset that contains the fact table.
Select the orders dataset as the primary dataset. Note that you can filter the dataset at this level as well.
Figure 5 – Cube Builder Dataset
Select the Groups and Summaries page. This page is used to build the dimensions and measures. We
will first create the measure of ProductSales. Select the Summary Fields element in the right pane and then
click the Add button. This will add a new measure. This can also be done by dragging a data field from
the primary dataset in the left pane to the Summary Fields element in the right pane.
Figure 6 – Create the Product Sales measure
Products sales are determined by multiplying PRICEEACH by QUANTITYORDERED. Because this
is a calculation, we will need to use a JavaScript expression to calculate this measure. To create the
expression, drag the PRICEEACH field to the newly created measure. This will launch the properties
editor for the ProductSales measure.
Figure 7 – ProductSales Properties
Change the name to Amount, select the SUM aggregate function, and enter the following expression.
dataSetRow["PRICEEACH"]*dataSetRow["QUANTITYORDERED"]
Next, select the Dimensions group and click the Add button. Enter Products as the name. This will
create the new dimension used for products. This can also be done by dragging any of the product dataset
fields to the dimensions group in the right pane. Drag the PRODUCTLINE field from the left pane and
drop it on the newly created dimension. This will launch the Group Level editor.
Figure 8 – Group Level Editor Dynamic Dimension Level
At this point you can define the dimension level as either static or dynamic. In the sample database,
every product has a unique code, PRODUCTCODE and each code is a member of one of seven
PRODUCTLINES. Therefore we will use a dynamic dimension level, which depends on the dataset rows
and columns when determining level memebers. Dynamic dimensions require that there be one unique
entry in the Dimension dataset that links to the fact dataset. In this example, the fact dataset will contain
many duplicate product codes, but the products dataset will only contain unique product codes. Static
dimension levels are created using the expression builder and available data values. For example we could
create a static dimension for big orders and little orders, using the statically defined dimension as illustrated
in figure 9.
Figure 9 – Static Dimension Level
Next, drag the PRODUCTCODE from the products dataset in the left pane to the product lines
dimension in the right pane.
Figure 10 – Product Line Dimension
The Key Field property is used later to link the Products dimension to the fact or primary dataset. You
can also enter a field from the dataset to display instead of the key field when displaying the results in a
cross tab report item. The attribute fields allow the cube developer to enter additional fields that are
calculated at each level and can be used when displaying a level in a report. As an example, a dimension
may be structured as Country, State, and City. The city value may be the key field, but you may want the
population for the city to be included in the cross tab. This can be accomplished by adding the population
dataset field as an attribute to the City group level.
BIRT cubes supply a special dimension when using fields that contain dates. We can use this feature
by constructing a dimension based on shipped date. Shipped date implies that the product transaction has
been completed, so we will use this as a dimension. First create a new dimension named Shipped as
described earlier. Next drag the SHIPPEDDATE field from the primary dataset to the new dimension.
This will launch the Group Level Editor for the Shipped dimension, and the date-time levels will
automatically be created. Check the year, quarter, and month levels.
Figure 11 – Date Dimensions
We are now finished building the dimensions and measures needed for the cube. The final step is to
link the Products dimension with the primary dataset. To do this, select the Link Groups page in the Cube
Builder and drag the PRODUCTCODE field from the Products group to the PRODUCTCODE field in the
Primary group.
Figure 12 – Link Groups
Using the Cross Tab Report Item
Now that the Cube is built, the cross tab report item can be used to construct a dynamic cross table.
To do this, first drag the cross tab report item from the palette to the report layout view. This will add an
empty cross tab to the report. The cross tab report item allows dragging of cube dimensions into the
column and row headings and cube measures into the intersection cell.
Figure 13 – Cross Tab Report Item
In the cube created in the previous section, dimensions were created for date and product lines and a
measure was created to sum product sales. Drag the Date group dimension to the row area, the Product
Lines group dimension to the column area, and the Amount measure to the intersection cell produces as
shown in Figure 14.
Figure 14 - Simple Cross Tab
Running the report produces the output shown in Figure 15.
Figure 15 – Example Report
The product lines available in the dataset are expanded across the columns and the years are inserted
into the rows. The measure is also automatically calculated for the intersecting cell. It is also important to
realize that multiple dimensions can be placed in the columns and rows and multiple measures can be used
in the intersecting cells.
If a dimension has multiple levels, such as the date dimension in this example, by default, the top level
is displayed. Left clicking on the down arrow on the dimension, launches a popup menu which allows the
developer to expand which levels will be presented in the cross tab. Using this same approach, totals per
level can be added to the crosstab as shown in Figure 16.
Figure 16 – Dimension menu
Select the Show/Hide Group Levels menu item for the Date dimension and click on quarter as shown
in Figure 17.
Figure 17 – Show Hide Group Levels
Running the report again shows the results are now parsed by quarter as shown in Figure 18.
Figure 18 – Added Quarter
Selecting the Totals menu item from dimension pop-up allows the developer to add totals per level as
well as grand totals for the entire cross tab. Selecting the Totals item on the date dimension in the previous
example then selecting sub totals for the year amount and a grand total for the amount adds aggregation
elements per year and for a grand total as shown in Figure 19.
Figure 19 - Totals
The modified cross tab now aggregates the data per year with a grand total for the entire dataset as
shown in Figure 20.
Figure 20 – Total Report
In this example, we choose to sum the amount measure across years and quarters per product line.
Double-clicking on the newly added aggregation element launches the Aggregation builder.
Figure 21 – Aggregation Builder
Using the Aggregation Builder allows the developer to change the totaling function, apply filters to
data and choose what levels are aggregated. Currently, SUM, AVERAGE, MAX, MIN, FIRST, LAST,
COUNT, and COUNTDISTINCT totaling functions are supported when the Aggregation Builder is used in
the cross tab report item.
In this example, aggregation has been illustrated using a row dimension. These same operations can
be applied to the column dimensions. This can be useful when grand totals are needed in the last column.
In addition, general formatting as applied to all BIRT report elements can also be applied to the cross tab
report item, including the use of styles. Filtering can be applied to the cross tab to filter data based on
dimension such as removing a specific product line in the previous example. Sorting can be applied to
dimensions as well. In this example, reversing the date order can be accomplished using the sort feature.
Highlighting can also be used to indicate unique cross table entries such as highlighting of high sales
figures. Figure 22 illustrates a few examples of applying filter, sorting and highlighting. It is important to
note that filtering and sorting can also be applied to the aggregation and calculated fields as well. For
example, creating a cross tab that displays the top 5 customers who ordered from the Planes product line in
2003.
Figure 22 - Formatting
Figure 23 shows the formatted example.
Figure 23 – Completed Report
Summary
This article only illustrates a small portion of what is available with the new BIRT cube and dynamic
crosstab features. We encourage you to download the BIRT 2.2 release and experiment with the new
features. To read more about the 2.2 release feature set, be sure to read the new and notable at
http://www.eclipse.org/birt/phoenix/project/notable2.2.php.
Download