Hierarchies in SAP BW ASAP FOR THE BW ACCELERATOR BUSINESS INFORMATION WAREHOUSE Information about and Examples of Using Hierarchies in the SAP BW SAP (SAP America, Inc. and SAP AG) assumes no responsibility for errors or omissions in these materials. These materials are provided “as is” without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP shall not be liable for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. SAP does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within these materials. SAP has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third party web pages nor provide any warranty whatsoever relating to third party web pages. BUSINESS INFORMATION WAREHOUSE – HIERARCHIES 2000 SAP AMERICA, INC. AND SAP AG TABLE OF CONTENTS BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 1 Introduction This document provides background information on hierarchies used in the SAP Business Information Warehouse. 1.1 Software Version Supported This document was written specifically for BW version 2.0B, but should apply to all versions of BW. 1.2 References Loading InfoSource Data via Flat Files Multi-Dimensional Modeling with BW Authorizations and Roles 2 Introduction, Definition and Terms 2.1 Objectives The aim of this documentation is to explain the various options for modeling data using hierarchies in the SAP Business Information Warehouse, based on specific examples. This document is written for software version 2.0B. 2.2 Definition and Properties A hierarchy models the grouping and structuring of a characteristic by individual evaluation criteria. A hierarchy in the SAP BW has the following properties: Hierarchies are created for base characteristics (characteristics involving master data) in the SAP BW. One example of a base characteristic is the cost center (InfoObject 0COSTCENTER). Hierarchies for the cost center can then be applied to both sender and receiver cost centers. Hierarchies are saved in special master data tables of a characteristic. They behave the same as master data, and are therefore valid for all InfoCubes and can be modified. Several hierarchies can be defined for a single characteristic. A hierarchy can contain a maximum of 98 levels. Hierarchies can be loaded from an R/3 System or from a file, or can be manually created and changed in the BW. The following diagram shows an example of a hierarchy in the BEx Analyzer: 3 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES Hierarchical structures can also be modeled within an InfoCube, in the dimensions, or as attributes of a characteristic. Hierarchical structures that are modeled in dimensions cannot be changed without a comprehensive realignment, which is not contained in the SAP BW standard (see Chapter 4). As the diagram shows, a hierarchy also differs from a modeled hierarchical structure in the dimensions and attributes displayed in the BEx. Hierarchical structures are not the subject of this documentation and will not be mentioned further. 4 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 2.3 Terminology 2.3.1.1 Nodes Nodes are objects that form a hierarchy. A node can have children (additional sub-nodes). We differentiate between two different types of nodes: Nodes that can be posted to Nodes that can be posted to are themselves valid characteristic values of the base characteristic. Accordingly, values can be saved for these nodes in the fact table. When the data is presented in the BEx Analyzer Workbook, nodes that can be posted to are displayed as leafs containing the facts saved directly for this characteristic value. They are also displayed as nodes – that is, an aggregation of the values of all the children of the node that can be posted to. Examples: customer hierarchy, corporate group with subsidiaries, sales with subsidiaries and corporate group itself Nodes that cannot be posted to An object that does not refer to the characteristic on which the hierarchy is based. We differentiate between characteristic nodes and text nodes. A characteristic node is defined by a characteristic and the values of that characteristic. When a characteristic node is presented, the name of this characteristic node is taken from the master data table (text table, for example, region in the customer hierarchy). For text nodes, a name must be defined directly as a text. In general, the characteristic attributes of the characteristic to which the hierarchy applies is located at the last level in the hierarchy (e.g. characteristic attributes = leaves). A leaf can appear several times in the same hierarchy, but cannot be allocated several times to the same node. The values of leaves that appear several times are only taken into account once in the higher-level nodes. Please note that the value of a node that cannot be posted to is calculated exclusively as the aggregation of the values from the children of that node. 2.3.1.2 Root All nodes that do not have parents are called roots. Usually, a hierarchy only has one root. 2.3.1.3 Hierarchy Level All the nodes that have the same spacing to the roots form a hierarchy level. The roots of the hierarchy have level 1. The level of a node measures the spacing between that node and its root. 2.3.1.4 Leaf A leaf is an object that is a characteristic value of the base characteristic, and can therefore have values in the fact table. In contrast to nodes that can be posted to, however, a leaf cannot have any children. 2.3.1.5 Interval Describes a set of leaves through the upper and lower limits. New characteristic values are arranged automatically. Intervals in hierarchies must be allowed during InfoObject maintenance. Intervals can be modeled in hierarchies when many leaves (which describe an interval) are assigned to a node. Examples are cost element hierarchies or quarterly hierarchies that are described by intervals. When intervals are involved, please note that these intervals are resolved in the presentation hierarchy and do not have any texts of their own. 5 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 2.3.1.6 Balanced and Unbalanced Hierarchies Balanced and unbalanced hierarchies balanced hierarchy unbalanced hierarchy A typical balanced hierachy is a geographic hierarchy with levels such as continent – country – state – region – city A special type of unbalanced hierarchies might occur if different sources can provide information only at different levels (for example one source can deliver data at material level while the other deliver data only at material group level) SAP AG 2000 (Joachim Mette) / 1 Remarks The fact table contains values exclusively for leaves and the nodes that can be posted to. The values of all other nodes are determined by aggregating the leaves and nodes that can be posted to. A hierarchy is uniquely identified by its name, version, and key date. Every node (node name, InfoObject) must be unique (on a given key date). 6 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 3 Using Hierarchies in a Query 3.1 Presentation Hierarchy To use a hierarchy as a presentation hierarchy, have the breakdown in the rows correspond to the hierarchical arrangement of the characteristic values. The value of a key figure is displayed first, in the upper-most node of the hierarchy, and is followed by all the children of that node, which you can expand down to the individual leaf level. The values of the nodes in a level are calculated by aggregating all the nodes that can be posted to and leaves of the underlying levels. Only nodes that have values are displayed. The additional breakdown of a further characteristic in the rows is not possible (with the exception of batch printing). You select a presentation hierarchy for a characteristic when you define a query. Use the context menu (right mouse button) to maintain the “Properties” of the characteristic. There are a number of options for selecting a “presentation hierarchy” at the bottom of the window. You can first select the “Hierarchy name”, then the “Version”, and then the “Key date” for evaluating the hierarchy. Lastly, you can specify the “Start expand level”, the number of levels to display when running the query. If you do not want the presentation hierarchy to start with the root node, you can restrict the expansion in the rows of the BEx Analyzer workbook by a node in the hierarchy. In this case, the start level is the spacing to this node. You have to set the “Active” flag for a hierarchy before you can evaluate it in the BEx Analyzer workbook. 7 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES You can also specify variables when you select a hierarchy. The following technical InfoObjects are available for this purpose: 0HIER_VERS: Hierarchy version 0DATE: Calendar day 8 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES Starting in BW Release 2.0, you can also change the position of a hierarchy root node in the workbook. This makes a bottom-up evaluation possible. To select this function, use the context-sensitive menu in the workbook. You can define these settings as default values in the hierarchy definition. Another new feature in BW Release 2.0 enables you to hide nodes that only contain a single leaf. 9 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 3.2 Hierarchy Filter Function A node in a hierarchy can be used for selection like a characteristic value, as a filter in a query. In the process, both the node and the hierarchy can be determined through variables. Example: Sales and order quantity for customers in Asia – that is, selecting the key figures “Sales” and “Order quantity” with a filter on node “Asia”, and the “Material hierarchy” is displayed as the presentation hierarchy for the analysis. The following options are available for restricting a hierarchy node: Setting a filter in the query Limiting the characteristic to one hierarchy node in the row/column definition Limiting a structure element to one hierarchy node Creating a restricted key figure that is limited to one hierarchy node You can use a different hierarchy to present the data in all these cases. 10 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 4 Special Properties of Hierarchies 4.1 Node Name A node is identified by its name (or by the InfoObject that acts as the node). This means the node name must be unique in the hierarchy. If a node appears several times in a hierarchy, you must set the “link indicator”. This causes the node to inherit the entire substructure of the original node. If both nodes appear in the same place in the hierarchy, the aggregation is corrected accordingly. The hierarchy shown above contains an economic structure in addition to the geographical structure. The countries USA, Canada, and Mexico are grouped together under NAFTA. These countries are also assigned to the respective continents, with the corresponding sales figures. Please note that values from nodes/leaves that are assigned several times are only counted once, and are not added several times. 4.2 Version You sometimes have to analyze a characteristic under different hierarchical structures – for example, the cost center hierarchy in both actual and plan versions – for simulation purposes. From the technical perspective, you can use two different hierarchy structures with two different names to achieve this. Alternatively, you can assign one name and use two versions to model the “actual” and “planned” variants. This enables you, for example, to define a cost center report in which the name of the cost center hierarchy is predefined and a variable is defined for the “version” (parameter variable for InfoObject “0HIER_VERS”). Then, when the enduser runs the query, he or she only has to choose a version in order to select the correct cost center hierarchy. 4.3 Time Dependent Hierarchies 11 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES Time dependent hierarchies were not taken into account in the information above. Until now, the structure of a hierarchy always represented the present status. If you change the hierarchy, for example, by moving a leaf to another node, then all the key figures are displayed for the modified structure. To model the historical view, you either have to model the hierarchy time-dependent or create a second version. Please note that there is no link between the temporal component of the data (document date of the key figure) and the validity interval of a node/leaf within the hierarchy structure. The data is evaluated in the hierarchy structure for the requested key date chosen for the hierarchy. The key date of the query (global query properties) and the key date of the hierarchy are completely independent of one another. Example: The fact table contains two entries for Sales Rep. #1002 (Greg Hunter): Sales Rep. 0CALDAY Currency Revenue 1002 01.01.1999 USD 400.00 1002 01.01.2000 USD 300.00 In the hierarchy defined with a “time-specific structure”, Sales Rep. Greg Hunter is assigned to sales region “USA” from January 1, 1980 to December 31, 1999, and to sales region “Canada” from January 1, 2000 to December 31, 9999. 1. Hierarchy key date December 31, 1999 2. Hierarchy key date December 31, 2000 12 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 3. Hierarchy key date December 31, 2000 and filter value for 0CALDAY = January 1, 2000 This means if the time characteristic is not restricted, all the values of a node/leaf in the fact table will be aggregated under the hierarchy structure that is valid as of the selected key date. 4.3.1 Key Date Evaluation When time-specific hierarchies are involved, the query definition can include a key date – in addition to the hierarchy name and version – on which the hierarchy will be evaluated. If you do not define a key date for a hierarchy, it will be evaluated on the key date that is defined in the query properties (menu item in the Query Builder). If no key date has been defined in the query properties, then the execution date of the query applies for the hierarchy. You can also use a variable to determine the key date. The enables the user to choose a key date for query analysis (or have the system determine one automatically). 13 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 4.3.2 Time Dependency: Technical Background For performance reasons, time dependency has been implemented in two different ways. There is no difference between the two options in the results of the query. 1. Time-specific hierarchy name The key (identifier) of the hierarchy structure, which consists of the name and version, is extended with the validity interval. The respectively valid hierarchy structure is determined from the name and the key date. This type of time dependency is called “time-specific hierarchy name” in InfoObject maintenance. 2. Time-specific hierarchy structure Each parent-child relationship has a defined time interval for which this relationship is valid. In this variant, the hierarchy name is used to determine a structure, which is then evaluated with the key date. This type of time dependency is called “time-specific hierarchy structure” in InfoObject maintenance. 4.3.3 Recommendation for Using Time-dependent Hierarchies For hierarchy structures that are constant for longer periods of time (such as a month or more), we recommend modeling the time dependency in the hierarchy name. In this case, aggregates can be used for the hierarchy. For example, if you extract the hierarchy structure from the OLTP system monthly, and you want to record a history, you can model the time dependency in the hierarchy name. When a time-specific hierarchy structure is involved, the structure is evaluated at the query runtime. In this case, only the aggregates that are grouped by the characteristic values can be used. You cannot use any aggregates that contain pre-summarized data in the levels. 5 Aspects of Data Modeling As an alternative to using external hierarchies, you can also model a hierarchical structure in the dimensions or via the attributes of a characteristic. The following aspects can help you decide which data model to use: Hierarchy Hierarchy in dimension Hierarchy via attributes The “as posted” view is not possible The “as posted” view is not possible Different views of the data – such as version, key date – are possible Different views of the data are not possible; only the view that is valid in the characteristic master record on the evaluation date Hierarchy applies to the InfoObject, and can therefore be evaluated for the InfoObject for all InfoCubes The hierarchy can only be evaluated in the InfoCube The hierarchical structure is modeled in the master data, and can therefore be evaluated for the InfoObject for all InfoCubes Aggregates must be used to achieve high-performance analyses High-performance analyses are possible without using aggregates Aggregates must be used to achieve high-performance analyses Drill-down path is predefined by the hierarchy structure The drill-down path is not predefined, which means you can skip levels Only the “as posted” view is possible The drill-down path is not predefined, which means you can skip levels 14 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES Non-leveled hierarchies are possible The different attributes of the dimension correspond to the levels of the hierarchy, which means only leveled hierarchies are possible The different attributes of the dimension correspond to the levels of the hierarchy, which means only leveled hierarchies are possible Duplicate leaves are taken into account wherever they appear Duplicate lesaves are only taken into account “as posted” Duplicate leaves are not possible The hierarchy can be changed quickly Reorganization is not possible without reloading the InfoCube Reorganization is possible by selecting additional attributes 6 Performance 6.1 Read Mode for a Query You can set the read mode per query or for all queries in the transaction RSRT (Query Monitor). BW supports the following three read modes: Reading all of the data When executing the query in the Business Explorer, all of the fact table data, which is needed for all possible navigational steps in the query, is read in the main memory area of the OLAP processor. Therefore, all new navigational states are aggregated and calculated from the data of the main memory. Reading the data on demand: The OLAP processor only requests the corresponding fact table data that is needed for each navigational state of the query in the Business Explorer. Therefore, new data is read for each navigational step. The most suitable aggregate table is used and, if possible, already aggregated in the database. The data for identical navigational states is buffered in the OLAP processor. Reading on demand when expanding the hierarchy: When reading data on demand (2), the data for the entire – meaning completely expanded – hierarchy is requested for a hierarchy drilldown. For the read on demand when expanding the hierarchy (3), the data is aggregated by the database along the hierarchy and is sent to the start level of the hierarchy (highest node) in the OLAP processor. When expanding a hierarchy node, the children of the node are then respectively read on demand. In general, the reading of data on demand (2) provides much better performance than reading all the data (1). This read mode should especially be considered for queries with many, free characteristics. A query that contains two or more free characteristics from different dimensions (such as “Customer” and “Product”), will probably only be efficiently executable in this mode, as the aggregates can only be optimally used when reading the data on demand. For large hierarchies, aggregates should be created a middle level of the hierarchy and the start level of the query should be smaller or the same as this aggregate level. For queries about such large hierarchies, the read on demand when expanding the hierarchy method (3) should be set. 6.2 Size of a Hierarchy For performance reasons, a hierarchy should not contain more than 100,000 leaves. If the hierarchy grows larger, you should add a level that can be used as a navigation attribute or separate characteristic in the dimension. 15 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES Still, we recommend creating one large hierarchy in the system instead of several small hierarchies. In this case, you can use variables to restrict the large hierarchy user-specifically. 6.3 Performance Involving Multiple Use of a Hierarchy within a Query Hierarchy nodes can be used to select any kind of object (restricted key figure, row, ... see chapter 3.2). If you use a characteristic to restrict a structure element in a hierarchy, and then break down the rows by the same characteristic, this will result in poor performance. In this case, all the nodes have to be resolved down to leaf level and selected individually from the database. 7 FAQ . ?. How can you set a focus on a partial hierarchy? .!. If you restrict a characteristic to one hierarchy node in the rows (you can also use a variable to do this), and the hierarchy is also selected as the presentation hierarchy, only the sub-tree below the selected node is displayed in the Excel list. 8 Technical Details 8.1 Loading Hierarchies from a File The loading of hierarchies from files is defined in the Accelerator “Loading InfoSource Data via Flat Files”. 8.2 Hierarchies and Authorizations Authorizations involved with hierarchies are described in the Accelerator “Authorizations and Roles”. 8.3 Saving Hierarchy Information in the BW When you activate a hierarchy, the hierarchy structure is translated to the inclusion table. If a hierarchy with a time-specific structure is involved, and the hierarchy is not available for the requested key date, the inclusion table is generated at the query runtime (this information is contained in table RSRHIEDIR_OLAP). When you run a query with a hierarchy, in the first step, the inclusion table is used to resolve the nodes to the extent necessary, and the results are buffered in a temporary table. During the actual reading, this table is joined with the corresponding dimension (or SIDTab). Technically, the hierarchy definition is saved in the tables listed below: /BIC/Hmerkmal: hierarchy table; administration information is contained in table RSHIEDIR: hierarchy name, version, key date <-> HIEID (Char25) /BIC/Imerkmal: inclusion table Child - Parent (Succ Pred) in SID ; administration information is contained in table RSRHIEDIR_OLAP: HIEID, key date, Objvers <-> HIESID, SVER /BIC/Smerkmal: SID table for leaves (SID >=0) /BIC/Kmerkmal: SID table for nodes (SID < 0) 16 BUSINESS INFORMATION WAREHOUSE - HIERARCHIES 17