Understanding DSO Applies to: SAP NetWeaver BW. Summary This is the document containing each and every detail about DSOs and their implementation for beginners in SAP BI. For advanced users also, this document has many small but usually ignored helpful facts. 1 Table of Contents DataStore Object .................................................................................................................................... 3 Definition ................................................................................................................................................. 3 Use......................................................................................................................................................... 3 Standard DataStore Object .................................................................................................................................4 Write-Optimized DataStore Objects………………………………………….……………………………………......17 DataStore Objects for Direct UpdateUse………………………………………………….………………………….26 DataStore Object Definition A DataStore object serves as a storage location for consolidated and cleansed transaction data or master data on a document (atomic) level. This data can be evaluated using a BEx query. A DataStore object contains key fields (such as document number, document item) and data fields that, in addition to key figures, can also contain character fields (such as order status, customer). The data from a DataStore object can be updated with a delta update into InfoCubes (standard) and/or other DataStore objects or master data tables (attributes or texts) in the same system or across different systems. Unlike multidimensional data storage using InfoCubes, the data in DataStore objects is stored in transparent, flat database tables. The system does not create fact tables or dimension tables. Use Overview of DataStore Object Types Type Structure Data Supply SID Generation Standard DataStore Object Consists of three tables: activation queue, table of active From data Yes transfer process data, change log Write-Optimized DataStore Consists of the table of active From data Objects data only transfer process DataStore Objects for Direct Update Consists of the table of active From APIs No No data only 3 Standard DataStore Object Use The standard DataStore object is filled with data during the extraction and loading process in the BI system. Structure A standard DataStore object is represented on the database by three transparent tables: • Activation queue: Used to save DataStore object data records that need to be updated, but that have not yet been activated. After activation, this data is deleted if all requests in the activation queue have been activated. • Active data: A table containing the active data (A table). • Change log: Contains the change history for the delta update from the DataStore object into other data targets, such as DataStore objects or InfoCubes. The tables of active data are built according to the DataStore object definition. This means that key fields and data fields are specified when the DataStore object is defined. The activation queue and the change log are almost identical in structure: the activation queue has an SID as its key, the package ID and the record number; the change log has the request ID as its key, the package ID, and the record number. This graphic shows how the various tables of the DataStore object work together during the data load. Data can be loaded from several source systems at the same time because a queuing mechanism enables a parallel INSERT. The key allows records to be labeled consistently in the activation queue. The data arrives in the change log from the activation queue and is written to the table for active data upon activation. During activation, the requests are sorted according to their logical keys. This ensures that the data is updated to the table of active data in the correct request sequence. 4 Example for Activating and Updating Data The graphic below shows how data is updated in a DataStore object and the effect of the activation step. 1. Request 1 with amount 10 and request 2 with amount 30 are loaded parallel into the DataStore object. This takes you to the activation queue. You are given a unique request ID there. 2. When you carry out the activation step, the requests are sorted by key, transferred into the table containing the active data, and immediately deleted from the activation queue. In the table containing the active data, the amount 10 is replaced by 30 (since Overwrite is set as the update type). 3. When you activate the data, the change log is also notified: The old record from the active table is saved as a negative (-10) and the new record is stored as a positive (+30). 4. If all the records are activated, you can update the changes to the data records for the DataStore object in the related InfoProvider in a separate step. The amount in this example is increased in the related InfoProviders by 20. 0RECORDMODE Upon activation of Standard DSO, SAP Netweaver BW adds the 0RECORDMODE InfoObject to the definition of the Standard DSO and to all the three tables of the standard DSO. This InfoObject is used internally by SAP Netweaver BW. You can overwrite the existing record for the same semantic key field combination, in addition to adding key figure values for the record with same semantic key field combination. SAP Business Content offers DataSources for a number of standard business processes. The DataSource field ROCANCEL, for example is mapped to the 0RECORDMODE InfoObject in SAP NetWeaver BW. The combination of the Update mode set in transformation, along with the value of the 0RECORDMODE 5 InfoObject, helps SAP Netweaver BW properly treat the incoming record in the Active data and Change Log tables. The following are the values for the field ROCANCEL and the meaning that they communicate about the record. Value BLANK X A D R N Meaning The record provides an after image. The record provides a before image. The record provides an additive image. The record must be deleted The record provides a reverse image. The record provides a new image. Designing a standard DSO To create a Standard DSO, first go to RSA1 transaction screen. You will reach the DWW screen where you have to select InfoProvider under the Modeling pane present on the left side of the screen as shown below. Now right click on your InfoArea and select Create DataStore Object from the context menu as shown below. 6 You will get the following window wherein you are required to fill in the technical name and description of the DSO to be created. SAP has also provided an option of copying the entire DSO structure from another DSO using the Copy from Text-Field. This structure can be modified later. In this case, we will be building the DSO from scratch. 7 The filled in details are shown below. Press the Create button to continue. You will reach the following screen for editing the DSO. 8 Settings in Standard DSO As you can see, the following settings are available in a Standard DSO. All of them will be explained one-by-one below in detail. Type of DataStore Object By Default, the DSO type is created as a standard type. This can be changed by clicking on the Change icon . You will get the following pop-up. Since we are creating a Standard DSO, we will leave these settings unchanged and go back to the previous screen by pressing . SID Generation upon Activation When checked(Occurs by default), the SIDs Generation Upon Activation box causes the system to generate an integer number known as a Surrogate ID (SID) for each master data value. These SIDs are stored in separate tables called SID tables. For each characteristic InfoObject, SAP Netweaver BW checks the existence of an SID value for each value of an InfoObject in the SID table. The system then generates a new value of SID if an existing value is not found. The SID is used internally by SAP Netweaver BW when a query is based on a DSO. 9 In cases where the Standard DSO is not used for reporting and is just used for staging purposes, it is recommended to uncheck this checkbox. Unique Data Records This setting is used when there‟s no chance that the data being loaded to a standard DSO will create a duplicate record. It improves performance by eliminating some internal processes. If this box is checked and it turns out that there are duplicate records, you will receive an error message Because of this, you should only select this box when you are sure that you won‟t have duplicate data. Set Quality Status to „OK‟ Automatically The Set Quality Status to „OK‟ automatically flag results in the quality status of the data being set to “OK” after being loaded without any technical errors; the status must be set to this to activate newly loaded data in the standard DSO. Only activated data can be passed to further data targets. Activate Data Automatically Data loaded into standard DSOs first get sorted in the Activation Queue table, which is activated using the activation process. To make this process automatic, you should check this flag. Update Data Automatically Activated data available in a standard DSO can be passed to other data targets, such as another DSO or an InfoCube. This process can be automated by setting this flag. Including Key Fields and Data Fields in the DSO The DSO contains 2 kinds of fields namely, the key field and the data field. The combination of key fields is responsible for uniquely identifying the data. All other objects can be included as data fields. There are two ways to provide input InfoObjects into the DSO: • Using Templates • Using Direct Input We will be explaining the use of both the methods with the example below. 10 Using Templates Click on the InfoObject Catalog button circled in Red below. 11 The following pop-up opens up. Select the InfoArea associated with the InfoObjects you require using button. Firstly, for including the key fields double click on the Characteristics catalog. You will see that the left template pane contains all the Characteristic Info-Objects contained in the Catalog. Now simply drag and drop the info-objects required to be added as key fields from the left pane to the Key Fields menu in the right pane. 12 As you can see below, we have successfully added the key fields. Now open up the InfoObject catalog again and select the key figures to add the Data Fields. 13 You will see that the left template pane contains all the Key Figure Info-Objects contained in the Catalog. Similar to what we did before, simply drag and drop the info-objects required to be added as key fields from the left pane to the Key Fields menu in the right pane. Using Direct Input Now we will add some new key fields using InfoObject Direct Input method. You can use the method to add data fields too. We will illustrate the addition of 0CALWEEK and 0DOC_NUMBER InfoObject to the key fields. To achieve this, right click on the Key Fields (Highlighted below) and select InfoObject Direct Input from the context menu. 14 The following pop-up opens. Here you can input the technical names of the InfoObjects you have to include and press enter to see their descriptions as shown below. Press to confirm. As you can see below, the new key fields have been added successfully. 15 Navigational Attribute Inclusion Navigational attributes defined in the included InfoObjects are available for viewing under the Navigational Attributes Column. They are included automatically but you still have to confirm them by selecting the On/Off checkboxes circled in red below. Here we have included the Sales Promotion and Opportunity Navigational attributes as shown below. Final Steps Now our DSO structure design is complete. Now we follow through the usual routine of Save, Check and Activate. Save using the Press button. to check for errors. The following message confirms that there are no errors in design. Press the button to activate the DSO. The Object Information menu now shows the DSO as active. The Standard DSO design is now complete. 16 Write-Optimized DataStore Objects Definition A DataStore object that consists of just one table of active data. Data is loaded using the data transfer process. Use Data that is loaded into write-optimized DataStore objects is available immediately for further processing. They can be used in the following scenarios: You use a write-optimized DataStore object as a temporary storage area for large sets of data if you are executing complex transformations for this data before it is written to the DataStore object. The data can then be updated to further (smaller) InfoProviders. You only have to create the complex transformations once for all data. You use write-optimized DataStore objects as the EDW layer for saving data. Business rules are only applied when the data is updated to additional InfoProviders. The system does not generate SIDs for write-optimized DataStore objects and you do not need to activate them. This means that you can save and further process data quickly. Reporting is possible on the basis of these DataStore objects. However, we recommend that you use them as a consolidation layer, and update the data to additional InfoProviders, standard DataStore objects, or InfoCubes. Structure Since the write-optimized DataStore object only consists of the table of active data, you do not have to activate the data, as is necessary with the standard DataStore object. This means that you can process data more quickly. The loaded data is not aggregated; the history of the data is retained. If two data records with the same logical key are extracted from the source, both records are saved in the DataStore object. The record mode responsible for aggregation remains, however, so that the aggregation of data can take place later in standard DataStore objects. The system generates a unique technical key for the write-optimized DataStore object. The standard key fields are not necessary with this type of DataStore object. If there are standard key fields anyway, they are called semantic keys so that they can be distinguished from the technical keys. The technical key consists of the Request GUID field (0REQUEST), the Data Package field (0DATAPAKID) and the Data Record Number field (0RECORD). Only new data records are loaded to this key. 17 You can specify that you do not want to run a check to ensure that the data is unique. If you do not check the uniqueness of the data, the DataStore object table may contain several records with the same key. If you do not set this indicator, and you do check the uniqueness of the data, the system generates a unique index in the semantic key of the InfoObject. This index has the technical name "KEY". Since write-optimized DataStore objects do not have a change log, the system does not create delta (in the sense of a before image and an after image). When you update data into the connected InfoProviders, the system only updates the requests that have not yet been posted. Use in BEx Queries For performance reasons, SID values are not created for the characteristics that are loaded. The data is still available for BEx queries. However, in comparison to standard DataStore objects, you can expect slightly worse performance because the SID values have to be created during reporting. If you want to use write-optimized DataStore objects in BEx queries, we recommend that they have a semantic key and that you run a check to ensure that the data is unique. In this case, the write-optimized DataStore object behaves like a standard DataStore object. If the DataStore object does not have these properties, you may experience unexpected results when the data is aggregated in the query. Designing a Write-Optimized DSO To create a Write-Optimized DSO, first go to RSA1 transaction screen. 18 You will reach the DWW screen where you have to select InfoProvider under the Modeling panel present on the left side of the screen as shown below. Now right click on your InfoArea and select Create DataStore Object from the context menu as shown below. 19 You will get the following window wherein you are required to fill in the technical name and description of the DSO to be created. SAP has also provided an option of copying the entire DSO structure from another DSO using the Copy from Text-Field. This structure can be modified later. We will use the Structure of the Standard DSO which we created in the first Part of the document for matters of simplicity and consistency 20 The filled in details are shown below. Press the Create button to continue. You will reach the following screen for editing the DSO. By Default, the DSO type is created as a standard type. This can be changed by clicking on the Change icon . 21 You will get the following pop-up. Now select the radio-button for Write-Optimized DSO, and go back to the previous screen by pressing . Now, the DSO has become a Write Optimized one as seen below 22 Settings in Write-Optimized DSO As you can see, the following settings are available in a Write-optimized DSO. All of them will be explained one-by-one below in detail. Type of DataStore Object This setting has been explained above while modifying the type of DSO from Standard to Write-Optimized. 23 Do Not Check Uniqueness of Data By default, this option is not checked. In this case, a unique index called key is created with the fields included in the semantic Key section. While loading the data, Do Not Check Uniqueness of Data is checked with respect to fields in the semantic key. If this indicator is checked, the key index isn’t generated, and the DSO can have several records with the same semantic key value. This significantly improves the loading performance. Including Key Fields and Data Fields in the DSO The DSO contains 2 kinds of fields namely, the Semantic key and the data field. The combination of Semantic keys is responsible for : Identifying error in incoming records or Duplicate records Protecting Data Quality such that all subsequent Records with same key are written into error stack along with incorrect Data Records. Processing the error records or duplicate records, Semantic Group is defined in DTP. Note: if we are sure there are no incoming duplicate or error records, Semantic Groups need not be defined. All other objects can be included as data fields. Similar to a Standard DSO, the same 2 ways are used to provide input InfoObjects into the DSO: 1. Using Templates 2. Using Direct Input We have explained these methods in detail in the previous document Final Steps Now our DSO structure design is complete. Now we follow through the usual routine of Save, Check and Activate. Save using the Press button. to check for errors. 24 The following message confirms that there are no errors in design. Press the button to activate the DSO. The Object Information menu now shows the DSO as active. The Write-Optimized DSO design is now complete. 25 DataStore Objects for Direct Update Definition The DataStore object for direct update differs from the standard DataStore object in how the data is processed. In a standard DataStore object, data is stored in different versions (active, delta, modified), whereas a DataStore object for direct update contains the data in a single version. Therefore, data is stored in precisely the same form in which it was written to the DataStore object for direct update by the application. In the BI system, you can use a DataStore object for direct update as a data target for an analysis process. For more information, see The DataStore object for direct update is also required by diverse applications, such as SAP Strategic Enterprise Management (SEM) for example, as well as other external applications. Use DataStore objects for direct update ensure that the data is available quickly. The data from this kind of DataStore object is accessed transactionally, that is, data is written to the DataStore object (possibly by several users at the same time) and reread as soon as possible. It offers no replacement for the standard DataStore object. Instead, an additional function displays those that can be used for special applications. Structure The DataStore object for direct update only comprises a table for active data. It retrieves its data from external systems via fill- or delete- APIs. The load process is not supported by the BI system. The advantage to the way it is structured is that data is easy to access. They are made available for reporting immediately after being loaded. 26 Application programming interface (API) It is important to understand the concept of APIs before understanding a Direct Update DSO. An application programming interface (API) is a particular set of rules and specifications that software programs can follow to communicate with each other. It serves as an interface between different software programs and facilitates their interaction, similar to the way the user interface facilitates interaction between humans and computers. An API can be created for applications, libraries, operating systems, etc., as a way of defining their "vocabularies" and resources request conventions (e.g. function-calling conventions). It may include specifications for routines, data structures, object classes, and protocols used to communicate between the consumer program and the implementer program of the API. The DataStore object for direct update consists of a table for active data only. It retrieves its data from external systems via fill or delete APIs. The following APIs exist: • RSDRI_ODSO_INSERT: Inserts new data (with keys not yet in the system). • RSDRI_ODSO_INSERT_RFC: see above, can be called up remotely • RSDRI_ODSO_MODIFY: inserts data having new keys; for data with keys already in the system, the data is changed. • RSDRI_ODSO_MODIFY_RFC: see above, can be called up remotely • RSDRI_ODSO_UPDATE: changes data with keys in the system • RSDRI_ODSO_UPDATE_RFC: see above, can be called up remotely • RSDRI_ODSO_DELETE_RFC: deletes data Analysis Process Designer Use In SAP BW, data from various databases from systems available in the company are collected, consolidated, managed and prepared for evaluation purposes. There is often further, valuable potential in this data. It deals with completely new information that is displayed in the form of meaningful connectivity between data but that is too well hidden or complex to be discovered by simple observation or intuition. The Analysis Process Designer (APD) makes it possible to find and identify these hidden or complex relationships between data in a simple way. Various data transformations are provided for this purpose, such as statistical and mathematical calculations, and data cleansing or structuring processes. The analysis results are saved in BW data targets or in a CRM system. They are available for all decision and application processes and thus can be decisive (strategically, tactically, and operatively). 27 Examples of analysis processes include the calculation of ABC classes, determination of frequency distribution or of scoring information. Integration The analysis process designer is the application environment for the SAP data mining solution. The following data mining functions are integrated into the APD: • Creating and changing data mining models • Training data mining models with various BW data (data mining model as data target in the analysis process) • Execution of data mining methods such as prediction with decision tree, with cluster model and integration of data mining models from third parties (data mining model as a transformation in the analysis process) • Visualization of data mining models The APD is integrated into the Administrator Workbench: 28 Restrictions Integration into the Administrator Workbench has the following restrictions: • The node texts are not language dependent. • You can only integrate an analysis process into a process chain using the process type ABAP program. To do this, choose the ABAP report RSAN_PROCESS_EXECUTE. • Analysis processes are not displayed in the data flow display. • The where-used list only functions from the analysis process to other objects and from data mining models to the analysis process, but not from other objects such as InfoProviders. Functions The analysis process designer is a workbench with an intuitive, graphic user interface for the creation, execution and monitoring of analysis processes. Analysis process can be created using Drag&Drop. Data from different data sources in the BW system can be combined, transformed and prepared for analysis in several individual steps so that it can then be resaved into targets in the BW system (transactional ODS object or InfoObjects with attributes) or in a CRM system. Various • Data sources, • Transformations and • Data Targets are available. Various additional functions support you during modeling and executing an analysis process, as well as during interpretation of the analysis results. The following graphic shows the various steps in the Analysis Process Designer. 29 First select a data target that contains the desired data. Then this data is prepared and then transformed. This transformed data is then saved in a BW object or in another system. For analysis, you can display the data in a query in the Business Explorer. Versioning Analysis processes are integrated into the versioning concept (active, inactive version, content version and delivery). For mode details on how the APD works with Direct-Update DSOs under different transformation, refer to my following whitepapers: http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/90e15fcc-b253-2e10-c4a6-e4593150f890 http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/d00dbb01-a753-2e10-a09b-81d9ad4b862b Creating DataStore Objects for Direct Update When creating a DataStore object, you can change the DataStore object type under Settings via the context menu. The default setting is Standard. The switching of DataStore object types between standard and direct update is only possible if no data exists yet in the DataStore object. Integration Since DataStore objects for direct update cannot be filled with BI data using staging (data is not supplied from the DataSources), they are not displayed in the scheduler or in the monitor. However, you can update the data of DataStore objects of this type to additional InfoProviders. If you switch a standard DataStore object, that already has update rules, to direct update, the update rules are set as inactive and can no longer be processed. Since no change log is generated, delta update of InfoProviders stored at the end of the process is not possible. The DataStore object for direct update is available as an InfoProvider in the BEx Query Designer and can be used in reporting Designing a Direct-Update DSO To create a Direct Update DSO, first go to RSA1 transaction screen. You will reach the DWW screen where you have to select InfoProvider under the Modeling pane present on the left side of the screen as shown below. 30 Now right click on your InfoArea and select Create DataStore Object from the context menu as shown below. 31 You will get the following window wherein you are required to fill in the technical name and description of the DSO to be created. SAP has also provided an option of copying the entire DSO structure from another DSO using the Copy from Text-Field. This structure can be modified later. We will use the Structure of the Standard DSO which we created in the first Part of the document for matters of simplicity and consistency. The filled in details are shown below. Press the Create button to continue. 32 You will reach the following screen for editing the DSO. By Default, the DSO type is created as a standard type. This can be changed by clicking on the Change icon . You will get the following pop-up. Now select the radio-button for Direct Update DSO, and go back to the previous screen by pressing . 33 Now, the DSO has become a Direct Update one as seen below. 34 Settings in Direct Update DSO As you can see, the following settings are available in a Direct Update DSO. All of them will be explained one-by-one below in detail. Type of DataStore Object This setting has been explained above while modifying the type of DSO from Standard to Direct Update. Including Key Fields and Data Fields in the DSO The DSO contains 2 kinds of fields namely, the key field and the data field. The combination of key fields is responsible for uniquely identifying the data. All other objects can be included as data fields. All other objects can be included as data fields. Similar to a Standard DSO, the same 2 ways are used to provide input InfoObjects into the DSO: • Using Templates • Using Direct Input We have explained these methods in detail in the first part of the document. Final Steps Now our DSO structure design is complete. Now we follow through the usual routine of Save, Check and Activate. Save using the Press button. to check for errors. The following message confirms that there are no errors in design. 35 Press the button to activate the DSO. The Object Information menu now shows the DSO as active. The Direct Update DSO design is now complete. Related Content http://help.sap.com/saphelp_sem60/helpdata/en/c0/99663b3e916a78e10000000a11402f/content.htm http://help.sap.com/saphelp_erp2004/helpdata/en/49/7e960481916448b20134d471d36a6b/content.htm http://help.sap.com/saphelp_nw70/helpdata/en/c0/99663b3e916a78e10000000a11402f/content.htm http://en.wikipedia.org/wiki/Application_programming_interface http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/90e15fcc-b253-2e10-c4a6-e4593150f890 http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/d00dbb01-a753-2e10-a09b-81d9ad4b862b 36