Version 6.0 – April 4 , 2006 SAP NetWeaver Business Intelligence Unicode-compliance Product Management NetWeaver BI, SAP AG Contents Content Unicode in General Excursus: MDMP Unicode support of SAP NetWeaver BI Interfaces to SAP Systems Interfaces to non-SAP Systems SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 2 Unicode Essentials What is Unicode? Character encoding schema for (nearly) all characters used world wide Each character has a unique number („Unicode code point“) Notation U+nnnn (where nnnn are hexadecimal digits) See http://www.unicode.org for complete code charts SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 3 Representation of Unicode Characters UTF-16 – universal transformation format, 16 bit encoding Fixed length, 1 character = 16bit. Surrogates using 2*16 bit Platform dependent byte order 16 bit alignment restriction (needs even addresses) UTF-8 – Unicode transformation format, 8 bit encoding Variable length, 1 character = 1...4 bytes Platform independent No alignment restriction Character Unicode code point UTF-16 big endian UTF-16 little endian a U+0061 00 61 61 00 61 ä U+00E4 00 E4 E4 00 C3 A4 a U+03B1 03 B1 B1 03 CE B1 U+3479 34 79 79 34 E3 91 B9 SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 4 UTF-8 Transparent Unicode Enabling of SAP R/3 Character expansion model Separate Unicode and non-Unicode versions of SAP R/3 ABAP ABAP source source NonUnicode R/3 1 character = 1 byte (types C, N, D, T, STRING) Non-Unicode kernel Non-Unicode database 1 character = 2 bytes (UTF16), (types C, N, D, T, STRING) Unicode kernel Unicode database Unicode R/3 No explicit Unicode data type in ABAP Single ABAP source for Unicode and non-Unicode systems Automatic conversion of character data for communication between Unicode and non-Unicode systems SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 5 Implications You are not interested in Unicode? No customer is forced to convert to Unicode Possibly minor changes for not Unicode enabled programs (more restrictive syntax check as of WebAS 6.10 on) You are interested in Unicode? Major part of ABAP coding is ready for Unicode without any changes Minor part of ABAP coding, i.e. some customer-specific ABAP coding, has to be adapted to comply with Unicode restrictions SAP will deliver powerful tools to convert your existing system SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 6 Resource Requirements for Unicode-compliant BI What are the additional resource requirements after a Unicode conversion? Text fields are usually bigger in a Unicode environment than in a Non-Unicode environment. Enhanced functionality requires additional resources, which strongly depend on the given customer scenario. In the following, we give a rough estimation what a Unicode conversion could mean for the resources. CPU’s We expect additional SAP NetWeaver BI requirements of roughly 30% more CPU power - like the SAP R/3 requirements. Main Memory We expect additional SAP NetWeaver BI requirements of roughly 50% more memory - like the SAP R/3 requirements. Disk Storage Disk Storage depends strongly on the underlying DBMS and the given data model/volume. For a significant share of InfoCube data (only numeric keys!), there might not be a significant increase of the DB size. For SAP R/3 on ORACLE, tests have resulted in roughly 35% additional disk space. For SAP NetWeaver BI, we expect – depending on the scenario – less additional disk space. Note that after a conversion, the disk size may even decrease because of DB reorganisation. SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 7 The Unicode Upgrade Project Preparation Conversion Post-Conversion Set up the Unicode upgrade project Analyze data to minimize downtime To be done during system downtime Unload /reload process for small databases Minimum downtime tool for large databases (Incremental Migration IMIG) SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 8 Set up the system for Unicode Start up the Unicode system Contents Content Unicode in General Excursus: MDMP Unicode support of SAP NetWeaver BI Interfaces to SAP Systems Interfaces to non-SAP Systems SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 9 MDMP – Definition What does MDMP mean? In the past, SAP supported multiple languages by MDMP MDMP = Multiple Display – Multiple Processing Mixed code page technology depending on logon language Several 100 satisfied customers MDMP was an SAP workaround for accessing multiple code pages Defined Only users where accessing the solution using SAPgui frontend little integration into the world wide web MDMP should not be used anymore for new installations MDMP is not supported for SAP NetWeaver BI SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 10 MDMP – Connection How can I connect SAP NetWeaver BI to an MDMP SAP source system? Language-dependent source tables Most text tables include the language key. In an SAP system, the code page can be derived from the language key. This enables the correct conversion of the texts from the source system to SAP NetWeaver BI. Language-independent source table If a text table does not include the language key (e.g. customer texts), these texts have to be converted “manually”. The code page (or language) has to be defined for EACH record in these languageindependent tables either by appending a new column into the table or by defining the extraction job in the corresponding codepage. If there is no language information given, the source tables will be converted based on the logon language of the user. SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 11 MDMP – Project Based Solution Why is there no generic solution for the connection of MDMP source systems? All language-independent source tables must be checked and for each record, the code page which it was created in must be determined In order to derive this code page information, attribute values, organizational assignments or heuristics may help, but there is no generic approach Examples: in order to determine the code page of a customer name, the country where she/he lives in may be used usually not correct, if e.g. a Japanese customer lives in Russia Alternatively, the assignment to a sales organization may help For more information see How-To Guide in SAP Service Marketplace SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 12 Design Principles for Source Systems How can I ensure a smooth Unicode conversion? The BI system receives the following information from source systems Date and timestamp information Key figures Characteristics by key Texts How should I design the data in the source system? Date and timestamp information as well as key figures are just figures; the conversion is not critical Characteristic keys (e.g. material number, customer number, brand key, country key) must be designed as 7-bit-US-ASCII characters (common characters). Don’t use other characters as keys or attributes. As a benefit, the permitted character setting in the BW customizing need not be maintained. See OSS note 173241 for more information. Note that maintaining the permitted characters makes you code page-dependent. There might be problems with the conversion to Unicode (e.g. characters like ‘µ’ are a lower case letter in Unicode – and thus not allowed) Texts are ONLY loaded into the text tables of the InfoObjects; transaction data and master data attributes must NOT contain texts. Texts are usually languagedependent and must be designed in the source system with an associated language key (even if e.g. the R/3 datamodel is not providing this). SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 13 Contents Content Unicode in General Excursus: MDMP Unicode support of SAP NetWeaver BI Interfaces to SAP Systems Interfaces to non-SAP Systems SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 14 Unicode-Compliant SAP NetWeaver BI Unicode-compliance means: SAP NetWeaver BI can interpret and display Unicode characters User interface will be shown in local language Business data can use all languages in parallel SAP NetWeaver BI can extract data from source systems with specific code pages (Non-Unicode or Unicode) SAP NetWeaver BI can extract data from SAP source system running mixed code pages (MDMP) Interfaces to 3rd party systems support correct code page conversion SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 15 Unicode Release Planning Which versions of SAP BW are released for Unicode and when are they available? SAP NetWeaver BI Unicode releases SAP BW 3.5 as part of SAP NetWeaver ’04 and all subsequent releases [SAP BW 3.1 Content has been piloted, but it is not shipped anymore] Are there any platforms that do not support Unicode? Yes, the following DB/OS platforms do not support Unicode Informix Reliant Unix SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 16 Supported Scenarios – SAP NetWeaver 2004s Which scenarios does SAP NetWeaver 2004s support? BI in SAP NetWeaver 2004s is fully Unicode-compliant Load of data in different code pages Display of all data if front end is configured in Unicode code page All MS Windows-based client tools are Unicode-enabled SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 17 Supported Scenarios in SAP NetWeaver ‘04 Which scenarios does SAP NetWeaver ’04/SAP BW 3.5 support? BI Server (backend) Load of data in different code pages Display of all data in SAP Gui environment Unicode Web front end Display of all data if front end is configured in Unicode code page Restrictions for MS Windows-based client tools (BEx Excel front end, Web Application Designer, Query Designer) in SAP NetWeaver ’04 – see also OSS note 588480 Object keys must be in US-7-bit-ASCII to ensure access to keys in all logon languages / code pages The restrictions only apply to texts, transactional data is not affected. The client is configured in one Non-Unicode code page. Note that also the Windows default code page must be switched to this Non-Unicode code page. Only the data in the logon code page are displayed correctly: I.e. a Russian employee can see all Russian texts, but not the German ones (‘#’ characters will be displayed for special German characters), whereas the German employee can see the German texts (but not the Russian ones) English texts (US-7-bit-ASCII) are displayed correctly A user must not change and save queries, work books, query elements or web templates with Non-US-7-bit-ASCII (English) texts that have been originally created in a different code page. For example, a query containing Russian text elements should only be changed in Russian code page. SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 18 Front end Restrictions in SAP BW 3.5 in Detail – 1 – Restrictions for MS Windows-based client tools – example BEx Excel Frontend Logon Language Russian German characters corrupted Logon Language German Russian characters corrupted No restrictions in Web Frontend Correct characters from both code pages – independent from logon language SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 19 Front end Restrictions in SAP BW 3.5 in Detail – 2 – BEx Analyzer Excel front end Note that the corrupted text is a pure display matter; the data in the database is always correct! Query Designer / Web Application Designer In order to avoid corrupted text written back to the database, organizational means must be installed. There are basically two options Queries, query elements, work books, web templates must only contain English texts (US-7-bit-ASCII) Queries, query elements, work books, web templates containing texts (also hidden ones in e.g. restricted key figures or variables) in a specific language / code page must not be changed by users in a different code page RTL (Right-to-Left) Languages The Windows frontend components (BEx Analyzer, Query Designer, Web Application Designer) do not support RTL languages (like Hebrew, Arabian). However BEx Web does support them. Keys in US-7-bit-ASCII Keys (of any InfoObject) must be in US-7-bit-ASCII to enable users in every code page to enter the key in a search criterion; if Russian characters were allowed in keys, a German user could never enter these and, hence, never select these SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 20 Supplementation Business Scenario You run your SAP NetWeaver BI in different languages, but the availability of translations for all languages cannot be guaranteed. Hence, you want to define a standard language. This standard language is used to generate all missing translations. Example: you want to use English material texts in case the Japanese ones are not available. Supplementation The text tables can be supplemented with a standard language. Usage Start in transaction SMLT and check by double-clicking on the (target) language (e.g. Japanese), if the supplementation language has been maintained (e.g. English). Then use the menu path Language Special Actions Supplementation (Expert) in order to select all affected tables (usually all language-dependent /BIC/T* and /BI0/T*-tables – those tables are cross-client class A tables) If you want to run supplementation periodically, you can use report RSTLAN_SUPPLEMENT_PERIODIC (parameter: preceding supplementation run) and include it into your process chain. Check OSS note 111750 for supplementation of German language SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 21 BW Unicode Installation and Conversion How can I make my BI Unicode-compliant? Delivery You can choose between Non-Unicode and Unicode installation Note: Unicode installation requires more hardware resources (depending on DB platform) Installation modes New Install Conversion of an existing SAP NetWeaver BI Before the conversion, upgrade your SAP BW to at least SAP BW 3.5 / SAP NetWeaver ‘04 R3LOAD converts an existing SAP NetWeaver BI (’04 or 2004s) automatically by exporting the DB, realigning the DB and importing the DB again Note that the Unicode Conversion is a pilot project as part of a BI System Copy (see note 543715) Customer-developed programs (variable exits, virtual characteristics/key figures, transformation rules, table interface etc.) must be in line with the Unicode rules The duration of a conversion depends on the size of the existing database SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 22 Contents Content Unicode in General Excursus: MDMP Unicode support of SAP NetWeaver BI Interfaces to SAP Systems Interfaces to non-SAP Systems SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 23 Interfaces to SAP Source Systems Which SAP Source System Types exist ? Single code page systems One single code page can include several languages (e.g., Latin-1 contains German, French, Spanish, Italian etc.) A Unicode source system is also a single code page system Mixed code page systems SAP source systems with several code pages (usually: MDMP) How does the conversion work? RFC-Calls are enhanced by source and target code page parameters; they transform characters „on the fly“ to the correct code page The RFC call derives the source code page information from the language key Note: English texts (US-7-bit-ASCII) is converted correctly in ALL code pages For mixed code page systems, language information on RECORD level defines the conversion PROJECT based solution SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 24 Connection Scenarios – 1 – Which systems can I connect to ? Overview on the supported connections SAP BW 3.5 Unicode and following releases SAP BW 3.x Non-Unicode SAP BW 2.x NonUnicode SAP source Unicode ok ok, but some information might be lost not recommended Project! Some information might be lost not recommended SAP source nonUnicode ok ok ok SAP source MDMP Project! ok, but some information might be lost not recommended ok, but some information might be lost not recommended SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 25 Connection Scenarios – 2 – What are the prerequisites for a proper connection ? Unicode SAP BW 3.5 systems and Unicode / Non-Unicode SAP source systems SAP NetWeaver BI-Service API 3.0B SP3 resp. PI_BASIS 2002_1_620 SP3 is required to connect the Unicode source system with SAP NetWeaver BI e.g. R/3 Unicode / Non-Unicode Service API 3.0B SP3 or PI_BASIS 2002_1_620 SP3 SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 26 BI Unicode BI Connection Scenarios – 3 – Non-Unicode SAP BW 3.x systems and Unicode SAP source systems During conversion some information might be lost. The following restrictions apply Characters might be corrupted as the target system does not know all Unicode characters In multi-byte code pages (Asian languages), suffix characters could get cut off during the extraction, as the bigger Unicode containers must be mapped to the target non-Unicode containers e.g. R/3 Restrictions apply Unicode BI Non-Unicode SAP BW 3.x Non-Unicode SAP BW 2.x systems and Unicode SAP source systems SAP BW 2.x as non-Unicode BW system is not generally released to extract data from Unicode source systems; in special scenarios, this connection might be possible on a project basis BI e.g. R/3 Unicode SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 27 Project Solution Non-Unicode SAP BW 2.x Connection Scenarios – 4 – What about other interfaces to SAP source systems ? The same connection scenarios apply for transient interfaces to SAP source systems Report-Report-Interface RemoteCubes Virtual InfoProviders to SAP source systems SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 28 Technical Settings for Unicode Be sure to set up your system according to the following recommendations Languages You can only extract languages to SAP NetWeaver BI, which are defined in the basis parameter ZCSA/INSTALLED_LANGUAGES (language vector) SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 29 Contents Content Unicode in General Excursus: MDMP Unicode support of SAP NetWeaver BI Interfaces to SAP Systems Interfaces to non-SAP Systems SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 30 Unicode-compliance of Interfaces Which source system code pages are supported for NonSAP sources? For Non-SAP sources, it is difficult to determine the source code page. In most cases, only English texts (7-bit-US-ASCII) or Unicode texts are converted correctly Examples Flat File Connect (Same restrictions apply like for all MultiConnect connections – independent from Unicode) DB UD Connect Virtual InfoProviders to Non-SAP sources SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 31 Unicode-compliance of BAPIs and 3rd Party Tools What about Unicode with respect to BI-BAPI’s and 3rd Party Tools? Loading data into BW via Staging-BAPI (and 3rd party ETL-tools) This interface is released for the following sources: (see OSS note 765543) Single-byte Unicode code pages UTF-16 All other code pages, but only for transfer structure of size 250 characters and smaller This interface is not released for the following sources: Multi-byte code pages with variable byte length (I.e. Asian code pages and UTF- 8) AND Transfer structure is larger than 250 characters Displaying data in a 3rd party front end via OLAP BAPI, OLE DB for OLAP or XMLA These interfaces are currently released as a pilots for Unicode ETL or frontend partners Ask our partners for Unicode-compliance of their products SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 32 Contents Content Unicode in General Excursus: MDMP Unicode support of SAP NetWeaver BI Interfaces to SAP Systems Interfaces to non-SAP Systems SAP AG 2004, Unicode@BW, Product Management NetWeaver BI, 33