(Online Analytical Processing). - CIn

Online analytical processing From Wikipedia, the free encyclopedia (Redirected from OLAP) Jump to: navigation, search Online analytical processing, or OLAP (IPA: /ˈoʊlæp/), is an approach to quickly answer multi-dimensional analytical queries.[1] OLAP is part of the broader category of business intelligence, which also encompasses relational reporting and data mining.[2] The typical applications of OLAP are in business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas. The term OLAP was created as a slight modification of the traditional database term OLTP (Online Transaction Processing).[3] Databases configured for OLAP use a multidimensional data model, allowing for complex analytical and ad-hoc queries with a rapid execution time. They borrow aspects of navigational databases and hierarchical databases that are faster than relational databases.[4] Nigel Pendse has suggested that an alternative and perhaps more descriptive term to describe the concept of OLAP is Fast Analysis of Shared Multidimensional Information (FASMI).[5] The output of an OLAP query is typically displayed in a matrix (or pivot) format. The dimensions form the rows and columns of the matrix; the measures form the values. Contents [hide]         1 Concept o 1.1 Multidimensional databases 2 Aggregations 3 Types o 3.1 Multidimensional o 3.2 Relational o 3.3 Hybrid o 3.4 Comparison o 3.5 Other types 4 APIs and query languages 5 Products o 5.1 History o 5.2 Market structure 6 See also 7 Bibliography 8 References [edit] Concept In the core of any OLAP system is a concept of an OLAP cube (also called a multidimensional cube or a hypercube). It consists of numeric facts called measures which are categorized by dimensions. The cube metadata is typically created from a star schema or snowflake schema of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables. Each measure can be thought of as having a set of labels, or meta-data associated with it. A dimension is what describes these labels; it provides information about the measure. A simple example would be a cube that contains a store's sales as a measure, and Date/Time as a dimension. Each Sale has a Date/Time label that describes more about that sale. Any number of dimensions can be added to the structure such as Store, Cashier, or Customer by adding a column to the fact table. This allows an analyst to view the measures along any combination of the dimensions. For Example: Sales Fact Table +-----------------------+ | sale_amount | time_id | +-----------------------+ Time Dimension | 2008.08| 1234|---+ +----------------------------+ +-----------------------+ | | time_id | timestamp | | +----------------------------+ +---->| 1234 | 20080902 12:35:43| +----------------------------+ [edit] Multidimensional databases Multidimensional structure is defined as “a variation of the relational model that uses multidimensional structures to organize data and express the relationships between data” (O'Brien & Marakas, 2009, pg 177). The structure is broken into cubes and the cubes are able to store and access data within the confines of each cube. “Each cell within a multidimensional structure contains aggregated data related to elements along each of its dimensions” (pg. 178). Even when data is manipulated it is still easy to access as well as be a compact type of database. The data still remains interrelated. Multidimensional structure is quite popular for analytical databases that use online analytical processing (OLAP) applications (O’Brien & Marakas, 2009). Analytical databases use these databases because of their ability to deliver answers quickly to complex business queries. Data can be seen from different ways, which gives a broader picture of a problem unlike other models (Williams, Garza, Tucker & Marcus, 1994). [edit] Aggregations It has been claimed that for complex queries OLAP cubes can produce an answer in around 0.1% of the time for the same query on OLTP relational data. [6] [7] The most important mechanism in OLAP which allows it to achieve such performance is the use of aggregations. Aggregations are built from the fact table by changing the granularity on specific dimensions and aggregating up data along these dimensions. The number of possible aggregations is determined by every possible combination of dimension granularities. The combination of all possible aggregations and the base data contains the answers to every query which can be answered from the data [8]. Because usually there are many aggregations that can be calculated, often only a predetermined number are fully calculated; the remainder are solved on demand. The problem of deciding which aggregations (views) to calculate is known as the view selection problem. View selection can be constrained by the total size of the selected set of aggregations, the time to update them from changes in the base data, or both. The objective of view selection is typically to minimize the average time to answer OLAP queries, although some studies also minimize the update time. View selection is NP-Complete. Many approaches to the problem have been explored, including greedy algorithms, randomized search, genetic algorithms and A* search algorithm. A very effective way to support aggregation and other common OLAP operations is the use of bitmap indexes. [edit] Types OLAP systems have been traditionally categorized using the following taxonomy. [9] [edit] Multidimensional Main article: MOLAP MOLAP is the 'classic' form of OLAP and is sometimes referred to as just OLAP. MOLAP stores this data in an optimized multidimensional array storage, rather than in a relational database. Therefore it requires the pre-computation and storage of information in the cube - the operation known as processing. [edit] Relational Main article: ROLAP ROLAP works directly with relational databases. The base data and the dimension tables are stored as relational tables and new tables are created to hold the aggregated information. Depends on a specialized schema design. [edit] Hybrid Main article: HOLAP There is no clear agreement across the industry as to what constitutes "Hybrid OLAP", except that a database will divide data between relational and specialized storage. For example, for some vendors, a HOLAP database will use relational tables to hold the larger quantities of detailed data, and use specialized storage for at least some aspects of the smaller quantities of moreaggregate or less-detailed data. [edit] Comparison Each type has certain benefits, although there is disagreement about the specifics of the benefits between providers.  Some MOLAP implementations are prone to database explosion. Database explosion is a phenomenon causing vast amounts of storage space to be used by MOLAP databases when certain common conditions are met: high number of dimensions, pre-calculated results and sparse multidimensional data. The typical mitigation technique for database explosion is not to materialize all the possible aggregation, but only the optimal subset of aggregations based on the desired performance vs. storage trade off.  MOLAP generally delivers better performance due to specialized indexing and storage optimizations. MOLAP also needs less storage space compared to ROLAP because the specialized storage typically includes compression techniques.[10]  ROLAP is generally more scalable.[10] However, large volume pre-processing is difficult to implement efficiently so it is frequently skipped. ROLAP query performance can therefore suffer tremendously  Since ROLAP relies more on the database to perform calculations, it has more limitations in the specialized functions it can use.  HOLAP encompasses a range of solutions that attempt to mix the best of ROLAP and MOLAP. It can generally preprocess quickly, scale well, and offer good function support. [edit] Other types The following acronyms are also sometimes used, although they are not as widespread as the ones above:    WOLAP - Web-based OLAP DOLAP - Desktop OLAP RTOLAP - Real-Time OLAP [edit] APIs and query languages Unlike relational databases, which had SQL as the standard query language, and wide-spread APIs such as ODBC, JDBC and OLEDB, there was no such unification in the OLAP world for a long time. The first real standard API was OLE DB for OLAP specification from Microsoft which appeared in 1997 and introduced the MDX query language. Several OLAP vendors - both server and client - adopted it. In 2001 Microsoft and Hyperion announced the XML for Analysis specification, which was endorsed by most of the OLAP vendors. Since this also used MDX as a query language, MDX became the de-facto standard.[11] [edit] Products [edit] History The first product that performed OLAP queries was Express, which was released in 1970 (and acquired by Oracle in 1995 from Information Resources)[12]. However, the term did not appear until 1993 when it was coined by Ted Codd, who has been described as "the father of the relational database". Codd's paper [1] resulted from a short consulting assignment which Codd undertook for former Arbor Software (later Hyperion Solutions, and in 2007 acquired by Oracle), as a sort of marketing coup. The company had released its own OLAP product, Essbase, a year earlier. As a result Codd's "twelve laws of online analytical processing" were explicit in their reference to Essbase. There was some ensuing controversy and when Computerworld learned that Codd was paid by Arbor, it retracted the article. OLAP market experienced strong growth in late 90s with dozens of commercial products going into market. In 1998, Microsoft released its first OLAP Server - Microsoft Analysis Services, which drove wide adoption of OLAP technology and moved it into mainstream. [edit] Market structure Below is a list of top OLAP vendors in 2006, with figures in millions of United States Dollars.[13] Vendor Microsoft Corporation Global Revenue 1,801 Hyperion Solutions Corporation 1,077 Cognos 735 Business Objects 416 MicroStrategy 416 SAP AG 330 Cartesis SA 210 Applix 205 Infor 199 Oracle Corporation 159 Others 152 Total 5,700 Microsoft was the only vendor that continuously exceeded the industrial average growth during 2000-2006. Since the above data was collected, Hyperion has been acquired by Oracle, Cartesis by Business Objects, Business Objects by SAP, Applix by Cognos, and Cognos by IBM.[14] [edit] See also Computer science portal       Business intelligence Data warehousing Data mining Predictive analytics Business analytics OLTP [edit] Bibliography  Daniel Lemire (2007-12). "Data Warehousing and OLAP-A Research-Oriented Bibliography" (in English). http://www.daniel-lemire.com/OLAP/.  Erik Thomsen. (1997). OLAP Solutions: Building Multidimensional Information Systems, 2nd Edition. John Wiley & Sons. ISBN 978-0471149316.  O’Brien, J. A., & Marakas, G. M. (2009). Management information systems (9th ed.). Boston, MA: McGraw-Hill/Irwin.  Williams, C., Garza, V. R., Tucker, S, Marcus, A. M. (1994, January 24). Multidimensional models boost viewing options. InfoWorld, 16(4). [edit] References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. ^ a b Codd E.F., Codd S.B., and Salley C.T. (1993). "Providing OLAP (On-line Analytical Processing) to User-Analysts: An IT Mandate". Codd & Date, Inc. http://www.fpm.com/refer/codd.html. Retrieved on 2008-03-05. ^ Deepak Pareek (2007). Business Intelligence for Telecommunications. CRC Press. pp. 294 pp. ISBN 0849387922. http://books.google.com/books?id=M-UOE1Cp9OEC. Retrieved on 2008-03-18. ^ "OLAP Council White Paper" (PDF). OLAP Council. 1997. http://www.symcorp.com/downloads/OLAP_CouncilWhitePaper.pdf. Retrieved on 2008-03-18. ^ Hari Mailvaganam (2007). "Introduction to OLAP - Slice, Dice and Drill!". Data Warehousing Review. http://www.dwreview.com/OLAP/Introduction_OLAP.html. Retrieved on 2008-03-18. ^ Nigel Pendse (2008-03-03). "What is OLAP? An analysis of what the often misused OLAP term is supposed to mean". OLAP Report. http://www.olapreport.com/fasmi.htm. Retrieved on 2008-03-18. ^ MicroStrategy, Incorporated (1995). "The Case for Relational OLAP" (PDF). http://www.cs.bgu.ac.il/~dbm031/dw042/Papers/microstrategy_211.pdf. Retrieved on 2008-03-20. ^ Surajit Chaudhuri and Umeshwar Dayal (1997). "An overview of data warehousing and OLAP technology". SIGMOD Rec. (ACM) 26: 65. doi:10.1145/248603.248616. http://doi.acm.org/10.1145/248603.248616. Retrieved on 2008-03-20. ^ Gray, Jim; Chaudhuri, Surajit; Layman, Andrew; Reichart, Hamid; Venkatrao; Pellow; Pirahesh (1997). "Data Cube: {A} Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals". J. Data Mining and Knowledge Discovery 1 (1): pp. 29–53. http://citeseer.ist.psu.edu/gray97data.html. Retrieved on 2008-03-20. ^ Nigel Pendse (2006-06-27). "OLAP architectures". OLAP Report. http://www.olapreport.com/Architectures.htm. Retrieved on 2008-03-17. ^ a b Bach Pedersen, Torben; S. Jensen (December 2001). "Multidimensional Database Technology" (PDF). Distributed Systems Online (IEEE): 40–46. ISSN 0018-9162. http://ieeexplore.ieee.org/iel5/2/20936/00970558.pdf. ^ Nigel Pendse (2007-08-23). "Commentary: OLAP API wars". OLAP Report. http://www.olapreport.com/Comment_APIs.htm. Retrieved on 2008-03-18. ^ Nigel Pendse (2007-08-23). "The origins of today’s OLAP products". OLAP Report. http://olapreport.com/origins.htm. Retrieved on November 27. ^ Nigel Pendse (2006). "OLAP Market". OLAP Report. http://www.olapreport.com/market.htm. Retrieved on 2008-0317. ^ Nigel Pendse (2008-03-07). "Consolidations in the BI industry". http://www.olapreport.com/consolidations.htm. Retrieved on 2008-03-18. OLAP Origem: Wikipédia, a enciclopédia livre. Ir para: navegação, pesquisa OLAP,ou On-line Analytical Processing é a capacidade para manipular e analisar um grande volume de dados sob múltiplas perspectivas. As aplicações OLAP são usadas pelos gestores em qualquer nível da organização para lhes permitir análises comparativas que facilitem a sua tomada de decisões diária. Classifica-se em DOLAP, ROLAP, MOLAP e HOLAP [editar] Ligações externas OLAP - On Line Analytical Processing O que é OLAP OLAP - On Line Analytical Processing é a tecnologia que permite ao usuário (geralmente diretores, presidentes e gerentes) um rápido acesso para visualizar e analisar os dados com alta flexibilidade e desempenho. Esse alto desempenho se dá graças ao modelo multidimensional, que simplifica o processo de pesquisa. Classifica-se em (DOLAP, ROLAP, MOLAP e HOLAP). DOLAP – Desktop On Line Analytical Processing São as ferramentas que disparam uma QUERY da estação de trabalho para o servidor que por sua vez retornam enviando o micro-cubo de volta para ser analisado na estação de trabalho do cliente. Vantagem: Pouco tráfego na rede, pois o processamento acontece estação de trabalho do cliente. Maior agilidade na análise dos dados. Desvantagem: O tamanho do micro-cubo não pode ser grande, se não a análise passa a ser demorada e a máquina do cliente pode não suportar dependendo de sua configuração. ROLAP - Relational On Line Analytical Processing São ferramentas que enviam as consultas SQL para o servidor de banco de dados relacional e processada lá mesmo. Sendo assim o processamento será apenas no servidor. Vantagem: Permite a análise de grandes volumes de dados devido aos processamentos serem do lado do servidor e não da estação de trabalho do cliente. Desvantagem: Se forem feitas diversas requisições ao servidor simultaneamente o mesmo poderá ficar lento ou até mesmo indisponível dependendo de sua configuração. Isso se da exatamente por ele ter que processar todas as requisições de todos os clientes. MOLAP - Multidimensional On Line Analytical Processing São ferramentas que fazem suas requisições diretamente ao servidor de banco de Dados multidimensional. O usuário manipula os dados diretamente no servidor. Vantagem: Ganho no desempenho, e permite a consulta de grandes volumes de dados devido ao processamento ser feito diretamente no servidor. Desvantagem: Custo da ferramenta é elevado e também temos o problema de escalailidade. HOLAP - Hybrid On Line Analytical Processing São as ferramentas hibridas, ou seja, a combinação de ROLAP e MOLAP. Vantagem: A mistura das duas tecnologias obtendo o melhor de cada uma delas, ROLAP (escalabilidade) + MOLAP (alto desempenho). Desvantagem: Custo da ferramenta é elevado OLAP - On Line Analytical Processing pode ser traduzido como Processo Analítico On Line, é a tecnologia que permite ao usuário (geralmente diretores, presidentes e gerentes) um rápido acesso para visualizar e analisar os dados com alta flexibilidade e desempenho. Esse alto desempenho se dá graças ao modelo multidimensional, que simplifica o processo de pesquisa. Classificase em (DOLAP, ROLAP, MOLAP e HOLAP). Business Objects Cognos Hyperion Microstrategy MV Business Analytics Suite Oracle BI Enterprise Edition Pentaho What is OLAP? An analysis of what the often misused OLAP term is supposed to mean You can contact Nigel Pendse, the author of this section, by e-mail on NigelP@olapreport.com if you have any comments or observations. Last updated on March 3, 2008. The term, of course, stands for ‘On-Line Analytical Processing’. Unfortunately, this is neither a meaningful definition nor a description of what OLAP means. It certainly gives no indication of why you would want to use an OLAP tool, or even what an OLAP tool actually does. And it gives you no help in deciding if a product is an OLAP tool or not. It was simply chosen as a term to contrast with OLTP, on-line transaction processing, which is much more meaningful. We hit this problem as soon as we started researching The OLAP Report in late 1994 as we needed to decide which products fell into the category. Deciding what is an OLAP has not got any easier since then, as more and more vendors claim to have ‘OLAP compliant’ products, whatever that may mean (often they don’t even know). It is not possible to rely on the vendors’ own descriptions and membership of the long-defunct OLAP Council was not a reliable indicator of whether or not a company produces OLAP products. For example, several significant OLAP vendors were never members or resigned, and several members were not OLAP vendors. Membership of the instantly moribund replacement Analytical Solutions Forum was even less of a guide, as it was intended to include non-OLAP vendors. The Codd rules also turned out to be an unsuitable way of detecting ‘OLAP compliance’, so we were forced to create our own definition. It had to be simple, memorable and product-independent, and the resulting definition is the ‘FASMI’ test. The key thing that all OLAP products have in common is multidimensionality, but that is not the only requirement for an OLAP product. This is copyright material. You can make brief references to it freely, with attribution, but not reproduce large sections or the entire article without permission from the publisher. You are free to link to this page without permission. In addition to this article, The OLAP Report contains numerous other analyses, product reviews and case studies. Many of these are available for immediate individual purchase, or you can subscribe to the entire site. The FASMI test We wanted to define the characteristics of an OLAP application in a specific way, without dictating how it should be implemented. As our research has shown, there are many ways of implementing OLAP compliant applications, and no single piece of technology should be officially required, or even recommended. Of course, we have studied the technologies used in commercial OLAP products and this report provides many such details. We have suggested in which circumstances one approach or another might be preferred, and have also identified areas where we feel that all the products currently fall short of what we regard as a technology ideal. Our definition is designed to be short and easy to remember — 12 rules or 18 features are far too many for most people to carry in their heads; we are pleased that we were able to summarize the OLAP definition in just five key words: Fast Analysis of Shared Multidimensional Information — or, FASMI for short. This definition was first used by us in early 1995, and we are very pleased that it has not needed revision in the years since. This definition has now been widely adopted and is cited in over 120 Web sites in about 30 countries. FAST means that the system is targeted to deliver most responses to users in less than five seconds, with the simplest analyses taking no more than one second and very few taking more than 20 seconds. Even if users have been warned that it will take more than a few seconds, they are soon likely to get distracted and lose their chain of thought, so the quality of analysis suffers. This speed is not easy to achieve with large amounts of data, particularly if on-the-fly and ad hoc calculations are required. Vendors resort to a wide variety of techniques to achieve this goal, including specialized forms of data storage, extensive pre-calculations and specific hardware requirements, but we do not think any products are yet fully optimized, so we expect this to be an area of developing technology. In particular, the full pre-calculation approach fails with very large, sparse applications as the databases simply get too large (the database explosion problem), whereas doing everything on-the-fly is much too slow with large databases, even if exotic hardware is used. Even though it may seem miraculous at first if reports that previously took days now take only minutes, users soon get bored of waiting, and the project will be much less successful than if it had delivered a near instantaneous response, even at the cost of less detailed analysis. The BI and OLAP Surveys have found that slow query response is consistently the most often-cited technical problem with OLAP products, so too many deployments are clearly still failing to pass this test. Indeed, there are strong indications that users are becoming ever more demanding, so query responses that would have been considered adequate just a few years ago are now regarded as painfully slow. After all, if Google can search a large proportion of all the on-line information in the world in a quarter of a second, why should relatively tiny amounts of management information take orders of magnitude longer to query? ANALYSIS means that the system can cope with any business logic and statistical analysis that is relevant for the application and the user, and keep it easy enough for the target user. Although some pre-programming may be needed, we do not think it acceptable if all application definitions have to be done using a professional 4GL. It is certainly necessary to allow the user to define new ad hoc calculations as part of the analysis and to report on the data in any desired way, without having to program, so we exclude products (like Oracle Discoverer) that do not allow adequate end-user oriented calculation flexibility. We do not mind whether this analysis is done in the vendor's own tools or in a linked external product such as a spreadsheet, simply that all the required analysis functionality be provided in an intuitive manner for the target users. This could include specific features like time series analysis, cost allocations, currency translation, goal seeking, ad hoc multidimensional structural changes, nonprocedural modeling, exception alerting, data mining and other application dependent features. These capabilities differ widely between products, depending on their target markets. SHARED means that the system implements all the security requirements for confidentiality (possibly down to cell level) and, if multiple write access is needed, concurrent update locking at an appropriate level. Not all applications need users to write data back, but for the growing number that do, the system should be able to handle multiple updates in a timely, secure manner. This is a major area of weakness in many OLAP products, which tend to assume that all OLAP applications will be readonly, with simplistic security controls. Even products with multi-user read-write often have crude security models; an example is Microsoft OLAP Services. MULTIDIMENSIONAL is our key requirement. If we had to pick a one-word definition of OLAP, this is it. The system must provide a multidimensional conceptual view of the data, including full support for hierarchies and multiple hierarchies, as this is certainly the most logical way to analyze businesses and organizations. We are not setting up a specific minimum number of dimensions that must be handled as it is too application dependent and most products seem to have enough for their target markets. Again, we do not specify what underlying database technology should be used providing that the user gets a truly multidimensional conceptual view. INFORMATION is all of the data and derived information needed, wherever it is and however much is relevant for the application. We are measuring the capacity of various products in terms of how much input data they can handle, not how many Gigabytes they take to store it. The capacities of the products differ greatly — the largest OLAP products can hold at least a thousand times as much data as the smallest. There are many considerations here, including data duplication, RAM required, disk space utilization, performance, integration with data warehouses and the like. We think that the FASMI test is a reasonable and understandable definition of the goals OLAP is meant to achieve. We encourage users and vendors to adopt this definition, which we hope will avoid the controversies of previous attempts. The techniques used to achieve it include many flavors of client/server architecture, time series analysis, object-orientation, optimized proprietary data storage, multithreading and various patented ideas that vendors are so proud of. We have views on these as well, but we would not want any such technologies to become part of the definition of OLAP. Vendors who are covered in this report had every chance to tell us about their technologies, but it is their ability to achieve OLAP goals for their chosen application areas that impressed us most. Dr Edgar “Ted” Codd (1923-2003) It is with sadness that I learned of the death last week of Dr Ted Codd, the inventor of the relational database model. I was fortunate enough to meet Dr Codd in October 1994, shortly after he, in a white paper commissioned by Arbor Software (now part of Hyperion Solutions), first coined the term OLAP. I was chairing a conference in London (the same conference at which I first met Nigel Pendse) and Dr Codd gave the keynote address. He explained how analytical databases were a necessary companion to databases built on the relational model which he invented in 1969. It is easy to forget today, when the relational database is ubiquitous, that there was a time when it was far from the dominant standard and, in fact, competed with network, hierarchical and other types of databases. Dr Codd defended his invention strongly. Even when Honeywell MRDS, the first commercial relational data base, was released in 1976, there were still many detractors. By the time Oracle released its relational database in 1979 and started to gain traction with the market, Dr Codd had spent ten long years defending his invention. It was not until the early 80’s that the relational database emerged as a clear standard. Subsequently I was fortunate enough to share the podium with Dr Codd and his knowledgeable wife, Sharon, as we gave many presentations on the subject of OLAP at conferences around North America. This gave me a chance to get to know both Ted and Sharon on a more personal level. To hear Ted explain how he landed flying boats on lakes in Africa during the second World War made me realize that there was much more to Ted than the public face of this man who revolutionized computing in his lifetime. The invention of the relational model is well understood to be a major factor in making modern computing what it is today. ERP systems could not have evolved to where they are without a strong database standard such as the relational model. Modern e-commerce Web sites are dependent on relational technology. But relational technology is equally crucial to those of us in the OLAP world. The source data for our OLAP system comes almost exclusively from relational sources, and it is reassuring to know that the man who invented the relational model, also recognized that it could not provide, without help, the rich analytics that business needs. In the 1994 white paper Dr Codd wrote, “Attempting to force one technology or tool to satisfy a particular need for which another tool The Codd rules and features In 1993, E.F. Codd & Associates published a white paper, commissioned by Arbor Software (now Hyperion Solutions), entitled ‘Providing OLAP (On-line Analytical Processing) to User-Analysts: An IT Mandate’. The late Dr Codd was very well known as a respected database researcher from the 1960s through to the late 1980s and is credited with being the inventor of the relational database model in 1969. Unfortunately, his OLAP rules proved to be controversial due to being vendor-sponsored, rather than mathematically based. is more effective and efficient is like attempting to drive a screw into a wall with a hammer when a screwdriver is at hand: the screw may eventually enter the wall but at what cost?” Thank you, Ted Codd. Richard Creeth April 22, 2003 It is also unclear how much involvement Dr Codd himself had with the OLAP work, but it seems likely that his role was very limited, with more of the work being done by his wife and a temporary researcher than by Dr Codd himself. Several of the rules seem to have been invented by the sponsoring vendor, not Dr Codd. The white paper should therefore be regarded as a vendorpublished brochure (which it was) rather than as a serious research paper (which it was not). Note that this paper was not published by Codd & Date, and Chris Date has never endorsed Codd’s OLAP work. The OLAP white paper included 12 rules, which are now well known (and available for download from vendors’ Web sites). They were followed by another six (much less well known) rules in 1995 and Dr Codd also restructured the rules into four groups, calling them ‘features’. The features are briefly described and evaluated here, but they are now rarely quoted and little used. Basic Features B F1: Multidimensional Conceptual View (Original Rule 1). Few would argue with this feature; like Dr Codd, we believe this to be the central core of OLAP. Dr Codd included ‘slice and dice’ as part of this requirement. F2: Intuitive Data Manipulation (Original Rule 10). Dr Codd preferred data manipulation to be done through direct actions on cells in the view, without recourse to menus or multiple actions. One assumes that this is by using a mouse (or equivalent), but Dr Codd did not actually say so. Many products fail on this, because they do not necessarily support double clicking or drag and drop. The vendors, of course, all claim otherwise. In our view, this feature adds little value to the evaluation process. We think that products should offer a choice of modes (at all times), because not all users like the same approach. F3: Accessibility: OLAP as a Mediator (Original Rule 3). In this rule, Dr Codd essentially described OLAP engines as middleware, sitting between heterogeneous data sources and an OLAP front-end. Most products can achieve this, but often with more data staging and batching than vendors like to admit. F4: Batch Extraction vs Interpretive (New). This rule effectively required that products offer both their own staging database for OLAP data as well as offering live access to external data. We agree with Dr Codd on this feature and are disappointed that only a minority of OLAP products properly comply with it, and even those products do not often make it easy or automatic. In effect, Dr Codd was endorsing multidimensional data staging plus partial pre-calculation of large multidimensional databases, with transparent reach-through to underlying detail. Today, this would be regarded as the definition of a hybrid OLAP, which is indeed becoming a popular architecture, so Dr Codd has proved to be very perceptive in this area. F5: OLAP Analysis Models (New). Dr Codd required that OLAP products should support all four analysis models that he described in his white paper (Categorical, Exegetical, Contemplative and Formulaic). We hesitate to simplify Dr Codd’s erudite phraseology, but we would describe these as parameterized static reporting, slicing and dicing with drill down, ‘what if?’ analysis and goal seeking models, respectively. All OLAP tools in this Report support the first two (but some other claimants do not fully support the second), most support the third to some degree (but probably less than Dr Codd would have liked) and few support the fourth to any usable extent. Perhaps Dr Codd was anticipating data mining in this rule? F6: Client Server Architecture (Original Rule 5). Dr Codd required not only that the product should be client/server but that the server component of an OLAP product should be sufficiently intelligent that various clients could be attached with minimum effort and programming for integration. This is a much tougher test than simple client/server, and relatively few products qualify. We would argue that this test is probably tougher than it needs to be, and we prefer not to dictate architectures. However, if you do agree with the feature, then you should be aware that most vendors who claim compliance, do so wrongly. In effect, this is also an indirect requirement for openness on the desktop. Perhaps Dr Codd, without ever using the term, was thinking of what the Web would one day deliver? Or perhaps he was anticipating a widely accepted API standard, which still does not really exist. Perhaps, one day, XML for Analysis will fill this gap. F7: Transparency (Original Rule 2). This test was also a tough but valid one. Full compliance means that a user of, say, a spreadsheet should be able to get full value from an OLAP engine and not even be aware of where the data ultimately comes from. To do this, products must allow live access to heterogeneous data sources from a full function spreadsheet add-in, with the OLAP server engine in between. Although all vendors claimed compliance, many did so by outrageously rewriting Dr Codd’s words. Even Dr Codd’s own vendor-sponsored analyses of Essbase and (then) TM/1 ignore part of the test. In fact, there are a few products that do fully comply with the test, including Analysis Services, Express, and Holos, but neither Essbase nor iTM1 (because they do not support live, transparent access to external data), in spite of Dr Codd’s apparent endorsement. Most products fail to give either full spreadsheet access or live access to heterogeneous data sources. Like the previous feature, this is a tough test for openness. F8: Multi-User Support (Original Rule 8). Dr Codd recognized that OLAP applications were not all read-only and said that, to be regarded as strategic, OLAP tools must provide concurrent access (retrieval and update), integrity and security. We agree with Dr Codd, but also note that many OLAP applications are still read-only. Again, all the vendors claim compliance but, on a strict interpretation of Dr Codd’s words, few are justified in so doing. Special Features S F9: Treatment of Non-Normalized Data (New). This refers to the integration between an OLAP engine and denormalized source data. Dr Codd pointed out that any data updates performed in the OLAP environment should not be allowed to alter stored denormalized data in feeder systems. He could also be interpreted as saying that data changes should not be allowed in what are normally regarded as calculated cells within the OLAP database. For example, Essbase allows this, and Dr Codd would perhaps have disapproved. F10: Storing OLAP Results: Keeping Them Separate from Source Data (New). This is really an implementation rather than a product issue, but few would disagree with it. In effect, Dr Codd was endorsing the widely-held view that read-write OLAP applications should not be implemented directly on live transaction data, and OLAP data changes should be kept distinct from transaction data. The method of data write-back used in Microsoft Analysis Services is the best implementation of this, as it allows the effects of data changes even within the OLAP environment to be kept segregated from the base data. F11: Extraction of Missing Values (New). All missing values are cast in the uniform representation defined by the Relational Model Version 2. We interpret this to mean that missing values are to be distinguished from zero values. In fact, in the interests of storing sparse data more compactly, a few OLAP tools such as TM1 do break this rule, without great loss of function. F12: Treatment of Missing Values (New). All missing values to be ignored by the OLAP analyzer regardless of their source. This relates to Feature 11, and is probably an almost inevitable consequence of how multidimensional engines treat all data. Reporting Features R F13: Flexible Reporting (Original Rule 11). Dr Codd required that the dimensions can be laid out in any way that the user requires in reports. We would agree, and most products are capable of this in their formal report writers. Dr Codd did not explicitly state whether he expected the same flexibility in the interactive viewers, perhaps because he was not aware of the distinction between the two. We prefer that it is available, but note that relatively fewer viewers are capable of it. This is one of the reasons that we prefer that analysis and reporting facilities be combined in one module. F14: Uniform Reporting Performance (Original Rule 4). Dr Codd required that reporting performance be not significantly degraded by increasing the number of dimensions or database size. Curiously, nowhere did he mention that the performance must be fast, merely that it be consistent. In fact, our experience suggests that merely increasing the number of dimensions or database size does not affect performance significantly in fully pre-calculated databases, so Dr Codd could be interpreted as endorsing this approach — which may not be a surprise given that Arbor Software sponsored the paper. However, reports with more content or more on-the-fly calculations usually take longer (in the good products, performance is almost linearly dependent on the number of cells used to produce the report, which may be more than appear in the finished report) and some dimensional layouts will be slower than others, because more disk blocks will have to be read. There are differences between products, but the principal factor that affects performance is the degree to which the calculations are performed in advance and where live calculations are done (client, multidimensional server engine or RDBMS). This is far more important than database size, number of dimensions or report complexity. F15: Automatic Adjustment of Physical Level (Supersedes Original Rule 7). Dr Codd required that the OLAP system adjust its physical schema automatically to adapt to the type of model, data volumes and sparsity. We agree with him, but are disappointed that most vendors fall far short of this noble ideal. We would like to see more progress in this area and also in the related area of determining the degree to which models should be pre-calculated (a major issue that Dr Codd ignores). The Panorama technology, acquired by Microsoft in October 1996, broke new ground here, and users can now benefit from it in Microsoft Analysis Services. Dimension Control D F16: Generic Dimensionality (Original Rule 6). Dr Codd took the purist view that each dimension must be equivalent in both its structure and operational capabilities. This may not be unconnected with the fact that this is an Essbase characteristic. However, he did allow additional operational capabilities to be granted to selected dimensions (presumably including time), but he insisted that such additional functions should be grantable to any dimension. He did not want the basic data structures, formulae or reporting formats to be biased towards any one dimension. This has proven to be one of the most controversial of all the original 12 rules. Technology focused products tend to largely comply with it, so the vendors of such products support it. Application focused products usually make no effort to comply, and their vendors bitterly attack the rule. With a strictly purist interpretation, few products fully comply. We would suggest that if you are purchasing a tool for general purpose, multiple application use, then you want to consider this rule, but even then with a lower priority. If you are buying a product for a specific application, you may safely ignore the rule. F17: Unlimited Dimensions & Aggregation Levels (Original Rule 12). Technically, no product can possibly comply with this feature, because there is no such thing as an unlimited entity on a limited computer. In any case, few applications need more than about eight or ten dimensions, and few hierarchies have more than about six consolidation levels. Dr Codd suggested that if a maximum must be accepted, it should be at least 15 and preferably 20; we believe that this is too arbitrary and takes no account of usage. You should ensure that any product you buy has limits that are greater than you need, but there are many other limiting factors in OLAP products that are liable to trouble you more than this one. In practice, therefore, you can probably ignore this requirement. F18: Unrestricted Cross-dimensional Operations (Original Rule 9). Dr Codd asserted, and we agree, that all forms of calculation must be allowed across all dimensions, not just the ‘measures’ dimension. In fact, many products which use only relational storage are weak in this area. Most products, such as Essbase, with a multidimensional database are strong. These types of calculations are important if you are doing complex calculations, not just cross tabulations, and are particularly relevant in applications that analyze profitability. This page is part of the free content of The OLAP Report, but ten times more information is available only to subscribers, including reviews of dozens of products, case studies and in-depth analyses. You can register for access to a preview of some of the subscriber-only material in The OLAP Report or subscribe on-line. It is also possible to purchase individual reviews, analyses and case studies from The OLAP Report. Category:OLAP History From OLAP (Redirected from OLAP History) Jump to: navigation, search Contents [hide]         1 The History of OLAP 2 Birth of the Multidimensional Analysis through the APL 3 Express, an Enduring Example 4 System W for Financial Applications 5 Metaphor, the Beginning of the Client/Server 6 The New MIS Using GUI 7 PowerOLAP, Real-time Data and Excel Integraton 8 The Spread of Spreadsheets The History of OLAP OLAP is not a new concept and has persisted through the decades. As a matter of fact, the origin of OLAP technology can be traced way back in 1962. It was not until 1993 that the term OLAP was coined in the Codd white paper authored by the highly esteemed database researcher Ted Codd, who also established the 12 rules for an OLAP product. Like many other applications, it has undergone several stages of evolution whose patterns of progress are relatively intricate to follow through. Birth of the Multidimensional Analysis through the APL It was Kenneth Iverson who first introduced the base foundation of OLAP through his book “A Programming Language”, which defined a mathematical language with processing operators and multidimensional variables. The APL was regarded as the first multidimensional language and its implementation as a computer programming language happened during the late 1960’s by IBM. Iverson created brief notations by employing Greek symbols as operators. During this period, high resolution GUIs had not yet surfaced and, as APL uses Greek symbols, it requires support of special hardware like special keyboards, screens and printers. On top of this, since early APL programs were interpreted as opposed to being compiled, it tends to inefficiently exhaust more machine resources and is known for consuming too much RAM space, to name only a few of its drawbacks. Maintenance of APLbased mainframe products is very costly and most programmers encounter difficulties in programming multidimensional applications using arrays in other languages. Eventually, there was a decline in the market significance of APL, but it still survives to a limited degree. Although it was not deemed a modern OLAP tool, several of its ideas can be seen living through some of the modern day multidimensional applications. Express, an Enduring Example A new multidimensional product emerged during the year 1970’s, which became a popular OLAP offering, in the form of Express. This was the first multidimensional tool directed to support marketing related demands or application needs. It later on evolved into a hybrid OLAP after its acquisition by Oracle and has thrived for more than 3 decades. It remains, even in the current period, as one of the well-marketed multidimensional products. One of Express’ more famous successors is the Oracle9i OLAP. And though several enhanced versions have been released throughout the years, the concepts and data models remain unchanged. The 1980’s period played a significant role in the advancement of the OLAP industry as this triggered the rise of many multidimensional products. System W for Financial Applications By the year 1981, a new decision support system software, has been developed by Comshare as a result of their attempt to expand the scope of their market and services offered. System W was the first OLAP tool to cater to financial applications and the first to apply hypercube approach in its multidimensional modeling. But though it proved to be a profitable venture for Comshare for quite some time, it didn’t really achieve much success in the market and was even less favored by technical people as it was more difficult to program in comparison with other software of its kind. Furthermore, it also takes up much of the machine resources and often suffers from database explosion. UNIX also released APL but never promoted it as an OLAP tool. Presently, System W ceased being marketed but is still operating limitedly on a few IBM mainframes. Other products who replicated similar System W concepts came out such as DOS One-Up by Comshare and the Windows-based Commander Prism but did not make quite a significant mark in the industry. In 1992, Essbase was launched by Hyperion Solution which eventually became a major OLAP server product in the market come year 1997. But just like the original product, this descendant application suffers too from database explosion. Hyperion was finally able to resolve the problem with exploding databases through the release of its Essbase 7X version. Metaphor, the Beginning of the Client/Server After a couple or so years after the release of System W, the generally considered first ROLAP product, Metaphor, entered the OLAP market. This multidimensional product established new concepts like client/server computing, multidimensional processing on relational data, workgroup processing and object-oriented development and was basically designed to cater for companies of consumption goods. The vendor of Metaphor was compelled to create proprietary PC and networks since hardware in those days could barely support Metaphor’s requirements. In 1991, IBM acquired Metaphor and launched the product under the new name IDS. The product still remains operational to support remaining loyal users. The New MIS Using GUI A new type of Management Information System product emerged during the mid 1980’s in the form of Executive Information System, or more commonly known as EIS which emphasizes the use of graphical user interfaces (GUI). And on 1985, Pilot Command Center, which was branded as the first ever client/server EIS was released. Other client/server products that came out are Strategy, Holos, and Information Advantage. Pilot has decided to phase out Command Center but has implemented some of the concepts in its Lightship Server product. Some of Command Center concepts such as automatic time series handling, multidimensional client/server processing and simplified human factors can still be seen living through some modern OLAP products. PowerOLAP, Real-time Data and Excel Integraton Founded in 1997, PARIS Technologies published PowerOLAP™, which represents a milestone in the evolution of OLAP (on-line analytical processing) technology. Like any important evolutionary event, PowerOLAP combines the most advanced features of what came before it with new capabilities. Most significantly, PowerOLAP enables users to reach through seamlessly to access transactional data in a relational database for dynamic OLAP manipulations in a true multidimensional environment. In addition, PowerOLAP employs Excel and the Web as a front end, connecting users throughout an organization with underlying data sources via the tools they know best, direct to their desktops. The Spread of Spreadsheets A new end-user analysis tool was becoming a favorite during the latter period of 1980. The spreadsheet market was fast prevailing which compelled some of the vendors to create multidimensional applications that could reside on a spreadsheet environment. Compete initiated to open the market for a multidimensional spreadsheet. It was later on acquired by Computer Associates, in addition to its other spreadsheet products like the SuperCalc and 20/20, from its original vendor then heavily advertised and offered it at a lower cost, but even at this rate it still did not make much market significance. CA later on came out with the version 5 of SuperCalc which was clearly influenced by the almost defunct Compete product. Improv from Lotus followed suit after Compete. Lotus 1-2-3 began to develop Improv for the NeXT machine under the code name ‘BackBay’. This became a reality as Improv was later on launched on NeXT machines. This became a phenomenal success and has considerably augmented Lotus’ sales until after the efforts to port Improv in Windows and Macintosh system software. The rise of the competitor Microsoft’s Excel product marked the beginning of the decline of Lotus. Lotus attempted moving Improv down the market in the hope of increasing it’s marketability but did not work out. Excel steadily gained on 1-2-3 and ultimately proved to be the superior product which dominated the market. Microsoft’s integration of the Pivot Tables feature in Excel was probably one of the most important enhancements of the Excel product as PivotTable became the most popular and widely used tool for multidimensional analysis. Throughout the years, Microsoft continued to produce new and enhanced versions of Excel like the Excel 2000 and Excel 2003 which showcases a more sophisticated Pivot Table feature that is functions as both a desktop OLAP: small cubes, generated from large databases, but downloaded to PCs for processing (even though, in Web implementations, the cubes usually reside on the server) and a client to Microsoft Analysis Services. Sinper Corporation came into the OLAP market during the late 1980’s and presented its multidimensional analysis software product for DOS and Windows, then known as TM/1. Sinper turned TM/1 to serve as a multidimensional back-end server for Excel and 1-2-3. Essbase by Arbor followed suit. Market for a multidimensional spreadsheet is booming fast. More and more vendors were attracted to plunge into this growing business. Traditional vendors of host-oriented products like Acumate, Express, Gentia, Holos, Hyperion, Mineshare, MetaCube, PowerPlay and WhiteLight all offer products which provide highly integrated spreadsheet access to their OLAP servers. Soon after came the release of the OLAP@Work Excel Add-In with features that enable users to make full use of OLAP Services. Then on the year 2004, Excel Add-in went mainstream. Vendors like Business Objects, Cognos, Microsoft, MicroStrategy and Oracle launched their own versions of the product. Concurrently, IntelligentApps, a main vendor of Analysis Services Excel AddIn, was acquired by Sage. Microsoft released PerformancePoint which delivers more functionality for execution of performance management in the year 2007, but has announced the existence of the product in the prior year. Pages in category "OLAP History" The following 4 pages are in this category, out of 4 total. B  Business Applications  Codd's Paper  Multidimensional Basics  Types of OLAP Systems C M T OLAP AND OLAP SERVER DEFINITIONS OLAP: ON-LINE ANALYTICAL PROCESSING Defined terms On-Line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user. OLAP functionality is characterized by dynamic multi-dimensional analysis of consolidated enterprise data supporting end user analytical and navigational activities including:       calculations and modeling applied across dimensions, through hierarchies and/or across members trend analysis over sequential time periods slicing subsets for on-screen viewing drill-down to deeper levels of consolidation reach-through to underlying detail data rotation to new dimensional comparisons in the viewing area OLAP is implemented in a multi-user client/server mode and offers consistently rapid response to queries, regardless of database size and complexity. OLAP helps the user synthesize enterprise information through comparative, personalized viewing, as well as through analysis of historical and projected data in various "what-if" data model scenarios. This is achieved through use of an OLAP Server. OLAP SERVER An OLAP server is a high-capacity, multi-user data manipulation engine specifically designed to support and operate on multidimensional data structures. A multi-dimensional structure is arranged so that every data item is located and accessed based on the intersection of the dimension members which define that item. The design of the server and the structure of the data are optimized for rapid ad-hoc information retrieval in any orientation, as well as for fast, flexible calculation and transformation of raw data based on formulaic relationships. The OLAP Server may either physically stage the processed multi-dimensional information to deliver consistent and rapid response times to end users, or it may populate its data structures in real-time from relational or other databases, or offer a choice of both. Given the current state of technology and the end user requirement for consistent and rapid response times, staging the multi-dimensional data in the OLAP Server is often the preferred method. OLAP GLOSSARY Defined terms:                    AGGREGATE ANALYSIS, MULTI-DIMENSIONAL ARRAY, MULTI-DIMENSIONAL CALCULATED MEMBER CELL CHILDREN COLUMN DIMENSION CONSOLIDATE CUBE DENSE DERIVED DATA DERIVED MEMBERS DETAIL MEMBER DIMENSION DRILL DOWN/UP FORMULA FORMULA, CROSS-DIMENSIONAL GENERATION, HIERARCHICAL HIERARCHICAL RELATIONSHIPS                             HORIZONTAL DIMENSION HYPERCUBE INPUT MEMBERS LEVEL, HIERARCHICAL MEMBER, DIMENSION MEMBER COMBINATION MISSING DATA, MISSING VALUE MULTI-DIMENSIONAL DATA STRUCTURE MULTI-DIMENSIONAL QUERY LANGUAGE NAVIGATION NESTING (OF MULTI-DIMENSIONAL COLUMNS AND ROWS) NON-MISSING DATA OLAP CLIENT PAGE DIMENSION PAGE DISPLAY PARENT PIVOT PRE-CALCULATED/PRE-CONSOLIDATED DATA REACH THROUGH ROLL-UP ROTATE ROW DIMENSION SCOPING SELECTION SLICE SLICE AND DICE SPARSE VERTICAL DIMENSION Definitions: AGGREGATE See: Consolidate ANALYSIS, MULTI-DIMENSIONAL The objective of multi-dimensional analysis is for end users to gain insight into the meaning contained in databases. The multidimensional approach to analysis aligns the data content with the analyst's mental model, hence reducing confusion and lowering the incidence of erroneous interpretations. It also eases navigating the database, screening for a particular subset of data, asking for the data in a particular orientation and defining analytical calculations. Furthermore, because the data is physically stored in a multi-dimensional structure, the speed of these operations is many times faster and more consistent than is possible in other database structures. This combination of simplicity and speed is one of the key benefits of multi-dimensional analysis. ARRAY, MULTI-DIMENSIONAL A group of data cells arranged by the dimensions of the data. For example, a spreadsheet exemplifies a two-dimensional array with the data cells arranged in rows and columns, each being a dimension. A three-dimensional array can be visualized as a cube with each dimension forming a side of the cube, including any slice parallel with that side. Higher dimensional arrays have no physical metaphor, but they organize the data in the way users think of their enterprise. Typical enterprise dimensions are time, measures, products, geographical regions, sales channels, etc. Synonyms: Multi-dimensional Structure, Cube, Hypercube CALCULATED MEMBER A calculated member is a member of a dimension whose value is determined from other members' values (e.g., by application of a mathematical or logical operation). Calculated members may be part of the OLAP server database or may have been specified by the user during an interactive session. A calculated member is any member that is not an input member. CELL A single datapoint that occurs at the intersection defined by selecting one member from each dimension in a multi-dimensional array. For example, if the dimensions are measures, time, product and geography, then the dimension members: Sales, Janu OLAP Server From OLAP Jump to: navigation, search An OLAP server is a high-capacity, multi-user data manipulation engine specifically designed to support and operate on multidimensional data structures. A multi-dimensional structure is arranged so that every data item is located and accessed based on the intersection of the dimension members which define that item. The design of the server and the structure of the data are optimized for rapid ad-hoc information retrieval in any orientation, as well as for fast, flexible calculation and transformation of raw data based on formulaic relationships. The OLAP Server may either physically stage the processed multi-dimensional information to deliver consistent and rapid response times to end users, or it may populate its data structures in real-time from relational or other databases, or offer a choice of both. Given the current state of technology and the end user requirement for consistent and rapid response times, staging the multi-dimensional data in the OLAP Server is often the preferred method. OLAP Functionality From OLAP Jump to: navigation, search In the core of any OLAP system is a concept of an OLAP cube (also called a multidimensional cube or a hypercube). It consists of numeric facts called measures which are categorized by dimensions. The cube metadata is typically created from a star schema or snowflake schema of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables. OLAP Cube From OLAP Jump to: navigation, search An OLAP cube is a data structure that allows fast analysis of data. The arrangement of data into cubes overcomes a limitation of relational databases. Relational databases are not well suited for near instantaneous analysis and display of large amounts of data. Instead, they are better suited for creating records from a series of transactions known as OLTP or On-Line Transaction Processing. Although many report-writing tools exist for relational databases, these are slow when the whole database must be summarized. Contents [hide]   1 Background o 1.1 Functionality o 1.2 Pivot o 1.3 Hierarchy o 1.4 OLAP operations o 1.5 Linking cubes and sparsity o 1.6 Variance in products 2 Technical definition Background OLAP cubes can be thought of as extensions to the two-dimensional array of a spreadsheet. For example a company might wish to analyze some financial data by product, by time-period, by city, by type of revenue and cost, and by comparing actual data with a budget. These additional methods of analyzing the data are known as dimensions.Because there can be more than three dimensions in an OLAP system the term hypercube is sometimes used. Functionality The OLAP cube consists of numeric facts called measures which are categorized by dimensions. The cube metadata is typicallyTemplate:Fact created from a star schema or snowflake schema of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables. Pivot A financial analyst might want to view or "pivot" the data in various ways, such as displaying all the cities down the page and all the products across a page. This could be for a specified period, version and type of expenditure. Having seen the data in this particular way the analyst might then immediately wish to view it in another way. The cube could effectively be re-oriented so that the data displayed now had periods across the page and type of cost down the page. Because this re-orientation involved re-summarizing very large amounts of data, this new view of the data had to be generated efficiently to avoid wasting the analyst's time, i.e within seconds, rather than the hours a relational database and conventional report-writer might have taken. Hierarchy Each of the elements of a dimension could be summarized using a hierarchy. The hierarchy is a series of parent-child relationships, typically where a parent member represents the consolidation of the members which are its children. Parent members can be further aggregated as the children of another parent. For example May 2005 could be summarized into Second Quarter 2005 which in turn would be summarized in the Year 2005. Similarly the cities could be summarized into regions, countries and then global regions; products could be summarized into larger categories; and cost headings could be grouped into types of expenditure. Conversely the analyst could start at a highly summarized level such as the total difference between the actual results and the budget and drill down into the cube to discover which locations, products and periods had produced this difference. OLAP operations The analyst can understand the meaning contained in the databases using multi-dimensional analysis. By aligning the data content with the analyst's mental model, the chances of confusion and erroneous interpretations are reduced. The analyst can navigate through the database and screen for a particular subset of the data, changing the data's orientations and defining analytical calculations. The user-initiated process of navigating by calling for page displays interactively, through the specification of slices via rotations and drill down/up is sometimes called "slice and dice". Common operations include slice and dice, drill down, roll up, and pivot. Slice: A slice is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset.<ref name=OLAPGlossary1995/> Dice: The dice operation is a slice on more than two dimensions of a data cube (or more than two consecutive slices).<ref>Template:Cite web</ref> Drill Down/Up: Drilling down or up is a specific analytical technique whereby the user navigates among levels of data ranging from the most summarized (up) to the most detailed (down).<ref name=OLAPGlossary1995/> Roll-up: A roll-up involves computing all of the data relationships for one or more dimensions. To do this, a computational relationship or formula might be defined.<ref name=OLAPGlossary1995/> Pivot: To change the dimensional orientation of a report or page display.<ref name=OLAPGlossary1995/> Linking cubes and sparsity The commercial OLAP products have different methods of creating the cubes and hypercubes and of linking cubes and hypercubes (see Types of OLAP in the article on OLAP.) Linking cubes is a method of overcoming sparsity. Sparsity arises when not every cell in the cube is filled with data and so valuable processing time is taken by effectively adding up zeros. For example revenues may be available for each customer and product but cost data may not be available with this amount of analysis. Instead of creating a sparse cube, it is sometimes better to create another separate, but linked, cube in which a sub-set of the data can be analyzed into great detail. The linking ensures that the data in the cubes remain consistent. Variance in products The data in cubes may be updated at times, perhaps by different people. Techniques are therefore often needed to lock parts of the cube while one of the users is writing to it and to recalculate the cube's totals. Other facilities may allow an alert that shows previously calculated totals are no longer valid after the new data has been added, but some products only calculate the totals when they are needed. Technical definition In database theory, an OLAP cube isTemplate:Fact an abstract representation of a projection of an RDBMS relation. Given a relation of order N, consider a projection that subtends X, Y, and Z as the key and W as the residual attribute. Characterizing this as a function, W : (X,Y,Z) → W the attributes X, Y, and Z correspond to the axes of the cube, while the W value into which each ( X, Y, Z ) triple maps corresponds to the data element that populates each cell of the cube. Insofar as two-dimensional output devices cannot readily characterize four dimensions, it is more practical to project "slices" of the data cube (we say project in the classic vector analytic sense of dimensional reduction, not in the SQL sense, although the two are clearly conceptually homologous), perhaps W : (X,Y) → W which may suppress a primary key, but still have some semantic significance, perhaps a slice of the triadic functional representation for a given Z value of interest. The motivationTemplate:Fact behind OLAP displays harks back to the cross-tabbed report paradigm of 1980s DBMS. One may wish for a spreadsheet-style display, where—to appropriate the Microsoft Excel paradigm—values of X populate row $1; values of Y populate column $A; and values of W : ( X, Y ) → W populate the individual cells "southeast of" $B2, so to speak, $B2 itself included. While one can certainly use the DML (Data Manipulation Language) of traditional SQL to display ( X, Y, W ) triples, this output format is not nearly as convenient as the cross-tabbed alternative: certainly, the former requires one to hunt linearly for a given ( X, Y ) pair in order to determine the corresponding W value, while the latter enables one to more conveniently scan for the intersection of the proper X column with the proper Y row. See also Cube OLAP From OLAP Jump to: navigation, search OLAP or ON-Line Analytical Processing is a software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by a user. OLAP functionality is characterized by dynamic multi-dimensional analysis of consolidated enterprise data supporting end user analytical and navigational activities. OLAP tools do not store individual transaction records in two-dimensional, row-by-column formats, like a worksheet, but instead use multi-dimensional database structures-known as Cubes in OLAP terminology-to store arrays of consolidated information. The data and formulas are stored in an optimized multidimensional database, while views of the data are created on demand. Analysts can take any view, or, Slice, of a Cube to produce a worksheet-like view of points of interest. Category:OLAP and Excel From OLAP Jump to: navigation, search Contents [hide]        1 The Power of Excel-Friendly OLAP 2 Introducing OLAP 3 Excel-Friendly OLAP 4 How Much Truth? 5 Data Warehouse vs. OLAP Database o 5.1 Data Silos o 5.2 Mergers and Acquisitions o 5.3 System Conversions o 5.4 External Data o 5.5 Forecasts o 5.6 Statistical Corrections o 5.7 Excel Dashboard Reporting 6 Limitations of Excel BPM Reporting 7 Examples of OLAP integration in Excel The Power of Excel-Friendly OLAP Should Excel be a key component of your company’s BPM system? There’s no doubt how most IT managers would answer this question. Name IT’s top ten requirements for a successful BPM system, and they’ll quickly explain how Excel violates dozens of them. Even the user community is concerned. Companies are larger and more complex now than in the past; they seem too complex for Excel. Managers need information more quickly now; they can’t wait for another Excel report. Excel spreadsheets don’t scale well. They can’t be used by many different users. Excel reports have many errors. Excel security is a joke. Excel output is ugly. Excel consolidation occupies a large corner of Spreadsheet Hell. And Sarbanes Oxley has changed everything. Or so we’re told. For these reasons, and many more, a growing number of companies of all sizes have concluded that it’s time to replace Excel. But before your company takes that leap of hope or faith, perhaps you should take another look at Excel…particularly when Excel can be enhanced by an Excel-friendly OLAP database. Excel-friendly OLAP could force your company to take another look at Excel. That technology helps to eliminate many of the classic objections to using Excel for business performance management. Introducing OLAP Excel-friendly OLAP products cure many of the problems that both users and IT managers have with Excel. But before I explain why this is so, I should explain what OLAP is, and how it can be Excel-friendly. Although OLAP technology has been available for years, it’s still quite obscure. One reason is that “OLAP” is an acronym for four words that are remarkably devoid of meaning: On-Line Analytical Processing. OLAP databases are more easily understood when they’re compared with relational databases. Both “OLAP” and “relational” are names for a type of database technology. Oversimplified, relational databases contain lists of stuff; OLAP databases contain cubes of stuff. For example, you could keep your accounting general ledger data in a simple cube with three dimensions: Account, Division, and Month. At the intersection of any particular account, division, and month you would find one number. By convention, a positive number would be a debit and a negative number would be a credit. Most cubes have more than three dimensions. And they typically contain a wide variety of business data, not merely General Ledger data. OLAP cubes also could contain monthly headcounts, currency exchange rates, daily sales detail, budgets, forecasts, hourly production data, the quarterly financials of your publicly traded competitors, and so on. You can define any consolidation hierarchy for any of a cube’s dimensions. For example, in the Month dimension every month could roll up into quarters, which could roll up into years. Months also could roll up into year-to-date categories. Users treat both the “leaf” members and the consolidated members as equivalent sources of data. To illustrate, users could choose data from a leaf member like Aug-2006 just as easily as they could choose from a consolidated member like Aug-2006-YTD. Other dimensions typically have their own roll-up structures. An Account dimension could roll up accounts into traditional financial statement hierarchies. A Division dimension could roll up divisions into the corporate reporting hierarchy. And a Product dimension could roll up products into one or more product structures. Excel-Friendly OLAP You probably could find at least 50 OLAP products on the market. But most of them lack a key characteristic: spreadsheet functions. Excel-friendly OLAP products offer a wide variety of spreadsheet functions that read data from cubes into Excel. Most such products also offer spreadsheet functions that can write to the OLAP database from Excel…with full security, of course. Read-write security typically can be defined down to the cell level by user. Therefore, only certain analysts can write to a forecast cube. A department manager can read only the salaries of people who report to him. And the OLAP administrator must use a special password to update the General Ledger cube. Other OLAP products push data into Excel; Excel-friendly OLAPs pull data into Excel. To an Excel user, the difference between push and pull is significant. Using the push technology, users typically must interact with their OLAP product’s user interface to choose data and then write it as a block of numbers to Excel. If a report relies on five different views of data, users must do this five times. Worse, the data typically isn’t written where it’s needed within the body of the report. Instead, the data merely is parked in the spreadsheet for use somewhere else. Using the pull technology, spreadsheet users can write formulas that pull the data from any number of cells in any number of cubes in the database. Even a single spreadsheet cell can contain a formula that pulls data from several cubes. To illustrate, suppose that an Excel dashboard presents information for a particular division and month. Excel users typically would designate a Month and a Division cell, which all the formulas would reference. With this design, you could change the Month cell from “Jun-2006” to “Jul-2006”, and the Division cell from “Northeast” to “Southwest”. Then, by simply recalculating your workbook, you would update the report to reflect the new settings. Under automation, you could print a report for every division for a given month. At first reading, it’s easy to overlook the significant difference between this method of serving data to Excel and most others. Spreadsheets linked to Excel-friendly OLAP databases don’t contain data; they contain only formulas linked to data on the server. In contrast, most other technologies write blocks of data to Excel. It really doesn’t matter whether the data is imported as a text file, copied and pasted, generated by a PivotTable, or pushed to a spreadsheet by some other OLAP. The other technologies turn Excel into a data store. But Excel-friendly OLAP avoids that problem. How Much Truth? It’s common these days for database vendors to talk about having “one version of the truth.” (Recently, for example, Google listed 48,000 hits for that expression.) What’s less common is for anyone to ask these vendors how much relevant truth their systems can provide. This is a critical question for managers looking for BPM information, and for their staff—usually Excel users—who must provide the information. As most Excel users are sadly aware, the IT Department’s data warehouse never will provide all the data needed for business performance management. It’s true that corporate data warehouses typically contain massive numbers of transactions. But this exhaustive detail largely is irrelevant to BPM, which typically relies on detailed summaries of data. At the extreme, data warehouses are a yard wide and a mile deep. But BPM requires data that is a mile wide and a yard deep. Data Warehouse vs. OLAP Database Here are some examples of data that OLAP databases can contain, but which data warehouses typically don’t: Data Silos Many information systems—both old and new—rely on databases that never will be added to the data warehouse. But these systems contain data that managers often need for managing business performance. Most of those systems provide some way to export their data. Often, they support ODBC. Most can export their data as text files. Some companies even print reports from their legacy systems to files, and then use Monarch software to convert that text into rows and columns of data that can be imported into their OLAP database. Mergers and Acquisitions When two companies merge, the one company now has two data warehouses, not one. Each organization has one version of its own truth, but neither has one version of the whole truth. This is not an easy problem for IT to solve. I know of one company, for example, that has five ERPs on four continents. For nearly ten years, IT’s goal has been to create a single data warehouse within two years. Unfortunately, users and their managers need summary data to be fully available immediately, certainly by the end of the month in which a merger or acquisition closes. One company closed the purchase of a billion-dollar subsidiary on the 26th of the month. By the Board meeting two weeks later, the Finance staff had printed more than 200 spreadsheets that reported both consolidated and consolidating reports for the new company, down to low-level summaries. All financial data was expressed in terms of the parent company’s Chart of Accounts. The staff could integrate the disparate systems so quickly because the parent already was using an Excel-friendly OLAP. They mapped the subsidiary’s meta data (general ledger codes, department codes, and so on) to the parent’s meta data. They imported the subsidiary’s financials to a new “slice” in the parent’s General Ledger cube, translating the meta data on the fly. Then they printed their standard spreadsheet analyses, all 200 pages of them, while adding a few new Excel analyses specific to the new subsidiary. System Conversions When a company purchases a new Enterprise Resource Planning (ERP) system, it creates at least two problems for BPM reporting. First, the company typically converts the fewest months of historical data it can. For financial systems, companies often convert only one year of history prior to the current fiscal year. But for many BPM purposes, data about past performance is very useful, even critical: •Monthly time-series forecasting requires at least 30 months of historical data, preferably more. •New products and sales offices often follow a consistent pattern for both revenue growth and startup expenses; but those patterns only can be discovered by analyzing data for startups during the past several years. •The analysis of trends in cost-volume-profit relationships during past downturns can serve as a guide to cost-reduction efforts during current downturns. With an Excel-friendly OLAP in place before the conversion, all historical data continues to be available. Better yet, managers continue to receive their standard Excel reports, which can display data from both ERPs. Second, transactions can be classified differently between the old and new systems, and this problem can be very difficult to solve. To illustrate, I know of two large companies whose system conversions were significantly over budget. The accountants for both companies had specified that all account-department combinations that were not explicitly allowed were to be rejected by the new systems. But to reduce expenses, both systems were set to allow all such combinations that weren’t specifically prohibited. As a consequence, many transactions each month were automatically booked to incorrect account-department combinations. Under normal circumstances the accountants in each company would have had to manually inspect more than one-hundred million account balances to find GL accounts whose transaction patterns had changed when the accounting systems changed. This would have been an impossible task, of course. However, both companies had been using Excel-friendly OLAP systems before their conversions began. Therefore, each created a simple spreadsheet that returned the monthly transactions for any specific account, department, and division, for the twelve months prior to the conversion and for all months after. Then using standard Excel statistics functions, and simple spreadsheet automation, the spreadsheets looped through every combination of account, department, and division, and listed all questionable combinations. The staff quickly corrected the obvious mistakes and researched the others. External Data Managers often need to see their performance reported within the context of their business environment. That environment can be described by the financial data of publicly held customers and competitors, by local and regional economic data, by population trends, and by other measures. IT doesn’t control such data. Nor do IT managers typically understand it. That’s not their job; it’s the user’s job. In most companies, if users don’t create and maintain cubes of external data, no one ever will. It’s not unusual for a knowledgeable user to create an OLAP cube on her local computer, populate it with public data, and then test its use with various spreadsheet reports. Once the cube is tested, she can work with the database administrator to move the cube to the OLAP server. Forecasts Most data warehouses provide empty buckets for budget data. But they typically don’t capture the wide variety of forecasts that companies generate. Nor do they help to generate those forecasts. But Excel-friendly OLAP offers both solutions. To illustrate, Excel users easily can generate both top-down and bottom-up sales forecasts, compare the forecasts to find large conflicts, and then revise the forecasts after researching the differences. To prepare the top-down forecasts, users can send a forecasting spreadsheet to the sales people. Unlike most spreadsheets, this one would include formulas that write the new forecast data to the appropriate area of an OLAP cube on the server. Full security would be maintained, of course. To prepare the bottom-up forecasts, users first create a spreadsheet that uses statistical methods to extend past sales performance for any product and region into the future. This spreadsheet also writes the forecast to an area of the OLAP cube, again, with full security. Then, using automation, they apply this spreadsheet forecast to every product and every region. To compare the forecasts, they set up a spreadsheet to compare the top-down and the bottom-up forecasts for any product and region. Again, using automation, they calculate the workbook for every combination and automatically note where the two versions vary by an unusual degree. Statistical Corrections Forecasts, analyses, and management reporting all can be seriously flawed if analysts rely on historical data that reflects errors and oversights. But for a variety of reasons, managers, investors, and auditors all take a dim view of prior-period adjustments to the General Ledger. One way to handle this problem, particularly for forecasting and analysis, is for users to maintain an errorcorrection entity that can be consolidated or ignored, depending on the circumstance. Of course, these corrections must be managed carefully. It would be very easy, after all, for indications of real problems to be “corrected” out of existence. But when statistical corrections are tightly controlled, they provide the only practical way that past performance can be analyzed as it actually happened, not as it was mistakenly booked at the time. Excel Dashboard Reporting Excel has not been an obvious choice for BPM reporting. One reason for this is obvious: Typical Excel reports are ugly and difficult to read. But they don’t need to be. Figure 1 illustrates an Excel dashboard report of public data for Starbucks Corporation. I created this report completely in Excel, with no assistance from third-party tools. This particular report uses data from two public web sites, downloaded into Excel. The report could display equivalent data for any public company whose financial information is covered by the two web sites. Figure 1 In a business environment, a report like this could report performance for a department, division, product line, or for an entire company. The data would come from an Excel-friendly OLAP, not from the Web. One significant advantage to using Excel for this type of reporting is that Excel users can change the report quickly and easily, without involving the IT Department. In fact, assuming that the necessary data already resides in the OLAP database, an Excel user typically could replace one measure with another in less than ten minutes. Another significant advantage is that the report – even a single figure in the report -- can display data from many original sources. To illustrate, a figure could show the trend in labor costs (from the General Ledger cube) per full-time-equivalent employee (from the Headcount cube). Another figure could show the ratio of total company sales (from the General Ledger cube) to the sales of its largest publicly traded competitor (from a Competitor cube). There is virtually no limit to the appearance that an Excel dashboard can take. Figure 2 illustrates a mockup dashboard based on a standard display that Business Week used about ten years ago. In fact, I often “steal” ideas for dashboard designs from the pages of business magazines. Excel dashboards also can compare the same measures for many different products, divisions, departments, and other entities. Figure 2 Limitations of Excel BPM Reporting As a general rule, Excel output tends to be on paper rather than on screen. Although many managers prefer it that way, paper reporting can seem primitive to certain managers. However, Excel-friendly OLAP vendors are making progress towards web-based reporting. PARIS Technologies allows Excel reports to connect to an OLAP database over the Internet. This gives Excel users read-write access to their cubes from virtually anywhere in the world. Several vendors offer interactive Web implementations of Excel reports linked to Applix TM1 cubes. And Excel 2007 spreadsheets will offer ways for users to interact with Analysis Services over the Web. On the other hand, the high-tech solution might not always be the best solution. One large company decided to take a low-tech approach to online management reporting. Each month the company automatically captures nearly 10,000 bitmaps of Excel reports of their OLAP data, and then displays those images on their Intranet. But whatever the display limitations, Excel-friendly OLAP databases should cause you to take another look at Excel for business performance management. Examples of OLAP integration in Excel See also Example 6 - OLAP and Excel. Note that the previous examples should be reviewed in order for this one to make sense. Information for OLAP and Excel provided by Charley Kyd of [www.exceluser.com] Pages in category "OLAP and Excel" This category contains only the following page. E  Excel-friendly OLAP products Excel-friendly OLAP products From OLAP Jump to: navigation, search The best-known Excel-friendly OLAP product is Analysis Services, which is included with Microsoft SQL Server. Excel 2007 includes a variety of spreadsheet functions that read data from Analysis Services, but they don’t write back to Analysis Services. Even users of Excel XP, and earlier versions, need to use one of several 3rd-party Excel add-ins to use spreadsheet functions that offer read-write access to Analysis Services. These products include xlCubed (http://www.xlCubed.com), IntelligentApps (http://www.IntelligentApps.com), and BIXL (http://www.bixl.com). Two other Excel-friendly OLAP products offer a sharp contrast to Analysis Services. My tests show that both products return data to Excel about 100 times faster than Analysis Services does. Unlike Analysis Services, both products can be administered by knowledgeable users, rather than the IT Department.  TM1, from Applix Corporation (http://www.applix.com), probably was the first OLAP product. TM1 offers approximately 30 read-write spreadsheet functions.  PowerOLAP, from PARIS Technologies (http://www.olap.com) works much like TM1. The company has partnered with a subsidiary of Hitachi, Ltd. to produce a Japanese version of their product. PowerOLAP offers more than 60 read-write spreadsheet functions. Each vendor’s spreadsheet functions work slightly differently. But the most-used function for each product looks something like this: =GETDATA(database, cube, member1, member2, …) To illustrate, this spreadsheet formula could return a number from a cube named “GL” in the Finance database, for account 1234, from the Southwest division, for July, 2006: =GETDATA(“Finance”, “GL”, “1234”, “Southwest”, “Jul-2006”) Like all Excel formulas, this formula typically would contain cell addresses or range names, not the literal values for each argument. Citation Content provided by Charley Kyd of (http://www.exceluser.com) ROLAP From Wikipedia, the free encyclopedia Jump to: navigation, search ROLAP stands for Relational Online Analytical Processing. ROLAP is an alternative to the MOLAP (Multidimensional OLAP) technology. While both ROLAP and MOLAP analytic tools are designed to allow analysis of data through the use of a multidimensional data model, ROLAP differs significantly in that it does not require the pre-computation and storage of information. Instead, ROLAP tools access the data in a relational database and generate SQL queries to calculate information at the appropriate level when an end user requests it. With ROLAP, it is possible to create additional database tables (summary tables or aggregations) which summarize the data at any desired combination of dimensions. While ROLAP uses a relational database source, generally the database must be carefully designed for ROLAP use. A database which was designed for OLTP will not function well as a ROLAP database. Therefore, ROLAP still involves creating an additional copy of the data. However, since it is a database, a variety of technologies can be used to populate the database. Contents [hide]     1 ROLAP vs. MOLAP[1] o 1.1 Advantages of ROLAP o 1.2 Disadvantages of ROLAP o 1.3 Performance of ROLAP  1.3.1 OLAP Survey  1.3.2 Downside of flexibility 2 Trends 3 Products 4 References [edit] ROLAP vs. MOLAP[1] The discussion of the advantages and disadvantages of ROLAP below, focus on those things that are true of the most widely used ROLAP and MOLAP tools available today. In some cases there will be tools which are exceptions to any generalization made. [edit] Advantages of ROLAP  ROLAP is considered to be more scalable in handling large data volumes, especially models with dimensions with very high cardinality (i.e. millions of members).  With a variety of data loading tools available, and the ability to fine tune the ETL code to the particular data model, load times are generally much shorter than with the automated MOLAP loads.  The data is stored in a standard relational database and can be accessed by any SQL reporting tool (the tool does not have to be an OLAP tool).  ROLAP tools are better at handling non-aggregatable facts (e.g. textual descriptions). MOLAP tools tend to suffer from slow performance when querying these elements.  By decoupling the data storage from the multi-dimensional model, it is possible to successfully model data that would not otherwise fit into a strict dimensional model.  The ROLAP approach can leverage database authorization controls such as row-level security, whereby the query results are filtered depending on preset criteria applied, for example, to a given user or group of users (SQL WHERE clause). [edit] Disadvantages of ROLAP  There is a general consensus in the industry that ROLAP tools have slower performance than MOLAP tools. However, see the discussion below about ROLAP performance.  The loading of aggregate tables must be managed by custom ETL code. The ROLAP tools do not help with this task. This means additional development time and more code to support.  When the step of creating aggregate tables is skipped, the query performance then suffers because the larger detailed tables must be queried. This can be partially remedied by adding additional aggregate tables, however it is still not practical to create aggregate tables for all combinations of dimensions/attributes.  ROLAP relies on the general purpose database for querying and caching, and therefore several special techniques employed by MOLAP tools are not available (such as special hierarchical indexing). However, modern ROLAP tools take advantage of latest improvements in SQL language such as CUBE and ROLLUP operators, DB2 Cube Views, as well as other SQL OLAP extensions. These SQL improvements can mitigate the benefits of the MOLAP tools.  Since ROLAP tools rely on SQL for all of the computations, they are not suitable when the model is heavy on calculations which don't translate well into SQL. Examples of such models include budgeting, allocations, financial reporting and other scenarios. [edit] Performance of ROLAP [edit] OLAP Survey In the OLAP industry ROLAP is usually perceived as being able to scale for large data volumes, but suffering from slower query performance as opposed to MOLAP. The OLAP Survey, the largest independent survey across all major OLAP products, being conducted for 6 years (2001 to 2006) have consistently found that companies using ROLAP report slower performance than those using MOLAP even when the data volume were taken into consideration. However, as with any survey there are a number of subtle issues that must be taken into account when interpreting the results.   The survey shows that ROLAP tools have 7 times more users than MOLAP tools within each company. Systems with more users will tend to suffer more performance problems at peak usage times. There is also a question about complexity of the model, measured both in number of dimensions and richness of calculations. The survey does not offer a good way to control for these variations in the data being analyzed. [edit] Downside of flexibility Some companies select ROLAP because they intend to re-use existing relational database tables -- these tables will frequently not be optimally designed for OLAP use. The superior flexibility of ROLAP tools allows this less than optimal design to work, but performance suffers. MOLAP tools in contrast would force the data to be re-loaded into an optimal OLAP design. [edit] Trends The undesirable trade-off between additional ETL cost and slow query performance has ensured that most commercial OLAP tools now use a "Hybrid OLAP" (HOLAP) approach, which allows the model designer to decide which portion of the data will be stored in MOLAP and which portion in ROLAP. [edit] Products Examples of commercial products using ROLAP include Microsoft Analysis Services, MicroStrategy and Oracle BI (the former Siebel Analytics). There is also an open source ROLAP server - Mondrian. [edit] References 1. ^ Bach Pedersen, Torben; S. Jensen (December 2001). "Multidimensional Database Technology". Distributed Systems Online (IEEE): 40-46. ISSN 0018-9162. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=00970558. In the OLAP world, there are mainly two different types: Multidimensional OLAP (MOLAP) and Relational OLAP (ROLAP). Hybrid OLAP (HOLAP) refers to technologies that combine MOLAP and ROLAP. MOLAP This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a multidimensional cube. The storage is not in the relational database, but in proprietary formats. Advantages:   Excellent performance: MOLAP cubes are built for fast data retrieval, and is optimal for slicing and dicing operations. Can perform complex calculations: All calculations have been pre-generated when the cube is created. Hence, complex calculations are not only doable, but they return quickly. Disadvantages:  Limited in the amount of data it can handle: Because all calculations are performed when the cube is built, it is not possible to include a large amount of data in the cube itself. This is not to say that the data in the cube cannot be derived from a large amount of data. Indeed, this is possible. But in this case, only summary-level information will be included in the cube itself.  Requires additional investment: Cube technology are often proprietary and do not already exist in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed. ROLAP This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP's slicing and dicing functionality. In essence, each action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement. Advantages:  Can handle large amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. In other words, ROLAP itself places no limitation on data amount.  Can leverage functionalities inherent in the relational database: Often, relational database already comes with a host of functionalities. ROLAP technologies, since they sit on top of the relational database, can therefore leverage these functionalities. Disadvantages:  Performance can be slow: Because each ROLAP report is essentially a SQL query (or multiple SQL queries) in the relational database, the query time can be long if the underlying data size is large.  Limited by SQL functionalities: Because ROLAP technology mainly relies on generating SQL statements to query the relational database, and SQL statements do not fit all needs (for example, it is difficult to perform complex calculations using SQL), ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP vendors have mitigated this risk by building into the tool out-of-the-box complex functions as well as the ability to allow users to define their own functions. HOLAP HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can "drill through" from the cube into the underlying relational data. MOLAP or ROLAP OLAP tools take you a step beyond query and reporting tools. Via OLAP tools, data is represented using a multidimensional model rather than the more traditional tabular data model. The traditional model defines a database schema that focuses on modeling a process of function, and the information is viewed as a set of transactions, each which occurred at some single point in time. The multidimensional model usually defines a star schema, viewing data not as a single event but rather as the cumulative effect of events over some period of time, such as weeks, then months, then years. With OLAP tools, the user generally vies the data in grids or corsstabs that can be pivoted to offer different perspectives on the data. OLAP also enables interactive querying of the data. For example, a user can look at information at one aggregation (such as a sales region) and then drill down to more detail information, such as sales by state, then city, then store. OLAP tools do not indicate how the data is actually stored. Given that, it’s not surprising that there are multiple ways to store the data, including storing the data in a dedicated multidimensional database (also referred to as MOLAP or MDD). Examples include Arbors Software’s Essbase and Oracle Express Server. The other choice involves storing the data in relational databases and having an OLAP tool work directly against the data, referred to as relational OLAP (also referred to as ROLAP or RDBMS). Examples include MicroStrategy’s DSS server and related products, Informix’s Informix-MetaCube, Information Advantage’s Decision Suite, and Platinum Technologies’ Plantinum InfoBeacon. (Some also include Red Brick’s Warehouse in this category, but it isn’t really an OLAP tool. Rather, it is a relations database optimized for performing the types of operations that ROLAP tools need.) ROLAP versus MOLAP Relational OLAP (ROLAP) Multidimensional OLAP (MOLAP) Scale to terabytes Under 50 DB capacity Managing of summary tables /indexes Instant response Platform portability Easier to implement SMP and MPP SMP only Secure Integrated meta data Proven technology Data modeling required Data warehouses can be implemented on standard or extended relational DBMSs, called relational OLAP (ROLAP) servers. these serves assume that data is stored in relational databases and they support extensions to SQL and special access and implementation methods to efficiently implement the multidimensional data model and operations. In contrast, multidimensional OLAP (MOLAP) servers are servers that directly store multidimensional data in special data structures (like arrays or cubs) and implement OLAP operations over these data in free-form fashion (free-from within the framework of the DMBS that holds the multidimensional data). MOLAP servers have sparsely populated matrices, numeric data, and a rigid structure of data once the data enters the MOLAP DBMS framework. Relational Databases ROLAP servers contain both numeric and textual data, serving a much wider purpose than their MOLAP counterparts. Unlike MOLAP DBMSs (supported by specialized database management systems). ROLAP DBMSs (or RDMBSs) are supported by relational technology. RDBMSs support numeric, textual, spatial, audio, graphic, and video data, general-purpose DSS analysis, freely structured data, numerous indexes, and star schema’s. ROLAP servers can have both disciplined and ad hoc usage and can contain both detailed and summarized data. ROLAP supports large databases while enabling good performance, platform portability, exploitation of hardware advances such as parallel processing, robust security, multi-user concurrent access (including read-write with locking), recognized standards, and openness to multiple vendor’s tools. ROLAP is based on familiar, proven, and already selected technologies. ROLAP tools take advantage of parallel RDBMSs for those parts of the application processed using SQL (SQL not being a multidimensional access or processing language). SO, although it is always possible to store multidimensional data in a number of relations tables (the star schema), SQL does not, by itself, support multidimensional manipulation of calculations. Therefore, ROLAP products must do these calculations either in the client software or intermediate server engine. Note, however, that Informix has integrated the ROLAP calculation engine into the RDBMS, effectively mitigating the above disadvantage. Multidimensional Databases MDDs deliver impressive query performance by pre-calculating or pre-consolidating transactional data rather than calculating on-the-fly. (MDDs pre-calculate and store every measure at every hierarchy summary level at load time and store them in efficiently indexed cells for immediate retrieval.) However, to fully preconsolidate incoming data, MDDs require an enormous amount of overhead both in processing time and in storage. An input file of 200MB can easily expand to 5GB; obviously, a file this size take many minutes to load and consolidate. As a result, MDDs do not scale, making them a lackluster choice for the enterprise atomic-level data in the data warehouse. However, MDDs are great candidates for the <50GB department data marts. To manage large amounts of data, MDD servers aggregate data along hierarchies. Not only do hierarchies provide a mechanism for aggregating data, they also provide a technique for navigation. The ability to navigate data by zooming in and out of detail is key. With MDDs, application design is essentially the definition of dimensions and calculation rules, while the RDBMS requires that the database schema be a star or snowflake. With MDDs, for example, it is common to see the structure of time separated from the repletion of time. One dimension may be the structure of a year, month, quarter, half-year, and year. A separate dimension might be different years: 1996, 1997, and so on. Adding a new year to the MDD simply means adding a new member to the calendar dimension. Adding a new year to a RDBMS usually requires that each month, quarter, half-year and year also be added. In General Usually, a scaleable, parallel database is used for the large, atomic. organizationally-structured data warehouse, and subsets or summarized data from the warehouse are extracted and replicated to proprietary MDDs. Because MDD vendors have enabled drill-through features, when a user reaches the limit of what is actually stored in the MDD and seeks more detail data, he/she can drill through to the detail stored in the enterprise database. However, the drill through functionality usually requires creating views for every possible query. As relational database vendors incorporate sophisticated analytical multidimensional features into their core database technology, the resulting capacity for higher performance salability and parallelism will enable more sophisticated analysis. Proprietary database and nonitegrated relational OLAP query tool vendors will find it difficult to compete with this integrated ROLAP solution. Both storage methods have strengths and weaknesses -- the weaknesses, however, are being rapidly addressed by the respective vendors. Currently, data warehouses are predominantly built using RDBMSs. If you have a warehouse built on a relational database and you want to perform OLAP analysis against it, ROLAP is a natural fit. This isn’t to say that MDDs can’t be a part of your data warehouse solution. It’s just that MDDs aren’t currently well-suited for large volumes of data (10-50GB is fine, but anything over 50GB is stretching their capabilities). If your really want the functionality benefits that come with MDD, consider subsetting the data into smaller MDD-based data marts. When deciding which technology to go for, consider: 1) Performance: How fast will the system appear to the end-user? MDD server vendors believe this is a key point in their favor. MDD server databases typically contain indexes that provide direct access to the data, making MDD servers quicker when trying to solve a multidimensional business problem. However, MDDs have significant performance differences due to the differing ability of data models to be held in memory, sparsely handling, and use of data compression. And, the relational database vendors argue that they have developed performance improvement techniques, such as IBM’s DB2 Starburst optimizer and Red Brick’s Warehouse VPT STARindex capabilities. (Before you use performance as an objective measure for selecting an OLAP server, remember that OLAP systems are about effectiveness (how to make better decisions), not efficiency (how to make faster decisions).) 2) Data volume and scalability: While MDD servers can handle up to 50GB of storage, RDBMS servers can handle hundreds of gigabytes and terabytes. And, although MDD servers can require up to 50% less disk space than relational databases to store the same amount of data (because of relational indexes and overhead), relational databases have more capacity. MDD advocates believe that you should perform multidimensional modeling on summary, not detail, information, thus mitigating the need for large databases. in addition to performance, data volume, and scalabiltiy, you should consider which architecture better supports systems management and data distribution, which vendors have a better user interface and functionality, which architecture is easier to understand, which architecture better handles aggregation and complex calculations, and your perception of open versus proprietary architectures. Besides these issues, you must also consider which architecture will be a more strategic technology. In fact, MDD servers and RDBMS products can be used together -- one for fast reposes, the other for access to large databases. What if? IF A. You require write access for What if? analysis B. Your data is under 50 GB C. Your timetable to implement is 60-90 days D. You don’t have a DBA or data modeler personnel E. You’re developing a general-purpose application for inventory movement or assets management THEN Consider an MDD solution for your data mart (like Oracle Express, Arbor’s Essbase, and Pilot’s Lightship) IF A. Your data is over 100 GB B. You have a "read-only" requirement THEN Consider an RDBMS for your data mart. IF A. Your data is over 1TB B. Need data mining at a detail level Consider an MPP hardware platform like IBM’s SP and DB2 RDBMS If, you’ve decided to build a data mart using a MDD, you don’t need a data modeler. Rather, you need an MDD data mart application builder who will design the business model (identifying dimensions and defining business measures based on the source systems identified. Prior to building separate stove pipe data marts, understand that at some point you will need to: 1) integrate and consolidate these data marts at the detail enterprise level; 2) load the MDD data marts; and 3) drill through from the data marts to the detail. Note that your data mart may outgrow the storage limitations an MDD, creating the need for an RDMBS (in turn, requiring data modeling similar to constructing the detailed, atomic enterprise-level RDBMS). What are the major differences between ROLAP and MOLAP? Douglas Hackney Information Management Online, August 1, 1999 Q: Advertisement <SCRIPT language="JavaScript1.1" SRC="http://ad.doubleclick.net/adj/informationmanagement.com/;abr=!ie;pg=ros;sz=468x60;pos=3;tile=5;ord=18479281?"> </SCRIPT> What are the major differences between ROLAP and MOLAP? Could you explain with examples? A: Doug Hackney's Answer: This is a condensation of a full day's worth of seminar information into a few paragraphs. Please forgive the brevity. MOLAP (multidimensional OLAP) tools utilize a pre-calculated data set, commonly referred to as a data cube, that contains all the possible answers to a given range of questions. MOLAP tools feature very fast response, and the ability to quickly write back data into the data set (budgeting and forecasting are common applications). Primary downsides of MOLAP tools are limited scalability (the cubes get very big, very fast when you start to add dimensions and more detailed data), inability to contain detailed data (you are forced to use summary data unless your data set is very small), and load time of the cubes. The most common examples of MOLAP tools are Hyperion (Arbor) Essbase and Oracle (IRI) Express. MOLAP tools are best used for users who have "bounded" problem sets (they want to ask the same range questions every day/week/month on an updated cube, e.g. finance). ROLAP (relational OLAP) tools do not use pre-calculated data cubes. Instead, they intercept the query and pose the question to the standard relational database and its tables in order to bring back the data required to answer the question. ROLAP tools feature the ability to ask any question (you are not limited to the contents of a cube) and the ability to drill down to the lowest level of detail in the database. Primary downsides of ROLAP tools are slow response and some limitations on scalability (depending on the technology architecture that is utilized). The most common examples of ROLAP tools are MicroStrategy and Sterling (Information Advantage). ROLAP tools are best used for users who have "unbounded" problem set (they don't have any idea what they want to ask from day to day; e.g., marketing). It is very important to pay close attention to the underlying architecture of ROLAP tools, as some tools are very "scalability challenged." HOLAP (hybrid OLAP) addresses the shortcomings of both of these technologies by combining the capabilities of both approaches. HOLAP tools can utilize both pre-calculated cubes and relational data sources. The most common example of HOLAP architecture is OLAP services in Microsoft SQL Server 7.0. OLAP vendors of all stripes are working to make their products marketable as "hybrid" as quickly as possible. It is critically important to closely examine the architectures of these "repackaged/repositioned" offerings, as their "HOLAP" claims may be more marketing hype than architectural reality. Douglas Hackney is the president of Enterprise Group Ltd., a consulting and knowledge-transfer company specializing in designing and implementing data warehouses and associated information delivery systems. He can be reached at www.egltd.com. For more information on related topics, visit the following channels:  OLAP MOLAP vs. ROLAP 2/14/2003 By ITtoolbox Popular Q&A Team for ITtoolbox as adapted from BI-Select discussion group Summary: What is the difference between MOLAP and ROLAP? Full Article: Disclaimer: Contents are not reviewed for correctness and are not endorsed or recommended by ITtoolbox or any vendor. Popular Q&A contents include summarized information from ITtoolbox BI-Select discussion unless otherwise noted. 1. Adapted from response by Michael on Monday, February 10, 2003 MOLAP pre-summarizes the data to improve performance in querying and displaying the data. Products such as MS SQL Server Analysis Services even let you pre-determine how much of the data to pre-build for performance purposes. Is MOLAP proprietary? Certainly, each vendor has their own way of summarizing the data but the DB vendors like Oracle, MS and IBM still make that data accessible. In some MOLAP configurations the relational data is just duplicated in some format, but that is not an absolute. ROLAP may be applicable for very large, infrequently used data access. Slow processing, slow query response and huge storage requirements are ROLAP's chief characteristics. 2. Adapted from response by David on Monday, February 10, 2003 MOLAP does require data duplication as the users are going against a pre-summarized cube that is a separate structure from the main datawarehouse. How much duplication is dependent on the amount of detail you want to present to users, and how many different cubes you end up building. Obviously, a cube only reproduces a piece of the warehouse. The size of a warehouse doesn't have much to do with the front end technology used to access it. ROLAP reduces your overall storage requirements since you are going directly against your warehouse. Also, large warehouses can be designed to present data to users in a timely fashion. As new tools and new design approaches reduce query times, I think we're more likely to see a redirection towards direct access of the datawarehouse than a proliferation of reporting structures (like cubes) built from the main warehouse. This is not to say that MOLAP or HOLAP is going away. It just seems to me that the different approaches are suited to different kinds of users. There's still no single way to fufill all of your orgs needs. 3. Adapted from response by Michael on Monday, February 10, 2003 Just because it's pre-summarized doesn't necessarily define ROLAP as "duplicated." The data is in a different format because it's pre-summarized. The whole evolution has been to improve access to information for end users, including query response time. ROLAP may indeed reduce storage but how can you compare the cost of disk to response time? To get the same ROLAP response time you may need to dramatically increase processing power so what have you really saved. Granted, a good data warehouse design may include some level of summarization. At this stage I'm not convinced ROLAP is the future direction at the expense of MOLAP or HOLAP. Relational online analytical processing (ROLAP) is a form of online analytical processing (OLAP) that performs dynamic multidimensional analysis of data stored in a relational database rather than in a multidimensional database (which is usually considered the OLAP standard). Data processing may take place within the database system, a mid-tier server, or the client. In a two-tiered architecture, the user submits a Structure Query Language (SQL) query to the database and receives back the requested data. In a three-tiered architecture, the user submits a request for multidimensional analysis and the ROLAP engine converts the request to SQL for submission to the database. Then the operation is performed in reverse: the engine converts the resulting data from SQL to a multidimensional format before it is returned to the client for viewing. As is typical of relational databases, some queries are created and stored in advance. If the desired information is available, then that query will be used, which saves time. Otherwise, the query is created on the fly from the user request. Microsoft Access's PivotTable is an example of a three-tiered architecture. Since ROLAP uses a relational database, it requires more processing time and/or disk space to perform some of the tasks that multidimensional databases are designed for. However, ROLAP supports larger user groups and greater amounts of data and is often used when these capacities are crucial, such as in a large and complex department of an enterprise. Advantages/Disadvantages of MOLAP, ROLAP and HOLAP 9/13/2004 By ITtoolbox Popular Q&A Team for ITtoolbox as adapted from BI-CAREER discussion group Summary: What are the advantages and disadvantages of MOLAP, ROLAP and HOLAP? Full Article: Disclaimer: Contents are not reviewed for correctness and are not endorsed or recommended by ITtoolbox or any vendor. Popular Q&A contents include summarized information from ITtoolbox BI-CAREER discussion unless otherwise noted. Adapted from response by Ezekiel on Thursday, August 19, 2004 In the OLAP world, there are mainly two different types: Multidimensional OLAP (MOLAP) and Relational OLAP (ROLAP). Hybrid OLAP (HOLAP) refers to technologies that combine MOLAP and ROLAP. MOLAP Excellent performance- this is the more traditional way of OLAP analysis. In MOLAP, data is stored in a multidimensional cube. The storage is not in the relational database, but in proprietary formats. Advantages: MOLAP cubes are built for fast data retrieval, and are optimal for slicing and dicing operations. They can also perform complex calculations. All calculations have been pre-generated when the cube is created. Hence, complex calculations are not only doable, but they return quickly. Disadvantages: It is limited in the amount of data it can handle. Because all calculations are performed when the cube is built, it is not possible to include a large amount of data in the cube itself. This is not to say that the data in the cube cannot be derived from a large amount of data. Indeed, this is possible. But in this case, only summary-level information will be included in the cube itself. It requires an additional investment. Cube technology are often proprietary and do not already exist in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed. ROLAP This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP's slicing and dicing functionality. In essence, each action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement. Advantages: It can handle large amounts of data. The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. In other words, ROLAP itself places no limitation on data amount. It can leverage functionalities inherent in the relational database. Often, relational database already comes with a host of functionalities. ROLAP technologies, since they sit on top of the relational database, can therefore leverage these functionalities. Disadvantages: Performance can be slow. Because each ROLAP report is essentially a SQL query (or multiple SQL queries) in the relational database, the query time can be long if the underlying data size is large. It has limited by SQL functionalities. Because ROLAP technology mainly relies on generating SQL statements to query the relational database, and SQL statements do not fit all needs (for example, it is difficult to perform complex calculations using SQL), ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP vendors have mitigated this risk by building into the tool out-of-the-box complex functions as well as the ability to allow users to define their own functions. HOLAP HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can "drill through" from the cube into the underlying relational data. What's a ROLAP Tool? Tutorial Home >Hardware >Storage and Drives >Data Warehousing >How do I analyze data in a relational database? Tutorial Home >Hardware >Storage and Drives >Data Warehousing >End-user Reporting >What's a ROLAP Tool? Step 1: The Concept ROLAP stands for Relational On-line Analytical Processing. It is a type of reporting tool that provides users with the capability to do efficient "drill" or "trend" analysis on large volumes of atomic or summarized data stored in a relational database. Step 2: Drill Analysis Data may have a hierarchical relationship such as year, month, and day. ROLAP tools provide users the ability to analyze FACT variations at different levels of hierarchy. There are different types of drill analysis such as drill-down (using lower level attribute), drill-up (using higher level attribute), drill-within (using attribute from same dimension not in main hierarchy), drill-anywhere (using attributes from any dimension), and drill-across (using attributes from other dimensions). Step 3: Trend Analysis Data warehouses or data mart(s) contain historical data. ROLAP tools provide users the ability to analyze FACT trends for different business perspectives (dimensions), having different levels of hierarchy, over a given time duration. Step 4: Rotation Analysis ROLAP tools may present data in simple tabular reports (along X-axis and Y-axis) such as year 2000 sales (in dollars) (intersection of X-axis and Y-axis) of sales person(s) (listed along Y-axis) for 12 months (listed along X-axis). ROLAP tools provide users the ability to perform rotation analysis (referred as slicing and dicing) of tabular data by interchanging data between X-axis and Y-axis such as year 2000 sales (in dollars) of sales person(s) (listed along X-axis) for 12 months (listed along Y-axis). Step 5: Data Storage Irrespective of the data model, whether multi-dimensional or relational, ROLAP tools may be used in analyzing data stored as columns and tables in a relational database. Data may be stored in denormalized or normalized dimensions and normalized FACT(s) tables. Storage requirements are optimal for storing atomic and summarized data in a relational database when compared with a multi-dimensional database. ROLAP tools provide the capability to summarize or aggregate atomic data without having to store it. Step 6: Data Access SQL is the industry standard access language for relational databases. Data may be stored in atomic or summarized format. A SQL query will take a longer to time to read atomic data, summarize (or aggregate) it, and present it to users in a particular format when compared to a SQL query that reads summarized data, and presents it to users in a particular format. Hence, query performance is unpredictable using a ROLAP tool. Also, it is not necessary to store aggregated data in a relational database. Evolução do OLAP O OLAP surgiu na década de 90 com a evolução dos sistemas. Através do seu surgimento, facilitou a vida dos profissionais, permitindo o acesso rápido aos dados conjugados com funcionalidades de análise multidimensional. Saiba mais sobre sua análise e sua consulta. A tecnologia do OLAP surgiu com a evolução dos sistemas de informação. Esses sistemas no seu começo armazenavam grandes quantidades de dados, mas a recuperação dos mesmos tornava-se complicado para os usuários finais e analistas de sistemas. O Clipper é uma ferramenta que trazia diversas dificuldades para gerar um relatório. Essa dificuldade talvez não fosse em relação à massa de dados, mas sim à grande complexidade de um sistema não-relacional onde tínhamos que sair à procura dos dados em vários arquivos. Assim, para que conseguíssemos construir o relatório de todos os dados dos clientes da empresa, tínhamos dois grandes trabalhos: primeiro encontrar os dados, e depois codificar para construir o relatório no formato desejado. Os SGBD’s (Sistemas Gerenciadores de Banco de Dados) foram evoluindo significativamente junto com as linguagens de programação, o que facilitou um pouco a vida dos analistas de sistemas. As montanhas de dados já poderiam ser acessadas de uma maneira um pouco mais simples, mas ainda longe do ideal, visto que os usuários ainda dependiam de uma pessoa muito eficiente na área de informática para ter acesso a qualquer relatório que não havia sido previsto no levantamento do sistema. Acompanhando a evolução dos sistemas, na década de 90 introduziu-se uma nova classe de ferramentas no mercado, que através do surgimento, facilitou a vida dos profissionais, permitindo o acesso rápido aos dados conjugado com funcionalidades de análise multidimensional. Essa ferramenta foi batizada de OLAP (On Line Analitical Processing). Sua rapidez é satisfatória, algo em torno de 5 segundos para a resposta. A análise é dinâmica, onde o usuário pode fazer a consulta que quiser, sem depender de uma pessoa especializada na área, e multidimensional compartilhada. Ela permite aos usuários analisar os dados em dimensões múltiplas, como região, produto, tempo e vendedor. Cada dimensão também pode conter hierarquias, por exemplo, a dimensão tempo pode conter as hierarquias ano, trimestre, mês, semana ou dia. A dimensão região pode ter as hierarquias continente, país, estado, cidade ou bairro. A análise multidimensional é a principal característica da tecnologia OLAP. Ela consiste em ver determinados cubos de informações de diferentes ângulos, e de vários níveis de agregação. Os “cubos” são massas de dados que retornam das consultas feitas ao banco de dados, e podem ser manipulados e visualizados por inúmeros ângulos (usando a tecnologia de slice-and-dice) e diferentes níveis de agregação (usando a tecnologia chamada “drill”). As consultas OLAP serão realizadas com a ferramenta Oracle Discoverer, com o objetivo de extrair informações do banco de dados. Ele permite ao administrador a criação de condições associadas a folders (que são o equivalente as tabelas do relacional). Os usuários podem fazer consultas ou criar suas próprias, o que dá uma boa liberdade de consulta. O Discoverer oferece as facilidades básicas de drills e de slice and dice. Ele permite que o usuário não seja obrigado a descer na hierarquia passando por todos os níveis. Se o usuário quiser, poderá pular algum nível. Além disto, o drill-down não precisa obedecer a uma hierarquia pré-definida. Pode ser feito em qualquer campo. A análise no OLAP, é feita através de características. São elas: Drill Across, Drill Down, Drill Up, Drill Throught, Slice And Dice, Alertas, Ranking, Filtros, Sorts e Breaks que facilitam o acesso aos dados. O Drill Across ocorre quando o usuário pula um nível intermediário dentro de uma mesma dimensão. Uma dimensão tempo é composta por ano, semestre, trimestre, mês e dia. O usuário estará executando um Drill Across quando ele passar de ano direto para semestre ou mês. O Drill Down ocorre quando o usuário aumenta o nível de detalhe da informação, diminuindo o nível de agregação. Já, o Drill Up ao contrário do Drill Down, aumenta o grau de granularidade, diminuindo o nível de detalhamento da informação. O Drill Throught ocorre quando o usuário passa de uma informação contida em uma dimensão para uma outra. O Slice and Dice é uma das principais características de uma ferramenta OLAP. Ele é responsável por recuperar o microcubo dentro do OLAP, além de servir para modificar a posição de uma informação, alterar linhas por colunas de maneira a facilitar a compreensão dos usuários e girar o cubo sempre que tiver necessidade. Os Alertas são utilizados para indicar situações de destaque em elementos dos relatórios, baseados em condições envolvendo objetos e variáveis. Servem para indicar valores mediante condições mas não para isolar dados pelas mesmas. A opção Ranking permite agrupar resultados por ordem de maiores e menores, baseado em objetos numéricos em uma tabela direcionada (relatório) não afetando a pesquisa (Query). Os Filtros servem para filtrar os dados recuperados pelo usuário facilitando análises diretamente no documento. Os sorts servem para ordenar uma informação. Pode ser em ordem crescente ou decrescente. Os Breaks separaram o relatório em grupos de informações (blocos). Por exemplo: O usuário tem a necessidade de visualizar a informação por cidades, então ele deve solicitar um Break. Após esta ação ter sido executada, automaticamente o relatório será agrupado por cidades, somando os valor mensuráveis por cidades. Palestra Pargres: Um Middleware para Processamento Paralelo de Consultas OLAP em Clusters de Banco de Dados Resumo: Atualmente não é difícil encontrar Sistemas de Informação que manipulem bases de dados com tamanho na casa dos terabytes, e crescendo. Esse crescimento torna a cada dia mais freqüentes as consultas denominadas de alto custo, como as consultas OLAP. A característica principal dessas consultas é demandar muito tempo de processamento envolvendo varreduras de grande volumes de dados. O Pargres se propõe a obter alto desempenho durante o processamento de consultas de alto custo em grandes bancos de dados de maneira pouco dispendiosa usando paralelismo intra-consulta em clusters de bancos de dados. A abordagem é não intrusiva, mantendo os programas de aplicação, as consultas ao SGBD e o esquema relacional intactos, permitindo a migração imediata de sistemas atualmente em ambientes seqüenciais. Como resultado são obtidos ganhos de aceleração de consultas muitas vezes acima do linear no número de nós. Além disso, por usar uma arquitetura em cluster a solução oferece também alta disponibilidade. Sendo totalmente baseada em software livre, é uma solução de de baixo custo para a execução em tempo hábil de pesadas consultas OLAP. Início Página Principal Cursos Citrix Gestão Java Linux Microsoft Novell PHP Virtualização Voip Web Formações Gestão de TI Infra de Redes Java Linux Microsoft PHP VoIP Web Certificações Cases Linux Microsoft Notícias Alunos Cases Fotos das Turmas Suporte Cases Consultoria Contratos Soluções Produtos Licenciamento Eventos Academics Fotos de Eventos Palestras RoadShow Seminários Promoções Produtos Suporte Treinamentos Oportunidades Curriculum Vagas Imprensa Mídia Contato Canal Direto Entendendo o OLAP O banco de dados SQL Server 2000 junto à tecnologia OLAP é um método ágil e eficiente para gerenciar todos os dados necessários para o funcionamento de uma empresa. Pode ser aplicado em diversas áreas como finanças, vendas e marketing. Sua principal finalidade é proporcionar um bom andamento da empresa, além de reduzir seu custo operacional. O OLAP (processamento analítico on-line) é uma forma de organizar grandes bancos de dados repletos de informações de extrema importância para uma determinada empresa. Os dados OLAP são organizados pelo seu administrador de banco de dados para se adaptar à forma como você analisa e gerencia dados, com o intuito de que a criação dos relatórios necessários exija menos tempo e esforço, ou seja, a ferramenta OLAP prepara o cruzamento dos dados, sendo necessário o uso do SQL Server 2000 para gerenciar todas as informações necessárias para o funcionamento da empresa. O Microsoft SQL Server 2000 junto com a ferramenta OLAP é uma tecnologia de simples manutenção e bastante flexível que permite ao administrador criar soluções compatíveis com a realidade de sua empresa além de ser o mais completo produto de bases de dados e análise. Sua principal função é organizar os dados por níveis de detalhe, usando as mesmas variáveis que você usa para analisar os dados. Variáveis são todo tipo de dados que se precisa saber sobre uma determinada consulta. Por exemplo, um banco de dados com informações sobre os profissionais e clientes de uma determinada empresa de marketing podem ter campos separados para identificar o sexo, o estado civil, a idade, a cidade onde cada pessoa mora, etc. Após, todas essas informações são calculadas e armazenadas no sistema, deixando-as prontas para uma determinada consulta. As informações são armazenadas em cubos multidimensionais, que gravam valores quantitativos e medidas, permitindo visualização através de diversos ângulos. Estas medidas são organizadas em categorias descritivas, chamadas de dimensões.Dimensão é um conjunto de níveis que abrange um aspecto de dados. As informações no sistema sobre quando as vendas foram feitas poderiam ser organizadas em uma dimensão de tempo com níveis de ano, trimestre, mês e dia. Os bancos de dados OLAP são chamados de cubos porque combinam diversas dimensões, como tempo, geografia e produtos, com dados resumidos, como cifras de vendas ou números de estoque. Ao criar um cubo OLAP a partir de uma consulta, você transforma o conjunto de registros simples em uma hierarquia estruturada, ou cubo, que permite que os relatórios se concentrem no nível de detalhes desejado. Uma das vantagens do SQL 2000 do OLAP é permitir que o usuário acrescente ou tire uma dimensão do cubo, conforme a necessidade. O tempo de resposta de uma consulta multidimensional depende de quantas células são requeridas. O Microsoft SQL Server 2000 Meta Data Services é um conjunto de serviços que permitem ao usuário gerenciar meta dados. Meta dados são abstratos, podem ser utilizados em um ambiente de desenvolvimento e descrevem a estrutura e o significado de aplicações e processos. Sua principal função é proporcionar um modo de armazenar e gerenciar Meta dados sobre sistemas e aplicações de informação. Esta tecnologia consiste do mecanismo de repositório, ferramentas, interfaces de programação de aplicação (APIs), modelos de informação padronizados, um navegador e um Kit de Desenvolvimento de Software (SDK). Diferente do sistema OLAP que dá apoio à decisão, o sistema OLTP é responsável por tornar as pesquisas muito mais ágeis e seguras, reduzindo o tempo na hora de se fazer consultas aos bancos de dados. Sua finalidade é fazer com que uma grande quantidade de pequenas informações não se perca, processando milhares de informações por dia, que contém em cada uma delas uma pequena porção de dados. Por ser uma ferramenta relacional que processa um registro de cada vez, o OLTP tem como função alimentar a base de dados que compõem o OLAP, que é multidimensional. Devido a necessidade dos profissionais de adquirirem o conhecimento necessário para aplicar essa tecnologia, a SISNEMA disponibiliza um curso que trata exatamente sobre o uso dessas ferramentas. O curso: Soluções OLAP usando o Microsoft® SQL Server 2000™ Analysis Services permite ao aluno, conhecimentos necessários para entender o que o OLAP traz de vantagens para os seus usuários. O participante obterá informações de noções básicas sobre o que é o sistema OLAP, como utilizar e gerenciar as ferramentas dentro do SQL, como ocorre o cruzamento dos dados, como trabalhar as dimensões de cada lado do cubo separando medidas de dimensões e reconhecerá recursos avançados sobre questões hierárquicas. Antes de entrar no curso, o aluno precisa ter conhecimento básico de Design, administração e implementação de bancos de dados. A SISNEMA disponibiliza os melhores treinamentos de informática da região sul do país e, nesse curso não é diferente. O participante ficará apto a configurar essas ferramentas e descobrirá todos os recursos agregados a elas. Tecnologia OLAP: Aprenda mais sobre essa importante ferramenta A aplicação da ferramenta OLAP é indispensável, justamente por proporcionar um fácil gerenciamento, boa manutenção e um baixo custo para a empresa. Entrevistamos o instrutor de OLAP Marcelo Duarte. Saiba mais sobre a opinião dele a respeito dessa poderosa tecnologia. O OLAP (processamento analítico on-line) é uma ferramenta que fornece para as organizações um método de acessar, visualizar, e analisar uma grande quantidade de dados corporativos com alta flexibilidade e performance, permitindo assim o uso de relatórios gerenciais para sistemas de apoio a tomada de decisões (SAD). A tecnologia OLAP faz uso de dados já consolidados de sua empresa que serão armazenados para futura criação dos cubos e posterior análise dos mesmos. As informações são conceitualmente organizadas em cubos que armazenam valores quantitativos ou medidas. As medidas são identificadas por duas ou mais categorias descritivas denominadas dimensões que formam a estrutura de um cubo. Uma dimensão pode ser qualquer visão do negócio que faça sentido para sua análise, como produto, departamento ou tempo. Este modelo de dados multidimensional simplifica para os usuários o processo de formular pesquisas ou "queries" complexos, criar relatórios, efetuar análises comparativas, e visualizar subconjuntos (slice) de maior interesse. Por exemplo, um cubo contendo informações de vendas poderá ser composto pelas dimensões tempo, região, produto, cliente, cenário (orçado ou real) e medidas. Medidas típicos seriam valor de venda, unidades vendidas, custos, margem, etc. Para o instrutor de OLAP, Marcelo Duarte, o uso de um cubo tem como principal importância, garantir uma análise multidimensional das informações: “Ele organiza os dados em formato de cubos garantindo uma análise multidimensional das informações armazenadas no seu banco. Isso significa que estas informações serão analisadas, totalizadas e disponibilizadas para avaliação em conjunto com outras permitindo que um mesmo fato seja visto de diferentes ângulos”. Um cubo OLAP é formado por dimensões de dados. Cada dimensão corresponde o armazenamento de determinadas informações. Um Cubo MS-OLAP pode conter até 128 dimensões, e é normal que uma dessas dimensões seja a dimensão de tempo: “O cubo existe para analisar dados de forma multidimensional e sem ele, não haveria essa possibilidade” comenta Marcelo. A aplicação da ferramenta OLAP é indispensável para qualquer empresa, justamente por proporcionar um bom gerenciamento: “É de extrema importância, simplesmente por proporcionar produtividade para sua empresa, na ajuda de tomadas de decisões, e nas análises necessária pelas gerências” comenta o instrutor. As principais diferenças dessa tecnologia OLAP integrado ao Microsoft SQL Server para outras ferramentas são: O baixo custo de propriedade (TCO), o fácil gerenciamento e manutenção e o uso integrado do MS-Excel como Pivot Table padrão para que os clientes possam acessar e analisar os cubos. “O OLAP Services pode ser considerado como uma funcionalidade do SQL Server que, da mesma forma que o DTS, faz parte do produto, não sendo necessário adquirir nenhum módulo adicional de software para se beneficiar de seus recursos. Trata-se de um banco de dados multidimensional desenhado exclusivamente para aplicações OLAP permitindo a criação de cubos das informações armazenadas no data warehouse acelerando as execuções das consultas” diz o instrutor Marcelo Duarte. As ferramentas Front-Ends existentes no mercado são as aplicações personalizadas, o Microsoft Excel e aplicações de terceiros. Para um bom uso da ferramenta OLAP é necessário um bom conhecimento do Microsoft SQL Server 2000, tanto da parte de administração cpmo de programação, conhecimento de DTS do SQL Server e conhecimento de MDX, a linguagem de consulta as bases de dados OLAP. A SISNEMA disponibiliza os melhores treinamentos de informática da região sul do país e, nesse curso não é diferente. O participante ficará apto a configurar essas ferramentas e descobrirá todos os recursos agregados a elas. A tecnologia do OLAP A linguagem SQL oferece inúmeros recursos para quem trabalha com banco de dados, porém, ela é incapaz de trabalhar com cálculos complexos e séries de tempo. O OLAP (On-line Analytical Processing) foi criado, na década de 90, para que dados em grande quantidade, sejam consultados com maior flexibilidade e funcionalidade. Ele é considerado uma arquitetura de aplicação, um sistema multidimensional interativo que permite ao analista trabalhar conforme sua necessidade, descobrindo novos padrões em suas ferramentas. O usuário do OLAP tem acesso rápido e simplificado estilo de navegação, permitindo visualização e análise de todos os dados corporativos com alta performace. Este acesso às informações independe do tamanho, complexidade ou fonte dos dados. A tecnologia do OLAP permite que o usuário tenha acesso otimizado, sem importar a origem das informações. No OLAP, as informações são armazenadas em cubos multidimensionais, que gravam valores quantitativos e medidas, permitindo visualização através de diversos ângulos. Estas medidas são organizadas em categorias descritivas, chamadas de dimensões e formam, assim, a estrutura do cubo. Este modelo multidimensional agiliza e simplifica o processo de busca e pesquisas, bem como cria relatórios, efetua análises comparativas e visualiza sub-conjuntos. O OLAP, tem a capacidade de analisar informações de diferentes formas já que navega em sistema drill up (para cima) e drill down (para baixo), ou seja, entre níveis que visualizam a informação através de um ângulo mais detalhado. A busca, feita por dimensões, pode ser em hierarquia, por exemplo: dentro da dimensão tempo você pode ter uma classe representando os níveis meses, dias, anos etc. Esta forma de navegação facilita a pesquisa do usuário. As estruturas devem operar em modelo cliente-servidor, suporte multi-usuário e assim fornecer consultas flexíveis, proporcionadas pelas variações de estrutura do OLAP. O MOLAP (Multidimensional On Line Analytical processing), tradicionalmente organiza, analisa e faz a consulta dos dados diretamente do servidor. ROLAP (Relational On Line Processing): Permite que consultas multidimensionais e tabelas bidimensionais sejam relacionadas no próprio servidor, mantendo o cubo no mesmo, possibilitando a análise de um grande volume de dados. DOLAP (Desktop On Line Analytical Processing), como mencionado antes, o OLAP trabalha em sistema clienteservidor. A vantagem do DOLAP é que seu processo diminui o tráfego na rede, visto que as informações solicitadas são enviadas em microcubo e retornam para análise na wokstation, evitando que o servidor fique sobrecarregado. Recentemente uma nova arquitetura foi criada: o HOLAP (Hybrid On Line Analytical Processing), onde uma fusão entre as tecnologias MOLAP e ROLAP, foram a base para o seu desenvolvimento. A vantagem deste sistema é a sua estruta: formada somente com melhores recursos de cada uma das ferramentas. Atualmente o OLAP é aplicado em diversas áreas e setores empresariais como recursos humanos, marketing, vendas, finanças, manufaturas e outros. Fundamental para o bom andamento e gerenciamento das empresas, o OLAP é uma ferramenta essencial para quem precisa consultar constantemente seu banco de dados, bem como tomar decisões baseadas nas informações contidas nele. Ferramentas em banco de dados transformam informações em decisões Foi-se o tempo em que as consultas em banco de dados só podiam ser feitas por profissionais da área de sistemas ou de informática. Hoje, as empresas de médio e grande porte podem contar com métodos muito mais ágeis e eficientes. São eles: OLAP, BI, Data Mining, Meta Data Services e Data WareHouse que se encarregam de transformar as informações contidas no banco de dados em decisões. Tudo isso de uma maneira rápida e simples. O OLAP (On-line Analytical Processing) é uma ferramenta essencial para quem precisa consultar constantemente seu banco de dados em grande quantidade, bem como tomar decisões baseadas nas informações contidas nele. Ele também permite que o usuário trabalhe com cálculos complexos, tudo isso proporcionando uma consulta com maior flexibilidade e funcionalidade. A tecnologia do OLAP permite que o usuário tenha acesso independente do tamanho nas informações, complexidade ou fonte dos dados, sem importar a sua origem. As informações são armazenadas em cubos multidimensionais que gravam valores quantitativos e medidas, permitindo visualização através de diversos ângulos. Estas medidas são organizadas em categorias descritivas chamadas de dimensões e formam, assim, a estrutura do cubo. Sua aplicação é bastante diversificada e seu uso encontra-se em diversas áreas e setores de uma empresa. A tecnologia é aplicada em locais tais como finanças, vendas, marketing, recursos humanos e manufatura. Seu uso proporciona um bom andamento e um excelente gerenciamento de uma empresa. Télcio Elui Cardoso trabalha na Mercur. Para ele a aplicação do OLAP trouxe diversos benefícios para a empresa onde trabalha: “A Tecnologia OLAP juntamente com ferramentas de front-end, nos permite fazer análises estratégicas sobre os negócios da empresa, de forma simples, rápida e intuitiva. Além do ganho na agilidade da disponibilidade das informações, reduz também consideravelmente, o custo operacional da empresa”. O profissional comenta que antes da implantação dessa tecnologia, havia necessidade de gerar relatórios mensais para análise e acompanhamento de clientes: “Hoje em dia com a aplicação do OLAP não preciso acompanhar, e me preocupar nas decisões que tenho que tomar sobre os resultados”. O Business Intelligence (BI) é a categoria de software que permite às pessoas acessar, analisar, compartilhar informações, extrair com rapidez dados de fontes distintas, organizar análises, simulações e outras informações importantes. Tudo isso melhorando o desempenho gerencial e operacional de uma determinada empresa. Seu principal benefício é permitir às empresas um melhor conhecimento de seus clientes, o que auxilia na criação de produtos e serviços, bem como no atendimento de suas necessidades. Depois disso, o sistema indica a quantidade de produto a ser fornecida a cada cliente e o vendedor certo para efetivar o negócio. Por mais sofisticados e complexos que possam parecer, os sistemas de BI possuem uma proposta simples: transformar informações armazenadas em decisões. Uma plataforma de BI possui 2 ferramentas básicas para a sua utilização, são elas: Query & Reporting e OLAP. Ferramentas de query & reporting possibilitam a transformação de dados em informações, facilitando a compreensão de oportunidades e tendências, e permitindo a seus usuários tomar decisões melhores e mais bem informadas. A ferramenta OLAP é de extrema importância para uma plataforma de BI, pois permite a realização de análises sofisticadas (comparações estatísticas, simulações e análises de cenários) e cálculos dimensionais (alocações e rateios). O Data Warehouse é um banco de dados criado com a finalidade de concentrar informações gerenciais baseadas no conteúdo de outros bancos de dados da empresa. Essas informações, embora armazenadas e espalhadas pelo sistema, são filtradas e preparadas para agilizar a análise e evitar o uso dos demais bancos de dados utilizados diariamente. Apesar de ser aplicado em qualquer banco de dados, seu uso se concentra em grandes empresas que possuem volumes de informação muito grandes. O Data Mining (Busca de Dados) é a pesquisa, exploração, processamento e a busca de informação útil em grande quantidade de dados. Sua principal função é ajudar os usuários a analisar dados em bancos de dados relacionais e cubes multidimensionais OLAP para descobrir novos padrões e tendências que podem ser usadas para fazer previsões. As capacidades de data mining no SQL Server 2000 são fortemente integradas tanto com fontes de dados OLAP quanto com relacionais. A tecnologia do Data Mining é aplicada em empresas e agências governamentais com profissionais na área de administração, finanças e marketing. O Meta Data Services da Microsoft é um conjunto de serviços que permite ao usuário gerenciar meta dados. Os meta dados são abstratos, podem ser utilizados em um ambiente de desenvolvimento e descrevem a estrutura e o significado dos dados, bem como a estrutura e o significado de aplicações e processos. Esta tecnologia consiste do mecanismo de repositório, ferramentas, interfaces de programação de aplicação (APIs), modelos de informação padronizados, um navegador e um Kit de Desenvolvimento de Software (SDK), proporcionando um modo de armazenar e gerenciar meta dados sobre sistemas e aplicações de informação. Devido as necessidades dos profissionais em se aprofundarem mais nessas ferramentas, a SISNEMA oferece um curso que trata exatamente dessas ferramentas, em especial o OLAP. Este curso tem como objetivo fornecer aos alunos os conhecimentos necessários para fazer o design, projeto e distribuição de soluções OLAP usando o Microsoft® SQL Server 2000™ Analysis Services. Seus principais conteúdos são: Introdução ao data warehouse, entendendo a arquitetura e análise de serviços, distribuindo dimensões usando o editor de dimensões, trabalhando com cubos e medidas, criando armazenamento de cubos, implementando cálculos usando MDX, trabalhando com cubos virtuais, distribuindo uma solução de OLAP, entre outros. A SISNEMA disponibiliza os melhores treinamentos de informática da região sul do país e, nesse curso não é diferente. O participante ficará apto a configurar essas ferramentas e descobrirá todos os recursos agregados a elas. A importância de utilizar o OLAP Visando fornecer e trabalhar em cima de todas as características que envolvem esta tecnologia, a SISNEMA Informática oferece o curso “Designing and Implementing OLAP Solutions”. Paulo Roberto Lein foi um dos profissionais interessados em descobrir uma ferramenta que atendesse à carência do seu dia-adia e realizou esse treinamento a fim de supri-la. Paulo Roberto Lein, analista de sistemas, teve a necessidade de disponibilizar de uma ferramenta que permitisse que determinados setores de sua empresa, tanto pudessem acompanhar, como também analisar o comportamento de informações que se encontrassem no sistema, utilizado pela mesma. Possibilitando ainda, até tomadas de decisões. Mais do que nunca usuários e empresas precisam manipular dados de maneira rápida e dinâmica nos seus bancos de dados. O OLAP supre essa necessidade de manipulação para dados multidimencionais e permite ainda que o usuário analise o porquê dos dados obtidos, além de dar base para criar tendências e descobrir padrões. O interesse pela procura do treinamento, veio inicialmente após presenciar uma demonstração da ferramenta, pelo Diretor de Tecnologia da SISNEMA Informática, Nei Maldaner. Reforçou a idéia numa visita feita por ele ao Hospital Moinhos de Vento, onde viu algo semelhante funcionando na prática, o que o deixou muito entusiasmado. “Queria entender os conceitos do OLAP, as características da ferramenta que vem junto com o SQL Server 2000, o Analysis Manager, como montar os cubos, entender suas peculiaridades e de como disponibilizar o resultado final em clientes Excel ou em uma página da Web”, declara Paulo, que trabalha na IENH (Fundação Evangélica de Novo Hamburgo). Entre os conceitos aprendidos, Paulo destaca os conceitos de Data Warehouse e OLAP (On-line Analytical Processing).Viram como funciona o Analysis Manager e seus serviços, aprenderam a criar e a entender os conceitos de dimensões, medidas e cubos, bem como formas de armazenamento e otimização, o uso de partições, poder do MDX, uso de cubos virtuais, clientes EXCEL e Web Pages, uso de segurança, formas de disponibilização e atualização do OLAP e, ainda, um exemplo de Data Mining. “O curso foi extremamente esclarecedor, motivar e, acima de tudo, prático. Os exercícios permitiam com que executássemos tarefas bem peculiares de uma ferramenta de Solução OLAP, entendendo suas características e usando exemplos práticos”, conta o analista de sistemas. Este curso permite aos alunos conhecimentos necessários par entender o que o OLAP traz de vantagens para os seus usuários. Aspectos de como o OLAP pega os dados dinâmicos e os dados estáticos, os cruza e os guarda para uma eventual busca do usuário. Ainda, como analisar as dimensões e que recursos utilizar para cada caso, entre outros conceitos abordados. Para ele, o instrutor Bruno Ferrarese foi sempre muito prestativo e disponível. Paulo garante que a experiência do ministrante é muito vasta e os profissionais trocaram as experiências entre si. “A turma toda questionou possíveis problemas e soluções com o instrutor, macetes e técnicas que poderíamos usar no nosso dia a dia. Além de alertar para pontos chave para soluções de problemas que poderiam surgir”, conta. “Tivemos a oportunidade de conhecer um pouco de cada um, e principalmente os pontos que nos levaram a fazer o próprio curso, o que motivou cada um e suas expectativas, bem como a preocupação de sempre estar buscando mais conhecimento nessa área que está se expandindo e consolidando cada vez mais”, declara Lein. Paulo irá aplicar o conhecimento adquirido na fundação em que trabalha, IENH, disponibilizando aplicações ao cliente (Excel e principalmente Páginas WEB), para a Diretoria e Setores, para análise e acompanhamento de informações que se encontram nos sistemas da Escola, em cima das necessidades dos mesmos. Quando questionado sobre o ganho que teve em realizar o curso, foi enfático: “Ganho em conhecimento e visão do que a ferramenta pode oferecer. Com certeza, vale a pena realizá-lo, pois ele apresenta uma solução que muitas empresas procuram e às vezes desconhecem ferramentas desse porte”. “O treinamento acrescentou ainda, uma visão mais clara dos caminhos e tendências do mercado em relação a termos que agora estão em maior evidência: BI, OLAP, Data Minning, entre outros. Certamente irei sempre indicar a SISNEMA para os outros profissionais”, completa Paulo, satisfeito. A tecnologia do OLAP é a resposta do mercado para atender a demanda de necessidade em manipular dados em grandes quantidades de maneira rápida e satisfatória, com base em várias perspectivas. A SISNEMA disponibiliza os melhores treinamentos de informática da região sul do país e, nesse curso não é diferente. O participante ficará apto a configurar essa ferramenta e descobrirá todos os recursos agregados a ela. Características da Tecnologia OLAP No mundo globalizado de hoje, as empresas estão enfrentando maior concorrência e expandindo sua atuação para novos mercados. Dessa maneira, a velocidade com que executivos obtém informações e tomam decisões determina a competitividade de uma empresa e seu sucesso em longo prazo. A resposta do mercado para atender a demanda dessa necessidade é a OLAP, a qual consiste em organizar, de forma acessível e ágil, os bancos de dados. Através de um simples estilo de navegação e pesquisa, usuários podem rapidamente analisar inúmeros cenários, criar relatórios "ad-hoc", descobrir fatos e tendências relevantes independente do tamanho, complexidade e fonte dos dados corporativos. Segundo Alexandre Von Muhlen, instrutor da SISNEMA, o OLAP é um produto fácil de ser utilizado, além de facilitar a busca. “Com o OLAP o usuário tem acesso a uma análise mais específica de informações, as quais estão armazenadas no banco de dados, de um modo simples e sem complicações”, enfatiza. A principal característica dos sistemas OLAP é permitir uma visão conceitual multidimensional dos dados de uma empresa. Os dados são modelados em uma estrutura multidimensional conhecida por “cubo”. As dimensões do cubo representam os componentes dos negócios da empresa, tais como "cliente", "produto", "fornecedor" e "tempo". “Uma loja, por exemplo, que queira colocar suas informações dentro de um banco de dados do OLAP, poderá realizar consultas para saber como foram as vendas do mês, qual produto vendeu mais, entre outros fatores, visto que o sistema acumula informações”, explica o instrutor. “O usuário terá acesso, ainda, a informações de porque determinado produto teve maior número de vendas e porque outro saiu mais no mês anterior. Existem mercadorias que são mais procuradas em alguma época do ano e, através de uma análise, o investidor poderá criar tendências para o próximo ano, pois saberá que haverá maior saída dessa mercadoria e terá que aumentar o estoque”, conclui. Dentro de cada dimensão de um modelo OLAP, os dados podem ser organizados em uma hierarquia que define diferentes níveis de detalhe. Um usuário, visualizando dados em um modelo OLAP, poderá navegar para cima (drill up) ou para baixo (drill down) entre níveis para visualizar informação com maior ou menor nível de detalhe sem a menor dificuldade. O OLAP se divide em algumas estruturas, as quais fornecem consultas flexíveis. Em um banco de dados MOLAP (Multidimensional On-Line Processing), o qual funciona como um gerenciador no banco de dados multidimensional, os dados são mantidos em arranjos e indexados de maneira a prover um ótimo desempenho no acesso a qualquer elemento. O sistema ROLAP (Relational On-Line Processing) fornece análise para fazer todo o processamento de informações no servidor da base de dados e para executar comandos SQL para recuperar os dados no servidor OLAP. O DOLAP (Desktop On-Line Analytical Processing) consiste em diminuir o tráfego na rede e evita que o servidor fique sobrecarregado. Há, ainda, o HOLAP (Hybrid On-Line Analytical Processing) que é a fusão da MOLAP e da ROLAP. Sua estrutura é formada com os melhores recursos de cada uma das ferramentas. A aplicação do OLAP é bastante diversificada e seu uso encontra-se em diversas áreas de uma empresa. A tecnologia é aplicada em locais tais como finanças, vendas, marketing, recursos humanos e manufatura. Dado o estado atual da tecnologia e da exigência do usuário por tempos consistentes e rápidos na resposta, a tecnologia do OLAP vem para atender essa procura. É perfeito para quem precisa constantemente consultar seu banco de dados para quaisquer fins. Qualificação em OLAP Para quem deseja aprender a configurar e usar todos os recursos disponíveis no sistema OLAP, a SISNEMA disponibiliza o curso Designing and Implementing OLAP Solutions. Como tudo nos dias de hoje corresponde ao dinamismo e a competitividade, são indispensáveis recursos eficazes de busca e de resposta. Analistas e gestores necessitam de sistemas de informação que sejam ágeis e respondam às complexas consultas sobre informações de negócios. Com o sistema multidimensional do OLAP (On-Line Analytical Processing), o usuário tem a capacidade de analisar o banco de dados de maneira satisfatória, mas para isso é fundamental ter conhecimentos sólidos. O curso tem a intenção de mostrar o que o OLAP traz de vantagens para o seu usuário. Com o OLAP, é possível manipular e examinar interativamente grandes quantidades de dados consolidados e detalhados de várias perspectivas. O OLAP envolve a análise de relacionamentos complexos sobre milhares de itens de dados armazenados em bases de dados multidimensionais, com o intuito de descobrir padrões e tendências. Segundo o diretor de tecnologia da SISNEMA, Nei Maldaner, o objetivo do OLAP é pegar os dados dinâmicos e os dados estáticos, cruzá-los de maneira infinita e reservá-los para uma eventual busca do usuário. Configurar o programa de OLAP e ter acesso aos dados requer uma clara compreensão dos modelos de dados de uma empresa e das funções analíticas necessárias aos executivos e outros analistas de dados. “No treinamento, são passadas as metodologias para orientar os usuários a definir itens, analisar quais as dimensões e recursos para estratégias de cruzamento e qual utilizar em cada caso”, diz. O participante obterá informações de noções básicas sobre o que é o sistema OLAP, como utilizar e gerenciar as ferramentas dentro do SQL, como ocorre o cruzamento dos dados, como trabalhar as dimensões de cada lado do cubo separando medidas de dimensões e reconhecerá recursos avançados sobre questões hierárquicas. “Como os dados se cruzam, muitas vezes, o cruzamento pode levar algum tempo. Com isso, o participante aprenderá estratégias e formas para poder cruzar as informações novas com as que já existem”, explica Nei. São transmitidas estratégias de maneiras de linguagem para a manipulação das informações do banco de dados. Aprende-se a trabalhar utilizando o cubo como se fosse um objeto, de como uni-lo e compartilhá-lo. Os usuários necessitam mais do que uma visão estática de dados que não podem mais ser manipulados. As ferramentas OLAP oferecem a esse usuário maior capacidade de manipulação, permitindo analisar o porquê dos dados obtidos. Essas ferramentas, muitas vezes, são baseadas em banco de dados multidimensionais, o que significa que os dados precisam ser extraídos e carregados para as estruturas proprietárias do sistema, já que não há padrões abertos para o acesso de dados multidimensionais. Essas características tornam o OLAP uma tecnologia essencial em diversos tipos de aplicações. Com o curso da SISNEMA Informática o participante terá acesso a todas as informações e conceitos essenciais para a manipulação desse sistema. Conheça as funcionalidades do OLTP x OLAP De essencial importância para a gestão integrada de TI, em banco de dados SQL o OLTP (On-line Transaction Processing) é fundamental para transações empresariais agindo em execuções e tarefas do dia-a-dia, otimizando os bancos de dados. Juntamente ao OLAP, o OLTP é uma solução eficaz e inteligente que proporciona condições favoráveis ao gerenciamento empresarial, tornando pesquisas muito mais fáceis, ágeis e seguras de se fazer, permitindo que haja uma redução considerável de tempo na hora de se fazer consultas aos bancos de dados. Enquanto o OLAP trabalha com dados históricos, no sentido de analisar informações, o OLTP opera com dados que movimentam o negócio em tempo real, suportando operações cotidianas de negócio empresariais por meio de seu processo operacional. O OLTP tem como tem como função alimentar a base de dados que compõem o OLAP, que é multidimensional, já o OLTP é uma ferramenta relacional, orientada para o processo, trabalhando com dados do presente e processando um registro de cada vez, não sendo multidimensional como o OLAP. A finalidade do OLTP é fazer com que uma grande quantidade de pequenas informações não se perca, processando milhares ou milhões de informações por dia, que contém em cada uma delas uma pequena porção de dados. Os usuários de OLTP freqüentemente lidam com um registro de cada vez, o que faz com que a mesma tarefa seja executada inúmeras vezes, pois a maioria de seus relatórios são feitos em uma tabela inteira. Logo, as pesquisas e consultas são instantâneas, quando muito extensas envolvem múltiplas tabelas chamadas de join queries e devem ser executadas em segundos ou minutos. Embora muitas vezes citado como referência para banco de dados, O OLTP pode ser utilizado genericamente para descrever um ambiente de processamento de transações e assim, agilizar o ambiente de consulta além de apoiar o OLAP. A SISNEMA Informática investe, consideravelmente, em bancos de dados, pois reconhece a importância dessa tecnologia nos dias de hoje, promovendo seminários e palestras abertas a todos os profissionais que buscam ampliar seus conhecimentos em gerenciamento e administração de banco de dados. Data Warehousing Data Warehousing: Integração de Dados Organizacionais para a Tomada de Decisões 1. Introdução Os Sistemas Gerenciadores de Bases de dados (SGBDs) são largamente usados por organizações para manter os dados que documentam as operações cotidianas. Em aplicações cujos dados são atualizados freqüentemente, tais como os dados operacionais, as transações fazem normalmente pequenas mudanças, e um grande número de transações têm que ser eficientemente processadas. Recentemente, entretanto, as organizações têm enfatizado aplicações nas quais os dados correntes e históricos são analisados e explorados, identificando tendências úteis e criando resumos dos dados, para suportar o processo de tomada de decisão de alto-nível. Tais aplicações são referidas como suporte à decisão, e têm crescido rapidamente dentro do setor. Os SGBDs possuem uma série de características, como técnicas para otimização de consultas e indexação, suporte a consultas complexas, recursos para definir e usar visões, etc. O uso de visões tem ganho popularidade devido à sua utilidade em aplicações que envolvem análises de dados complexas. As organizações podem consolidar informações de diversas bases de dados dentro de um data warehouse, trazendo tabelas de muitas fontes de dados para um mesmo local, ou ainda, materializando uma visão que é definida sobre tabelas de diversas fontes. O data warehousing tem se difundido bastante, e muitos produtos estão agora disponíveis para criar e gerenciar depósitos de dados de múltiplas bases de dados. Data warehousing é uma coleção de tecnologias de suporte à decisão, com o objetivo de habilitar o executivo, o gerente e o analista a tomar decisões melhores e mais rápidas. As tecnologias de data warehousing têm sido disponibilizadas por muitas indústrias: suporte ao cliente, vendas (de acordo com o perfil do usuário), serviços financeiros (para análise de reclamações, análise de riscos, análise de cartão de crédito e detecção de fraudes), transporte (para gerenciamento de frota), telecomunicações (para análise de chamadas e detecção de fraudes), análise do uso de energia, assistência (para análise dos efeitos de uma doença), etc. Um data warehousing é uma coleção de dados "orientada ao sujeito", integrada, tempo-variável e não-volátil, usada primariamente para a tomada de decisões corporativas. Tipicamente, o data warehouse é mantido separadamente de bases de dados operacionais organizacionais. Há muitas razões para fazer isso. O data warehouse suporta Online Analytical Processing (OLAP); os requisitos funcionais e de desempenho, que são bastante diversificados de aplicações Online Transaction Processing (OLTP), tradicionalmente suportadas por bases de dados operacionais. Aplicações OLTP tipicamente processam dados na ordem definida das transações, que são as operações diárias de uma organização. Essas tarefas são transações estruturadas, repetitivas e atômicas. Consistência e recuperabilidade da base de dados são críticas, e a maximização do throughput das transações é a métrica-chave de desempenho. Conseqüentemente, a base de dados é projetada para refletir a semântica operacional de aplicações conhecidas, em particular, para minimizar conflitos de concorrência. Os data warehouses, em contraste, são projetados para dar suporte a decisões. Dados históricos, sumarizados e consolidados são mais importantes do que registros detalhados e individuais. Visto que os data warehouses contêm dados consolidados, eventualmente de diversas bases de dados operacionais, potencialmente por longos períodos de tempo, eles tendem a ter ordens de magnitude mais amplas do que bases de dados operacionais; os sistemas de data warehouse são projetados para suportar centenas de gigabytes até terabytes em tamanho. Para facilitar a análise complexa e a visualização, os dados no data warehouse são tipicamente modelados multidimensionalmente. Por exemplo, em um data warehouse de vendas, a data da venda, o local da venda, a pessoa que realizou a venda e o produto, podem ser algumas das dimensões de interesse. Freqüentemente, essas dimensões são hierárquicas; a data de venda pode ser organizada como uma hierarquia dia-mês-ano, o produto como uma hierarquia produtocategoria-indústria, etc. O suporte à decisão requer dados que podem estar faltando em bases de dados operacionais; por exemplo, entender as tendências ou fazer predições requer dados históricos, enquanto bases de dados operacionais armazenam somente dados correntes. O suporte às decisões usualmente requer dados consolidados de muitas fontes heterogêneas: essas podem incluir fontes externas, além de diversas bases de dados operacionais. As diferentes fontes podem conter dados de vários níveis de qualidade, ou representações de uso inconsistente, códigos e formatos, que precisam ser conciliados. Finalmente, o suporte a modelos de dados multidimensionais e operações típicas de OLAP requerem organização de dados especial, métodos de acesso e métodos de implementação, não geralmente providos por SGBDs comerciais, que focam OLTP. Por todas essas razões, os data warehouses são implementados separadamente de bases de dados operacionais. 2. Suporte à Decisão Os Sistemas de Apoio à Decisão (SAD) são ferramentas fundamentais para a evolução do processo de tomada de decisões dentro desta nova realidade empresarial, pois as atividades empresariais e as necessidades dos clientes estão em constante mutação, o que torna as decisões um fator de suma importância. Os SAD devem acompanhar esta tendência, sendo flexíveis e adaptáveis no meio em que se encontram. Esses sistemas auxiliam o executivo em todas as fases de tomada de decisão, principalmente nas etapas de desenvolvimento, comparação e classificação dos riscos, além de fornecer subsídios para a escolha de uma boa alternativa. Unindo conceitos de administração de empresas e de informática, os SAD vêm se tornando uma importante ferramenta para o gestor de empresas em sua constante busca pela qualidade total. Os SAD utilizam muito a regra "e se" para geração de dados e informações de simulações, cenários, etc, como por exemplo: (i) determinação do local mais adequado para uma unidade comercial ou de um PDV (ponto de venda), (ii) elaboração de orçamentos com diversas alternativas, (iii) aumento, diminuição ou segmentação de negócios, em conjunto com um possível perfil de clientes. Os SAD permitem a coordenação e integração de dados, de partes, visando a objetivos comuns, fornecendo informações que permitam melhores decisões empresariais. A arquitetura do SAD é composta por uma base de dados e uma base de modelos, e por três sistemas: (i) Sistema Gerenciador de Base de dados (SGBD), (ii) Sistema Gerenciador de Banco de Modelos (SGBM) e um (iii) Gerenciador de Interface (GI). A decisão organizacional requer uma visão compreensiva de todos os aspectos de uma empresa, e muitas organizações têm criado data warehouses consolidados que contêm dados de diversas bases de dados mantidos por diferentes unidades de negócios, junto com informação histórica e sumarizada. A tendência em direção a data warehousing é complementada pelo aumento da ênfase sobre poderosas ferramentas de análise com três classes de ferramentas disponíveis: sistemas que suportam uma classe de consulta que tipicamente envolve group-by e operadores de agregação e provê excelente suporte para condições booleanas complexas e funções estatísticas. São aplicações cujas consultas são chamadas de Online Analytic Processing (Processamento Analítico Online), ou OLAP. Esses sistemas suportam um estilo de consulta no qual os dados são melhor interpretados como arrays multidimensionais, e são influenciados por ferramentas do usuário final, tal como, planilhas eletrônicas. Existem SGBDs que suportam consultas estilo SQL tradicional mas são projetadas para também suportar consultas OLAP eficientemente. Muitos SGBDs relacionais estão correntemente realçando seus produtos nessa direção, e investindo tempo extra para distinguir entre sistemas OLAP especializados e SGBDs relacionais, incrementando suporte a consultas OLAP. Esta classe é motivada pelo desejo de se encontrar tendências e padrões interessantes ou não-esperados em grandes conjuntos de dados. Em análise de dados exploratória, embora um analista possa identificar um "padrão interessante", é muito difícil formular uma consulta que capture a essência de um "padrão interessante". A quantidade de dados em várias aplicações é muito grande para permitir análise manual ou análise estatística tradicional. O objetivo do data mining é suportar análise exploratória sobre grandes conjuntos de dados. Avaliar consultas OLAP ou data mining sobre dados distribuídos globalmente é uma tarefa árdua. A solução é criar um repositório centralizado de todos os dados, ou seja, um data warehousing. A disponibilidade de um depósito de dados facilita a aplicação de ferramentas de OLAP e data mining, e o desejo de aplicar tais ferramentas de análise é uma forte motivação para construir um data warehouse. 3. Data Warehousing Em termos gerais, um Data Warehouse (DW) é uma grande base de dados que armazena informações integradas a partir de bases de dados operacionais de uma organização. Uma das palavras-chave dessa definição imprecisa, a palavra que se encontra no âmago do DW, é a integração. Os arquitetos de um DW devem transformar e integrar dados operacionais dentro e fora da empresa e, então, colocá-los em um DW, que será usado como ferramenta estratégica pelos clientes e/ou usuários que tomam decisões-chave sobre negócios baseadas em informações disponíveis. Esse processo de informação e de integração pode ser a parte mais estimulante na montagem do DW, devido à natureza da maioria dos sistemas operacionais e às especificações do projeto de dados processados neles. As informações do DW são, por necessidade, um projeto muito diferente.Os dados armazenados de uma empresa representam um recurso, mas de modo geral, raramente servem como recurso em seu estado original. É pela extração de dados e pela respectiva integração dos mesmos ao DW, que uma organização transforma os dados operacionais em uma ferramenta estratégica. As informações analíticas e integradas, em lugar dos dados brutos, permitem que uma empresa tome decisões sobre missões importantes e negócios estratégicos. Quer seja ordenado pelos níveis superiores da organização como parte de um projeto de infra-estrutura mais abrangente, quer seja iniciado por um único departamento da empresa para atender a necessidades específicas, o DW é moldado por meio do esforço de colaboração de muitas partes, incluindo os analistas de DSS (Decision Support System) e os programadores de base de dados. Data warehouses contêm dados consolidados de muitas fontes, além de disponibilizarem informações sumarizadas que cobrem um extenso período. Estes sistemas são muito mais amplos do que outros tipos de bases de dados; tamanhos na faixa de gigabytes para terabytes são comuns. Eles permitem consultas complexas e tempos de resposta mais rápidos. Estas características diferenciam aplicações de warehouse de aplicações OLTP, e diferenciam de técnicas de projeto e implementação de SGBDs, que têm que ser usadas para atingir resultados satisfatórios. Um SGBD distribuído, com boa escalabilidade e alta disponibilidade, é um requisito para uma ampla faixa de warehouses. Bases de dados operacionais são aqueles que possuem uma organização diária das operações de acesso e modificação. Os dados dessas bases de dados e outras fontes externas são extraídos usando-se gateways,ou interfaces externas padrão suportadas por SGBDs subordinados. Um gateway é uma interface de aplicação que permite que programas clientes gerem declarações SQL para serem executadas no servidor. Padrões tal como Open Database Connectivity(ODBC) e Open Linking e Embedding for Databases (OLE-DB) da Microsoft e Java Database Connectivity (JDBC) são emergentes para gateways. 4. Visões e Suporte à Decisões Visões são largamente usadas em aplicações de suporte à decisão. Diferentes grupos de analistas dentro de uma organização estão tipicamente concentrados com diferentes aspectos de negócios, e é conveniente definir visões que dão a cada grupo uma visão interna dos detalhes de negócios que são peculiares a um determinado grupo. Tendo sido definida uma visão, pode-se escrever consultas ou novas definições de visões que usam a visão já definida; nesse sentido, uma visão é simplesmente uma tabela-base. A avaliação de consultas é muito importante para aplicações de suporte a decisões. 4.1 Visões, OLAP e Data Warehousing Visões são intimamente relacionadas a OLAP e data warehousing. Consultas OLAP são tipicamente consultas agregadas. Analistas querem responder a essas consultas sobre grandes conjuntos de dados, e é natural considerar visões pré-computadas. Em particular, o operador cube (é uma extensão proposta no SQL equivalente a uma coleção de declarações group-by, com uma declaração group-by para cada subconjunto de k dimensões) dá origem a diversas consultas agregadas que são intimamente relacionadas. Os relacionamentos que existem entre as muitas consultas agregadas que surgem de uma única operação cube podem ser exploradas para desenvolver muitas estratégias de pré-computação efetivas. A idéia é escolher um subconjunto de consultas agregadas para materialização, tal qual num caminho em que consultas cube típicas podem ser rapidamente respondidas usando-se visões materializadas e fazendo-se algumas computações adicionais. A escolha de visões para materializar é influenciada por muitas consultas que podem ter o desempenho melhorado e pela soma de espaço requerido para armazenar a visão materializada (desde que nós tenhamos que trabalhar com uma dada soma de espaço armazenado). Visões e Warehousing: um data warehouse é simplesmente uma coleção de tabelas replicadas assíncronas e visões mantidas periodicamente. Um warehouse é caracterizado por seu tamanho, o número de tabelas envolvidas e o fato que a maioria das tabelas subordinadas advém de bases de dados mantidas independentemente (externos). O problema fundamental na manutenção de warehouses é manter o assincronismo de tabelas replicadas e visões materializadas. 5. Conclusões Sistemas Gerenciadores de Bases de Dados (SGBDs) tradicionais possuem uma série de recursos, como o suporte a consultas complexas, a otimização de consultas, o controle de concorrência, etc. Entretanto, esses sistemas não enfatizam aplicações nas quais os dados correntes e históricos precisam ser analisados. Nesse contexto, pode-se falar nos Data Warehouses, que permitem considerar informações a partir de diversas bases de dados, com o intuito de gerar informações sumarizadas, para auxiliar no processo de tomada de decisão.São várias as etapas até que o Data Warehouse gere informações sumarizadas. É necessária a extração dos dados operacionais da empresa. Depois os dados são refinados, para minimizar a ocorrência de erros. Então, os dados são transformados, para conciliar diferenças semânticas e gerar a visão relacional através de tabelas. Finalmente, é feito o carregamento dos dados, criando materializações através de visões. Sistemas de suporte à decisão trabalham com modelos de dados multidimensionais (OLAP Online Analytical Processing Processamento Analítico Online) e operações que requerem organização de dados especial, não provida por Sistemas Gerenciadores de Bancos de Dados convencionais, que focam OLTP (Online Transaction Processing Processamento de Transações Online). As aplicações de OLAP geram consultas complexas que tornam os sistemas SQL tradicionais inadequados. As aplicações de Data Warehousing são inúmeras em planejamento gerencial, urbano, ambiental, etc., e seus usuários potenciais são principalmente pessoas que estão diretamente relacionadas ao processo de tomada de decisão organizacional, como o corpo gerencial, administrativo e presidencial. OLAP OLAP (On Line Analytical Processing – Processamento On-line Analítico), é uma abordagem de fornecimento de respostas rápidas para consultas analíticas de natureza multidimensional. O OLAP faz parte de uma categoria mais abrangente, o Business Intelligence, que também inclui ETL (Extract, Transform, Load – Extração, Transformação e Carga), geração de relatórios relacionais e data mining. As aplicações mais comuns do OLAP são a geração de relatórios empresariais para vendas, marketing, geração de relatórios de gerenciamento, BPM (Business Process Management – Gerenciamento de Processos Empresariais), planejamento orçamentário e projeções, geração de relatórios financeiros e áreas similares. O termo OLAP foi cunhado como uma ligeira modificação de um termo tradicional de base de dados, o OLTP (On Line Transaction Processing – Processamento Online de Transações). As bases de dados configuradas para OLAP utilizam um modelo de dados multidimensional, permitindo consultas ad-hoc com um tempo de execução acelerado. Nigel Pendse sugeriu que um termo alternativo e possivelmente mais descritivo para designar o conceito de OLAP seria FASMI (Fast Analysis of Shared Multidimensional Information – Análise Rápida de Informações Multidimensionais Compartilhadas). Elas tomam emprestado aspectos de bases de dados navegacionais e bases de dados hierárquicas que são mais rápidas do que as bases de dados relacionais. Os resultados de uma consulta OLAP são normalmente mostrados em um formato de matriz (ou pivô). As dimensões formam as linhas e as colunas da matriz; as medidas, os valores. No núcleo de qualquer sistema OLAP jaz o conceito do cubo OLAP (também chamado de cubo multidimensional ou hipercubo). Ele consiste de fatos numéricos chamados medidas, que são categorizadas por dimensões. Os metadados do cubo são normalmente criados com base no esquema estrela ou esquema floco de neve de tabelas de bases de dados relacionais. Medidas são obtidas em registros da tabela de fatos e dimensões são obtidas nas tabelas dimensionais. Em produtos MOLAP (OLAP Multidimensional), os cubos são alimentados por meio da cópia de um instantâneo dos dados residentes na fonte dos dados, enquanto os produtos ROLAP (OLAP Relacional) trabalham diretamente com a fonte de dados, sem a cópias dos mesmos, e os produtos HOLAP (OLAP Híbrido) combinam as duas abordagens anteriores. Operações para movimentar a visão dos dados ao longo dos níveis hierárquicos de uma dimensão. OPERAÇÃO Drill Down Drill Up Slice Dice Drill Across DESCRIÇÃO Aumento do nível de detalhe da informação e conseqüente diminuição do nível de granularidade. Diminuição no nível de detalhe e conseqüente aumento do nível de granularidade. Corta o cubo, mas mantém a mesma perspectiva de visualização dos dados. Funciona como um filtro que restringe uma dimensão à apenas um ou alguns de seus valores. Mudança de perspectiva da visão multidimensional, como se o cubo fosse girado. Permite descobrir comportamentos e tendências entre os valores das medidas analisadas em diversas perspectivas. O nível de análise dentro de uma mesma dimensão é alterado, ou seja, o usuário avança um nível intermediário dentro de uma mesma dimensão. Drill Around Ocorre quando a tabela de fatos que compartilha dimensões em comum não é organizada em uma ordem linear, assim é preciso fazer um drilling em volta do valor. Drill Through Ocorre quando o usuário passa de uma informação contida em uma dimensão para uma outra. EXEMPLO Uma análise de vendas por estado é alterada para uma análise de vendas das cidades de um determinado estado. Uma análise de vendas é alterada de uma cidade para seu estado correspondente Em uma dimensão tempo de um modelo é selecionado somente o ano de 2000. A análise é alterada de região (linha) por ano (coluna) para ano (linha) por região (coluna). O nível da análise é alterado direto de ano para mês dentro da dimensão tempo, quando esta é composta por ano, semestre e mês. Existem 10 entidades cuidando de um paciente, compartilhando informações entre si. É possível gerar poderosos relatórios realizando queries distintas para cada tabela fato e fazer um outer join com as configurações dos resultados do paciente. O usuário está realizando uma análise na dimensão tempo e no próximo passo analisa a informação por região. Drill Out Drill Within Sort Ranking Pivoting Paging Filtering É o detalhamento para informações externas como fotos, som, arquivos texto, tabelas. É o detalhamento através dos atributos de uma dimensão. Tem a função de ordenar a informação, podendo ser aplicada a qualquer tipo de informação, não somente a valores numéricos. Permite agrupar resultados por ordem de tamanho, baseado em valores numéricos, refletindo somente na apresentação do resultado e não no resultado em si. Ordenar as instituições em ordem alfabética. Alternar linhas e colunas, sendo que todos os valores totalizados serão recalculados. Arrastar a dimensão sexo para dentro da coluna de tempo que compõe as linhas da tabela. Apresentação dos resultados de uma consulta em várias páginas, permitindo a navegação do usuário. Apresentação de consultas com restrições sobre atributos ou fatos. Tiling Visualização múltipla em uma única tela. Alerts Utilizados para indicar situações de destaque em elementos dos relatórios, baseados em condições envolvendo objetos e variáveis. Break Permite separar o resultado de uma análise em grupos de informações, permitindo também a subtotalização de valores para cada grupo. Ordenar a relação de filiais de acordo com os maiores volumes de vendas. Permitir um máximo de 10 resultados por página. Páginas resultantes de uma consulta, diferentes metáforas visuais referentes a uma consulta, ou resultados de diferentes consultas. Definir que os valores das vendas mensais inferiores a R$ 50.000,00 devem aparecer com destaque em vermelho. O usuário tem a necessidade de visualizar a informação por cidades, então ele solicita um break. O relatório será automaticamente agrupado por cidades, somando os valores mensuráveis por cidade. About The OLAP Report You can contact Nigel Pendse, the author of this section, by e-mail on NigelP@olapreport.com if you have any comments or observations. This page was last updated on November 7, 2008 . The OLAP Report is an independent research resource for organizations buying and implementing OLAP applications. This section provides more information about it in an FAQ (Frequently Asked Questions) form. How did The OLAP Report start? When did The OLAP Report first become available on the Web? How is The OLAP Report researched? How is The OLAP Report funded? What does a subscription to The OLAP Report cost? Why is some of The OLAP Report Web site freely accessible? Can I get a free preview or trial of the subscriber material? Does The OLAP Report include information on OLAP applications as well as technologies? Which products are included in The OLAP Report? Can I buy individual product reviews? How are products chosen for inclusion in The OLAP Report? Does The OLAP Report provide a good introduction to OLAP for a novice? Is The OLAP Report available in hardcopy form? Does The OLAP Report Web site require a particular Web browser? Is The OLAP Report available in languages other than English? When was the last edition published and when is the next edition? Does The OLAP Report contain material reproduced from elsewhere? How is The OLAP Report rated compared with other analyst research? How is The OLAP Report related to The OLAP Survey? Do I get The OLAP Survey as a subscriber to The OLAP Report? How did The OLAP Report start? The OLAP Report project began in October 1994, when I approached Business Intelligence (our first publisher), about a new project idea. I had known Business Intelligence for several years, and liked its strict vendor-independence and research-based approach to technology and business reports. Having worked with decision support software since the mid 1970s, both as a user and as a vendor, I had been very disappointed by the shallowness of the coverage of OLAP by industry analysts. Most had never even used an OLAP tool and they seemed to be more concerned with checklists of often-irrelevant, vendor-promoted features than how products really worked and what they were capable of doing for users. It seemed to me that this was the equivalent of a motoring magazine written by non-drivers, and no more useful. I was also concerned that vendor-sponsored ‘research’ documents seemed to be too prevalent then, as now. I was convinced that the OLAP market was about to enter a period of sustained boom and buyers needed clear, unbiased, better-informed help. In particular, they needed genuine research that wasn’t vendor-sponsored, performed by people who really understood the area. We agreed to go ahead with the project and enlisted Richard Creeth to the team. Richard had extensive experience designing and implementing financial OLAP applications using a variety of products, and ran his own consulting firm that specialized in this area. Conversely, I had used, specified, sold and marketed such products for many years. The three of us met in December 1994 and the research commenced immediately in both the US and the UK. Our original plan had been to produce a 350-page printed report, covering at least 15 OLAP products, by mid 1995. However, once the project got underway, we uncovered far more that needed saying than we expected, and the report that eventually emerged in August 1995 covered 23 products and many other topics in 520 pages, split into two volumes. Only 12 of those 23 products were still included in the final printed edition in 2001, though some of the others remain on the market in a limited way. When did The OLAP Report first become available on the Web? We had provisionally planned to do an updated printed second edition in 1997, but we hadn’t allowed for the speed of change in this area, or the Web, which altered all our plans. From not even mentioning the Web in the 1995 edition, we started publishing updates to individual chapters for subscribers on the new www.olapreport.com site in late 1996. By July 1997, the whole of The OLAP Report was available in an updated form on the Web, which had, in effect, superseded the printed edition. But parallel printed editions continued to be produced until 2001. How is The OLAP Report researched? In most cases, we interview vendors face-to-face, involving management, marketing and product development and management people. This involves seeing and testing the products in action. In some cases, we also install and test the software ourselves. We never review products without at least seeing them in use. “A word of thanks for the information contained within The OLAP Report. This is my third year of subscribing and I find the report an invaluable aid.” Matthew Croft Finance Manager – MI Development Prudential UK Typically, these meetings last about ten hours, but with larger vendors, they can extend to several long, all-day sessions involving 10–20 people. In particular, we insist that vendors prove their less believable boasts. We also speak to users, competitors and others to get a more complete view. We have interviewed vendors and users face-to-face in the US, UK, Belgium, Canada, France, Germany, Israel, the Netherlands and Spain. Vendors can see the reviews of their products before publication, and have an opportunity to correct factual errors then or later, but they have no power of veto over our opinions, conclusions and recommendations. Vendors also cannot pull out at this stage if they do not like the opinions in the review. We also get a lot of valuable feedback from readers, so that errors can be corrected quickly on the Web site, one of the other advantages of online publication. Feedback comes from product users, consultants, vendors and even students. Click here for more information on the authors. How is The OLAP Report funded? High quality research like that found in The OLAP Report is expensive to produce, and it has to be funded. There are essentially three sources of funding for Web-based publications:    Advertising: this can work (though not as well as it used to) for mass-market, consumer-oriented Web sites such as Yahoo. But it is not appropriate for a specialist site like The OLAP Report, which has therefore never taken advertising. Sponsorship: most ‘free’ information on BI is actually vendor-sponsored, but The OLAP Report has always steered clear of this source of revenue. Specifically, no vendors pay for their products to be reviewed or featured in case studies, and although vendors can buy reprints of reviews, this is a tiny part of the total revenues. Reviews and case studies are never written with reprints in mind, so reviews always include criticisms. This freedom from sponsorship allows The OLAP Report to be critical of product weaknesses without fear of loss of income. Subscription: this has always been the revenue model for The OLAP Report. We believe strongly that the best way to preserve The OLAP Report’s independence, and its commitment to providing the best information for OLAP buyers rather than sellers, is for those same buyers to provide the funding. This may make it seem relatively more expensive than sponsored sites, but the extra cost is well worth it for buyers, who are guaranteed impartial material that is designed to help them get the most from their far larger OLAP software, hardware and consultancy investment. This commitment to independence means that The OLAP Report is entirely funded by subscription, and the rates have to reflect it. Conversely, if vendors provide free reprints of other analyst research, you should enquire about whether the research was sponsored by the vendor, rather than simply reprinted after publication. In other words, was the ‘analysis’ originally commissioned and paid for by the vendor’s marketing department, or did the vendor pay to reprint independently produced research that merely happened to be favorable? Both types exist, though most vendor-distributed or downloadable reprints from free sites are actually of vendor-funded ‘research’ (which is really no more than disguised marketing material), and it’s worth finding out which is which before reading it. What does a subscription to The OLAP Report cost? Subscribing to The OLAP Report The OLAP Report is available for workgroups or, exceptionally, for individual use. Subscriptions are listed in US dollars, but can be paid in any major currency. Discounts are available for larger groups and on renewal. Subscribers get on-line access to the equivalent of over 1500 printed pages, many of which will be updated during their period of subscription. The OLAP Report is available for multi-user workgroup or individual use by annual subscription. Subscriptions for individual use or workgroups of under five users are available only for OLAP buyers. Because of their wider interest, vendors and consultants are instead offered larger workgroup licenses at discounted prices. Vendors may also purchase marketing rights to material from The OLAP Report — prices on application. Each user is supplied with a user ID and password which gives that person unlimited access to the Web site. If several people in your organization require access, then you must have a multi-user subscription (available at discounted prices). Renewals, and multiyear subscriptions, are also significantly discounted. Why is some of The OLAP Report Web site freely accessible? About a tenth of The OLAP Report is available without subscribing or even registering. This material does not include any of the detailed product reviews or positioning analyses. The reason for providing it is to allow non-subscribers to get some value from the site and to get a feel for the style, quality and depth of the overall site. It also ensures that The OLAP Report is noticed by search engines like Google. There are no restrictions on linking to these pages from other sites. Click here to register for free preview access to a small sample of the content. . Annual subscription rates: Work group size, for in-house use by OLAP buying organizations only: Can I get a free preview or trial of the subscriber material? Yes, you can register for access to a small subset of the subscriber-only material. This lets you assess the style and level of detail provided, though in some cases, the preview material is older than that available to subscribers. US$ Single user Three users Five users $2,995 $3,995 $4,995 Plus sales taxes or VAT if applicable. Subscriptions are also available in other currencies at equivalent prices. Larger user workgroups or vendor/consultant licenses Click here to request prices for larger multi-user or consultant subscriptions. On-line order links and contact details: Does The OLAP Report include information on OLAP applications as well as technologies? Click here to order subscriptions or individual articles on-line. Phone: 1-866-274-9720 or +44 (0) 20 8879 6261 Yes, right from the first edition in 1995, The OLAP Report . has included reviews on horizontal analytical applications, particularly in the financial consolidation and budgeting areas. We regard these as typical applications of OLAP technology and therefore a good fit in The OLAP Report. The OLAP Report also has general information on a range of other analytical applications based on OLAP technology. However, it does not include reviews of vertical applications, such as those used in banking, transport, etc, as they are too specialized. Which products are included in The OLAP Report? You can always find which products are reviewed in The OLAP Report by looking at the products index. This lists all the reviewed products, including the version number, date when the review was last revised and the length of the review. Some of the minor or discontinued products have ‘frozen’ reviews (indicated with an ice cube icon: be updated again, but is kept in the online archive. ), meaning that the current review will not “As well as product and general market analysis, The OLAP Report also provides excellent ‘white-papers’ on a wide range of Yes, some reviews, analyses and case studies from The OLAP Report are now BI/OLAP related topics, definitions, available for individual purchase, though only subscribers have access to the entire explanations and recommended content. Individually purchased articles are provided in PDF form and are not approaches… which are all well worth a subsequently updated, whereas subscribers view content on the Web and read for anyone wishing to fast-track their automatically get access to all content that is updated or added during the period understanding of this business-empowering of their subscription. Subscribers also pay a much lower fee than the collective technology.” price of the individual articles. Can I buy individual product reviews? How are products chosen for inclusion in The OLAP Report? John Morter Regional IT Delivery Manager Hallmark International The most obvious requirement is that they must include OLAP capabilities; we are often approached by vendors of BI tools with little or no OLAP functionality, but have to refuse them. For a guide to OLAP requirements, see the section in The OLAP Report that defines OLAP. It is also necessary that the vendor be prepared to cooperate with us in producing a tough, impartial review; we would not be able to guarantee accuracy without such cooperation. Vendors must be prepared to be open about how their products work, how they are priced, etc, and if they are not prepared to disclose such information, we cannot include them. Beyond that, we take account of market demand; products with fewer than ten sites in production, or which are not actively marketing in the US or UK, would not normally be reviewed, even if the vendors are keen to be included. Does The OLAP Report provide a good introduction to OLAP for a novice? No, The OLAP Report is not aimed at the many people who have just heard of OLAP and want a basic introduction to the topic. There are many other sites on the Web (typically from consultants and vendors) that provide this function, and there are also books available. The OLAP Report is aimed at organizations with a professional interest in OLAP (buying or selling OLAP products, implementing OLAP applications or investing in OLAP vendors), and they are assumed to already have a basic knowledge of the concept. Is The OLAP Report available in hardcopy form? The OLAP Report began in 1995 as a conventional 520-page printed report, but it grew too bulky and hard to update in that form, so from 2002 it became an online-only resource. The final printed edition in 2001 had 678 densely-packed pages, without even including all of the material available on the Web site. If it was still available in printed form, it would be over 1500 pages long. Now that The OLAP Report is Web-only, it includes many more screen shots, and can take advantage of color and links. It also has an efficient automatic site index, available to both subscribers and non-subscribers (so even non-subscribers can check on the contents). Does The OLAP Report Web site require a particular Web browser? This site is intended to provide useful information quickly and conveniently, not to be a test site or demonstration of the latest Seven years of printed editions of The OLAP Report, from 1995 to 2001, before it browser technology. It deliberately uses no went purely on-line. If still available in printed form, it would have well over cookies, frames, Java, ActiveX or plug-ins and double the pages of the 2001 edition. few scripts, but it still will not display properly if used with old browsers or on machines with less than 16-bit color. Unlike with many other over-designed Web sites, you are always free to vary font size to suit your screen or printer (for example, by using <Ctrl>+<Mouse wheel> with modern mice and IE). We find that it displays most accurately with Internet Explorer, Firefox and Opera (though sometimes with a slightly different look). We recommend at least 1024x768 screen resolution with 16-bit color, although higher resolutions and more colors are obviously preferable (the pages are usually tested on IE6, using up to 1600x1200 resolution and 24-bit color). For best printed output, you may want to reduce the font size before printing. Is The OLAP Report available in languages other than English? The original printed edition of The OLAP Report was translated into a French edition. Now, however, The OLAP Report is too large, is updated too often and the content is too technical, for translated editions to be feasible. It is therefore now only available in this English-language form. When was the last edition published and when is the next edition? The OLAP Report is not a single document, but is a large Web site containing well over 100 sections. Some of these could be regarded as large reports in their own right, with the equivalent of up to 70 printed pages. These sections are updated separately, and each shows the date when it was last revised. Typically, at least one of these sections will be updated every week, so there is no concept of an ‘edition’ for the site as a whole. How has the OLAP report helped you to select Does The OLAP Report contain material reproduced from elsewhere? the software you use? No, all material in The OLAP Report was written specially for it. For example, although all vendors produce case studies and white papers, The OLAP Report does not include any such material. We even try to avoid using vendors’ standard architectural diagrams, screen shots, etc; information of this nature in The OLAP Report is usually prepared specially. As they say, “Truth has many faces and any one of them alone is a lie”. When vendors of OLAP products are just interested in pushing the positive points of their products, where does one turn to for the full picture? It’s The OLAP Report, which shows each product in the correct light, warts and all. How is The OLAP Report rated compared with other analyst research? The best quantitative evidence is from user surveys asking about influences on OLAP purchase. Among many other things, The OLAP Survey 4 asked respondents about their use of industry analysts when selecting OLAP products. Overall, 43.5 percent of OLAP buyers cited industry analyst research as an influence, which made it the most important single factor; this rose to 51.8 percent of those who had conducted a formal multi-product evaluation. Of those who cited industry analyst research as being influential, 30 percent cited The OLAP Report as the most important, just behind Gartner as 30.7 percent. No other analyst was cited by more than 13 percent of respondents. How is The OLAP Report related to The OLAP Survey? Do I get The OLAP Survey as a subscriber to The OLAP Report? The contents of the two resources are quite different and they are designed to complement each other. Both are about OLAP, and are vendor-independent, but the content and research methods are entirely different. Has it proved a useful educational resource for your OLAP team? Ever since I discovered it in 1998, The OLAP Report has been one of my two preferred sites for all things OLAP (the other was DM Review). Of course, when it came to specific products, it was The OLAP Report that I depended upon. Mr Nigel Pendse has also patiently answered my occasional queries on specific products. J K Rawal Senior Technical Architect Infosys Technologies The OLAP Report is mainly qualitative, expert assessment of products and technologies, whereas The OLAP Survey is a quantitative analysis of the detailed experiences of hundreds of OLAP users. Unlike The OLAP Report, it has no vendor involvement as research is conducted only with user organizations. It does not include product reviews, case studies, etc. The OLAP Report is delivered via the Web and is constantly updated, whereas The OLAP Survey is published periodically in the form of a discrete report, available in soft copy and data mart formats. The OLAP Report is published by BARC and The OLAP Survey by Survey.com, though BARC also acts as a distributor outside North America. The OLAP Report subscribers should therefore not expect free access to The OLAP Survey results, apart from the occasional quotes that are included. Equally, The OLAP Survey buyers do not get access to The OLAP Report contents. Ferramentas de Relatórios e Consultas Existem cinco categorias de ferramentas de suporte de decisão:      - Relatórios - Consultas gerenciadas - Sistema de Informação Executivo (EIS) - OLAP - Data mining Tipo de ferramenta Questão básica Exemplo de resposta Usuário típico e suas necessidades Pesquisa e Relatórios "O que aconteceu?" Relatórios mensais de vendas, histórico do inventário Dados históricos, habilidade técnica limitada OLAP "O que aconteceu e por que?" Vendas mensais versus mudança de preço dos competidores Visões estáticas da informação para uma visão multidimensional; tecnicamente astuto EIS "O que eu preciso saber agora?" Memorandos, centros de comando Informações de alto nível ou resumidas; pode não ser tecnicamente astuto Data Mining "O que é Interessante?" Modelos de previsão Tendências e relações obscuras entre os dados; tecnicamente astuto "O que pode acontecer?" Ferramentas do Data Warehouse (Fonte: Revista Byte Brasil, Janeiro 1997) Ferramentas de Relatório Pode ser dividias em dois tipos   - Ferramentas de relatório de produção (Suportam grande volume de trabalho como cálculos ou impressões de cheque) - Ferramentas de relatório de desktop (Para usuários finais, como exemplo, temos o Seagate Crystal Report´s. Tem interface gráfica e funções de gráfico ) Consultas gerenciadas É um shield entre a complexidade do SQL e suas estruturas. Muitas vezes são integrados com Web Servers. Sistema de Informação Executivo (EIS) Ferramentas EIS permitem desenvolvedores construírem aplicações de suporte de decisão customizadas e em ambiente gráfico. Os EIS mais populares são o Pilot Software e o Platinum OLAP É um meio de ver dados corporativos. Usuários podem navegar através de hierarquias e dimensões com um simples click de mouse. Data mining Usam uma variedade de dados estatísticos e algoritmos de inteligência artificial para analisar a correlação de variáveis, investigando padrões e relações. Produtos: Cognus Impromtu Sua aceitação no mercado é grande pois utiliza uma interface gráfica parecida com o windows. Também é muito aceito porque suas ferramentas de consulta e relatório são unificados numa única interface. Permite controle administrativo completo a baixo custo. Em termos de escalabilidade pode suportar um usuário ou centenas de usuários se utilizando do banco de dados no data warehouse. Relatórios no Cognus Impromtu: Esse soft foi desenhado para tornar fácil o trabalho do usuário de criar e rodar seus próprios relatórios. O Impromtu oferece:         - ferramentas de consulta e relatórios unificados; - arquitetura orientada a objeto; - integração completa com o Power Play; - escalabilidade; - segurança e controle; - dados apresentados num contexto de negócios; - mais de 70 templates de relatórios pré-definidos; - relatórios de relevância de negócio. Aplicativos: Power Builder O Power Builder trabalha com polimorfismo, a habilidade de herdar forms e objetos e a premissa de que se um objeto foi criado e testado, ele pode ser reusado por outras aplicações. A força do Power Builder não está somente na orientação a objeto, mas na habilidade de desenvolver aplicações windows e sua afinidade com a arquitetura cliente/servidor. Forté Baseado na arquitetura three-tiered client/server: Particionado em três partes distintas: A apresentação lógica é colocada no cliente; a aplicação lógica reside nos servidores de aplicativos e o banco de dados num servidor data warehouse. O Forté possui integração com a tecnologia Java e Web. Construtores de Informação Cactus: Ambiente de desenvolvimento cliente/servidor. É capaz de criar aplicações de qualquer tamanho e escopo. Focus Fusion: Banco de dados multidimensional para OLAP´s e data warehouses. OLAP A necessidade de receber um grande número de dados de um grande banco de dados (centenas de Giga ou até mais) são os motivos de existir o OLAP (não é um aplicativo, é uma arquitetura de aplicação). Quando temos a necessidade de um sistema multidimensional precisamos de um OLAP. Um problema do SQL é a incapacidade de trabalhar com cálculos complexos e séries de tempo. Por exemplo, calcular a média de algo nos últimos três meses requerem extensões ANSI SQL que raramente são encontrados em produtos comerciais. Uma outra vantagem do OLAP é que ele é interativo. O analista pode jogar um valor para simular algo. Assim pode, inclusive descobrir padrões escondidos. Eu posso acrescentar ou tirar uma dimensão do cubo, conforme eu necessitar. O tempo de resposta de uma consulta multidimensional depende de quantas células são requeridas. Para resolver o tamanho do problema do cubo, que cresce exponencialmente a saída é consolidar todos os sub totais lógicos e os totais por todas as dimensões. Esta consolidação faz sentido quando as dimensões fazem parte de uma mesma hierarquia (anos, semestres, meses, dias). Linhas Guia do OLAP  - Visão conceitual multidimensional: enfatiza a forma como o usuário "vê" dados sem impor que os dados sejam armazenados em formato multidimensional;            - Transparência: localização da funcionalidade OLAP deve ser transparente para o usuário, assim como a localização e a forma dos dados; - Facilidade de Acesso: acesso a fontes de dados homogêneas e heterogêneas deve ser transparente; - Desempenho de consultas consistente: não deve ser dependente do número de dimensões; - Arquitetura cliente/servidor: produtos devem ser capazes de operar em arquiteturas cliente/servidor; - Dimensionalidade genérica: todas as dimensões são iguais; - Manipulação dinâmica de matrizes esparsas: produtos devem lidar com matrizes esparsas eficientemente; - Suporte multi-usuário; - Operações entre dimensões sem restrições; - Manipulação de dados intuitiva; - Relatórios/consultas flexíveis; - Níveis de agregação e dimensões ilimitados: ferramentas devem ser capazes de acomodar 15 a 20 dimensões. Categorias de ferramentas OLAP MOLAP: É utilizado, tradicionalmente para organizar, navegar e analisar dados. ROLAP: Permite que múltiplas consultas multidimensionais de tabelas bidimensionais relacionais sejam criadas sem a necessidade de estrutura de dados normalmente requerida nesse tipo de consulta. MQE: Possui a capacidade de oferecer análise "datacube" e "slice and dice". Isto é feito primeiro desenvolvendo uma consulta para selecionar dados de um DBMS que entrega o dado requisitado para o desktop, que é o local onde está o datacube. Uma vez que os dados estão no datacube, usuários podem requisitar a análise multidimensional. Produtos no mercado: Cognus Power Play: É um software maduro e popular que é caracterizado como um MQE. Ele pode aproveitar o investimento feito na tecnologia de banco de dados relacional para oferecer acesso multidimensional para a corporação, com a mesma robustez, escalabilidade e controle administrativo. IBI Focus Fusion: É um banco de dados com tecnologia multidimensional para OLAP e data warehouse. É desenhado para endereçar aplicações de negócios que precisem de análise dimensional dos dados dos produtos. Sua aplicação mais específica é para a formação de aplicações de inteligência de negócios num ambiente de data warehouse. Pilot Software: É uma suíte de ferramentas que incluem: um banco de dados multidimensional de alta velocidade (MOLAP), integração com data warehouse (ROLAP), data mining e várias aplicações de negócio custumizáveis focando pós-venda e profissionais de marketing Ferramentas OLAP e internet A web é um perfeito meio para suporte de decisão:   - A internet é um recurso virtualmente livre que permite conectividade com e entre as empresas;  - A web facilita as tarefas administrativas complexas de ambiente de gerenciamento distribuído. - A web permite companhias a guardar e gerenciar dados e aplicações que podem ser gerenciados centralmente, mantidos e atualizados, eliminando problemas com software e dados financeiros; Conclusões: É claro que os produtos OLAP serão mais voltados à compatibilidade com a web. Tecnologias que suportam internet e web continuam a avançar rapidamente. Tenha em mente então que o produto OLAP que você experimentar com suporte a internet não deve ser um fator decisivo. Para manter a competitividade, os fabricantes continuarão a melhorar os produtos, algumas vezes radicalmente. Integração de Sistemas de Informação Geográfica e Ferramentas OLAP Ednilson Carlos Souza da Silva e Maria Luiza Machado Campos Resumo Sistemas de Informação Geográfica podem ser vistos como um tipo bastante particular de sistema de suporte à decisão, oferecendo mecanismos sofisticados para a manipulação e análise de dados georreferenciados. Outra linha de ferramentas voltadas para o suporte à decisão, as ferramentas OLAP (On-line Analytical Processing) são utilizadas para acesso e manipulação de grandes depósitos de dados no ambiente a que se convencionou chamar de Data Warehouse. Integrando informações provenientes de fontes diversas, estas ferramentas permitem análises estatísticas sofisticadas e simulação eficiente de novas associações entre os dados. Ao contrário dos SIGs, as ferramentas OLAP não vinculam a associação dos dados unicamente à dimensão geográfica, permitindo que outras dimensões sejam especificadas e utilizadas com igual peso nas análises. Este artigo analisa as semelhanças e diferenças existentes entre sistemas de informação geográfica e ferramentas OLAP, analisando as vantagens e a viabilidade da integração destas duas tecnologias. Abstract Geographical Information Systems constitute a particular type of decision support systems, supporting the manipulation and analysis of spatially referenced data. More recently, another set of tools has emerged to support decision making and data analysis in general, the On-line Analytical Processing tools or OLAP tools. These are used for the access and manipulation of very large data repositories, called Data Warehouses. They integrate data collected from different sources, they allow sophisticated statistical analysis and the efficient simulation of new associations on the data. But, unlike GIS systems, OLAP tools do not emphasize the geographical dimension , using other dimensions for the analysis as well. This paper discusses the similarities and differences between the two types of tools, analysing the viability and advantages of integrating these tecnologies. Introdução Em resposta as solicitações de tratamento de um volume cada vez maior de informações, a heterogeneidade de fontes de dados e a diversidade das aplicações, as tecnologias de Sistemas de Informação têm se desdobrado para atender ao mercado com as mais variadas soluções. Sistemas de Informação Geográfica são conhecidos por permitirem a manipulação conjunta de dados convencionais e dados espacialmente referidos. Estes últimos costumam ser tratados segundo uma abordagem temática e orientada à aplicação, incorporando a dimensão geográfica à modelagem de dados convencional. Constituindo-se em um tipo bastante particular de sistema de suporte à decisão, os SIGs têm evoluído consideravelmente nos últimos anos, no sentido de incorporar funcionalidades para a manipulação e análise de dados georreferenciados utilizando níveis de abstração e padrões de visualização intuitivamente mais próximos do mundo real. Em especial, a utilização da tecnologia de banco de dados teve papel determinante na evolução destes sistemas, seja através do suporte de sistemas de bancos de dados relacionais ou dos mais recentes sistemas de bancos de dados orientados a objetos. Outro tipo de ferramenta voltada para o suporte à decisão tem recebido especial atenção e mobilizado grandes esforços dos produtores de software e desenvolvedores de aplicações. São as ferramentas OLAP (On-line Analytical Processing), utilizadas para acesso e manipulação a grandes depósitos de dados, nos ambientes a que se convencionou chamar de Data Warehouse. Integrando informações provenientes de fontes diversas, estas ferramentas permitem análises estatísticas sofisticadas e simulação eficiente de novas associações entre os dados. Ao contrário dos SIGs, as ferramentas OLAP não vinculam a associação dos dados unicamente à dimensão geográfica, permitindo que outras dimensões sejam especificadas e utilizadas com igual peso nas análises. Na verdade, muitos usuários dos atuais SIGs fazem uso de algum tipo de ferramenta para gerar cruzamentos e agregações que serão posteriormente georreferenciados. No entanto, estas ferramentas são em geral, simples planilhas eletrônicas ou pacotes para análise estatística, não integrados ao ambiente do SIG. São deficientes para realizar processamentos mais complexos de forma amigável e intuitiva, como já ocorre nas ferramentas OLAP encontradas hoje em dia no mercado. Este artigo analisa as semelhanças e diferenças existentes entre as abordagens seguidas pelos sistemas de informação geográfica e ferramentas OLAP típicas do ambiente Data Warehouse, analisando as vantagens e viabilidade da integração destas duas tecnologias. Evolução dos Sistemas de Informação Geográfica Desde sua concepção inicial, mais simplista e voltada para o projeto e construção de mapas, os SIGs têm incorporado uma variedade crescente de funções. Em especial, apresentam mecanismos sofisticados para a manipulação e análise espacial dos dados, permitindo uma visualização dos resultados bem mais intuitiva do que a obtida através de relatórios e gráficos convencionais [Mart91]. Certamente foi de fundamental importância o estreitamento da integração entre as ferramentas para SIGs e os depósitos de informação — a tecnologia de banco de dados. Hoje, a maior parte destas ferramentas comerciais se apoia sobre sistemas de gerenciamento de bancos de dados relacionais, embora existam muitas pesquisas e protótipos utilizando-se do alto grau de expressividade e capacidade de tratar dados não convencionais dos sistemas de gerenciamento de bancos de dados orientados a objeto. O uso de SIG como instrumento de suporte à decisão, nem sempre enfatizado na literatura técnica, torna-se evidente ao considerar-se sua capacidade de integração de dados e múltiplas alternativas de apresentação das informações aos usuários, o que potencializa a capacidade de abstração e simulação de resultados. Inicialmente explorado em áreas que suportam a definição de estratégias e políticas governamentais, a exemplo de meioambiente e economia, atualmente a área de negócios tem sido o mais ativo domínio deste tipo de aplicações [Mart91] [Cast93] [Grim94] [Beau94][FoRo93]. De fato, muitas das informações utilizadas na atividade comercial têm componentes espaciais. O que os executivos, estrategistas das empresas e planejadores de políticas governamentais precisam é de pacotes que sejam amigáveis, eficientes e que proporcionem resultados imediatos, respondendo com agilidade as mudanças nos requerimentos das aplicações. No entanto, esta não é a realidade das atuais ferramentas de SIG que ainda exigem considerável esforço para a manipulação e extração de resultados. A literatura sobre SIG é repleta de referências sobre o volume pouco assimilável de mapas e tabelas relacionadas que eventualmente são tratadas nestas aplicações, seguindo as práticas convencionais [CrWP95]. Aliás, a redução desta ineficiência é um dos recursos de propaganda mais mencionados pelos desenvolvedores de ferramentas de SIG. Data Warehouse No suporte à decisão, a tecnologia de informação passa por uma atividade febril nos últimos anos, impulsionada, como já foi dito, pela necessidade de gerência eficiente da informação no ambiente de negócios. A elegância do modelo relacional é bastante apropriada ao processamento de transações, pois permite tratar com eficiência as operações típicas deste ambiente (conhecido como OLTP – On-line Analytical Processing)[KiSt94]. Contudo, a modelagem relacional convencional falha quando tenta extrair dos dados situações de diagnóstico ou prognóstico, típicas do ambiente de suporte à decisão. Assim, para implementar um Sistema de Suporte à Decisão eficiente é necessário dividir a arquitetura de dados da empresa em dois ambientes de bancos de dados [CaRo97]: Ambiente voltado aos bancos de dados operacionais, que dão suporte ao negócio da empresa. Estes bancos via de regra já existem e têm aplicação e operações bem definidas e Ambiente dos bancos de dados de suporte à decisão, orientados para aplicações sobre os dados do negócio da empresa. Estes bancos normalmente precisam ser construídos a partir dos dados existentes, legados ou distribuídos pelas bases de dados da empresa, com outro perfil de modelagem e de operação. Isto implica no desenvolvimento de novas ferramentas de acesso aos dados. Muita atenção tem sido dada à tecnologia de Data Warehouse, um termo criado por Bill Inmon[Inmo97], o mentor da tecnologia, em meados de 1990, para referir-se a este ambiente de dados para suporte à decisão. Um Data Warehouse (DW, quando abreviado), inclui um banco de dados cujas principais características são: Ser orientado ao assunto, o que contrasta com a abordagem convencional de apoio ao processamento de transações. Representa a migração para o suporte à decisão, onde o objetivo é tomar decisões sobre o "assunto" tema dos dados armazenados. Por exemplo, os dados de venda em um sistema orientado a transações, contém informações sobre a venda de produtos específicos para clientes específicos. Dados de venda no ambiente de suporte à decisão contêm um histórico das vendas considerando período, local, produto, etc. Quando bem especificado, este conjunto de dados expressa a "natureza" das informações nele contidas; Ser perfeitamente integrado, pois deve consolidar dados de diferentes origens (sistemas legados, sistemas convencionais voltados ao processamento de transações, etc.), o que freqüentemente envolve compatibilizar codificações, unidades, medidas, etc; Considerar a variação temporal, uma vez que os sistemas orientados a transação capturam dados válidos em um instante de tempo, no momento do acesso. Logo depois os dados podem ser alterados e perderem sua informação "histórica". Quando tem uma dimensão temporal, o dado é associado a um ponto no tempo. Diferentes dados podem ser comparados ao longo do eixo temporal, seja para expressarem informação coincidente no tempo ou para relacionarem um histórico da informação ao longo de um período; Ser não volátil, pois as informações já presentes no banco são raramente modificadas. Os novos dados são absorvidos pelo banco, integrando-se com as informações previamente armazenadas [Hack95]. Processamento Analítico Processamento Transacional Tipo de Usuário Sem especialização ou poder de decisão Normalmente, um gerente envolvido no processo decisório Tipo de Dados Atualizado, preciso e detalhado Histórico, consolidado e freqüentemente já totalizado Tipo de Acesso SQL simples SQL complexo Tempo de Acesso Concorrente, com baixo volume de registros acessados Não concorrente, com alto volume de registros acessados Performance Medida pelo volume de transações executadas Medida pelo tempo de resposta e precisão da consulta Fig. 1 – Diferenças entre processamento transacional e processamento analítico As características distintas do DW, comparadas aos bancos de dados tradicionais na atividade de suporte à decisão, capacitam a tecnologia como uma das mais promissoras. Contudo, o processo de migração das informações disponíveis nos múltiplos bancos e depósitos de dados das empresas para a arquitetura de um DW não é trivial. Do mesmo modo, extrair informações coerentes e significativas do DW envolve profundo conhecimento da atividade e do tema a que se destina a informação [Kimb96]. Nitidamente, aqui repousa uma semelhança muito grande com os SIGs convencionais. O processo de consulta e apuração das informações residentes em um DW difere das características de acesso aos bancos de dados operacionais. As diferenças mais evidentes estão relacionadas na Figura 1. Ferramenta OLAP Investimentos crescentes têm sido feitos na produção de ferramentas para manipulação dos dados de um DW. A tendência tem sido um grande aumento no uso de ferramentas de mineração de dados (data mining) e de ferramentas OLAP. O data mining refere-se ao estudo comportamental dos dados, estando vinculado à disciplinas como redes neurais, inteligência artificial e lógica nebulosa para fazer modelos de previsão e apresentar tendências e relações ocultas entre os dados. As ferramentas OLAP procuram estabelecer e analisar relações entre os dados e fatos ocorridos na administração do negócio, dando suporte, por exemplo, à decisão sobre as mudanças de orientação das estratégias empregadas. Pela capacidade de adaptação as aplicações e aos usuários, as ferramentas OLAP vêm obtendo boa aceitação no mercado. O termo capacidade de adaptação deve ser relativizado por uma característica que permeia este tipo de ferramenta, a de ser melhor utilizada quando tem seu banco de dados organizado segundo uma representação multidimensional. Este tipo de representação apropriada para OLAP pode ser visualizado num espaço multidimensional, onde cada eixo pode ser encarado com uma dimensão ou perspectiva (tempo, área geográfica, sexo) e os pontos neste espaço com um valor medido correspondente a interseção dos elementos correspondentes em dada dimensão [The95]. Um processamento do tipo OLAP sempre envolve consultas interativas, seguindo um caminho de análise de diversos passos, com por exemplo, aprofundar-se sucessivamente por níveis mais baixos de detalhe de um quesito de informação específico. Sistemas de Informação Geográfica e Ferramentas OLAP As ferramentas OLAP já nasceram atreladas a uma técnica de tratamento de informação orientada para resultados e todo o desenvolvimento posterior foi baseado em flexibilidade e facilidade de uso para dar agilidade ao processo de tomada de decisão. Se compararmos as duas tecnologias pode-se apontar: São ferramentas de análise de dados com uma grande área de interseção ainda inexplorada, isto é, um SIG precisa das facilidades implementadas pela ferramenta OLAP, enquanto esta receberia um impulso considerável ao agregar tratamento específico à dimensão geográfica; São orientadas a usuários específicos, porém com níveis diferentes de especialização. A implementação e uso de um SIG envolve conhecimento cartográfico na definição do mapeamento, certo grau de abstração na construção da base de dados que conterá as informações manipuláveis pelo SIG e um domínio considerável do pacote no instante de consulta e análise espacial. Isto demanda um perfil de usuário apto a percorrer as possibilidades da ferramenta para extrair dela o máximo possível. Quando utilizado no ambiente de suporte à decisão, geralmente é um indivíduo no quadro técnico que assessora o pessoal da gerência. No caso das ferramentas OLAP, o usuário é o próprio gerente, que aprende a manipular a ferramenta do mesmo modo como aprendeu a usar uma planilha eletrônica ou editor de texto. Trata-se, em geral, de um usuário não especializado. Em ambos os casos, tende a existir um núcleo na organização responsável pela modelagem, definição da base de dados e montagem das consultas principais. O SIG, entretanto, perde agilidade quando desejamos expandir o conhecimento comportamental da base, isto é, quando se tenta refinar o processo de decisão através de feedbacks ou navegando pelas hierarquias explícitas nos dados existentes na base (agregações ou partições de dados que consideram, por exemplo, períodos de tempo como anos, meses, semanas, dias do mês e dias de semana). Um usuário OLAP em pouco tempo adquire o sentimento da ferramenta e potencializa novos cruzamentos de informação com mais rapidez e flexibilidade, eventualmente sem auxílio externo. Um usuário de SIG menos especializado precisaria de uma aplicação desenvolvida adhoc, onde obteria facilidade de acesso, mas perderia flexibilidade; Necessitam de investimentos consideráveis em equipamentos, suporte técnico e treinamento, o que se reflete no alto nível de cobrança da organização por resultados. Considerando a disseminação das ferramentas OLAP e o extenso background dos produtos na área de SIG, a integração entre estas duas ferramentas de gerenciamento de informação é inevitável. O objetivo é tornar disponível em um único ambiente toda a capacidade de análise presente neste dois tipos de ferramentas. A maneira como esta integração pode ser atingida merece maiores estudos, o que já vem sendo feito, de forma incipiente, através de associações entre fornecedores dos dois tipos de ferramentas. Sistemas Espaciais de Suporte à Decisão Fundamentalmente, um SIG é uma poderosa ferramenta de integração entre bancos de dados e sistemas precisos de mapeamento cartográfico. São representados objetos, pertencentes ou não a base, com a finalidade de prover análise espacial para suporte à decisão. Este conceito, determina o foco atual dos SIGs que buscam prover cada vez mais flexibilidade de avaliação da informação nas bases de dados sem degradar as funções de captura, manipulação e gerenciamento. Apoio à decisão é o principal objetivo das aplicações gerenciais de tecnologia de informação. As várias implementações de sistemas de informação sempre se preocuparam em facilitar ao usuário a extração de informações com a melhor relação custobenefício. Contudo, no suporte à decisão existe um fator de impacto no desempenho geral dos sistemas, a possibilidade do usuário assumir diferentes posturas de acordo com a informação disponível e com a forma como ela é apresentada. Ambas possibilitam o refinamento das decisões do usuário, melhorando sensivelmente o desempenho do processo. O primeiro enfoque da utilização dos recursos de SIG no apoio à decisão foi a definição do conceito de Sistemas Espaciais de Suporte à Decisão (SESD), que não são a mesma coisa que SIG, embora repousem sobre a mesma tecnologia. Um Sistema de Suporte à Decisão tradicional é definido como um sistema computacional de auxílio à decisão que contém bases de dados, um conjunto de modelos otimizados para apuração dos dados visando apoio à decisão e uma interface amistosa e flexível que permita ao tomador de decisões consultar e manipular as bases de dados e modelos, seguindo critérios próprios, em tempo real [Spra80]. Um SIG pode incluir todos estes atributos tornando-se um SESD, utilizando intensivamente seus recursos de análise espacial, ora na visualização de "camadas" de informação comuns a uma dada região, ora na construção de mapas temáticos ou na análise estatística e de prognósticos. Estudos realizados sobre a utilização dos SESD no processo de tomada de decisão, como em [CrWP95], utilizaram a metodologia de aplicar um SIG consagrado no mercado sobre um problema específico (com grau progressivo de complexidade) e avaliar o tempo e precisão do processo decisório. E depois comparar com os resultados obtidos em condições normais de apuração com relatórios e mapas distintos. Evidentemente a utilização do SIG como SESD se revelou mais proveitosa, mas podemos afirmar que SESDs clássicos, sob a ótica da tecnologia de informação, ainda não existem. O que existe é o emprego dos SIGs com uma nova perspectiva, empregando resultados oriundos de ferramentas como planilhas eletrônicas, simuladores e pacotes estatísticos sob modelos de dados e descritores adequados a tomada de decisão. Tipicamente, um Sistema de Suporte à Decisão não espacial possui um núcleo responsável pelo gerenciamento e controle destes modelos e descritores. Um SESD autêntico deve não somente manipular este tipo de dados como também integrá-los espacialmente. Implementações de SESDs com módulos de cálculo, projeção e análise estatística agregados já existem, mas se restringem à aplicações específicas para um determinado ambiente. Nenhuma delas oferece um gerenciamento efetivo de descritores que estimule o usuário final a definir suas sumarizações e cruzamentos na ferramenta para utilizá-la intensivamente. Entretanto, este quadro não constitui demérito aos SESDs atuais, que representam um importante aperfeiçoamento sobre os Sistemas Gerenciadores de Bancos de Dados (SGBDs) tradicionais apoiados por ferramentas de apresentação gráfica de dados. As funções disponíveis nos SESDs lançados no mercado fornecem ao tomador de decisões um modo eficiente de organizar, recuperar e exibir dados baseado em suas características espaciais [Menn97], de modo mais integrado e menos complexo que o SIG convencional empregaria para executar as mesmas operações. Data Warehouse Geográfico Data Warehouse em si não é uma nova tecnologia. É um dos instrumentos da tecnologia da informação para prover novas formas de interação, manipulação e controle sobre os dados de maneira que eles se transformem em informação útil, necessária aos processos decisórios. Em grande parte das bases de dados corporativas repousa um atributo, um campo, uma referência geográfica ou espacial, que vai desde o endereço dos fornecedores até a distribuição das filiais da empresa. Uma decisão sobre as estratégias do negócio irá reunir em algum ponto as características espaciais e não espaciais dos dados disponíveis. O grau de relevância da informação georreferenciada e analisada espacialmente se torna cada vez maior na arquitetura de DWe [Mapi96]. A melhor forma de expressar relacionamentos geográficos entre os dados é através de um mapa, o que nos leva a surpreendente realidade do distanciamento entre a arquitetura e a melhor ferramenta de manipulação da informação espacial, o SIG. A componente espacial de um DW pode ser assimilada de três formas. Em primeiro lugar é necessária uma ferramenta que execute as agregações espaciais e as conversões geográficas (atribuição de coordenadas ou códigos que estabeleçam a referência espacial). Em segundo lugar, o banco de dados de apoio ao DW deve ser modelado seguindo uma mistura entre as características de um banco de dados espacial e de um banco de dados voltado para DW, incluindo suporte a multidimensionalidade e escalabilidade. E finalmente, a análise estatístico-espacial dos dados e a apresentação alternativa dos resultados em mapas ou tabelas. O objetivo mais importante será sempre permitir que sejam feitas consultas ao DW a partir dos atributos geográficos, uma análise espacial destes resultados, habilitar refinamentos sucessivos e agregar os resultados em áreas geográficas visualizadas instantaneamente [Berk97]. Como se vê, as duas últimas possibilidades são viabilizadas somente se existir um suporte de SGBD adequado e uma ferramenta OLAP de apuração de dados contemplada com muitas facilidades de SIG. Por outro lado, se considerarmos um SIG que possa acessar uma base de dados modelada como DW, vamos sentir a falta da flexibilidade que aquele tipo de ferramenta possui para tratar dados analíticos no mesmo nível que uma ferramenta OLAP. Assim, as duas ferramentas, SIG e OLAP, se ressentem da falta de integração, seja no ambiente de Data Warehouse ou para uso geral no suporte à decisão. A apresentação de um produto denominado Data Warehouse Geográfico ou Sistema de Informações Geográficas Orientado para Data Warehouse passa obrigatoriamente pela composição destas ferramentas, seu sucesso depende do grau de transparência ao usuário dos recursos individualmente suportados por cada uma delas. Viabilidade de Integração SIG-OLAP Como se observa, constatada a necessidade de integração começam a surgir diferentes abordagens para tratar o problema, seja no âmbito dos SGBDs, das ferramentas OLAP ou dos pacotes de SIG. As soluções correntes de SGBDs para DW que tem o respaldo de grandes fornecedores como Oracle e Informix, embora sólidas e já apresentando extensões para dados espaciais, ainda não permitem dar suporte a um ambiente único de manipulação dos dados para atender as características de SIG e OLAP. Então permane à necessidade de mudar a organização dos dados para atender às necessidades específicas de uma ou outra ferramenta. Algumas ferramentas OLAP já permitem que as agregações sejam feitas pela dimensão geográfica exibida em formato de mapa mas nem de longe atingem as facilidades da análise espacial dos SIGs. Pacotes de SIG alardeiam a possibilidade da criação de uma camada sobre a base de dados que serviria de interface entre a modelagem de DW e as facilidades de controle espacial do SIG. Entretanto é um caminho obstruído pela necessidade de adaptação a cada ambiente instalado, via software, se a organização desejar atingir um mínimo de flexibilidade e potencial da ferramenta OLAP na apuração de resultados. É uma tentativa de extensão da ferramenta e não de integração. Um estudo cuidadoso da integração das tecnologias de SIG e OLAP deveria considerar o tratamento de quatro questões principais: Integração de dados de fontes heterogêneas, uma atividade comum às duas ferramentas; Metodologia de projeto consolidada, envolvendo modelagem de sistemas e bancos de dados; Suporte de metadados, com a agregação aos sistemas um gerenciamento de um depósito de metadados e Flexibilidade e facilidade de uso, ponto de maior contraste entre as duas ferramentas. Cada uma destas questões será discutida em detalhe a seguir. Integração de Fontes de Dados Heterogêneas A captura dos dados para lançar no ambiente de Data Warehouse ou de SIG é uma das áreas mais problemáticas. Em DW é comum a absorção dos dados legados da empresa, herança dos antigos mainframes com bancos de dados de arquitetura primitiva. Estes dados, em conjunto com todas as outras fontes disponíveis (relatórios em papel, gráficos, resultados de cálculo, planilhas eletrônicas, informações históricas) devem ser filtrados e achatados de modo a incorporarem os mesmos códigos, domínios, unidades e referências. Somente após esta comparação é que podem ser carregados na base, onde via de regra não são alterados. A integração dos dados é uma das fases mais demoradas e custosas da geração de um DW. No tocante ao SIG, os custos de conversão dos dados freqüentemente atingem 20 % do custo total de implementação [Menn97], por causa da necessidade de monitorar permanentemente a precisão da informação geográfica a ser inserida na base. Por exemplo, surgem erros pela falta de precisão no posicionamento (o objeto deve estar onde o mapa diz que ele está), erros de classificação dos atributos do objeto gráfico (o objeto deve estar definido e classificado com precisão) e também erros de abrangência do mapa (quando podemos considerar se todos os objetos pertinentes ao mapa estão nele contidos ou se não há falta ou excesso). Um outro conflito é que muitos dos dados para SIG provêm de fontes externas, sejam governamentais ou encomendadas comercialmente, não fazendo parte dos dados operacionais já disponíveis na empresa. O mapeamento destas fontes pode produzir resultados numa escala e precisão não apropriadas ao tipo de análise requerida. Além disso, a utilização de escalas e projeções cartográficas envolve um conhecimento fora do alcance do usuário comum, exigindo um acréscimo exponencial na curva de aprendizado da ferramenta. Os níveis de agregação para os dados em um ambiente SIG-OLAP integrado têm especificidades diferentes para cada ferramenta. Tipicamente, uma ferramenta OLAP permite operações do tipo roll-up e drill-down. A primeira implica em diminuir o grau de detalhamento na apuração de uma dimensão, isto é, agregar os dados de um nível para o nível superior e a segunda faz exatamente o contrário, ampliando o espectro de informação disponível. Por exemplo, quando consultamos a dimensão tempo no nível de mês, roll-up seria totalizar por ano, drill-down seria conhecer os resultados diários. Estas operações não estão diretamente relacionadas com hierarquias nas dimensões, podendo agregar dimensões diferentes no processo denominado slice and dice. Para o SIG, a agregação é muitas vezes feita pela utilização de containers, como um maximum bounding rectangle (MBR), restrita aos objetos gráficos por eles encerrados. Uma abordagem estática e sem a flexibilidade de navegação adequada. Durante a fase de modelagem do ambiente integrado deve existir um compromisso que não prejudique o desempenho da apuração das hierarquias e outros tipos de agregação/espalhamento dos dados, incluindo a possibilidade de obter diferentes perspectivas da multidimensionalidade dos dados (pivotagem) na forma de mapas ou relatórios. Se necessário, descer ao maior grau de detalhe na descrição dos objetos gráficos correspondentes as dimensões do processo, definir as normas para geocodificação dos dados não espaciais importantes no processo de decisão para aumentar as chances de obter bastante flexibilidade no produto final. Metodologia de Projeto A metodologia de desenvolvimento de uma aplicação de SIG está ainda atrelada à ferramenta utilizada. Historicamente, os projetos de SIG não seguem as etapas tradicionais de modelagem de dados e especificação de processos, especialmente pela falta de integração destes ambientes a ferramentas CASE e pela dificuldade de capturar determinadas situações em um modelo conceitual convencional. O detalhamento das associações entre objetos do mundo real e sua representação espacial, assim como a modelagem de propriedades destas representações, ficam muitas vezes escondidos nos mecanismos de especificação do SIG. A medida que novas ferramentas passem a se apoiar na tecnologia de orientação a objetos, esta situação poderá sofrer sensível melhora, pela possibilidade de trabalhar com definições em um nível mais alto de abstração. Semelhante situação acontece com o projeto de Data Warehouse. Embora existam modelos clássicos na área, como o modelo estrela ou o modelo floco de neve [Kimb96] para representar as estruturas de dados, nenhuma metodologia para projeto e implementação de um DW se encontra ainda consolidada. Existem inúmeras sugestões baseadas em casos em que o usuário pode optar por afinidade com sua área de aplicação, mas nada ainda procurando estabelecer uma separação clara entre projeto conceitual, projeto lógico e projeto físico. Por outro lado, a visão dos dados através da ferramenta OLAP torna um tanto obsoleta a modelagem inicial, se considerarmos a ótica da modelagem E-R (Entidade-Relacionamento) tradicional. Dependendo da ferramenta OLAP escolhida, o cruzamento e apuração das informações freqüentemente extrapola esta modelagem inicial, que serve apenas, em algumas metodologias apresentadas, para identificar temas a serem tratados no modelo estrela. A definição de um modelo que incorpore as duas funcionalidades é uma questão em aberto, principalmente em ambientes integrados como SIG-OLAP. A modelagem da dimensão espacial em DW é simples mas modelar a representação física desta dimensão que os pacotes de SIG implementam, através de descritores e apontadores para os objetos gráficos, significa conhecer a arquitetura interna do produto. Via de regra, esta representação é esquecida na modelagem e mencionada apenas através dos atributos de controle gráfico inseridos nas tabelas construídas através do SIG. O pacote de SIG normalmente exige que as tabelas do banco de dados sejam definidas através dele, com o propósito de manter os objetos gráficos e os dados correspondentes no banco sob seu estrito controle. Em se tratando de tabelas representativas de dimensões, a agregação pura no ambiente de SIG fica sensivelmente prejudicada, pois uma dimensão contém normalmente um descritor. As tabelas de agregação das dimensões, conhecidas como tabelas de fatos do DW, seriam visualizadas de forma estática, com mínimas possibilidade de sumarização e pivotagem. A modelagem e construção da base que suportará o ambiente integrado SIG-OLAP deve ser compartilhada desde o início do projeto, apoiada por uma eficiente estrutura de dados e de gerenciamento dos descritores. Suporte de Metadados A integração de duas ferramentas tão distintas de implementação como SIG e ferramentas OLAP surge, à primeira vista, como uma tarefa de compatibilização via descritores, dos diferentes tipos de dados e das rotinas de tratamento, controle e análise. Na verdade nenhuma das duas tecnologias têm, no momento, um nível de suporte adequado de gerência de metadados. De forma geral, os SIGs não têm um gerenciamento de metadados disponível ao usuário, especialmente sobre a estrutura de armazenamento dos objetos gráficos e os mecanismos de relacionamento espacial (proximidade, abrangência, etc). Existem empresas lucrando com a oferta de produtos de suporte a descritores para os pacotes de SIG mais utilizados no mercado. Estes produtos são ainda restritos ao ambiente operacional de cada ferramenta, acenando como um primeiro passo na integração entre elas. No lado OLAP, o suporte de metadados tem ainda um papel mais importante, pois é normalmente através de sua manipulação direta que o usuário especifica suas consultas e análises. Eventualmente também é mais eficaz definir agregações e fatos como visões e consultas prontas no ambiente de metadados, para depois expandi-las e cruzá-las em consultas ad hoc. Fabricantes, desenvolvedores e usuários de produtos ligados as diversas tecnologias da informação adquiriram consciência da importância dos descritores de dados, por isso têm se reunido para definir um padrão de intercâmbio, gerenciamento e compartilhamento de metadados. Sob o título de Metadata Coalition (MDC), são os responsáveis pela publicação da Metadata Interchange Specification (MDIS)[Meta97], que em sua versão 1.1 cria convenções para modelagem, nomenclatura e formato de intercâmbio de metadados entre ferramentas. A evolução deste projeto e sua assimilação pela comunidade ligada à área de sistemas de informação pode servir de catalisador no processo de integração tratado neste artigo. Flexibilidade e Facilidade de Uso No ambiente integrado proposto, um dos objetivos relevantes é proporcionar o emprego das funcionalidades embutidas nas duas ferramentas através de uma interface consistente e amigável. A consistência rege a flexibilidade e o grau de ajuste com que a ferramenta integrada pode associar propriedades das ferramentas originais preservando as características mais importantes de cada uma. A ferramenta deve ser flexível a ponto de permitir a incorporação de recursos adicionais, oriundos das observações do comportamento do ambiente integrado, sem prejuízo no desempenho. No estágio atual da tecnologia de SIG parece que a facilidade de manipulação é ainda uma preocupação mais divulgada nos anúncios dos desenvolvedores do que sentida nas telas de operação. De qualquer modo, o esforço para construir uma ferramenta integrada poderosa e fácil de usar envolve o tratamento de duas questões relevantes: a falta de um padrão consolidado entre os pacotes de SIG (para tratamento de arquivos e interface com o usuário) e o ônus da mudança de ambiente que o usuário sofrerá no tocante à visualização dos dados. Padronização A concorrência pelo mercado de SIG, que no início residia em pesadas workstations e periféricos, produziu uma série de soluções proprietárias normalmente incompatíveis entre si. Atualmente, migrar por completo um produto gerado em Intergraph/MGE para o Esri/ArcInfo envolve uma série de conversões que normalmente causam alguma depreciação no produto convertido além de um custo homem/hora considerável. A demanda por soluções específicas no ambiente de SIG fez com que cada vez mais módulos e funções de análise espacial fossem agregados aos produtos sem um critério definido, em muitos casos nem mesmo a facilidade de operação. O usuário, ao transitar de um pacote para outro, necessita aprender boa parte da filosofia de trabalho daquele novo pacote. Isto contrasta seriamente com o paradigma atual da computação, onde se pretende uma interface e métodos de trabalho únicos visando atingir o maior número possível de usuários, seja no mercado corporativo ou na computação pessoal. Existem movimentos coordenados por órgãos como o consórcio Open GIS que, diante da impossibilidade de conciliar os interesses dos fabricantes em prol de uma interface e métodos uniformes de operação dos pacotes de SIG atuais, pretende orientar o desenvolvimento das aplicações futuras relacionadas a tecnologias emergentes. As ferramentas OLAP nasceram de aplicações e necessidades individuais, desde as primeiras planilhas eletrônicas até a arquitetura cliente/servidor, onde o usuário pode capturar os dados no servidor de arquivos e processá-los no seu computador pessoal. O padrão Windows para interface dos computadores pessoais permitiu às ferramentas OLAP toda sua funcionalidade e leveza de operação com recursos de menu, rolagem de tela, arrastar e soltar, copiar e colar, etc, semelhantes em muitas das ferramentas. Extensões como ODBC e OLE permitem compartilhamento entre diferentes aplicações e procedimentos. Embora a maioria dos SIGs já ofereçam clientes para o mesmo ambiente, a operação é degradada pela robustez da aplicação gráfica normalmente embutida no processamento, desvantagem que pode ser superada ampliando a capacidade do hardware envolvido na operação. Sendo a interface um elemento crucial para o sucesso do produto é natural que o ambiente integrado busque algo próximo deste padrão. Visualização dos Dados A diferença na apresentação de resultados entre as duas ferramentas é justificável pelo público e fins a que se destinam. No ambiente integrado, a convergência para um modo de visualização que conjugue as funcionalidades de ambas é óbvia, mas existem vários aspectos a considerar além de janelas, drag-and-drop e interface gráfica. O usuário do segmento de negócios acostuma-se muito cedo a lidar com estimativas baseadas em análises estatísticas elaboradas em planilhas de cálculo, um padrão típico de visualização (Figura 2), que varia entre o formato tabular e gráfico, com ajuste de cores e fontes como sinalizadores de alerta para condições críticas. Os dados são "achatados" e adicionados com os resultados da análise (a coluna %). Caso exista uma ligação com a base de dados da organização, o objetivo é dispor somente as informações necessárias aos cruzamentos desejados pelo usuário que são capturados pela planilha e nela manipulados. Não há estímulo para a exploração de novos cruzamentos e da possibilidade de avaliação do "comportamento" da base. Qualquer necessidade nova de apuração depende da conexão com a base e do nível de acesso do usuário. 1970 1980 1990 Município Homem Mulher % Homem Mulher % Homem Mulher % Piraí 28845 23336 80.9 32446 26701 82.3 32921 28345 86.1 Resende 71574 59559 83.2 73006 62103 85.1 73967 66151 89.4 Rio Claro 11217 9621 85.8 10997 9401 85.5 11086 8529 76.9 Barra Mansa 112014 96418 86.1 126638 100871 79.7 125299 98174 78.4 Itatiaia 14395 12414 86.2 16902 14626 86.5 19003 15604 82.1 Valença 48920 42423 86.7 47566 40972 86.1 46450 39368 84.8 Volta Redonda 172885 150162 86.9 174412 149938 86.0 173702 151652 87.3 Barra do Piraí 61068 53694 87.9 61939 52856 85.3 62449 53278 85.3 Quatis 6834 6021 88.1 6915 6109 88.3 7006 6198 88.5 Rio das Flores 4901 4637 94.6 5105 4518 88.5 5086 4302 84.6 Fig.2 – Visualização dos dados em uma tabela estatística Aplicações OLAP têm uma interface que privilegia o suporte a múltiplas formas de análise estatística, tanto no tipo de cálculo, quanto na facilidade de visualização de atributos e dimensões envolvidas no cálculo. Onde o usuário sentese mais à vontade com uma ferramenta OLAP é na possibilidade de cruzar variáveis aparentemente não relacionadas, que após o cruzamento passam a fazer algum sentido. Podendo enxergar a base multidimensional de diferentes ângulos (Figuras 3a e 3b), o usuário é estimulado a investigar seu conteúdo com mais profundidade, no intuito de conhecer variantes da apuração inicial que possam atender suas necessidades, imediatas e futuras. Um Sistema de Informação Geográfica dá ênfase à visualização espacial dos dados, facilitando apurações e análises que envolvam conceitos como adjacência, proximidade e abrangência, entre outros, sobre os dados existentes na base. Esta abordagem é ideal para usuários que buscam compartimentar a informação em uma unidade geográfica mínima que pode ser agregada sem uma rígida hierarquia. É muito viável em aplicações cuja necessidade de localização geográfica estejam em primeiro plano, mas ineficiente para uma interpretação mais geral do conteúdo da base e inflexível quando se pretende navegar sobre os dados e descobrir novas associações entre os atributos, incluindo a dimensão geográfica. Os SIGs e SGBDs mais comuns do mercado conversam entre si de um modo semelhante: as tabelas são construídas pelo SIG, junto com os objetos espaciais (linhas, polígonos, etc), armazenando em base própria as estruturas de dados e atributos que controlam a dimensão geográfica. Esta base é relacionada aos dados não espaciais através de atributos específicos inseridos em cada tabela do SGBD vinculada a um objeto espacial. Na forma usual de representação de informações nestas tabelas, as diferentes perspectivas do dado (dimensões) ficam embutidas em geral no nome das colunas (Figura 4). A possibilidade de agregação e cruzamento de variáveis é assim dificultada pela visão bidimensional estática deste tipo de representação. O exemplo da Figura 4 claramente não faz jus às potencialidades de visualização espacial dos SIGs. O que se pretende demonstrar, no entanto, é o potencial existente em uma visão multidimensional dos dados armazenados. No ambiente comum entre SIG e OLAP, o usuário não deve ser limitado a recuperar o dado de uma forma padrão, mas sim deve dispor de instrumentos para novas análises e consolidações reforçados por características de apresentação que eliminem a inércia após obtenção de seus primeiros resultados. Geocode H (70) M(70) H(80) M(80) H(90) M(90) 88aryih87a 28845 23336 32446 26701 32921 28345 K97kh9k 71574 59559 73006 62103 73967 66151 069h96k 11217 9621 10997 9401 11086 8529 757thf75 112014 96418 126638 100871 125299 98174 Ut858ut9 14395 12414 16902 14626 19003 15604 070oh06 48920 42423 47566 40972 46450 39368 010kfk92 172885 150162 174412 149938 173702 151652 Ll0y0bkk 61068 53694 61939 52856 62449 53278 050kgo50 6834 6021 6915 6109 7006 6198 838jh438 4901 4637 5105 4518 5086 4302 Figura 4 – Visão dos dados em uma tabela do SIG Conclusão As ferramentas de SIG e OLAP constituem importantes tecnologias para o suporte ao processo de tomada de decisão nas áreas de análises espaciais e análises estatísticas. Do ponto de vista do usuário não especializado, característico deste tipo de ambiente, a utilização independente destas ferramentas acarreta vários problemas. O usuário precisa aprender diferentes técnicas e produtos, necessita de suporte técnico da área de sistemas de informação já que é necessário migrar dados entre os produtos, adaptar formatos e tamanhos, além de poder sofrer descontinuidade no raciocínio, provocado pelo salto entre uma ferramenta e outra. Um ambiente integrado obriga ao aprendizado e análise mais eficientes, produzindo melhores decisões. As atuais ferramentas encontradas no mercado são, em sua maioria, apoiadas por sistemas de banco de dados relacionais. No entanto, a utilização destas tecnologias de forma integrada é ainda tarefa bastante árdua já que utilizam abordagens de modelagem de dados distintas (ambientes de Data Warehouse costumam empregar o conhecido modelo estrela, que difere radicalmente do utilizado em aplicações de SIG) e possuem paradigmas de manipulação e desenvolvimento distintos. Este artigo procurou levantar características dos dois ambientes, fazendo um estudo da relação entre eles, levantando possíveis obstáculos e alternativas para a integração das duas tecnologias. O atual movimento no mercado no sentido da ligação de ferramentas SIG-OLAP específicas é um forte indício de que mais esforços estarão sendo feitos para o desenvolvimento de soluções integradas que venham a solucionar os problemas ainda existentes nestas ligações. Dificilmente isto ocorrerá partindo-se do desenvolvimento de um ambiente completamente novo, mas certamente aproveitando recursos e lições aprendidas na operação e implementação das ferramentas SIG e OLAP existentes. Autores Ednilson Carlos Souza da Silva – ednilson@nce.efrj.br Maria Luiza Machado Campos – mluiza@nce.ufrj.br IM/NCE – Universidade Federal do Rio de Janeiro CP 2324 – 20001-970 – Rio de Janeiro – RJ Tel. (021)598-3168 – Fax (021) 598-3156 Referências Bibliográficas [Beau91] Beaumont, J. R. GIS and market analysis. In Geographical Information Systems - Principles and Applications, vol. 2, pp. 139-151. Editado por David J. Maguire, M. F. Goodchild e David W. Rhind, 1991, Longman Scientific & Technical (UK). [Berk97] Berkel, Jan van. Data Warehouse: where to locate GIS. http://www.esri.com/base/common/userconf/ proc97/PROC97/PAP650/P650.HTM [CaRo97] Campos, Maria Luiza M. & Rocha F°, Arnaldo V. Data Warehouse. Anotações do Curso de Data Warehouse http://www.nce.ufrj.br/~mluiza/dataware/home.htm [Cast93] Castle, Gilbert H. et al. Profiting from a Geographic Information System. 1993, GIS World Books (US). [CrWP95] Crossland, M. D. & Wynne, B. E. & Perkins, W. C. Spatial decision support systems: An overview of technology and a test of efficacy. Decision support systems, 1995, n° 14, pp. 219-235. [FoRo93] Fotheringham, A. S. & Rogerson, P. A. GIS and spatial analytical problems. International Journal of Geographical Information Systems, 1993, vol. 7, n° 1, pp. 3-19. [Grim94] Grimshaw, David J. Bringing Geographic Information Systems into business. 1994, Longman Scientific & Technical (UK). [Inmo97] Inmon, William H. Como construir o Data Warehouse. 1997, Editora Campus (BR). [Kimb96] Kimball, Ralph. The Data Warehouse Toolkit. 1996, John Wiley & Sons, Inc. (US). [KiSt94] Kimball, Ralph & Strehlo, Kevin. Why decision support fails and how to fix it. Datamation, Junho de 1994, n° 1, pp. 40-45. [Mapi97] MapInfo Corporation White Paper. MapInfo and the Data Warehouse. http://www.mapinfo.com/events/mapolap/olapadms. [Mart91] Martin, David. Geographical Information Systems and socioeconomic applications. 1991, Routledge Editors (UK). [Mene97] Meneck, Brian E. Understanding the role of Geographic Information Technologies in business: applications and research directions. Journal of Geographic Information and Decision Analysis, 1997, vol. 1, n°1, pp. 44-68. [MePi94] Medeiros, Claudia B. & Pires, Fatima. Databases for GIS. SIGMOD Record, Março de 1994, vol. 23, n°1, pp. 107-115. [Meta97] Metadata Coalition, The. Metadata Interchange Specification (MDIS). Versão 1.1, Agosto,1997. http://www.hc.net/~metadata. [Spra80] Sprague, R. H. Framework to DSS. Management Information Systems Quarterly, 1980, vol. 4, n° 2, pp. 1-26. [The95] Thé, Lee. OLAP answers to tough business questions. Datamation, Maio, 1995, n° 1, pp. 65-72. Trepte, Kai. Business Intelligence Tools. Data Management Review. Novembro, 1997, Vol. 7, n° 11,pp. 36-40. [Trep97] [UCGI96] University Consortium for Geographical Information Science. Research priorities for Geographic Information Science. Cartography and Geographic Information Systems, 1996, vol. 23, n° 3, pp. 115-127. Diferenças entre o OLAP e não as tabelas dinâmicas de OLAP no Excel Ver isenção de responsabilidades para tradução automática Ver produtos para os quais este artigo se aplica. Nesta página    Sumário Mais Informação o Obtenção de dados e diferenças de actualização  Consulta de segundo plano  Consultas de parâmetros  Optimizar memória  Definições de campo de página o Diferenças de cálculo  Funções de sumário  Campos calculados e itens calculados  Subtotais  Marcar totais com * o Esquema e diferenças de estrutura  Medidas de dimensões vs.  Mudar o nome de campos  Agrupar e desagrupar itens  Dados de detalhe  Sequência de ordenação inicial  Mostrar páginas comando  Mostrar itens com sem dados Referências Expandir tudo | Reduzir tudo Sumário O Microsoft Excel permite-lhe criar relatórios de tabela dinâmica baseados em d... O Microsoft Excel permite-lhe criar relatórios de tabela dinâmica baseados em dados de origem OLAP (Online Analytical Processing). Se trabalhar com relatórios de tabela dinâmica que se baseiam em dados de origem OLAP e com relatórios baseados em dados de origem nãoOLAP, irá notar diferenças nas funcionalidades disponíveis e em funcionamento de funções. Este artigo aborda alguns das principais diferenças entre relatórios de tabela dinâmica baseados nos dados de origem OLAP e relatórios de tabela dinâmica baseados em dados de origem nãoOLAP. Voltar ao topo Mais Informação Obtenção de dados e diferenças de actualização Bases de dados OLAP são organiza... Obtenção de dados e diferenças de actualização Bases de dados OLAP são organizados para facilitar a obtenção e análise de grandes quantidades de dados. Antes do Excel apresentar dados resumidos num relatório de tabela dinâmica, um servidor OLAP efectua cálculos para resumir os dados. Apenas os dados resumidos são devolvidos ao Excel, conforme necessário. Com não-OLAP bases de dados externas, todos os registos de origem individuais são devolvidos e, em seguida, o Excel faz o sumário. Consequentemente, as bases de dados OLAP podem fornecer Excel a capacidade de analisar quantidades muito maiores de dados externos. Um servidor OLAP devolve novos dados ao Excel sempre que alterar a vista ou o esquema do relatório de tabela dinâmica ou de gráfico dinâmico. Quando utiliza dados de origem não-OLAP, os dados são actualizados diferente e várias opções de actualização estão disponíveis na caixa de diálogo Opções de tabela dinâmica . Dados não-OLAP podem ser devolvidos ao Excel como um intervalo de dados externos ou um relatório de tabela dinâmica ou de gráfico dinâmico. Dados OLAP de mensagens em fila podem ser devolvidos ao Excel apenas no formato de um relatório de tabela dinâmica ou de gráfico dinâmico. Consulta de segundo plano Não é possível activar a opção de consulta em segundo plano na caixa de diálogo Opções de tabela dinâmica quando o relatório de tabela dinâmica é baseado numa origem de dados OLAP. Consultas de parâmetros Relatórios de tabela dinâmica baseados em origem de dados OLAP não suportam a utilização de consultas parametrizadas. Optimizar memória A caixa de verificação optimizar memória na caixa de diálogo Opções de tabela dinâmica não está disponível quando o relatório de tabela dinâmica é baseado numa origem de dados OLAP. Definições de campo de página Nos relatórios de tabela dinâmica que se baseiam em dados de origem não-OLAP, pode utilizar as definições de campo de página para obter dados para cada item de campo de página individualmente ou para todos os itens em simultâneo. Estas definições de campo de página não estão disponíveis nos relatórios que se baseiam em dados de origem OLAP. Dados de origem OLAP são sempre obtidos para cada item conforme necessário, permitindo que os relatórios para apresentar informações provenientes de bases de dados OLAP extensas. Voltar ao topo Diferenças de cálculo Funções de sumário Não pode alterar a função utilizada para resumir um campo de dados num relatório de tabela dinâmica baseado em dados de origem OLAP. Esta limitação resulta do facto de que os totais são calculados no servidor OLAP. Campos calculados e itens calculados Não é possível criar um campo calculado ou um item calculado numa tabela dinâmica com base na dados de origem OLAP. Subtotais Aplicam-se as seguintes limitações quando trabalha com subtotais num relatório de tabela dinâmica baseado em dados de origem OLAP:    Não pode alterar a função de sumário para subtotais no relatório de tabela dinâmica. Não pode visualizar os subtotais de campos de coluna internos de linha interna ou no relatório de tabela dinâmica. Uma vez que os totais são calculados no servidor OLAP, não pode alterar a definição de itens de subtotal de página ocultos na caixa de diálogo Opções de tabela dinâmica . Marcar totais com * O Marcar totais com * opção caixa de diálogo Opções de tabela dinâmica só está disponível em relatórios de tabela dinâmica baseados nos dados de origem OLAP. Esta opção marca cada subtotal e total geral com um asterisco (*) para indicar que estes valores contêm itens ocultos, bem como os itens apresentados. Voltar ao topo Esquema e diferenças de estrutura Medidas de dimensões vs. Quando trabalha com um relatório de tabela dinâmica baseado em dados de origem OLAP, as dimensões podem só ser utilizadas como linha, coluna ou campos de página. Medidas podem só ser utilizadas como campos de dados. Quando arrastar uma dimensão para a área de colocação do campo de dados ou uma medida para a linha, coluna ou área de colocação do campo de página, receberá a seguinte mensagem de erro: O campo que está a mover não pode ser colocado nessa área de tabela dinâmica. Quando um relatório de tabela dinâmica com base nas dados de origem OLAP estiver activo, a barra de ferramentas tabela dinâmica apresenta um ícone junto a cada linha de campos. O ícone indica onde Excel permitirá que coloque o campo no relatório de tabela dinâmica. Se o ícone for mais escuro no canto superior esquerdo, o campo é uma dimensão que pode arrastar para a linha, coluna ou áreas de colocação de campo de página. Se o ícone for mais escuro no canto inferior direito, o campo é uma medida em que pode arrastar para a área de pendente do campo de dados. Mudar o nome de campos Excel permite-lhe mudar o nome de campos que adicionar a tabela dinâmica. Quando o relatório de tabela dinâmica se baseia em dados de origem OLAP, perderá o nome personalizado se remover o campo a tabela dinâmica. Agrupar e desagrupar itens No Excel 2000, não é possível agrupar itens num relatório de tabela dinâmica baseado em dados de origem OLAP; no entanto, é possível esta no Excel 2002. Dados de detalhe Relatórios de tabela dinâmica baseados nos dados de origem OLAP permitem-lhe apresentar o nível mais baixo de dados disponíveis no servidor OLAP. No entanto, não é possível apresentar os registos de detalhe subjacentes que compõem os valores de sumário. Sequência de ordenação inicial Dados de origem não-OLAP, os itens de um novo relatório de tabela dinâmica aparecem inicialmente por ordem ascendente pelo nome do item. No caso de dados de origem OLAP, itens aparecem inicialmente pela ordem em que o servidor OLAP devolve-os. Pode, em seguida, ordenar ou reorganizar manualmente os itens se pretender uma ordem diferente. Mostrar páginas comando O comando Mostrar páginas não está disponível nos relatórios de tabela dinâmica baseados em dados de origem OLAP. Mostrar itens com sem dados A opção de Mostrar itens sem dados de mensagens em fila na caixa de diálogo campo da tabela dinâmica não está disponível nos relatórios de tabela dinâmica baseados em dados de origem OLAP. Voltar ao topo Referências Para mais informações sobre como utilizar origens de dados OLAP no Excel, cliqu... Para mais informações sobre como utilizar origens de dados OLAP no Excel, clique em Ajuda do Microsoft Excel no menu Ajuda , escreva o OLAP no Assistente do Office ou no Assistente de respostas e, em seguida, clique em Procurar para visualizar os tópicos devolvidos. Voltar ao topo A informação contida neste artigo aplica-se a:    Microsoft Office Excel 2003 Microsoft Excel 2002 Standard Edition Microsoft Excel 2000 Standard Edition Voltar ao topo kbmt kbhowto KB234700 KbMtpt Palavras-chave: Voltar ao topo Tradução automática IMPORTANTE: Este artigo foi traduzido por um sistema de tradução automática (também designado por Machine translation ou MT), não tendo sido portanto revisto ou traduzido por humanos. A Microsoft tem artigos traduzidos por aplicações (MT) e artigos traduzidos por tradutores profissionais. O objectivo é simples: oferecer em Português a totalidade dos artigos existentes na base de dados do suporte. Sabemos no entanto que a tradução automática não é sempre perfeita. Esta pode conter erros de vocabulário, sintaxe ou gramática… erros semelhantes aos que um estrangeiro realiza ao falar em Português. A Microsoft não é responsável por incoerências, erros ou estragos realizados na sequência da utilização dos artigos MT por parte dos nossos clientes. A Microsoft realiza actualizações frequentes ao software de tradução automática (MT). Pedíamos-lhe o favor de preencher o formulário existente no fundo desta página caso venha a encontrar erros neste artigo e tenha possibilidade de colaborar no processo de aperfeiçoamento desta ferramenta. Obrigado. Clique aqui para ver a versão em Inglês deste artigo: 234700

(Online Analytical Processing). - CIn

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib