© ABB Utilities - 1 2002-01; NM-MW-0169/UTUA/NDF Mats Werner Utility Data Warehouse UTUA/NDF UDW 200 SPIDER Utility Data Warehouse Data Warehouse for storage of all kinds of data Presentation in One-line diagrams Presentation as Trends & Historical information 10.25.32 B9 B8 C1 B3 StationA T1 T2 B1 B2 B4 B5 B6 C2 B7 Data mining and other analysis tools C7 Presentation in Reports & spreadsheets Vedlikeholdsplaner: Region Syd Region/Område Region Syd Stasjon/LinjeAlle Status Alle Type Alle Spenning300kv Skala Uker 1996 27 28 29 30 31 32 33 8 15 22 29 5 August 12 Nr Stasjon/Linje Anl.DelÅrsak Status 1 Juli 96S432-1 Hasle 10 G2 Ombygging I Gang 96S530-2 Flesaker G1 Årlig Revisjon Utsatt 96S532-1 Lysebotn F1 MekaniskOK Reg. 96S541-1 Rendalen G3 UtskiftingOK Land Lukk Detaljplan Ny plan Utskrifter Hjelp © ABB Utilities - 2 - UDW 200 SPIDER SCADA/EMS/DMS Data Warehouse Based on... other Utility Information Systems UDW – Utility Data Warehouse © ABB Utilities - 3 - Oracle-based Data Warehouse for Utility data Able to store the continuous inflow of data from the power process Able to store information from any other Utility Information System Makes all this data available for Data Mining. High User Performance High Availability High Redundancy UDW – Utility Data Warehouse Both Utility-wide Data Warehouse and on-line Historian! Decision Support Historical Analysis in Office Environment in Control Room Environment Real-Time Network Management Systems Non-real-time Information Systems AMS ERP BMS © ABB Utilities - 4 - CIS … Oracle RDB EMS DMS SCADA Utility Data Warehouse Utility Data Warehouse – Original Requirements RDBMS supporting the latest SQL standard ”Ad Hoc” queries, ”Data Mining”, etc. Continuous operation Availability > 99.96 % Redundancy Continuous inflow of data (every 10 sec, 24h/day) Response times (incl. picture presentation): 3-5 seconds Extremely large database tables: > 500 million rows Continuous calculations sum, max, average etc. Complex calculations using MatlabTM © ABB Utilities - 5 - Automatic archiving of data that should not be retained on-line On-line backup Automatic maintenance of the database eliminates the need for a DBA at the customer Relational Analysis The ability to analyze relations between data is the basis for Decision Support and Data Mining Relational Database Design makes this easy Object ID Time Stamp Value Object ID Object Name Station 123456 12:34:10 123.456 789123 12:34:10 456789 1 123456 TRF1 2.234123 1 789123 12:34:10 213.4324 1 123456 12:34:20 8.456 789123 12:34:20 2.2455 456789 12:34:20 212.143 Station Station ID Name 446 446 City23 Downtown TRF2 226 226 FOO WestRing 456789 BAY1 791 … 2 987654 BAY2 446 … 1 … Quality ID … 1 © ABB Utilities - 6 - … Example: ”Give me the name and the location of all lines that has been in alarmed state during the weekend ” Location … UDW - Open but integrated in SPIDER concept UDW is a completely open system UDW is also a well integrated product in the SPIDER concept © ABB Utilities - 7 - … as shown in the following areas: Open Commercial Tools and Interfaces Application Programming Interface, API Well Integrated in SPIDER Powerful Database Structure UDW as an integration platform UDW - Open Commercial Tools and Interfaces Based on Oracle Oracle is an open commercial relational database © ABB Utilities - 8 - delivering ”the industry's highest performance, reliability and security for thousands of popular applications” All tools applicable for Oracle is also applicable for UDW Possibility to access UDW data directly in Oracle, using SQL or Oracle tools New development in Oracle means increased value of UDW Open interfaces ODBC/JDBC Microsoft COM SQL*Net UDW API (see below) UDW Application Programming Interface, API Simplifies access to UDW data From C, C++, Java or Microsoft COM-based applications Simplifies the creation of custom user applications Functions for inserting and reading many data points at once General browser routines, to access data without knowing the data model An Oracle-independent general database server service, Implements Additional functionality User-friendly access routines for setting and getting a range of time-series data When data is stored by event, the data is retrieved as if it was stored cyclically Implements support for Daylight Savings Time, Seamless Data Retrieval and Data Sets (as described later in the presentation) The UDW API is used when: © ABB Utilities - 9 - for remote SQL access without SQL*Net accessing UDW data in the SPIDER environment accessing UDW data through the Microsoft COM interface accessing UDW data in dedicated programs (including UDW-internal) UDW - Well Integrated in SPIDER UDW data can be presented in SPIDER pictures, on-line diagrams the report and trend tools used in SPIDER The object oriented Data Engineering for the UDW is integrated with SPIDER Data Engineering. The UDW programs are monitored by SPIDER Alarms and warnings from UDW are sent to SPIDER alarm list © ABB Utilities - 10 - e.g. if the data sampling is not functioning properly due to network or hardware problems UDW - Powerful Database Structure Data is stored in a user-friendly way Uncompressed in open Oracle tables Database views to simplify common queries Data model is temporal history of each object exist in the database calculation algorithms related to certain objects are time-stamped Network relations and calculations are therefore correct even when studying old data, e.g. data restored from off-line archives Very good performance regardless of system and database size, © ABB Utilities - 11 - By the use of the Oracle partitioning technique Sampling schemes are easily defined using Object Types and Patterns UDW – Object Types and Patterns Sampling schemes defined in Object Types Each Object Type corresponds to one Oracle Table Objects to be sampled Object Types UDW patterns sampled Object Types calculated Object Types 10 sec measurands 1 min average measurands Measurement pattern 2 indications by event hourly minimum measurands Indication pattern state estimator result daily maximum measurands Measurement pattern 1 For each Object Type is defined: Sampling method cyclically – including sampling frequency by event – with our without integrated dead band © ABB Utilities - 12 - Retention An object to be sampled refers to a Pattern A Pattern refers to a number of Object Types If period it shall be archived when retention period is over Enables easy set-up of sampling schemes for sampled objects UDW as and Integration Platform (example) Platform for application development and cross-platform integration Developing a simple application in MS Excel, using the UDW COM interface Combine the data in the Utility Data Warehouse with the functionality of Excel: © ABB Utilities - 13 - • Visual Basic programs • Excel spreadsheet functions • Excel Charts UDW - Data Model Sampled data is stored in Object Types A simple Power System model is also provided including e.g. stations, sub systems, point classes To enable queries and selections based on relational criteria e.g. ”show all measured values included in station named AMHERST” Automatically populated from SPIDER Data Engineering Temporality is provided Objects Stations 10.25.32 B9 B8 Station A C1 T1 C2 T2 B1 Subsystems Information and relations between real-world objects B2 B7 B3 B4 B5 B6 C7 © ABB Utilities - 14 - Real-world objects 10 sec measurands 1 hour average daily maximum Object Types: values related to sampled real-world objects Additional data relations possible to define easily Since an open relational database technology is used UDW - Data Sampling Virtually any type of information can be stored in the UDW SPIDER SCADA/EMS/DMS Sampling All types of data in the SPIDER real-time database Avanti can be sampled and stored in the UDW database. Cyclically, according to a user-defined sampling rate. “By Event" (upon change), with or without an integrated dead band Sampling from other Information Systems © ABB Utilities - 15 - The actual sampling is highly dependent on the source system from which the data is extracted. There are several ways to store data into UDW: Direct SQL insert, e.g. through database links (assuming an RDB source) Direct access through open interfaces (e.g. ODBC, JDBC, Microsoft COM) Dedicated program that uses the UDW API Dedicated program that uses the ”bulk data inserter” (as used in SCADA-sampling) Access through Avanti (e.g. by first inserting data into Avanti) UDW – By Event sampling, integrated dead band Sampling data By Event with an integrated dead band a threshold value in percent of the measurement range is defined measured value changes are accumulated between samples © ABB Utilities - 16 - if the actual value differs only slightly from the latest stored, it will thus eventually be sampled anyway UDW - Quality Marking UDW handles quality information together with all sampled values. propagates automatically through all calculations validity of calculation result depends on quality of included values according to what is defined in Data Engineering Mapping of quality states is defined in Data Engineering For SPIDER sampling these mappings are set by default, but are configurable. Possible to distinguish quality information generated in UDW and quality information originating from the SPIDER Avanti database. © ABB Utilities - 17 - Original quality value Quality value in UDW Quality value in UDW after manual correction in UDW after calculation (a+b) Object Value Quality Object Time Stamp Value Quality Object Time Stamp Value Quality 123456 123.46 Invalid 123456 12:34:10 123.46 Invalid 22222 12:34:10 125.46 Invalid 789123 2.000 Corrected 789123 12:34:10 2.000 Corrected in SPIDER 33333 12:34:10 7.000 OK 456789 213.434 OK 456789 12:34:10 5.00 Corrected in UDW UDW - Post Disturbance Review The Post Disturbance Review, PDR, function allows the user to monitor the process behavior before, during, and after a disturbance T rigging event (discarded) T rigging event In the DE tool it is specified: T rigging event (prolonging) time © ABB Utilities - 18 - period before event Deadzone which data is to be monitored, which events that will trigger a PDR time periods the number of areas available for storage of disturbance information period after event A disturbance recording can be manually triggered by an operator It is possible to playback the recorded disturbance data in a Single Line diagram. Upon operator request, data for completed disturbances can be saved on the off-line medium used for archiving UDW - Data Archiving For each Object Type it is defined (in Data Engineering): how long it shall be retained on-line in the database if it shall be archived When retention period has passed, data is automatically exported to the chosen off-line medium UDW maintains a library system for the archived data. Seamless presentation is supported If restored data and on-line data connects in time Standard software can be used for the Archiving function. © ABB Utilities - 19 - When data shall be restored the user specifies the type of data and for which time interval the data shall be restored such as Legato NetworkerTM UDW - Advanced Data Processing Powerful and versatile data processing capabilities Times series calculations: Snapshot calculations All power of the Matlab tool is available for utilization Calculation wizard to ease simple calculation definitions Complete ROLAP1 aggregation support (configurable) Data can be automatically aggregated in all dimensions © ABB Utilities - 20 - Raw data and processed data are stored and handled the same way, i.e. both can be used for e.g. data mining Both during sampling and at manual entry Quality coding of all values E.g. summary for e.g. a station or the whole power network Automatic recalculation of all2 formulas Any formula/calculation on objects for many points of time Matlab used as calculation engine Any formula/calculation on objects for a certain point of time Custom calculations Sum, Average, Max, Min Propagates through calculations Calculation patterns (“templates”) For easy connection of sampled objects to a set of pre-defined calculations and sampling profiles 1ROLAP = Relational On Line Analytic Processing 2Restrictions apply on custom calculations to prevent recursive deadlocks UDW - Report and user interaction functions UDW data is easily accessed from various client tools There are four main access routes: Through SPIDER (historical data originating from SPIDER can be presented in SPIDER trends and spreadsheet reports, and in one-line diagrams). Direct SQL read (through SQL*Net, Oracle tools, Microsoft Query etc.) Direct access through open interfaces - ODBC, JDBC, COM, (e.g. Microsoft Office programs). Through the API or Java API (dedicated programs accessing UDW data) Presentation in Reports & spreadsheets Presentation in One-line diagrams Presentation as Trends & Historical information Vedlikeholdsplaner: Region Syd 10.25.32 Station A T1 B4 C2 T2 B1 B3 Region/Område Region Syd TypeAlle StatusAlle Stasjon/Linje Spenning Skala Uker Alle 300kv 1996 27 28 29 30 31 32 33 15 22 29 5August 12 Nr Stasjon/Linje Anl.Del Årsak Status1 8 Juli 96S432-1 Hasle 10 G2Impinging I Gang 96S530-2 Flesaker G1 Årlig Revisjon Utsatt 96S532-1 LysebotnF1 Mekanisk OK Reg. 96S541-1 RendalenG3 Utskifting OK Land B9 B8 C1 B2 B5 B6 Data mining and other analysis tools B7 C7 © ABB Utilities - 21 - Lukk UDW 200 Database (Oracle) Detaljplan Ny plan Utskrifter Hjelp UDW – Client example: SPIDER Time Tagged Trends The Time Tagged Trend (TTT) function is used for presentation of historical data in curve diagrams. Up to 16 curves in each diagram Automatic scaling of value axis depending on curve values. The values can be fetched from different object types © ABB Utilities - 22 - each with its own value axis. Can have time offsets relative to each other. Quality coding UDW – Client example: SPIDER Single Line Diagram UDW data can be shown in the Single Line Diagram © ABB Utilities - 23 - by selecting a historic time and an Object Type or a Data Set. Possible to start a replay of the data using tape-recorder like controls: play, advance, rewind, stop the selected time is automatically changed with a given speed. UDW – Client example: Microsoft Excel © ABB Utilities - 24 - There are several ways to get UDW data into Excel: via the SPIDER Excel add-in directly through the Oracle ODBC driver copy/paste or export/ import data retrieved with another tool, e.g. Oracle™ Browser use the UDW COMinterface in an Excel Visual Basic program, that reads the appropriate information from UDW Once the data is in Excel, its built-in functions can be used i.e. data can be calculated on, charts can be created etc. UDW – Client example: Microsoft Query © ABB Utilities - 25 - Using the Microsoft Query tool it is possible to create custom queries to the UDW in a user-friendly way, from the Windows environment © ABB Utilities - 26 - UDW – Client example: MATLABTM Graphics Using the integrated MatlabTM server it is possible to define advanced formulas and calculations Using a MatlabTM client tool it is also possible to benefit from the advanced graphical user interface of MatlabTM, and that way visualize data and correlations found in UDW data. UDW - Data Sets The Data Set function facilitates presentation of UDW data Groups a set of Object Types A Data Set may include object groups from different SPIDER concepts: Measurands, Process values, Indications etc. © ABB Utilities - 27 - Can thereafter be referenced as one identity Practical e.g. when presenting historical data in Single Line diagram UDW - Seamless Retrieval – Data Set example Measured values are often sampled with several frequencies, where higher frequency normally corresponds to a shorter retention period Each sampling frequency corresponds to one Object Type For some points of time there may exist data in more than one Object Type By creating a Data Set of these Object Types, it is possible to address the sampled values as one entity get best possible resolution for all points of time Data that are retrieved when asking for data from t1 to tn: 10 second t1-t2 1 hour values t2-t3 1 minute values t3-tn 10 second values 1 minute © ABB Utilities - 28 - 1 hour t1 t2 t3 tn UDW - Daylight Saving Time Calendar time is normal time with Daylight Savings offset The UDW has full support for Daylight Savings Time. It is possible to store both normal time and calendar time in UDW. It is recommended for most usages to only store normal time and let the UDW API take care of the DST conversions. If the times are retrieved through the UDW API, the UDW takes care of the conversion both when it comes to individual points of time and larger time spans. The UDW API is used when accessing UDW data in the SPIDER environment, by the MS COM interface and when accessing UDW data in dedicated programs Example: If the user wants to get data for a time span that includes a switch between normal time and Daylight Savings Time, there will be the correct number of times retrieved (e.g. an extra hour if going from daylight savings to normal time). © ABB Utilities - 29 - The calendar time is used in display pictures and reports UDW - Authority UDW uses standard Oracle™ authority routines Users are assigned to a Profile and one or more Roles Roles provides specific access rights Profiles are used to set session parameters Such as the maximum CPU time for a session Examples of Roles defined in UDW: system administrator, privileged user, common user, etc. Number of sessions Max CPU per session Connect time during session etc. © ABB Utilities - 30 - Profiles are used to limit the risk for low priority users or user groups to impair the performance of the UDW database by using badly formed SQL-statements, for instance, a full table scan. Profile User Database creator System administrator Role UDW - Audit Trail The Audit Trail function in UDW tracks all changes made in the UDW database For all manual updates the following data is stored: old and new value the affected time and object identity the point of time of the modification an operator identification The standard Oracle™ audit trail function can also be used Tracking of insert is not recommended © ABB Utilities - 31 - providing tracking of user actions on a table level. the insert rate is very high in the system, and auditing insert would degrade performance. UDW – Planned Values The UDW supports storage of planned values A ”window” of future data points are a opened for storage Can be used to store e.g. forecasts or schedules Defined as an Object Type in Data Engineering © ABB Utilities - 32 - Handling of retention period, archiving etc. just like other Object Types A number of parameters are possible to define: if the values shall be overwritten by actual sampled values or not if the values shall initially be set to blank or ”invalid” the size of the ”window” – the time span to be opened in the future initial time period of the plan, i.e. including historical times UDW - Data Engineering The Data Engineering of the UDW is object oriented and dialog driven. Is totally integrated with the Oracle-based SPIDER data engineering tool. © ABB Utilities - 33 - All data engineering logic for UDW is contained within the UDW product. UDW – Specification of Object Types For sampled Object Types: sampling method: by event (including dead band) or cyclically (including frequency) retention period, i.e. how long the object shall be retained on-line in the database mapping between the source system (SCADA) and UDW quality codes properties to be sampled from source system (SCADA) to UDW © ABB Utilities - 34 - for instance the actual value and updated flag for a measured value For calculated Object Types: source Object Type (a sampled Object Type or another calculated Object Type) type of calculation (e.g. Average, Sum, Max, Min) required percentage factor of valid source data to get a valid result if time stamping shall be in the beginning or end of the calculated period UDW - Backup and recovery For safety copy reasons the on-line data in UDW can be backed up, using standard Oracle™ functionality. The backup is done in a way so that all data, including calculated values, are consistent. The interval between backup activities is configurable UDW Safety Backup and UDW Archiving are complementary functions! © ABB Utilities - 36 - Oracle RDB Utility Data Warehouse UDW on-line system UDW Safety Backup UDW Archiving Archiving of a safety backup of on-line data Archiving of data older than retention period UDW - Configuration The UDW can be configured in different ways to provide the required degree of redundancy and data security. Data Redundancy The data in the UDW database is stored redundantly so that if one disk fails, no data is lost To get full data redundancy, RAID disks of type 0+1 are used Server Redundancy © ABB Utilities - 37 - striping of data and disk mirroring is applied The UDW servers are redundant so that if one UDW server becomes inoperable (e.g. due to hardware errors) another server is taking over. Different configurations are applicable depending on the requirements on allowable downtime and loss of data during switch-over. Redundant Server solutions, with automatic switchover Emergency Center solutions, with manual switchover Single Server solutions UDW - Redundant Server solution Redundant Server solution No single point of failure Automatic switchover when the on-line server malfunctions Separate server for Safety Backup and Archiving Backup Server DLT Automated Tape Library with duplicate drives for archiving and restoring SCSI SCSI Backup LAN SCADA/EMS/DMS The two UDW servers are connected in a cluster configuration Both servers receive data to be stored from the sampled system. One of the servers is on-line while the other is hot standby and ready to take over the processing. The on-line server updates the UDW database with the sampled data and calculation results Memory Channel © ABB Utilities - 38 - UDW Server 1 Server-RAID Server-RAID Connection Connection SCSI SCSI UDW Server 2 RAID Disk LAN UDW – Emergency Center solution Emergency Center solution No single point of failure Manual switchover when the on-line server malfunctions Original UDW rebuilt by restoring a safety backup SCADA/EMS/DMS HIS Server, online RAID Disk Server-RAID Connection SCSI The standby server is kept up-to-date with the on-line server, using Oracle Standby It is possible to restart the standby database as the primary UDW database with minimal loss of time and data. LAN SCADA/EMS/DMS HIS Server, standby Emergency Control Center © ABB Utilities - 39 - RAID Disk W A N Server-RAID Connection SCSI LAN UDW – Single Server solution Single Server solution With either RAID or disk shelf SCADA/EMS/DMS Disk or RAID Shelf UDW Server SCSI © ABB Utilities - 40 - LAN If a disk shelf is used instead of RAID, the single disk controller is located in the UDW Server