TS5352 – Enterprise Database Systems Mark Lindquist mrkcl@msn.com September 2009 Final Project Information System Planning at MVCH Table of Contents: Introduction Purpose of this document Future data management at MVCH A conceptual model The database and its data definitions The client/server environment Distributed data and databases Quality Governance Security Governance Business continuity planning Data integration strategy Business Intelligence strategy Future data management challenges Conclusion Appendix A: The Data Definition Language (DDL) Appendix B: Annotated bibliography Appendix C: SQL and Data Security Introduction: In times of old, meaning two decades or so ago, computer systems were in their infancy. The business aspect of computering, as I like to call it, revolved around transactions that used flat file data structures. The database environment for which the benefits over traditional file systems will be come clear was a new concept in data management. As the internet and local intranet grew and became the driving force behind business to business (B2B) transactions the need for a stable and efficient data processing environment was realized with the database and its Database Management System (DBMS). At MVCH the database environment and its unique benefits over the current flat file system being run from a centrally located mainframe has yet to be realized. A database environment has many advantages over the traditional file processing system of old. File processing system disadvantages include “program data dependence, duplication of data, limited data sharing, lengthy development times, and excessive program maintenance (Hoffer P. 12).” While file processing systems contain these disadvantages the database environment contains important advantages. First the database approach strives for an integrated data approach where data is shared throughout the hospital. Second the advantages of the database approach include program data independence, planned data redundancy, improved data consistency, improved data sharing, increased productivity of application development, enforcement of standards, improved data quality, improved data accessibility and responsiveness, reduced program maintenance , and improved decision support. With this enlightenment of the database approach MVCH is therefore encouraged to adopt the restructuring of its old file processing system, replacing it with the database environment. Purpose of This Document: The purpose of this document, as a result, is to show that the future data needs of MVCH will require a major update of the current system. Establishing a database environment for MVCH requires many components. From gathering the correct data to creating the security and quality that that data requires-this document details the processes, policies, and structures needed to accomplish a solid information system for MVCH. After reading this document you should understand first, how vital it is for MVCH to update its current system but more importantly you will understand the processes, steps, ingredients, and meanings of a successful database environment. There are diagrams and models that show data relationships and data entities, the building block of the relational data structure. There is the client/server architecture that will comprise the information system. There are security, privacy, and quality issues to explore. The database itself will be created and SQL queries run against it. The risks associated with the running, execution, and maintenance of the database will fall under the guise of a Business Continuity Plan (BCP) in the later pages of the document. Data integration, the key to a successful business will be described. Business Intelligence and future data management challenges will round out the document. Future Data Management at MVCH: Future data management at MVCH, assuming adoption of the new information system, entails a two fold approach. The near future encompasses the envisioned data structure of the new database environment; its components, processes, and technologies. The future of tomorrow envisions a data structure that meets the changing needs of a growing MVCH and surrounding area. To meet the near future needs of MVCH data must be managed in the new database environment. The function of a database at MVCH is to store business related data for business related processing but the data itself must be fit for use. Data should be accurate, unique, complete, consistent, current, and other quality characteristics to ensure a robust, efficient, and capable database. The longer term data management needs are not as clear. MVCH and the surrounding community will undergo growth and change. This growth and change will dictate the data needs within the new information system. The data used today for everyday business transactions is different from tomorrow’s data in that tomorrow’s data will reflect this change and growth. As a result of the near future data needs and tomorrow’s data needs a sound data management program, overseen by the data administrator, should manage these changes and growth. This data management program must not only see to the current needs of MVCH and surrounding community but also be aware of changes and growth within MVCH and local community and meet those needs with accurate data. A Conceptual Model: The actual database development process consists of first creating an Enterprise Data Model (EDM), part of the external schema of database development. The EDM “is a high level model that identifies, defines, and relates the major entities of interest in an organization” (Hoffer P.57). Along with the EDM are the various user views of the database. The top down, high level depiction of an organizations data (the EDM) coexists with these user views to form the External Schema of the database development process. The actual conceptual model is then created from this external schema comprised, again, of the EDM and various user views. The three schema architecture of developing a database consists of this first step of defining the external schema which comprises the EDM and the various user views of the database. The other steps are the logical schema and physical schema. Each step revises and adds to the previous step in defining the constructs of the database environment. The logical schema defines the database in more detail, than the external schema, whereas the physical schema further creates the physical characteristics of the database from the logical schema. The EDM model for the MVCH database environment, diagram 1, is depicted below. This is just a high level depiction of the hospitals major data entities, without attributes, primary keys, foreign keys, or cardinality. Diagram 2 is more detailed than the EDM, deriving attributes foreign and primary keys, defining organizational scope of the data, and listing entities and data types. The conceptual model is independent of any database management technology. Hospital 1 Assigned M Employee 1 1 Has w orksAt M M Ward Physician 1 Houses M Patient 1 incurs Diagram 1- Conceptual Data Model M Charge Em pl oyee PK EmployeeID FK1 FirstName LastName JobTitle YearsEmployed HospitalID Hospi tal PK Physi ci an HospitalID PK PhysicianID FK1 FirstName LastName HospitalID HospitalName HospitalLocation Pati ent Charge PK PatientID FK1 FirstName LastName WardNumber W ard PK ChargeID PK WardNumber FK1 WardSpecialty HospitalID FK1 Procedure DatePerformed PatientID Diagram 2- Logical Data Model Below is a table of the conceptual, logical, and physical design models and there characteristics. Note how each model gets more detailed then the previous model. The result is a completed and fully functional database to meet the needs of MVCH and surrounding community. Feature Conceptual Logical Physical Entity Names ✓ ✓ Entity Relationships ✓ ✓ Attributes ✓ Primary Keys ✓ ✓ Foreign Keys ✓ ✓ Table Names ✓ Column Names ✓ Column Data Types ✓ Graph by 1keydata.com The Database and it’s Data Definitions The actual process of creating the database, creating, altering, or dropping a table, or establishing constraints on the database are achieved through the physical schema of the three part model described above. The Data Definition Language (DDL) describes, in SQL commands between the database and DBMS, to construct the database and table(s). Run within Oracle’s SQL developer environment that executes the commands, the DDL is first created from the DDS-Lite ER-Diagram which gives us the corresponding DDL. The output confirmed by SQL developer is a fully functioning database with corresponding tables that represent vital hospital entities; Hospital, Employee, Patient, Ward, Physician, and Charge. The DDL for our new information system at MVCH is located in Appendix A. The Client/Server Environment Now that our new information system database has been created and key data entities established it becomes necessary to describe the architecture that will comprise, the database, the processes that act on the data (applications), and the users of the system. The old paradigm of data manipulation was an approach whereby the database and applications were located centrally in one location. With the client/server approach, the two tiered system emerged, as storage of data was handled by a database server and the processing logic was handled by the clients or users of the system. The problem was that too much work was over burdening on the two respective tiers not to mention a decrease in system performance. The solution was a three tiered or n-tiered approach whereby a web server or application server or both share, or take over, the processing logic usually handled by the client. The result is a thin client. In the client/server three tiered approach, that being proposed for MVCH, we have the presentation logic which is handled by the client. This is input of data and output to the user in some way such as a screen. We have the processing logic handled by the web server, application server or both, and the data storage being handled by the database server. A good example of the client/serve architecture and one appropriate for MVCH is “In hospital data processing, for example, a client computer can be running an application program for entering patient information while the server computer is running another program that manages the database in which the information is permanently stored” (TheFreeDictionary.com). This is a clear example of the three tiered system with client, application, database servers providing presentation, processing, and storage logic respectively. The Distributed Database The data in the database(s) envisioned for MVCH will appear to the users of the system as coming from a single, centrally located database. In reality the data is dispersed as the database that holds the data is referred to as distributed. “A distributed database is a single logical database that is spread physically across computers in multiple locations that are connected by a data communications network” (Hoffer P.616). The advantages of the distributed database are ease of use, due to Location Transparency, to the user and the ability to administer the local database locally, due to Local Autonomy. To make all this happen Hoffer (P.617) refers to the distributed DBMS. The distributed DBMS in place at MVCH will keep track of data, making it seem as one source. It will provide security, concurrency, and recovery features. It will allow growth and be scalable. The distributed RDMS determines locations of requests for data and the locations of the processing of the data, all transparent to the user. With a distributed database in place at MVCH it will support ease of use to the end users and will, in the event of the main database becoming unavailable, allow the local data to be administered locally, giving the individual clients ownership, so to speak, of the data. Quality Governance Under MVCH’s new quality governance committee, data will be seen to be reliable and effective for the hospital. This requires that periodic cleansing or scrubbing of the data becomes necessary. Under the guidance of a data steward appointed by the committee the data will be cleansed, “removing incorrect, incomplete, improperly formatted, or duplicated” (TechTarget.com) data. This will ensure a hospital that runs on data that is fit for use. A data audit will be conducted by the data steward as there are no current data quality measures in place from which to establish data quality. A data audit profiles each file in the database looking for inconsistencies such as extreme values for certain columns. Data will also be checked for quality by examining the business rules of MVCH and examining the data, making sure it aligns with these business rules. Security Governance Quality data at MVCH is just one aspect of the new information system that will make it perform efficiently or perform at all. Securing that data is as important to MVCH as introducing good quality data. Data security is “the protection of the data against accidental or intentional loss, destruction, or misuse (Hoffer P. 569).” The new system at MVCH will be complex with a distributed client/server database environment. Packets of data will traverse the local intranet and internet. With this type of environment security is of the utmost concern. Data in this environment can be intercepted, misused, and error prone. Within MVCH’s new information system data must be secure but from what or whom? Threats to MVCH’s data environment include intentional and unintentional actions against the data. Actions such as human error, software and hardware caused irregularities, accidental loss, theft, fraud, loss of data quality, and unavailability all must be brought under a security governance program initiated within MVCH. Techniques and tools to aid in securing data include Views, Assertions, Checks, Domains, and other Integrity controls, all part of the DBMS software. Encryption is another powerful tool in helping to secure data at MVCH. With no protection of data currently at MVCH encryption techniques will be put into place throughout the new information system. Encryption scrambles data into unreadable packets by the use of a public key or the use of a private and a public key. A public key works by assigning a “key” to the author of the data and the user who wants access to the data. The public key encrypts the data for the author and decrypts the data for the user. A private-public key works the same way but the user is given the private key and uses it to decrypt the message. The securing of data can be achieved through the use of VIEWS and GRANT privileges. Appendix C contains screenshots of these technologies in action. Business Continuity Planning (BCP) BCP is essential for MVCH. BCP is a strategy for the organization that attempts to cover all aspects of disaster. Whereas most organizations view BCP as an extra expense with mediocre results BCP becomes not an activity that most organizations embrace. Since 911 organizations now see BCP as essential to their existence. For instance only 44% of companies who go through fires exist after the disaster; and of these only 33% survived longer than three years. The need for a BCP is clear. MVCH must develop the team who will design the actual BCP. Management of the BCP should be set up as should a timeframe and initial costs of the proposed system. Next the team must identify the risks and conduct risk assessments to determine what risks the BCP will address. Risks such as “technical, economic, internal, external, human or natural (Naef W.)” must be considered under the BCP. Now the BCP can be created at MVCH. Based on the above descriptions the BCP will alleviate all areas of disaster and allow for a fast and complete recovery. Most importantly the BCP must be tested in each area of concern where the tests reveal the extent and efficiency of the BCP. "We see far too many Business Continuity Plans and or Disaster Recovery Plans that whilst they have been tested were done so in unrealistic ideal conditions and thus we do not truly recognize what really happens in a crisis (Spinks D.)" The results of the tests and the tests themselves should align with the business needs of MVCH. To keep MVCH a competitive hospital the BCP should cover areas that MVCH deems most important to its success. MVCH’s adoption of a good BCP will ensure that the new information system performs and continues to do so even in the event of disaster. Along with the information system the hospital itself can continue its duties and feel content that a disaster of unforeseen consequences will not bring it down. Data Integration The current data structure at MVCH is inadequate as business units within the hospital operate on data and use databases that perform for the individual departments that make up MVCH. Currently each department runs a database, separate from the other departments, to satisfy its own purposes. The consolidation of data into a single database should be the ultimate goal of MVCH. This single, enterprise wide database, provides each department with the data they need to perform their tasks and makes transparent the use of that data to all functions within all departments for the hospital as a whole. A more detailed and technical look at data integration is revealed through Enterprise Application Integration (EAI) and Enterprise Information (EI). EAI is the conversion of one data type to another data type, through middleware, so one application can share data with another. Metadata reformatting is a key aspect of EAI. A key benefit of EAI is it eliminates the need to redesign processes and systems to achieve data integration, leaving existing systems alone. At MVCH this becomes an important factor as data processes and systems will not have to be overhauled as EAI can integrate data without all the hassle of redesigning the data structure completely. However EAI does not take into account the high level business intelligence that results from the enterprise database and, although EAI requires no redesign of current data architecture within MVCH, the old separated data and architecture currently in use at MVCH will, in fact, be redesigned to achieve client/server architecture. EI would be a better choice to achieve integration of data at MVCH because it requires a redesign of existing data structures, something, as mentioned, above MVCH will need to do anyway. But also EI achieves a true centralized data approach. It does this by establishing a common set of metadata which, not only centralizes data, but ensures that functions of the application share data definitions as well. As a result data processes are standardized producing data that is accurate. If an organization, like MVCH, can take on the complex task of redesign, EI, which establishes a centralized database in which all data is shared, is certainly worth the effort. Business Intelligence “The key to thriving in a competitive marketplace is staying ahead of the competition. Making sound business decisions based on accurate and current information takes more than intuition” (Rosetti L.). Business Intelligence (BI) refers to the technology and applications that use an organization’s raw data to achieve this accurate and current information. With BI at MVCH the key data entities that support the new information system will enable hospital wide decisions to be made about this data that will further a competitive advantage for MVCH. Future Data Management Challenges The data needs of an organization today and in the future are different from the data needs that were present a decade or two ago. Today concepts like unstructured data, business objects, and business intelligence (BI) pervade all aspects of the database environment and its future. It has become recognized in recent years that the need to align business goals with key “business objects” such as customer, product, and order, is a growing view of the successful organization. BI, a recent concept, refers to “a broad category of applications and technologies for gathering, storing, analyzing, and providing access to data to help enterprise users make better business decisions” (Rosetti L.). Unstructured data is data that is in binary large object (BLOB) format which “include e-mail files, word-processing text documents, PowerPoint presentations, JPEG and GIF image files, and MPEG video files” (Blumberg). I think with the benefits of an object oriented data structure the future needs of the organization can be met head on. Object orientation uses Classes, the blueprint of an object, and the object, which has properties and operations. Thus the object is said to have behavior and state which when instantiated from the Class can represent real life entities and events. Conclusion This planning document for the new information system at MVCH has many areas of concern. From gathering the data that will form the basis for our new system to creating the database with the DDL, to securing and providing quality data, this document aims to describe, explain, and make clear the ingredients that will make up an efficient information system. I hope that all concerned who want to understand the new information system have read this planning document and can now relate the need for the new data technologies to meet the need for continued growth of MVCH. Appendix A: Data Definition Language (DDL) for MVCH --- Target: Oracle -- Syntax: sqlplus user@tnsnames_entry/password @filename.sql --- Date : Sep 11 2009 17:13 -- Script Generated by Database Design Studio 2.21.3 -- --- Create Table : 'Hospital' -- HospitalID : -- HospitalName : -- HospitalLocation : -CREATE TABLE Hospital ( HospitalID NUMBER(99) NOT NULL UNIQUE, HospitalName VARCHAR2(30) NOT NULL, HospitalLocation VARCHAR2(30) NOT NULL, CONSTRAINT pk_Hospital PRIMARY KEY (HospitalID)) / --- Create Table : 'Physician' -- PhysicianID : -- FirstName : -- LastName : -- HospitalID : (references Hospital.HospitalID) -CREATE TABLE Physician ( PhysicianID NUMBER(99) NOT NULL UNIQUE, FirstName VARCHAR2(30) NOT NULL, LastName VARCHAR2(30) NOT NULL, HospitalID NUMBER(99), CONSTRAINT pk_Physician PRIMARY KEY (PhysicianID), CONSTRAINT fk_Physician FOREIGN KEY (HospitalID) REFERENCES Hospital (HospitalID)) / --- Create Table : 'Ward' -- WardNumber : -- WardSpecialty : -- HospitalID : (references Hospital.HospitalID) -CREATE TABLE Ward ( WardNumber NUMBER(99) NOT NULL UNIQUE, WardSpecialty VARCHAR2(30) NOT NULL, HospitalID NUMBER(99), CONSTRAINT pk_Ward PRIMARY KEY (WardNumber), CONSTRAINT fk_Ward FOREIGN KEY (HospitalID) REFERENCES Hospital (HospitalID)) / --- Create Table : 'Employee' -- EmployeeID : -- FirstName : -- LastName : -- JobTitle : -- YearsEmployed : -- HospitalID : (references Hospital.HospitalID) -CREATE TABLE Employee ( EmployeeID NUMBER(99) NOT NULL UNIQUE, FirstName VARCHAR2(30) NOT NULL, LastName VARCHAR2(30) NOT NULL, JobTitle VARCHAR2(30) NOT NULL, YearsEmployed VARCHAR2(30) NOT NULL, HospitalID NUMBER(99), CONSTRAINT pk_Employee PRIMARY KEY (EmployeeID), CONSTRAINT fk_Employee FOREIGN KEY (HospitalID) REFERENCES Hospital (HospitalID)) / --- Create Table -- PatientID -- FirstName : 'Patient' : : -- LastName : -- WardNumber : (references Ward.WardNumber) -CREATE TABLE Patient ( PatientID NUMBER(99) NOT NULL UNIQUE, FirstName VARCHAR2(30) NOT NULL, LastName VARCHAR2(30) NOT NULL, WardNumber NUMBER(99), CONSTRAINT pk_Patient PRIMARY KEY (PatientID), CONSTRAINT fk_Patient FOREIGN KEY (WardNumber) REFERENCES Ward (WardNumber)) / --- Create Table : 'Charge' -- ChargeID : -- Procedure : -- DatePerformed : -- PatientID : (references Patient.PatientID) -CREATE TABLE Charge ( ChargeID NUMBER(99) NOT NULL UNIQUE, Procedure VARCHAR2(30) NOT NULL, DatePerformed VARCHAR2(30) NOT NULL, PatientID NUMBER(99), CONSTRAINT pk_Charge PRIMARY KEY (ChargeID), CONSTRAINT fk_Charge FOREIGN KEY (PatientID) REFERENCES Patient (PatientID)) / --- Permissions for: 'public' -GRANT ALL ON Hospital TO public / GRANT ALL ON Physician TO public / GRANT ALL ON Ward TO public / GRANT ALL ON Employee TO public / GRANT ALL ON Patient TO public / GRANT ALL ON Charge TO public / exit; Appendix B: Annotated Bibliography1keydata.com website (n.d.) “Conceptual, logical, and physical data models. http://www.1keydata.com/datawarehousing/data-modeling-levels.html retrieved Sept. 2009. A good visual representation of the conceptual, logical, and physical database architecture. About.com (n.d.). “Entity-Relation diagram” http://databases.about.com/cs/specificproducts/g/er.htm Retrieved August 2009. This is a short but concise description of the ERD. Hoffer J. Prescott M. Topi H. (2009). “Modern Database Management” The course textbook is the most useful reference for initial definitions of topics. The textbook offers a wide range of definitions and descriptions covering data in all aspects of its use. When researching a topic first go to the textbook for initial definitions. Mitchell B. (n.d.). “Client/Server Model” http://compnetworking.about.com/od/basicnetworkingfaqs/a/client-server.htm Retrieved August, 2009. A good description of the Client/Server architecture. It explains the devices, applications, and networks that utilize Client/Server architecture including a comparison of Client/Server to P2P. Naef W. (2003). “Business Continuity Planning- a safety net for businesses” http://www.iwar.org.uk/infocon/business-continuity-planning.htm Retrieved August 2009. Naef provides a clear description of a BCP in general terms Rosetti L. (2009). “What is Business Intelligence” http://searchdatamanagement.techtarget.com/sDefinition/0,,sid91_gci213571,00.html Retrieved Sept. 2009. An excellent initial description of Business Intelligence. The web page also has many links to more information. Spinks D. (2003). “Business Continuity Planning Interview with David Spinks, EDS” http://www.iwar.org.uk/infocon/bcp-spinks.htm Retrieved September 2009. David Spinks relates his version of an effective BCP in this interview. TechTarget.com (2005). “What is data scrubbing?” http://searchdatamanagement.techtarget.com/sDefinition/0,,sid91_gci880972,00.html Retrieved Sept. 2009. A look at data scrubbing the cleansing of data to make it fit for use TheFreeDictionary (n.d.). “Client/Server Architecture” http://encyclopedia2.thefreedictionary.com/client-server+architecture Retrieved September 2009. A detailed, intricate, but concise description of the client/server architecture together with a clear example used in the assignment. Wikipedia (n.d.). “Conceptual Model” http://en.wikipedia.org/wiki/Conceptual_model_%28computer_science%29 Retrieved Sept. 2009. Appendix C: Queries, VIEWS, and GRANT privileges SELECT Statement on new Database- SELECT Statement on new VIEW Patientinfo- SELECT Statement on another user’s table- Denial of SELECT Statement on another user's VIEW-