Chapter 5 Database Processing - Case & Exercise Jason C. H. Chen, Ph.D. Professor of MIS School of Business Gonzaga University Spokane, WA 99258 USA chen@jepson.gonzaga.edu Dr. Chen, Management Information Systems 1 In-class exercise UYK (p.171) Dr. Chen, Management Information Systems 2 1.Draw an entity-relationship diagram that shows the relationships among a database, database applications, and users. User (e.g., GU students, faculty, staff etc.) Dr. Chen, Management Information Systems mandatory many mandatory one Database Application Database (e.g., ZagWeb, Blackboard, Bookstore, Library, etc.) (e.g., GU database) 3 2.Consider the relationship between Adviser and Student in Figure 5-20. Explain what it means if the maximum cardinality of this relationship is (A:S – Advisor:Student) • a. N:1 – An advisor is assigned one student; a student is assigned many advisors. • b. 1:1 – An advisor is assigned one student; a student is assigned one advisor • c. 5:1 – An advisor is assigned one student; a student is assigned no more than five advisors • d. 1:5 – An advisor is assigned no more than five students; a student is assigned one advisor Dr. Chen, Management Information Systems 4 3. Identify two entities in the data entry form in Figure 527. What attributes are shown for each? What do you think are the identifiers? Dr. Chen, Management Information Systems Fig 5-27 Sample Data Entry Form 5 3. Identify two entities in the data entry form in Figure 5-27. What attributes are shown for each? What do you think are the identifiers? • Entities (or Tables/Files): – Employee; Class • Employee attributes: – Employee Number, First Name, Last Name, Email • Class attributes: – Course Name, Course Date, Instructor, Remarks • Employee identifier (key): – Employee Number • Class identifier (key): – Course Name & Course Date – Why two fields? And what is it called? Dr. Chen, Management Information Systems 6 4. Using your answer to question 3, draw an E-R diagram for the data entry form in Figure 5-27. Specify cardinalities. State your assumptions. Employees take zero or more classes; a class is taken by one or more employees Assumptions: 1. Courses may be offered many times but always on different dates. 2. Employees may not have taken any classes. 3. Classes have at least one employee. Dr. Chen, Management Information Systems 7 • 5. The partial E-R diagram in Figure 5-28 (next page) is for a sales order. Assume there is only one Salesperson per SalesOrder. Dr. Chen, Management Information Systems 8 a. Specify the maximum cardinalities for each relationship. State your assumptions, if necessary. • A Salesperson writes many Sales Orders; a Sales Order is written by one Salesperson. (Assumes Salespeople work alone and not in teams) • A Customer places many Sales Orders; a Sales Order is placed by one Customer. • A Sales Order contains many Line Items; a Line Item is contained in one Sales Order. • A Line Item contains one Item; an Item is contained in one Line Item. [M] [M] [M] Dr. Chen, Management Information Systems 9 b. Specify the minimum cardinalities for each relationship. State your assumptions, if necessary. • A Salesperson may have zero Sales Orders; a Sales Order is written by one Salesperson. (Assumes Salespeople work alone and not in teams; assumes a Sales Order is not required for a Salesperson to exist in the system) • A Customer places at least one Sales Order; a Sales Order is placed by one and only one Customer. (Assumes at least one Sales Order is required for a Customer to exist in the system) • A Sales Order contains at least one Line-Item; a Line-Item is contained in one and only one Sales Order. • A Line Item contains one and only one Item; an Item is contained in one and only one Line Item. [0] [>=1] [>=1] [1] Dr. Chen, Management Information Systems 10 Case Study 5: Fail Away with Dynamo, Bigtable, and Cassandra (1 – 5, p.174) • Current relational DBMS products not designed for large, multi-server systems • NoSQL databases – Dynamo, Bigtable, Cassandra • Amazon: Dynamo • Google: Bigtable processes petabytes of data on hundreds of thousands of servers • Elastic • Cassandra used by Facebook, Twitter, Digg, Reddit Dr. Chen, Management Information Systems 11 1. Clearly, Dynamo, Bigtable, and Cassandra are critical technology to the companies that create them. Why did they allow their employees to publish academic papers about them? Why did they not keep them as proprietary secrets? • • The companies that originally developed Dynamo, Bigtable, and Cassandra did so to solve a real business problem they encountered as they went about doing their primary business. They were not in the business of developing, marketing, selling and supporting a new method of storing data. • Since the data store they developed was not a focus of their business strategy, but was a means to accomplish their business strategy, they did not feel that it was worth it to try to keep the data store a proprietary secret. They were perhaps also aware that others were working on similar solutions and so did not feel that any competitive advantage they gained from their solutions would be sustainable over time. Dr. Chen, Management Information Systems 12 2. What do you think this movement means to the existing DBMS vendors? How serious is the NoSQL threat? Justify your answer. What responses by existing DBMS vendors are sensible? • The companies that developed the NoSQL solutions were dealing with a specific, unique processing problem – processing massive amounts of data on thousands of servers. Existing DBMS products were not designed to deal with this particular issue effectively. Existing DBMS vendors should not ignore this issue, but also should not feel that they are doomed. • The particular processing requirements that Google/Amazon/Facebook etc. faced are not necessarily going to be faced by every organization. There will still be a need for a traditional DBMS for many organizations. The DBMS vendors will want to evaluate how best to offer this type of data store for their customers who need it – perhaps by offering to support the transition to the open source product for those customers who require its capabilities. Dr. Chen, Management Information Systems 13 3. Is it a waste of your time to learn about the relational model and Microsoft Access? Why or why not? • • Learning about the relational model and how to use Access gives a student a good foundation in data management concepts. • As stated earlier, not every organization will require the capabilities provided by the NoSQL data store. People using databases for personal productivity purposes will certainly not be moving to the NoSQL data store. • So, learning about relational databases and Access is still a good investment in time and effort for students today. Dr. Chen, Management Information Systems 14 4. Given what you know about GearUp, should it use a relational DBMS, such as Oracle Database or MySQL, or should it use Cassandra? • • GearUP most likely can utilize a relational DBMS effectively and will not require a data management approach like Cassandra. GearUp probably does not have the volume of transactions or servers to justify a NoSQL approach. Dr. Chen, Management Information Systems 15 5. Suppose that GearUp decides to use a NoSQL solution, but a battle emerges among the employees in the IT Department. One faction wants to use Cassandra, but another faction wants to use a different NoSQL data store, named MongoDB (www.mongodb.org). Assume that you’re Kelly, and Lucas asks for your opinion about how he should proceed. How do you respond? • • Kelly should tell Lucas that the decision to use a specific NoSQL solution should be based on a careful analytical evaluation of GearUp’s requirements. There is no reason for this to become a factionalized debate. • Determine exactly what GearUp’s needs are and then determine, analytically, how well Cassandra and mondoDB satisfy those requirements. Dr. Chen, Management Information Systems 16 5. (cont.) • • Once each of the NoSQL’s capabilities have been objectively researched and matched to the company’s real requirements, the best fit should become apparent. Requirements should include technical feasibility, economic feasibility, and organizational feasibility issues in order to be complete. Dr. Chen, Management Information Systems 17