Government of Russian Federation Federal State Autonomous Educational Institution of Higher Professional Education "National Research University 'Higher school of economics' Faculty of Business Informatics Discipline program "Data Bases" for direction 38.04.05 "Business Informatics", Master training Program’s author: Nikolay V. Markov, nikolay.markoff@gmail.com Approved at the meeting of the Department of information and business in the sphere of information technologies Head of Department, Svetlana V. Maltseva «____»____________ 2014 г. _____________________ Recommended by the EMS section of «Business Informatics» «____»____________ 2014 г. Chairman, Y. V. Taratukhina ____________________ Moscow, 2014 This program can not be used by other parts of the university and other institutions of higher education without the permission of the department - developer of the program. 1. Scope and normative references This program of an academic discipline establishes minimum requirements for knowledge and skills of the student and determines the content and types of studies and reports. The program is designed for teachers, leading this discipline, teaching assistants and students directions 38.04.05 "Business Informatics" Master training, students in the master's program "Big Data Systems". The program is developed in accordance with: working curriculum of the University towards 38.04.05 "Business Informatics" Master training for master's program «Big Data Systems», approved in 2014 2. Goals for studying Studying the models of data structures; Understanding of the database classification, depending on the implemented data models and methods for their use; Exploring the ways of data storage on the physical level, the types and ways of file systems organization; A detailed study of the relational data model and databases implementing this model, the query language SQL; Understanding of the problems and the main ways of solving them in the collective access to data; Exploring the opportunities of DBMS supporting various models of data organization, the advantages and disadvantages of DBMS implementation of various data structures, definitions of DBMS; Understanding of the lifecycle of database support and maintenance; A picture of the specialized hardware and software aimed at building a database of large volumes of storage used in the economy. 3. Student competences, generated as a result of studying As a result, during the studying of the discipline a student should:: know the basic models of data structures (lists, hierarchies, relationships, network structure); have an understanding of the database classification (on supported models of data by type of information stored by way of organizing access to the system architecture); have an understanding of the physical layer of data storage, know the ways of file systems organization; 2 have an understanding of the basic concepts of the relational data model; know the main proposals of the query language SQL; have an understanding of the problems of public access to the data; know the basic concepts and principles of transaction processing (OLTP); have an understanding of the non-relational databases and the problems that can be solved with their help; understand the main stages of the life cycle of the database, support and maintenance, to know the methodology of data backup; As a result of the development of the discipline the student acquires the following competences: Competence Ability to evaluate and to GEF/NR U code СК-1 Descriptors - the main features of the development (indicators of achievement results) Owns and uses process the mastered Forms and methods of teaching, contributing to the formation and development of competence Lectures, workshops, homework scientific methods and ways of working The ability to apply the ПК-13 Owns and uses methods of system analysis Lectures, workshops, homework and modeling to evaluate and design Ability to develop and apply ПК-14 Owns and uses mathematical models to Lectures, workshops, homework justify the design decisions in the field of ICT Ability to organize self and collective research work at ПК-16 Demonstrates Lectures, workshops, homework the enterprise and manage it 4. Place in the structure of the discipline of the educational program As part of the master's program «Big Data Systems» this discipline is a compulsory subject. In addition, the course is adaptive, so that the student does not require a wide range of knowledge in the field of IT and mathematics. 3 For the proper development, students should: know the content of the discipline "Computer Science" in which there were studied the basics of algorithms and developed the skills confident work on the computer. be able to use mathematical and IT-tools for management tasks. The main provisions of the discipline should be used for further studying of the following disciplines, including "Advanced methods of data analysis and big data in business intelligence". 5. Topical plan of an academic discipline № Total Topic name hours 1 Introduction 2 The basic concepts of databases, data structures Classroom hours Lecture s 16 and database management systems, classification Basic concepts and terms of the relational model 4 SQL - the standard query language for relational 16 22 databases 5 Operations of relational algebra and compliance 16 proposals SQL 6 ars s Homewo rk 2 2 12 4 4 14 2 2 12 4 4 14 2 2 12 2 2 12 22 of databases 3 Semin Workshop Normal form 16 ИТОГО 108 16 16 76 6. Forms of students knowledge control Type of control Current 1st year Form of control 1 Control test Parameters 2 1 Writing control test (40 min), result evaluation – 2 weeks (week) Total Exam 1 Oral exam, 20 min per student (week) 6.1 Criteria for assessing the knowledge, skills The student should demonstrate the knowledge of sections of the discipline and the ability to present the results of homework and tests in accordance with the required competencies. Evaluation of all forms of monitoring are set on a 10-point scale. 4 On the final evaluation on a subject matter consists of ratings for: work in practical classes - O1 control work - O2 exam - O3 according to the formula: O = O1 + 0.2 * 0.4 * O2 + O3 0.4 * 7. Program content Topic 1. Introduction Brief description of the discipline, its goals, objectives, scope, content, order the study material, contact with other disciplines of the curriculum and in the training in the specialty. Theoretical and practical components. Forms of independent work. Characteristics of educational literature. Control measures. Basic literature 1. Codd, E.F. A relational model of data for large shared data banks, CACM 13, NO 6, 1970 2. Date, C.J. An introduction to database systems, Addison-Wesley Publishing Company, 1986 3. Mitea, A.C. Relational and object-oriented databases, “Lucian Blaga” University Publishing Company, 2002 4. Codd, E.F. Relational completeness on data base sublanguage, Data Base Systems, Courant Computer Science Symposia Series, Vol.6 Englewood Cliffs, N.J, Prentice-Hall, 1972 5. Kuhns, J.L. Answering questions by computer: A logical study, Report RM-5428-PR, Rand Corporation, Santa Monica, California, 1967 6. Codd, E.F. A data base sublanguage founded on the relational calculus, Proceedings ACM SIGFIDET Workshop on Data Description, Access and Control, 1971 Additional literature 1. Lacroix, M., Pirotte, A. Domain oriented relational languages, Proceedings 3rd International Conference on Very Large Data Bases, 1977 2. Lacroix, M., Pirotte, A. Architecture and models in data base management systems, G.M. Nijssen Publishing company, North-Holland, 1977 Topic 2. The basic concepts of databases, data structures and database management systems, classification of databases The concept of data. The concept of the database. The concept of a database management system. The concept of a data warehouse. The main types of data structures. Linear structures. The concept of the list. Types lists ("bus", "ring"). Ways of organizing records in the lists. Problems that arise when working with lists. Ways to 5 overcome them. Hierarchy or tree. Basic concepts and definitions. Binary and n-ary trees, wood dimension. Balanced and unbalanced trees. The concept of network organization data. Structure of the "star", "snowflake", the union of stars, fully connected network, an arbitrary graph. Basic literature 1. Codd, E.F. A relational model of data for large shared data banks, CACM 13, NO 6, 1970 2. Date, C.J. An introduction to database systems, Addison-Wesley Publishing Company, 1986 3. Mitea, A.C. Relational and object-oriented databases, “Lucian Blaga” University Publishing Company, 2002 4. Codd, E.F. Relational completeness on data base sublanguage, Data Base Systems, Courant Computer Science Symposia Series, Vol.6 Englewood Cliffs, N.J, Prentice-Hall, 1972 5. Kuhns, J.L. Answering questions by computer: A logical study, Report RM-5428-PR, Rand Corporation, Santa Monica, California, 1967 6. Codd, E.F. A data base sublanguage founded on the relational calculus, Proceedings ACM SIGFIDET Workshop on Data Description, Access and Control, 1971 Additional literature 1. Lacroix, M., Pirotte, A. Domain oriented relational languages, Proceedings 3rd International Conference on Very Large Data Bases, 1977 2. Lacroix, M., Pirotte, A. Architecture and models in data base management systems, G.M. Nijssen Publishing company, North-Holland, 1977 Topic 3. Basic concepts and terms of the relational model Basic concepts and terms of the relational model (n-ary relation, relationship diagram, a tuple, domain, key, primary key, foreign key). Fundamental properties of relations. Relational algebra. Operations of relational algebra (union, intersection, difference, Cartesian product, projection, restriction, union, equi-join, division). Relational calculus. The history of the emergence of the relational model and relational database systems. Basic literature 1. Codd, E.F. A relational model of data for large shared data banks, CACM 13, NO 6, 1970 2. Date, C.J. An introduction to database systems, Addison-Wesley Publishing Company, 1986 3. Mitea, A.C. Relational and object-oriented databases, “Lucian Blaga” University Publishing Company, 2002 4. Codd, E.F. Relational completeness on data base sublanguage, Data Base Systems, Courant Computer Science Symposia Series, Vol.6 Englewood Cliffs, N.J, Prentice-Hall, 1972 5. Kuhns, J.L. Answering questions by computer: A logical study, Report RM-5428-PR, Rand Corporation, Santa Monica, California, 1967 6 6. Codd, E.F. A data base sublanguage founded on the relational calculus, Proceedings ACM SIGFIDET Workshop on Data Description, Access and Control, 1971 Additional literature 1. Lacroix, M., Pirotte, A. Domain oriented relational languages, Proceedings 3rd International Conference on Very Large Data Bases, 1977 2. Lacroix, M., Pirotte, A. Architecture and models in data base management systems, G.M. Nijssen Publishing company, North-Holland, 1977 Topic 4. SQL - the standard query language for relational databases The main proposals of the language SQL: CREATE, DROP, INSERT, DELETE, SELECT, UPDATE. Creating and deleting the tables. Adding data to a table. Sample data. Delete, and modify the data. Connection tables. Complex operators SELECT. Sort (ORDER BY). Grouping Data (GROUP BY, GROUP BY ... HAVING). Built-in functions. Combining UNION. Existential quantifier EXIST and NOT EXIST. Retrieval using IN, nested SELECT. Subquery with multiple levels of nesting. Correlated subquery. Representation. Cursors. DECLARE CURSOR, DROP CURSOR. Indices. Offers language SQL CREATE INDEX and DROP INDEX. Parameter UNIQUE. Synonyms. Offers CREATE SYNONYM and DROP SYNONYM. Aliases. Determination of relational algebra operations on the basis of proposals from SQL. Basic literature 1. Codd, E.F. A relational model of data for large shared data banks, CACM 13, NO 6, 1970 2. Date, C.J. An introduction to database systems, Addison-Wesley Publishing Company, 1986 3. Mitea, A.C. Relational and object-oriented databases, “Lucian Blaga” University Publishing Company, 2002 4. Codd, E.F. Relational completeness on data base sublanguage, Data Base Systems, Courant Computer Science Symposia Series, Vol.6 Englewood Cliffs, N.J, Prentice-Hall, 1972 5. Kuhns, J.L. Answering questions by computer: A logical study, Report RM-5428-PR, Rand Corporation, Santa Monica, California, 1967 6. Codd, E.F. A data base sublanguage founded on the relational calculus, Proceedings ACM SIGFIDET Workshop on Data Description, Access and Control, 1971 Additional literature 1. Lacroix, M., Pirotte, A. Domain oriented relational languages, Proceedings 3rd International Conference on Very Large Data Bases, 1977 2. Lacroix, M., Pirotte, A. Architecture and models in data base management systems, G.M. Nijssen Publishing company, North-Holland, 1977 Topic 5. Operations of relational algebra and compliance proposals SQL 7 Fundamental properties of relations. Relational algebra. Operations of relational algebra (union, intersection, difference, Cartesian product, projection, restriction, union, equi-join, division). Relational calculus. Basic literature 1. Codd, E.F. A relational model of data for large shared data banks, CACM 13, NO 6, 1970 2. Date, C.J. An introduction to database systems, Addison-Wesley Publishing Company, 1986 3. Mitea, A.C. Relational and object-oriented databases, “Lucian Blaga” University Publishing Company, 2002 4. Codd, E.F. Relational completeness on data base sublanguage, Data Base Systems, Courant Computer Science Symposia Series, Vol.6 Englewood Cliffs, N.J, Prentice-Hall, 1972 5. Kuhns, J.L. Answering questions by computer: A logical study, Report RM-5428-PR, Rand Corporation, Santa Monica, California, 1967 6. Codd, E.F. A data base sublanguage founded on the relational calculus, Proceedings ACM SIGFIDET Workshop on Data Description, Access and Control, 1971 Additional literature 1. Lacroix, M., Pirotte, A. Domain oriented relational languages, Proceedings 3rd International Conference on Very Large Data Bases, 1977 2. Lacroix, M., Pirotte, A. Architecture and models in data base management systems, G.M. Nijssen Publishing company, North-Holland, 1977 Topic 6. Normal form The notion of the normal form. The first Normal Form. Functional dependence and the second normal form. Full functional dependency, transitive dependency, the third normal form. The normal form Boyce-Codd. The fourth normal form. Theorem Fagin. The fifth normal form. The special properties of binary relations. Need for normalization. Basic literature 1. Codd, E.F. A relational model of data for large shared data banks, CACM 13, NO 6, 1970 2. Date, C.J. An introduction to database systems, Addison-Wesley Publishing Company, 1986 3. Mitea, A.C. Relational and object-oriented databases, “Lucian Blaga” University Publishing Company, 2002 4. Codd, E.F. Relational completeness on data base sublanguage, Data Base Systems, Courant Computer Science Symposia Series, Vol.6 Englewood Cliffs, N.J, Prentice-Hall, 1972 5. Kuhns, J.L. Answering questions by computer: A logical study, Report RM-5428-PR, Rand Corporation, Santa Monica, California, 1967 8 6. Codd, E.F. A data base sublanguage founded on the relational calculus, Proceedings ACM SIGFIDET Workshop on Data Description, Access and Control, 1971 Additional literature 1. Lacroix, M., Pirotte, A. Domain oriented relational languages, Proceedings 3rd International Conference on Very Large Data Bases, 1977 2. Lacroix, M., Pirotte, A. Architecture and models in data base management systems, G.M. Nijssen Publishing company, North-Holland, 1977 8. Literature Basic literature 1. Codd, E.F. A relational model of data for large shared data banks, CACM 13, NO 6, 1970 2. Date, C.J. An introduction to database systems, Addison-Wesley Publishing Company, 1986 3. Mitea, A.C. Relational and object-oriented databases, “Lucian Blaga” University Publishing Company, 2002 4. Codd, E.F. Relational completeness on data base sublanguage, Data Base Systems, Courant Computer Science Symposia Series, Vol.6 Englewood Cliffs, N.J, Prentice-Hall, 1972 5. Kuhns, J.L. Answering questions by computer: A logical study, Report RM-5428-PR, Rand Corporation, Santa Monica, California, 1967 6. Codd, E.F. A data base sublanguage founded on the relational calculus, Proceedings ACM SIGFIDET Workshop on Data Description, Access and Control, 1971 Additional literature 1. Lacroix, M., Pirotte, A. Domain oriented relational languages, Proceedings 3rd International Conference on Very Large Data Bases, 1977 2. Lacroix, M., Pirotte, A. Architecture and models in data base management systems, G.M. Nijssen Publishing company, North-Holland, 1977 9. Knowledge control questions 1. Basic requirements for the organization of databases. 2. Purpose and main components of the system databases. 3. Stages of database design. 4. Data Model. Classification of data models. 5. Data Warehouse. The main components. 6. Model "entity-relationship." The basic concepts. Scope. 7. Hierarchical data model. The basic concepts. Scope. Advantages and disadvantages. 8. The relational data model. The basic concepts. Scope. Advantages and disadvantages. 9 9. operations of relational algebra. 10. The relational calculus with variable-tuples. 11. The relational calculus with variables on domains. 12. Functional dependencies. Axioms. The rules of inference of functional dependencies. 13. Excess functional dependencies. Minimum coverage. Decomposition of relations. 14. Normal forms of association schemes. First Normal Form. Second Normal Form. 15. Normal forms of association schemes. Third Normal Form. 16. Normal forms of association schemes. The normal form Boyce-Codd. 17. Normal forms of association schemes. Fourth normal form. 18. Normal forms of association schemes. Fifth normal form. 19. Structured Query Language SQL. Categories SQL. 20. Structured Query Language SQL. Description of the data. Table. Data types. Data integrity. 21. Structured Query Language SQL. Data manipulation statements. The cursor. 22. Structured Query Language SQL. The types of binding. 23. Structured Query Language SQL. Multitable queries. 24. Structured Query Language SQL. And changing operation obnov¬leniya database. 25. Structured Query Language SQL. Indices. 26. Structured Query Language SQL. Define custom views. 27. Structured Query Language SQL. Using UNION to combine the results of instructions SELECT. 28. Structured Query Language SQL. Querying. 29. Structured Query Language SQL. The use of pseudonyms. 30. Three levels of data in automated information systems. Developers: NRU-HSE________ _______professor________ _____Nikolay V. Markov (workplace) (position) (инициалы, фамилия) 10