Database Design and Development “Activity 1 Using the information you have been provided with and some additional external research, plan and produce a design for a complex relational database based on a response to the client brief. Normalize the data to 3rd normal form. Produce design documentation in response to the client brief ensuring this shows the design of the tables, indexes and constraints, SQL queries, forms, macro and security measures” A complex normalized relational diagram was performed, after research with all the elements needed and how they interact including all the full elements required for the development of the database in one diagram. The diagram reflects all entities and attributes, their relations, normalizations, primary keys and with elements defined later on the data dictionary which was also performed in detail and included on the documentation. The diagram with full details below in the following page it was re redesigned completely including the data dictionary and database design incorporating data flow designs after detecting inconsistencies in the testing process: ERD Next the data dictionary for the Mahrud Bee Trades case, which includes attributes, datatype and keys, needed in order to build the complex relational database: “ Table Customer Product Order Bees Queen Staff Payment Supplier “ Attributes Customer_ID Customer__Name Datatype(length) Int(10) Varchar(30) Contact Address Email Product_ID product_Name Description image Order_ID Product_ID Varchar(12) Text Varchar(30) Int(10) Varchar(30) Text Blob Int(10) Int(10) Quantity Unit_Price Order_date Customer_ID Bee_ID Bee_Name honey_produced Int(10) Double (8,2) pollen_produced jelly_produced Queen_ID Queen_Name Staff_ID Staff_Name address Email Contact Payment_ID Payment_date Order_ID Varchar(30) Varchar(30) Int(10) Varchar(30) Staff_ID(30) Varchar(30) text Varchar(30) Varchar(12) Iint(10) Date Int(10) Supplier_ID Supplier_Name Address Email contact Int(10) Varchar(30) Text Varchar(30) Varchar(30) Int(10) Int(10) Varchar(30) Varchar(30) Others Primary key Primary key Primary key Foreign key Foreign key Primary key Primary key Primary key Primary key Foreign key Primary key Next the Customer Order Data Flow Diagram And also the Harvest Process Data Flow Diagram Database Design Normalization to 3 NF First Normal Form For tables to be in 1NF it should i. ii. iii. iv. Single atomic value columns Values in same column should be of the same domain All table columns should have unique names Order in which data is stored doesn’t matter The fields in the previous design were: “Order_ID, Date, Customer_ID, Customer_Name, Product_ID, product_Name, description, image, Quantity, Unit_Price, Staff_ID, Staff_Name, Bee_ID, Bee_Name, Queen_ID, Payment_ID, payment_Date The first normal form of the above attributes is Order_ID,Product_ID, product_Name, description, image, Quantity, Unit_Price The remaining attributes are removed from the table. They include Order_ID, Date, Customer_ID, Customer_Name, Staff_ID, Staff_Name, Bee_ID, Bee_Name, Queen_ID, Payment_ID, Second Normal form The second normal form ensures the table in the 1NF and no partial dependencies. In this subset of data that apply to multiple rows are removed. The new table will be: Product_ID, product_Name, description, image This leaves the first table as: Order_ID, Product_ID, Quantity, Unit_Price Now this is the second normal form. The rest of attributes are left as: Order_ID, Date, Customer_ID, Customer_Name, Staff_ID, Staff_Name, Bee_ID, Bee_Name, Queen_ID, Payment_ID,honey produced, pollen_produced, jelly_produced Third normal form This removes all columns that are not dependent on primary key as well as ensure the table in the 2NF. Product_ID, product_Name, description, image Order_ID, Product_ID, Quantity, Unit_Price, Order_date Bee_ID, Bee_Name, honey_produced, pollen_produced, jelly_produced Queen_ID, Queen_Name, farm_sold_to, Customer_ID, Customer__Name, Contact, Address, email Staff_ID, Staff_Name, address, email, contact Payment_ID, payment_Date Supplier_ID, Supplier_Name, address, email, contact The above attributes can be arranged into tables as Product-Product_ID, product_Name, description, image Order-Order_ID, Product_ID, Quantity, Unit_Price, Order_date Bees-Bee_ID, Bee_Name, honey_produced, pollen_produced, jelly_produced Queen – Queen_ID, Queen_Name Customer- Customer_ID, Customer__Name, Contact, Address, email Staff- Staff_ID, Staff_Name, address, email, contact Payment-Payment_ID, payment_Date Supplier- Supplier_ID, Supplier_Name, address, email, contact” Plan to forms, queries, reports, security features Forms will be created taking into account the following criteria for Mahrud Bee Trading, the most important aspect will be accessibility and functionality. The forms will include just the essential avoiding the inclusion of a lot of fields which may cause difficulties to the customer and getting lost where the most valuable information should be input and retrieved clearly. Also the buttons will need to be for their purpose very clear and they need to be easily seen by the customer inside the form. Regarding queries and the tables on each form, will be verified that they are valid for example by checks the record’s source and the source of the row of the whole list and combo boxes if needed. Also the form design will be checked to see that everything on the screen in accurate and the spelling is correct. In the forms it will be avoided the duplication of hotkeys. Also the order of the tabs will be appropriate and the form should be apart from minimalist and simple also intuitive. Regarding queries for this assignment it won’t be used complex SQL syntax with servers and host with Oracle or SQL software, as it will be a lot more convenient in order to also to send the database in an easier way through Ms Access, due to the tables’s visual representation and that the links between them are graphical makes the project more effective avoiding SQL coding for the entire project. The Access interface facilitates though a powerful and appropriate analysis. The whole query engine is model by systems of SQL and it can be changed and switched the graph query design with the syntax which is SQL. Regarding security for Mahrud Bee database, different tasks will be taken into account when delivering the database, the deployment will be accurate due to the fact that if it’s done with failures, this is one of the biggest reasons for bringing security issues. Also avoiding the so called loopholes in the database, because it gives the chance for viruses to destroy the database. The data will we planned to be encrypted to avoid security glitches. Next step for the security design will be assigning different permissions to different staff from Mahrud Bee avoiding the creation of one common user for the entire organization with all permissions and functions. So basically this is a clear and concise plan of security measures. “Activity 2 – Create the database. This could be supported with screen shots to explain the features used during the implementation process” A screen-capture video was created of around five minutes, where its shown the forms, queries, tables in the first part of the video and in the second part how information is gathered through a macro and a representation of the different related tables. Here is the link - https://www.youtube.com/watch?v=uEux4DepTI4&feature=youtu.be Some screenshots will be added as well: Examples of Forms: . Tables: Script in SQL for the table’s creation as a reference. “create table cust_bill_detail ( sales_id number primary key, sales_date date not null, customer_id varchar(20) not null, total_amt number not null, tax number, taxvalue number, dis number, disvalue number ) create table customer_details ( customer_id number primary key, customer_name varchar(20) not null, email varchar(20) not null, address varchar(50) not null, city varchar(20) not null, state varchar(20) not null, pincode number not null ) create table customer_payment_details ( pay_id number primary key, sales_id number not null, pay_type varchar not null, pay_info varchar not null, pay_date date not null, amount number not null ) create table product_details ( product_id number primary key, product_name varchar(20) not null, product_desc varchar(20), product_img varchar(50) ) create table purchase_order_details ( purchase_id number primary key, product_id number not null, qty number not null, amount number not null, total_amt number not null ) create table purchase_order_return_details ( return_id number primary key, purcahse_id number not null, pur_order_ret number not null, qty number not null, reason varchar not null, product_id number not null, amount number ) create table sales_order_details ( sales_id number primary key, product_id number not null, qty number not null, amount number not null, total_amount number not null ) create table sales_order_return_details ( return_id number primary key, sales_id number not null, return_date date not null, qty number not null, reason number not null, product_id number not null, amount number not null ) create table stock_details ( product_id number primary key, product_qty number not null ) create table honey ( honid number, typ varchar ) create table pollen ( polId number, typ varchar ) create table royalJelly ( royalJId number, typ varchar )” “Activity 3 Now that the database has been implemented it needs to be tested. Explain the different testing methods that could be used to test the database and produce a test plan which identifies the tests to be carried out. This should include testing of extreme data and error handling. Use the test plan and carry out the tests. Produce screen shot evidence to show the tests that have been carried out and explain any errors that need to be resolved. Amend the database based on the testing” Testing a database has many different approaches and methodologies but in order to summarize the 3 biggest groups of models we have the following: -Structural Database tests – Database was tested related to the database elements as tables, columns, stored procedures, schema and views tests among others. By verifying their state and validating all the correct elements, and their relations and rules. -Functional Testing – tests which are related to functionality tests for users. The most common ones are named white and black box tests. The black box test is about the verification of the database’s integration to assess its functionality. White box tests refer to the database internal structure and the hidden specifications from the users -Nonfunctional Testing – Tests about data loads, risks, and stress tests which are all orientated on the concept of performance for the database. -However for our database we will be implementing specific and varied tests in order to identify data errors and handling ones as well with different techniques as the access database has forms, queries, tables which are applied for adding, modifying data, viewing etc. It will be used a performance analyzer tool in order to detect any flows, mistakes or aspects to be have optimized for the database. The tool will test the database’s structure and also the data, and will provide results by delivering recommendations in the case of a need to optimize the dataset and it can also deliver results in order to avoid confusions with errors in data. It will be tested the entire database at once. -As seen above there are different tabs for the type of objects to be tested or it is also possible to press all object types button to perform a test of all the objects in the database at once. Then the button ok will be pressed and the tests will start. Three types of results will be delivered from the test, recommendations which display improvements to issues that aren’t a big priority , ideas brings a set of instructions to optimize the database and suggestions which are related to potential serious issues needed to be amended in order to improve the database efficiency. As seen on the above image as an example when an item is clicked on the results analysis then all the info of the offered optimization will be seen on the analysis notes on the lower part of the box. Then in order to self correct all the issues which affect the database performance it is done by pressing the button select all and then to press the one named optimize. Testing was done under the following criteria: Metadata Testing in order to check that the definitions from tables conform to the model of data and the specifications of the db design: And they were founded errors in “Verify that the table and column data type definitions are as per the data model design specifications” For instance the data type in the column data model is NUMBER however the database type of data for the column is VARCHAR , so it had to be amended. Data length, constraints, naming standards and check across environments was ok. Data quality testing was done in order to check the accuracy and data quality. A data duplicate check was performed looking for duplicate rows related to a single key columm and Data Integrity checks to indentify records which are orphan for a child entity as having a foreign key into the parent entity. The main conclusions of errors after performing all the tests and table checks were the following ones: Errors detected after Tests -The Tables needed to be normalized So after this critical finding an entire redesign was done of the database from the start: 1- An entire redesign of the ERD diagram as seen in activity one compared to previous design was done in order to correct the database structure which was normalized from the 1st form till the 3rd one. 2-A clearer Data Flow Diagram as well for correcting guide. -Data Dictionary It needed to be amended to be compliant with the new design after normalization entirely including errors in relations, contrainsts , data dictionary, primary and foreign keys -Inconsistencies Some anomalies such as delete, update and insert were found due to the un-normalized design. Amendments after these errors In order to get practice an entire redesign was done correcting the origin of the bugs and issues recommended by MS Access as seen in previous activities. Transforming, specially this issue and the normalization process as well. “Activity 4 Now that the database has been designed, implemented and tested reflect on the process. Produce a report which: • • • • • critically evaluates the design process and documentation that was produced for Mahrud Bees database identifying improvements with the design documentation where appropriate reviews the database implementation process and explain what was successful and what you would do differently if you were to do this again evaluates the role of databases within Mahrud Bees. Compare this with how larger organizations use databases explains how Mahrud Bees can connect the database to other applications that it may use within the business recommends to Mahrud Bees a database management system they can use to maintain the database you have created” “4.A Critically evaluate the design process and documentation that was produced for Mahrud Bees database identifying improvements with the design documentation where appropriate” In general the design documents were done with attention to detail and they were produced satisfactory after a revision of the ERD diagram, normalization and edited part of the dictionary, also a data flow diagram was built, is a useful tool to represent if done with full details the main concept structure and elements that will take part on the database development. It was also normalized avoiding inconsistencies as for the data dictionary again it was done with full attention to detail and included all definitions, restraints, keys and relations, so it was clear after to define the development in the implementation stage. The documentation had a balance between having short information or excessive which creates confusion and reduces efficiency , at the same time documentation should be a on-going process meaning that a few times it was updated according to the needed changes which is an advantage. At the same time clearly the documentation process helped the following stages providing a clear picture so they were a lot easier to manage. Improvements to be taken in the future would be a more collaborative approach when producing the documentation so for example the owner of Mahrud Bee and other members should had been more involved so it is an improvement learnt for future projects, the need of a collaborative effort of all the team members and stake holders when producing the right documentation by commenting, asking questions, and at the same time encouraging others in order to information and ideas exchange. Each team member should have a possible optimal contribution in relation to the documents produced. Also important is the fact that due to been one of the times of developing the entire documentation process for a database the steps were followed slower than they will be in the future as the experience provides not only knowledge but by repetition faster results. “4.B. Review the database implementation process and explain what was successful and what you would do differently if you were to do this again” The software implementation was an important phase as after producing very detailed designed documentation with diagrams and data dictionary and rules, it was now time to understand the best way to develop the database in a possible and realistic way that could match the criteria agreed with Mahrud bees and which also will be delivered and used in an environment which the company could understand and manage. Now going back to the real scenario, as the experience isn’t a expert level but good enough and it was needed to send the database for assessment, it was explored the option of using MySQL and other advanced database applications which only could create more complexity for the project in order or time, resources, budget and finding a host, so it was decided also some scripts were made in SQL through Oracle learning, to use software more familiar and timely efficient which also delivered SQL scripts , so it was decided to use Ms Access in order to transform the documentation into a real relational database with good capabilities and compatible in different systems due to the popularity of the software involved. The testing process created the need to do a research to understand the specific testing techniques to databases while at the same time taking advantage of the testing efficient tools already included on Access software program which tests the entire efficiency of the database by all the objects inside it like forms, queries, tables among many others with great results as the database didn’t give major issues and also with a table tool to test that the structure of the database was optimal. Meeting the client Brief The database met the client brief by providing by identifying the needed entities and attributes in order to optimize the organization when having records in an efficient way of the business in different fields like types of products, suppliers, customers, transactions as example and establishing a logic dependency between the elements of the database. Including an easy to use interface and functionality, so the client was able to determine the resources and opportunities to grow the business, by keeping track of the records in a system, which can be done easily and fast. Also the timing was appropriate but a normalization issue detected during the testing process made the need of re designing all diagrams and structure of the database for the delivery to Mahrud Bee Trading company of an optimized solution. Future Improvements if the database had to be done again If the process had to be done again, in order to avoid different initial changes to the relational diagram and design of the database it would be convenient to look in dept the flow of information and data must always be clear. It is understood a need to know till the last detail how the data will flow so that there is no impediment in its management. Also always normalization should be done from the initial diagrams to 3F. This for a first phase and is preliminary, but even in this context it is understood that is essential to have extremely clear the system which will be developed. Then instead of looking at different stages at a time it would be followed a more strict order so after understanding in dept the data flow , all the focus should be put into ordering the data to avoid further mistakes, focusing totally on the information and what it will be needed in the process. It is understood also that if the database should be done again even after previous amendments still many things that can go wrong, so it would be essential avoiding duplication of information and content. In addition it would be relevant taking care of reviewing relationships, increasing security through small measures, such as the review of null fields and providing important improvements that take advantage of the new design. Filtering data quickly sticking to a procedure which includes three aspects to be highlighted for the next time: “where” in which tables it’s wanted to perform the search; “what”, what data exactly would be wanted to be returned in the future or “how”, with what appearance the results are wanted. This may seem easy to elaborate however experience of building a database once and again systemizes and helps to focus in the main process. Also it would be recommended a better understanding on queries and scripts by training with SQL language, and the different tools available to manage SQL databases. Finally to comment that the project should includ scalability as in the future it will be possible to adapt and integrate more objects and functions to the database. “4.C Evaluate the role of databases within Mahrud Bees. Compare this with how larger organizations use databases” Mahrud Bees clearly will benefit from the knowledge and then steps to take in his business when tracking the sales of different products, customer’s data, sales and orders and other details recorded. In this level of business is imperative to know where the business paths take and which improvements can be done in the company thanks to a systemized structure of operations of the organization saved and stored on a database, which is necessary for growing perspectives and assigning the right resources to different concepts which are interconnected into the business in order to use this information to optimize the model. Nowadays big companies everywhere in the world use what is called big data which is extracted intelligent information for very large databases. Datasets So large companies store important data sets or can buy the information from data sets providers which include a vast collection of for example in fields like finance, economy, health industry, marketing among many others with advanced computing systems available, which is now a primary resource for integrated research systems including big data. The main uses and reasons are: • More certainty in decision making: With the availability in real time of the information it is possible to carry out an analysis of any precedent that the company has in any process, and thus make a decision based on its experience, offering a clearer picture when deciding. • Find new business channels: Thanks to the collection of information on customer interactions, it is possible to measure not only how well current products or services are performing but also how they can convert or complement the solution to a particular problem. •Costs reduction. Storing large amounts of data can be both confusing and expensive, there is the possibility of outsourcing this service through companies dedicated to the sector or the contracting of software that has this function. Analytics Large companies use analytics to deliver reports from tailor research among big and complex databases creating a significant group of data models. Companies put emphasis in ensuring the accuracy and the reliability of the available data models adding credibility to their reports. Reports are very useful to be able to perform many actions, such as segmenting based on interests or analyzing behaviors. It facilitates and speeds up the work, also allowing to have clearly and neatly collected details of clients Advantages of databases for companies: -Grouping and storage of all company data -It is easier to share the data among the various members available to the company -Redundancy is avoided and the organization of the agenda is improved -Better dialogue with the clients When a database is managed properly, the organization obtains a series of advantages, increasing efficiency or performing work faster and more agile, as the work is simplified. They are functionalities that provide added value to the company as for example an advanced CRM. Databases are important in order to establish appropriate high tech Customer Relationship Manager strategies in the business world, collecting information about clients to carry out the management of this type of relationships and their data. This is how the information that most interests them is segmented and various aspects can be optimized, in the case of commercial communication, more personalized advertising campaigns or having a thorough record of the documents that are sent or received by the company. Databases are a competitive advantage, for large companies, contacting customers in a more personalized way, providing quality. Customers are taken into account in a personalized and constant way while new business strategies are generated. Segmentation allows communication with the client to be more accurate. It provides security and trust to the customer in the company, so it offers what it needs at the right time. Databases are already essential for companies in their Marketing campaigns The number of data and information that companies have is a source of knowledge and power. The better the consumers are known and know how to convert this data into useful information, the collection will increase and there will be more conversions. Also through social networks, when it comes to increasing the community, there are many companies that have opted for social networks to increase database with consumer information allowing to know them in detail. Data Structure Large organizations use optimal data structures as they deal with a big magnitude of data that it needs to be analyzed as according Forbes 2018 “data analysts spend up to 80 percent of their time preparing data for analysis and only 20 percent of the time actually doing the analysis” Data structures are completely related to algorithms. At the level of large companies structuring their data is really essential to maximize speed and performance from the searches in order to obtain models, analysis and reports as soon as possible. “4.D Explain how Mahrud Bees can connect the database to other applications that it may use within the business” Mahrud Bess has a relational database is which is characterized by clarity, has a mathematical basis and has proven its effectiveness, as for regards of connecting the database there are different techniques through APIs “Application Programming Interface” as it was studied on previous assignments as communication resources, which in this case allow the apps from the client-side to have access to the database from the server-side. The most important ones are named JDBC and OBDC explained below and their main difference is that JDBC is a Java specific source while ODBC is an independent language source. ODBC model which stands for Open Database Connectivity is a database access model created by SQL Access Group. The main target of this variant is allow the access to data from different applications, in spite of what database management system. This model obtain this result by adding an intermediate layer which is named “Client Interface level”, among the DMBS and the application, translating queries of data application into commands which the management system will understand. The software works in two ways, with a client-driven software, or a clientserver philosophy. In the first mode, the driver interprets the connections and SQL calls and translates them from the ODBC API to the DBMS. In the second mode to connect to the database, a DSN is created within the ODBC which defines the parameters and route of the connection according to the data requested by the creator or manufacturer. According to JDBC and ODBC Tech Differences - Available at: https://techdifferences.com/difference-betweenjdbc-and-odbc.html “The code for ODBC is complex and is hard to learn. However, the code for JDBC is simpler and easy to run”. ADO.NET ActiveX Data Objects (ADO) is one of the mechanisms that computer programs use to communicate with databases, give them orders and obtain results from them. With ADO, a program can read, insert, edit, or delete, the information contained in different storage areas within the database called tables. In addition, the database itself can be manipulated to create new areas for the storage of information (tables), as well as to alter or eliminate existing ones, among other things. It was developed by Microsoft and is used in Windows environments by programming languages such as Visual Basic, C ++, Delphi among others, as well as on the Web through the use of Active Server Pages (ASP) and the VBScript language. JDBC Java Database Connectivity (JDBC) is a derivative inspired by it, an application programming interface that allows the execution of operations on databases from the Java programming language regardless of the operating system where it is run or the database which is accessed using the SQL dialect of the database model used. ADO.NET is a group of resources which are managed by programmers in order to access data plus its services. It belongs to the classes base library which is included at the .NETFramework from Microsoft. It is often employed by developers to get access and alter the data inside a relational database management system, but is also has the capability to access in the sources which are non relational. Source from “Difference between ODBC and JDBC in Java. Available at: https://www.tutorialspoint.com/Difference-between-ODBC-and-JDBC-in-Java “ “4.E Recommend to Mahrud Bees a database management system they can use to maintain the database you have created” In order to deliver the database at this first step the client will use Microsoft Access for understanding how the system works and managing the business with the new concept of having a database but in the near future Mahrud Bee data will be migrated and integrated into SQL server and the Database Management System recommended for Mahrud Bee in short term will be MySQL as it is free to use and has low cost requirements while entering now into a top professional solution. MySQL is a database manager, and currently one of the most used and recognized in the market. Especially when it comes to web development, it is classified as the most popular open source database in the world. MySQL is also used by very popular and large websites. Among these prominent sites, we can name some examples such as: Youtube, Wikipedia, Facebook, Google, Flickr and Twitter. It is mostly used in conjunction with web servers where it is related to web applications or CMS for online sites. It is closely linked to PHP in regards to this type of development. It is a database that presents speed in reading, especially when certain engines such as MylSAM or InnoDB are used. Likewise, whatever the environment and objective for which MySQL is intended to be used, it is necessary to monitor performance in order to correct errors, both programming and SQL. Although MySQL is specially optimized for operation on GNU / Linux operating systems, it is available for almost 100% of the systems currently used in all electronic equipment with hardly any difference in performance between different distributions. Some features of MySQL are: -Allows to choose multiple storage engines for each table. -Grouping of transactions, being able to collect them multiplely from several connections in order to increase the number of transactions per second. -Secure connectivity. -Execution of transactions and use of foreign keys. -It presents a broad subset of the SQL language. -Replication -Available on almost all platforms or systems. -Search and indexing of text fields. -Use several tools for portability. -Hash tables in temporary memories -It offers a system of passwords and secure privileges for host-based verification and encrypted password traffic when connecting to a server. -Use of multithreads using kernel threads. -It supports a large amount of data, even with more than 50 million records. -In the latest versions, up to 64 indexes per table are allowed. Each index can consist of 1 to 16 columns or parts of columns. The maximum limit width is 1000 bytes. References D. R. Howe , Data Analysis for Database Design Paperback –2001 O’Reilly | Safari. 2019. 1. Creating Your First Database - Access 2013: The Missing Manual [Book]. Available at: https://www.oreilly.com/library/view/access-2013the/9781449359447/ch01.html Database Tutorial. 2019. Database Tutorial. Available at: https://www.quackit.com/database/tutorial DocumentDB Tutorial. 2019. DocumentDB Tutorial. [ONLINE] Available at: https://www.tutorialspoint.com/documentdb/index.htm Microsoft Access Tutorial. 2019. Microsoft Access Tutorial. [ONLINE] Available at: https://www.quackit.com/microsoft_access/tutorial/. MySQL Tutorial. 2019. MySQL Tutorial. [ONLINE] Available at: https://www.tutorialspoint.com/mysql/index.htm. Intellipaat Blog. 2019. Big Data Tutorial for Beginners. [ONLINE] Available at: https://intellipaat.com/blog/big-data-tutorial-for-beginners/ Difference between ODBC and JDBC in Java. Available at: https://www.tutorialspoint.com/Difference-between-ODBC-and-JDBC-in-Java