Uploaded by Mengchen Su

ITC6000 Assignment 2.docx

advertisement
Meet Sadalgekar
ITC 600 70917 Database Management Systems
Professor: Gerald Lancia
ITC 6000 Assignment-2
1) List at least four advantages of indexing. Mention some indexing tips that would be of
particular help to a database designer. Also, discuss indexing practices that can adversely
affect the performance of a database. In your discussion, include information on how a
designer should decide on how many indexes to provide per table.
An index is an orderly arrangement used to logically access rows in a table. From a conceptual
point of view, an index is composed of an index key and a set of pointers. The index key is the
index’s reference point. More formally, an index is an ordered arrangement of keys and pointers.
Each key points to the location of the data identified by the key.
Advantages of indexing:





Increased performance: There is no need to access a row in the database from an index
structure, so you can reduce the total number of I/O operations needed to retrieve data.
Reduced table space: Because we do not need to link to a row in a table, there is no need
to store the ROW_ID in the index. The overall space required for the table is reduced.
Presorted data: The data in the leaf nodes is already sorted by the value of the primary key.
The index will provide the database with a sorted list of table’s columns. The database can
simply scan the index from the first record to the last record and retrieve the rows in sorted
order.
Index provides uniqueness in data of a column. Each time an application adds or modifies
a row in the table, the database needs to search all existing records to ensure none of values
in the new data duplicate existing values. Indexing will improve this search time in an
efficient manner.
Moreover, indexing plays a significant role in DBMSs for the implementation of primary
keys. When we define a table’s primary key, the DBMS automatically creates a unique
index on the primary key column(s) that we declared. In a unique index, the index key can
have only one pointer value or a row associated with it.
Indexing practices that adversely affect the performance of a database:
Indexes are stored on the disk, and the amount of space required will depend on the size of the
table, the number and types of columns used in the index. Disk space is generally cheap enough to
trade for application performance, particularly when a database serves a large number of users.
If the data is modified on regular intervals then database engine requires updating all the indexes,
thus too many indexes will decline the performance.
1
Meet Sadalgekar
ITC 600 70917 Database Management Systems
Professor: Gerald Lancia
The disadvantage of a non-clustered index is that it is slightly slower than a clustered index and
they can take up quite a bit of space on the disk.
They decrease performance on inserts, updates, and deletes. They take up space (this increases
with the number of fields used and the length of the fields). Some databases will monocase values
in fields that are indexed. Irrespective of frequently modifying database, too many indexes can
actually slow the database down.
Indexing tips and information on how a designer should decide on how many indexes to
provide per table:
Indexes should be built to optimize the access of the SQL queries. To properly create an optimal
set of indexes requires a list of the SQL to be used, an estimate of the frequency that each SQL
statement will be executed, and the importance of each query should be taken into consideration.
Large numbers of indexes on a table affect the performance of INSERT, UPDATE, DELETE, and
MERGE statements because all indexes must be adjusted appropriately as data in the table
changes. The designer should avoid over-indexing heavily updated tables and keep indexes
narrow, that is, with as few columns as possible.
Use many indexes to improve query performance on tables with low update requirements, but
large volumes of data. In my understandings, large numbers of indexes can help the performance
of queries that do not modify data, such as SELECT statements, because the query optimizer has
more indexes to choose from to determine the fastest access method.
Furthermore, indexing small tables may not be optimal because it can take the query optimizer
longer to traverse the index searching for data than to perform a simple table scan. Therefore,
indexes on small tables might never be used, but still should be maintained when the data in the
table changes.
When someone updates the value of a column that has been defined in an index, the DBMS must
also update the index. So, indexing speeds up the process of retrieval but slows down the
modification.
In my opinion number of indexes per table must be selected based on the improvement in
performance of the database. If adding more indexes per table increases the efficiency of a database
then that is not a problem.
2
Meet Sadalgekar
ITC 600 70917 Database Management Systems
Professor: Gerald Lancia
2) List at least four basic rules for identifying primary keys in a relational database. Mention
how you would go about identifying the foreign keys in your design. Briefly discuss foreign
key ownership and how to indicate a foreign key in your relational database model.
Four basic rules for identifying primary keys can be made by taking into consideration the
following points:
Numbers
Primary keys are most of the time numeric values because they are easily incremented. Some
database software offers an auto-incremented number option, which allows for an easy primary
key field. This can also be a larger more complex number, such as an employee ID number.
Required
If a primary key is defined within a table, it will never have a null value as a piece of information
cannot be referenced without a primary key, so the value is automatically required by the software.
If a column contains any null values, it is not the primary key.
Combination Keys
A primary key can either be made from a single field or a combination of several fields. A primary
key could be made from a combination of an employee ID number and the time the computer use
started in a database containing computer use logs, for example.
Not Updated
A static piece of information is information that will not be altered or updated over time. Look for
information that is not subject to changes when identifying primary keys, such as ID numbers and
not names.
Identifying Foreign Keys:
Every dependent and category entity in the model must have a foreign key for each relationship it
participates. Foreign keys are formed in dependent and subtype entities by migrating the entire
primary key from the parent entity. Moreover, if the primary key is composite, it may not split.
3
Meet Sadalgekar
ITC 600 70917 Database Management Systems
Professor: Gerald Lancia
Foreign Key Ownership:
Foreign key attributes are not considered to be owned by the entities to which they migrate,
because they are reflections of attributes in the parent entities. Thus, each attribute in an entity is
either owned by that entity or belongs to a foreign key in that entity. Furthermore, foreign keys
attributes are indicated by the notation (FK) beside them
3) Describe at least two examples of common errors in entity relationship modeling. If
possible, provide a graphical illustration of the problems and solutions. List some questions
the designer should consider before designing the models so that these errors can be avoided.
Figure a, b, c, and d manifests some common errors in Entity relationship model committed by the
designers.
A beginner might want to model the verbs like keep track or assigns or established as relationships.
These verbs refer to implementing the database and not to its content. Keep track refers to storing
data in the database, established refers to adding an instance of an entity to the database, and
assigns refers to giving a value to an attribute of an entity. In deciding which elements to model,
it is valuable to keep in mind the real world situation.
Moreover, designers also frequently confuse entities with their attributes or properties.
Occasionally, if properties are complex and play a significant role in the problem domain, then
they may be modeled as entities. More often, properties of an entity should be modeled as
attributes.
4
Meet Sadalgekar
ITC 600 70917 Database Management Systems
Professor: Gerald Lancia
Other errors are modeling indirect or redundant relationships and inappropriately modeling object
types as relationships rather than as entities.
The representation used in Figure (d) shows that the attributes related to payments: date of
payment, amount of payment, and method of payment, are now attributes of the relationship Pay.
This representation can add unnecessary complexity to the model. Ordinarily, a relationship is
uniquely represented by the identifiers of one or more of the entities which participate in it. If the
relationship includes a time-dependent attribute like date of payment, then that attribute must also
be included in the primary key for that relationship.
5
Meet Sadalgekar
ITC 600 70917 Database Management Systems
Professor: Gerald Lancia
To avoid these errors the designers should keep in mind the following aspects:





Recognizing the problem domain in detail.
Deciding the entities and attributes efficiently.
Make a distinction between elements that comprise the content of the database and
elements that are outside the scope of the database.
Modeling indirect or redundant relationships and inappropriately modeling object types as
relationships rather than as entities.
No ambiguity in the understanding of the data.
4) Using library resources or the Internet, locate an entity relationship diagram (ERD) utility
that can be used to create graphic database designs. Provide a general description about the
company that produces the tool. Mention the ER modeling techniques offered by the tool
and the database products that it supports. Discuss other features that might make this an
attractive product to a database designer and explain why.
Visual Paradigm:
Entity Relationship Diagram (ERD) Tool
for Data Modeling using ORM Hibernate Framework
Company Description:
Visual Paradigm is headquartered in Hong Kong. It is a leading and globally recognized provider
for Business and IT Transformation software solutions. It enables organizations to improve
business and IT agility and foster innovation through popular open standards. Their award-winning
products are trusted by over 320,000 users in companies ranging from small business, consultants,
to blue chip organizations, universities and government units across the globe. Visual Paradigm's
software has been adopted by many organizations in the worldwide, including Fortune 500 and
Fortune 1000 companies
ER modeling techniques and features:



It can design from conceptual model for the problem, to logical model, and subsequently
to physical schema for automatic database generation.
It maintains the traceability between models automatically.
This tool supports most of the leading databases in market (when change the connection of
the database type, the diagram will automatically be conformed to column type style of the
supported database).
6
Meet Sadalgekar
ITC 600 70917 Database Management Systems
Professor: Gerald Lancia







It can generate database schema from the entire ER diagram, a part of even a single entity
of the diagram.
The tool can perform reverse engineering of existing database from either a database or
DDL.
It can flexibly generate database from different models.
It can covert Class diagram to ERD and ERD to Class Diagram.
It can integrate RDBMS with object-oriented technology using Object-Relational Mapping
with Hibernate Framework and make the auto transformation with the easy ORM Wizard.
It can switch the connection between the production database and your personal database
at your own choice.
Uses Visual Diff to identify the modifications between different version of diagrams
More features which will be efficient for the designers:










Entity Relationship Diagram.
Table Record Editor.
Database View Editor.
Ad-Hoc SQL Statements Generator.
Construct Conceptual, Logical and Physical ER Model.
Visual Modeling.
Enterprise Architecture.
Business Modeling.
Analysis & Charting.
Advanced Modeling Tools.
In my understandings the above features provided by the visual paradigm tool will provide the
designers the vast scope for an efficient ER diagram designing. Moreover the Table record editor,
database view editor and the most important Ad-hoc SQL statement generator gives a better
platform to the designers to develop the models and database diagrams. With the help of this tool
understanding of the problem and the database becomes easier as a result unambiguous ER
diagrams and models are designed.
7
Meet Sadalgekar
ITC 600 70917 Database Management Systems
Professor: Gerald Lancia
References:
Retrieved from: http://www.dbta.com/Columns/DBA-Corner/Top-10-Steps-to-Building-UsefulDatabase-Indexes-100498.aspx
Retrieved from: https://technet.microsoft.com/en-us/library/ms191195(v=sql.105).aspx
Retrieved from: http://ask.brothersoft.com/basic-rules-for-identifying-primary-keys-in-a
relational-database-104493.html
Retrieved from: http://condor.depaul.edu/gandrus/240IT/accesspages/primary-foreign
keys.htm
Retrieved from:
http://www.cis.drexel.edu/faculty/song/courses/info%20605/appendix/AppendixA.PDF
Retrieved from: https://www.visual-paradigm.com/features/database-design-with-erd-tools/
8
Download