Designing MS-Access Tables

advertisement
Designing MS-Access Tables
Relational Database Concepts
Paul A. Harris, Ph.D.
General Clinical Research Center
Introduction

Database design (data modeling) is crucial
for long-term management of information

For many users, the first experience using
MS-Access (or any RDBS) is confusing

A major cause of confusion is the design
and use of tables
Agenda
Discuss relational database concepts
- Keys and relationships
- Normalization
- Strategy
 Fields
- Types
- Demonstration
 Referential Integrity

Overview

MS-Access is a relational database engine
and a set of integrated development tools
Tables = Data
 Queries = combine
tables + ask questions
 Forms/reports UI
 Macros/Code add
functionality

Table
s
Quer
y
Code
Report
Forms
Macr
o
Relational Database Concepts
- Keys

Keys are pieces of data that help to identify a row of
information in a table

Primary key uniquely identifies an entire row of data – 1)
must have a value (cannot be null); 2) can never
change(?); and 3) must have a unique value for each
record in table.
- Look for a logical field meeting criteria
- If no logical field exists, invent one (auto-number)

Foreign keys are fields in one table that relate back to
another table’s primary keys
- Make sure foreign key “type” is same as related PK.
Relational Database Concepts
- Relationships

In a RDBS, tables are related through relationships.
Relationships may be one-to-one, one-to-many, many-tomany. One-to-many should be the most common.

One-to-One: One item in Table A applies to one item in
Table B (demographics table – dna table)
One-to-Many: One item in Table A applies to many items
in Table B (gender table – demographics table)
Many-to-Many: Many records in table A relates to many
records in Table B (avoid these)


Strive for one-to-many relationships – PK/FK
Relational Database Concepts
- Normalization

Series of rules developed by E.F. Codd (IBM) in 1970s –
integral to relational database model

First Normal Form: each column must contain only one
value (atomic, discrete data storage)

Second Normal Form: 1N + any column in a table that is
not a key has to relate only to the primary key

Third Normal Form: 2N + every non-key column is
independent of every other non-key column
Relational Database Concepts
- Normalization – First Normal Form
Each column (field) must contain only one value:

Identify any field that contains multiple pieces of
information (ex address)

Break up problem fields into separate fields (address1,
city, state, zip)
Relational Database Concepts
- Normalization – Second Normal Form
1N + any non-key column independent of every other non-key




Identify any fields that do not relate directly to the primary
key.
Create new tables accordingly
Assign or create new primary keys
Create requisite foreign keys indicating relationships
Relational Database Concepts
- Normalization – Third Normal Form
2N + any non-key column independent of every other non-key

Within a table, test to see whether any non-key field
determines the value of another non-key field
Relational Database Concepts
- Table Design and Normalization Strategy








Eliminate redundancy
Think about units – this will help with 1NF atomicity
Strive for one field primary key – use autonumbers if needed
Think first about the most important data table (most important
measurements), then work out from there to normalize
Think about questions you’ll be asking from your data – then think about
how your table structure may be combined to answer
Avoid many to many relationships – one to many relationships are
cleaner and avoid problems in long run
Don’t be afraid to break a normalization rule if it is silly for your
application
Work out on paper first, then mock-up with MS-Access and test
answering business questions with query-builds linking tables
Fields – Common Types







Text - Text or combinations of text and numbers, as well as numbers
that don't require calculations, such as phone numbers. – Up to 255
characters
Memo - Lengthy text or combinations of text and numbers - Up to
65,535 characters.
Number - Numeric data used in mathematical calculations.
Date/Time - Date and time values for the years 100 through 9999
AutoNumber - A unique sequential (incremented by 1) number or
random number assigned by Microsoft Access whenever a new
record is added to a table. AutoNumber fields can't be updated.
Yes/No - Yes and No values and fields that contain only one of two
values (Yes/No, True/False, or On/Off).
OLE Object - An object (such as a Microsoft Excel spreadsheet, a
Microsoft Word document, graphics, sounds, or other binary data)
linked to or embedded in a Microsoft Access table.
Demo?
Referential Integrity

Referential integrity is a system of rules that
Microsoft Access uses to ensure that
relationships between records in related tables
are valid, and that you don't accidentally delete
or change related data. (from MS-Help)

Ensures data validity between tables is upheld
Cascade Update
Cascade Delete


Summary – Paul’s Laws






Think about the entire project and design tables (1st Cut)
before touching keyboard
Formulate data questions to determine best table scheme
(How many people took drug A and gender = F and …).
Leave wiggle room.
Spend time normalizing, but don’t turn a 2-day project
into a 2-month project. You’re not E-Bay – you can get
by with less than perfect performance as long as you can
answer your questions and the application is flexible for
growth.
Think about central table and questions first - then work
outwards to define adjunct tables.
Design enough tables to make things work, but don’t go
overboard. I usually try to get by with as few as possible
while remaining true to the spirit of normalization.
Strive to store data once – don’t store calculations.
Where to Get More Information

Most database books have one chapter on
table design and normalization -- I like the
Visual QuickPro Guide series of technical
help books

Google search for ‘database normalization
tutorial’
Download