Database basics

advertisement

CTFS Workshop

Shameema Esufali

Asian data coordinator and technical resource for the network shameemaesufali@gmail.com

CTFS Workshop

Relational database basics

Why relational databases?

Why MySQL?

What about R?

Relational Theory

In order to work with MySQL it is necessary to understand the basics of relational theory. i.e how and why data is stored and managed in a relational database.

The guiding principle behind a relational database is to store data once and only once.

What is a Relation?

A table. Columns are fields (attributes) of data related to other fields on the same row (tuple).

Primary Key

Identifies the row of a table without duplicates.

Tells you what the row contains

Eg. If treeid is the primary key then the row has information about that tree

Candidate Primary Key

Any attribute(s) which together would serve as the primary key.

Must uniquely identify a row of data.

Each part of the key must be essential to unique identification. No redundancy.

Foreign Key

A foreign key is a column in a table that matches the primary key column of another table. Its function is to link the basic data of two entities on demand, i.e. when two tables are joined using the common key.

First Normal Form

One piece of information per column. No repeated rows. Eliminate fused data eg Code1,Code2

Tag Species Code

Wrong!

1234 SHORME A, BA

Right Tag

1234

1234

Species

SHORME

SHORME

Code

A

BA

Second Normal Form

Each column depends on the entire primary key.

Wrong

Tag Census Species Seedsize X Y DBH

1234 1 SHORTR Medium 11.3 15.4 12

Tag Species Seedsize X Y

1234 SHORTR Medium 11.3 15.4

Right

Third Normal Form

Each column depends ONLY on the primary key. i.e. there are no transitive dependencies

Wrong

Tag Species Seedsize X Y

1234 SHORTR Medium 11.3 15.4

Tag Species X Y

1234 SHORTR 11.3 15.4

Right

Fourth Normal Form

The table must contain no more than one multi-valued dependency

Tag DBH

1234 10

1234 11

1234 11

Cod e

A

A

BA

Entity Relationship diagram (ERD)

Shows in a diagram how entities (tables) are related to one another.

One to One

One to many

Many to many

One to one

Extension of number of attributes in a single table

Rarely required

Tree

More tree

attributes

Most common

Requires two tables.

Linked by

Foreign Key

One to Many

Parent

Family Genus

Child

Species

Many to many

Need to break down to one to many

Measurement

Tree

Requires three tables

Code

Code

Measurement

Associative table provides common key

Reassembling data

Data was broken down into tables to preserve integrity

How can we put it together to derive information?

Use Structured Query Language (SQL) to JOIN tables using a common attribute

Two tables may be joined when they share at least one common attribute

Joins

3

4

5

GenusID

1

2

The Primary key of the Parent table is stored in the Child table as a cross reference. This is called a Foreign Key.

Primary Key in Parent

Genus

Acacia

Acalypha

Adelia

Aegiphila

Alchornea

5

6

7

3

4

1

2

SpeciesID Species melanoceras diversifolia macrostachya triloba panamensis costaricensis latifolia

3

3

FamilyID

4

3

Foreign Key in Child

3

4

5

2

3

5

GenusId

1

2

Table joined on Foreign Key

GenusID

4

5

2

3

6

7

SpeciesID

1

Species melanoceras diversifolia macrostachya triloba panamensis costaricensis latifolia

3

4

2

2

5

5

GenusId

1 ⇿

3

4

2

2

5

5

GenusID

1

Genus

Acacia

Acalypha

Acalypha

Adelia

Aegiphila

Alchornea

Alchornea

3

3

3

3

3

3

FamilyID

4

The Genus ID in the Species table is used to pick up information for the corresponding Genus. It looks for a row with the matching Primary Key

Extend to join many tables

With SQL you can join as many tables as you need to in order to get the set of information you need. Thus the previous example can be extended to include Family which is a parent table of Genus and/or extended in the another direction to include Tree which is a child of

Species as long as there is a linking attribute .

This attribute is called a Foreign Key .

Download