SQL Basics

advertisement
SQL Basics
SQL: What is it?
• SQL stands for Structured Query Language.
• It was originally developed in the early 1970s
by IBM as a way to manipulate and retrieve
data stored in IBM’s relational DBMS, System
R.
• It can be pronounced “Sequel” or “S-Q-L”.
• With some variations, SQL is more or less the
standard query language for relational
databases.
2
When do we use SQL?
• SQL is most commonly used to retrieve data
from a database.
• It is also commonly used to modify data.
• SQL also contains commands for creating
tables and other database objects, and for
maintaining data security through granting
and denying privileges to users.
3
How is SQL used?
• You can use SQL in Access using the Access Query
Builder, which will be demonstrated in lab.
• All major DBMS’s, like Oracle and MySql, come
with client tools which allow database managers
and users to write SQL queries and send them to
the database.
• The next slide shows a screen shot of one of
these client tools, Microsoft’s SQL Server
Management Studio.
4
Choose a
database here.
Type the SQL
here, then
Execute.
See the results
here.
5
SQL and VB Programming
• It is fairly easy to create VB programs which
interact with databases in all major DBMSs.
• Assignment 2will be a VB program which
interacts with Access databases.
6
What can we do with SQL?
• SQL has for major components:
1. DQL (Data Query Language): Used to retrieve
data from a database.
2. DML (Data Manipulation Language): Used to
modify the data in a database—inserting,
updating, and deleting.
3. DDL (Data Definition Language): Used to design
tables and other database objects.
4. DCL (Data Control Language): Used to control
access to the data using security methods.
• We will only look at DML and DQL in this course.
7
SELECT Queries
• SELECT queries are the most commonly-used
queries in SQL.
• They are used to retrieve data from one or
more tables.
• Basic format:
• SELECT [Fields] FROM [Table(s)] WHERE
[Conditions] ORDER BY [Fields] ASC/DESC
• SELECT PlayerName, PlayerPosition FROM
Players WHERE TeamID=2 ORDER BY Age DESC
8
WHERE CLAUSE
• A where clause is a set of conditions
following the word “WHERE” in a query.
• Where clauses are used in DELETE FROM,
UPDATE, and SELECT queries.
• They are used to restrict the rows to be
deleted, updated, or selected.
• If you write a SQL query without a WHERE
clause, all records in the table will be
affected: deleted, updated, or selected.
9
Sample WHERE Clauses
• Most WHERE clauses are easy to understand.
• They can use AND, OR, or NOT, and
parentheses to change the order of operation.
• Examples:
–
–
–
–
WHERE PlayerID = 7
WHERE TeamID=2 AND Age>24
WHERE NOT (TeamID=1 AND Age<=30)
WHERE Age>20 OR Age<30 [That one will affect all
records]
– WHERE Age <20 AND Age>30 [Won’t affect any
records]
10
BETWEEN and IN in WHERE clauses
• BETWEEN and IN can be used to simplify WHERE
clauses.
• BETWEEN combines two inequalities:
–
–
–
–
WHERE Age BETWEEN 24 AND 27
Is the same as
WHERE Age>=24 AND Age<=27
BETWEEN is always inclusive; that is, it includes the
endpoints.
• IN combines multiple Ors:
– WHERE TeamID IN (2,3,5,8)
– Is the same as
– WHERE TeamID=2 OR TeamID=3 OR TeamID=5 OR
TeamID=8
11
The Fields list
• The list of fields following “SELECT” can include many
things:
• *
• Field1, Field2
• Field1 AS SomethingElse (aliased field)
• A math expression which can include field names (a
calculated field)
• An aggregate expression like MAX(Age) or COUNT(*)
• A constant value, for example 2, ‘Smith’, or NULL. This
can be useful when normalizing a database by
combining several tables with identical fields into one.
12
*
• Using an asterisk (*) after SELECT tells the database
to send us all of the fields in the table:
 Since the query used an asterisk, it returns all seven
columns.
 It also returns all of the rows in the table. Why?
13
SELECT * FROM TableName
• “SELECT * FROM TableName” is the easiest way to view an
entire table using SQL.
• If you, as a SQL expert, are trying to find data in an unfamiliar
database, this can be a good way to start.
• However, if the tables are large (lots of rows), it might take a
long time for that query to run.
• If you just want to see the names of the columns, you can run a
SELECT query with an impossible WHERE clause, such as:
• SELECT * FROM Players WHERE 1=0
• Here are the results of that query:
 To summarize: “SELECT *” means “get all fields”;
 To get all rows, just leave off the WHERE clause.
 To get no rows, add an impossible WHERE clause.
14
Back to the Field List
• While putting “*” after SELECT is easiest, it isn’t always the best
way to query a table.
• This is especially true when writing VB code: Since you will need
to refer to individual field names in your code, you can simplify
writing the code by putting the actual field names in the query.
• So, instead of writing “SELECT * FROM Players”, we would
write:
• SELECT PlayerID, TeamID, LastName, FirstName, Age,
PlayerPosition, PhoneNumber FROM Players
• Now, when we write our code which accesses these fields, we
don’t have to go back and look at the database to remember
what their names were.
• And, if we don’t need certain fields, we can leave them out of
the list. If the table is huge, this can make the program run
faster.
15
Aliases
• Field names can be “aliased”—that is, given
different names when displayed in the results.
• This is done using the SQL keyword “AS”:
• SELECT PhoneNumber AS Phone FROM Players
• You can also use this to break the rules about
field names when you display them—just put the
“incorrect” alias in square brackets:
• SELECT PhoneNumber AS [Phone #] FROM
Players
16
Aliases and Calculated Fields
• SQL can do arithmetic. You can include calculated fields in SELECT
queries. The calculations can include the other fields in the table.
• Aliases should always be used for calculated fields in SELECT
queries. Examples (sorry, I don’t actually have a CarData table to
demonstrate these):
• SELECT CarID, CarName, Miles, Gallons, Miles/Gallons AS [Miles per
Gallon] FROM CarData
• SELECT CarID, Miles*1.61 AS Kilometers FROM CarData
• SELECT FirstName & ‘ ‘ & LastName AS [Name] FROM Players
• Note: SQL generally allows both single (‘) and double (“) quotes to
be used for string constants.
• I recommend that you use single quotes, especially when creating
SQL statements in VB.
17
ORDER BY
• Query results can be sorted in a desired order by using ORDER
BY at the end of the query.
• You should never rely on query results being sorted UNLESS
the query includes an ORDER BY clause!
• SELECT * FROM Players ORDER BY Age ASC
• That query will put the youngest players on top of the list, with
the oldest at the bottom.
• ASC stands for ascending order. With numbers, this means
smallest to largest; with strings, it means alphabetical order.
• DESC is the opposite of ASC.
• ASC is the default for ORDER BY; if you simply write “ORDER BY
Age”, SQL will interpret it as “ORDER BY Age ASC”.
18
ORDER BY Examples
• SELECT LastName, FirstName FROM Players ORDER
BY LastName, FirstName
• For that query, “FirstName” is the secondary sort
order—the tie breaker. For players with the same
last names, those with first names beginning with A
will be listed before those beginning with B, and so
on. “Jones, Adam” before “Jones, Bob”.
• SELECT * FROM Players ORDER BY TeamID ASC, Age
DESC
• That query will display all members of team 1
before all members of team 2, but within each
team, the oldest players will be listed first.
19
TOP
•
•
•
•
•
•
•
•
To find only the best (or worst) records in any result set, you can use the
keyword TOP and the number of records you want to see right after SELECT.
If you use TOP, you should always have an ORDER BY clause—remember that
you cannot count on query results being sorted any particular way UNLESS
you have an ORDER BY clause.
SELECT TOP 5 LastName, FirstName, GPA FROM Students ORDER BY GPA
That query will find the five WORST students in the table. Don’t let the word
“TOP” confuse you. In SQL queries, it doesn’t mean “best”—it means the
records that will appear at the TOP of the result set. In this case, ORDER BY
GPA will list GPAs from lowest to highest, since the default order for ORDER
BY is ASC.
If you want the five BEST students, use
SELECT TOP 5 LastName, FirstName, GPA FROM Students ORDER BY GPA DESC
Note that there is no “BOTTOM” keyword in SQL; to find the bottom, reverse
the sort order and use TOP.
TOP will display ties, at least in Access. For example, in the last query, if there
are seven students with 4.0 GPAs, all seven will displayed as being in the top
5.
20
Aggregates
• Aggregates are fields in queries which summarize the values for all
records in the table, or for subsets of the records. The aggregates you
should know for this class are COUNT, MAX, MIN, SUM, AVG, and
STDEV.
– MIN: Find the minimum value (or the earliest date for dates) in the
specified field in the rows meeting the conditions.
– MAX: Find the maximum value (or the latest date for dates) in the
specified field in the rows meeting the conditions.
– SUM: Add the values in the specified field in the rows meeting the
conditions.
– AVG: Find the mean value of the specified field in the rows meeting the
conditions.
– STDEV: Find the standard deviation of the values of the specified field in
the rows meeting the conditions.
– COUNT: Find out how many rows there are which meet the conditions.
For this course, always use COUNT(*), not COUNT(SomeField). They
generally return the same numbers, but doing COUNT on a specific field
may return a smaller number if that field contains NULLs.
21
Aggregate Examples
• (See the TripSummary table in the SampleData.accdb database
to see what these queries refer to, although you can probably
understand them without having to look at the database.)
• SELECT MAX(DistanceKilometers) AS MaxDistance FROM
TripSummary
• Returns a result set with one column (MaxDistance) and one
row. The one cell in the result set contains the maximum
distance from all trips in the table.
• SELECT AVG(DurationMinutes) AS AvgTime FROM
TripSummary WHERE Driver=4
• Again, this returns a one-column, one-row result set containing
the average duration of driver 4’s trips.
• SELECT COUNT(*) AS NumTrips FROM TripSummary WHERE
Driver=37
• Returns the number of rows having driver = 37.
22
AAA: Always Alias Aggregates
• If your query says
• SELECT MAX(Age) FROM Players
• Access will return the following:
 Access doesn’t have a name for MAX(Age), so it makes up
one: “Expr1000”
 You should always alias aggregates with a meaningful name.
 SELECT MAX(Age) AS MaxAge FROM Players
23
GROUP BY
• Consider this table called Workers (also available in
SampleData.accdb):
•If you want to know the maximum age of all workers, you use this query:
•SELECT MAX(Age) AS MaxAge FROM Workers
•which will return
MaxAge
64
24
GROUP BY
• But if you want to know the maximum age
by JobType, use GROUP BY:
• SELECT JobType, MAX(Age) AS MaxAge
FROM Workers GROUP BY JobType
• Which returns
JobType
Clerical
Management
Other
Professional
Sales
Service
MaxAge
44
51
55
64
22
24
25
Another GROUP BY Example
• If you want to know how many men and
women there are in the list:
• SELECT Sex, COUNT(*) AS Num FROM
Workers GROUP BY Sex
Sex
Num
F
5
M
14
26
And Another!
• SELECT JobType, Sex, COUNT(*) AS Num
FROM Workers GROUP BY JobType, Sex
JobType
Sex Num
Clerical
Clerical
Management
Other
Other
Professional
Professional
Sales
Service
Service
F
M
M
F
M
F
M
M
F
M
2
1
2
1
4
1
3
2
1
2
27
MAX/MIN vs. TOP
• Above, we saw the results of this query:
• SELECT MAX(Age) AS MaxAge FROM
Workers
MaxAge
64
However, we can get the same result using TOP:
SELECT TOP 1 Age FROM Workers ORDER BY Age DESC
In addition, we can find out everything about the oldest worker using TOP, something we can’t
do with MAX:
SELECT TOP 1 * FROM Workers ORDER BY Age DESC
WorkerID
1485
YearsOfSchool
18
Sex
F
YearsWorking
40
UnionMember Wage
No
22.2
Age
64
JobType
Professional
28
Combining TOP and Aggregates
• Therefore, if you just want to find one maximum or minimum
and want to know something about the record that possesses
that value, use TOP. In most cases, it will be preferable to MAX
or MIN. However, if you want to calculate maximum or
minimums by categories, use MAX and MIN along with GROUP
BY. You can even combine TOP with GROUP BY:
• SELECT TOP 3 JobType, AVG(Wage) AS AvgWage FROM
Workers GROUP BY JobType ORDER BY AVG(Wage) DESC
JobType
Management
Professional
Other
AvgWage
18.695
16.2225
9.388
 This tells us the three highest paying job categories and
the corresponding average wages.
29
Download