Uploaded by joytototo1108

CS 330 S19

advertisement
CS 330
Management Information Systems (MIS)
Kevin Lanctot
Based on slides and lecture notes prepared by
Michael Liu and Anne Banks Pidduck
and the course text
Management Information Systems: Managing the Digital Firm,
7th Canadian Edition by Laudon, Laudon and Brabston (2013)
Table of Contents
Topic
0
1
2
3
4
5
6
7
8
9
CS 330
Course Overview
IT Infrastructure
Databases
Networking
Management Information Systems
Business Processes and Types of Info Systems
Organizations and IS
Social, Ethical, and Legal Issues
Security
Managing Knowledge
Spring 2019
3
28
153
236
279
303
341
362
399
462
Welcome to CS 330!
Topic 0 – Course Overview
Four Questions
• Who am I? Who are we?
→ Staff
• Who are you?
→ Intended Audience
• What are we doing?
→ Course Overview
• How will we do it?
→ Course Delivery
Reference
• CS 330-S19-Outline.pdf available on Learn.
CS 330
Spring 2019
Who are we? → Staff
Instructor
Kevin Lanctot, kevin.lanctot@uwaterloo.ca
• Office: DC 2131 (near the skywalk to the M3/MC buildings)
• Office hours: Tues and Thurs 12:30 – 1:30 pm
• Things to know about me:
- I’m a talker not a typer, i.e. best to ask me questions about
course content in person rather than through email or Piazza.
- I only check my email once or twice a day.
- I typically replied to emails within 48 hours
- Last name is pronounced long-k toe, i.e. “long toe” with the
“k” sound after long.
•
CS 330
Spring 2019
4
Who are we? → Staff
TAs
• Rishav Agarwal
• Kashif Khan
• Mustafa Korkmaz
• Mohamed Mhedhbi
• Ke Nian
Role
• Some will have office hours before the assignment is due.
• Others will have office hours after assignments or the midterm
have been handed back in case you have any questions.
• These office hours and their office hours will be posted in
Piazza once that has been finalized.
CS 330
Spring 2019
5
Who are you?
Course Objective
• Interested in learning more about computer science from the
perspective of a manager who has to make informed decisions
about information technology.
Intended Audience
• This course is most suitable for students interested in the
application of computers to business (i.e. no programming).
• Prereq: One of CS 106, 116, 136, 138, 146; Level at least 2B;
Not open to Computer Science students.
• Antireq: AFM 241, BUS 415W, 486W, CS 480/490, MSCI 441
• BBA/BMath double degree students interested in CPA should
NOT take CS 330 because there is an ant-req problem.
CS 330
Spring 2019
6
What are we doing?
Major Topics
Foundations of Management Information Systems
• What are they? What is their role? How are they used?
• Applications, types and its impact on business and society
• Ethical and security concerns
Technical Foundations of Information Systems
• Hardware, software, files, databases, telecommunications,
connectivity, standards.
Building Information Systems
• Tools and techniques to analyze and design information
systems. The systems development life cycle.
CS 330
Spring 2019
7
What are we doing?
Course Objectives
At the end of the course students should be able to:
1. Assess trade-offs in technological solutions, such as build,
rent or buy decisions
2. Make informed decisions about using technology in a
business environment
3. Define and explain the characteristics of a variety of business
information systems
4. Describe strategic roles and management usage for
information systems in business
CS 330
Spring 2019
8
What are we doing?
5.
6.
7.
8.
CS 330
Develop a Disaster Recovery Plan
Develop and understand website security and privacy
policies
Understand the relationships among various information
systems
Understand the implications of wireless technology
Spring 2019
9
What are we doing?
Types of Questions We’ll Deal With
• What do we need to secure our system?
• Should I use a router, switch or hub to build an office network?
• Who owns the pictures you posted on Facebook?
• What is the big deal of 5G and IoT?
• How much does it cost to have an IT infrastructure? How long
does it last?
• What do you need for your IT infrastructure?
• Why do we need a data warehouse?
• How much is the lifetime cost of a PC?
CS 330
Spring 2019
10
What are we doing?
Types of Questions We’ll Deal With
• Can the police search my work computer without a warrant?
• Can border officials search my phone without a warrant?
• Is it illegal to hack an iPhone?
• What is cloud computing?
• What is pros and cons of cloud computing?
• What is LTE and RFID?
• Is internet the same as the world wide web?
• Is a 3.1GHz processor necessarily faster than a 2.2GHz one?
CS 330
Spring 2019
11
How will we do it?
Course Delivery
•
•
Lectures will include
- slides
- some class discussions
- some notes made on blackboard / whiteboard
Slides only contain key points
- must supplement slides with
▪ course text and
▪ notes taken in class
- key points on a slide will be written in blue italics
- key technical terms, that you should learn, are written in
red, such as DRAM
CS 330
Spring 2019
12
How will we do it?
Attendance
• In Spring 2013, 9 random attendances were taken during the
term for about 220 students
•
Final Grade
Attended 7 or
more classes
Addended less
than 7 classes
90s
7.1%
0%
80s
38.5%
12.4%
Failed
3.1%
10.3%
Conclusion: Regular attendance helps.
CS 330
Spring 2019
13
How will we do it?
Grade Calculation
• 20%
4 assignments, each worth 5%
• 25%
Midterm Exam
• 55%
Final Exam
• In order to pass CS 330, students are required to
1. pass the entire course (get at least a 50% overall grade)
and
2. pass the weighted average of the midterm + final
• Otherwise, the maximum course grade is 46%
• Grades (will be) viewable on Learn.
• Midterm and final are closed-book.
CS 330
Spring 2019
14
How will we do it?
Course Work
•
•
•
•
•
•
Assignment 1:
Assignment 2:
Midterm:
Assignment 3:
Assignment 4:
Final Exam:
CS 330
Thursday June 6th at 5:00 pm
Thursday June 20th at 5:00 pm
Thursday June 27th 7:00-8:20 pm
Thursday July 11th at 5:00 pm
Thursday July 25th at 5:00 pm
to be scheduled by the Registrars’ Office
Spring 2019
15
How will we do it?
Assignments
• 4 assignments
• can be up to 24 hours late, -10% (even if just a part of it is late)
• They can be submitted in class or in the assignment boxes on
the 4th floor of the MC near the tutorial centre.
• We will set-up a way of submitting electronically on Learn (for
proof of submission) and possibly as a preferred method of
submission.
• Assignments will be available for pick-up after class or during
office hours.
• Assignments must be well organized and easy to read.
CS 330
Spring 2019
16
How will we do it?
Lost Assignments
•
Softcopy submission to Learns is the only proof we will accept
that the hardcopy of the assignment has been lost.
Retention of Assignments
• Unclaimed assignments will be retained for one month after
the term grades become official in Quest.
• After that time, they will be destroyed in compliance with
UW’s confidential shredding procedures.
CS 330
Spring 2019
17
How will we do it?
Regrading Request
• Requests for regrading will be accepted up to 14 days after
students have the opportunity to pick up their assignments or
midterm.
• Details of how to request a regrade will be posted in Piazza
after the first assignment is due.
Regrading Policy
• Grades are posted in Learn
• It is your responsibility to verify that the posted grade
corresponds to the grade actually received and to notify the
instructor of any error.
CS 330
Spring 2019
18
How will we do it?
In Case of Illness
• Accommodations for a missed assignment, midterm or final
exam require a valid Verification of Illness form (VIF).
- Submit the VIF to the MUO
- MUO verifies it then notifies all your instructors
- https://uwaterloo.ca/math/vif
• If you miss the final exam, you will be given an INC if
- there is a strong reason for missing the exam (generally a
serious medical issue verified by a doctor's note)
- AND a satisfactory performance during the term.
CS 330
Spring 2019
19
How will we do it?
Course Textbook
• Information Systems: Managing the Digital Firm (2014), 7th
Canadian Edition, Toronto, Pearson Prentice Hall.
• A copy will be placed on 3-hour reserve in the Davis Library.
Learn
• lecture slides
• assignments
• marks
• drop box for your assignments
• we will have an official pinned post for each assignment, for
the midterm and for the final exam.
CS 330
Spring 2019
20
How will we do it?
Piazza
• For general questions about assignments
• One [official] thread per assignment
• Use private post if hinting about your approach, how you might
solve it, implementation details
• When asking about an assignment put the following in your
title, AxxxQyyy.
- E.g. if you asking about Question 2 on Assignment 1 include
A1Q2 in the heading
CS 330
Spring 2019
21
How will we do it?
Piazza
• You will receive an invitation to join the class discussion for CS
330 via your uwaterloo.ca email this week.
• For questions about lecture material it is best to see me or a TA
but you are welcome to use Piazza for questions that require a
brief answer.
• We will have extra office hours just before the midterm and
final exams.
CS 330
Spring 2019
22
How will we do it?
How to succeed on assignments and exams
What are some good studying habits?
CS 330
Spring 2019
23
How will we do it?
How to succeed on assignments and exams
What are some good studying habits?
CS 330
Spring 2019
24
How will we do it?
UW Values
• honesty, trust, fairness, respect and responsibility
Avoid Cheating
• Do your own work.
• Do not try to look up answers to assignment questions on the
web (unless the question states that it is ok to do so).
• Midterm and final are worth 80% of your final mark.
CS 330
Spring 2019
25
How will we do it?
UW Values
• honesty, trust, fairness, respect and responsibility
Avoid Plagiarism
• Use a proper reference and citation
- Without which, you might be accused of plagiarism
• Write in your own words. Don’t copy verbatim!
- Even if you have proper reference and citation, but if you
copy most material verbatim, you will not be charged with
plagiarism, but you might get 0 for the assignment.
CS 330
Spring 2019
26
How will we do it?
Academic Integrity
www.uwaterloo.ca/academicintegrity
Grievance
http://www.adm.uwaterloo.ca/infosec/Policies/policy70.htm
Discipline
http://www.adm.uwaterloo.ca/infosec/Policies/policy71.htm
Appeals
http://www.adm.uwaterloo.ca/infosec/Policies/policy72.htm
CS 330
Spring 2019
27
Topic 1 – IT Infrastructure and
Emerging Technologies
Key Question
• Why should you care about information technology (IT)?
• What are the basic components of IT infrastructure?
References
• Course Text Chapter 5, IT Infrastructure and Emerging Technologies
Acknowledgements
• partially based on the lecture “Geek Speak” developed by John
Finnson and John Doucette
CS 330
Spring 2019
28
Why Understanding IT is Important
Class Exercise
Distinguish gibberish from genuine technical vocabulary.
• Don’t be “cut out of the loop”!
• Maintaining employees’ respect
Exercise
1. I will show you a sentence
2. A student will answer whether it is genuine or gibberish
3. Discuss
4. Class vote
CS 330
Spring 2019
29
Why Understanding IT is Important
Class Exercise
• Hackers are using snorters to cause cyber-lightening on our
internal network.
- Genuine or gibberish?
•
Hackers have used Trojan horses to introduce bots into our
network.
- Genuine or gibberish?
CS 330
Spring 2019
30
Why Understanding IT is Important
Class Exercise
• Marketing wants a new server to support an in-house data
mart.
- Genuine or gibberish?
•
The head of IT suggests pinning business intelligence tools to
our data cabinets to improve network protocols.
- Genuine or gibberish?
CS 330
Spring 2019
31
Why Understanding IT is Important
Class Exercise
• The security department suggests using stronger encryption
for our wireless network to protect from war driving and cyber
vandalism.
- Genuine or gibberish?
•
Marketing wants to register our RFID tags with the domain
name system for faster HTTP access.
- Genuine or gibberish?
CS 330
Spring 2019
32
Why Understanding IT is Important
Class Exercise
• We need to replace our existing 1 TB SSD with a 1,000 GB SSD
because 1 TB is too small.
- Genuine or gibberish?
•
If you install a duo core 32-bit CPU, you can have 64-bit
computation power.
- Genuine or gibberish?
CS 330
Spring 2019
33
Why Understanding IT is Important
Dr. Evil has discovered that Donald Trump has set the nuclear
launch code to match his Twitter account password.
So Dr. Evil plans to hack into Twitter to obtain Trump’s password
to gain control of US nuclear arsenal.
Is this plan technically feasible?
a) Yes
b) No
CS 330
Spring 2019
34
Why Understanding IT is Important
A friend runs an e-commerce company.
Should they buy a DSL, ADSL, or T1 line?
a) DSL
b) ADSL
c) T1 line
d) This question is gibberish
CS 330
Spring 2019
35
Why Understanding IT is Important
Your IT expert informs you of employees falling victim to
phishing and identity theft.
She advises that a social engineering expert should be brought to
the company to instruct the employees on how to avoid these
attacks.
a) Phoney / Gibberish
b) Good idea
c) Bad idea
CS 330
Spring 2019
36
Why Understanding IT is Important
Know Your Options
• You will make better decisions if you understand the options
and their trade-offs.
• Important to understand security
- Social engineering breaches can damage your company’s
reputation and brand.
- e.g. leaving a disk in the washroom that contains the label
“Executive Salary Summary 2018” but really contains
malicious software (malware)
- Understand the structure of company’s security.
CS 330
Spring 2019
37
Hardware Components
Goal: Understand the basic components of a computer
1. Processor (a.k.a. central processing unit or CPU) is where
symbols, characters, and numbers are manipulated
2. Primary Memory is where data and program instructions are
stored temporarily during processing
- e.g. registers, cache, RAM
3. Secondary Storage stores data and programs even when the
computer is turned off
- e.g. magnetic disks (HD) and optical disks (DVDs, Blu-ray),
flash drives, solid state drives (SSD), magnetic tape
CS 330
Spring 2019
38
Hardware Components
Input Devices: convert data and instructions from the outside
world into electronic form
- e.g. keyboard, mouse, touchpad, touchscreen, microphone,
camera
5. Output Devices: converts electronic data produced by the
computer into a form understood by humans or the outside
world
- e.g. printer, speaker, monitor
6. Communication Devices: provide connections between the
computer and communications networks
- network interface card (ethernet or Wi-Fi), Bluetooth
4.
CS 330
Spring 2019
39
Measuring the Amount of Data
Capacity: KB, MB, GB, TB, etc.
• Small b is a bit or single binary digit
- there are only 2 possible bit values: 0 or 1
• Big B is a byte (8 bits), enough info to specify one English letter
- there are 28 = 256 possible values: 00000000 to 11111111
• In general with n bits, 2n different values are possible
• When dealing with data storage and data transfer rates the
units refer to multiples of 1024 (or 210) and you should use
large letters like K, M, G, T.
• When dealing with the metric system or frequencies they refer
to multiples of a thousand, like distance (1 km = 1000 metres)
or weight (1 kg = 1000 grams).
CS 330
Spring 2019
40
Measuring the Amount of Data
Capacity: KB, MB, GB, TB, etc.
• Common measures of size (typically use bytes)
- KB = 1024 bytes
- MB = 10242 (or roughly a million) bytes
- GB = 10243 (or roughly a billion) bytes
- TB = 10244 (or roughly a trillion) bytes
Common measures of speed (typically use bits)
- Kb = 1024 bits
- Mb = 10242 (or roughly a million) bits
- Gb = 10243 (or roughly a billion) bits
• Occasionally manufacturers will use multiples of 1000 rather
than 1024 but they will mention this in a footnote somewhere.
•
CS 330
Spring 2019
41
The Processor
Word Size
• When talking about processors, the word size is a measure of
how many bits a processor can transfer or manipulate in in
parallel (i.e. at the same time).
• Recently processors for servers, laptops, tablets and cell
phones come in two varieties.
- 32-bit architecture has a word size of 32 bits.
- 64-bit architecture has a word size of 64 bits.
• Premium smart phones, recent laptops and servers would all
use 64-bit architectures.
• Older and inexpensive smart phones would still be using 32-bit
architectures.
CS 330
Spring 2019
42
The Processor
64-bit Architecture
• When processor companies like Intel and AMD moved from 32bit to 64-bit architecture for their processors, they made sure
that computer programs that worked for their old (32-bit)
processors would also work on their newer 64-bit ones.
• This feature is called backwards compatibility, i.e. when the
new version will still work with the old system.
• Programs optimized for 64-bit architectures will run faster.
• Programs created for 32-bit architectures would still run on the
newer processors.
CS 330
Spring 2019
43
The Processor
Processor Come in Two Varieties
• Built for efficiency: typically ...
- used in smart phones and tablets which are designed to run
a long time with just a small battery.
- these processors do not need a fan to keep cool
- they try to minimize the number of transistors they use
• Built for speed: typically …
- used in laptops, desktops and servers which either have a
large battery or are plugged in
- these processors occasionally need a fan to keep cool
- they are complex (i.e. use a lot of transistors)
CS 330
Spring 2019
44
The Processor
The Processor is the part of a computer that does the
computation. It only executes simple instructions like …
- Arithmetic and Logic: add, sub, mult, div, and, or
- Comparisons: less than, greater than, equals, not equals
- Accessing data: lw (load word from RAM), sw (store word in
RAM)
- Flow control: used to implement function calls, for loops,
while loops etc.
• The goal is to have simple instructions that can be executed
very quickly by the processor.
• Programs written in high level languages like Racket, C or C++
get converted to these simple instructions.
•
CS 330
Spring 2019
45
The Processor
The Components of a Processor
• Program Counter (PC): holds the address of the current (or
next) instruction
• Instruction Register (IR): holds the instruction that is being (or
is about to be) executed
• Arithmetic Logic Unit (ALU): performs arithmetic and logic
operations (add, sub, mult, div, and, or)
• General Purpose Registers: a small amount of temporary (and
very fast) storage within the data path
• Control Unit reads the instruction in the instruction register
and turns on and off the other components of the processor to
execute the instruction.
CS 330
Spring 2019
46
The Processor
Control Unit
P
C
I
R
Registers
$0,
$1,
⁞
$31
ALU
Random Access Memory (RAM)
CS 330
Spring 2019
47
The Processor
The steps of executing an instruction are
1. Fetch: get the next instruction from memory and load it into the
instruction register.
2. Decode: get the source values from the registers
3. Execute: perform an ALU operation (if required)
4. Memory: access (i.e. read from or write to) RAM (if required)
5. Write Back: write the results back to a register (if required)
Not all steps are used for each instruction.
E.g. The instruction add $1, $2, $3 means
• get the data from registers 2 and 3 (i.e. step 2)
• add them together using the ALU and (i.e. step 3)
• store the result in register 1 (i.e. skip step 4 but do step 5)
CS 330
Spring 2019
48
The Processor
Processor Caches
• The speed at which you can access memory depends on the size
of the memory, so the processor has a small amount of memory
on the processor chip (called a cache) in (typically) three sizes
1. Level 1 Cache: 32 KB
2. Level 2 Cache: 256 KB
3. Level 3 Cache: 2 MB
• The cache sizes vary from processor to processor.
• The most frequently used data and instructions would be in the
smallest cache and could be accessed very quickly.
CS 330
Spring 2019
49
The Processor
Multicore Processors
• Each core acts like a separate processor on the same chip.
• They may share some resources e.g. (L2 or L3 caches) and
shared access to the rest of the computer.
• Can also have multicore processors which do not share any
caches.
• The two most common multicore processors you would see in
a laptop are
- a duo-core processor (i.e. 2 cores) which can execute 2
instructions at the same time,
- a quad-core processor (i.e. 4 cores) which can execute 4
instructions at the same time.
CS 330
Spring 2019
50
The Processor
Processing Power
• Processor performance is typically reported as clock speed
(frequency).
• Its processing power is actually based on:
- the number of bits that can be processed simultaneously
(word size)
- the speed that the data that can be moved between the
processor, primary storage, and other devices (data bus
width and speed)
- how complex the instruction is
- i.e. you could have an instruction that reads a value from
RAM and adds it to another value, which you can think of as
two instructions: 1) read value 2) add
CS 330
Spring 2019
51
Where to Store Data?
Varieties of Storage Devices
• A computer has many storages devices, including
- static random access memory (SRAM) used for registers
- dynamic random access memory (DRAM or just RAM),
- hard disk drive (HDD or HD) or solid state drive (SSD)
- USB flash drive, secure digital (SD) card, mini SD, micro SD
- digital versatile disk (DVD) , Blu-ray disk (BD)
• Why make it so complicated?
• Why not just have one type of storage device?
CS 330
Spring 2019
52
Gap between CPU and Memory Performance
Performance
Source: Computer architecture: a quantitative
approach by Hennessy, Patterson and Arpaci-Dusseau
Processor
Memory
•
•
Year
Processor performance has been increasing much faster than
memory performance.
Accessing (reading from and writing to) memory is the bottleneck.
CS 330
Spring 2019
53
Clock Speed
Measuring Clock Speed (Frequency)
• The speed a clock “ticks” (really a square wave) is typically
measured in
- MHz: 1 million clock ticks per second or
- GHz: 1 billion clock ticks per second or a clock tick every
billionth of a second.
Measuring Time
CS 330
Unit
milliseconds
microseconds
Symbol
ms
μs, us
Fraction of a second
1/1000 s
1/1,000,000 s
nanoseconds
ns
1/1,000,000,000 s
picoseconds
ps
1/1,000,000,000,000 s
Spring 2019
54
Memory Technology
Typical performance and cost figures as of 2012
Technology
SRAM
Typical Access Time
1ns
$500-$1000
70ns
$10-$20
5,000 - 50,000ns
$0.75-$1.00
Magnetic Disk 5,000,000 - 7,000,000ns
$0.05-$0.10
DRAM
Flash Memory
0.2
50
-
$/GB
-
credit: Computer Organization and Design 5th ed. by Patterson and Hennessy pg. 378
•
•
faster memory is more expensive
1 ns access time (i.e. 10-9 seconds), means you can access
memory 1 billion (i.e. 109) times per second.
CS 330
Spring 2019
55
Types of Memory
Two Types of Random Access Memory
• Static RAM (SRAM)
- expensive, but faster
- use for registers (typically 128-256 B for a processor)
• Dynamic RAM (DRAM)
- less expensive, but slower
- used for RAM (typically 4-8 GB for a laptop)
Goal
• make it seem like you have large amounts of fast memory
• approach store commonly used data and instructions in fast
memory and store rarely used data and instructions in slow
memory
CS 330
Spring 2019
56
Types of Memory
Registers
• a small number that are directly manipulated by the processor
Caches (L1, L2, L3)
• stores the most commonly used data and instructions
Primary Storage / Main Memory (a.k.a. RAM)
• when you click on a program or file, you load it from the hard
disk into main memory in order to access it
Secondary Storage / Hard Disk (or even a network drive)
• where programs or files are stored when they are not being
used
CS 330
Spring 2019
57
Memory Hierarchy
Type
Registers
L1 Cache
L2 Cache
L3 Cache
Main Memory
Hard Drive
Network
Size in Bytes
Access Time
100s
10,000s
100,000s
1,000,000s
less than 1
1s
10s
10s
1,000,000,000s
1,000,000,000,000s
virtually unlimited
100s
100,000s
100,000,000s
Access time is measured in clock cycles, i.e. it takes less
than 1 clock cycle to access data from a register.
CS 330
Spring 2019
58
Memory Hierarchy
fastest, most expensive, smallest capacity, closest
registers
• cache L1, L2, L3
• main memory
• disk
• network
• off-site archive (tape, optical, etc.)
•
slowest, least expensive, largest capacity, farthest away
If memory was like sheets of paper and clock ticks were like
inches...
CS 330
Spring 2019
59
CS 330
Spring 2019
60
Primary Memory
•
•
•
•
Includes registers, caches, and RAM
Often called RAM (Random Access Memory) because it can
directly access any randomly chosen address in roughly the
same amount of time
Characteristics: faster, expensive and volatile (disappears when
there is no power)
Stores:
- all or part of the software program being executed
- the operating system programs that manage the operation
of the computer
- the data that the program is using
CS 330
Spring 2019
61
Secondary Storage
Includes
- Hard drive (HD, or HDD)
- Optical drive (CD/DVD drive, Blu-ray drive)
- Flash drive (SSD, SD, USB flash drive)
• Often called external memory or external storage because it is
not directly accessible by the processor.
• Characteristics: slower, cheaper and non-volatile (permanent)
• Data and programs must be copied into primary storage before
being the processor is able to access it directly.
•
CS 330
Spring 2019
62
Secondary Storage: Parts of a Hard Drive
source: http://www.quora.com/Why-is-the-physical-sizeof-a-hard-disk-drive-larger-compared-to-memory-cards
This video (0:00-2:15) shows the parts of a hard drive:
https://www.youtube.com/watch?v=kdmLvl1n82U
CS 330
Spring 2019
63
Secondary Storage: Hard Drives Basics
How it works
• Platter
- A set of disks stacked on top of each other, each with a
smooth magnetic coating on both sides of the disk.
- RPM: rotations per minute, i.e. how fast the disk is spinning
(5400 rpm and 7200 rpm are common)
- Higher RPMs means the data can be accessed faster.
• An actuator arm moves across the disk to position the
read/write heads.
• The read/write head changes the orientation of the magnetic
field at a particular location to represent 0 or 1.
CS 330
Spring 2019
64
Secondary Storage: Hard Drives Basics
How it Works
• This video (0:00-0:55) shows a hard drive in action, e.g. booting
up, deleting a folder, etc:
https://www.youtube.com/watch?v=9eMWG3fwiEU
Some Parameters
• Mean Time Between Failures (MTBF)
- approximately one hundred thousand hours
• Follows a bathtub curve
- more likely to fail initially due to manufacturer error
- more likely to fail later do to wearing out
CS 330
Spring 2019
65
Secondary Storage: Hard Drive Reliability
Annualized Failure Rate (AFR)
• 0.7% – 0.8% for enterprise drives (what UW buys)
• 1.25% for consumer drives (what is in your laptop) if it is
replaced every 4 years (as of 2018).
• It has been reduced from 1.95% in the past few years.
• An annualized failure rate of 1.25% means on average roughly
(1.0 - 0.0125)4 x 100% ≈ 95% of the drives would still be
working after 4 years.
• For some recent data see
https://www.backblaze.com/blog/hard-drive-stats-for-2018/
• The company has over 100,000 hard drives.
• It tracks failures and makes the data public.
CS 330
Spring 2019
66
Secondary Storage: Solid State Drives
Some Parameters
• A alternative for a hard disk drive.
• Pros
- It is typically around 10x faster to access data
- It typically lasts longer.
- It has no moving parts that can wear out.
• Cons
- It is more expensive.
- It can wear out sooner than a hard disk drive when writing a
lot of data.
- The data can fade over time.
E.g. compare the Western Digital 1 TB hard disk drive vs. solid
state drive at bestbuy.ca
CS 330
Spring 2019
67
Secondary Storage: Optical Drives Basics
How it works
• E.g. CDs, DVDs, Blu-ray Disks.
• Very similar to a magnetic hard drive, except only one surface
(the bottom of the disc).
• It uses a laser and a mirror rather than an actuator arm and a
read/write head to read and write the data.
• The smooth aluminum surface reflects light very well to
represent a 0.
• The laser creates pits on the surface (which scatters light) to
represent a 1.
• Slower and less capacity than a hard drive but they are
inexpensive and durable.
CS 330
Spring 2019
68
Secondary Storage: Hybrid Drives
How it works
• combine a
- smaller SSD (which offers speed)
with a
- larger HDD (which offers large capacity at a small price)
• Software on the hybrid drive tracks which files are used often
and puts then on the SSD to achieve faster access for
commonly use files
- common strategy: optimize for the common case
• Price and performance between that of an HDD and an SSD.
E.g. Seagate Firecude
CS 330
Spring 2019
69
Secondary Storage: Assessing Performance
Some Key Measures
•
Price per gigabyte ⇒ hard disk drives
- getting cheaper
•
Capacity ⇒ hard disk drives
- getting larger
•
Speed: typically measured in MB/s (megabytes per second) or
GB/s (gigabytes per second) ⇒ solid state drives
•
Durability (look at how long the warranty period is) ⇒ answers
vary: some would say DVDs others solid state drives.
CS 330
Spring 2019
70
Improving Performance
Will Adding RAM Improve Computer Performance?
• Answer: It depends.
• If there is not enough RAM (primary memory) to hold all the
program (and some of the data) then the OS will use secondary
storage (the HDD or SSD).
• Secondary storage is much slower to access so this strategy (if
needed) will degrade system performance.
• If the OS never has to use this strategy (because there is
sufficient RAM) then adding more RAM will not help.
• Solution: Monitor how much RAM you are using. If you are
using near the limit (especially when you have a lot of
programs running / windows open / tabs open / documents
open) then adding more RAM will help.
CS 330
Spring 2019
71
Specialty Computers
Mainframes
The main characteristics of a mainframe computer are
• reliability (often with redundant parts)
• ability to hot swap, e.g. replace a failing hard drive while the
computer is still running and processing other transactions
• ability to support many users (e.g. 100,000 users) and process
their requests very quickly
- e.g. processing bank transactions, processing credit card
transactions, airline reservations
• ref: https://en.wikipedia.org/wiki/Mainframe_computer
E.g. IBM zSystems, Unisys ClearPath Libra, Hewlett-Packard
NonStop, Groupe Bull's GCOS, Fujitsu BS2000.
CS 330
Spring 2019
72
Specialty Computers
Supercomputers
• Main characteristic: fast floating point computations
• Main Use: For complex calculations like simulations, weather
forecasting and scientific computations
• Speed measures in how many floating point operations (FLOPS)
they can do per second.
• The (currently) fastest supercomputer can do 200 PFLOPS, i.e.
200,000,000,000,000,000 FLOPS (more than a million times
faster than our podium computer).
• Use on the order of 100,000s of cores (processors).
• The challenge is managing the data on all these cores.
• ref: https://en.wikipedia.org/wiki/TOP500
CS 330
Spring 2019
73
Specialty Computers
Microcontrollers
• Main Characteristic: Simple processors with RAM and I/O
capabilities that cost as little as $0.25.
• Used in embedded systems, i.e. as part of a home appliances,
office equipment, digital watches, traffic lights, robots, cars.
• Today’s (i.e. 2014) car has the computing power of 20 personal
computers, features about 100 million lines of programming
code, and processes up to 25 gigabytes of data an hour.
Source:
http://www.mckinsey.com/insights/manufacturing/whats_driving_the_connected_car
E.g. for some current examples of microcontrollers
https://www.digikey.com/products/en/integrated-circuits-ics/embedded-microcontrollers/685
CS 330
Spring 2019
74
New Subtopic: Evolution of IT Infrastructure
What is IT Infrastructure?
• Definition: The shared technology resources that provide the
platform for the firm’s information system applications.
- It includes investment in hardware, software, and services,
such as consulting, education, and training.
• It has evolved in five stages since the 1950s.
1. Mainframe / Minicomputer
2. Personal computer
3. Client/server
4. Enterprise computing
5. Cloud and Mobile Computing
• Each configuration is still around today in some form.
Ref: Section 5.1 in the course text.
CS 330
Spring 2019
75
Evolution of IT Infrastructure
Stage 1: Mainframe / Minicomputer
• Very expensive.
• One centralized system.
• Controlled by operators.
• Owned by large corporations
- e.g. banks, insurance companies
• Later users interacted with the
mainframe directly via terminals.
• Minicomputers where cheaper and
came along later.
• A large university could have several
minicomputers.
CS 330
Spring 2019
Course text Figure 5-2
76
Evolution of IT Infrastructure
Stage 2: Personal Computers
• The computer is used by one person.
• Initially cost roughly $4,000 (after
adjusting for inflation).
• Could do simple word processing,
accounting and game playing.
• Users were technically sophisticated.
• Started off text-based but eventually
evolved to a graphical user interface
and a mouse.
• The software market was eventually
dominated by Microsoft.
CS 330
Spring 2019
Course text Figure 5-2
77
Evolution of IT Infrastructure
Stage 3: Client/Server
• Two types of machines: clients (typically inexpensive) and
servers (typically more expensive).
• Clients: requests and uses services provided by the servers
- e.g. students in this course
• Servers: runs an application and provides it to others over a
network, e.g.
- Google searches,
- streaming music on Spotify,
- streaming video on YouTube,
- lecture slides on Learn,
- course selection on Quest.
Course text Figure 5-2
CS 330
Spring 2019
78
Evolution of IT Infrastructure
Stage 4: Enterprise Computing
• Link together different networks and applications throughout
the firm. Sometimes called integration.
• Link different types of hardware.
Course text Figure 5-2
• Link different type of data formats.
• Use internet protocols for
the network.
• Create standards for the
data format.
• Use software to translate
between the various formats.
Course text Figure 5-2
CS 330
Spring 2019
79
Evolution of IT Infrastructure
Stage 5: Cloud and Mobile Computing
• Extension of client/server but rather than a server have a
shared pool of resources.
• The resources include:
- a cluster of computers
- software (e.g. gmail, google docs)
- storage
• Can sell software applications as a
service delivered over the internet
- E.g. Microsoft’s Office 365
Course text Figure 5-2
CS 330
Spring 2019
80
Evolution of IT Infrastructure
Client/Server Architecture
• This is the most common form of distributed computing
architecture but it is not the only form.
Peer to Peer (P2P)
• Every machine in the network consumes and provides
service(s) at the same time.
• E.g. torrent sites, you can download files from other people’s
computers and they can download files from yours.
• Hard to control, there is no central computer “in charge.”
• Started out in software/game/music/video piracy but can also
be used to download updates.
CS 330
Spring 2019
81
New Subtopic: Drivers of Technology
Drivers of Technology
• The evolution in IT infrastructure has been driven by the
following five drivers of technology.
1.
2.
3.
4.
5.
•
Moore’s Law
The Law of Mass Digital Storage
Metcalfe’s Law
Declining Communications Costs
The Creation of Technology Standards
Why is this important: When designing a product that will be
available in 18 months, consider what the hardware
performance will be like in 18 months.
Ref: Section 5.1 in the course text.
CS 330
Spring 2019
82
Drivers of Technology
1. Moore’s Law
• The number of transistors that can fit on a chip doubles every 18
months.
• This law has been interpreted as:
- the power of microprocessors doubles every 18 months,
- computing power doubles every 18 months,
- the price of computing falls by half every 18 months.
• The trend has been true since 1959 but
as of 2010-2013 it looks to be slowing down.
• The graph on the next slide shows the number of transistors and
the millions of instructions (MIPS) a processor can execute.
• The trend also causes the cost of a single transistor to decrease.
CS 330
Spring 2019
83
Drivers of Technology
1. Moore’s Law
Source: course
text Figure 5-4
CS 330
Spring 2019
84
Drivers of Technology
1. Moore’s Law has Contributed to Decreasing Costs
Source: course text Figure 5-5
CS 330
Spring 2019
85
Drivers of Technology
2. Law of Mass Digital Storage
• Observation: The amount of digital information is roughly
doubling every year.
• The growth is exponential.
• Since 1990, the storage capacity for hard drive has increased at
a rate of 65% per year.
• The cost of storing a gigabyte is falling at an exponential rate,
being cut in half every 15 months rate of 100% per year.
• The textbook literally says “falling at an exponential rate of
100% per year”
CS 330
Spring 2019
86
Drivers of Technology
2. Hard Disk Drive Capacity
Observation: storage capacity grows exponentially
source: 5th edition of course text
CS 330
Spring 2019
87
Drivers of Technology
2. Data Storage per Dollar
The amount of data that can be stored per dollar doubles every
15 month.
Source: course
text Figure 5-6
CS 330
Spring 2019
88
Drivers of Technology
3. Metcalfe’s Law
Observation: The value of a network grows exponentially as a
function of the number of network members.
Image Source: http://www.collabworks.com/Main_WhatIsOpenIT/Metcalfe.htm
CS 330
Spring 2019
89
Drivers of Technology
4. Declining Communication Costs
Communication costs have been declining.
The lower the cost of communication ⇒ the more reliance on it
to conduct business.
Source: course text Figure 5-7
CS 330
Spring 2019
90
Drivers of Technology
5. The Creation of Standards
The creation of technology standards allows competition,
increase interoperability and reduces costs.
Some examples
• ASCII and Unicode standards for representing alphabets
• The Portable Operating System Interface (of Unix and Linux)
• TCP/IP to interconnect different networks (i.e. the internet)
• Ethernet and Wi-Fi to connect devices to the internet.
• HTML and the World Wide Web for the formatting and
displaying of text, pictures and video.
CS 330
Spring 2019
91
New Subtopic: Infrastructure Components
Drivers of Technology ⇒ Infrastructure Components
• There are seven (major) components of IT infrastructure.
• The choices must be coordinated
- i.e. a choice in one component affects the options available
in the other components.
1.
2.
3.
4.
5.
6.
7.
Computer Hardware Platforms
Operating System (OS) Platforms
Enterprise Applications
Data Management and Storage
Network and Telecom Platforms
Internet Platforms
Service Platform
Ref: Section 5.2 in the course text.
CS 330
Spring 2019
92
1. Computer Hardware Platforms
Two Varieties of Machines
• Client machines: desktops, laptops, tablets and smart phones
• Server machines (i.e. specialized high-end computers)
- could be a single mainframe or
- could be a large number of rack servers or blade servers
(thin, modular computer, without a dedicated keyboard or
monitor)
image source: https://www.dell.com/en-ca/work/shop/povw/poweredge-r230
CS 330
Spring 2019
93
1. Computer Hardware Platforms
•
•
Companies like Google and Facebook have server farms,
collections of 100,000s of blade servers stored in racks in
large, windowless, air-conditioned rooms.
This design takes up the least amount of space.
image source: https://www.computerhope.com/jargon/s/servfarm.htm
CS 330
Spring 2019
94
2. Operating System (OS) Platforms
•
•
•
•
Definition: The OS manages a computer’s hardware and software
resources: processor, memory, peripherals, files, apps
For laptops and desktops (in Q1 2013)
- 91% of PCs ran Microsoft Windows
- 6.5% ran macOS
For smart phones (in Q1 2014)
- 71% ran Android (bought by Google, based on Linux)
- 19% ran iOS
- 8.1% Windows Phone.
For servers (in Q1 2013)
- 65% of servers in the US ran Unix or Linux
- 35% ran Windows
CS 330
Spring 2019
95
3. Enterprise Applications (EA)
•
•
•
•
Role: Computer programs used by organizations that
integrate business applications and services across the many
different departments.
E.g. a central database and programs used by Sales and
Marketing, Finance and Accounting, Human Resources,
Manufacturing and Production.
E.g. Quest at UW
Previously departments had their own databases and it was
hard to combine the data from all of them.
Currently, the largest suppliers of enterprise software are
SAP, Oracle, IBM and Microsoft.
CS 330
Spring 2019
96
4. Data Management and Storage
Database Management System (DBMS)
• Role: organize and store the company’s data
• Open source MySQL is available free of charge, and now
supported by HP and most consulting firms.
• Database server: you might need a server to run your DBMS,
particular if it is to be accessible by several machines or even
through the Internet.
Currently, the leading database software providers are
Oracle, IBM (DB2), Microsoft (SQL Server), and Sybase.
•
More on databases in Chapter 6 of course text.
CS 330
Spring 2019
97
4. Data Management and Storage
Data storage
• Major types: hard disk drives, tape drives, cloud-based storage
• Can use Redundant Array of Independent Disk (RAID) to
improve hard disk performance ...
Currently, the market is dominated by Western Digital,
Seagate and Toshiba.
•
Tape drives are good for remote offsite backup (archiving) due
to its portability.
Currently, the market is dominated by IBM, HP, and Sony.
•
Cloud-based storage will be discussed later ...
CS 330
Spring 2019
98
4. Data Management and Storage
RAID Storage Architecture
• Using many hard drives to achieve improvements in 1)
reliability, 2) availability, 3) performance and 4) capacity
• Currently 7 different types: RAID 0 - RAID 6.
• Each achieves a difference balance of reliability, availability,
performance, and capacity.
• There are tradeoffs: e.g. having multiple copies of a file
increases reliability (if one copy gets damaged) but decreases
overall capacity.
CS 330
Spring 2019
99
4. Data Management and Storage
RAID Technique: Disk Mirroring
• Store a copy of the data on another disk
• Improved Reliability: if one disk fails, use the other
• Improve Read Performance: if one disk is busy, read the data
from other disk
• Decreased Capacity: using twice as much space to store a file
CS 330
Spring 2019
100
4. Data Management and Storage
RAID Technique: Disk Striping
• store sequential data on alternating disks, e.g.
block 1 on disk 1, block 2 on disk 2,
block 3 on disk 1, block 4 on disk 2,
block 5 on disk 1, block 6 on disk 2, ...
• Improve Performance: bandwidth twice as fast as a single disk
• Decrease Reliability: file is corrupted if only one of the two
disks fail.
For RAID 0: only striping is used.
For RAID 1: use mirroring (and possibly some striping)
CS 330
Spring 2019
101
4. Data Management and Storage
RAID Techniques: Parity
• many different types of parity
• even parity: add either an extra 0 or an extra 1 at the end of a
sequence of bits in order to ensure that the number of 1’s in
the sequence is even.
• 1001000 has an even number of 1’s so add a 0
EvenParity(1001000) = 10010000
• 1001001 has an odd number of 1’s so add a 1
EvenParity(1001001) = 10010011
• 1001011 has an even number of 1’s so add a 0
EvenParity(1001011) = 10010110
• Parity can detect if a single error (or an odd number of errors)
has occurred in the storage of the data
CS 330
Spring 2019
102
4. Data Management and Storage
Data Backup
• Online backup (hot backup)
- Instant real-time backup
- Protects against one HD failure
Examples RAID 1 and RAID 5
•
Offline backup (archive)
- Done at the end of the day, copy and ship to a different
location
- Example: backup to tape drive
- Protect against complete failure, but can only recover data
from one day ago (or more).
- Full vs. incremental backup
CS 330
Spring 2019
103
5. Network and Telecom Platforms
Network Hardware
• network: a group of computers linked together to share
resources
• hub: any data sent to a hub is sent to all connected devices
• bridge: only one input and one output, looks at data and
decides whether to forward it across the bridge
• switch: has many ports, looks at data and decides which port
to send it out on
• router: like switch but works on many more network protocols
• firewall: hardware or software (or both) put between the
internal network and the internet to prevent outsiders from
obtaining unauthorized access
CS 330
Spring 2019
104
5. Network and Telecom Platforms
Network Hardware
• Computers have a Network Interfacing Card (NIC)
- e.g. typically Ethernet, Wi-Fi or Bluetooth
Leading network hardware providers are Cisco,
Alcatel-Lucent, and Juniper Networks
•
Network Operating Systems (NOS)
- manages features such as users, groups, file sharing, printer
access, security
NOS include Microsoft Windows Server, Linux, Cisco IOS and
Novell NetWare
CS 330
Spring 2019
105
5. Network and Telecom Platforms
Network Hardware
• Also includes telephone and cell phone services, telephones,
cell phones, telephone systems (PBXs, i.e. the telephone
equipment that sets up the extensions and voicemail for the
campus), automated attendants, call centre software, fax
machines (might be combined with the photocopier and
scanner)
Telecomm Service vendors include Rogers, Bell, Telus and Shaw,
plus regional carriers.
CS 330
Spring 2019
106
6. Internet Platforms
Internet Service Provider (ISP)
• provides the link from your home or company network to the
rest of the internet
• they own the telephone line and cable that runs to your home
or office (i.e. the last mile)
• many smaller regional ISPs lease the network from the ISPs and
provide their own customer service, tech support etc.
Major Canadian ISPs are Rogers, Bell and Shaw
CS 330
Spring 2019
107
6. Internet Platforms
Website Development
• Can hire others or create and maintain it yourself.
• Simple websites use languages like html (hypertext markup
language) and JavaScript
• Simple websites are typically static (i.e. the site does not
change unless a person edits the web page files)
• More sophisticated websites are dynamic (i.e. when a client
makes a query, a web page is created using a combination of
scripts and database queries in order to get the most recent
and relevant information.
• Many of the big players are using artificial intelligent to learn
what to present to you.
CS 330
Spring 2019
108
6. Internet Platforms
Website Development
• E.g. when you click on a YouTube video, besides providing the
video, the webpage also lists how many views it has had, how
many likes, how many dislikes, the latest comments, etc.
• Check back later and these values will change (for a popular
video) i.e. they were created dynamically
Programming languages for dynamic web pages include: PHP
by Rasmus Lerdorf, ASP.NET (Active Server Pages) by
Microsoft, JSP (JavaServer Pages) and Java by Oracle.
CS 330
Spring 2019
109
6. Internet Platforms
Web Hosting
You can create your own or use a web hosting service. In order
to create your own you need...
• a server, i.e. powerful computer(s)
• a domain name and an IP address for your website (e.g. see
online tools nslookup and whois)
• a web server, i.e. software that the servers runs to accept the
requests that web browsers makes.
The two most common web servers are
1. Apache by the National Center for Supercomputing
Applications (NCSA) (roughly 60% market share)
2. Internet Information Services (IIS) by Microsoft (roughly 20%
market share)
CS 330
Spring 2019
110
6. Internet Platforms
More Information about Setting up your own Website
•
Get a domain name and map it to your website:
http://www.thesitewizard.com/archive/registerdomain.shtml
•
Set up your own website:
http://www.thesitewizard.com/gettingstarted/startwebsite.shtml
CS 330
Spring 2019
111
7. Service Platform
•
•
A Service Platform is a collection of services that enable the
information system to function, i.e. consulting and system
integration services
Most firms cannot develop their systems without significant
outside help including
- identifying which parts of the business can be improved by
using IT
- ensuring new systems integrate with legacy systems
- maintenance
- training
- security
CS 330
Spring 2019
112
A Few Comments About IT Infrastructure
The term server could refer to hardware (the server machine)
or software (the server application) or both
• Some IT components may be bundled
- Machines might come with preinstalled with OS’s.
- Server OS = NOS.
- the Enterprise Application platform (coordinates activities
across many departments), Data Management platform
(database and storage) and Internet platform may dictate
the server machines needed
- Some EAs are bundled with their own DBMS
- Some EAs need to run on a server machine
- Some EAs are bundled with a service package (integration,
maintenance and training)
•
CS 330
Spring 2019
113
New Subtopic: Contemporary H/W Trends
Ref: Course text section 5.3
Key Topics
Eight contemporary hardware trends and two future ones
1. The mobile digital platform
2. Consumerization of IT and BYOD
3. Grid computing
4. Virtualization
5. Cloud computing
6. Green computing
7. High-performance and Power-saving processors
8. Autonomic Computing
9. Near Future: Nanotechnology & Quantum Computing
CS 330
Spring 2019
114
Hardware Technology Trends
Trend 1: Mobile Digital Platform
• Increasingly, internet access happens via highly portable
devices: smartphones and tablets
• Smart phones are taking over the functions of many other
electronic devices, e.g. GPS.
• Compare this old Radio Shack ad to a smart phone
http://www.trendingbuffalo.com/life/uncle-stevesbuffalo/everything-from-1991-radio-shack-ad-now/
• The integration of voice (the telephone network) and data
(computers) bring together two historically distinct global
networks.
CS 330
Spring 2019
115
Hardware Technology Trends
Trend 2: Consumerization of IT and BYOD
• BYOD = Bring Your Own Device (to work)
• Allow employees to bring their own device.
• Allow employees to use software services, such as Gmail,
Google, Facebook and Twitter.
• Key trend: consumerization of IT: technology that was meant
for the consumer moves into the business world.
• Companies must consider
- what can be used and what cannot be used
- security,
- software availability,
- ownership
- privacy
CS 330
Spring 2019
116
Hardware Technology Trends
Trend 3: Grid Computing
• Key Observation: processors are idle most of the time
- e.g. System Idle Process in Windows
• Idea: simulate a supercomputer by organizing the
computational power of a network of computers
• may be geographically remote, have different OS, etc.
• Benefit: capable of working on problems that require shortterm access to large computational capacity
• called grid computing
• requires software to control and allocate resources on the grid
• E.g. SETI: http://setiathome.berkeley.edu/
CS 330
Spring 2019
117
Hardware Technology Trends
Trend 3: Grid Computing Limitations
• Key Observation: some tasks can be broken up into smaller
independent tasks (parallelized)
- e.g. find all the occurrence of a keyword in a collection of
documents
• Key Observation: other tasks cannot be parallelized.
- e.g. calculate the Fibonacci number F(x) for some large x,
where F(n) = F(n-1) + F(n-2)
• Limitation: only tasks that can be parallelized can take
advantage of grid computing.
CS 330
Spring 2019
118
Hardware Technology Trends
Trend 4: Virtualization
• Virtualization: the creation of a virtual (rather than actual)
version of something, such as a hardware platform, operating
system, a storage device or network resources.
• many looks like one: e.g. many smaller hard drives can be
configured to look like one large one.
• one looks like many: e.g. a single powerful server can be
configured to look like many smaller computers
- hook-up dozens of displays and keyboards to it
- the different virtual machines can even be running different
OSs (e.g. Windows 7, Windows 10, macOS and Linux)
CS 330
Spring 2019
119
Hardware Technology Trends
Trend 4: Virtualization
In a computer
• the hardware is managed by the
operating system (which is
software).
• and the application software
interacts with the hardware through
the operating system (rather than
interacting directly with the
hardware).
Application Software
Operating System
Hardware
e.g.
Chrome
Windows 10
Dell Laptop
CS 330
Spring 2019
120
Hardware Technology Trends
Trend 4: Virtualization
What if you wanted to run a Linux app in
a Windows computer?
• Idea: Create software that simulates
hardware which the Linux OS could
run on.
• Run the Linux app in this environment.
• This setup would be running the Linux
App in a virtual Linux environment
which would be running in an actual
Windows environment.
• Example: VMware, VirtualBox
CS 330
Spring 2019
Linux App
Linux OS
Virtual Hardware
Windows 10
Hardware
121
Hardware Technology Trends
Trend 4: Virtualization
Benefits
• Better resource management (when using one resource to look
like many) by using more of the processor’s capacity, less space,
less expense, less energy
• Support legacy applications by running older versions of the OS
• Testing: can test software on a variety of virtual configurations
CS 330
Spring 2019
122
Hardware Technology Trends
Scenario: Meeting Peak Demand
• Example: imagine accounting system handles 10,000
transactions per day with a peak demand of 20,000 during tax
season.
• There are three technical options are available for this business.
5.1 Load balancing
5.2 Cloud computing
5.3 On-demand computing
CS 330
Spring 2019
123
Hardware Technology Trends
5.1 Load Balancing
• the work load is evenly distributed on many servers
• creates a high availability computing system
• e.g. 4 servers each handles 6000 transactions per day, each
operates at between 40% - 85% capacity
• Can deal gracefully with
- Crashes
- Upgrades
- Seasonal peak demands
• Average down time is drastically reduced.
• Downside: must purchase and maintain hardware (the two
extra servers) that is rarely used
CS 330
Spring 2019
124
Hardware Technology Trends
Trend 5.2: Cloud Computing
• The purchase, as a service from another company, of hardware
/ programming tools / software that is accessed over the
internet.
• Examples
- hardware: Amazon Web Services (AWS)
- software: Microsoft 365 (Cloud Version of Microsoft Office)
• for Microsoft 365 you pay a monthly subscription fee
• a particular form of cloud computing is ...
CS 330
Spring 2019
125
Hardware Technology Trends
Trend 5.3: On-Demand (Utility) Computing
• A form of cloud computing
• Firms off-load peak demand for computing power to remote,
large-scale data processing centers
• Firms pay only for the computing power they use, as with an
electrical utility
• Excellent for firms with spiked demand curves caused by
seasonal variations in demand, e.g. on-line shopping website
on Black Friday
• Saves firms from purchasing excessive levels of infrastructure
CS 330
Spring 2019
126
Hardware Technology Trends
Trend 5: Pros and Cons of the Cloud
• Pros:
- Cost: a less expense way to cover peak demand
- Convenient: use as needed
- Flexible: not fixed to one brand of computers, usage may
easily increase or decrease
• Cons:
- Privacy: less control over it
- Liability: Google Cloud went down 6 times in 1 year
- Legal: must comply with Canadian privacy laws
- Loss of control
• Not for mission critical system
CS 330
Spring 2019
127
Hardware Technology Trends
Trend 6: Green Computing
• Green computing is the design and use of computer systems in
a way that minimizes their impact on the environment.
- reduce power consumption
- reduce e-waste (old cell phone, old laptops)
- In Canada: https://www.recyclemyelectronics.ca/
• But must sanitize (i.e. erase data):
- https://dban.org/ (for HHDs)
- https://www.bleachbit.org/
Trend 7: High-performance and Power-saving processors
• Multicore processors where cores can disconnect from power
when not in use
• Energy efficient designs (fewer transistors)
CS 330
Spring 2019
128
Hardware Technology Trends
Trend 8: Autonomic Computing
• Computer systems have become so complex that the cost of
managing them has risen
- a significant portion of a company’s IT budget is spent
preventing or recovering from system crashes
- the most common cause is operator error
• Autonomic computing is an industry-wide effort to develop
systems that are capable of self-management: i.e. selfconfigure, self-protect, self-optimize and self-heal themselves
• e.g. P2P (peer to peer) systems like Skype or the internet.
- If nodes go down, network still functions.
CS 330
Spring 2019
129
Hardware Technology Trends
Scenario: Business Rising
• Chris needs to set up 3 different servers:
- An Apache web server on Linux
- A MySQL DBMS server on Windows 10
- A very old accounting application on MS-DOS
• It is estimated that a small tower server ($5000) can handle the
workload of one category of service, while a medium size
tower server ($10,000) can handle all the workload
• How should this situation be handled?
CS 330
Spring 2019
130
Future Hardware Technology: Nanotechnology
What is it?
• Nanotechnology: Science of using nanostructures to build
devices.
• A nanometer is a billionth of a meter and is the size of a few
atoms or a small molecule.
• Nanotechnology uses individual atoms and molecules to create
computer chips and other devices
• Presently a transistor is about 14 nanometers wide and made
mostly of silicon (roughly 70 silicon atoms wide).
• The limit with this approach seems to be 5 nanometers.
• Looking for new materials and ideas to make smaller
transistors.
CS 330
Spring 2019
131
Future Hardware Technology: Quantum Computing
What is it?
• Classical Bit vs. Qbit
• Qbit is a superposition of 0 and 1
- 2 Qbits need four numbers to specify the state.
- 3 Qbits need eight numbers to specify the state.
⁞
- n Qbits need 2n numbers to specify the state.
• Not a universal replacement for classical computers. It
minimizes the number steps needed to arrive at result for
some problems, e.g. factoring.
• How Does a Quantum Computer Work? by Veritasium.
https://www.youtube.com/watch?v=g_IaVepNDT4
CS 330
Spring 2019
132
New Subtopic: Contemporary S/W Trends
Ref: Course text section 5.4
Key Topics
We will look at four contemporary software platform trends
1. Linux and open-source software
2. HTML and HTML5
▪ Java not covered in this course
3. Web services and service-oriented architecture
4. Software outsourcing and cloud services
▪ mashups and apps not covered in this course
CS 330
Spring 2019
133
Software Technology Trends
Trend 1: Open-Source Software
• Open-source software is source code that is publicly available
and that can be modified and redistributed by anyone for any
purpose.
• Different standards exist for open-source software e.g.
- Free Software Foundation (FSF) started in 1985
- Open Source Initiative (OSI) started in 1998
• Originally “free” meant free to inspect and modify, now it more
likely means available at no cost.
• Often developed and maintained by a worldwide network of
programmers and designers under the management of user
communities.
CS 330
Spring 2019
134
Software Technology Trends
Trend 1: Open-Source Software
• A company (e.g. Google) may fund an open source challenger
(e.g. Firefox) to another company’s product (e.g. Microsoft’s
Internet Explorer).
• A company (Sun Microsystems) may make a product they no
longer support (StarOffice) open source (now called
OpenOffice, which is an competitor to Microsoft Office).
Examples
Linux is the most widely used open-source operating system.
Other examples include Apache HTTP Web server, MySQL
database, and the programming language Python.
CS 330
Spring 2019
135
Software Technology Trends
Trend 1: Open-Source Software: Costs and Benefits
What are the benefits of open-source software?
• lower cost
• more security, less bugs - many people inspect code
• flexibility - may modify the code
• transparency - know exactly what the code does
• not reliant on a single vendor
What are the drawbacks of open-source software?
They are less likely to
• have easy of use
• meet customer needs
• be compatible with your particular hardware
• have support
CS 330
Spring 2019
136
Software Technology Trends
Trend 2: HTML and HTML5
The format for displaying information on the web.
• HTML stands for hypertext markup language
- hypertext refers to text that contains links to other text that
you can access quickly.
- markup language refers to a way of annotating and
presenting text, i.e. bold, italics, titles, subtitles etc.
• HTML originally did not support audio and video and so you
needed third party plugins
• The latest version (HTML5) supports audio and video.
The book mentions Java which is a common programming
language that we will not discuss in this course.
CS 330
Spring 2019
137
Software Technology Trends
Trend 3: Web Services and SOA
• Web services is software components that exchange
information with each other using web communication
standards and languages.
• The web provides well-known and well-supported standards
for presenting information.
• Web browsers use Hypertext Markup Language (HTML) which
specifies how text, graphics etc., is displayed in a browser.
• A generalization of HTML is eXtensible Markup Language (XML)
which can also specify what the data means.
CS 330
Spring 2019
138
Software Technology Trends
Trend 3: Web Services and SOA
• XML provides a format for (possibly different programs) to
exchange information
Ref: Section 5.4 in the course text.
E.g. it could specify that $16,800 represents the price in
Canadian dollars.
• Two different systems (possibly at different companies, with
different operating systems and different programs) can speak
a common language.
•
CS 330
Spring 2019
139
Software Technology Trends
Trend 3: Web Services and SOA
• The use of web services to achieve integration among different
applications and platforms is referred to as service-oriented
architecture (SOA)
• SOA is a cost effective way to adopt to new technology and to
integrate different applications
• E.g. a car rental company (say Dollar Rent A Car) can interact
with other companies’ web site (such as a airline, a tour
company etc.) by converting its information to the language of
the web.
• Now customers can book a flight, rent a car and book a tour all
at the same website.
CS 330
Spring 2019
140
Software Technology Trends
Source: course
text Figure 5-10
CS 330
Spring 2019
141
Software Technology Trends
Trend 4: Software Outsourcing
Changing sources of outsourced software:
• Purchase customizable generic software package
e.g. SAP and Oracle-PeopleSoft
•
•
Contract custom software development or maintenance to a
third party which could even be located in another country.
- started off as maintenance and data entry
- now also includes developing new software
Use software available from the cloud, called software as a
service (SaaS)
e.g. Salesforce.com for customer relations management
CS 330
Spring 2019
142
New Subtopic: Management Issues
Ref: Course text section 5.5
Subtopics
We will look at managing IT infrastructure
1. Dealing with change
2. Management and governance
3. Infrastructure investments
a) Total cost of ownership
b) Competitive forces model
CS 330
Spring 2019
143
Management Issues
1. Dealing with Change
• firms need to be able to grow (or shrink)
• scalability: ability to expand to serve a larger (or smaller)
number of users without breaking down
2. Management and Governance
Who is responsible for the IT infrastructure?
• each department (decentralized)
• one overall IT department (centralized)
• mixture of both
CS 330
Spring 2019
144
Management Issues - TCO
3a) Infrastructure investments: Total cost of ownership
• There are different ways to estimate the total cost of
ownership (TCO).
• We will use the following: the acquisition costs for hardware
and software represent 20% of the TCO.
- It could range from 20-35% depending on what is bought.
- TCO is like an iceberg (only see part of it).
• Can break down TCO into
- Capital expenditure: fixed, one-time cost to acquire system.
- Operational expenditure: ongoing expenses for running it.
• The table on the next slide lists the various components that
contribute to the TCO.
CS 330
Spring 2019
145
Management Issues - TCO
Component
Cost
Hardware
Computers, cables, terminals, storage, printers
Software
Operating systems, applications
Installation
Staff to install computers and software
Training
Time and people for both developers and end users
Support
Ongoing technical support and help desks
Maintenance
Upgrades for hardware and software
Infrastructure
Networks and backup units
Downtime
Lost productivity during system failures
Space and
Energy
Real estate, computer furniture and utility costs for
housing and powering the technology
Source: course text Table 5-3
CS 330
Spring 2019
146
Management Issues - TCO
3a) Infrastructure investments: Total cost of ownership
To get a sense of actual costs we will look at this report
http://www.nashnetworks.ca/pdf/TCOofIT.pdf
from Nash Networks
http://www.nashnetworks.ca/index.php
The report is 10 years old but gives a good sense of the issues.
There are two types of costs
1. direct costs which include hardware, software, printer paper,
ink, internet costs
2. indirect (or hidden) costs which include downtime, poorly
trained users, user mistakes, using computer for nonbusiness purposes, users installing accessories
CS 330
Spring 2019
147
Management Issues - TCO
3a) Infrastructure investments: Total cost of ownership
Direct Costs for a PC over a 3 - 4 Year Lifetime
Phase of Lifecycle
Purchase (computer; printer/scanner/fax;
cables, printer ink; paper)
Deployment (setup, staff downtime)
Operations (admin, downtime)
Support
Retirement
Total Cost
Approximate Annual Costs
CS 330
Spring 2019
Cost
$3,090
$500
$1,040
$1,680
$630
$6,940
$2,000
148
Management Issues - TCO
3a) Infrastructure investments for Large Companies in 2007
IT budget
Average IT operating budget as % of revenue
Average IT capital budget as % of revenue
Average IT operating budget per employee
5.5%
2.5%
$9,100
IT spending by category
Hardware
Software
Support (staff, external providers, contractors)
Telecommunications
26%
20%
41%
13%
These costs are for companies with more than 2,500 employees.
CS 330
Spring 2019
149
Management Issues - TCO
3a) Infrastructure investments for Large Companies in 2007
• One way to reduce costs is how the computers are managed.
• There are two extremes
1. Unmanaged: users can install any application and change
any setting.
2. Locked and well-managed: users cannot install software or
change critical settings. There are policies in place to
restrict what an employee can do.
• The more a computer is managed, the less the size of the
indirect costs.
• See the graph on the next slide (again for large companies in
2007) for details
CS 330
Spring 2019
150
Management Issues - TCO
CS 330
Spring 2019
151
Management Issues
3b) Infrastructure investments: Competitive forces
Consider six factors when deciding how much to spend on IT.
1. Demand for services: What services do you provide (to
customers, suppliers and employees)? Are their needs being
met?
2. Business strategy: What new capabilities will be needed to
achieve these goals?
3. IT strategy: How will IT help achieve these goals?
4. IT assessment: Is your IT infrastructure too old or too new?
5. Competitor’s Services: What do your competitor firms offer
customers, suppliers and employees?
6. Competitor’s IT Investments: How much have they spent?
CS 330
Spring 2019
152
Topic 2 – Databases
Key Concepts
• flat files vs. relational databases
• attributes, records and tables
• primary keys, candidate keys, foreign keys
• schema and data independence
• database design and normalization
• data warehouses, data marts, online analytical processing, data
mining
References
• Course text, Chapter 6 Databases and Information
Management
CS 330
Spring 2019
153
Flat files vs. Databases
How to Store Data Digitally
• Two ways to store data digitally
- in a flat file (i.e. as one large table)
- in a relational database (i.e. many smaller tables)
• Key Question: What are the problems with storing data in a
traditional file environment? i.e. why doesn’t UW store all its
student information in one gigantic Excel spreadsheet?
CS 330
First Name
Last Name
Student ID
Course
Grade
Chris
Lee
20158888
CS115
83
Chris
Lee
20158888
CS116
78
Chris
Lee
20158888
CS230
81
Chris
Lee
20158888
CS234
80
Spring 2019
154
Storing Data in a Flat File
Benefits of a Flat File(s)
• simple to create and administer, can use a spreadsheet
• easy to understand
• all the data is stored in one place
• easy to sort or filter information
• good for one person processing a small amount of data
Key Question: how are the needs of storing data for one person
different from the needs of storing data for a large company?
Goal: accurate, timely, relevant information
CS 330
Spring 2019
155
Storing Data in a Flat File
Limitations of a Flat File
• lack of security: each person has access to the whole file or
none of it
- cannot give different people different views of the data
- cannot control who has access to what data
▪ want payroll info to be visible only to Payroll Dept
- do no know what they have changed
• lack of concurrent access: only one person can modify the
same file at a time
- concurrent access means multiple people (or programs) can
access the file at the same time
CS 330
Spring 2019
156
Storing Data in a Flat File
Limitations of a Flat File
• lack of data integrity
- redundancy: the same data is stored in many places
- e.g. if the is an update or an error is discovered
⇒ have to search whole file(s) and change it in many places
- it is easy to miss a place, therefore …
- redundancy leads to inconsistency: different values for the
same attribute
• lack of scalability: the file can become very large and then
searching the whole file becomes much slower
CS 330
Spring 2019
157
Storing Data in a Flat File
Limitations of a Flat File
• program-data dependence: the file format and the program
that processes the data are tied together (strongly coupled)
- change one (i.e. program or file format) and you must
change the other
- difficult to handle different user preferences
▪ Chris uses MS Word
▪ Kelly uses Adobe Acrobat
• lack of custom formats: cannot display info in different formats
for different people, e.g. for sales data
- Regional Directors want it organized by region
- Product Managers want it organized by product
CS 330
Spring 2019
158
Storing Data in a Flat File
Limitations of a Flat File(s)
If each department has its own copy of the data and its own
programs, there are more challenges.
• Each department will be tempted to develop its own processes
which use it own subset of data files (see next slide).
• The company will not have
- a single solution for security, backup/crash recovery
- centralized data administration
- a high amount of data sharing and availability
• E.g. on the next slide, different departments…
- use different programs and different subsets of data.
- duplicate data (e.g. A, B) from a master file (i.e. they are
using derivative files)
CS 330
Spring 2019
159
Storing Data in Many Derivative Files
Limitations of a Flat File(s)
Source: Course text Figure 6-2
CS 330
Spring 2019
160
Databases and DBMS
What are Database and DBMS?
• Database: a collection of related information stored in a
structured form
- The structure (think column heading of the tables) is
described by schema
• Database Management System (DBMS): a collection of
programs that manipulate a database
- set up the storage structures
- perform updates on the data
- process queries (requests for data retrieval) from
applications and users
• The DBMS provides a central point of access to the data
CS 330
Spring 2019
161
Databases and DBMS
Why Use Databases and DBMSs?
• They provide data integrity
- reduce data redundancy and inconsistency
• They provide data independence from the program
- i.e. the data is stored in a standard format
• They provide security, concurrent access and crash recovery
- enable data sharing and high availability
• The provide centralized data administration
- for backing up and for access
• They reduce application development time because standard
software packages exist in the market
CS 330
Spring 2019
162
Data Models
Common Types of Databases
• There are different types of database models based on how
you view (or structure) the data
- Network model: model data as a network
- Hierarchy model: model data as a tree
- Relational model: model data as a table
- Object-oriented model: model data as objects
• The most popular model is the relational database.
CS 330
Spring 2019
163
Relational Databases
Overview
• First developed in the 1970s
• The most widely used type of database
- especially in business oriented transaction processing
- most businesses use them in some form or another
• Key Observation: information is related
- e.g. for customers, purchases, products, suppliers
customers make purchases
purchases list products
products have suppliers
CS 330
Spring 2019
164
Relational Databases
Structure
• Attribute (property an entity might have) or field: a column
- E.g. student number, given name, family name
- Attribute values must be atomic
▪ atomic: a single value such as a number, a character, a
string, a date, etc., e.g. CS330
▪ non-atomic: a list of values
e.g. a list of completed courses (could be many)
- Domain: set of allowed values for an attribute
▪ e.g. {CS115, CS116, CS230, CS234, CS330, CS338 ...}
▪ e.g. positive integer (for Student ID)
• Record or tuple: a row, i.e. a collection of attribute values
- all rows (in a table) have the same number of values
- each row is distinguishable from the other rows
CS 330
Spring 2019
165
Relational Databases
Student ID
First Name
Last Name
20158888
Chris
Lee
20158889
Terry
Lee
20158890
Terry
Dodd
Intro to Relational Databases Video
row: record or tuple
column: attribute or field
Student ID
Course
Grade
20158888
CS115
83
20158888
CS116
78
20158888
CS230
81
http://www.youtube.com/watch?v=eXiCza050ug
It talks about MS Access but what is
says is true of all relational databases.
CS 330
Spring 2019
166
Relational Databases
Structure
• On the previous slide the fields are: Student ID, First Name, Last
Name, Course, and Grade.
• A table (typically stored in a file) is a group of records
- it relates (connects) rows to columns
• A relation is a set of rows (tuples or records)
- i.e. it is a set of related entities.
- In mathematics a relation is a connection between two entities,
- e.g. the Student ID, First Name and Last Name in a row are all
related (i.e. associated with a particular student),
- e.g. the Student ID, Course and Grade are all related in the
second table.
• A relational database is a collection of tables (relations)
CS 330
Spring 2019
167
Relational Databases
Structure
Course text, Figure 6-4
In the table above,
• the attributes are: Supplier_Number, Supplier_Name,
Supplier_Street, Supplier_City, Supplier_Province, Supplier_PC
(i.e. the column headings).
• The records are the rows, i.e. the attributes of each supplier.
• Each row represents an entity, i.e. a person, place, thing, event
about which information is maintained.
CS 330
Spring 2019
168
Relational Databases
Structure
Course text, Figure 6-4
In the table above,
• the attributes are: Part_Number, Part_Name, Unit_Price,
Supplier_Number (i.e. the column headings).
• The records are each row, i.e. the attributes of each entity.
• In this table, the entities are parts.
CS 330
Spring 2019
169
Relational Databases: Keys
Primary Keys
• Primary key: a minimum set of attribute(s) whose values are
unique in each row of a table
• Used to uniquely identify (and retrieve) individual entities
(rows), i.e. no two rows in a table have the same primary key.
• In the table below Student ID uniquely identifies a student
whereas First Name does not.
• Sometimes a primary key
Student ID First Name Last Name
is created and assigned in
20158888
Chris
Lee
order to ensure that each
20158889
Terry
Lee
entity has a unique key
20158890
Terry
Dodd
(e.g. Student ID).
⁞
CS 330
Spring 2019
⁞
⁞
170
Relational Databases: Keys
Primary Keys
Both tables are
from the course
text, Figure 6-4
In the table above, the primary key is the Part_Number.
In the table below, the primary key is the Supplier_Number.
CS 330
Spring 2019
171
Relational Databases: Keys
Primary Keys
• Sometimes it takes two or more attributes to uniquely identify
a row, i.e. to create a primary key.
• This combination of attributes is called a composite key.
• E.g. in the table below, the pair (Student ID, Course) uniquely
identifies each row whereas Student ID by itself does not.
CS 330
Student ID
Course
Grade
20158888
CS115
83
20158888
CS116
78
20158888
CS230
81
⁞
⁞
⁞
Spring 2019
172
Relational Databases: Keys
Primary Keys
• Sometimes there may be more than one key which could be used
as the primary key.
• These keys are called candidate keys and one of them is
designated as the primary key.
- E.g. at UWaterloo both your student number and your UW
userid are unique.
CS 330
Student ID
First Name
Last Name
userid
20158888
Chris
Lee
c17lee
20158889
Terry
Lee
t47lee
20158890
Terry
Dodd
tdodd
⁞
⁞
⁞
⁞
Spring 2019
173
Relational Databases: Keys
Foreign Keys
• One of the goals of good database design is to minimize
redundancy.
• E.g. in the table below the “Chris”, “Lee” and “20158888” are
repeated many times.
CS 330
First Name
Last Name
Student ID
Course
Grade
Chris
Lee
20158888
CS115
85
Chris
Lee
20158888
CS116
76
Chris
Lee
20158888
CS230
80
Chris
Lee
20158888
CS234
84
Chris
Lee
20158888
CS330
80
⁞
⁞
⁞
⁞
⁞
Spring 2019
174
Relational Databases: Keys
Foreign Keys
• Foreign key: a field in a table that is a primary key in another
table
• The foreign key is used to link different tables together and
avoid redundancy.
Student ID
First Name
Last Name
20158888
Chris
Lee
20158889
Terry
Lee
20158890
Terry
Dodd
CS 330
Spring 2019
Student ID
Course
Grade
20158888
CS115
83
20158888
CS116
78
20158888
CS230
81
175
Relational Databases: Keys
Primary and Foreign Keys Example
• SID (i.e. Student ID) is the primary key for the Students table
Students (SID, First Name, Last Name)
•
CID (i.e. Course ID) is the primary key for the Courses table
Courses (CID, Instructor, Term, Building, Room, Time)
•
How do we express the courses students take and the grade
they receive for each course?
•
We create a Completed table with two foreign keys: CID and SID.
•
The Completed table represents the relationship “students
complete courses” and a row in a table students or courses
represent one complete item in the relation.
CS 330
Spring 2019
176
Relational Databases: Keys
Completed
CID
Students
SID
Grade
SID
CS115 20158890
83
20158888
Chris
Lee
CS116 20158888
78
20158889
Terry
Lee
⁞
⁞
⁞
⁞
⁞
⁞
First Name Last Name
SID links the completed course to the student’s information.
CID links the completed course to the course details.
CID
CS 330
Instructor Term
Bldg
Room
Time
CS116
C. Smith
S19
MC
4040 1:00-2:20 TTh
CS135
J. Doe
S19
MC
4041 2:30-3:50 TTh
⁞
⁞
⁞
⁞
Spring 2019
⁞
⁞
177
Relational Databases
Another Example
Both table are
from the course
text, Figure 6-4
In the table above, Supplier_Number is a foreign key because it is
the primary key in the table below. ⇒ It links information about a
supplier to information about a part.
CS 330
Spring 2019
178
Relational Databases - Keys
Exercise: What are possible Primary Keys?
Student (SID, email, address, phone, birthday, social insurance #)
a) SID?
b) email?
c) Phone?
d) (SID, Address)?
e) social insurance number?
S19 Courses (CID, Instructor, Term, Bldg, Room, Time, Cap)
• What are the possible primary key(s)?
CS 330
Spring 2019
179
Database Management Systems
Problem 1
• What if your database is growing so large that you need to split
it over multiple hard drives?
• Ideally: you would want to avoid modifying all your programs
when this happens.
Solution
• Separate how the data is stored (the Physical Schema or
Physical View) from how the data is used (the External Schema
or the Logical View).
CS 330
Spring 2019
180
Database Management Systems
Problem 2
• What if different users (say the Payroll Clerk and Benefits Clerk)
are interested in different parts of the database
Solution
• Create a single global view of the data (the Global View a.k.a.
the Conceptual Schema) that feeds into many individual views
of the data (the External Schema a.k.a. the Logical View) for
the different user groups.
CS 330
Spring 2019
181
Database Management Systems
Conceptual Schema
(Global View)
CS 330
External Schema
(Logical View)
Spring 2019
182
Database Management Systems
Three Schema Architecture
• External Schema (or Logical View)
- how the data is displayed to a particular user
- different views for different user groups
- the rest of the database is hidden from that user
- e.g. Payroll sees net pay, the IT department does not.
• Conceptual Schema (or Global View)
- a global description of the whole database (all the data)
- unbiased towards any particular group of users
- we focus on this level
• Physical Schema (or Physical View)
- how the data is physically stored and organized
- what data is in which file on which disk
CS 330
Spring 2019
183
Database Management Systems
Example of the Three Views
Consider the attribute “birthday” with the value “June 20 1994”
• Physical Schema (or Physical View)
- a pattern of 0’s and 1’s located on a magnetic disk
• Conceptual Schema (or Global View)
- the date would be located in a particular row, in a particular
table, under the column heading Birthday in a particular
table in a particular file
• External Schema (or Logical View)
- to display someone’s age, the DBMS could subtract their
birthday from the current date
CS 330
Spring 2019
184
Database Management Systems
Three Schema Architecture
• Why have these three layers?
• Answer: Data independence: i.e. the separation of logic, storage
and presentation.
• Can change software without changing the data and vice versa.
• Just like file systems: regardless of where or how the file is
stored, you can open it.
• Easier to manage and control (e.g. want some users to only see
age but not exact date of birth).
CS 330
Spring 2019
185
Database Management Systems
Data Independence
• Key Idea: remove details related to data storage and access
from application programs
• Concentrate those functions in single subsystem: the
Database Management System (DBMS).
• Have all applications access data through the DBMS.
• Make applications independent of data storage and make its
display independent of data logic.
CS 330
Spring 2019
186
Database Management Systems
SQL
• All modern databases support SQL.
• It is the most commonly used language to create, manage and
query a database.
• SQL statements can be embedded in other programming
languages (C/C++, Java, Python etc.)
• The SQL command to access a database is often generated on
the fly, behind the scenes
- i.e. users specify what they want and click the search button.
- e.g. http://www.lib.uwaterloo.ca
- e.g. https://cs.uwaterloo.ca/cscf/teaching/schedule/
CS 330
Spring 2019
187
Database Management Systems
Operation: Select
Many operations on a table would involve obtaining information
from particular rows or columns
• Select finds the rows that match a certain criteria
- e.g. select parts with part_number 137 or 150
CS 330
Part_Number
Part_Name
Unit_Price
Supplier_Number
137
Door latch
22.00
8259
150
Door moulding
6.00
8263
Spring 2019
188
Database Management Systems
Operation: Join
• Join adds relevant columns from another table, say Suppliers
Part_Number
Part_Name
Unit_Price
Supplier_Number
137
Door latch
22.00
8259
150
Door moulding
6.00
8263
CS 330
Part_Number
Part_Name
Unit_Price
Supplier_Number
Supplier Name
∙∙∙
137
Door latch
22.00
8259
CBM Inc.
∙∙∙
150
Door moulding
6.00
8263
Jackson Composities
∙∙∙
Spring 2019
189
Database Management Systems
Operation: Project
• Project would only include certain columns …
•
Part_Number
Part_Name
Unit_Price
Supplier_Number
Supplier_Name
∙∙∙
∙∙∙
137
Door latch
22.00
8259
CBM Inc.
∙∙∙
∙∙∙
150
Door moulding
6.00
8263
Jackson Composities
∙∙∙
∙∙∙
e.g. project the columns Part_Number, Part_Name,
Supplier_Number and Supplier_Name
Part_Number
Part_Name
Supplier_Number
Supplier Name
137
Door latch
8259
CBM Inc.
150
Door moulding
8263
Jackson Composities
CS 330
Spring 2019
190
Database Management Systems
Data Manipulation
• The contents of a database can be accessed using a data
manipulation language which specifies the contents to extract.
• e.g. The following MySQL query
generated the table below.
Part_Number
Part_Name
Supplier_Number
Supplier Name
137
Door latch
8259
CBM Inc.
150
Door moulding
8263
Jackson Composities
CS 330
Spring 2019
191
Database Management Systems
Data Definition
• The contents of a database must be clearly defined using a
data definition language which specifies the type of each
attribute / field / column heading. E.g.
CREATE TABLE Parts
(Part_Number
number,
Part_Name
text,
Unit_Price
currency,
Supplier_Number number,
PRIMARY KEY (Part_Number));
•
It could also specify valid ranges (for numbers) and whether
duplicate values are allowed.
CS 330
Spring 2019
192
Database Management Systems
Limitations of Relational Databases
• Multimedia data: graphics, audio, video
- Tables (i.e. rows and columns of data) don’t handle
multimedia data well
• Arrays of data (all the same type of data indexed with a natural
number)
• Unstructured text: e-mail, text messages, tweets, user
comments
• Hierarchical data
- Example: Taxonomy of Organisms
- Hierarchy of categories: kingdom, phylum, class, order,
family, genus, species
CS 330
Spring 2019
193
Database Management Systems
Hierarchical Database
animals
chordates
vertebrates
birds
reptiles
arthropods
insects
spiders
crustaceans
mammals
A tree captures the relationship among the data: a parent (e.g.
vertebrates) can have many children (e.g. birds, reptiles, …).
• How would you design a relational schema for this?
• Not as common as relational databases.
•
CS 330
Spring 2019
194
Database Management Systems
Network Database
University
Department
Student
Course
Section
Completed
Unlike a hierarchical database, a child (i.e. Completed ) can have
multiple parents (i.e. Section and Student).
• These databases can be faster than relational databases.
• But they are not as common as relational databases.
•
CS 330
Spring 2019
195
Database Management Systems
Object-oriented (OO) Database
• Many applications need to store and retrieve text, graphics,
audio and video (i.e. multimedia)
• Organizing the database as tables with rows and columns does
not handle multimedia very well.
• OO Databases store both
- the types of data and
- the procedures that manipulate the data
• Relatively slow because of complexity.
• Support many OO concepts like inheritance and polymorphism
- a grad student is a type of student with additional attributes
(fields)
CS 330
Spring 2019
196
Database Management Systems
Object-oriented (OO) Database
• Inheritance: a Grad_Student includes
all the attributes of a Student plus
possibly some additional ones (i.e.
thesis_supervisor, office,
office_phone)
• Polymorphism: A Student and a
Grad_Student can respond to many
of the same operations, e.g.
get_student_number
• Object-oriented databases are
becoming more popular.
CS 330
Spring 2019
Student
• student_number
• name
• userID
• major
Grad Student
• thesis_supervisor
• office
• office_phone
197
Database Design
Criteria for a Good Design
What are the criteria for a good database design?
- Correctness
- Completeness: it characterizes all the data
- Minimum redundancy: it cannot be completely eliminated in
all cases.
CS 330
Spring 2019
198
Database Design
Steps in Database Design
1) Identify what data to store and the relationships between
the entities.
Use an entity-relationship (ER) diagram capture this data.
2) Convert the ER diagram into tables
- Use a set of mapping rules which we will cover briefly.
3) Fine-tune your design
- Apply the normalization process to remove redundancy.
CS 330
Spring 2019
199
Example: Company Database
Step 1a: Identify the Data
• An Employee has a name, sex, address, salary, SIN, birthday,
works for a department, works on projects, might have a
supervisor (who is also an employee).
• A department has a name, a department number, a manager
(who is also an employee), located in one or more cities.
• A manager has a starting date.
• A project has an name, a project number and a location. It is
controlled by a department.
• The number of hours an employee works on a project should
be recorded for performance evaluation.
CS 330
Spring 2019
200
Example: Company Database
Step 1b: Create an ER Diagram
SIN
Source: Fundamentals of Database Systems by Ramez Elmasri
CS 330
Spring 2019
201
Example: Company Database
Step 1b: Create an ER Diagram: Entities and Relationships
• The rectangle are entities (things, nouns) that we store data
about, e.g. EMPLOYEE, DEPARTMENT, PROJECT
•
•
•
The diamonds are relationships between the entities.
- E.g. WORKS_FOR, MANAGES, SUPERVISION, CONTROLS,
WORKS_ON
Sometimes we store data about the relationship.
- E.g. StartDate, Hours
There are different formats for ER diagrams.
- We will use Chen notation but not cover all its details.
CS 330
Spring 2019
202
Example: Company Database
Step 1b: Create an ER Diagram: Attributes
• The ovals are the pieces of data (attributes) that we store
about entities or their relationships,
- e.g. Salary, Address, Sex, Name, SIN, …
•
The primary keys for each entity are underlined in the ovals.
•
Double ovals are pieces of data that we store that can have
more than one value (multivalued attributes),
- e.g. location of the department (i.e. Waterloo and Toronto).
•
Dashed ovals are pieces of data that can be derived from other
attributes,
- e.g. NumberOfEmployees.
CS 330
Spring 2019
203
Example: Company Database
Step 1b: Create an ER Diagram
Relationships can be …
•
1:1 A department only has one employee that manages that
department and a manager only manages one department.
•
1:N (one to many): One department has many employees that
works_for it.
•
N:M (many to many): An employee works_on many projects
and a project has many employees.
CS 330
Spring 2019
204
Example: Company Database
Step 1b: Create an ER Diagram
Relationship R can be connected to an entity E by a
•
•
single line meaning not every entity E participates in the
relationship R, called partial participation, e.g.
- not every employee is a supervisor
- not every employee is a supervisee (i.e. the CEO)
- not every department controls a project
double line meaning every entity participates in the
relationship, called total participation, e.g.
- every employee works on 1 or more projects
- every project has employees working on it
- every department has employees and a manager
CS 330
Spring 2019
205
Example: Company Database
Step 2: Map the ER Diagram to DB Tables
• There are about dozen different rules for the mapping.
• We only introduce a few simple ones.
• An entity from a ER diagram is represented a table.
• Relationships are represented either as
1. foreign keys in one of the entity’s table
2. or they get their own table.
• For each table you create, you must find a primary key (P-key)
to uniquely identify a single record (i.e. row) in the table.
CS 330
Spring 2019
206
Example: Company Database
Step 2: Map the ER Diagram to DB Tables
a) Entities get mapped to tables (just the headings of each table
are shown)
EMPLOYEE (Bdate, SIN, Fname, Minit, Lname, Sex, Address, Salary)
DEPARTMENT(Dname, Dnumber, Location)
PROJECT(Pname, Pnumber, Location)
b)
For 1:1 relationships: place the P-key from one entity into the
other entity’s table
For EMPLOYEE manages DEPARTMENT add MgrSIN to the
DEPARTMENT table (i.e. it is a foreign key in DEPARTMENT).
DEPARTMENT(Dname, Dnumber, Location, MgrSIN)
CS 330
Spring 2019
207
Example: Company Database
Step 2: Map the ER Diagram to DB Tables
c) For 1:N relationships: place the P-key from the “1” entity into
the “N” entity’s table.
For EMPLOYEE works_for DEPARTMENT add the
DEPARTMENT’s Dnumber to the EMPLOYEE table.
EMPLOYEE (Bdate, SIN, Fname, Minit, Lname, Sex, Address, Salary,
Dnumber)
For DEPARTMENT controls PROJECT add Dnumber to the
PROJECT table.
PROJECT(Pname, Pnumber, Location, Dnumber)
CS 330
Spring 2019
208
Example: Company Database
Step 2: Map the ER Diagram to DB Tables
c) For 1:N relationships: place the P-key from the “1” entity into
the “N” entity’s table
For EMPLOYEE supervisor EMPLOYEE add SupSIN to the
EMPLOYEE table
EMPLOYEE (Bdate, SIN, Fname, Minit, Lname, Sex, Address, Salary,
Dnumber, SupSIN)
CS 330
Spring 2019
209
Example: Company Database
Step 2: Map the ER Diagram to DB Tables
c) For N:M relationships: create a new table with composite Pkey (composed of P-keys from both entities) and include any
associated data, e.g. hours.
- For EMPLOYEE works_on PROJECT create a WORKS_ON
table
WORKS_ON (SIN, Pnumber, Hours)
CS 330
Spring 2019
210
Example: Company Database
Step 2: Map the ER Diagram to DB Tables
Note we have greatly simplified things here.
I.e. we have not talked about how to deal with
• derived attributes like: NumberOfEmployees
• multivalued attributes like: Locations
• entities connected to a relationship by a single line vs. a
double line
CS 330
Spring 2019
211
Example: Company Database
Step 3: Normalize the design
• The last step in database design is to normalize it.
• Normalization is a process to minimize the redundancy in a
design.
• There are different levels of strictness for reducing redundancy.
• We will cover this subject only briefly by learning one way to
check for and remove some redundancy.
CS 330
Spring 2019
212
Normalization
Functional Dependency
• Boyce-Codd Normal Form (BCNF): every attribute for an entity
depends only on the candidate key(s) (and not some other
attributes as well).
• What do we mean by depends?
• Functional Dependency (FD): A → (B, C) means...
- B and C depend on A
- i.e. the value of A determines the values of B and C
CS 330
Spring 2019
213
Normalization
Functional Dependency
• Observations
- There may be many different students with the same last
name.
- But each student has a unique student number.
• Conclusion
- The last name depends on student number.
- Student number does not depend on last name.
• Given your student number, I can look up your last name.
• But given (only) your last name, I cannot find out (for sure) what
your student number is.
• Write this dependency as: student number → last name
CS 330
Spring 2019
214
Normalization
More Examples
Examples of functional dependencies:
• employee-number → employee-name
• course-number, section-number, term → lecture-room
• course-number, section-number, term → instructor
Examples that are not functional dependencies:
• employee-name ↛ employee-number
• lecture-room ↛ course-number
• instructor ↛ course-number
• last-name ↛ colour-of-socks-you-wore-today
CS 330
Spring 2019
215
Normalization
Looking for Functional Dependencies
What are the functional dependencies in the following Emp
(employee) table?
Emp (EName, SIN, BDate, Address, DNum, DName, MgrSIN)
where
EName – employee name
SIN – social insurance number
BDate – birthday
DNum – department number
DName – department name
MgrSIN – department manager’s social insurance number
CS 330
Spring 2019
216
Normalization
Looking for Functional Dependencies
What are the functional dependencies?
• social insurance number (SIN) determines: employee name
(EName), birthday (BDate), Address, DNum (Department
number)
i.e. SIN → Ename, BDate, Address, DNum
• department number (DNum) determines: department name
(DName) and department manager social insurance number
(MgrSIN)
i.e. DNum → DName, MgrSIN
The table is not in BCNF: DName depends on SIN (a primary key)
and DNum (not a primary key, repeated many times in the table)
CS 330
Spring 2019
217
Normalization
Looking for Functional Dependencies
Recall: Boyce-Codd Normal Form (BCNF): every attribute for an
entity only depends on the candidate key(s)
Solution
• Break the table into two tables Emp(Employee) and Dept
(Department), each with their own primary key:
1. Emp (EName, SIN, BDate, Address, DNum)
2. Dept(DNum, DName, MgrSIN)
Now all the attributes of
• Emp are determined by SIN and
• Dept are determined by DNum
CS 330
Spring 2019
218
Another Example: Ordering Parts
Sample Order
Determining the functional dependencies first is another way to
build and ER Diagram. E.g.
• An order consists of an Order Number, a Date and a list of parts
and their supplier (called line items).
Order: 19330
Date: June 10, 2019
Part # Part Name
Quantity Unit Price
Supplier
137
Door
latch
200
$22.00
8259: CBM Inc.
74 5th Ave, Saint John, NB, E2M 5T3
150
Door
moulding
300
$6.00
8263: Jackson Components
82 Micklin St, Hamilton, ON, L9H 7M4
152
Door
lock
300
$31.00
8259: CBM Inc.
74 5th Ave, Saint John, NB, E2M 5T3
CS 330
Spring 2019
219
Another Example: Ordering Parts
Look for Functional Dependencies
First consider the dependencies.
Supplier
Supplier_Number → Supplier_Name, Supplier_Street, Supplier_City,
Supplier_Province, Supplier_PC
Part
Part_Number → Part_Name, Unit_Price, Supplier_Number
Line_Item
Order_Number, Part_Number → Part_Quantity
Order
Order_Number → Order_Date
CS 330
Spring 2019
220
Another Example: Ordering Parts
Look for Functional Dependencies
Use the dependencies to build the tables,
Supplier (Supplier_Number, Supplier_Name, Supplier_Street,
Supplier_City, Supplier_Province, Supplier_PC)
Part (Part_Number, Part_Name, Unit_Price, Supplier_Number)
Line_Item (Order_Number, Part_Number, Part_Quantity)
Order (Order_Number, Order_Date)
From these tables you can build the ER Diagram.
Example taken from section 6.2 of the course text.
CS 330
Spring 2019
221
Abilities of a Database
Can a DBMS find the following?
• Filter: List the names of students who get in the 90s for both CS
330 and STAT 371
• Predict: What would be the monthly sales figure if we raise
prices by 10%? Lowered it by 5%?
• Filter: Find all the professors that taught/are teaching the
student Terry Lee.
• Predict: Find out how likely students will pass CS 330 if they get
80+ in CS 115.
• Summarize: Plot the CS330 grade distribution according to the
programs its students are enrolled in.
CS 330
Spring 2019
222
Abilities of a Database
Can a DBMS find the following?
• A Database can...
- record data,
- search for an item,
- filter and project (i.e. select certain records and attributes)
- group information,
- summarize (max, min, average, count)
• A Database cannot perform
- statistical analysis: what-if, forecasting, correlation
• This limitation is why we need Data Warehouses, Data Marts,
Online Analytical Processing and Data Mining
CS 330
Spring 2019
223
Business Intelligence and Analytic Tools
Data Warehouse
• Definition: A decision support database that is maintained
separately from the organization’s operational database.
• I.e. it provides information to help make decisions.
• How: It stores data (both current and historic) that could be of
interest to a decision maker.
• A data warehouse is
- integrated (i.e. connected to corporate databases)
- time-variant (takes into account data that changes over
time)
- non-volatile (does not delete old entries)
CS 330
Spring 2019
224
Business Intelligence and Analytic Tools
Data Warehouse
• Use a database to keep track of day-to-day transactions (i.e.
ordering from suppliers, making products, selling to
customers).
• Use a data warehouse to find patterns in the data and to
provide insights. E.g.
- What products cost the most to maintain?
- What products cost the most to develop?
- What products have the lowest defect rate?
- Did changing suppliers impact our defect rate?
• Then make decisions based on the patterns in the data.
CS 330
Spring 2019
225
Business Intelligence and Analytic Tools
Data Warehouse Components
Course text, Figure 6-12
CS 330
Spring 2019
226
Business Intelligence and Analytic Tools
Why have a Separate Data Warehouse?
• Performance
- Operational databases (which are not data warehouses) are
tuned for day-to-day transactions and workloads, i.e.
processing daily transactions.
- Complex queries (which take a long time to process) would
degrade performance for processing daily transactions.
- Special data organization, access and implementation
methods needed for complex queries.
CS 330
Spring 2019
227
Business Intelligence and Analytic Tools
Why have a Separate Data Warehouse?
• Function
- Decision support requires historical data (up to 5 to 10 years
of data).
- Consolidates data from many operational systems as well as
external sources (GDP, foreign exchange rates, inflation)
- Data quality considerations (how trustworthy is the data)
CS 330
Spring 2019
228
Business Intelligence and Analytic Tools
Benefits of Data Warehouses
http://www.youtube.com/watch?v=KGHbY_Sales
Examples
•
•
•
•
https://www.ibm.com/analytics/data-warehouse
https://azure.microsoft.com/en-us/services/sql-data-warehouse/
https://cloud.google.com/bigquery/
https://aws.amazon.com/redshift/
CS 330
Spring 2019
229
Business Intelligence and Analytic Tools
Data Warehouse vs. Data Marts
Data Warehouse
• Collects information about multiple subjects that span the
entire organization
• requires extensive business modeling
• may take years to design and build
Data Marts
• departmental subsets that focus on selected subjects
- E.g. marketing data mart that focusses on customers,
products and sales
• faster roll out (compared to a data warehouse)
• more complex to integrate all the data marts in the long run.
CS 330
Spring 2019
230
Business Intelligence and Analytic Tools
Online analytical processing (OLAP)
• Traditional database queries look for answers in (twodimensional) tables.
• Online analytical processing (OLAP) supports multidimensional
data analysis.
• This feature enables users to view the same data broken down
in different ways along different dimensions e.g.
- by product, by regions, by time period, by cost, by price.
- E.g. How well has the predicted vs. actual sales performed in
each region and for each product since June?
CS 330
Spring 2019
231
Business Intelligence and Analytic Tools
Online analytical processing
• Here data is being considered
along three dimensions
1. product
2. region
3. actual vs. predicted
Dimensions, Measures, Hierarchy and Grain
Course text,
Figure 6-13
https://www.youtube.com/watch?v=qkJOace9FZg
OLAP: https://www.youtube.com/watch?v=2ryG3Jy6eIY
CS 330
Spring 2019
232
Business Intelligence and Analytic Tools
Data Mining
• Instead of making a query, tools automatically analyze large
pools of data to find hidden patterns, infer rules and predict
trends, e.g.
- Associations: customers who buy X will likely buy Y if it is on
sale.
- Sequences: customer who buy X will typically buy Y within
two months.
- Classifications: types of customers who are likely to stop
using your product.
- Clusters: group together similar customers.
- Forecasts: predict what some future values will be based on
current trends.
CS 330
Spring 2019
233
Managing Data Resources
Information Policy
• A database stores information but organizations also need
policies for how it is used.
• An information policy specifies organizational rules for sharing,
disseminating, acquiring, standardizing, classifying and
inventorying information.
- What data and information to store
- How to store, manage and use it
- Who can access what
▪ E.g. who can access and change an employee’s salary.
• Database administration manages the structure and content of
corporate databases as well as access rules and security.
CS 330
Spring 2019
234
Managing Data Resources
Ensuring Data Quality
Try to find and correct errors in the data, e.g. different version of
someone’s name.
• A data quality audit is a structured survey of the accuracy and
completeness of data in an information system, i.e. check the
data.
- Look at all (or a sample of) the data.
- Ask end users for the opinion of the data.
• Data cleansing consists of activities for detecting and
correcting data in an information system
- E.g. is the postal code accurate for the address?
- Enforces consistency
CS 330
Spring 2019
235
Topic 3 – Networking
Key Concepts
• principle components of a network
• common types of networks, transmission media and internet
connections
• principle technologies and standards for networking
References
• Course text, Chapter 7 Telecommunications, the Internet, and
Wireless Technology
CS 330
Spring 2019
236
Overview of Computer Networks
Computer Networks (a Review)
• A computer network is two or more computers connected
together so that they can share resources
• Network components (on each machine):
- a Network Interface Card (NIC): allows a computer to be
connected to the network
▪ e.g. Google image search “ethernet card” “Wi-Fi card” or
“Bluetooth card”
- a Network Operating System (NOS): routes and manages
communications on the network and coordinates network
resources
CS 330
Spring 2019
237
Overview of Computer Networks
Other Network Components (mostly a review)
• Connection medium: could be wire, fiber optic cable, radio
waves (more on this topic soon).
• Dedicated servers: e.g. file server, e-mail server, database
server, web server.
• Hubs, bridges and switches connect machines on the same
network and forward data from one to another
- typically only see these in larger networks (i.e. 16+
computers)
• Routers connect two or more different networks
- e.g. your home network (typically Wi-Fi) to the Internet
(typically DSL on telephone lines or HFC on cable TV lines)
CS 330
Spring 2019
238
Overview of Computer Networks
Other Network Components (mostly a review)
• Firewall: hardware or software (or both) put between the
internal network (or individual computer) and the internet to
prevent outsiders from obtaining unauthorized access
• Key Question: How does it block unauthorized network access
but allow authorized access?
- The firewall keeps track which websites you have contacted
recently and are waiting for a reply.
- e.g. if you initiate a search using Google, you firewall will
accept a reply from Google but not from any other website.
• My home firewall gets dozens of unauthorized attempts to
access my home network every hour.
CS 330
Spring 2019
239
Overview of Computer Networks
A Network for a Large Company
would include
• internet
• public telephone network
• internal wired network
• internal wireless network
• cell phone network
• video conferencing system
• extranet (a private network that
partners, suppliers and vendors
can access).
CS 330
Spring 2019
Course text,
Figure 7-2
240
Key Trend: Packet Switching
Circuit Switching
This was the first method of switching, dating back to the first
telephone systems.
• Originally a connection between two devices was achieved by
creating a circuit, a temporary dedicated path between the
source and the destination.
• Think of a telephone call, you set up the circuit (dial the number)
and the circuit exists until either party hangs up, even if there is
no one talking.
• This approach wastes network resources when no (or little)
talking is taking place.
CS 330
Spring 2019
241
Key Trend: Packet Switching
Packet Switching
Later on an alternative to circuit switching was developed,
packet switching.
1. Data (such as a webpage) is broken down into small parts
(called packets) roughly 1 KB in size.
2. Packets are sent from the source to the destination (possibly
along different communication paths)
3. Packets are reassembled in their original order once they
reached their destination.
Key Advantage (compared to circuit switching): Only use the
network when you have information to send ⇒ more people can
share the network.
CS 330
Spring 2019
242
Key Trend: Packet Switching
Course text,
Figure 7-3
The network consists of many nodes and there are multiple
routes to the destination.
CS 330
Spring 2019
243
Key Trend: Packet Switching
An Analogy
• Say, you are having a party in Toronto.
• Twenty of your friends are attending.
• How is packet switching different from circuit switching?
- Hint: train vs. cars
• Train: a single route
• Cars:
- group splits up with some people in each car
- cars may take different routes
- cars may mix in with other cars (not going to party)
- group reassembles when at the destination
CS 330
Spring 2019
244
Different Types of Networks
•
•
•
•
•
Topology: How are the nodes connected to each other?
Geographic scale: How big is the network?
Protocol: What are the rules for communication?
- how to initiate and terminate communication,
- message format, handle errors, control messages,
- route messages, voltage levels
Transmission media: Wired, wireless or fiber
Services: E-mail, printing, file transfer, remote terminal,
teleconference, database access, file sharing etc.
CS 330
Spring 2019
245
Topologies
Popular Topologies
• The most common
topologies for wired
computer networks are
- the star,
- the ring, and
- the bus.
• Ethernet uses either a star
or a bus.
Source: Course text,
5th edition, Figure 7-6
CS 330
Spring 2019
246
Geographical Scale
Popular Geographical Scales
• NFC (near field communication) up to 4 cm, e.g. mobile payment
systems
• PAN (personal area network) up to 10 metres,
- e.g. Bluetooth connecting a laptop to a wireless mouse.
• LAN (local area network) within a small building or a single floor
of a large building,
- e.g. Ethernet in campus offices
- e.g. Wi-Fi in your house or apartment, which can also be called
WLAN (wireless local area network).
• WAN (wide area network) typically means the internet but it
could any network spanning regions or countries.
CS 330
Spring 2019
247
Protocols
Internet Protocol Suite
• A network protocol is a set of rules governing how data is
exchanged in a network
• Internet Protocol Suite is the standard for most networks
including the internet.
• At its core are two protocols: the Transmission Control Protocol
(TCP) and the Internet Protocol (IP).
• Each computer is assigned and identified by an IP address
- like a telephone number for a cell phone
- It contains four 8-bit numbers, each separated by a dot (for
IP version 4).
- Example: 192.3.15.1
CS 330
Spring 2019
248
Protocols
TCP/IP
• You can check your IP address by typing ipconfig in a DOS shell
or by asking “what is my ip address” in Google
• You can find out who is responsible for that address with the
nslookup command (or a whois server)
• UWaterloo’s range is 129.97.0.0 to 129.97.255.255
• You can see what sites your computer is connected to with the
netstat command
• TCP and IP are part of a four layer protocol suite …
CS 330
Spring 2019
249
Internet Protocol Suite
Course text
Figure 7-4
CS 330
Spring 2019
250
Internet Protocol Suite
Application Layer
• defines protocols for applications to exchange data
• e.g. the HTTP protocol for web browsers and web servers
• send the data (e.g. a webpage) to the transport layer to be
transported
Transport Layer
• sets up and manages the connection with the destination
• breaks up data into packets at the source and reassembles
them at the destination
• also handles flow control and congestion and optionally
reliability (i.e. request retransmission if a packet is lost or
corrupt)
CS 330
Spring 2019
251
Internet Protocol Suite
Internet Layer
• addressing and routing a packet through the network
• gets packet from source to destination based only on its
address
Network Interface (e.g. Ethernet, Wi-Fi, DSL)
• transporting a bit (or a packet) in the network medium
• i.e. placing the packet (a sequence of bits) on the network
medium (at the source) and receiving it from the medium (at a
neighbouring node)
• e.g. deals with how to represent and recognize a 0 or a 1 on
the medium (wire, fibre optic cable, radio waves)
CS 330
Spring 2019
252
Internet Protocol Suite
Application Layer
Transport Layer
Internet Layer
Network Interface
B
F
A
Course text,
Figure 7-3
CS 330
C
D
E
I
G
H
Spring 2019
253
Internet Protocol Suite
Source Application Layer
• Send an email with a large attachment from home (A) to
UWaterloo (I).
• Call a transport level function to send the message to the
destination.
Source Transport Layer
• Sets up and manages the connection between A and I.
• Breaks the email up into packets (tracking their order) and calls
an internet layer function to send each packet.
• It makes sure that they are not sent too quickly (flow control)
and the arrive without error.
CS 330
Spring 2019
254
Internet Protocol Suite
Source Internet Layer
• Finds a path from A to I.
• Has 3 choices: go through B, C, or D.
• Figures out the best choice and then calls the appropriate
network interface function to send the packet on that link.
Source Network Interface (e.g. Ethernet, Wi-Fi, DSL)
• Just concerned with getting a single packet to the next node.
• A to C could be Wi-Fi.
• C to G could be DSL.
• G to I could be Ethernet.
CS 330
Spring 2019
255
Internet Protocol Suite
Destination Network Interface (e.g. Ethernet, Wi-Fi, DSL)
• Receives the packet and sends it up to the internet layer.
Destination Internet Layer
• If it is for this address then send it up to the transport layer.
• If not, it figures out which link to send it out on and calls the
appropriate network level function.
Destination Internet Layer
• Receive the packets and reassembles them.
• When they have all arrived, it lets the application level know
that a message has arrived.
Destination Application Layer
• Lets the user know they’ve got a new email.
CS 330
Spring 2019
256
An Analogy: Mailing a Letter
CS 330
Spring 2019
257
Physical Transmission Media
Four Common Media
1. Twisted pair (of wires) – e.g. telephone lines, category 5
networking cable (CAT5), Ethernet cable
2. Coaxial cable – e.g. cable TV
3. Fiber optic cable – fast, massive bandwidth
4. Wireless transmission media and devices – more and more
popular in LANs
CS 330
Spring 2019
258
Physical Transmission Media
Speed and Responsiveness
• One way of characterizing network performance is by
bandwidth, i.e. the number of bits that can transmitted per
second.
- typical units are Kbps, Mbps, Gbps
- recall small b = bit not byte
- K, M, G are (typically) multiples of 1024
- the larger the bandwidth, the faster the network, i.e. the
more data that can be transferred in one second
• Network responsiveness is measured by latency, i.e. how long
it takes to receive the first byte of data or the time between a
request and a response.
CS 330
Spring 2019
259
Wireless Communication
Wireless
•
•
for PAN (Personal Area Network) i.e. 10 metres or less.
e.g. cell phone to headset; computer to wireless mouse,
keyboard, printer; cell phone tethering (smart phone uses
computer’s internet connection)
some say it stands for Wireless Fidelity, but it was a
meaningless word meant to be similar to hi fi.
• WLAN (wireless LAN), i.e. within a home or office
•
•
comes in various speeds: a, b, g, n, ac
CS 330
Spring 2019
260
Wireless Communication
Wireless
secure remote wireless access for longer distances (up to 50
kilometres)
• stands for Worldwide Interoperability of Microwave Access
• based on microwaves
• typically used in rural settings that do not have cable access
• needs a base station to connect with a remote tower
•
CS 330
Spring 2019
261
Wireless Communication
3rd, 4th and 5th Generation Cellular Networks
•
Newer generations of networks have faster data speeds
- 3G (1-2 Mbps typical) since the mid 2000s
- 4G (4-200 Mbps typical) current in most areas of Canada
- 5G (1 Gbps or greater) about to happen
•
4G comes in many flavours with different speeds: HSPA+, LTE
and LTE Advanced
Cell phone companies provide maps of their coverage
•
https://www.bell.ca/Mobility/Our_network_coverage
https://www.rogers.com/consumer/wireless/network-coverage
https://www.telus.com/en/bc/mobility/network/coverage-map
•
You can find out about other companies at
http://en.wikipedia.org/wiki/List_of_Canadian_mobile_phone_companies
CS 330
Spring 2019
262
Wireless Communication
Cell Towers
•
•
•
The range of a single cell phone tower (in ideal circumstances,
i.e. flat terrain) can be 35 km.
In cities, the towers are much closer, e.g. 1 km apart.
The location of cell phone towers is public information.
•
You can get a map of the ones in your area if you want to
choose a provider that has a tower close by.
https://www.ertyu.org/steven_nikkel/cancellsites.html
•
Most towers support multiple technologies, e.g. 3G and 4G.
Most cell phones support multiple technologies, so if a 4G
phone cannot find 4G, it will connect to 3G service.
•
CS 330
Spring 2019
263
Wireless Communication
5th Generation Cellular Networks
•
5G will allow for smaller antennas with a shorter range and so
(besides extra bandwidth) it can support more devices.
- 4G can support 100,000 device per square km.
- 5G can support 1,000,000 device per square km
•
•
•
This density means more support for the internet of things
- i.e. the extension of the internet to devices like smart
thermostats, lighting and home security systems.
5G systems also have faster response times (lower latency) so it
can support devices that require more stringent timing.
5G is compatible with 4G so the 5G network can grow
incrementally in an area with 4G service.
CS 330
Spring 2019
264
Radio Frequency Identification (RFID)
What is it?
Course text,
Figure 7-19
Similar to a bar code, i.e. the tag stores a unique number that
identifies an item (or type of item).
• When the tag is placed close to an RFID reader, the identifier is
read off of the tag and sent to a computer.
• Do not need line-of-sight (like the grocery store check out), the
reader just has to be close to the tag.
•
CS 330
Spring 2019
265
Radio Frequency Identification (RFID)
Types and Uses of RFIDs
• RFID tags come in many sizes and shapes but are generally
fairly small and flat
• There are two types
- passive: cost a few pennies, don’t need a battery, can only
be read from within a few feet
- active: cost a few dollars, need a battery, can be read from
over 100 feet away
• Great for inventory management: How many xyz’s do we have
in inventory and were are they?
• Similar to tracking a package with Canada Post/UPS/Fedex.
CS 330
Spring 2019
266
The Internet ≠ The Web
What is the Internet?
• An internet: a network of networks
• The Internet: a collection of local, regional, national and
international computer networks linked together
• It evolved from the late 60’s to mid 80’s as a way to link up
different networks together.
• Most homes and businesses connect to the internet by
subscribing to an Internet Service Provider (ISP), e.g. Bell
Internet or Roger’s High Speed Internet.
• The first major application was email (then file transfer,
electronic bulletin board services, etc.)
CS 330
Spring 2019
267
The Internet ≠ The Web
What is the World Wide Web?
• Recall: The Internet is a large number of networks connected
together.
• The World Wide Web, created in 1989, is just one of the
services available over the Internet.
• The Web is a collection of interconnected documents and other
resources, linked by hyperlinks
• When you use a web browser (e.g. Chrome, Firefox, Safari) to
request a webpage you are using the web.
CS 330
Spring 2019
268
The Internet ≠ The Web
What is the World Wide Web?
• hypertext transfer protocol (HTTP): the protocol that structures
the communication between the web browser (the client) and
the web server
- usually placed at the beginning of the web address
- it has commands to get a webpage, see if a webpage has
changed recently, etc.
• hypertext markup language (HTML): is the file type (i.e.
format) that a browser understands
- placed at the end of a web address
• e.g. http://www.uwaterloo.ca/index.html
CS 330
Spring 2019
269
The Internet ≠ The Web
What is the World Wide Web?
• Every webpage has a unique address, called uniform resource
locator (URL)
- e.g. http://www.uwaterloo.ca is a URL
• Typically webpages reference other webpages via their URL,
which is printed in blue and underlined.
• If you can click on the URL and the program jumps to another
page it is called a hyperlink
• Hypertext is just a file (or software system) that
contains/implements hyperlinks.
CS 330
Spring 2019
270
The Internet ≠ The Web
What is not part of the World Wide Web?
• Any program that needs to be connected to the internet to run,
but does not use a browser.
• Typically you download it as a separate program
• Examples
- Spotify: listen to streaming music
- iTunes: purchase (and listen to) music, videos, etc.
- some multiplayer on-line games
- some utility programs, typically Unix/Linux based, like
secure shell, secure copy
CS 330
Spring 2019
271
IP Addresses and the DNS
IP Addresses
• Every device connected to the Internet has a unique identifier
call its IP address
- e.g. 129.97.208.24 (IPv4)
- e.g. fe29::1725:c216:85fc:100d (IPv6)
- currently use IPv4 but we are running out of addresses.
- numbers are difficult to remember
• The domain name is the English-like name that corresponds to
the IP address
- e.g. uwaterloo.ca
• You need to register (pay) to get a domain name
• The Domain Name System (DNS) translates domain names into
IP addresses
CS 330
Spring 2019
272
IP Addresses and the DNS
1. The computer sends the
domain name to a DNS server.
uwaterloo.ca
129.97.208.24
2. The DNS server responds with
the IP address.
3. The pair is stored in a DNS
cache
4. To see what is in your cache
type ipconfig /displaydns
uwaterloo.ca =
129.97.208.24
Image Source:
http://www.windowsnetworking.com/img/gifs/tcpipdns.gif
CS 330
Spring 2019
273
IP Addresses and the DNS
For a domain name like sales.google.com the top level domain
name is “com”, the second level is “google”
CS 330
Spring 2019
274
Internet Services
Voice Over IP (VoIP)
• Voice over IP is a way of making telephone calls using the
internet, as opposed to using the telephone system.
• e.g. Skype, Google Voice plug-in for Gmail
• Can use your computer (if it has speakers and a microphone) or
buy a VoIP telephone.
• Used as a way of cutting down communication costs.
CS 330
Spring 2019
275
Internet Services
Voice Over IP (VoIP)
• Sound is digitized (sampled 8,000 times per second), broken up
into packets, transported through the internet via IP, then
reassembled at the other end.
CS 330
Spring 2019
276
Internet Services
VPN Motivation
• Goal: provide the ability to work remotely and securely access
files, e-mails and business data from your company’s internal
network
• Challenge: the Internet is not safe!
• Malicious people can intercept IP traffic (called packet sniffing)
• Need a way of securing data.
• Idea: create a virtual private network (VPN)
CS 330
Spring 2019
277
Internet Services
VPN
• A virtual private network (VPN) is a computer network that
provides secure access using a public infrastructure such as the
Internet
• Avoid the need for many leased lines that individually connect
remote offices (or remote users) to a private intranet.
• VPN creates a secure virtual tunnel to transport the data
• The original packet is encrypted before being transmitted
through the public network and then decrypted after reaching
its destination.
CS 330
Spring 2019
278
Topic 4 - Management Information Systems
Key Concepts
• data vs. information,
• information system (IS), management information system
(MIS), business intelligence (BI)
• objectives of an information system
• contemporary approach to MIS
References
• course text, Chapter 1 Information Systems in Business Today
CS 330
Spring 2019
279
What is MIS?
Some Key Definitions
• Data: raw facts (course text, pg. 13) e.g. a list of items scanned
at a supermarket checkout scanner
• Information: data shaped into a form that is meaningful ... to
human beings (course text, pg. 13) e.g. which items are selling
well, which aren’t, which need reordering
• Information Technology: all the hardware and software that a
firm needs to use in order to achieve its business objectives
(course text, pg. 12) e.g. desktops, laptops, servers, smart
phones, MS Office, custom software.
CS 330
Spring 2019
280
What is MIS?
Class Task:
What data does YouTube track?
• who uploaded the video, number of views, likes, dislikes,
number of subscribers, comments, the region you are in, what
videos you have seen before
What information could we obtain from this data?
• most popular videos, videos people view again and again, most
popular YouTube channels, most popular YouTube channels in a
certain region, YouTube channels that are popular in many
regions
CS 330
Spring 2019
281
What is MIS?
Class Task:
What are YouTube’s goals?
• to sell advertizing, which means
- to be a popular website
- to keep you on their website for a long time (so you will see
more ads)
How could we use this information to further YouTube’s goals?
• to easily identify the most popular videos in different genres
(e.g. comedy, gaming, pet videos) or in a particular region or
across many regions
CS 330
Spring 2019
282
What is MIS?
More Key Definitions
• Information System: A set of interrelated components that
collect, process, store and distribute information to support
decision making and control in an organization (course text,
pg. 12)
• Information System Literacy: understanding the
- technical
- organization
- management
dimensions of an information system (course text, pg. 14)
CS 330
Spring 2019
283
What is MIS?
Dimensions of an Information System
• technical: you’ve seen this already e.g. processors, secondary
storage, servers, databases, networks, etc.
• organization: (next topic)
- different groups in a firm have different information needs,
e.g. senior management does long range planning vs.
operational workers deal with day-to-day transactions.
- rules (such as course prerequisites) are embedding in the
information system (such as Quest)
CS 330
Spring 2019
284
What is MIS?
Dimensions of an Information System
• management: (future topic): make decisions, formulate
action plans, design and deliver new products
• The goal of studying Management Information Systems is to
develop broader information systems literacy (course text,
pg. 15)
All definitions from course text, 7th Canadian edition.
CS 330
Spring 2019
285
Why have MIS?
The Mission of MIS
• To improve the performance of people in organizations through
the use of information technology
• To (fully or partially) automate data gathering, processing,
storage and information distribution with the help of
information technology (IT)
• To convert business data into information and business
intelligence (IT technology to help make better decisions)
CS 330
Spring 2019
286
Why have MIS?
MIS vs. IS
• Wikipedia and some other online resources consider MIS is a
part of an IS and is designed to support or automate decision
making
• The textbook considers a MIS is a broader IS where both
technical and behavioral issues are considered.
CS 330
Spring 2019
287
Strategic Objectives of an IS
Issues to Consider
• Why consider a MIS?
- cost benefit analysis
- adapt to (internal and external) change
- create/maintain a competitive advantage
• How to develop and manage an IS?
- design, implement, and integrate
- training, new business practices
- privacy and security
CS 330
Spring 2019
288
Strategic Objectives of an IS
Issues to Consider
• Operational excellence, improved efficiency
- Overhead (costs other than labour and materials) as a
percentage of sales revenue
▪ Walmart spends 16.6%
▪ Sears spends 24.9% (went bankrupt in 2017 in Canada,
2018 in the US)
▪ Industry average in retail is 20.7%
- Monthly Sales per square foot
▪ Walmart $28 US
▪ Target $23 US
▪ industry average in retail is $12 US
- Walmart links suppliers to every Walmart store
see https://www.youtube.com/watch?v=SUe-tSabKag&t=131s
CS 330
Spring 2019
289
Strategic Objectives of an IS
Issues to Consider
• Help develop new products, services, and business models
- e.g. iTunes, Spotify, Netflix.
• Understand customers and suppliers better
- to enhance customer loyalty know what the customers want
- e.g. high end hotel, room temperature, etc.
- keep suppliers informed
• Improved decision making by basing decisions on the most
recent and relevant information
- avoid over production and under production
- know the effectiveness of a tool or person
CS 330
Spring 2019
290
Strategic Objectives of an IS
Why
• Survival, companies have to respond to
- customers desire to use new technology
- e.g. banking machines
- new legislation in information gathering and reporting
•
These factors lead to a competitive advantage.
- doing things better or cheaper than the competition.
CS 330
Spring 2019
291
Strategic Objectives of an IS
Change for Survival
“As C.E.O., it’s also superimportant to keep focused on the
future,” Mr. Page said. “Companies can tend to get comfortable
doing what they’ve always done, with a few minor tweaks. It’s
only natural to want to work on things you know. But
incremental improvement is guaranteed to make you obsolete
over time, especially in tech.”
- Larry Page, CEO, Google
Source: http://www.nytimes.com/2013/04/19/technology/googles-earnings-beat-expections-but-revenue-does-not.html
CS 330
Spring 2019
292
Strategic Objectives of an IS
Creative Destruction
• new technology threatens existing business
• a term coined by Joseph Schumpeter in his work Capitalism,
Socialism and Democracy (1942) to denote a "process of
industrial mutation that incessantly revolutionizes the
economic structure from within, incessantly destroying the old
one, incessantly creating a new one."
Source: http://www.investopedia.com/terms/c/creativedestruction.asp
CS 330
Spring 2019
293
Strategic Objectives of an IS
Why invest in an IS?
• provides real value to the company
• provides a better return on investment than other options
such as buildings, machines, etc.
What helps achieve a better return?
• need complementary assets: assets required to derive value
from a primary investment (pg 21), e.g. new business models,
new business methods, training, management behaviour etc.
CS 330
Spring 2019
294
Strategic Objectives of an IS
Course text,
Figure 1-8
•
Some companies get bigger productivity boost for their
investment in IT than do others.
CS 330
Spring 2019
295
Contemporary Approaches to MIS
Course text,
Figure 1-9
CS 330
Spring 2019
296
Contemporary Approaches to MIS
Technical Approaches
• Computer Science: methods of computation, storage, and
access.
• Operations Research: optimizing selected parameters such as
transportation costs, inventory levels.
• Management Science: models for decision making and
management practices.
CS 330
Spring 2019
297
Contemporary Approaches to MIS
Behaviour Approaches
• Sociology: how information systems affect individuals, groups
and organizations.
• Economics: production of digital goods, dynamics of digital
markets.
• Psychology: how humans decision makers use formal
information.
CS 330
Spring 2019
298
Contemporary Approaches to MIS
A Sociotechnical Approach
• Optimal organizational performance is achieved by jointly
optimizing both the social and technical systems used in
production (pg 24)
• Both behavioural and technical aspects need to be considered
(pg 24)
• To illustrate the importance of including behaviour approaches
along with technical issues, consider something simple like the
colours used in PowerPoint slides...
CS 330
Spring 2019
299
Contemporary Approaches to MIS
Example: Colours for text and background
• Yellow print has very low contrast on a white background. Try
to read the following.
Offer expires 07/31/13. Offer available to new High Speed
Internet subscribers only. May not be used in conjunction with
any other offer. Service is not available in all areas. Certain
taxes and fees may apply. DSL: Offer requires a 12 month
subscription.
CS 330
Spring 2019
300
Contemporary Approaches to MIS
Example: Colours for text and background
•
Blue has the shortest wavelength of visible light and red the
longest. Blue is refracted more strongly than red in our lenses.
•
Result: our eyes can’t focus on red and blue at the same time,
resulting in eye strain.
CS 330
Spring 2019
301
Contemporary Approaches to MIS
Case Study
• A university decides to adopt Learn as their platform for course
delivery (as opposed to creating course web sites using HTML)
• The new system requires instructors to learn the new system.
• Many senior profs refuse to learn the new system.
• If you are the VP in charge of a project, what would you do?
- Before purchasing a system like Learn, survey the users of the
existing system to get a sense of what they like and dislike
about the current system.
- As a group, do an assessment of the current system and the
available options to see which best meets their needs.
- People are more likely to accept a solution if they have had
their say and as many as possible of their needs are being met.
CS 330
Spring 2019
302
Topic 5 –Business Processes and
Types of Information Systems
Key Concepts
• Business Processes (BP)
• Customer Relationship Management (CRM)
• Supply Chain Management (SCM)
• Accounting Information System (AIS)
• Human Resources Information System (HRIS)
• Transaction Processing Systems (TPS)
• Decision-support Systems (DSS)
• Management Information Systems (MIS)
• Executive Support Systems (ESS)
References
• course text, Chapter 2 How Businesses Use Information
CS 330
Spring 2019
303
Business Processes
An Experiment in Behavioral Economics
• Ask people for help getting a car out of a pothole.
• Three Versions
1. randomly asked people passing by
⇒ many were happy to help
2. told people if they helped, they would get $10
⇒ only a few helped
3. after helping, he gave the volunteer a gift worth $1
⇒ all happily accepted the gift and thanked him
• What is going on here?
CS 330
Spring 2019
304
Business Processes
An Experiment in Behavioral Economics
• He concluded:
- We live in a capital market and a social market. Each has its
own rules and value system.
- Different markets, different rules, different returns, different
focus and different value systems.
• Our goal:
- Understand the role of an IS both in the capital market and
in the social market (i.e. as opposed to personal use of an
information system).
CS 330
Spring 2019
305
Business Processes
The Essence of a Business
•
The basic operation of a business is to convert resources into
products and services.
•
A business can also be seen as a collection of business processes.
•
Business processes are the collection of activities required to
produce a product or service (pg 32).
- i.e. how is work organized, coordinated, focused
- e.g. customer places an order, what happens next?
Goal: examine business processes with a view to understanding
how they might be improved by using information technology to
achieve greater efficiency, innovation and customer service (pg
34).
•
CS 330
Spring 2019
306
Business Processes
Major Business Functions for an Organization
• Each business is a collection of business functions, e.g.
Function
Manufacturing
and Production
Sales and
Marketing
Finance and
Accounting
Human
Resources
•
Course text, Table 1-2
Purpose
Producing and delivering products and
services
Selling the organization’s products and
services
Managing the organization’s financial
assets and records
Attracting, developing and maintaining
the organization’s labor force
Each business function is a collection of business processes ...
CS 330
Spring 2019
307
Business Processes
Examples of Business Processes (BP)
Course text, Table 2-1
CS 330
Spring 2019
308
Business Processes
Business Processes and Functions
• A business process might run across several business functions,
e.g. order fulfillment
Course text, Figure 2-1
CS 330
Spring 2019
309
Business Processes
How does an Information System (IS) fit in?
Since we are taking a sociotechnical approach, consider
1. Technical aspect (i.e. can we?)
- First understand how the existing business process works.
- Which parts of the business process steps be automated?
- Can we modify the existing process to enable more
automation?
- What changes need to be made?
2. Behavioral aspect (i.e. should we?)
- What is its impact on people and the organization?
- What is the impact on the organization’s structure and culture.
CS 330
Spring 2019
310
Business Processes
How does an Information System (IS) fit in?
• Once process is automated, ask what BI can we obtain?
• Use the information to enhance business processes through
(partial or full) automation. Ask…
- Can we increase the efficiency of existing processes?
- Can we enable a new product or service that can transform
the business?
- Can we enforce policy and regulation better?
• e.g.
- Which parts of a hiring process can be automated?
- Which parts of an order fulfillment process can be
automated?
CS 330
Spring 2019
311
Class Exercise: Managing Travel Expenses
A Business Process Example
• Travelling is expensive.
• Processing expense claims
adds to the cost.
• On average, it takes $48
US to process one claim.
• What work needs to be
done to process a claim?
1. manager approval
2. confirm budget
3. collect receipts
4. itemize
5. manager approval
6. expense clerk approval
7. notify payroll
8. get money via direct deposit
CS 330
Spring 2019
312
Class Exercise: Managing Travel Expenses
BP Automation
• What part of the process can be automated?
- pretty well every step, to a certain extent
•
What changes need to be made?
- Possibly scanning in receipts
- Use credit card companies that itemize receipts for us
- Linking up systems
- Create software
CS 330
Spring 2019
313
Class Exercise: Managing Travel Expenses
Business Intelligence (BI)
Once a process is automated, what BI can we obtain?
1. Collect data
What kind of business data can you obtain?
2.
Extract information to support decision making
What kind of information can you extract and you would like
to extract?
3.
Derive business intelligence (information technology to help
make better decisions)
How can you use this information to improve performance?
CS 330
Spring 2019
314
Class Exercise: Managing Travel Expenses
Deriving Business Intelligence (BI)
• After automating the processing of a traveling claim, you
observe the following facts:
- Lots of business travels to California during the winter time
- Lots of taxi fare from and to Pearson Airport
- Lots of international calls during business trips
• How would this information help you improve your business?
CS 330
Spring 2019
315
Class Exercise: Managing Travel Expenses
Deriving Information
• Now focus on efficiency or enabling new business processes
- Can the BP be improved by this technology?
▪ Negotiate better rates, e.g. taxis
- Will this new technology bring new product/service?
▪ Someone to negotiate prices
CS 330
Spring 2019
316
Types of Information Systems
Approach for the Next Few Slides
• For each of these business functions (from 307-308)
A. Sales and Marketing
B. Manufacturing and Production
C. Finance and Accounting
D. Human Resources
• Ask the following questions
1) Which business processes can be automated?
2) What data can be gathered?
3) What information can help improve business?
CS 330
Spring 2019
317
Types of Information Systems
A. Sales and Marketing Systems
1) Which business processes can be automated?
- Ordering process, order fulfillment, order inquiry,
advertising and promotion etc.
2) What data can be gather?
- Individual orders: who ordered what, when, where, when
was it filled, was there is any issues
- Customer data: name, contact info, purchases, returns
CS 330
Spring 2019
318
Types of Information Systems
A. Sales and Marketing Systems
3) What information can help improve business?
- Purchase habits: who likes what, when and where sales
trends: what is popular, when and where efficiency of
fulfillment, return rates
- Promotion strategy, production schedule, inventory level
etc.
• Terminology: called a Customer Relationship Management
(CRM) system
- Provides: Customer support, sales, marketing
CS 330
Spring 2019
319
Types of Information Systems
A. Sales and Marketing Systems
• A point of sales system (partially) automates the in-store
checkout process
• It often produces standard sales-related (sales performance
and sales trends) and customer-related information (like
customer base)
• Can you think of other useful information it might produce?
- Say, employee-related information …
CS 330
Spring 2019
320
Types of Information Systems
B. Manufacturing and Production Systems
1) Which business processes can be automated?
- Making of individual parts, assembling, testing,
stocking/shelving etc.
2) What data can be gather?
- What is produced, when, where, how many, whether there
are any issues
CS 330
Spring 2019
321
Types of Information Systems
B. Manufacturing and Production Systems
3) What information can help improve business?
- Efficiency of the production process, defective rate,
production/schedule status
- suppliers of defective parts, sources of error in process
• Terminology: called a Supply Chain Management (SCM) system
- Linked with suppliers and ensure materials and parts are
available when needed
CS 330
Spring 2019
322
Types of Information Systems
B. Manufacturing and Production Systems
• Based on the data shown in the above systems, can you
answer the following questions:
- If there is something wrong with an item sold to a customer,
can you trace where is it sold?
- Who else bought the same product?
- Can you link it to a particular shipment?
- Can you find out which sales representative handled the
transaction?
CS 330
Spring 2019
323
Types of Information Systems
C. Finance and Accounting Systems
1) Which business processes can be automated?
- Everything to do with money
2) What data can be gather?
Just a sample...
- Accounts receivable (A/R): payment received by the
company
- Accounts payable(A/P): bills owed by the company
- Billing: produces invoices for clients/customers
- Purchase Order: records company’s orders of inventory
CS 330
Spring 2019
324
Types of Information Systems
C. Finance and Accounting Systems
- Sales Order: records customer’s orders
- Cash Book: records collections and payments
3) What information can help improve the business?
- match purchase order, goods receipt, pay invoice
- match customer order, invoice, receive payment
- Cash flow, financial status of the firm …
• Terminology: called an Accounting Information System (AIS)
- Collects, stores and processing accounting information
CS 330
Spring 2019
325
Types of Information Systems
D. Human Resource System
1) Which business processes can be automated?
- automatic deposits (pay cheque), tax forms, pay slips,
2) What data can be gather?
- payroll
- time and attendance
- performance appraisal
- benefits administration
- recruiting
- Learning Management (CPR, hazardous materials)
CS 330
Spring 2019
326
Types of Information Systems
D. Human Resource System
3) What information can help improve the business?
- high employee turnover in a certain area
- high absenteeism
- cost of overtime vs. hiring more employees
- trouble filling certain positions
• Terminology: called a Human Resources Management System
(HRMS) or a Human Resources Information System (HRIS).
CS 330
Spring 2019
327
Types of Information Systems
The IS Challenge
• Can you name a BP that cannot be automated or has
absolutely nothing to with IT?
- Can you foresee it might be automated in the future?
• Can you think of a job that has absolutely nothing to with IT?
- Can you foresee it might be automated in the future?
• This avenue of thinking can lead to new business
opportunities!
CS 330
Spring 2019
328
Types of Management
Three Levels of Management
• For the previous dozen slides we looked at different business
functions and their IS needs.
• Now we will look at different levels of management and their IS
needs
• Senior Management: concerned with
- long range strategic (of great importance) decisions
- financial success of company as a whole
• Middle Management: concerned with implementing the plans of
the senior management
• Operational Management: concerned with monitoring day-today activities of the company
CS 330
Spring 2019
329
Types of Management
Different IS Needs
CS 330
Spring 2019
330
Types of Management
Different IS Needs
Tables from 4th edition of course text.
CS 330
Spring 2019
331
Types of Management
Different IS Needs
• Which level of management would be most interested at the
following questions?
- Is an order filled properly?
- Percentage of orders filled properly?
- What is the status of an order?
- How many tons of candy should we stock for Halloween?
- How to promote our Halloween party package?
- Should we open a branch in Guelph?
- Should we be concerned about 5G?
CS 330
Spring 2019
332
IS for Different Management Levels
Different IS Needs
• Different levels of management use different IS systems
• Senior Management
- Executive Support System (ESS)
• Middle Management
- Management Information System (MIS)
- Decision Support System (DSS)
• Operational Management
- Transaction Processing System (TPS)
• Let’s go from the bottom up.
CS 330
Spring 2019
333
Types of Management
Course text, 4th ed
Figure 1-6
ESS
DSS
MIS
TSP
Different Levels of Management Have Different Concerns
CS 330
Spring 2019
334
IS for Different Management Levels
Transaction Processing System (TPS)
• Automates business processes
• Records routine transactions necessary to conduct day-to-day
business
• E.g. process sales order, fulfillment, billing
• Allow frontline workers and managers to monitor status of
operations and relations with external environment
CS 330
Spring 2019
335
IS for Different Management Levels
Management Information System (MIS)
• Provides routine reports on department’s current performance
to middle management
• confusing, same term used to refer to whole course
• Based on data from TPS
• Summarizes TPS data
• Typically have little analytic capability
• E.g. sales and marketing summaries, actual vs. predicted sales
of items by region
CS 330
Spring 2019
336
IS for Different Management Levels
Decision Support System (DSS)
• Supports non-routine decision making by middle management
- Example: What is impact on production schedule if
December sales doubled?
• Often uses external information as well as information from
the TPS and MIS
• E.g. create a statistical model of how sales relates to other
factors, e.g. vacations abroad and the value of the Canadian
dollar, car sales and the price of gasoline
CS 330
Spring 2019
337
IS for Different Management Levels
Decision Support System (DSS)
• Example in book, DSS uses information about
- ship speed and capacity
- port distance
- fuel consumption, fuel cost
- cost to hire crew for that ship
- expense to dock at port
• to create competitive bids on transporting good by ship
IS for Different Management Levels
Executive Support System (ESS)
• Support non-routine decisions requiring judgment, evaluation,
and insight by senior management
• Specialize version of DSS
• Graphical displays, friendly user interface
• ability to drill down to info from MIS and DSS
• e.g.: ESS that provides minute-to-minute view of firm’s
financial performance as measured by working capital,
accounts receivable, accounts payable, cash flow, and
inventory
CS 330
Spring 2019
339
IS for Different Management Levels
System Relationships
Figure from 4th edition
of course text
CS 330
Spring 2019
340
Topic 6 –Organizations and IS
Key Concepts
• The behavioural view of organizations
• The impact of IS on organizations
• Two Ways of Creating a Competitive Advantage
1. Porter’s Competitive Forces Model
2. The Value Chain Model
References
• course text, Chapter 3.1 – 3.3 Information Systems,
Organizations, and Strategy
CS 330
Spring 2019
341
Overview of Organizations
Motivation
1. In a high tech company, other than the senior executives,
which position/job pays the most?
- technical sales: i.e. people who have both technical
knowledge and the ability to influence other people
2. What is office politics?
- the strategies people use to gain advantage in the
workplace
3. Has anyone ever observed office politics taking place?
CS 330
Spring 2019
342
Overview of Organizations
Office Politics Examples
1. In a group meeting, your boss tells a joke. You’ve heard it
before and don’t think it is funny. Do you laugh at the joke
anyway? Why or why not?
2. Your department head is proposing a project which you think
is a doomed to fail. What do you do?
CS 330
Spring 2019
343
Overview of Organizations
Office Politics Skills
What are some strategies for succeeding in an organization other
than (or in addition to) excellent technical skills?
• Give and receive feedback in an effective manner
• Be unconditionally cooperative
• Develop good communications skills
• Develop good interpersonal skills
• Don’t pass on gossip
• Seek advice from knowledgeable people
• Consult with the people who will be affected by a decision you
are making
CS 330
Spring 2019
344
Overview of Organizations
Quick Review
• Recall (from slide 299, Topic 4 Management Information Systems)
that this course is taking a Sociotechnical Approach, i.e.
- optimal organizational performance is achieved by jointly
optimizing both the social and technical systems used in
production (pg 24)
- both behavioural and technical aspects need to be considered
(pg 24)
• Why?
Because organizations have both of these components.
CS 330
Spring 2019
345
Overview of Organizations
Technical Microeconomic View
• The technical view: An organization is stable, formal social
structure that uses capital and labour from the environment as
input and processes them to produce products and services
(course text pg 66).
course text, Figure 3-2
CS 330
Spring 2019
346
Overview of Organizations
Behavioural View
The behavioural view of an organization looks at the structures
and processes within the organization.
Environmental
Resources
Environmental
Outputs
course text,
Figure 3-2
CS 330
Spring 2019
347
Features of Organizations
Behavioural View
In order to introduce an IS into an organization you would have
to take the following into account...
• Routines and Business Processes: organizations become very
efficient over time because they develop routines (or standard
operating procedures) to deal with (almost all) situations
•
Organizational Politics: people with different positions and
backgrounds will have different points of view and will struggle
for limited company resources.
- Many will resist change they do not agree with.
CS 330
Spring 2019
348
Features of Organizations
Behavioural View
• Organizational Culture: the unquestioned assumptions that
organizations make about their goals and products.
- Anything that challenges these assumptions will be met with
resistance.
•
Organizational Environment: government (i.e. regulations),
competitors, customers, financial institutions, culture,
technology, knowledge.
- IS can help identify changes that the company should
respond to.
CS 330
Spring 2019
349
Features of Organizations
Behavioural View
• Organizational Structure: different organizational structures
would have different ISs, e.g. a entrepreneurial structure
(simple flat structure) might have a single IS whereas
professional bureaucracy (many independent departments,
such as UW) may have several independent systems.
•
Other Organizational Features
- democratic vs. authoritarian leadership
- benefit stock holders (for profit) vs. benefit society (nonprofit)
CS 330
Spring 2019
350
Impact of IS on Organizations
IS Reduces the Cost of Information
• IS helps reduce transactional costs
- i.e. the costs associated with an organization buying a
product or service
- e.g. the cost of communicating with suppliers, obtaining
information about products, monitoring contract
compliance
• IS helps reduce agency costs
- i.e. the costs associated with managing agents (employees)
so that they will act in the interests of the company rather
than in their own self-interest
CS 330
Spring 2019
351
Impact of IS on Organizations
IS Reduces the Cost of Information
• IT flattens organizations
- management more efficient ⇒ need less of them
- lower levels have easier access to relevant information
• IT innovations cause resistance because it affects
- the organizational structure
- the job tasks
- the people
- the information technology
• The most common reason for IT innovation failure is the
organization’s resistance to change
CS 330
Spring 2019
352
Competitive Advantage
Recall Case Study: IT in Walmart
• Walmart is the leader in retail sales, largely due to the fact that
it is also among the leaders of utilizing information technology
• They have a competitive advantage, i.e. they use commonly
available resources more efficiently.
• How can a company create a competitive advantage?
Answer: we will consider two models ...
1. Porter’s Competitive Forces Model
2. The Business Value Chain Model
CS 330
Spring 2019
353
Competitive Advantage
#1: Porter’s Competitive Forces Model
The strategies a firm use are determined by five factors.
course text, Figure 3-8
CS 330
Spring 2019
354
Competitive Advantage
#1: Porter’s Competitive Forces Model
• Traditional competitors try to attract your customers.
• New market entrants are more likely when the cost of entry is
low.
• If your prices get too high customers may seek substitute
products.
• The power of customers increases if they can easily switch to a
competitors products, if prices are transparent, and products
are undifferentiated.
• The more suppliers the company has for an item, the more
control it has over prices.
CS 330
Spring 2019
355
Competitive Advantage
#1: Four Basic Competitive Strategies
Use information systems to...
• decrease costs (e.g. Walmart) or increase quality (e.g. smart
phones in general)
• differentiate products and enable new products and services
(e.g. Apple)
• to focus on a market niche, i.e. specialize (e.g. high end hotels).
• develop strong ties with suppliers (e.g. Chrysler) or customers
(e.g. Amazon, Chapters).
CS 330
Spring 2019
356
Competitive Advantage
Case Study: UWaterloo
• Which of the competitive forces in Porter’s Model is the
biggest threat to UWaterloo? Why?
• Recently, Maclean’s ranked UW on top in Best Overall, Most
Innovative, and Leaders of Tomorrow.
• However, UW only got an average in terms of student’s
experience with their education. Based on that, which of the
competitive forces in Porter’s Model is the biggest threat to
UWaterloo?
• How does MIS help? Which IS strategy would you recommend?
CS 330
Spring 2019
357
Competitive Advantage
#2: The Business Value Chain Model
Course text,
Figure 3-9
CS 330
Spring 2019
358
Competitive Advantage
#2: The Business Value Chain Mode
• Identifies where information systems are particularly helpful in
creating a competitive advantage
• Two broad areas to consider
- primary activities: directly related to creating the product or
service
- support activities: makes the primary activities possible
CS 330
Spring 2019
359
Competitive Advantage
#2: The Business Value Chain Mode
• Primary Activities (directly related to creating the product or
service) include
- automated warehouse systems
- computer controlled manufacturing
- computerized ordering systems
- equipment maintenance systems
- automated shipping
CS 330
Spring 2019
360
Competitive Advantage
#2: The Business Value Chain Mode
• Support Activities (makes the primary activities possible )
include
- electronic scheduling and messaging systems
- workforce planning systems
- computerize-aided design (CAD) systems
- computerized ordering systems
CS 330
Spring 2019
361
Topic 7 – Social, Ethical, and Legal Issues
Key Concepts
• The social, ethical and legal issues raised by information
technology.
• Ethical principles that may help us make decisions to deal with
these issues.
References
• course text, Chapter 4 Social, Ethical, and Legal Issues in the
Digital Firm
CS 330
Spring 2019
362
Moral Dimensions of Information Age
Technology Trends
•
Computing Power Increases ⇒ more dependence on
computers
•
Storage Costs Decreasing ⇒ cheaper to store information
about individuals
•
Big Data Techniques ⇒ can develop (mostly accurate) profiles
of individuals
•
Growth of Internet ⇒ easy to access and copy personal data
•
Growth of Mobile Phone Usage ⇒ location may be tracked
without user knowledge or consent
- e.g. turn off location and you can still be tracked
CS 330
Spring 2019
363
Moral Dimensions of Information Age
Implications
The rise of of computers and the Internet has raised five areas of
ethical, social and political concern
1. Personal information rights and obligations
e.g. what rights do we have to protect ourselves from others
tracking our personal information
2. Digital property rights and obligations
e.g. music/video/software piracy
3. Data and system quality
e.g. is the data about me correct, secure
CS 330
Spring 2019
364
Moral Dimensions of Information Age
Implications
4. Accountability, liability and control
e.g. who is held accountable for any harm done when
customer’s data is stolen
5. Quality of life
e.g. maintaining boundaries between work and home life
Example: Privacy and Social Networks produced by the Office of
the Privacy Commissioner of Canada
https://www.youtube.com/watch?v=X7gWEgHeXcA
CS 330
Spring 2019
365
Cautionary Tales
Loss of Control
• For information posted on the web or send through email, you
generally have no control of ...
- how it is used:
prank, ridicule, spam, identify theft
- how it is interpreted:
humourous vs. insulting, intentional vs. accidental
•
Often websites are able to use the text, pictures, or videos you
post for whatever purposed they want to use it for.
CS 330
Spring 2019
366
Moral Dimensions of Information Age
Ethical, Social, Legal Aspects
•
Ethical: principles of right and wrong that individuals use to
make choices to guide their behaviors
•
Social: affecting people and communication, i.e. etiquette,
expectations, social responsibility (acting for the benefit of
society), changing social institutions (family, education,
organizations)
•
Legal/Political: knowing the law and working within its limits,
i.e. changing old laws, creating new laws, and understanding
existing laws
CS 330
Spring 2019
367
Ethics
Key Concepts
• Responsibility: accepting the potential costs, duties, and
obligations for decisions
• Accountability: provide mechanisms to identify who is
responsible
• Liability: laws exist that permits individuals to recover damages
done to them
• Due process: laws are well known and understood, can appeal
to a higher authority to ensure that the laws are applied
correctly
CS 330
Spring 2019
368
Cautionary Tales
Use of Company Computers
• Generally companies own the information on their computers,
tablets and cell phones
• In the past, UW has allowed police access to the email of UW
students caught running a meth lab
• Police have obtained IP addresses of company computers used
to “anonymously” create malicious posts about someone else.
They then approached the company and asked who used that
computer.
• Exception, if the company allows you to use a company laptop
for personal use.
CS 330
Spring 2019
369
Ethics
Ethical Principles
• Golden Rule: Do unto others as you would have them do unto
you.
• Kant’s Categorical Imperative: If an action is not right for
everyone to take, then it is not right for anyone.
• Descartes’ Rule of Change: If an action cannot be taken
repeatedly, then it is not right to be taken at any time (e.g.
using $1 worth of office supplies for personal use).
•
Utilitarian Principle: Take the action that achieves the higher or
greatest value for all concerned.
•
Risk Aversion Principle: Take the action that produces the least
harm or incurs the least cost to all concerned.
CS 330
Spring 2019
370
Ethics
An Ethical Decision
To what extent should companies monitor their employees at
work? Monitor everything? Monitor nothing? Is there middle
ground?
CS 330
Spring 2019
371
Concern 1: Personal Information
Privacy
• Privacy is the claim of individuals to be left alone, free from
surveillance or interference from other individuals,
organizations, or the state.
• In Canada we have the Personal Information Protection and
Electronic Documents Act (PIPEDA)
• It establishes principles for the collection, use, and disclosure
of personal information.
• Organizations need informed consent to collect and use
customer data.
• Our law is more strict than the US, less strict than Europe.
CS 330
Spring 2019
372
Concern 1: Personal Information
What is it?
According to PIPEDA personal information (PI) includes
• demographics: age, income, ethnic origin, religion, marital
status
• internet: e-mail, e-mail address, IP address
• physical: age, height, weight, medical records, blood type,
fingerprints
• financial: purchases, spending habits, banking information,
credit/debit card data, loan or credit reports, tax returns,
Social Insurance Number
source: https://www.priv.gc.ca/information/pub/guide_ind_e.asp
CS 330
Spring 2019
373
Concern 1: Personal Information
How your PI is Protected?
PIPEDA’s Principles for the Treatment of PI
• Accountability: appoint someone to be responsible
• Consent: inform you of the purpose of collecting that info
• Limiting use: only use it for purposes you consent to
• Safeguards: your PI must be protected
• Individual access: you have the right to access your PI
source: https://www.priv.gc.ca/information/pub/guide_ind_e.asp
CS 330
Spring 2019
374
Concern 1: Personal Information
How your PI is Protected?
PIPEDA’s Principles for the Treatment of PI
• Identifying purposes: the reason for collecting your PI must be
identified
• Limiting collection: only gather information that is necessary
• Accuracy: should keep your info accurate
• Openness: privacy policy should be easy to find and
understand
• Recourse: you should be provided with a complaint procedure
source: https://www.priv.gc.ca/information/pub/guide_ind_e.asp
CS 330
Spring 2019
375
Concern 1: Personal Information
Concerns
• Terms of service are often all-or-nothing, if you use the website
or app you must agree to give up your privacy.
• Often companies will provide your PI to “affiliates” or “trusted
partners” ⇒ Who are they?
• Often companies say they keep information needed for business
purposes ⇒ What PI and what purposes?
• Often companies can keep your PI for as long as they want i.e.
your PI has dual ownership
e.g. Facebook.
https://www.youtube.com/watch?v=Gb29Rcycjv0
CS 330
Spring 2019
376
Concern 1: Personal Information
Internet Challenges to Privacy
• cookies
- a website stores a unique bit of data (like an account
number) on your device
- think of the cookie as a primary key identifying you in their
database
- use this data to track your activity on the site
• third party cookies
- companies like Facebook, Google, Amazon, track your
activity across many websites, not just their own
- even if you do not have a Facebook account, Facebook
tracks you
- use this technology to get a more complete picture of you
CS 330
Spring 2019
377
Concern 1: Personal Information
Internet Challenges to Privacy – Cookies
source: course text, Figure 4-3
CS 330
Spring 2019
378
Concern 1: Personal Information
Internet Challenges to Privacy
• web beacons – websites can tell that you’ve viewed a certain
item, say an ad in your email
- typically a small picture the same colour as the background
(so you don’t see it)
- could be on a website or in an email
• spyware – software that tracks where you have surfed,
typically spotted by virus protection programs
• each smartphone has a unique International Mobile (Station)
Equipment Identity (IMEI) associated with them (try dialing
*#06#) that tracks that device and can be used to blacklist a
phone in case of theft
CS 330
Spring 2019
379
Concern 1: Personal Information
Internet Challenges to Privacy
• browser fingerprinting
- Each computer/cell phone has many
▪ settings (e.g. has “do not track” activated)
▪ hardware specs (what are my screen dimension)
- The combination of these properties that browsers can
report makes each cell phone/laptop rare (or unique), e.g.
https://panopticlick.eff.org/
https://amiunique.org/
- This rareness provides a way for companies to track you
even if many other tracking methods have been blocked.
- Currently Firefox has the best support to limit this approach.
CS 330
Spring 2019
380
Concern 1: Personal Information
What Information is Collected
• Your posts, post you started but then deleted
• Sites visited, posts read, videos viewed
• Searches, location, relationships
• Items you have purchased, items you have looked at
• Images from computer camera
• Sounds overheard by personal assistant (e.g. Alexa)
• Health, medical data and financial data
•
E.g. Facebook and Google
https://gizmodo.com/all-the-ways-facebook-tracks-you-that-you-might-notkno-1795604150
https://www.nytimes.com/interactive/2019/07/10/opinion/google-privacypolicy.html
CS 330
Spring 2019
381
Concern 1: Personal Information
How is that Information Used?
• Advertising product and services you may be interested in
• Tailoring the content that you see
- suggesting articles / videos you may be interested in
- limitation: the echo-chamber effect
• Hiring decisions
• Insurance coverage/premiums
• Preferential offers/pricing/etc.
• Identifying security risks
• Solving crimes
• Combined with other information …
CS 330
Spring 2019
382
Concern 1: Personal Information
How is that Information Used?
NORA: nonobvious relationship awareness
NORA combines info from various sources
(telephone listings, lists of customers) to
create a more detailed profile of each
person.
E.g.
https://www.youtube.com/watch?v=V7M_FOhXXKM
CS 330
Spring 2019
383
Concern 1: Personal Information
Strategies
Chris is concerned about privacy so deletes the browser’s
cookies once a week. What are the limitations of this strategy?
• If at some point Chris had logged into website XYZ with the
previous cookies and then logs in again to the same account
with the new cookies...
then website XYZ can match up (i.e. associate) both sets of
cookies to the same person and continue to accumulate
information about Chris.
•
Can also use fingerprinting to tentatively associate the old and
the new version of the cookie.
CS 330
Spring 2019
384
Concern 1: Personal Information
Strategies
Is there a better strategy that is not too much of a hassle?
• Use two browsers, one for day-to-day access (of accounts you
have to sign in to) and one for more private access.
• Delete any cookies from the one you use for private access
quite frequency and don’t log into any accounts with it.
• Consider using a computer in a lab or library for certain types
of searches.
- Do not log into any of your regular websites here.
- Generally, identity can be tracked to the organization but
not to you personally but the police could still discover your
identity by contacting the university / library.
CS 330
Spring 2019
385
Concern 1: Personal Information
Viewing your Cookies in Chrome
• While browsing, click on the lock (on the LHS of the address
bar) to view the cookies that site is using.
• To see all your cookies go to chrome://settings/siteData
- Click on the triangle to see the cookies for each site.
- Click on the chevron (i.e. ‘v’) to view a particular cookie.
Viewing your Cookies in Firefox
• While browsing, click on the ‘i’ (on the LHS of the address bar)
to view the cookies that site is using.
• To see all your cookies, go to about:preferences#privacy
• In the Cookies and Site Data section select Manage Data…
CS 330
Spring 2019
386
Concern 2: Digital Property Rights
What is IP
• Intellectual property (IP) is intangible property (a recipe, a
song, an invention, software) created by individuals or
corporations
• Depending on what it is, it can be protected by one of the
following legal traditions:
a) Trade secret
b) Copyright
c) Patent
CS 330
Spring 2019
387
Concern 2: Digital Property Rights
Trade Secret
• A trade secret is intellectual work or product belonging to a
business, provided it is not in the public domain, that confers
economic advantage, and reasonable attempts have been
made to keep it secret.
- e.g. recipe for Coke or the layout of a chemical plant
• the risk is that there is a breach of confidentiality
- e.g. publishing the recipe for Coke
• most End User License Agreements (EULAs) prohibit the
reverse engineering of a computer program
CS 330
Spring 2019
388
Concern 2: Digital Property Rights
Copyright
• A copyright protects original literary, musical, artistic, dramatic
works and computer software.
• Prohibits copying of entire work or parts for at least 50 years
• Copyrighting the look and feel of a device is still a murky issue
- Apple v. Microsoft (1994): look and feel of Mac OS vs. MS
Windows 2.0
- Apple v. Samsung: (2011): look and feel of smart phones and
tablets
CS 330
Spring 2019
389
Concern 2: Digital Property Rights
Patent
• A patent grants the owner an exclusive monopoly on the ideas
behind an invention for between 17 and 20 years
• intended to promote innovation by protecting investments
made to commercialize inventions
• originality, novelty, and invention are key concepts
• can offer protection in all 160 counties that are members of
the World Trade Organization (WTO)
• cannot patent software in Canada, can in the US
CS 330
Spring 2019
390
Concern 2: Digital Property Rights
Challenges to IP Rights
• the internet has made it easy to copy and distribute intellectual
property
• perfect digital copies cost almost nothing
• sharing of digital content over the Internet costs almost
nothing
• a web page may present data from many sources
• sites and software for file sharing are hard to regulate
CS 330
Spring 2019
391
Concern 2: Digital Property Rights
Canada’s Response
• the Copyright Modernization Act (2011)
• cannot circumvent digital locks
• time shifting, format shifting, and backup copies are OK as long
as there are no digital locks
• fair use provisions for education, satire, parodies
• damages for non-commercial infringement (i.e. illegally
downloading music and videos) limited to between $100 $5000
CS 330
Spring 2019
392
Concern 2: Digital Property Rights
Canada’s Response
• includes a notice-and-notice provision (in effect as of Jan 1,
2015)
• copyright holders notify ISP about infringement
• ISP notifies customer (without revealing customer’s identity to
copyright holder)
• copyright holder still has to get a court order for an ISP to
reveal a customer’s identity
http://www.theglobeandmail.com/technology/digital-culture/canadiandownloaders-should-expect-a-copyright-notice-in-the-mail/article22336673/
CS 330
Spring 2019
393
Concern 3: Data Quality and System Errors
The Issue
• No large program is error-free: errors exist with a low
probability
• it is impossible to test every combination of inputs
• software producers knowingly ship products with bugs
• the number of bugs can reach a steady state: in the process of
fixing existing bugs, new bugs are created
• the largest source of error is poor data quality rather than
faulty hardware or software
CS 330
Spring 2019
394
Concern 3: Data Quality and System Errors
Example: Design Flaw to Cost Intel $1 Billion
• In 2011, Intel temporarily halted shipments of a new chip
platform due to a design flaw that may cause 5% of chips to fail
over the next three to five years.
• It's estimated the move will cost Intel $1 billion.
• Costs includes having to fix nearly half a million desktop and
laptops already out there.
source: New York Times
http://www.nytimes.com/2011/02/01/technology/01chip.html
CS 330
Spring 2019
395
Concern 4: Accountability and Liability
Software Company’s Liability
• software is typically licensed not sold
• most End User License Agreements (EULAs) limit liability
• in law, publishers of books and magazines are not legally liable
for their content, to allow for freedom of expression
• when software acts more like a book (an information provider)
the producer is not liable
• when software acts more like a machine controller (a service
provider) the producer can be held liable
CS 330
Spring 2019
396
Concern 5. Quality of Life
IS Have Negative Social Costs
•
Blurring work-home boundaries
employees are expected to do more work at home with
company laptops and cell phones
•
Centralized control structure
companies such as Google, Facebook, Amazon and Microsoft
dominate the collection of personal information
•
Rapidity of change
because of globalization, companies must respond very quickly
to any changes in the environment
CS 330
Spring 2019
397
Concern 5. Quality of Life
IS Have Negative Social Costs
•
Dependency on IS
many companies are vulnerable to any failure in their IS, yet
these systems are not regulated
•
Cybercrime
whole new areas of crime have opened up and institutions
have been slow to respond: e.g. malware infection, phishing
fraud, hardware theft, attacks by botnets
•
Job Loss
•
Repetitive Stress Injury / Carpal Tunnel Syndrome
CS 330
Spring 2019
398
Topic 8 – Security
Key Concepts
• Secure Communication
• The Problem
• Common Malware
• Computer Security
• Tools for Protecting IS
• Wireless Security
• Securing Your System
• Security and Control Framework
References
• course text, Chapter 8, Securing Information Systems
CS 330
Spring 2019
399
Secure Communication
Basic Idea
• encryption: render a message unreadable so anyone seeing it
will not be able to determine the original message
• decryption: retrieve the original message
• The strength of an encryption depends on the number of
possible keys ⇒ it takes longer to try all possible keys
• e.g. pick a key of length one, add 3 to each letter
plain text:
meet me after the toga party
key:
3333 33 33333 333 3333 33333
cypher text: phhw ph diwhu wkh wrjd sduwb
•
‘e’ and ‘t’ are common in the plain text,
‘h’ and ‘w’ are common in the cypher text.
CS 330
Spring 2019
400
Secure Communication
Basic Idea
• The number of possible keys is a function of the length of the
key, e.g. a longer key means more possible key values.
• E.g. pick a key of length five, say 3, 6, 5, 2, 4.
• Add 3, 6, 5, 2, 4 respectively to each sequence of five letters
plain text:
meet me after the toga party
key:
3652 43 65243 652 4365 24365
cypher text: pkjv qh gkviu zmg xrmf revzd
• The longer key makes it harder to use statistics to find out
which letters correspond to ‘e’ or ‘t’.
• Called symmetric key encryption: the same key is used to
encrypt and decrypt the message.
CS 330
Spring 2019
401
Secure Communication
Brute Force Search
• To use brute force search means to try every possible key to
find the actual key
• the difficulty grows exponentially with key size
Key Size
(bits)
CS 330
Number of
Possible Keys
Time required at
1012 attempts/sec
32
232 = 4.3 x 109
2.15 milliseconds
56
256 = 7.2 x 1016
10 hours
128
2128 = 3.4 x 1038
5.4 x 1018 years
168
2168 = 3.7 x 1050
5.9 x 1030 years
Spring 2019
402
Secure Communication
Computationally Secure
• computationally secure: an encryption method is
computationally secure if it will take the attacker a very long
time to crack the message using the best existing technology
• what is secure today may not be secure years from now
- implication of Moore’s Law
- novel methods, e.g. quantum computation
• for an example of a method that is computationally secure
consider secure hashing…
CS 330
Spring 2019
403
Secure Communication
Secure Hashing Example: SHA256
• A hash function is a computer function that maps input of any
size onto an output of a fixed size.
• Secure Hash Algorithms (SHA) are a family of hashing functions.
• SHA256 maps any message to a 32 byte (256 bit) number
- i.e. there are 2256 ≈ 1.16 x 1077 different output values
• Change the input even slightly and the hash value (i.e. the
output) changes considerably.
• Given a value x, it is computationally hard to come up with a
message m such SHA256(m) = x (typically you’d use brute force)
• For an online SHA256 calculator see
https://www.tools4noobs.com/online_php_functions/sha256/
CS 330
Spring 2019
404
Secure Communication
Key Distribution Problem
• Recall that in symmetric key encryption both parties must
know the key.
• How do both parties get the symmetric key when you want to
buy something from a web site for the first time?
• How do you and the web site agree on a key?
• This challenge is called the key distribution problem
• The solution that is currently used is called public key
encryption...
CS 330
Spring 2019
405
Secure Communication
Public Key Encryption
• Idea: use a pair of keys: a public key and a private key
• The two keys are mathematically related so that when you
encrypt with either one, the only way to decrypt (other than
brute force) is using the other one.
• It is generally used to exchange a shared key or a digital
signature, rather than a whole message.
•
For an example of how encryption and decryption is done see
https://www.cemc.uwaterloo.ca/resources/real-world/RSA.pdf
CS 330
Spring 2019
406
Secure Communication
Digital Signature
• Goal: to show that the message came from the sender rather
than an imposter (it is authentic) and has not been tampered
with (it has data integrity).
• A digital signature uses a hash function to convert the message
m into a number. Call the hash function, h( ).
• E.g. for the message “buy apple stock”, associate a number with
each letter and add them up (mod 1000).
m =b u y
a p p l e
s t o c k
h(m) = 2+21+24 + 1+16+16+12+5 + 19+20+15 +3+11
= 165
CS 330
Spring 2019
407
Secure Communication
Digital Signature
• The sender and receiver agree on a hash function, e.g. SHA256
• If the sender wants to send message ms
- calculate the hash function of the message, h(ms)
- encrypt h(ms) with the sender’s private key encrypt(h(ms))
- send ms and encrypt(h(ms))
• Receiver
- receives mr and calculates the hash function of mr, i.e. h(mr)
- decrypts encrypt(h(ms)) using the sender’s public key and
checks to see if it equals h(mr)
- when encrypt(h(ms)) is decrypted using the senders public key,
if it equals mr then it means that mr came from the sender
CS 330
Spring 2019
408
Secure Communication
Digital Signature
• the message is not secret
• these steps only guarantee that the message
- came from the sender and
- has not been tampered with
• only the sender could encrypt h(ms) with the sender’s private
key
• anyone can decrypt it with the sender’s public key and verify
that it did come from sender.
• But how do you find out the sender’s public key in a reliable
way ? ⇒ need a certificate authority
CS 330
Spring 2019
409
Secure Communication
Digital Signature
image source: http://en.wikipedia.org/wiki/Digital_signature
CS 330
Spring 2019
410
Secure Communication
Certificates
• The certificate has the digital signature of a known Certificate
Authority (CA).
• These are a small number of trusted organizations.
• A list of them and their public keys are included with a browser.
• The browser can verify
- the legitimacy of the digital signature,
- hence the legitimacy of the certificate,
- hence the public key of the certificate holder.
• https is based on using CA’s and certificates ...
CS 330
Spring 2019
411
Secure Communication
Secure Browsing – Part 1
• Say Pat wants to buy a book from Amazon for the first time.
• When Amazon first started, it created a pair of keys, one public
and one private.
• It submitted the public key to a Certificate Authority (CA), say
DigiCert, to get a certificate.
• The CA verifies that this is the public key of Amazon offline
(e.g. through the mail).
• Once verified, the CA then creates a certificate for Amazon
(digitally signed by the CA).
• The certificate contains information about Amazon and its
public key.
CS 330
Spring 2019
412
Secure Communication
Secure Browsing – Part 2
• When Pat signs up for an account, Amazon presents its
certificate to Pat’s browser.
• The process to verify the certificate is done by the browser.
• The browser verifies that the certificate has been signed by a
recognized CA (checked using that CA’s public key).
• If the certificate is valid then its contents (which includes the
public key of Amazon) are also valid.
• The browser then extracts Amazon’s public key from the
certificate and can now send Amazon an encrypted message
that only Amazon can decrypt.
CS 330
Spring 2019
413
Secure Communication
Secure Browsing – Part 3
• The browser then randomly generates a symmetric key and
encrypts it using Amazon’s public key and sends it back to
Amazon.
• Since it is encrypted with Amazon’s public key, it can only be
decrypted by Amazon’s private key.
• Amazon decrypts the key.
• Now Pat and Amazon share a symmetric key and all
subsequent conversation can be encrypted using this key.
CS 330
Spring 2019
414
Secure Communication
Secure Browsing – Summary
DigiCert Inc
Amazon
Pat
Image source: course text, Figure 8-7
CS 330
Spring 2019
415
The Problem
One Source of Problems - People
• People are careless and make mistakes
• People can be tricked (recall social engineering, slide 36-37)
into divulging confidential information
• E.g. IT professionals are discouraged from having LinkedIn
accounts. Why?
- If Chris’s LinkedIn profile says he works in the IT Dept of
XYZ Inc., then hackers will send e-mails to employees of
XYZ pretending to be Chris asking them to click a link,
download a file, or reveal some confidential information
CS 330
Spring 2019
416
The Problem
Source of Data Breaches (in 2011) – Part 1
Stolen laptop
Fraud or scam
Document found in trash or unattended
Stolen computer
Snail mail exposed or intercepted
Stolen document
Lost media found
Lost document found
Lost computer drive found
Stolen computer drive
7%
10%
7%
6%
5%
3%
3%
3%
2%
2%
source: http://www.scientificamerican.com/article/data-breach-howthieves-steal-your-identity-and-information/
CS 330
Spring 2019
417
The Problem
Another Source of Problems: Bugs
• Any complex piece of hardware or software contains bugs
• A computer processor (billions of transistors) or an operating
system (100 million lines of code) are very complex
• For even a moderately complex enterprise system there are
many points of vulnerability ...
CS 330
Spring 2019
418
The Problem
Another Source of Problems: Bugs
Some possible points include ...
source: course text, Figure 8-1
CS 330
Spring 2019
419
The Problem
Source of Data Breaches (in 2011) – Part 2
Email exposed or intercepted
Virus
Hacked computer or server
Scraped from the Web
4%
2%
16%
12%
source: http://www.scientificamerican.com/article/data-breach-howthieves-steal-your-identity-and-information/
•
note: web scraping is when a computer program rather than a
person surfs the web
•
Sometimes companies are pressured to create backdoors
(secret ways) for governments to access private data
CS 330
Spring 2019
420
The Problem
Another Source of Problems
• In 2013 Edward Snowden revealed that the NSA could breach
many security protocols
•
These included
- Encrypted chat
- Encrypted VoIP (Voice over IP)
- VPN (Virtual Private Network)
- SSH (Secure Shell)
- HTTPS (Hypertext Transfer Protocol using SSL, where SSL
means Secure Sockets Layer): developed by the predecessor
of Firefox to implement secure browsing
CS 330
Spring 2019
421
The Problem
How big is the problem?
• In June 2014 McAfee estimated that the global cost of
cybercrime was between $375 billion and $500 billion per year.
Activity
Car Crashes
Narcotics
Cost as a % of GDP
1.0%
0.9%
Cybercrime
0.8%
source: http://www.mcafee.com/ca/resources/reports/rp-economicimpact-cybercrime2.pdf
CS 330
Spring 2019
422
Classes of Threats
Common Malware
Malware: malicious software, i.e. software designed to cause
damage to or loss of control of a computer or a computer
network.
We will look at nine common types of malware.
• Computer virus: software that attaches to other programs or
data in order to be executed,
- copy itself from file to file
- can harm data, programs, machines, the network or open a
backdoor to hackers
CS 330
Spring 2019
423
Classes of Threats
Common Malware
• Worm: similar to viruses but run on their own (i.e. do not need
to attach to other programs)
- can cause the same damage as a virus
- uses a computer network to spread
- e.g. many computers come with a default password, a worm
might try to remotely log on to other computers using the
default names and passwords for a variety of operating
systems
• Trojan horse: a software program that appears to be benign,
but then does something unexpected behind the scenes
- the user has to launch them
- they cannot replicate on their own
CS 330
Spring 2019
424
Classes of Threats
Common Malware
• Trojan horse: (continued)
- can cause the same damage as a virus
- e.g. Android app that supplies weather reports could also
allow a hacker to download any files on that phone
•
Phishing: an email or text message that
1. pretends to come from a trusted authority
2. asks for confidential information
e.g. please log into your account to verify some information
CS 330
Spring 2019
425
Classes of Threats
Common Malware
source: http://en.wikipedia.org/wiki/Phishing
CS 330
Spring 2019
426
Classes of Threats
Common Malware
• Denial of Service Attack
- many computers overwhelm a website requesting service in
an attempt to block others from using the website
- no data is lost, only potential business is lost
•
Sniffing: eavesdropping on network communication in order
to obtain propriety information, i.e. email, confidential reports,
company files, etc.
•
Spam: junk email (usually sent in bulk), less of it now
- there are laws against spam
- Gmail, Hotmail, Outlook have excellent spam filters
CS 330
Spring 2019
427
Classes of Threats
Common Malware
• Botnet: a collection of computers (usually ones that have been
compromised) that are used together for a common purpose
(i.e. a robot network) such as a denial of service attack.
- The largest botnet that has been found and removed so far
controlled over 12 million computers
- it has been estimated as much as 10% of computers around
the world may be part of one or another botnet
•
Ransomware: software that threatens to publish the victim’s
files or prevents the victim from accessing their files unless a
ransom is paid (usually in Bitcoin so the victim cannot trace the
person they paid).
CS 330
Spring 2019
428
Computer Security
Definition
• Computer security is the policies, procedures and technical
measures used to prevent unauthorized access, alteration,
theft, interruption or physical damage to information systems
• There is more to computer security than just password
protection and encryption ...
CS 330
Spring 2019
429
Computer Security
What Services are Needed?
A customer wants to order at item online. What might be some
concerns? The customer ...
•
is who he says he is (i.e. authentication)
•
can only access certain parts of the system (i.e. access control)
•
cannot view another customer’s order (i.e. data
confidentiality)
•
cannot modify another customer’s data (i.e. access integrity)
CS 330
Spring 2019
430
Computer Security
What Services are Needed?
A customer wants to order at item online. What might be some
concerns? The customer ...
• can place an order if so desired (i.e. availability)
• keeps his word after placing the order (i.e. non-repudiation).
There are two types of repudiation
1) the sender denies sending the data
2) the receiver denies receiving the data
CS 330
Spring 2019
431
Computer Security
Six Security Service Definitions
• Authentication: assurance that the communicating entity is the
one claimed
• Access Control: prevention of the unauthorized use of a
resource
• Data Confidentiality: protection of data from unauthorized
disclosure
• Data Integrity: assurance that data received is as sent by an
authorized entity
• Non-Repudiation: protection against denial by one of the
parties in a communication
• Availability: assurance that services are available when needed
CS 330
Spring 2019
432
Computer Security
Which Service?
Captain Jack Sparrow redecorates the Black Pearl and wants to
open it up to the public with these prices
- for $10, a tourist can visit the 1st deck,
- for $20, a tourist can visit the whole ship.
Question: Which of the following security services can be
implemented to enforce these rules?
Authentication, Access Control, Data Confidentiality, Data
Integrity, Availability, Non-repudiation
Answer: access control
CS 330
Spring 2019
433
Computer Security
Which Service?
Captain Jack Sparrow decides to auction off the Black Pearl on
eBay but is not sure if the website that he logs into is in fact
eBay.
Question: Which of the following security services could be
implemented to ease his anxiety?
Authentication, Access Control, Data Confidentiality, Data
Integrity, Availability, Non-repudiation
Answer: Authentication
CS 330
Spring 2019
434
Computer Security
Which Service?
Captain Jack Sparrow receives an email from the Smurfs offering
$10 million to buy the Black Pearl. Jack thinks this is a sweet deal
but he is afraid that the Smurfs might back down later on.
Question: What security service can be used to prevent the
Smurfs from denying they send the e-mail?
Authentication, Access Control, Data Confidentiality, Data
Integrity, Availability, Non-repudiation
Answer: Non-repudiation
CS 330
Spring 2019
435
Computer Security
Which Service?
Captain Jack Sparrow wants to make an announcement that he
sold his ship and officially retires from piracy.
Question: What security service (or services) can be used to
ensure the public the message is genuine?
Authentication, Access Control, Data Confidentiality, Data
Integrity, Availability, Non-repudiation
Answer: Authentication and Data Integrity
CS 330
Spring 2019
436
Tools for Protecting IS
Access Control
• Passwords
- security professionals prefer long, mixed case, alphanumeric
combinations that are not words in any language
- people prefer short, lowercase, meaningful words
• Two factor authentication: token / smart card / phone app
- a second physical device which is often used in conjunction
with a password
• Biometrics
- fingerprint, retinal image, face
CS 330
Spring 2019
437
Tools for Protecting IS
25 Most Commonly Hacked Passwords
password
123456
qwerty
abc123
1234567
letmein
dragon
baseball
iloveyou
master
ashley
bailey
shadow
123123
superman
qazwsx
football
12345678
monkey
trustno1
111111
sunshine
passw0rd
654321
michael
source: http://www.theglobeandmail.com/news/technology/tech-news/top25-most-hacked-passwords-revealed/article2244739/
CS 330
Spring 2019
438
Tools for Protecting IS
How do Password Crackers Work
• Try common passwords, i.e. previous slide.
• Try common passwords with a suffix of 2 or 3 characters.
• Try dictionary words, with variations in capitalization or
spelling (like ‘$’ for ‘s’, ‘1’ for ‘l’, @ for a).
• Try combinations of 2 or 3 dictionary words.
• To target a specific person, gather info about them (from the
web) e.g. names (of partner, children, pets) favourites (sports,
food, musicians, actors) and use these instead of common
passwords in the strategy above.
CS 330
Spring 2019
439
Tools for Protecting IS
How to Foil Them
• Best Method: use a password manager. These programs pick a
different random sequence of characters for each web site.
• Alternative Method: convert a phrase meaningful only to you
- don’t pick a pet’s name, e.g. Bailey or b@i1ey
- do pick a phrase that describes the pet
▪ I was 14 when Bailey arrived.
▪ Convert it to a password, typically by picking the first
letter for words, keeping capitalization, punctuation and
numbers.
▪ Iw14wBa.
CS 330
Spring 2019
440
Tools for Protecting IS
Firewalls
• Mentioned back on slide 239.
• Both Mac OS X and Windows have had software firewalls for the
last 10+ years.
• Many cable and DSL modems have hardware firewalls built in.
Intrusion Detection Systems
• Looks for unusual patterns, e.g.
- Chris normally works weekdays 8:30 am–4:30 pm and typically
only logs into his desktop computer, email and Learn.
- Why is he trying to remotely log into every other computer on
the network at 3 am on a Saturday?
CS 330
Spring 2019
441
Tools for Protecting IS
Antivirus software
• Avast and AVG (among others) are free
• Windows Defender (part of Windows 10) and File Quarantine
(part of Mac OS) are supplied for free with their respective OS.
• These programs look for bit patterns in programs, called a
signature to recognize known viruses, worms, Trojan horses
• They can also look for “unusual behaviour” to detect new ones
- e.g. a program accessing the internet a lot.
• Downsides of antivirus software
- can slow the launch or running of programs a bit
- can slow the opening of a file or the mounting of a USB
thumb drive
CS 330
Spring 2019
442
Wireless Security
Setting Up Wi-Fi
• Setting up the most secure Wi-Fi connection possible can
involving knowing about a lot of acronyms.
•
All you have to understand these abbreviations: 802.11a,
802.11b, 802.11g, 802.11n, 802.11ac, WEP, WEP-40, WEP-104,
WPA, WPA-personal, WPA/PSK, WPA2, WPA2-personal,
WPA2/PSK, WPA2-Entreprise, WPA3, TKIP, AES, WPS, EAP, LEAP,
PEAP + approximately 100 other authentication protocols!
•
As time goes on ... there will be more!
CS 330
Spring 2019
443
Wireless Security
Three Parameters: #1 Speed
• Wi-Fi comes in a variety of bandwidths.
• The newer versions have the fastest bandwidths.
•
•
802.11b 802.11a/802.11g 802.11n
802.11ac
11Mb/s
54Mb/s
300Mb/s 1200Mb/s
slowest
fastest
the newest is 802.11ac
generally there is some backward compatibility
- n generally works with g and b
- ac generally works with all the rest
CS 330
Spring 2019
444
Wireless Security
Three Parameters: #2 Security
• Wi-Fi comes with a variety of security protocols.
• The newer versions are the most secure.
•
•
•
•
WEP
WPA
WPA2/AES
WPA3
least secure
most secure
WEP (Wired Equivalent Privacy) can be easily cracked
WPA (Wireless Protected Access)
- was a temporary replacement to WEP
WPA2/AES is newer
WPA3 is the newest (2018)
CS 330
Spring 2019
445
Wireless Security
Three Parameters: #3 Authentication
Authentication – Personal
• There are methods specifically for home or small companies.
•
•
WPS
least secure
WPS is Wi-Fi Protected Setup
PSK is Pre-Shared Key
Personal or PSK
most secure
Authentication – Enterprise
• you typically have no choice, just do what the company or
university tells you to do
• e.g. Eduroam at UW
CS 330
Spring 2019
446
Wireless Security
The Fundamental Problem
• ISPs want to make the default set-up something that practically
everyone has (i.e. the oldest) WEP or WPA
• but you want the most secure option that is available.
The Solution
• Check each device to see if it supports the most recent
(currently WPA3).
- If some don’t, then make it as secure as you can.
- Typically Wi-Fi routers can be set to try WPA3 first (then
WPA2/AES and then fall back to WPA).
CS 330
Spring 2019
447
System Security
Software Vulnerability
• Recall from slide 418-419, that software usually contains bugs.
• Bugs can create security vulnerabilities, opening up the system
to intruders.
• Eliminating all bugs is not technically or economically possible
with large programs.
• Vendors release small pieces of software (called patches or
updates) to repair significant flaws.
• Many programs now, by default, automatically download and
install updates. If not, set it up so these programs get updated.
• Caution: the discovery of bugs outpaces the ability of even big
companies to fix them all.
CS 330
Spring 2019
448
Securing Your System
Barest Minimum
• Use strong passwords (slides 437 – 440).
• Use antivirus / malware protection (slide 442).
• Activate automatic updates for OS, browser, and anything else
that uses the internet (slide 448).
Best Practices
• Isolate and encrypt sensitive data.
• Minimize your attack surface: the different places in your
system where a hacker can try to add or extract data.
CS 330
Spring 2019
449
Securing Your System
Isolate and Encrypt Sensitive Data
• The NSA cannot (currently) crack AES-256 encrypted documents.
• macOS, Linux and Windows 10 Professional all have the ability
to encrypt hard drives.
- but Windows Home edition does not have this feature.
• Use AES-256 based encryption software, e.g. Veracrypt for
Windows / OS X / Linux to encrypt your drives.
• Use AES-256 based flash drives, e.g. Kingston Data Traveler Vault
Privacy.
• Have a separate user account on your computer for your
banking and financial activities and files.
CS 330
Spring 2019
450
Securing Your System
Minimize Your Attack Surface
• Use WPA3 (or WPA2 + AES) for Wi-Fi (slide 447)
• Configure the firewall in your OS and your modem/router (slide
441)
- Google the terms configure or setup + firewall + your OS or
your modem/router manufacturer and model
▪ e.g. 1: configure firewall macOS
▪ e.g. 2: setup firewall Windows 10
▪ e.g. 3: configure firewall 2wire 2701
CS 330
Spring 2019
451
Securing Your System
Minimize Your Attack Surface
• When not in use, disconnect from the internet, i.e. turn off
Wi-Fi (on computer) or turn off modem
- e.g. when you are sleeping, at school, at work
•
More advanced: remove unnecessary browser plug-ins, remove
unnecessary software, don’t run unnecessary services, modify
unnecessary default features.
source:
https://www.us-cert.gov/sites/default/files/publications/TenWaystoImproveNewComputerSecurity.pdf
CS 330
Spring 2019
452
Security and Control Framework
Business Value of Security and Control
• Inadequate security and control can result in lost of business
and may create serious legal liabilities.
• Businesses must protect the information assets of
- their own company, their own employees
- their customers, and their business partners.
• Failure to do so can lead to costly litigation for data exposure
or theft.
• A sound security and control framework that protects
business information assets can thus produce a high return
on investment.
CS 330
Spring 2019
453
Security and Control Framework
Legal and Regulatory Requirements
• Canada: recall slides 372-375, Personal Information Protection
and Electronic Documents Act (PIPEDA)
- It establishes principles for the collection, use, disclosure and
safeguarding of personal information.
• Canada: companies must be able to respond to legal requests
for electronic documents relevant to a civil case (a discovery
request).
• Ontario: Canadian version of the Sarbanes-Oxley Act (C-SOX)
- Internal controls must be put in place to govern the accuracy
of information in financial statements (similar to how they do
in the US with the Sarbanes-Oxley Act).
- Other provinces have done the same.
CS 330
Spring 2019
454
Security and Control Framework
Tool #1: Risk Assessment
• To do a risk assessment is to determine the level of risk to the
firm for various classes of risks e.g.
- Type of risk: power failure
- Probability of occurrence in a year: 30%
- Loss Range (low, average, high) = ($5k, $100k, $200k)
- Expected Annual Loss = 0.3 × $100,000 = $30,000.
- Conclusion: spending $20,000 on backup system is a
reasonable expense.
CS 330
Spring 2019
455
Security and Control Framework
Tool #2: Security Policy
• A security policy identifies
- main risks (say power failures),
- goals (maximum a downtime of 3 minutes per year),
- mechanisms to achieve these goals (uninterruptible power
supplies + diesel generator backup).
Tool #3: Acceptable Use Policy
• Acceptable Use Policy (AUP) states the acceptable uses and
users of information and computers,
- e.g. privacy, user responsibility, personal use of devices,
access rules for different employees
- technical measures used to enforce the policies
CS 330
Spring 2019
456
Security and Control Framework
Tool #3 continued: Sample Access Rule for an HR Clerk
• This document identifies the information employees have
access to and the type of access (read-only vs. update) based
on their role in the organization.
source: course text, Figure 8-3
CS 330
Spring 2019
457
Security and Control Framework
Tool #4: Disaster Recovery Planning
• Getting IT systems up and running after a disruption
- e.g. back-up files and maintain back-up systems.
Tool #5: Business Continuity Planning
• Getting the business up and running after a disaster
- safeguarding people as well as machines.
• Identify and document critical business processes
- not relying on people who may be unavailable.
• Create action plans for these processes.
• Line up offsite resources, e.g. the cloud.
CS 330
Spring 2019
458
Security and Control Framework
Tool #6: Security Auditing
• A security audit investigates if the current security and control
framework is adequate.
• Create a comprehensive assessment of a company’s computer
security policies, procedures, technical measures, personnel,
training, documentation
- may even simulate an attack.
• The risk assessment is done before security implementation
while auditing is done after its implementation and repeated
from time to time.
CS 330
Spring 2019
459
Security and Control Framework
Tool #6 continued: A Sample Audit
CS 330
Spring 2019
course text, Figure 8-4
460
Security and Control Framework
Bottom Line
• Many companies assume that a disaster too improbable and so
security and control is not worth the investment in time and
money.
• Lack of knowledge or lack of motivation are the greatest
causes of computer security breaches.
CS 330
Spring 2019
461
Topic 9 – Managing Knowledge
Key Concepts
• why is knowledge management needed
• knowledge and wisdom
• explicit and tacit knowledge
• implementing a KM system
References
• course text, Chapter 11.1 Managing Knowledge
CS 330
Spring 2019
462
Why Knowledge Management
The Knowledge Economy
For several decades the world's best-known forecasters of
societal change have predicted the emergence of a new
economy in which brainpower, not machine power, is the
critical resource. But the future has already turned into the
present, and the era of knowledge has arrived.
The Learning Organization by Economist Intelligence Unit and IBM (1996)
•
•
Note the year of the quote: 1996
Most of you have lived your entire lives in the era of the
knowledge economy.
CS 330
Spring 2019
463
Why Knowledge Management
The Increasing Demand for Knowledge Workers
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
1900
1910
1920
1930
farmworkers
service
managerial & admin.
1940
1950
1960
labourers & operators
clerical
prof. & tech.
1970
1980
1990
2000
crafts
sales
Source: T.A. Stewart, Intellectual capital. (1997)
CS 330
Spring 2019
464
Why Knowledge Management
Cost of Mismanagement
Each year, poor documentation and communications cost the
Canadian economy more than $50 billion.
Peter Richardson, Coping with the Crisis in the office: Canada’s $50 Billion
Challenge
Loss of System Knowledge
• What is the impact on an organization when people leave?
• Do they leave with years of knowledge?
CS 330
Spring 2019
465
Why Knowledge Management
The Exponential Growth of Digital Media
• Recall (from slide 86) that one of our drivers of technology was
the fact that the amount of data stored is roughly doubling
every year.
• Most of it is stored digitally.
• Printed documents only account for 0.003% of information
growth.
CS 330
Spring 2019
466
Why Knowledge Management
The Challenge
We need better way to ...
• manage the data and information randomly floating inside an
organization
• extract, store, and share the knowledge stored inside the minds
of the employees
• harness the external data and information freely floating
around an organization
CS 330
Spring 2019
467
Knowledge Management (KM)
Key Concepts
Recall (from slide 280) we distinguished
• Data: raw facts (course text, pg. 13) e.g. a list of items scanned
at a supermarket checkout
•
Information: data shaped into a form that is meaningful ... to
human beings (course text, pg. 13) e.g. which items are selling
well and which aren’t
•
now we will add ...
CS 330
Spring 2019
468
Knowledge Management (KM)
Key Concepts
• Knowledge: to discover patterns, rules and contexts where the
information is useful (course text, pg. 342)
- e.g. customers are more likely to buy an item that is at eye
level on a grocery store shelf
•
Wisdom: when, where and how to apply knowledge to get a
solution to a problem (course text, pg. 343)
- e.g. how to maximize the amount of money you make per
square foot in a grocery store
CS 330
Spring 2019
469
Knowledge Management (KM)
Two Types of Knowledge
20% is Explicit Knowledge
• Knowledge that has been
documented somewhere
• reports, policies, manuals,
emails
• formal or codified
• databases
• books, magazines, journals
CS 330
80% is Tacit Knowledge
• What employees know that
has not been documented
• knowledge held in the minds
of the employees
• informal and uncodified
• values, perspectives and
culture
• memories of staff, suppliers
and vendors
Spring 2019
470
Knowledge Management (KM)
What is KM?
• Knowledge management is the task of acquiring, storing,
disseminating, and applying an organization's explicit and tacit
knowledge to meet mission objectives.
• The objective of KM is to
- connect those who know to those who need to know
- leverage knowledge transfer from one to many
- know‐how, know‐why and know‐who
CS 330
Spring 2019
471
Knowledge Management (KM)
What is KM’s Role?
•
KM is one of the fastest growing areas of software investment
in companies.
•
Knowledge is a source of wealth for an organization, just like
labor, land, or financial capital.
•
The key challenge of the knowledge‐based economy is to
foster innovation.
•
A substantial part of a companies stock value is related to its
intangible assets.
•
These intangible, intellectual assets that must be properly
managed.
CS 330
Spring 2019
472
Knowledge Management (KM)
Which Areas of KM? (US Data)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Capture & share best practices
Corporate learning strategies
Customer Relationship Mgmt
Competitive intelligence
Source: J. Milan, KM: A revolution waiting for IR (2001)
Paper presented at the 41st Annual AIR Forum.
CS 330
Spring 2019
473
Implementing a KM System
Stage #1: Create a Knowledge Network
• Develop a sharing environment
- Mentor program, virtual team, expert panel, seminars and
conferences, communities of practice
• Use collaboration tools to encourage information sharing
- Shared drives , Wikis and Blogs, Groupware like SharePoint,
Creation of FAQs
• Ideally, everything you do, say and know is properly
documented and stored in digital form
• Challenges ...
CS 330
Spring 2019
474
Implementing a KM System
Typical Concern: Resistance to Sharing
• The more I share, the less valuable I am to the company and
others
• If you are a ...
- student, would you share your studying strategies?
- professor, would you share your lectures?
- a machine operator, would you share your knowledge of
operations?
- a stock broker, would you share trade information?
• But if you a reputation in the company for being
knowledgeable and helpful, your job is safer.
CS 330
Spring 2019
475
Implementing a KM System
Stage #2: Implement a Search Engine
• Provide relevant information to decision making using a text
based search engine.
- Internal sources: everything stored in digital form: e-mails,
internal online forum, meeting minutes, reports, memos,
database systems
- External sources: everything publicly available on the
Internet
- Search engine: a program that decides what information is
relevant, e.g. https://cloud.google.com/products/search/
CS 330
Spring 2019
476
Implementing a KM System
Typical Concern: Relevancy
• Relevancy, from a human standpoint, is:
- user-dependent
▪ depends upon a specific user’s judgment;
▪ situational, relates to user’s current needs
- time dependent
▪ changes over time
- geographically dependent
▪ an approach that works in one part of the country will not
work in another part
▪ municipal and provincial laws may be different
CS 330
Spring 2019
477
Implementing a KM System
Stage #3: Build an Intelligent System
• The ultimate goal of knowledge management
• Build on the search engine with the addition of an inference
engine or machine learning
- system is capable of making suggestions or computing
solutions, e.g. automated medical diagnosis
- might use a neural net to detect suspicious (possibly
fraudulent) credit card transactions or suspicious tax returns
That is the end of the official course material!
CS 330
Spring 2019
478
Final Exam and Final Thoughts
Preparing for the Final
• Stay tuned to Piazza for an official post with details about
- The final exam details: format, excluded material, weighting
of material from 1st and 2nd half etc.
- The post will be up by Aug 4th
- I will create a single file that contains all the slides.
- I will have extra office hours for the 3 business days before
the exam (9th, 12th, 13th).
• I hope this course has helped you to become more informed
users of computer technology and better able to use it in a
business environment.
• Good luck on the final!
CS 330
Spring 2019
479
Download