The problem solving life cycle - Computer and Information Science

advertisement
Course and lab website: acc6.its.brooklyn.cuny.edu/~cc312
Lecture website:
www.sci.brooklyn.cuny.edu/~yarmish/cc312
Week 1
What is a computer? A calculator? A word-processing machine?
Answer: A Computer has both hardware and software. It can be programmed.
Analogies: Dominoes, Piano, movie projector
Hardware: Physical machinery that runs the software. This includes the motherboard,
hard disk, RAM, CPU…
Software: programs written for the computer to follow.
Operating System: A program that starts when the computer begins. It controls
which software will run at any given time. Windows, UNIX and Apple are
examples.
Application software: word, excel, Internet explorer, registration system…
Algorithm- A set of steps to solve a problem (similar to a recipe). Example: The
registration system in Brooklyn College.
Use two tables: A student table that has students’ addresses, courses taken or registered
for and grades. A course table that has the course number, hours, number enrolled and
enrollment limit.
Steps:
a. Enter all courses that will be offered this semester
b. If a new student comes to register enter name into student database
c. For each course a student wants to take
i.If the course is not full, in the course table increment the number
enrolled and in the student table enter the course number for that
student.
ii.If the course is full inform the student that they would need an
overtally.
Computer languages are used to put the ideas written in an algorithm into the computer.
Examples of computer languages include C, Java and SQL. The registration algorithm
can be done in any one of them.Week 2: Networks: LANs, WANs,
TCP/IP
Network – As computers became more common and affordable people began to look at
ways of having them communicate with each other to share resources. (Notice the
Ethernet wires connecting the computers in the lab – including the printer.) A first step
was to standardize rules of communication between computers. These rules are known as
Communication protocols. Example of protocols: Netbeui is a ‘non-routed’ protocol for
Windows networks and is similar to a single phone line. TCP/IP is a routed protocol
written for UNIX networks and is now required for computers connected to the internet.
It is similar to an operator based phone system.
A Local Area Network (LAN) is a group of computers connected and communicating
with one protocol.
As networking developed, independent and separate protocols were developed - so that
you had many different LANs with different protocols. In order to send email from one
LAN to the next, Gateways were developed to help ‘translate’ between protocols and
allow a limited amount of communication between them. These groups of LANs
connected by a gateway are called a Wide Area Network (WAN).
Diagram of a Local Area Network (LAN)
Diagram of a Wide Area Network (WAN):
ARPA-NET – Started and funded by the department of defense. The goal was to develop
a robust distributed (decentralized) network with many servers as opposed to having a
centralized place. This eventually developed into the internet.
Client/Server model: It is the software (program) that is the client or the server-not the
hardware. Examples are a web server, browser, Kazaa, outlook, white pages online.
TCP/IP protocol:
IP address- 4 groups of numbers separated by periods. Each group’s number can be at
most 255. (It is structured similar to a mailing address where it can be read from the more
general location such as what state to specific-which person.) Brooklyn College’s IP
address begins with 146.245.X.X. The sci network begins with 146.245.2.X.
DNS name is an easy to remember name used instead of the IP address.
The Domain Name Server (DNS) provides the correct IP address for each domain name.
This name also can be read from general to specific.
Uniform Resource Locator (URL) – Specifies the exact location of a resource. It has
three parts: the protocol to retrieve information, the machine name (DNS name or IP
address) and the resource (file, printer or other).
Email address and mailing list.
Week 3: bits, bytes, storage and packets
Text file or ASCII file vs. binary files (picture music…). Each format must be opened
by a program that understands that format. Binary files are usually storage intensive.
ASCII chart representation of characters.
RAM memory is electric whereas a disk is made of magnetic material.
Total disk storage=number of tracks *sectors per track*bytes per sector
Random-access disk storage vs. sequential-access tape storage.
A bit (binary digit): an electrical charge-light is either on or off. When writing this on
paper 1 mean 'on' and 0 means 'off';
A byte is a group 8 bits. A byte can represent one character. A KB-kilobyte=210 (~1,000
bytes); MB-megabyte=220 (~ a million); GB-gigabyte=230 (~ a billion).
File compression is used to save space. (Example MP3)
Internet connections are measured in bits per second (bps). There are many different
speeds. Dial-in speeds (28.8K, 56K) and DSL/cable are two examples.
Data sent through the Internet is broken into chunks called packets.
TCP organizes bytes into packets called 'IP datagrams'.
IP datagrams have headers similar to the address on an envelope. It contains the source IP
address, the destination IP address and the packet number. Routers examine the
destination address and route the packet appropriately. Packet switching allows fair
access to all packets.
Different packets may take different routes and may arrive in the wrong order and might
even be lost along the way. TCP on receiving end unpacks packets and reconstructs the
data – puts packets in order, discards multiples.
TCP 'handshake': a. TCP sender sends a packet b. TCP receiver sends back
acknowledgment that packet arrived. c. TCP on sending end waits for acknowledgement.
If not received the sender resends the packet but this time with a longer waiting period.
This scheme may result in multiple packets arriving which would be discarded.
TCP adjusts dynamically to network traffic.
Week 4: ASCII text-based processing vs binary
Representations: Data and applications: WORD-.doc; notepad-.txt; Dreamweaver (or
notepad) - .html; paint-.bmp...
1. Text files use ASCII and can be read by text word processors such as Notepad
and Word. Pictures use pixels (bitmap) and can be read by a graphics program. The
ASCII table is used by all text based programs,
2. The term binary is used to refer to all representations aside for ASCII. Graphic
files are said to be stored as binary files. Programs stored in executable files are also
said to be stored in a binary file.
3. In order to 'read' a file an application must be programmed (to be discussed
further in lecture 6) to read it. Different items must therefore be stored in different
ways. For example: Characters values are stored as ASCII, which is a form of
Unicode. In order to allow arithmetic operations, numbers cannot be stored as ASCII.
They are stored using excess notation or two’s complement for integers and floatingpoint notation for decimal-numbers. Floating-point numbers are susceptible to roundoff errors. An example that highlights the difference would be to show the difference
between storing the number 1057 as ASCII and storing it in two’s complement
notation.
Week 5: Algorithmic Thinking
Computers are not people. When dealing with a person we rely on that person's
experience. With a computer care must be taken not to take anything for granted.
We must be on guard about assumptions, inconsistencies and omissions when telling the
computer what to do. For example, if someone has bought bread at this store before, the
following instruction would be enough:
1. Buy a loaf of bread from the store to serve with a meal.
If they have not bought from this store in the past, the person would become frustrated by
the following omissions: What are the directions? How much does it cost? Etc...
Revised instructions:
1. Tell the person how much bread costs
2. Check that they have enough money
3. If not get enough money for them to pay for it
4. Give them store address
5. Send them to purchase the bread
If the instruction giver wrongly assumed that the buyer is familiar with the area the
person might become lost getting to the address.
In that case Step 4 above would have to be expanded as follows:
a. Make a right turn at the first corner
b. Make a left after 2 lights….
If the person has not dealt with money before (a child) these instructions would not be
enough. Imagine the person is from Canada and only dealt with Canadian currency we
would have a similar issue. When something goes wrong because the instructions were
not accurate enough it is referred to as a bug.
In the last case the bug was due to the following Error: Used Canadian currency which
was not accepted and the meal did not have bread on time.
A computer is like the person who knows nothing until being told.
Programmers, in general, go through a Software Life-Cycle:
1. Come up with an algorithm- We talked about this above. An algorithm predates
computers. An algorithm is similar to a recipe.
2. Write a program- Put the algorithm into a high-level programming language
such as C, Pascal, COBOL, Basic Java...
3. Compile the program- Translate it into machine code.
4. Run or execute the program.
5. Debug – fix errors
6. Maintenance – cycle through the step again.
A simple example would be developing an algorithm to calculate the factorial of a
number and then writing the high-level C code.
Another larger example would be developing the student registration system.
(Show C code)
Two examples of bugs:
1. A similar assumption to the currency assumption (above) caused the Mars
Climate Orbiter to be lost in space in 1999. In the calculation of its route some
programmers used the metric system while others used English units.
2. Another example is a patriot missile failing to intercept a scud fired at US troops
in 1991. The following is from an article: “Specifically, the time in tenths of second
as measured by the system's internal clock was multiplied by 1/10 to produce the time
in seconds.”
Due to the above it is important to carefully specify the problem and keep track of
changes to minimize errors. CASE tools are software packages that help keep track of
specifications and software for a specific problem.
(Please note that I expect to demonstrate some HTML coding in the lab for Lab 5 and
6.)Week
6: Review
http://acc6.its.brooklyn.cuny.edu/~core51/review/rev1.html
Week 7: First ExamWeek 8: Overview of CPU
The CPU consists of the following components:
o
o
o
o
o
Arithmetic/Logic Unit circuitry for arithmetic and logic operations,
Control Unit
 Program Counter – Is a register of the CU that contains the address
in main memory of the next instruction
 Instruction Register – Is a register of the CU that holds the
instruction that is currently executing
Registers - holds data for future use by arithmetic/logic
Bus - circuits connecting CPU and main memory
Main Memory (RAM)
Both the program instructions and the data are stored in main memory (RAM). One at a
time instructions are brought into the CPU, they are worked on and the results are put
back into memory. This is called the stored program concept (Von Neumann).
We are assuming that the program has already gone through a compiler and is now in
‘machine code’ (See Lecture 5).
The Control Unit executes a users stored program by cycling through the machine cycle,
which is:
A. Fetch (request) the instruction from main memory whose address is in the
program counter. The instruction is stored in the instruction register and the program
counter is incremented to the next instruction address.
B. The control unit decodes the instruction to understand which circuits and registers
are needed
C. The control unit executes the instruction by activating the circuits needs (refer to
addition example above.)
The above three steps are repeated until the end of the stored program is reached.
Sometimes, the program counter contains a jump instruction to another location in the
main memory. This allows the CPU to carry out a loop construct where a set of
instructions is repeated over and over.
A computer can execute millions of instructions per second.
(Make a small diagram)
(Please note that I expect to demonstrate some JavaScript coding in the lab for Labs 8
and 9. Also note that you only have to hand in the handworks (not labs) for weeks 8-11.)
Week 9: Overview of JavaScript
JavaScript was developed at Netscape as a web programming language. Worked jointly
with Sun Microsystems (who developed Java). The language is embedded in all Netscape
browsers since 2.0 and all Internet Explorer browsers since 3.0.
JavaScript is interactive and Client-Based – it works on the users machine, not the
remote server machine
It does not write files on either the user's machine or the server machine and it also does
not carry out graphics.
JavaScript code must go on between the script tag pair as in the following example:
<script language = "JavaScript">
JavaScript goes here
</SCRIPT>
Old browsers that don't support JavaScript will print the JavaScript code on the screen
since it doesn't realize that it is JavaScript. To avoid that you can surround the JavaScript
code by the comment tags <!-- and --> as in the following example:
<script language = "JavaScript">
<!--
Conceal from non-Javascript Browsers
JavaScript goes here
-->
</script>
(This is not so necessary anymore since most people have current enough browsers.)
An object is an item such as a pop-up dialog box and a button. A method carries out
some action or changes some property of an object. A function is a group of computer
code that does a task. A variable remembers what a user typed in. The following are a
few JavaScript functions/methods:
alert(); var major = prompt("What is your major?", "CIS");
document.bgColor = color; var reply confirm("Do you like JavaScript?"); window.
document.write("Your answer was" + reply );
An Event is a user action. An Event Handler is JavaScript code, which responds to the
user action.
Give an example that uses a name and a value. Show how to change the source of an
image and how to change the value of an input button.
Example:
<a href="#" onmouseOver = "document.bgColor='red'; return true"
Watch me!
</a>
<a href="#"
>
onmouseOver="document.bgColor='red'; return true"
onMouseOut ="document.bgColor='white'; return true"
>
Watch me!
</a>Week
10: Computability (Solvability) - Can all
functions be computed?
This question applies to an algorithm. Can we construct an algorithm A that would take
any program P (and P's input) as input. A should determine if P halts:
If P stops:
print “P halts.”
If P does not stop (infinite loop)
print “P does not halt.”
For example look at the following code. Does this loop end?
Repeat as long as X is not equal to 0
Add one to X
If the initial value of X is 0. Since X is already 0, the whole loop is
skipped (no statements are repeated).
● If the initial value of X is a negative integer. The loop will end when X
becomes 0.
● If the initial value of X is greater than 0 the loop will never end since there
are an infinite number of integers greater than 0.
●
This question concerned mathematicians even before digital computers were in
developed. In particular, during the 1930's much work was devoted to this.
The non computable example above is known as the Halting Problem and it cannot be
solved. The Church-Turing thesis stated conditions which a function must satisfy in order
to be computable. It is named after Alonzo Church and Alan Turing.
To understand this intuitively, think about a clock. Can you let me know as soon as it
does not stop?
A corollary to the non computability of the halting problem is that it is
not possible to construct an algorithm that would on its own construct
any other algorithm to solve a random problem.Week
11:
Feasibility
Example: Doubling payments every day for 30 days.
The idea is to look at the rate of growth of time or steps to solve the problem in relation
to the input. In other words how quickly does the workload increase for a one unit
increase of the input?
Exponential and factorial growths are infeasible.
Problem: Sort a set of n numbers.
Algorithm 1 - Exhaustive listing and search



List all permutations (orderings) of the data
Pick the one that is sorted.
How much work does it take? The number of permutations is n*(n-1)*(n2)*...*3*2*1 = n!
Algorithm 2 - The selection sort







Find the largest of the n data points
Find the largest of the remaining n-1 data points
Find the largest of the remaining n-2 data points
.
.
.
The smallest data point is remaining
How much work does this take? n + (n-1) + (n-2) + ... + 3 + 2 + 1 = n*(n+1)/2
If n = 5


The selection sort takes 15 units of work
The exhaustive listing and search takes 5! = 120 units of work
Algorithm 1 is infeasible whereas algorithm 2 is feasible.
Two examples of infeasible problems:

Traveling salesman problem

Chess
Week 12: Cryptography
Cryptography is the study of methods to encrypt or scramble data. Cryptanalysis is the
study of how to decode or break the crypt.
Conventional or single key encryption, also knows as Synchronous, is where the same
key is used to encrypt and decrypt the data.
Examples of single key encryption is the substitution cipher where each letter of the
alphabet is substituted with a different letter or symbol. Caesar's method is a substitution
cipher: replace every letter in the alphabet with the letter N spots away, For example
where N=3:
A
B
C
.
X
Y
Z
.
-
>
>
>
.
>
>
>
D
E
F
A
B
C
1. The sender uses key K (in this case K is the table above) to encrypt the message.
2. The sender sends the encrypted data.
3. The receiver decodes the message using K.
Public-key encryption, also known as Asynchronous, uses two keys. The receiver
publishes a public key which is used to encrypt the message. The receiver has a second,
private key (known only to the receiver) which is used to decrypt the transmitted text.
An advantage of public-key cryptography is that each person only needs one key-pair and
it is easy to distribute the keys for others to use.
Well-known public-key systems:




Elgamal - invented by Taher Elgamal
RSA - invented by Ron Rivest, Adi Shamir and Leonard Adleman
DSA - Digital Signature Algorithm by David Kravitz
Pretty Good Privacy - PGP - uses both conventional and public-key cryptography
PGP combines the RSA asynchronous encryption scheme in conjunction with a singlekey synchronous scheme:
At the sending end:
1. Compresses the message
2. Creates a session key that is used only once during this session. Created from
randomly selected mouse movements and keystrokes.
3. Session key is used to conventionally (single key Synchronous encryption)
encrypt the message.
4. The receiver's public key is used to encrypt the session key.
5. The encrypted message and encrypted session key are sent to the receiver.
At the receiving end:
1.
2.
3.
4.
Receiver uses private key to decrypt session key
The session key is used to decode the message text.
The text is decompressed.
Session key is discarded.
Advantages:



Only a very small content (the session key) is publicly encrypted.
The session key is used just once - hard to decode by repeated attacks
Conventional encryption is ~10,000 times faster than public-key encryption.
The RSA public-key encryption scheme works as follows:
M = the message
C = the encrypted message
e = the public exponent
d = the private exponent
n = a very large integer
The message is encrypted by:
C = Me mod n
e
(mod means divide M by n and keep the remainder)
The message is decrypted by:
M = Cd mod n (where n = p * q;
d = e-1 mod((p-1)*(q-1))
p and q are prime numbers)
If n should be a large number (128 bits or 256 bits). It is computationally infeasible to
find p and q because factoring large numbers is an infeasible problem since (at least
publicly) no one has found an efficient algorithm to solve it. The best algorithm publicly
known to factor a large number n, that is a factor of two unknown prime factors, is to:
1. Determine all (or most) prime numbers up to the square root of n. (There might
exist lists of primes to work with.)
2. Multiply each of these primes into n. The first one that divides evenly gives you p
and q (p is the divisor you tried and q is the quotient of the division.) This is
infeasible because simply adding one bit to the n double the number n. This is
exponential growth. This size of n.5 also grows exponentially.
Download