Course and lab website: acc6.its.brooklyn.cuny.edu/~cc312 Lecture website: www.sci.brooklyn.cuny.edu/~yarmish/cc312 Week 1 What is a computer? A calculator? A word-processing machine? Answer: A Computer has both hardware and software. It can be programmed. Analogies: Dominoes, Piano, movie projector Hardware: Physical machinery that runs the software. This includes the motherboard, hard disk, RAM, CPU… Software: programs written for the computer to follow. Operating System: A program that starts when the computer begins. It controls which software will run at any given time. Windows, UNIX and Apple are examples. Application software: word, excel, Internet explorer, registration system… Algorithm- A set of steps to solve a problem (similar to a recipe). Example: The registration system in Brooklyn College. Use two tables: A student table that has students’ addresses, courses taken or registered for and grades. A course table that has the course number, hours, number enrolled and enrollment limit. Steps: a. Enter all courses that will be offered this semester b. If a new student comes to register enter name into student database c. For each course a student wants to take i.If the course is not full, in the course table increment the number enrolled and in the student table enter the course number for that student. ii.If the course is full inform the student that they would need an overtally. Computer languages are used to put the ideas written in an algorithm into the computer. Examples of computer languages include C, Java and SQL. The registration algorithm can be done in any one of them.Week 2: Networks: LANs, WANs, TCP/IP Network – As computers became more common and affordable people began to look at ways of having them communicate with each other to share resources. (Notice the Ethernet wires connecting the computers in the lab – including the printer.) A first step was to standardize rules of communication between computers. These rules are known as Communication protocols. Example of protocols: Netbeui is a ‘non-routed’ protocol for Windows networks and is similar to a single phone line. TCP/IP is a routed protocol written for UNIX networks and is now required for computers connected to the internet. It is similar to an operator based phone system. A Local Area Network (LAN) is a group of computers connected and communicating with one protocol. As networking developed, independent and separate protocols were developed - so that you had many different LANs with different protocols. In order to send email from one LAN to the next, Gateways were developed to help ‘translate’ between protocols and allow a limited amount of communication between them. These groups of LANs connected by a gateway are called a Wide Area Network (WAN). Diagram of a Local Area Network (LAN) Diagram of a Wide Area Network (WAN): ARPA-NET – Started and funded by the department of defense. The goal was to develop a robust distributed (decentralized) network with many servers as opposed to having a centralized place. This eventually developed into the internet. Client/Server model: It is the software (program) that is the client or the server-not the hardware. Examples are a web server, browser, Kazaa, outlook, white pages online. TCP/IP protocol: IP address- 4 groups of numbers separated by periods. Each group’s number can be at most 255. (It is structured similar to a mailing address where it can be read from the more general location such as what state to specific-which person.) Brooklyn College’s IP address begins with 146.245.X.X. The sci network begins with 146.245.2.X. DNS name is an easy to remember name used instead of the IP address. The Domain Name Server (DNS) provides the correct IP address for each domain name. This name also can be read from general to specific. Uniform Resource Locator (URL) – Specifies the exact location of a resource. It has three parts: the protocol to retrieve information, the machine name (DNS name or IP address) and the resource (file, printer or other). Email address and mailing list. Week 3: bits, bytes, storage and packets Text file or ASCII file vs. binary files (picture music…). Each format must be opened by a program that understands that format. Binary files are usually storage intensive. ASCII chart representation of characters. RAM memory is electric whereas a disk is made of magnetic material. Total disk storage=number of tracks *sectors per track*bytes per sector Random-access disk storage vs. sequential-access tape storage. A bit (binary digit): an electrical charge-light is either on or off. When writing this on paper 1 mean 'on' and 0 means 'off'; A byte is a group 8 bits. A byte can represent one character. A KB-kilobyte=210 (~1,000 bytes); MB-megabyte=220 (~ a million); GB-gigabyte=230 (~ a billion). File compression is used to save space. (Example MP3) Internet connections are measured in bits per second (bps). There are many different speeds. Dial-in speeds (28.8K, 56K) and DSL/cable are two examples. Data sent through the Internet is broken into chunks called packets. TCP organizes bytes into packets called 'IP datagrams'. IP datagrams have headers similar to the address on an envelope. It contains the source IP address, the destination IP address and the packet number. Routers examine the destination address and route the packet appropriately. Packet switching allows fair access to all packets. Different packets may take different routes and may arrive in the wrong order and might even be lost along the way. TCP on receiving end unpacks packets and reconstructs the data – puts packets in order, discards multiples. TCP 'handshake': a. TCP sender sends a packet b. TCP receiver sends back acknowledgment that packet arrived. c. TCP on sending end waits for acknowledgement. If not received the sender resends the packet but this time with a longer waiting period. This scheme may result in multiple packets arriving which would be discarded. TCP adjusts dynamically to network traffic. Week 4: ASCII text-based processing vs binary Representations: Data and applications: WORD-.doc; notepad-.txt; Dreamweaver (or notepad) - .html; paint-.bmp... 1. Text files use ASCII and can be read by text word processors such as Notepad and Word. Pictures use pixels (bitmap) and can be read by a graphics program. The ASCII table is used by all text based programs, 2. The term binary is used to refer to all representations aside for ASCII. Graphic files are said to be stored as binary files. Programs stored in executable files are also said to be stored in a binary file. 3. In order to 'read' a file an application must be programmed (to be discussed further in lecture 6) to read it. Different items must therefore be stored in different ways. For example: Characters values are stored as ASCII, which is a form of Unicode. In order to allow arithmetic operations, numbers cannot be stored as ASCII. They are stored using excess notation or two’s complement for integers and floatingpoint notation for decimal-numbers. Floating-point numbers are susceptible to roundoff errors. An example that highlights the difference would be to show the difference between storing the number 1057 as ASCII and storing it in two’s complement notation. Week 5: Algorithmic Thinking Computers are not people. When dealing with a person we rely on that person's experience. With a computer care must be taken not to take anything for granted. We must be on guard about assumptions, inconsistencies and omissions when telling the computer what to do. For example, if someone has bought bread at this store before, the following instruction would be enough: 1. Buy a loaf of bread from the store to serve with a meal. If they have not bought from this store in the past, the person would become frustrated by the following omissions: What are the directions? How much does it cost? Etc... Revised instructions: 1. Tell the person how much bread costs 2. Check that they have enough money 3. If not get enough money for them to pay for it 4. Give them store address 5. Send them to purchase the bread If the instruction giver wrongly assumed that the buyer is familiar with the area the person might become lost getting to the address. In that case Step 4 above would have to be expanded as follows: a. Make a right turn at the first corner b. Make a left after 2 lights…. If the person has not dealt with money before (a child) these instructions would not be enough. Imagine the person is from Canada and only dealt with Canadian currency we would have a similar issue. When something goes wrong because the instructions were not accurate enough it is referred to as a bug. In the last case the bug was due to the following Error: Used Canadian currency which was not accepted and the meal did not have bread on time. A computer is like the person who knows nothing until being told. Programmers, in general, go through a Software Life-Cycle: 1. Come up with an algorithm- We talked about this above. An algorithm predates computers. An algorithm is similar to a recipe. 2. Write a program- Put the algorithm into a high-level programming language such as C, Pascal, COBOL, Basic Java... 3. Compile the program- Translate it into machine code. 4. Run or execute the program. 5. Debug – fix errors 6. Maintenance – cycle through the step again. A simple example would be developing an algorithm to calculate the factorial of a number and then writing the high-level C code. Another larger example would be developing the student registration system. (Show C code) Two examples of bugs: 1. A similar assumption to the currency assumption (above) caused the Mars Climate Orbiter to be lost in space in 1999. In the calculation of its route some programmers used the metric system while others used English units. 2. Another example is a patriot missile failing to intercept a scud fired at US troops in 1991. The following is from an article: “Specifically, the time in tenths of second as measured by the system's internal clock was multiplied by 1/10 to produce the time in seconds.” Due to the above it is important to carefully specify the problem and keep track of changes to minimize errors. CASE tools are software packages that help keep track of specifications and software for a specific problem. (Please note that I expect to demonstrate some HTML coding in the lab for Lab 5 and 6.)Week 6: Review http://acc6.its.brooklyn.cuny.edu/~core51/review/rev1.html Week 7: First ExamWeek 8: Overview of CPU The CPU consists of the following components: o o o o o Arithmetic/Logic Unit circuitry for arithmetic and logic operations, Control Unit Program Counter – Is a register of the CU that contains the address in main memory of the next instruction Instruction Register – Is a register of the CU that holds the instruction that is currently executing Registers - holds data for future use by arithmetic/logic Bus - circuits connecting CPU and main memory Main Memory (RAM) Both the program instructions and the data are stored in main memory (RAM). One at a time instructions are brought into the CPU, they are worked on and the results are put back into memory. This is called the stored program concept (Von Neumann). We are assuming that the program has already gone through a compiler and is now in ‘machine code’ (See Lecture 5). The Control Unit executes a users stored program by cycling through the machine cycle, which is: A. Fetch (request) the instruction from main memory whose address is in the program counter. The instruction is stored in the instruction register and the program counter is incremented to the next instruction address. B. The control unit decodes the instruction to understand which circuits and registers are needed C. The control unit executes the instruction by activating the circuits needs (refer to addition example above.) The above three steps are repeated until the end of the stored program is reached. Sometimes, the program counter contains a jump instruction to another location in the main memory. This allows the CPU to carry out a loop construct where a set of instructions is repeated over and over. A computer can execute millions of instructions per second. (Make a small diagram) (Please note that I expect to demonstrate some JavaScript coding in the lab for Labs 8 and 9. Also note that you only have to hand in the handworks (not labs) for weeks 8-11.) Week 9: Overview of JavaScript JavaScript was developed at Netscape as a web programming language. Worked jointly with Sun Microsystems (who developed Java). The language is embedded in all Netscape browsers since 2.0 and all Internet Explorer browsers since 3.0. JavaScript is interactive and Client-Based – it works on the users machine, not the remote server machine It does not write files on either the user's machine or the server machine and it also does not carry out graphics. JavaScript code must go on between the script tag pair as in the following example: <script language = "JavaScript"> JavaScript goes here </SCRIPT> Old browsers that don't support JavaScript will print the JavaScript code on the screen since it doesn't realize that it is JavaScript. To avoid that you can surround the JavaScript code by the comment tags <!-- and --> as in the following example: <script language = "JavaScript"> <!-- Conceal from non-Javascript Browsers JavaScript goes here --> </script> (This is not so necessary anymore since most people have current enough browsers.) An object is an item such as a pop-up dialog box and a button. A method carries out some action or changes some property of an object. A function is a group of computer code that does a task. A variable remembers what a user typed in. The following are a few JavaScript functions/methods: alert(); var major = prompt("What is your major?", "CIS"); document.bgColor = color; var reply confirm("Do you like JavaScript?"); window. document.write("Your answer was" + reply ); An Event is a user action. An Event Handler is JavaScript code, which responds to the user action. Give an example that uses a name and a value. Show how to change the source of an image and how to change the value of an input button. Example: <a href="#" onmouseOver = "document.bgColor='red'; return true" Watch me! </a> <a href="#" > onmouseOver="document.bgColor='red'; return true" onMouseOut ="document.bgColor='white'; return true" > Watch me! </a>Week 10: Computability (Solvability) - Can all functions be computed? This question applies to an algorithm. Can we construct an algorithm A that would take any program P (and P's input) as input. A should determine if P halts: If P stops: print “P halts.” If P does not stop (infinite loop) print “P does not halt.” For example look at the following code. Does this loop end? Repeat as long as X is not equal to 0 Add one to X If the initial value of X is 0. Since X is already 0, the whole loop is skipped (no statements are repeated). ● If the initial value of X is a negative integer. The loop will end when X becomes 0. ● If the initial value of X is greater than 0 the loop will never end since there are an infinite number of integers greater than 0. ● This question concerned mathematicians even before digital computers were in developed. In particular, during the 1930's much work was devoted to this. The non computable example above is known as the Halting Problem and it cannot be solved. The Church-Turing thesis stated conditions which a function must satisfy in order to be computable. It is named after Alonzo Church and Alan Turing. To understand this intuitively, think about a clock. Can you let me know as soon as it does not stop? A corollary to the non computability of the halting problem is that it is not possible to construct an algorithm that would on its own construct any other algorithm to solve a random problem.Week 11: Feasibility Example: Doubling payments every day for 30 days. The idea is to look at the rate of growth of time or steps to solve the problem in relation to the input. In other words how quickly does the workload increase for a one unit increase of the input? Exponential and factorial growths are infeasible. Problem: Sort a set of n numbers. Algorithm 1 - Exhaustive listing and search List all permutations (orderings) of the data Pick the one that is sorted. How much work does it take? The number of permutations is n*(n-1)*(n2)*...*3*2*1 = n! Algorithm 2 - The selection sort Find the largest of the n data points Find the largest of the remaining n-1 data points Find the largest of the remaining n-2 data points . . . The smallest data point is remaining How much work does this take? n + (n-1) + (n-2) + ... + 3 + 2 + 1 = n*(n+1)/2 If n = 5 The selection sort takes 15 units of work The exhaustive listing and search takes 5! = 120 units of work Algorithm 1 is infeasible whereas algorithm 2 is feasible. Two examples of infeasible problems: Traveling salesman problem Chess Week 12: Cryptography Cryptography is the study of methods to encrypt or scramble data. Cryptanalysis is the study of how to decode or break the crypt. Conventional or single key encryption, also knows as Synchronous, is where the same key is used to encrypt and decrypt the data. Examples of single key encryption is the substitution cipher where each letter of the alphabet is substituted with a different letter or symbol. Caesar's method is a substitution cipher: replace every letter in the alphabet with the letter N spots away, For example where N=3: A B C . X Y Z . - > > > . > > > D E F A B C 1. The sender uses key K (in this case K is the table above) to encrypt the message. 2. The sender sends the encrypted data. 3. The receiver decodes the message using K. Public-key encryption, also known as Asynchronous, uses two keys. The receiver publishes a public key which is used to encrypt the message. The receiver has a second, private key (known only to the receiver) which is used to decrypt the transmitted text. An advantage of public-key cryptography is that each person only needs one key-pair and it is easy to distribute the keys for others to use. Well-known public-key systems: Elgamal - invented by Taher Elgamal RSA - invented by Ron Rivest, Adi Shamir and Leonard Adleman DSA - Digital Signature Algorithm by David Kravitz Pretty Good Privacy - PGP - uses both conventional and public-key cryptography PGP combines the RSA asynchronous encryption scheme in conjunction with a singlekey synchronous scheme: At the sending end: 1. Compresses the message 2. Creates a session key that is used only once during this session. Created from randomly selected mouse movements and keystrokes. 3. Session key is used to conventionally (single key Synchronous encryption) encrypt the message. 4. The receiver's public key is used to encrypt the session key. 5. The encrypted message and encrypted session key are sent to the receiver. At the receiving end: 1. 2. 3. 4. Receiver uses private key to decrypt session key The session key is used to decode the message text. The text is decompressed. Session key is discarded. Advantages: Only a very small content (the session key) is publicly encrypted. The session key is used just once - hard to decode by repeated attacks Conventional encryption is ~10,000 times faster than public-key encryption. The RSA public-key encryption scheme works as follows: M = the message C = the encrypted message e = the public exponent d = the private exponent n = a very large integer The message is encrypted by: C = Me mod n e (mod means divide M by n and keep the remainder) The message is decrypted by: M = Cd mod n (where n = p * q; d = e-1 mod((p-1)*(q-1)) p and q are prime numbers) If n should be a large number (128 bits or 256 bits). It is computationally infeasible to find p and q because factoring large numbers is an infeasible problem since (at least publicly) no one has found an efficient algorithm to solve it. The best algorithm publicly known to factor a large number n, that is a factor of two unknown prime factors, is to: 1. Determine all (or most) prime numbers up to the square root of n. (There might exist lists of primes to work with.) 2. Multiply each of these primes into n. The first one that divides evenly gives you p and q (p is the divisor you tried and q is the quotient of the division.) This is infeasible because simply adding one bit to the n double the number n. This is exponential growth. This size of n.5 also grows exponentially.