LING 408/508: Programming for Linguists Lecture 1 August 24th Administrivia 1. Syllabus 2. Introduction 3. Quickie Homework 1 – due Wednesday night by midnight in my e-mailbox • I will assume everyone has a laptop… Syllabus • Details – Instructor: Sandiway Fong Depts. Linguistics and Computer Science – Email: sandiway@email.arizona.edu (homeworks go here) – Office: Douglass 311 – Hours: by appt. or walk-in, after class – Meet: Shantz 338, Mondays/Wednesdays 3:15-4:30pm – No class on: • • • • Monday September 7th (Labor Day) Wednesday November 11th (Veterans Day) Week after September 11th (out of town), plus Monday 21st Monday October 12th – My academic webpage: • dingo.sbs.arizona.edu/~sandiway/ or just Google me Syllabus • dingo.sbs.arizona.edu/~sandiway/ – Lecture slides will be available online (.pptx and pdf formats) just before class • look for updates/corrections afterwards – Lectures will be recorded using the panopto system (video, laptop screen, synchronized slides, keyword search) Syllabus • Pre-requisites: – none! • Course Objectives: – Introduction to programming • data types, different programming styles, thinking algorithmically … – and fundamental computer concepts • computer organization: underlying hardware, and operating systems (processes, shell, filesystem etc.) – Operating System: • Ubuntu (Linux) – Programming Languages: • selected examples: Bash shell, Python, Javascript, Perl, Tcl/Tk, HTML/CSS, MySQL, cgi-bin etc. Syllabus • Expected learning outcomes: – familiarity with the underlying technology, terminology and programming concepts – acquire the ability to think algorithmically – acquire the ability to write short programs – build a graphical user interface – build a web application (with a relational database) – be equipped to take classes in the Human Language Technology (HLT) program and related classes Syllabus • Grading – 408: homeworks (80%), term programming project (20%) – 508: homeworks (65%), term programming project (35%) – Note: requirement – you must submit all homeworks • Homework submissions: – – – – email only to me See homework 1 for the required format … homeworks will be introduced in class due date: • almost one week (typically) • example: homework presented in class on Monday (resp. Wednesday), due Sunday (resp. Tuesday) night by 11:59pm in my mailbox – all homeworks will be reviewed in class Syllabus • Homeworks – you may discuss questions with other students – however, you must program/write it up yourself (in your own words/code) – cite (web) references and your classmates (in the case of discussion) – Student Code of Academic Integrity: plagiarism etc. • http://deanofstudents.arizona.edu/codeofacademicintegrity • Revisions to the syllabus – “the information contained in the course syllabus, other than the grade and absence policies, may be subject to change with reasonable advance notice, as deemed appropriate by the instructor.” Syllabus • Absences – tell me ahead of time so we can make special arrangements, e.g. homeworks – I expect you to attend lectures (though attendance will not be taken) Introduction • Computers – Memory • Programs and data – CPU • Interprets machine instructions – I/O • keyboard, mouse, touchpad, screen, touch sensitive screen, printer, usb port, etc. • bluetooth, ethernet, wifi, cellular … Introduction • Memory – CPU registers invisible to programmers – L1/L2 cache – RAM – SSD/hard drive – blu ray/dvd/cd drive open file read/write fast slow Introduction • Memory Representation 0 array a[23] – binary: zeros and ones (1 bit) – organized into bytes (8 bits) • memory is byte-addressable – word (32 bits) • e.g. integer • (64 bits: floating point number) – big-endian/little-endian addressable Memory (RAM) your Intel and ARM CPUs • most significant byte first or least significant byte • communication … FFFFFFFF Introduction • A typical notebook computer – Example: a 2013 Macbook Air – CPU: Core i5-4250U • • • • • • • • • • 1.3 billion transistors built-in GPU TDP: 15W (1.3 GHz) Dual core (Turbo: 2.6 GHz) Hyper-Threaded (4 logical CPUs, 2 physical) 64 bit 64 KB (32 KB Instruction + 32 KB Data) L1 cache 256 KB L2 cache per core 12MB L3 cache shared 16GB max RAM Increased address space and 64-bit registers Introduction A 4 core machine: 8 virtual anandtech.com Introduction • Machine Language – A CPU understands only one language: machine language • all other languages must be translated into machine language – Primitive instructions include: • • • • • • • • • • • • • MOV PUSH POP ADD / SUB INC / DEC IMUL / IDIV AND / OR / XOR / NOT NEG SHL / SHR JMP CMP JE / JNE / JZ / JG / JGE / JL / JLE CALL / RET Assembly Language: (this notation) by definition, nothing built on it is more powerful http://www.cs.virginia.edu/~evans/cs216/guides/x86.html Introduction • Not all the machine instructions are conceptually necessary – many provided for speed/efficiency • Theoretical Computer Science – All mechanical computation can be carried out using a TURING MACHINE – Finite state table + (infinite) tape – Tape instructions: • at the tape head: Erase, Write, Move (Left/Right/NoMove) – Finite state table: • Current state x Tape symbol --> new state x New Tape symbol x Move Introduction • Storage: – based on digital logic – binary (base 2) – everything is a power of 2 – Byte: 8 bits • • • • 01011011 = 26+24+23+21+20 = 64 + 16 + 8 + 2 + 1 = 91 (in decimal) – Hexadecimal (base 16) • • • • • 0-9,A,B,C,D,E,F (need 4 bits) 5B (= 1 byte) = 5*161 + 11 161 160 = 80 + 11 5 B = 91 27 26 25 24 23 22 21 20 0 1 0 1 1 0 1 1 23 22 21 20 23 22 21 20 0 1 0 1 1 0 1 1 5 B Introduction: data types • Integers – In one byte (= 8 bits), what’s the largest and smallest number, we can represent? – 00000000 = 0 – 01111111 = 127 – 10000000 = -128 – 11111111 = -1 00000000 0 … 11111111 127 -128 -127 2’s complement representation -1 Introduction: data types • Integers – In one byte, what’s the largest and smallest number, we can represent? – Answer: -128 .. 0 .. 127 using the 2’s complement representation – Why? super-convenient for arithmetic operations – “to convert a positive integer X to its negative counterpart, flip all the bits, and add 1” – Example: – 00001010 = 23 + 21 = 10 (decimal) – 11110101 + 1 = 11110110 = -10 (decimal) – 11110110 flip + 1 = 00001001 + 1 = 00001010 Addition: -10 + 10 = 11110110 + 00001010 = 0 (ignore overflow) Introduction: data types • Typically 32 bits (4 bytes) are used to store an integer – range: -2,147,483,648 (2(31-1) -1) to 2,147,483,647 (2(32-1) -1) 231 230 229 228 227 226 225 224 … byte 3 byte 2 … byte 1 27 26 25 24 23 22 byte 0 21 20 C: int • what if you want to store even larger numbers? – Binary Coded Decimal (BCD) – code each decimal digit separately, use a string (sequence) of decimal digits … Introduction: data types • what if you want to store even larger numbers? – Binary Coded Decimal (BCD) – 1 byte can code two digits (0-9 requires 4 bits) – 1 nibble (4 bits) codes the sign (+/-), e.g. hex C/D 23 22 21 20 0 0 0 0 23 22 21 20 0 0 0 1 23 22 21 20 1 0 0 1 2 0 0 1 4 2 bytes (= 4 nibbles) 1 + 2 0 1 4 2.5 bytes (= 5 nibbles) 9 23 1 credit (+) 22 21 1 0 20 0 C 23 debit (-) 22 21 20 1 1 0 1 D