An Introduction to Programming Concepts and OI-programming …from abstract theory to dirty tricks… Objectives Today Introduction to the concept of “Algorithms” Introduction to common algorithms in OI competitions Introduction to graph theory Introduction to complexity “Philosophy” of OI competitions OI-style programming What is an Algorithm? From Wikipedia: An algorithm is a finite set of welldefined instructions for accomplishing some task which, given an initial state, will terminate in a corresponding recognizable end-state. (what does that mean?) Usually, an algorithm solves a “problem”. Examples Insertion sort Binary Search An algorithm does not have to be a computer program! Think about other possible algorithms in real life “Problem”s Usually a set of well defined inputs and corresponding outputs Example: the sorting problem: Input: a list of numbers Output: a sorted list of numbers We can use a number of different sorting algorithms to solve the sorting problem Data Structures Supplementary objects that help store data in an algorithm Different data structures have different properties, and can store different types of data, and access them in different ways Selecting the right data structure can be very important, as you will learn later Examples: arrays, queues, stacks… more will be introduced later Examples of algorithms Sorting algorithms Graph algorithms – Djikstra, Warshall-floyd, Bellman-Ford, Prims, Kruskal Tree-Search algorithms – BFS, DFS Linear Searching Algorithms Examples of Data Structures Array – random access Queue – First in First Out Stack – First in Last Out Heap – extract min/max number Binary Search Tree – Efficient insert, search, delete, etc. Hash Table – fast lookup Other graph data structures discussed below Examples of Techniques in Designing Algorithms Recursion Dynamic programming Greedy Divide and conquer Branch and bound (the above may have overlaps) Using and Creating Algorithms “It is science. You can derive them.” “It is art. We have no way to teach you!” – Alan Tam Why study algorithms? To solve problems that can be directly solved by existing algorithms To solve problems that can be solved by combining algorithms To get feelings and inspirations on how to design new algorithms Related Issues Proving correctness of algorithms (why? why not?) Other methods: finding counter examples, “Unu’s Conjecture of Competition” Questions? Graphs What is a graph? Informally, a set of relationships between things A graph is defined as G=(V,E), where V is the set of vertices (singular: vertex) E is the set of edges that connect some of the vertices A path is a sequence of vertices which are connected by edge(s) Example Map of Australia NT Q WA SA NSW V T Common Types of Graphs Directed/Undirected Graph Weighted/Unweighted Graph Connectivity NT Q WA SA NSW V T Trees A few common definitions (equivalent): Connected graph with no cycles There is a unique path between any two vertices Connected graph with v – 1 edges (v = num of vertices) Rooted/Unrooted Trees Heap, Binary Search Trees Representing a graph Adjacency Matrix Adjacency List Complexity What is complexity? We are not (yet!) concerned with the exact runtime or memory used We want to know how well an algorithm “scales up” (i.e. when there is a large input). Why? Complexity (cont’d) Here’s why: 2500 2000 1500 1000 500 0 f(n) = 10n f(n) = 30n f(n) = n^2 f(n) = n^3 Quasi-Formal Definition of Big-O (you need not remember these) We say f(x) is in O(g(x)) if and only if there exist numbers x0 and M such that |f(x)| ≤ M |g(x)| for x > x0 Example 1 – Bubble sort For i := 1 to n do For j := i downto 2 do if a[j] > a[j-1] then swap(a[j], a[j-1]); Time Complexity? O(n2) “Swap Complexity”? How about memory? Example 2 – Insertion Sort Quick introduction to insertion sort (you will learn more in the searching and sorting training): [] 4 3 1 5 2 [4] 3 1 5 2 [3 4] 1 5 2 [1 3 4] 5 2 [1 3 4 5] 2 [1 2 3 4 5] Time Complexity = ? Applications Usually, the time complexity of the algorithm gives us a rough estimation of the actual run time. O(n) for very large N O(n2) for n ~ 1000-3000 O(n3) for n ~ 100-200 O(n4) for n ~ 50 O(kn) for O(n!) for very small n, usually < 20 Keep in mind The constant of the algorithms (including the implementation) Computers vary in speeds, so the time needed will be different Therefore remember to test the program/computer before making assumptions! Problem I have implemented bubble sort for an Array A[N] and applied binary search on it. Time complexity of bubble sort? Time complexity of binary search? O(N2). No doubt. O(lg N) Well, what is the time complexity of my algorithm? Properties O(f) + O(g) = max(O(f), O(g)) O(f) * O(g) = O(fg) So, what is the answer regarding to previous question? Some other notations (optional) f(N) is Θ(g(N)) f(N) is o(g(N)) For all C, there exists N0 such that |f(N)| < C|g(N)| for all N > N0 f(N) is Ω(g(N)) iff f(N) is O(g(N)) and g(N) is O(f(N)) iff g(N) is O(f(N)) Again no need to remember them Computational Theory Topics P (Polynomical) Can be solved in polynomical time NP (Non-deterministic Polynomical) Can be checked in polynomial time NP does NOT stand for “not-polynomial”!! NP-Complete The “hardest” NP problems “Philosophy” of OI Competitions Objective of Competition… The winner is determined by: Fastest Program? Amount of time used in coding? Number of Tasks Solved? Use of the most difficult algorithm? Highest Score Therefore, during a competition, aim to get highest score, at all costs – “All is fair in love and war.” Scoring A “black box” judging system Test data is fed into the program Output is checked for correctness No source code is manually inspected How to take advantage (without cheating of course!) of the system? The OI Programming Process Reading the problems Choosing a problem Reading the problem Thinking Coding Testing Finalizing the program Reading the Problem Usually, a task consists of Title Problem Description Constraints Input/Output Specification Sample Input/Output Scoring Reading the Problem Constraints Range of variables Execution Time NEVER make assumptions yourself Ask whenever you are not sure (Do not be afraid to ask questions!) Read every word carefully Make sure you understand before going on Thinking Classify the problem Draw diagrams, use rough work, scribble… Consider special cases (smallest, largest, etc) Is the problem too simple? Graph? Mathematics? Data Processing? Dynamic Programming? etc…. Some complicated problems may be a combination of the above Usually the problem setters have something they want to test the contestants, maybe an algorithm, some specific observations, carefulness etc. Still no idea? Give up. Time is precious. Designing the Solution Remember, before coding, you MUST have an idea what you are doing. If you don’t know what you are doing, do not begin coding. Some points to consider: Execution time (Time complexity) Memory usage (Space complexity) Difficulty in coding Remember, during competition, use the algorithm that gains you most score, not the fastest/hardest algorithm! Coding Optimized for ease of coding, not for reading Ignore all the “coding practices” outside, unless you find them particularly useful in OI competitions No Comments needed Short variable names Use less functions NEVER use 16 bit integers (unless memory is limited) 16 bit integer may be slower! (PC’s are usually 32-bit, even 64 bit architectures should be somewhat-optimized for 32 bit) Coding (2) Use goto, break, etc in the appropriate situations Never mind what Djikstra has to say Avoid using floating point variables if possible (eg. real, double, etc) Do not do small (aka useless) “optimizations” to your code Save and compile frequently See example program code… Testing Sample Input/Output “A problem has sample output for two reasons: 1. 2. To make you understand what the correct output format is To make you believe that your incorrect solution has solved the problem correctly ” Manual Test Data Generated Test Data (if time allows) Boundary Cases (0, 1, other smallest cases) Large Cases (to check for TLE, overflows, etc) Tricky Cases Debugging Debugging – find out the bug, and remove it Easiest method: writeln/printf/cout It is so-called “Debug message” Use of debuggers: FreePascal IDE debugger gdb debugger Finalizing Check output format Any trailing spaces? Missing end-of-lines? (for printf users, this is quite common) better test once more with sample output Remember to clear those debug messages Check I/O – filename? stdio? Check exe/source file name Is the executable updated? Method of submission? Try to allocate ~5 mins at the end of competition for finalizing Interactive Tasks Traditional Tasks Give input in one go Give output in one go Interactive Tasks Your program is given some input Your program gives some output Your program is given some more input Your program gives more output …etc Example “Guess the number” Sample Run: Judge: I have a number between 1 and 5, can you guess? Program: is it 1? J: Too small P: 2? J: Too small P: 3? J: Too small P: 4? J: Correct P: 5? J: Too big P: Your number is 4! Open Test Data Test data is known Usually quite difficult to solve Some need time consuming algorithms, therefore you are given a few hours (i.e. competition time) to run the program Tricks: ALWAYS look at all the test data first Solve by hand, manually Solve partially by program, partially by hand Some with different programs Solve all with one program (sometimes impossible!) Make good use of existing tools – you do not have to write all the programs if some are already available! (eg. sort, other languages, etc) Tricks “No solution” Solve for simple cases Hard Code 50% Special cases (smallest, largest, etc) Incorrect greedy algorithms Stupid Hardcode: begin writeln(random(100)); end. Naïve hardcode: “if input is x, output hc(x)” More “intelligent” hardcode (sometimes not possible): pre-compute the values, and only save some of them Brute force Other Weird Tricks (not always useful…) Do nothing (e.g.. Toggle, IODM) Pitfalls / Common Mistakes Misunderstanding the problem Not familiar with competition environment Output format Using complex algorithms unnecessarily Choosing the hardest problem first Advertisement (targeted ad) NOI/IOI use Linux as competition environment exclusively We are thinking of providing Linux only environments for upcoming team formation test(s) Linux, when used properly, can be more powerful than Microsoft Windows TM for contests, because it has more powerful tools Eg. Command Line tools, Powerful Editors (vim, emacs), etc. The End Note: most of the contents are introductions only. You may want to find more in-depth materials Books – Introduction to Algorithms Online – Google, Wikipedia HKOI – Newsgroup, training websites of previous years, discuss with trainers/trainees. Training – Many topics are further covered in later trainings Experience! Any Questions?