CITS1401 Problem Solving and Programming Problem-solving Techniques Semester 1, 2014 A/Prof Lyndon While School of Computer Science & Software Engineering The University of Western Australia CITS1401 • CITS1401 covers – many important problem-solving techniques used widely in Computer Science and in programming – writing basic programs in Python, a modern high-level programming language – an introduction to software engineering • Problem-solving techniques have been covered mostly via lab work – in this lecture we will review the techniques covered CITS1401 problem-solving techniques 2 Techniques discussed • • • • • • Morphological analysis and testing Reduction and analogy Enumeration and search Abstraction Divide-and-conquer Backtracking CITS1401 problem-solving techniques 3 Morphological analysis and testing • Lab 2 • “morphology” is the study of patterns or forms – occurs widely in many branches of science • In CS/SE/IT, it occurs mostly in two contexts – classification of inputs for processing – classification of inputs for testing CITS1401 problem-solving techniques 4 Classification of inputs for processing • A term in a polynomial is defined by its coefficient and its exponent – what if we want to turn the term into a string? • In the “easy case”: (–6, 4) → “–6x^4” CITS1401 problem-solving techniques 5 Classification of inputs for processing • But what if the coefficient == 0? > 0? == 1? == –1? (0, 4) (6, 4) (1, 4) (–1, 4) → → → → “0” “6x^4” “x^4” “–x^4” → → → → “–6” “–6x” “–6/x^4” “–6/x” • And what if the exponent == 0? == 1? < 0? == –1? (–6, 0) (–6, 1) (–6, –4) (–6, –1) CITS1401 problem-solving techniques 6 Classification of inputs for testing • In optional preferential voting, the voter can rank any subset of the candidates, from one of them up to all of them – so any sequence of non-repeating integers increasing from 1 is a valid vote • If we write a function to parse OPV votes, what would be a good set of test data? • Assume an election with three candidates – a vote is represented by a string containing three characters CITS1401 problem-solving techniques 7 Testing example: classifying votes • Intentionally ranking all three candidates: – i.e. permutations of 123 – 6 of these – 123, 132, 213, 231, 312, 321 • Intentionally ranking only two candidates: – i.e. permutations of 12<sp> – 6 of these – 12<sp>, 1<sp>2, <sp>12, 21<sp>, 2<sp>1, <sp>21 CITS1401 problem-solving techniques 8 Testing example: classifying votes contd. • Intentionally ranking only one candidate: – i.e. a 1 with two spaces – 3 of these – 1<sp><sp>, <sp>1<sp>, <sp><sp>1 • Accidentally ranking only two candidates: – i.e. some non-space instead of the 3 • permutations of 124? – 6 of these – 124, 142, 214, 241, 412, 421 CITS1401 problem-solving techniques 9 Testing example: classifying votes contd. • Accidentally ranking only one candidate: – omitting the 2 • 6 permutations of 134? – duplicating the 2 • 3 permutations of 122 • Accidentally ranking no candidates: – omitting the 1 • 6 permutations of 234? – replicating the 1 • 4 possibilities: 11<sp>, 1<sp>1, <sp>11, 111 CITS1401 problem-solving techniques 10 Testing example: classifying votes contd. • All spaces: – <sp><sp><sp> • Over-length: – anything with more than three characters • Under-length: – anything with fewer than three characters • Already we have 43 test cases! • Every time we change the function, we should re-run all tests – clearly we need a testing program! CITS1401 problem-solving techniques 11 Reduction and analogy • Lab 3 • Reduction is solving a new problem by converting it into another problem for which we already have a solution – e.g. the problem of finding your way around an unknown city can be reduced to the problem of finding a map of the city • assuming you can read maps! • That problem can be reduced to the problem of finding a shop that sells maps – which can be reduced to the problem of reading the information at the airport… CITS1401 problem-solving techniques 12 Reduction example: building tables • Assume that we have written a function buildEvenTable that works for even n • Now we need to write the function buildTable that works for all n • It would be huge mistake to duplicate the code • Instead define buildTable by reducing the problem to buildEvenTable – plus stripDummy and some other logic CITS1401 problem-solving techniques 13 Reduction example: running programs • Imagine we have a Python interpreter that can run while-loops • Then someone says “Let’s add for-loops to Python” • We could change the interpreter – but that might be a lot of work, especially if it was originally written by someone else • Or we could use reduction – replace each for-loop with an equivalent while-loop for k in range(n): <statements> CITS1401 problem-solving techniques becomes k=0 while k < n: <statements> k += 1 14 Reduction example: programs contd. • Then someone says “Let’s add list comprehensions to Python” • Use reduction again: replace each list comprehension with an equivalent for-loop zs = [f(x) for k in range(n) if p(k)] becomes zs = [] for k in range(n): if p(k): zs.append(f(x)) • Note the hierarchical approach CITS1401 problem-solving techniques 15 Enumeration and search • Lab 4 • Very simple idea: generate all possible solutions to the problem, and then check each one to see if it’s correct/good • Used widely in – cryptography – artificial intelligence – game playing CITS1401 problem-solving techniques 16 Enumeration example: verbal arithmetic • SEND + MORE = MONEY – consistently replace each letter with a digit from 0, 1, …, 9 so that the arithmetic is correct • By enumeration, we could create all 10 x 9 x … x 3 = 1,814,400 possible assignments of digits to letters, and check each one – notionally very easy – computationally very expensive CITS1401 problem-solving techniques 17 Enumeration issues • Often “all possible solutions” is way too many! – especially if there’s an infinite number of them… • Often advantageous to – rank potential solutions • likely to find a correct/good one sooner – use known correct/good solutions to develop new (improved) possibilities • e.g. hill-climbing algorithms or genetic algorithms – randomness helps sometimes! CITS1401 problem-solving techniques 18 Enumeration example: cryptography • You are given a coded English message that you know was derived using a substitution cipher – i.e. each letter in the original was consistently replaced by a different letter • Using naïve enumeration gives 26 x 25 x … x 1 = 403,291,461,126,605,635,584,000,000 possibilities • So use tricks like: – ‘e’ probably occurs very often (& ‘a’, ‘r’, ‘t’, ‘n’, etc.) – the sequence ‘jx’ probably never occurs (& ‘zq’, etc.) – most occurrences of ‘q’ will be followed by a ‘u’ CITS1401 problem-solving techniques 19 Enumeration example: missionaries & cannibals • On one side of a river are three missionaries, three cannibals, and a canoe that can carry one or two people – any time on either side of the river, if the number of cannibals exceeds the number of missionaries, something unpleasant happens • Can you come up with a sequence of canoe trips that gets everyone safely across the river? • e.g. the first trip could be – – – – M crosses: no! MM cross: no! C crosses: ok, but the next trip must be just him coming back CC cross, or MC cross: maybe… CITS1401 problem-solving techniques 20 Enumeration example: missionaries & cannibals • By enumeration, we could create all possible sequences – but we need to check for loops – and it’ll be a lot • Instead just apply the rule “always maximise the number of people on the far bank” – leads almost directly to a solution! – these sorts of guidelines are called heuristics CITS1401 problem-solving techniques 21 Abstraction • Lab 5 • Abstraction means simplifying a problem as much as possible before solving it • Examples of this principle include – – – – operate on models instead of the “real world” ignore some details to focus on others discretise space and/or time (and other dimensions) prefer simple data reps. (e.g. integers vs. dates) • Abstraction can lead to more-general solutions CITS1401 problem-solving techniques 22 Abstraction examples • Graph problems – entities as nodes, connections as arcs – focus is on the topology • Dates as integers – faster, simpler, and more flexible • Database construction – store only relevant data – but of course “relevant” is context-dependent… CITS1401 problem-solving techniques 23 Abstraction • Einstein’s Constraint: “Everything should be made as simple as possible, but not simpler.” – omit all details that don’t contribute to a solution – allows you to focus on the “important bits” – but don’t omit any important bits! CITS1401 problem-solving techniques 24 Divide-and-conquer • Lab 6 • Divide-and-conquer means: – divide a problem instance into several smaller pieces – solve each of the pieces separately – combine their solutions to solve the original problem • Very widely-used technique, especially for processing data structures • Often leads to very efficient algorithms CITS1401 problem-solving techniques 25 Divide-and-conquer example: mergesort • Given a list of n numbers – [8, 0, 3, 6, 1, 7, 4, 2, 9, 5] • Split the list down the middle – [8, 0, 3, 6, 1] and [7, 4, 2, 9, 5] • Separately sort these two lists – [0, 1, 3, 6, 8] and [2, 4, 5, 7, 9] • Merge the two sorted lists – [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] – need only (repeatedly) compare the heads CITS1401 problem-solving techniques 26 Divide-and-conquer example: quicksort • Given a list of n numbers – [8, 0, 3, 6, 1, 7, 4, 2, 9, 5] • Choose a pivot (say 5) and partition the list – [0, 3, 1, 4, 2] and [8, 6, 7, 9] – elements smaller than the pivot in the first list • Separately sort these two lists – [0, 1, 2, 3, 4] and [6, 7, 8, 9] • Append the two sorted lists and re-insert the pivot – [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] CITS1401 problem-solving techniques 27 Divide-and-conquer issues • Dividing up the data equally gives the best performance – e.g. quicksort • the best pivot leaves two lists with n/2 elements • the worst pivot leaves one list with n–1, plus [ ] • Auxiliary operations should be cheap – e.g. merge, partition • Larger base cases may improve performance • Sometimes identical sub-problems arise CITS1401 problem-solving techniques 28 Backtracking • Lab 7 • Backtracking is a major enhancement to enumeration and search • Enumeration & search: – generate all possible complete solutions, then check each one for correctness • Backtracking: – build up partial solutions bit by bit, checking for correctness at each stage CITS1401 problem-solving techniques 29 Backtracking example: verbal arithmetic • SEND MORE + ----------MONEY – consistently replace each letter with a digit from 0, 1, …, 9 so that the arithmetic is correct • e.g. D + E = Y – or D + E = Y + 10 • e.g. N + R = E – more generally, N + R [+ 1] = E [+ 10] CITS1401 problem-solving techniques 30 Verbal arithmetic with Enumeration & Search • • • • • • • • • • • • • • • • D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 7, Y = 8: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 7, Y = 9: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 7, Y = 0: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 8, Y = 7: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 8, Y = 9: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 8, Y = 0: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 9, Y = 7: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 9, Y = 8: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 9, Y = 0: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 0, Y = 7: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 0, Y = 8: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 6, M = 0, Y = 9: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 7, M = 6, Y = 8: D = 1, E = 2, Y = 3, N = 4, R = 5, S = 7, M = 6, Y = 9: etc. 1,814,400 possible solutions to check CITS1401 problem-solving techniques NO NO NO NO NO NO NO NO NO NO NO NO NO NO 31 Backtracking example: verbal arithmetic • D=1 – E=2 • Y = 0: NO – 2,520 possibilities discarded • Y = 9, 8, 7, 6, 5, 4: NO – 15,120 more discarded • Y=3 –N=4 » R = 5: NO – 60 more discarded » etc. – E=3 – etc. • D=2 • etc. CITS1401 problem-solving techniques 32 Backtracking issues • Design a representation that allows building, checking, and discarding of partial solutions • Discard partial solutions as early as possible • Incorporate all available clues! CITS1401 problem-solving techniques 33