2012 = 2 2 * 503 “Might be useful tomorrow.” – MG PotW Solution int dp(int pos, int k) { if (k > lim) return -(1 << 30); if (pos == n) return 0; int res = mem[pos][k]; if (res != -1) return res; res = 0; int cur = k % 2; int stay = (s[pos] == res = max(res, dp(pos int flip = (s[pos] == res = max(res, dp(pos return res; } '0' + (1 - cur)); + 1, k) + stay); '0' + (cur)); + 1, k + 1) + flip); Gunn Programming Contest • Feb. 25, 9AM – 3PM • ProCo-like contest at Gunn High School o Teams of up to 3 o 2 hour round, 12 problems o “A good number” of the problems at APCS A level, with more advanced problems also included • Will be worth PotW credit (exact credit TBA) • Register at http://bit.ly/gunnproco12 • Other dates: o March 17 – Harker Programming Contest o May 19 – Stanford ProCo February USACO! • Today’s the last day to take the February USACO! o Take it if you haven’t already! (Even though AMC is tomorrow…) • 5 points PotW credit for participating (as usual) Compression • Representing data with smaller amounts of data o Note that all data is essentially binary • There are various file formats for compressed files o o o o o o General files: "zip", "tgz", "tar" In fact, "jar"s are really just augmented "zip" files Images: "jpg", "png", "gif" Sound: "mp3", "wav", "wma" Movies: "so many we're not even going to try to list some of them here" Executables, 3D Models, etc. • Two distinct variations: lossless and lossy Lossless • Compress the data so that the process is reversible • No scheme can guarantee decrease in file size o One-to-one mapping between possible file input/output makes this impossible • Instead, schemes often take advantage of repetition in data o Text files often repeat certain letters more frequently o Images might feature certain colors o Neighboring pixels are often close in color • Examples: ".png", most general formats Simple Encoding Algorithms • Run-Length Encoding (RLE) o Scan left to right on the data o Group identical elements that appear consecutively o Store the first element + a count • Huffman Trees o Depending on how often a certain character or word appears, assign it a binary id o More common words should be assigned more bits o However, no id can be a prefix of another Lossy • Take advantage of "unnoticeable" imperfections in data o Audio, image, and movie files depend on this technique • Often use differencing techniques o E.g.: In a movie, each successive frame can be differenced from the previous one to reduce data storage • Often have ability to specify quality Images • Specialized optimizations are designed to trick the human eye o ".jpg" files are notorious for their blockiness and failure to preserve sharp edges o ".gif' files omit colors and then use a dithering technique to emulate shades of color PotW – Silly Compression As part of a new compression algorithm, Bessie is trying to modify a sequence of data of length N so that it contains at most K distinct values. However, it costs |a - b| additional bits to change an integer a into an integer b. Tell her the optimal sequence she should modify the original sequence into. If there are multiple solutions, output any. For 30 points, solve for N < 20. For 50 points, solve for N < 100. For bragging rights, solve for N < 1000. Time Limit: 2 seconds. Note: The second and third tasks are very difficult Silly Compression (cont.) Sample Input #1: 3 2 (N K) 7 6 3 (elements of the original sequence) Output: 1 (minimal cost is 1) 6 6 3 (7 7 3 is another option) Sample Input #2: 72 3 4 5 6 9 10 11 Output: 6 4 4 4 4 10 10 10 (5 5 5 5 10 10 10 is another option) Hints • Note that an optimal sequence always exists such that: o Each element of the original sequence should be shifted to its closest relative in the new sequence o Each element of the new sequence should appear in the original sequence o Thus, the first task can be solved with a exponential-time brute force • Try all n choose k possibilities • 20 choose 10 fits well within time limit • Second task: o Note that the element of a sequence that minimizes the total distance to the other elements is the median o Use dynamic programming after sorting