Let Mapreduce Programs Fly Tang Zhenkun Email: tangzk2011@163.com Overview Mapreduce Basics Hadoop Counters Hadoop Log Info(slf4j) Unit Test(JUnit, MRUnit) Guava(Google Core Libraries for Java 1.6+) Others References Mapreduce Basics Hadoop job submit flow Mapreduce Basics Hadoop Web GUI Mapreduce Basics Hadoop job submit flow If errors? 1. Invisible to details 2. None step-through Debug Not just pray! Errors Command Errors, Grammar Errors Check, and check, and check again… Logic Errors That is the point that we need to deal with. Hadoop Counters Hadoop Standard Counters Map output records Reduce output records Custom Counters How to custom a mapreduce counter? 输入文件 context.getCounter(counterName); context.getCounter(groupName, counterName); How to custom a mapreduce counter? Hadoop Log Info Stdout does not work. System.out.println() Use Logger. Eg: log4j, slf4j X Hadoop Log Info – Slf4j SLF4j – Simple Logging Façade for Java. Simple, easy to use. Hadoop Log Info – Slf4j Unit Test TDD, Test-Driven Development, Unit Test – JUnit JUnit(Unit Test for Java) #Unit(for C#) XUnit How to write unit tests using JUnit? 小孩分油问题:两个小孩去打油,一人带 了一个一斤的空瓶,另一个带了一个七两、 一个三两的空瓶。原计划各打一斤油,可 是由于所带的钱不够,只好两人合打了一 斤(10两)油,在回家的路上,二人想平分这 一斤油,可是又没有其它工具。试仅用三 个瓶子(一斤、七两、三两)精确地分出两个 半斤油来。 How to write unit tests using JUnit? Define a state: Each represents the 10 ounces, 7ounces, and 3 ounces bottle. Define the Operation:multiAndPlus(X, b) Eg: pour 10 ounces from the first(10o) bottle to the third one. How to write unit tests using JUnit? MatTest.java Mat.java How to write unit tests using JUnit? @Test @Before, @After Assert* And last, RUN in Java Normal Application. Unit Test - MRUnit MRUnit, Unit Test for Hadoop Mapreduce How to write unit test using MRUnit? MapDriver ReduceDriver MapReduceDriver withInput(key, value) withOutput(key, value) runTest() And last, RUN in Java Normal Application. Assertions The Art of Assertion in CH5 of Programming Pearls, Second Edition. Assert in Java assert <boolean expression> assert <boolean expression> : <error message> But, you must run the application with enabling assertions implicitly.(java -ea <className>) Precondition in Guava Preconditions in Guava Guava, Google Core Libraries for Java 1.6+ Preconditions checkArgument(i >= 0, "Argument was %s but expected nonnegative", i); checkArgument(i < j, "Expected i < j, but %s > %s", i, j); Guava Other useful libraries. http://code.google.com/p/guava-libraries/ How to custom a partitioner in hadoop? 自定义Partitioner 自定义数据类型CustomType How to custom a partitioner in hadoop? How to custom a partitioner in hadoop? Partitioner: return Key % 3 When change to: (return key / 3), and change the number of reduce tasks to 4 Totally ordering. Others Maven Hadoop Remote Debug Auto endependency management JDWP, Java Debug Wire Protocol HPROF Analysis tools in JDK References Hadoop, the Definitive Guide, Second Edition. http://www.junit.org/ http://incubator.apache.org/mrunit/ http://code.google.com/p/guava-libraries/ http://insightfullogic.com/blog/2011/oct/21/5reasons-use-guava/