Let Mapreduce Programs Fly
Download
Report
Transcript Let Mapreduce Programs Fly
Let Mapreduce Programs Fly
Tang Zhenkun
Email: [email protected]
Overview
Mapreduce Basics
Hadoop Counters
Hadoop Log Info(slf4j)
Unit Test(JUnit, MRUnit)
Guava(Google Core Libraries for Java 1.6+)
Others
References
Mapreduce Basics
Hadoop job submit flow
Mapreduce Basics
Hadoop Web GUI
Mapreduce Basics
Hadoop job submit flow
If errors?
1. Invisible to details
2. None step-through Debug
Not just pray!
Errors
Command Errors, Grammar Errors
Check, and check, and check again…
Logic Errors
That is the point that we need to deal with.
Hadoop Counters
Hadoop Standard Counters
Map output records
Reduce output records
Custom Counters
How to custom a mapreduce counter?
输入文件
context.getCounter(counterName);
context.getCounter(groupName, counterName);
How to custom a mapreduce counter?
Hadoop Log Info
Stdout does not work.
System.out.println()
Use Logger.
Eg: log4j, slf4j
X
Hadoop Log Info – Slf4j
SLF4j – Simple Logging Façade for Java.
Simple, easy to use.
Hadoop Log Info – Slf4j
Unit Test
TDD, Test-Driven Development,
Unit Test – JUnit
JUnit(Unit Test for Java)
#Unit(for C#)
XUnit
How to write unit tests using
JUnit?
小孩分油问题:两个小孩去打油,一人带
了一个一斤的空瓶,另一个带了一个七两、
一个三两的空瓶。原计划各打一斤油,可
是由于所带的钱不够,只好两人合打了一
斤(10两)油,在回家的路上,二人想平分这
一斤油,可是又没有其它工具。试仅用三
个瓶子(一斤、七两、三两)精确地分出两个
半斤油来。
How to write unit tests using JUnit?
Define a state:
Each represents the 10 ounces, 7ounces, and 3
ounces bottle.
Define the Operation:multiAndPlus(X, b)
Eg: pour 10 ounces from the first(10o) bottle to
the third one.
How to write unit tests using JUnit?
MatTest.java
Mat.java
How to write unit tests using
JUnit?
@Test
@Before, @After
Assert*
And last, RUN in Java Normal Application.
Unit Test - MRUnit
MRUnit, Unit Test for Hadoop Mapreduce
How to write unit test using MRUnit?
MapDriver
ReduceDriver
MapReduceDriver
withInput(key, value)
withOutput(key, value)
runTest()
And last, RUN in Java Normal Application.
Assertions
The Art of Assertion in CH5 of Programming
Pearls, Second Edition.
Assert in Java
assert <boolean expression>
assert <boolean expression> : <error message>
But, you must run the application with enabling
assertions implicitly.(java -ea <className>)
Precondition in Guava
Preconditions in Guava
Guava, Google Core Libraries for Java 1.6+
Preconditions
checkArgument(i >= 0, "Argument was %s but expected nonnegative", i);
checkArgument(i < j, "Expected i < j, but %s > %s", i, j);
Guava
Other useful libraries.
http://code.google.com/p/guava-libraries/
How to custom a partitioner in
hadoop?
自定义Partitioner
自定义数据类型CustomType
How to custom a partitioner in
hadoop?
How to custom a partitioner in
hadoop?
Partitioner: return Key % 3
When change to: (return key / 3), and change the number of reduce tasks to 4
Totally ordering.
Others
Maven
Hadoop Remote Debug
Auto endependency management
JDWP, Java Debug Wire Protocol
HPROF
Analysis tools in JDK
References
Hadoop, the Definitive Guide, Second Edition.
http://www.junit.org/
http://incubator.apache.org/mrunit/
http://code.google.com/p/guava-libraries/
http://insightfullogic.com/blog/2011/oct/21/5reasons-use-guava/