Tuesday, May 22, 2012

-libjars not working in custom mapreduce code, How to debug

Mostly application developers bump into this issue. They ship their custom jars to map reduce job but when the classes in those are referred by code it throws a Class not found exception.

For -libjars to work your main class should satisfy the following two conditions.
 
1) Main Class should implement the Tool interface


 //wrong usage - Tool Interface not implemented
public class WordCount extends Configured {

//right usage
public class WordCount extends Configured implements Tool {
 
2) Main Class should get the existing configuration using getConf() method rather than creating anew configuration instance.


//wrong usage - creating anew instance of Conf 
public int run(String[] args) throws Exception {
   Configuration conf = new Configuration();
 
//right usage 
 public int run(String[] args) throws Exception {
    Configuration conf = getConf();

5 comments:

  1. My project is not working. I followed the steps given above, but i still get

    Exception in thread "main" java.lang.ClassNotFoundException:
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:249)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:201).

    Seems even the mainclass has not been found yet.
    i use the following command:
    hadoop jar ../lib/train_javaexample.jar train.examples.app.TypeSum -libjars=${LIBJAR} /user/whoiam/in /user/whoiam/out

    proj code:
    public class TypeSum extends Configured implements Tool{
    ...
    public final int run(final String[] args) throws Exception {
    Job job = new Job(super.getConf());
    job.setJarByClass(TypeSum.class);
    job.setMapperClass(TypeSumMapper.class);
    job.setCombinerClass(TypeSumReducer.class);
    job.setReducerClass(TypeSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    //FileInputFormat.addInputPath(job, new Path(args[0]));//???????????
    //FileOutputFormat.setOutputPath(job, new Path(args[1]));

    return (job.waitForCompletion(true)?0:1);
    }

    }
    Could you help please?

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Hadoop is really a good booming technology in coming days. And good path for the people who are looking for the Good Hikes and long career. We also provide Hadoop online training

    ReplyDelete