Can hadoop take input from multiple directories and files -


as set fileinputformat hadoop input. arg[0]+"/*/*/*" said match no files.

what want read multiple files as:

 directory1 ---directory11    ---directory111         --f1.txt         --f2.txt ---directory12 directory2 ---directory21 

is possible in hadoop? thanks!

you can take input multiple directories , files using ***** operator. it's because "arg[0]" argument isn't correct , therefore it's not finding files.

as alternative, can use inputformat.addinputpath or if need separate formats or mappers multipleinputs class can used.

example of basic adding path

fileinputformat.addinputpath(job, myinputpath); 

here example of multipleinputs

multipleinputs.addinputpath(job, inputpath1, textinputformat.class, mymapper.class); multipleinputs.addinputpath(job, inputpath2, textinputformat.class, myothermapper.class); 

this other question similar , has answers, hadoop reduce multiple input formats.


Comments

Popular posts from this blog

java - Jmockit String final length method mocking Issue -

asp.net - Razor Page Hosted on IIS 6 Fails Every Morning -

c++ - wxwidget compiling on windows command prompt -