Can hadoop take input from multiple directories and files -
as set fileinputformat hadoop input. arg[0]+"/*/*/*"
said match no files.
what want read multiple files as:
directory1 ---directory11 ---directory111 --f1.txt --f2.txt ---directory12 directory2 ---directory21
is possible in hadoop? thanks!
you can take input multiple directories , files using ***** operator. it's because "arg[0]" argument isn't correct , therefore it's not finding files.
as alternative, can use inputformat.addinputpath or if need separate formats or mappers multipleinputs class can used.
example of basic adding path
fileinputformat.addinputpath(job, myinputpath);
here example of multipleinputs
multipleinputs.addinputpath(job, inputpath1, textinputformat.class, mymapper.class); multipleinputs.addinputpath(job, inputpath2, textinputformat.class, myothermapper.class);
this other question similar , has answers, hadoop reduce multiple input formats.
Comments
Post a Comment