Is there a way to run Mapreduce with mongodb as input and output to HDFS?

1 Answer

Similar Questions

0 votes

Does Mongo-Hadoop connector has the support to take aggregation query output as the input to MapReduce jobs?

Can you please update if Mongo-Hadoop connector has the support to take aggregation query output as the input to MapReduce jobs. I know there is support for find query through mongo.input.query configuration.

+1 vote

How to learn hadoop mapReduce on mongodb in java

I would like to know if you have any examples or tutorials where I can learn hadoop mapReduce on mongodb in java?

0 votes

How to get info about which data in hdfs or file system that a MapReduce job visits?

I was trying to implement a Hadoop/Spark audit tool, but l met a problem that I can't get the input file location and file name. I can get username, IP address, time, user command, all of these info from hdfs-audit.log. But When I submit a MapReduce job, I can't see input file location neither in Hadoop logs or Hadoop ResourceManager.

Does hadoop have API or log that contains these info through some configuration ?If it have, what should I configure?

+1 vote

How to set mapreduce.input.fileinputformat.split.maxsize for a specific job ?

In xmls configuration file of Hadoop-2.x, "mapreduce.input.fileinputformat.split.minsize" is given which can be set but how to set "mapreduce.input.fileinputformat.split.maxsize" in xml file. I need to set it in my mapreduce code.

+2 votes

MongoDB: Is there a way to configure/increase the memory limit on the pipeline

Is there a way to configure/increase the memory limit(100MB) on the pipeline? What if I have a lot of RAM that can be used by MongoDB?

Is there a way to run Mapreduce with mongodb as input and output to HDFS?

Your comment on this post:

1 Answer

Your comment on this answer:

Your answer

Preview