Tuesday, July 23, 2013

Hadoop FAQs

1.You are given a directory SampleDir of files containing the following  first.txt, _second.txt,.third.txt, #fourth.txt. If you provide SampleDir to the MR job,how many files are processed?

2. You have an external jar file of size 1.3MB that has the required dependencies to run your MR job.What steps do you take to copy the jar file to the task tracker

3.When a job is run,your  properties file are copied to distributed cache in order for your map jobs to access.How do u access the property file
4. If you have m mappers and n reducers in a given job,shuffle and sort algorithm will result in  how many copy and write operations
5. You have 100 Map tasks running out of which ,99 have completed and one task is running slow.The system replicates the slower running task on a different machine and output is collected from the first completed maptask.Rest of the map tasks are killed.What is this phenomenon

No comments:

Post a Comment