Hadoop Error: Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.

If we face the following error while running hadoop jobs

http://<masternode ip>:50030/jobtracker.jsp

Error:

“Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.”

Description:

Respective box name <machine name> is not updated in all the cluster configuration files (hadoop and /etc/hosts)

Steps to resolve:

Step 1:

Stop all the hadoop services in master node

hduser: /usr/local/hadoop/bin/stop-all.sh

Step 2:

Edit the hadoop configuration file and update the box name in all cluster nodes

Location: /usr/local/hadoop/conf/

List of files:

core-site.xml

mapred-site.xml

hdfs-site.xml

slaves

masters

Step 3:

Edit the following file and update the entries as follows

/etc/hosts

<ipaddress> space <box name>

Step 4:

Perform step 4 in all the cluster nodes

root: rm -rf /app/

root: mkdir -p /app/hadoop/tmp

root: chmod -R 0755 /app/

root: chown -R hduser:hadoop /app

hduser: /usr/local/hadoop/bin/hadoop namenode -format

Step 5:

Start all the hadoop services in the master node

hduser: /usr/local/hadoop/bin/start-all.sh

Step 6:

Check whether all the services are running

hduser: jps

Masternode : 6 services

Jps

DataNode

TaskTracker

SecondaryNameNode

NameNode

JobTracker

Slave Nodes : 3 services

Jps

DataNode

TaskTracker

Permanent link to this article: https://blog.openshell.in/2013/03/hadoop-error-shuffle-error-exceeded-max_failed_unique_fetches-bailing-out/

Leave a Reply

Your email address will not be published.