we want to set the heartbeat timout for a tasktracker.
If the tasktracker does not send heartbeats for 60 seconds he should be marked as lost. I found the parameter mapreduce.jobtracker.expire.trackers.interval which sounds right to me.
I defined
mapreduce.jobtracker.expire.trackers.interval 60000
in the mapred-site.xml on all servers and restarted the jobtracker and all tasktrackers.
I started a benchmark "hadoop jar hadoop-examples.jar randomwriter rand" and every tasktracker gets 2 jobs. It is a small test environment.
On one tasktracker i stopped the network. On the jobtracker i could see the "Seconds since heartbeat"
increasing. But after 60 seconds the tasktracker was still in the overview. Even in the log of the jobtracker I found nothing.
After over 600 seconds i found the message
org.apache.hadoop.mapred.JobTracker: Lost tracker .....
And the tasktracker wasn't shown any more on the jobtracker. Isn't this the right setting?