Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Incompatible change
Description
Or, soon as 0.19.1 hadoop comes out, we need a new hbase release:
From hadoop list:
Yes guys. We observed such problems. They will be common for 0.18.2 and 0.19.0 exactly as you described it when data-nodes become unstable. There were several issues, please take a look HADOOP-4997 workaround for tmp file handling on DataNodes HADOOP-4663 - links to other related HADOOP-4810 Data lost at cluster startup HADOOP-4702 Failed block replication leaves an incomplete block .... We run 0.18.3 now and it does not have these problems. 0.19.1 should be the same. Thanks, --Konstantin Zak, Richard [USA] wrote: > It happens right after the MR job (though once or twice its happened > during). I am not using EBS, just HDFS between the machines. As for tasks, > there are 4 mappers and 0 reducers. > > > Richard J. Zak > > -----Original Message----- > From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of > Jean-Daniel Cryans > Sent: Friday, January 23, 2009 13:24 > To: core-user@hadoop.apache.org > Subject: Re: HDFS loosing blocks or connection error > > xlarge is good. Is it normally happening during a MR job? If so, how many > tasks do you have running at the same moment overall? Also, is your data > stored on EBS? > > Thx, > > J-D > > On Fri, Jan 23, 2009 at 12:55 PM, Zak, Richard [USA] > <zak_richard@bah.com>wrote: > >> 4 slaves, 1 master, all are the m1.xlarge instance type. >> >> >> Richard J. Zak >> >> -----Original Message----- >> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans >> Sent: Friday, January 23, 2009 12:34 >> To: core-user@hadoop.apache.org >> Subject: Re: HDFS loosing blocks or connection error >> >> Richard, >> >> This happens when the datanodes are too slow and eventually all replicas for a single block are tagged as "bad". What kind of instances are you using? >> How many of them? >> >> J-D >> >> On Fri, Jan 23, 2009 at 12:13 PM, Zak, Richard [USA] >> <zak_richard@bah.com>wrote: >> >>> Might there be a reason for why this seems to routinely happen to me when using Hadoop 0.19.0 on Amazon EC2? >>> >>> 09/01/23 11:45:52 INFO hdfs.DFSClient: Could not obtain block >>> blk_-1757733438820764312_6736 from any node: java.io.IOException: No live nodes contain current block >>> 09/01/23 11:45:55 INFO hdfs.DFSClient: Could not obtain block