Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-1296

Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • None
    • None
    • None
    • None

    Description

      From a private mail with Jérôme Thièvre:

      "Two times I got the same kind of problems described by schubert zhang in the thread *HDFS unbalance issue. (HBase over HDFS). *The first time, I found one of my regionserver with a full disk, whereas the others were at 10% of their capacity. I saw a lot of move file command at hadoop level from the full regionserver to others, but after check on the hdfs, I can see that hadoop copies the file but fails to delete it. As the regionserver disk was full, it didn't work anymore, even after hadoop and hbase restart. I had to delete all the data.

      "The second time this problem occurs I stopped the row insertion process before the disk was full. After hadoop and hbase restart, hadoop has deleted the files and the system was operational.
      It seems that in some cases where the system is heavily loaded with continuous writes and compactions, hadoop can't remove files."

      Attachments

        Activity

          People

            Unassigned Unassigned
            stack Michael Stack
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: