Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21751

WAL creation fails during region open may cause region assign forever fail

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.2, 2.0.4
    • Fix Version/s: 2.3.0, 2.2.1, 2.1.6
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      During the first region opens on the RS, WALFactory will create a WAL file, but if the wal creation fails, in some cases, HDFS will leave a empty file in the dir(e.g. disk full, file is created succesfully but block allocation fails). We have a check in AbstractFSWAL that if WAL belong to the same factory exists, then a error will be throw. Thus, the region can never be open on this RS later.

      2019-01-17 02:15:53,320 ERROR [RS_OPEN_META-regionserver/server003:16020-0] handler.OpenRegionHandler(301): Failed open of region=hbase:meta,,1.1588230740
      java.io.IOException: Target WAL already exists within directory hdfs://cluster/hbase/WALs/server003.hbase.hostname.com,16020,1545269815888
              at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.<init>(AbstractFSWAL.java:382)
              at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.<init>(AsyncFSWAL.java:210)
              at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createWAL(AsyncFSWALProvider.java:72)
              at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createWAL(AsyncFSWALProvider.java:47)
              at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:138)
              at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:57)
              at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:264)
              at org.apache.hadoop.hbase.regionserver.HRegionServer.getWAL(HRegionServer.java:2085)
              at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
              at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
              at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
              at java.lang.Thread.run(Thread.java:834)
      

        Attachments

        1. HBASE-21751.patch
          6 kB
          Allan Yang
        2. HBASE-21751v2.patch
          6 kB
          Allan Yang
        3. HBASE-21751-branch-2.1-v1.patch
          29 kB
          Bing Xiao
        4. HBASE-21751-branch-2.1-v2.patch
          29 kB
          Bing Xiao
        5. HBASE-21751.v2.patch
          13 kB
          Bing Xiao
        6. HBASE-21751.v3.patch
          13 kB
          Bing Xiao
        7. HBASE-21751-branch-2.1-v3.patch
          29 kB
          Bing Xiao

          Activity

            People

            • Assignee:
              luffy123 Bing Xiao
              Reporter:
              allan163 Allan Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: