HBase
  1. HBase
  2. HBASE-5675

Create table fails if we keep refreshing master's UI for task monitor status

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.90.4, 0.92.0
    • Fix Version/s: None
    • Component/s: master
    • Labels:

      Description

      I tried to create a table with 2K pre-split regions, region assignment was in middle and i was keep refreshing master's web UI to find the status of the task using task monitor, table creation was failed and META was showing 2K regions with server location value is null and regions weren't deployed onto region-servers.

      table_ACreating table table_A
      java.io.IOException: java.io.IOException: java.util.ConcurrentModificationException
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
      	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
      	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
      	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
      	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
      	at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:384)
      	at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:294)
      	at com.test.tools.hbase.schema.createIfNotExists(schema.java:520)
      	at com.test.tools.hbase.schema.main(schema.java:627)
      Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.util.ConcurrentModificationException
      	at java.util.SubList.checkForComodification(AbstractList.java:752)
      	at java.util.SubList.add(AbstractList.java:632)
      	at java.util.SubList.add(AbstractList.java:633)
      	at java.util.SubList.add(AbstractList.java:633)
      	..
      	..
      	at java.util.SubList.add(AbstractList.java:633)
      	at java.util.AbstractList.add(AbstractList.java:91)
      	at org.apache.hadoop.hbase.monitoring.TaskMonitor.createStatus(TaskMonitor.java:76)
      	at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:510)
      	at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:490)
      	at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:853)
      	at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:813)
      	at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:780)
      	at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
      	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
      	at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
      	at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
      	at $Proxy5.createTable(Unknown Source)
      	at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:382)	
      

        Activity

        Hide
        Mubarak Seyed added a comment -

        I could create a table using workarounds

        1. Increase the RPC timeout value to 10 minutes
        2. Create one table at a time
        3. No refresh of master's web UI
        Show
        Mubarak Seyed added a comment - I could create a table using workarounds Increase the RPC timeout value to 10 minutes Create one table at a time No refresh of master's web UI
        Hide
        Lars Hofhansl added a comment -

        TaskMonitor does not seem exist in 0.90.x.
        In 0.92.x it is properly synchronized - although the synchronization was changed in 0.94+ (unnecessarily IMHO).

        Show
        Lars Hofhansl added a comment - TaskMonitor does not seem exist in 0.90.x. In 0.92.x it is properly synchronized - although the synchronization was changed in 0.94+ (unnecessarily IMHO).
        Hide
        Mubarak Seyed added a comment -

        @Lars
        CDH3u2 backports distributed log splitting to 0.90.x branch and task monitor stuff as well. I have verified the code

        public List<StoreFile> close(final boolean abort) throws IOException {
            // Only allow one thread to close at a time. Serialize them so dual
            // threads attempting to close will run up against each other.
            MonitoredTask status = TaskMonitor.get().createStatus(
                "Closing region " + this +
                (abort ? " due to abort" : ""));
        
            status.setStatus("Waiting for close lock");
            try {
              synchronized (closeLock) {
                return doClose(abort, status);
              }
            } finally {
              status.cleanup();
            }
          }
        

        It is affected in 0.92 and above i believe. Thanks.

        Show
        Mubarak Seyed added a comment - @Lars CDH3u2 backports distributed log splitting to 0.90.x branch and task monitor stuff as well. I have verified the code public List<StoreFile> close( final boolean abort) throws IOException { // Only allow one thread to close at a time. Serialize them so dual // threads attempting to close will run up against each other. MonitoredTask status = TaskMonitor.get().createStatus( "Closing region " + this + (abort ? " due to abort" : "")); status.setStatus( "Waiting for close lock" ); try { synchronized (closeLock) { return doClose(abort, status); } } finally { status.cleanup(); } } It is affected in 0.92 and above i believe. Thanks.
        Hide
        stack added a comment -

        This is still an issue then?

        Show
        stack added a comment - This is still an issue then?
        Hide
        Andrew Purtell added a comment -

        We saw this CME in our internal distro but solved it by bringing forward TaskMonitor to the latest in 0.92.

        Show
        Andrew Purtell added a comment - We saw this CME in our internal distro but solved it by bringing forward TaskMonitor to the latest in 0.92.
        Hide
        Jonathan Hsieh added a comment -

        The fix was in HBASE-5535 and in the 0.94 branch and above. I'll mark this issue as duplicate, and get it added into the issues to fix in CDH.

        Show
        Jonathan Hsieh added a comment - The fix was in HBASE-5535 and in the 0.94 branch and above. I'll mark this issue as duplicate, and get it added into the issues to fix in CDH.
        Hide
        Mubarak Seyed added a comment -

        Thanks Jon.

        Show
        Mubarak Seyed added a comment - Thanks Jon.
        Hide
        Jonathan Hsieh added a comment -

        I took a closer look and HBASE-5535 doesn't seem to really add any new synchronization help on the tasks list. Another related patch is a part of HBASE-4057; investigating further.

        Show
        Jonathan Hsieh added a comment - I took a closer look and HBASE-5535 doesn't seem to really add any new synchronization help on the tasks list. Another related patch is a part of HBASE-4057 ; investigating further.
        Hide
        Jonathan Hsieh added a comment -

        Mubarak, found it – the fix is actually HBASE-4386, fixed in 0.92/0.94/trunk (and CDH3u3).

        Show
        Jonathan Hsieh added a comment - Mubarak, found it – the fix is actually HBASE-4386 , fixed in 0.92/0.94/trunk (and CDH3u3).

          People

          • Assignee:
            Mubarak Seyed
            Reporter:
            Mubarak Seyed
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development