Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-3777

Minor compaction fails forever after table deleted

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: None
    • Labels:
    • Environment:

      Hadoop 2.7.0, ZK 3.4.6, Accumulo 83d1b8388ad807d678c9a3a922e5025faa9a5933, 20 node m3.large EC2 cluster

      Description

      Was running RW test and saw an issue where a minor compaction thread went haywire after a table was deleted.

      Was continually seeing this exception.

      2015-05-06 16:16:35,374 [tserver.TabletServerResourceManager] ERROR: Memory manager failed Table with id 1l does not exist
      java.lang.IllegalArgumentException: Table with id 1l does not exist
              at org.apache.accumulo.core.client.impl.Tables.getNamespaceId(Tables.java:239)
              at org.apache.accumulo.server.conf.TableParentConfiguration.getNamespaceId(TableParentConfiguration.java:38)
              at org.apache.accumulo.server.conf.NamespaceConfiguration.getPath(NamespaceConfiguration.java:88)
              at org.apache.accumulo.server.conf.NamespaceConfiguration.get(NamespaceConfiguration.java:101)
              at org.apache.accumulo.server.conf.ZooCachePropertyAccessor.get(ZooCachePropertyAccessor.java:110)
              at org.apache.accumulo.server.conf.TableConfiguration.get(TableConfiguration.java:99)
              at org.apache.accumulo.core.conf.AccumuloConfiguration.getTimeInMillis(AccumuloConfiguration.java:252)
              at org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager.getMinCIdleThreshold(LargestFirstMemoryManager.java:142)
              at org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager.getMemoryManagementActions(LargestFirstMemoryManager.java:175)
              at org.apache.accumulo.tserver.TabletServerResourceManager$MemoryManagementFramework.manageMemory(TabletServerResourceManager.java:408)
              at org.apache.accumulo.tserver.TabletServerResourceManager$MemoryManagementFramework.access$400(TabletServerResourceManager.java:318)
              at org.apache.accumulo.tserver.TabletServerResourceManager$MemoryManagementFramework$2.run(TabletServerResourceManager.java:346)
              at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
              at java.lang.Thread.run(Thread.java:745)
      

      From the master logs :

      2015-05-06 16:16:35,014 [tableOps.CleanUp] DEBUG: Deleted table 1l
      

      It seems this went on for a while until something wacked the tserver

      [ec2-user@worker5 logs]$ grep 'Memory manager failed Table with id 1l does not exist' tserver_worker5.log | head -3
      2015-05-06 16:16:35,123 [tserver.TabletServerResourceManager] ERROR: Memory manager failed Table with id 1l does not exist
      2015-05-06 16:16:35,374 [tserver.TabletServerResourceManager] ERROR: Memory manager failed Table with id 1l does not exist
      2015-05-06 16:16:35,625 [tserver.TabletServerResourceManager] ERROR: Memory manager failed Table with id 1l does not exist
      [ec2-user@worker5 logs]$ grep 'Memory manager failed Table with id 1l does not exist' tserver_worker5.log | tail -3
      2015-05-06 17:15:06,141 [tserver.TabletServerResourceManager] ERROR: Memory manager failed Table with id 1l does not exist
      2015-05-06 17:15:06,392 [tserver.TabletServerResourceManager] ERROR: Memory manager failed Table with id 1l does not exist
      2015-05-06 17:15:06,642 [tserver.TabletServerResourceManager] ERROR: Memory manager failed Table with id 1l does not exist
      [ec2-user@worker5 logs]$ grep "Lost tablet server lock" tserver_worker5.log 
      2015-05-06 17:15:06,685 [tserver.TabletServer] ERROR: Lost tablet server lock (reason = LOCK_DELETED), exiting.
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                elserj Josh Elser
                Reporter:
                kturner Keith Turner
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m