Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-3351

Tracer can't write traces after offline and online of trace table

VotersWatch issueWatchers
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.1
    • 1.6.2, 1.7.0
    • trace
    • None

    Description

      While running tests for ACCUMULO-3167, I updated one of the tests to offline the trace table to reduce the possibility that any active logs for the trace table would exist in the metadata table.

      A later test went to validate that traces were found for some conditional update sessions and hung indefinitely.

      Inspecting the tracer log, the batchwriter had two exceptions due to the trace table being offline (as expected), but never recovered when the trace table came back online.

      2014-11-20 13:08:28,717 [impl.TabletServerBatchWriter] DEBUG: Table trace (in) is offline
      org.apache.accumulo.core.client.TableOfflineException: Table trace (in) is offline
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.binMutations(TabletServerBatchWriter.java:662)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.addMutations(TabletServerBatchWriter.java:694)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.startProcessing(TabletServerBatchWriter.java:233)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.addFailedMutations(TabletServerBatchWriter.java:551)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.access$700(TabletServerBatchWriter.java:101)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$FailedMutations.run(TabletServerBatchWriter.java:603)
      	at java.util.TimerThread.mainLoop(Timer.java:555)
      	at java.util.TimerThread.run(Timer.java:505)
      2014-11-20 13:08:28,720 [tracer.TraceServer] WARN : Problem flushing traces, resetting writer. Set log level to DEBUG to see stacktrace. cause: org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0  security codes: {}  # server errors 0 # exceptions 1
      2014-11-20 13:08:28,720 [tracer.TraceServer] DEBUG: flushing traces failed due to exception
      org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0  security codes: {}  # server errors 0 # exceptions 1
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.checkForFailures(TabletServerBatchWriter.java:537)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.flush(TabletServerBatchWriter.java:331)
      	at org.apache.accumulo.core.client.impl.BatchWriterImpl.flush(BatchWriterImpl.java:59)
      	at org.apache.accumulo.tracer.TraceServer.flush(TraceServer.java:245)
      	at org.apache.accumulo.tracer.TraceServer.access$300(TraceServer.java:78)
      	at org.apache.accumulo.tracer.TraceServer$1.run(TraceServer.java:235)
      	at org.apache.accumulo.server.util.time.SimpleTimer$LoggingTimerTask.run(SimpleTimer.java:42)
      	at java.util.TimerThread.mainLoop(Timer.java:555)
      	at java.util.TimerThread.run(Timer.java:505)
      Caused by: org.apache.accumulo.core.client.TableOfflineException: Table trace (in) is offline
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.binMutations(TabletServerBatchWriter.java:662)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.addMutations(TabletServerBatchWriter.java:694)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.startProcessing(TabletServerBatchWriter.java:233)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.addFailedMutations(TabletServerBatchWriter.java:551)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.access$700(TabletServerBatchWriter.java:101)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$FailedMutations.run(TabletServerBatchWriter.java:603)
      	... 2 more
      2014-11-20 13:08:28,722 [tracer.TraceServer] WARN : Unable to create a batch writer, will retry. Set log level to DEBUG to see stacktrace. cause: org.apache.accumulo.core.client.TableOfflineException: Table trace (in) is offline
      2014-11-20 13:08:28,722 [tracer.TraceServer] DEBUG: batch writer creation failed with exception.
      org.apache.accumulo.core.client.TableOfflineException: Table trace (in) is offline
      	at org.apache.accumulo.core.client.impl.ConnectorImpl.getTableId(ConnectorImpl.java:86)
      	at org.apache.accumulo.core.client.impl.ConnectorImpl.createBatchWriter(ConnectorImpl.java:128)
      	at org.apache.accumulo.tracer.TraceServer.resetWriter(TraceServer.java:262)
      	at org.apache.accumulo.tracer.TraceServer.flush(TraceServer.java:250)
      	at org.apache.accumulo.tracer.TraceServer.access$300(TraceServer.java:78)
      	at org.apache.accumulo.tracer.TraceServer$1.run(TraceServer.java:235)
      	at org.apache.accumulo.server.util.time.SimpleTimer$LoggingTimerTask.run(SimpleTimer.java:42)
      	at java.util.TimerThread.mainLoop(Timer.java:555)
      	at java.util.TimerThread.run(Timer.java:505)
      2014-11-20 13:08:28,723 [tracer.TraceServer] WARN : Problem closing batch writer. Set log level to DEBUG to see stacktrace. cause: org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0  security codes: {}  # server errors 0 # exceptions 1
      2014-11-20 13:08:28,723 [tracer.TraceServer] DEBUG: batch writer close failed with exception
      org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0  security codes: {}  # server errors 0 # exceptions 1
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.checkForFailures(TabletServerBatchWriter.java:537)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.close(TabletServerBatchWriter.java:354)
      	at org.apache.accumulo.core.client.impl.BatchWriterImpl.close(BatchWriterImpl.java:54)
      	at org.apache.accumulo.tracer.TraceServer.resetWriter(TraceServer.java:271)
      	at org.apache.accumulo.tracer.TraceServer.flush(TraceServer.java:250)
      	at org.apache.accumulo.tracer.TraceServer.access$300(TraceServer.java:78)
      	at org.apache.accumulo.tracer.TraceServer$1.run(TraceServer.java:235)
      	at org.apache.accumulo.server.util.time.SimpleTimer$LoggingTimerTask.run(SimpleTimer.java:42)
      	at java.util.TimerThread.mainLoop(Timer.java:555)
      	at java.util.TimerThread.run(Timer.java:505)
      Caused by: org.apache.accumulo.core.client.TableOfflineException: Table trace (in) is offline
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.binMutations(TabletServerBatchWriter.java:662)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.addMutations(TabletServerBatchWriter.java:694)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.startProcessing(TabletServerBatchWriter.java:233)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.addFailedMutations(TabletServerBatchWriter.java:551)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.access$700(TabletServerBatchWriter.java:101)
      	at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$FailedMutations.run(TabletServerBatchWriter.java:603)
      	... 2 more
      2014-11-20 13:10:12,929 [tracer.TraceServer] WARN : writer is not ready; discarding span.
      2014-11-20 13:10:12,930 [tracer.TraceServer] WARN : writer is not ready; discarding span.
      

      "writer is not ready; discarding span." repeats indefinitely.

      Attachments

        Activity

          People

            elserj Josh Elser
            elserj Josh Elser
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h

                Slack

                  Issue deployment