Accumulo
  1. Accumulo
  2. ACCUMULO-2964

Unexpected ThriftSecurityException from BatchScanner

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: client, tserver
    • Labels:
      None

      Description

      This is something I've only seen a handful of times when writing/running tests that stop and restart tservers. After the tserver is restarted, there is a thread (typically running in the master) which is trying to read a table. As such, the thread will continue to poll until the tserver comes up.

      Very infrequently, the client gets a ThriftSecurityException with a code of DEFAULT_SECURITY_ERROR and a message of Unknown security exception. There is no additional information in the client log (from the thrift call inside the batchscanner), and the tserver contains no error messages at all.

      The error that the client saw.

      2014-07-01 04:18:18,971 [impl.TabletServerBatchReaderIterator] DEBUG: Server : host:58090 msg : null
      ThriftSecurityException(user:!SYSTEM, code:null)
              at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10045)
              at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10022)
              at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result.read(TabletClientService.java:9961)
              at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
              at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:313)
              at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:293)
              at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:632)
              at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:592)
              at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablets(MetadataLocationObtainer.java:181)
              at org.apache.accumulo.core.client.impl.TabletLocatorImpl.processInvalidated(TabletLocatorImpl.java:667)
              at org.apache.accumulo.core.client.impl.TabletLocatorImpl.binRanges(TabletLocatorImpl.java:337)
              at org.apache.accumulo.core.client.impl.TabletLocatorImpl.processInvalidated(TabletLocatorImpl.java:660)
              at org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:610)
              at org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:440)
              at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:226)
              at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:84)
              at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:177)
              at org.apache.accumulo.master.replication.DistributedWorkQueueWorkAssigner.createWork(DistributedWorkQueueWorkAssigner.java:161)
              at org.apache.accumulo.master.replication.DistributedWorkQueueWorkAssigner.assignWork(DistributedWorkQueueWorkAssigner.java:140)
              at org.apache.accumulo.master.replication.WorkDriver.run(WorkDriver.java:97)
      

      The interesting part is that when the client saw this message, the new TabletServer was already started, and the old tabletserver appears to have been dead for 20s. So, the client in the master had been polling for 20s getting a ConnectException (connection refused) which is expected. I don't know why we got this exception after a length of time.

      The infrequency in which I see this makes me wonder if the random ports in the new tabletserver are somehow re-grabbing the old tserver's thrift client service port and something is unexpectedly being interpreted as this ThriftSecurityException? That's the only thing that seems remotely possible to me.

        Issue Links

          Activity

          Hide
          Josh Elser added a comment -

          I couldn't figure out what was happened when i was familiar with the problem. I haven't seen it recently, and know nothing more that would help.

          Show
          Josh Elser added a comment - I couldn't figure out what was happened when i was familiar with the problem. I haven't seen it recently, and know nothing more that would help.
          Hide
          Josh Elser added a comment -

          Digging into this more makes me think that there are two separate issues here. There is the inexplicable network/thrift error which I don't yet have a grasp on, and there is the issue with the batchwriter repeatedly failing. I've opened ACCUMULO-2990 for the latter and will reduce the priority on this issue again until I can get a better understanding of what's happening.

          Show
          Josh Elser added a comment - Digging into this more makes me think that there are two separate issues here. There is the inexplicable network/thrift error which I don't yet have a grasp on, and there is the issue with the batchwriter repeatedly failing. I've opened ACCUMULO-2990 for the latter and will reduce the priority on this issue again until I can get a better understanding of what's happening.
          Hide
          Josh Elser added a comment -

          Upping the priority as it's a possibility that something in thrift changed the lifecycle semantics of BatchScanner/BatchWriter and also adding the affected version back to 1.6.1 until we can determine otherwise.

          Show
          Josh Elser added a comment - Upping the priority as it's a possibility that something in thrift changed the lifecycle semantics of BatchScanner/BatchWriter and also adding the affected version back to 1.6.1 until we can determine otherwise.
          Hide
          Josh Elser added a comment -

          Just saw this again last night, but on a (Batch)Scanner instead of a BatchWriter this time. Same premise – tserver was killed and restarted. ~30s of connection refused to the old server and then suddenly a bunch of DEFAULT_SECURITY_ERROR thrift exceptions.

          Another interesting difference is that the exceptions i'm seeing this time are actually for !SYSTEM too, not just root.

          2014-07-11 04:13:43,713 [impl.TabletServerBatchReaderIterator] DEBUG: Server : juno:59672 msg : null
          ThriftSecurityException(user:!SYSTEM, code:null)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10045)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10022)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result.read(TabletClientService.java:9961)
                  at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:313)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:293)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:632)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:592)
                  at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablets(MetadataLocationObtainer.java:181)
                  at org.apache.accumulo.core.client.impl.TabletLocatorImpl.processInvalidated(TabletLocatorImpl.java:667)
                  at org.apache.accumulo.core.client.impl.TabletLocatorImpl.binRanges(TabletLocatorImpl.java:337)
                  at org.apache.accumulo.core.client.impl.TimeoutTabletLocator.binRanges(TimeoutTabletLocator.java:104)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.binRanges(TabletServerBatchReaderIterator.java:230)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.lookup(TabletServerBatchReaderIterator.java:217)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.<init>(TabletServerBatchReaderIterator.java:155)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReader.iterator(TabletServerBatchReader.java:115)
                  at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:66)
                  at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:56)
                  at org.apache.accumulo.server.master.state.MetaDataStateStore.iterator(MetaDataStateStore.java:67)
                  at org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:158)
          
          2014-07-11 04:13:43,714 [master.Master] ERROR: Error processing table state for store Normal Tablets
          java.lang.RuntimeException: java.lang.RuntimeException: Failed to create iterator
                  at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:72)
                  at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:56)
                  at org.apache.accumulo.server.master.state.MetaDataStateStore.iterator(MetaDataStateStore.java:67)
                  at org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:158)
          Caused by: java.lang.RuntimeException: Failed to create iterator
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.<init>(TabletServerBatchReaderIterator.java:159)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReader.iterator(TabletServerBatchReader.java:115)
                  at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:66)
                  ... 3 more
          Caused by: org.apache.accumulo.core.client.AccumuloSecurityException: Error DEFAULT_SECURITY_ERROR for user !SYSTEM - Unknown security exception
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:690)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:592)
                  at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablets(MetadataLocationObtainer.java:181)
                  at org.apache.accumulo.core.client.impl.TabletLocatorImpl.processInvalidated(TabletLocatorImpl.java:667)
                  at org.apache.accumulo.core.client.impl.TabletLocatorImpl.binRanges(TabletLocatorImpl.java:337)
                  at org.apache.accumulo.core.client.impl.TimeoutTabletLocator.binRanges(TimeoutTabletLocator.java:104)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.binRanges(TabletServerBatchReaderIterator.java:230)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.lookup(TabletServerBatchReaderIterator.java:217)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.<init>(TabletServerBatchReaderIterator.java:155)
                  ... 5 more
          Caused by: ThriftSecurityException(user:!SYSTEM, code:null)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10045)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10022)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result.read(TabletClientService.java:9961)
                  at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:313)
                  at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:293)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:632)
                  ... 13 more
          

          In addition to the normal processing, the master was also trying to write out some new mutations for the purpose of replication which started failing. The odd part is that the failure says it was for accumulo.metadata, but the mutations were for the replication table, not accumulo.metadata

          org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0  security codes: {accumulo.metadata(ID:!0)=[DEFAULT_SECURITY_ERROR]}  # server errors 0 # exceptions 0
                  at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.checkForFailures(TabletServerBatchWriter.java:537)
                  at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.flush(TabletServerBatchWriter.java:331)
                  at org.apache.accumulo.core.client.impl.BatchWriterImpl.flush(BatchWriterImpl.java:61)
                  at org.apache.accumulo.master.replication.WorkMaker.addWorkRecord(WorkMaker.java:192)
                  at org.apache.accumulo.master.replication.WorkMaker.run(WorkMaker.java:124)
                  at org.apache.accumulo.master.replication.ReplicationDriver.run(ReplicationDriver.java:91)
          

          While the first exceptions eventually stopped, the latter kept repeatedly failing for the duration of the test (which ultimately failed). Both cases are similar (repeatedly executed code inside of the master), but the former recreates the BatchScanner whereas the latter attempts to reuse the same BatchWriter.

          I'm wondering if there's an issue in the BatchWriter that's causing it to become useless after the tserver underneath died/went-away. In the above stacktrace, it appears as if this is the case.

          Show
          Josh Elser added a comment - Just saw this again last night, but on a (Batch)Scanner instead of a BatchWriter this time. Same premise – tserver was killed and restarted. ~30s of connection refused to the old server and then suddenly a bunch of DEFAULT_SECURITY_ERROR thrift exceptions. Another interesting difference is that the exceptions i'm seeing this time are actually for !SYSTEM too, not just root. 2014-07-11 04:13:43,713 [impl.TabletServerBatchReaderIterator] DEBUG: Server : juno:59672 msg : null ThriftSecurityException(user:!SYSTEM, code:null) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10045) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10022) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result.read(TabletClientService.java:9961) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:313) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:293) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:632) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:592) at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablets(MetadataLocationObtainer.java:181) at org.apache.accumulo.core.client.impl.TabletLocatorImpl.processInvalidated(TabletLocatorImpl.java:667) at org.apache.accumulo.core.client.impl.TabletLocatorImpl.binRanges(TabletLocatorImpl.java:337) at org.apache.accumulo.core.client.impl.TimeoutTabletLocator.binRanges(TimeoutTabletLocator.java:104) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.binRanges(TabletServerBatchReaderIterator.java:230) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.lookup(TabletServerBatchReaderIterator.java:217) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.<init>(TabletServerBatchReaderIterator.java:155) at org.apache.accumulo.core.client.impl.TabletServerBatchReader.iterator(TabletServerBatchReader.java:115) at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:66) at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:56) at org.apache.accumulo.server.master.state.MetaDataStateStore.iterator(MetaDataStateStore.java:67) at org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:158) 2014-07-11 04:13:43,714 [master.Master] ERROR: Error processing table state for store Normal Tablets java.lang.RuntimeException: java.lang.RuntimeException: Failed to create iterator at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:72) at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:56) at org.apache.accumulo.server.master.state.MetaDataStateStore.iterator(MetaDataStateStore.java:67) at org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:158) Caused by: java.lang.RuntimeException: Failed to create iterator at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.<init>(TabletServerBatchReaderIterator.java:159) at org.apache.accumulo.core.client.impl.TabletServerBatchReader.iterator(TabletServerBatchReader.java:115) at org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:66) ... 3 more Caused by: org.apache.accumulo.core.client.AccumuloSecurityException: Error DEFAULT_SECURITY_ERROR for user !SYSTEM - Unknown security exception at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:690) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:592) at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablets(MetadataLocationObtainer.java:181) at org.apache.accumulo.core.client.impl.TabletLocatorImpl.processInvalidated(TabletLocatorImpl.java:667) at org.apache.accumulo.core.client.impl.TabletLocatorImpl.binRanges(TabletLocatorImpl.java:337) at org.apache.accumulo.core.client.impl.TimeoutTabletLocator.binRanges(TimeoutTabletLocator.java:104) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.binRanges(TabletServerBatchReaderIterator.java:230) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.lookup(TabletServerBatchReaderIterator.java:217) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.<init>(TabletServerBatchReaderIterator.java:155) ... 5 more Caused by: ThriftSecurityException(user:!SYSTEM, code:null) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10045) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result$startMultiScan_resultStandardScheme.read(TabletClientService.java:10022) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startMultiScan_result.read(TabletClientService.java:9961) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:313) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:293) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:632) ... 13 more In addition to the normal processing, the master was also trying to write out some new mutations for the purpose of replication which started failing. The odd part is that the failure says it was for accumulo.metadata, but the mutations were for the replication table, not accumulo.metadata org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0 security codes: {accumulo.metadata(ID:!0)=[DEFAULT_SECURITY_ERROR]} # server errors 0 # exceptions 0 at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.checkForFailures(TabletServerBatchWriter.java:537) at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.flush(TabletServerBatchWriter.java:331) at org.apache.accumulo.core.client.impl.BatchWriterImpl.flush(BatchWriterImpl.java:61) at org.apache.accumulo.master.replication.WorkMaker.addWorkRecord(WorkMaker.java:192) at org.apache.accumulo.master.replication.WorkMaker.run(WorkMaker.java:124) at org.apache.accumulo.master.replication.ReplicationDriver.run(ReplicationDriver.java:91) While the first exceptions eventually stopped, the latter kept repeatedly failing for the duration of the test (which ultimately failed). Both cases are similar (repeatedly executed code inside of the master), but the former recreates the BatchScanner whereas the latter attempts to reuse the same BatchWriter. I'm wondering if there's an issue in the BatchWriter that's causing it to become useless after the tserver underneath died/went-away. In the above stacktrace, it appears as if this is the case.
          Hide
          Josh Elser added a comment -

          Son of a gun, I just saw this again last night. This time, it was from a Scanner in the JUnit code. The test saw the error after the first tserver had died completely, but before the tserver logged that it was starting.

          Client error:

          java.lang.RuntimeException: org.apache.accumulo.core.client.AccumuloSecurityException: Error DEFAULT_SECURITY_ERROR for user root - Unknown security exception
          	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result$startScan_resultStandardScheme.read(TabletClientService.java:6548)
          	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result$startScan_resultStandardScheme.read(TabletClientService.java:6525)
          	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result.read(TabletClientService.java:6448)
          	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
          	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startScan(TabletClientService.java:228)
          	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startScan(TabletClientService.java:204)
          	at org.apache.accumulo.core.client.impl.ThriftScanner.getBatchFromServer(ThriftScanner.java:99)
          	at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablet(MetadataLocationObtainer.java:100)
          	at org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocation(TabletLocatorImpl.java:465)
          	at org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:622)
          	at org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:440)
          	at org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocation(TabletLocatorImpl.java:462)
          	at org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:622)
          	at org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:440)
          	at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:226)
          	at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:84)
          	at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:177)
          	at org.apache.accumulo.test.replication.MultiInstanceReplicationIT.dataReplicatedToCorrectTable(MultiInstanceReplicationIT.java:375)
          

          Error out of the Scanner:

          2014-07-01 08:40:59,251 [impl.ThriftScanner] WARN : Security Violation in scan request to host:58142: ThriftSecurityException(user:root, code:null)
          

          Snippet from newly started tserver log (note that the above warning from the scanner came before the tserver said it started the thrift server):

          2014-07-01 08:40:59,181 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthorizor
          2014-07-01 08:40:59,184 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthenticator
          2014-07-01 08:40:59,187 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKPermHandler
          2014-07-01 08:40:59,321 [conf.Property] DEBUG: Loaded class : org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager
          2014-07-01 08:40:59,324 [tserver.TabletServer] INFO : Tablet server starting on 0.0.0.0
          2014-07-01 08:40:59,339 [util.FileSystemMonitor] INFO : Filesystem monitor started
          2014-07-01 08:40:59,343 [server.GarbageCollectionLogger] DEBUG: gc ParNew=0.03(+0.03) secs ConcurrentMarkSweep=0.04(+0.04) secs freemem=42,054,712(+42,054,712) totalmem=47,579,136
          2014-07-01 08:40:59,347 [trace.ZooTraceClient] DEBUG: Scanning trace hosts in zookeeper: /accumulo/bc72352d-a904-4102-8ab1-e536aea49c01/tracers
          2014-07-01 08:40:59,347 [trace.ZooTraceClient] DEBUG: Trace hosts: []
          2014-07-01 08:40:59,368 [tserver.TabletServer] DEBUG: org.apache.accumulo.tserver.TabletServer$ThriftClientHandler created
          2014-07-01 08:40:59,519 [tserver.TabletServer] INFO : address = host:53325
          2014-07-01 08:40:59,544 [tserver.TabletServer] DEBUG: Obtained tablet server lock /accumulo/bc72352d-a904-4102-8ab1-e536aea49c01/tservers/host:53325/zlock-0000000000
          2014-07-01 08:40:59,572 [tserver.TabletServer] INFO : Started replication service on host:58591
          

          I'm actually wondering now if this is just something due to the thrift-0.9.1 change?

          Show
          Josh Elser added a comment - Son of a gun, I just saw this again last night. This time, it was from a Scanner in the JUnit code. The test saw the error after the first tserver had died completely, but before the tserver logged that it was starting. Client error: java.lang.RuntimeException: org.apache.accumulo.core.client.AccumuloSecurityException: Error DEFAULT_SECURITY_ERROR for user root - Unknown security exception at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result$startScan_resultStandardScheme.read(TabletClientService.java:6548) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result$startScan_resultStandardScheme.read(TabletClientService.java:6525) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$startScan_result.read(TabletClientService.java:6448) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startScan(TabletClientService.java:228) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startScan(TabletClientService.java:204) at org.apache.accumulo.core.client.impl.ThriftScanner.getBatchFromServer(ThriftScanner.java:99) at org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablet(MetadataLocationObtainer.java:100) at org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocation(TabletLocatorImpl.java:465) at org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:622) at org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:440) at org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocation(TabletLocatorImpl.java:462) at org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:622) at org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:440) at org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:226) at org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:84) at org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:177) at org.apache.accumulo.test.replication.MultiInstanceReplicationIT.dataReplicatedToCorrectTable(MultiInstanceReplicationIT.java:375) Error out of the Scanner: 2014-07-01 08:40:59,251 [impl.ThriftScanner] WARN : Security Violation in scan request to host:58142: ThriftSecurityException(user:root, code:null) Snippet from newly started tserver log (note that the above warning from the scanner came before the tserver said it started the thrift server): 2014-07-01 08:40:59,181 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthorizor 2014-07-01 08:40:59,184 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKAuthenticator 2014-07-01 08:40:59,187 [conf.AccumuloConfiguration] INFO : Loaded class : org.apache.accumulo.server.security.handler.ZKPermHandler 2014-07-01 08:40:59,321 [conf.Property] DEBUG: Loaded class : org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager 2014-07-01 08:40:59,324 [tserver.TabletServer] INFO : Tablet server starting on 0.0.0.0 2014-07-01 08:40:59,339 [util.FileSystemMonitor] INFO : Filesystem monitor started 2014-07-01 08:40:59,343 [server.GarbageCollectionLogger] DEBUG: gc ParNew=0.03(+0.03) secs ConcurrentMarkSweep=0.04(+0.04) secs freemem=42,054,712(+42,054,712) totalmem=47,579,136 2014-07-01 08:40:59,347 [trace.ZooTraceClient] DEBUG: Scanning trace hosts in zookeeper: /accumulo/bc72352d-a904-4102-8ab1-e536aea49c01/tracers 2014-07-01 08:40:59,347 [trace.ZooTraceClient] DEBUG: Trace hosts: [] 2014-07-01 08:40:59,368 [tserver.TabletServer] DEBUG: org.apache.accumulo.tserver.TabletServer$ThriftClientHandler created 2014-07-01 08:40:59,519 [tserver.TabletServer] INFO : address = host:53325 2014-07-01 08:40:59,544 [tserver.TabletServer] DEBUG: Obtained tablet server lock /accumulo/bc72352d-a904-4102-8ab1-e536aea49c01/tservers/host:53325/zlock-0000000000 2014-07-01 08:40:59,572 [tserver.TabletServer] INFO : Started replication service on host:58591 I'm actually wondering now if this is just something due to the thrift-0.9.1 change?

            People

            • Assignee:
              Unassigned
              Reporter:
              Josh Elser
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10m
                10m

                  Development