Description
Just came across a really odd scenario. I believe that it's a race condition in the client that stems from our beloved ZooCache.
This was observed via a test failure in LogicalTimeIT:
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 29.249 sec <<< FAILURE! - in org.apache.accumulo.test.functional.LogicalTimeIT run(org.apache.accumulo.test.functional.LogicalTimeIT) Time elapsed: 29.037 sec <<< ERROR! org.apache.accumulo.core.client.TableNotFoundException: Table LogicalTimeIT_run06 does not exist at org.apache.accumulo.core.client.impl.Tables._getTableId(Tables.java:117) at org.apache.accumulo.core.client.impl.Tables.getTableId(Tables.java:102) at org.apache.accumulo.core.client.impl.TableOperationsImpl.addSplits(TableOperationsImpl.java:374) at org.apache.accumulo.test.functional.LogicalTimeIT.runMergeTest(LogicalTimeIT.java:81) at org.apache.accumulo.test.functional.LogicalTimeIT.run(LogicalTimeIT.java:56)
Ultimately:
conn.tableOperations().create(table, new NewTableConfiguration().setTimeType(TimeType.LOGICAL)); TreeSet<Text> splitSet = new TreeSet<Text>(); for (String split : splits) { splitSet.add(new Text(split)); } conn.tableOperations().addSplits(table, splitSet);
The important piece to remember is that a ZooKeeper client, when a watcher is set, will eventually get all updates from that watcher in the order which they occurred. LogicalTimeIT is repeatedly running the same test over tables of varying characteristics. I think these are the important points.
Consider the following:
- Client creates a table T1
- ZooCache is cleared after FATE op finishes
- Watcher is set on ZTABLES in ZK
- Client interacts with T1
- Client creates T2
- ZooCache is cleared after FATE op finishes
- Watcher fires on ZTABLES node in ZK (CHILDREN_CHANGED) and repopulates the child list on the ZTABLES node
- Client makes call to split T2
- Code will check if the table exists, but the childrenCache will be repopulated in ZooCache which will cause the client to think the table doesn't exit
- Eventually, the watcher would fire and ZTABLES would be updated and everything is fine.
I believe this is a plausible scenario, however perhaps unlikely.