Description
CachingHTableFactory may close HTable during eviction even if it is getting used for writing by another thread which results in writing thread to fail and index is disabled.
LRU eviction closing HTable or underlying connection when cache is full and new HTable is requested.
2016-08-04 13:45:21,109 DEBUG [nat-s11-4-ioss-phoenix-1-5.openstacklocal,16020,1470297472814-index-writer--pool11-t35] client.ConnectionManager$HConnectionImplementation: Closing HConnection (debugging purposes only) java.lang.Exception at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.internalClose(ConnectionManager.java:2423) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.close(ConnectionManager.java:2447) at org.apache.hadoop.hbase.client.CoprocessorHConnection.close(CoprocessorHConnection.java:41) at org.apache.hadoop.hbase.client.HTableWrapper.internalClose(HTableWrapper.java:91) at org.apache.hadoop.hbase.client.HTableWrapper.close(HTableWrapper.java:107) at org.apache.phoenix.hbase.index.table.CachingHTableFactory$HTableInterfaceLRUMap.removeLRU(CachingHTableFactory.java:61) at org.apache.commons.collections.map.LRUMap.addMapping(LRUMap.java:256) at org.apache.commons.collections.map.AbstractHashedMap.put(AbstractHashedMap.java:284) at org.apache.phoenix.hbase.index.table.CachingHTableFactory.getTable(CachingHTableFactory.java:100) at org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:160) at org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:136) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
But the IndexWriter was using this old connection to write to the table which was closed during LRU eviction
016-08-04 13:44:59,553 ERROR [htable-pool659-t1] client.AsyncProcess: Cannot get replica 0 location for {"totalColumns":1,"row":"\\xC7\\x03\\x04\\x06X\\x1C)\\x00\\x80\\x07\\xB0X","families":{"0":[{"qualifier":"_0","vlen":2,"tag":[],"timestamp":1470318296425}]}} java.io.IOException: hconnection-0x21f468be closed at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1153) at org.apache.hadoop.hbase.client.CoprocessorHConnection.locateRegion(CoprocessorHConnection.java:41) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.findAllLocationsOrFail(AsyncProcess.java:949) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:866) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1195) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveGlobalFailure(AsyncProcess.java:1162) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1100(AsyncProcess.java:584) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:727) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Although the workaround is to the cache size(index.tablefactory.cache.size). But still we should handle the closing of working HTables to avoid index write failures (which in turn disables index).