Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
5.0
-
None
-
None
Description
Today, if an HdfsTransactionLog cannot recover its lease, you get the following warning in the log:
log.warn("Cannot recoverLease after trying for " + conf.getInt("solr.hdfs.lease.recovery.timeout", 900000) + "ms (solr.hdfs.lease.recovery.timeout); continuing, but may be DATALOSS!!!; " + getLogMessageDetail(nbAttempt, p, startWaiting));
But some deployments may not actually want to continue if there is potential data loss, they may want to investigate what the underlying issue is with HDFS first. And there's no way outside of looking at the logs to figure out what is going on.
There's a range of possibilties here, but here's a couple of ideas:
1) config parameter around whether to continue with potential data loss or not
2) load but require special flag to read potentially incorrect data (similar to shards.tolerant, data.tolerant or something?)