Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
In production environment, we usually do bulkload huge amount hfile . It reasonable fail fast when any IOException occur
hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
public Map<byte[], List<Path>> bulkLoadHFiles(Collection<Pair<byte[], String>> familyPaths, boolean assignSeqId, BulkLoadListener bulkLoadListener, boolean copyFile, List<String> clusterIds, boolean replicate) throws IOException { ...... try { this.writeRequestsCount.increment(); // There possibly was a split that happened between when the split keys // were gathered and before the HRegion's write lock was taken. We need // to validate the HFile region before attempting to bulk load all of them List<IOException> ioes = new ArrayList<>(); List<Pair<byte[], String>> failures = new ArrayList<>(); for (Pair<byte[], String> p : familyPaths) { byte[] familyName = p.getFirst(); String path = p.getSecond(); HStore store = getStore(familyName); if (store == null) { IOException ioe = new org.apache.hadoop.hbase.DoNotRetryIOException( "No such column family " + Bytes.toStringBinary(familyName)); ioes.add(ioe); } else { try { store.assertBulkLoadHFileOk(new Path(path)); } catch (WrongRegionException wre) { // recoverable (file doesn't fit in region) failures.add(p); } catch (IOException ioe) { // unrecoverable (hdfs problem) ioes.add(ioe); } } } // validation failed because of some sort of IO problem. if (ioes.size() != 0) { IOException e = MultipleIOException.createIOException(ioes); LOG.error("There were one or more IO errors when checking if the bulk load is ok.", e); throw e; }
Attachments
Issue Links
- links to