Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-15271

Spark Bulk Load: Need to write HFiles to tmp location then rename to protect from Spark Executor Failures

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      When using the bulk load helper provided by the hbase-spark module, output files will now be written into temporary files and only made available when the executor has successfully completed.

      Previously, failed executors would leave their files in place in a way that would be picked up by a bulk load command. This caused retried failures to include spurious copies of some cells.
      Show
      When using the bulk load helper provided by the hbase-spark module, output files will now be written into temporary files and only made available when the executor has successfully completed. Previously, failed executors would leave their files in place in a way that would be picked up by a bulk load command. This caused retried failures to include spurious copies of some cells.

      Description

      With the current code if an executor failure before the HFile is close it will cause problems. This jira will have the files first write out to a file that starts with an underscore. Then when the HFile is complete it will be renamed and the underscore will be removed.

      The underscore is important because the load bulk functionality will skip files with an underscore.

        Attachments

        1. HBASE-15271.1.patch
          6 kB
          Theodore michael Malaska
        2. HBASE-15271.2.patch
          6 kB
          Theodore michael Malaska
        3. HBASE-15271.3.patch
          6 kB
          Theodore michael Malaska
        4. HBASE-15271.4.patch
          7 kB
          Theodore michael Malaska

          Issue Links

            Activity

              People

              • Assignee:
                ted.m Theodore michael Malaska
                Reporter:
                ted.m Theodore michael Malaska
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: