Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-15271

Spark Bulk Load: Need to write HFiles to tmp location then rename to protect from Spark Executor Failures

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      When using the bulk load helper provided by the hbase-spark module, output files will now be written into temporary files and only made available when the executor has successfully completed.

      Previously, failed executors would leave their files in place in a way that would be picked up by a bulk load command. This caused retried failures to include spurious copies of some cells.
      Show
      When using the bulk load helper provided by the hbase-spark module, output files will now be written into temporary files and only made available when the executor has successfully completed. Previously, failed executors would leave their files in place in a way that would be picked up by a bulk load command. This caused retried failures to include spurious copies of some cells.

      Description

      With the current code if an executor failure before the HFile is close it will cause problems. This jira will have the files first write out to a file that starts with an underscore. Then when the HFile is complete it will be renamed and the underscore will be removed.

      The underscore is important because the load bulk functionality will skip files with an underscore.

        Attachments

        1. HBASE-15271.1.patch
          6 kB
          Theodore michael Malaska
        2. HBASE-15271.2.patch
          6 kB
          Theodore michael Malaska
        3. HBASE-15271.3.patch
          6 kB
          Theodore michael Malaska
        4. HBASE-15271.4.patch
          7 kB
          Theodore michael Malaska

        Issue Links

          Activity

            People

            • Assignee:
              ted.m Theodore michael Malaska
              Reporter:
              ted.m Theodore michael Malaska

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment