Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-2001

Sqoop2 Kite connector might produce duplicate values when retrying failed tasks.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 2.0.0
    • None
    • None

    Description

      This happens (as I understand things) because Kite may make files visible before a task is completed or committed in the combined temporary dataset directory. We should be able to avoid this by setting the temporary dataset's writer cache limit to something huge - so files are not closed until the overall writer is closed, no matter how many files are open.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rdblue Ryan Blue
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: