Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-2055

Run only one map task attempt during export

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.5
    • Fix Version/s: 1.4.6
    • Component/s: None
    • Labels:
      None

      Description

      While investigating several user issues, I've noticed that our documentation is stating that on export mapper failure we fail the entire job:

      If an export map task fails due to these or other reasons, it will cause the export job to fail. The results of a failed export are undefined. Each export map task operates in a separate transaction. Furthermore, individual map tasks commit their current transaction periodically. If a task fails, the current transaction will be rolled back. Any previously-committed transactions will remain durable in the database, leading to a partially-complete export.

      This is however not the observed behavior as mapreduce will re-run failed mapper again (up to 3 times) before failing the job. This is confusing while investigating failures because most often one have to go to the first failed attempt and ignore the rest as they are usually failing on unrelated issues (key constraints).

      It seems that some of the connectors are smart enough to either suggest user to configure MR or do it automatically (PGDump, OraOop). I would like to propose to apply this behavior on every export job as that seem as a more reasonable default for export job.

      Doing this might have a side effect on more advanced connectors that have each mapper attempt idempotent (e.g. they are using temporary tables per map attempt or similar facility) in the sense that we stop re-running their failed attempts automatically and those connectors will have to re-enable this behavior on their own.

        Attachments

        1. SQOOP-2055.patch
          2 kB
          Jarek Jarcec Cecho
        2. SQOOP-2055.patch
          2 kB
          Jarek Jarcec Cecho

          Issue Links

            Activity

              People

              • Assignee:
                jarcec Jarek Jarcec Cecho
                Reporter:
                jarcec Jarek Jarcec Cecho
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: