Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-1532

Sqoop2: Support Sqoop on Spark Execution Engine

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None

      Description

      The current execution engine supported in sqoop is MR.

      The goal if this ticket is to support sqoop jobs ( map only and map+reduce ) to run on spark environment.

      It should at the minimum support running on the standalone spark cluster and then subsequently work with YARN/mesos.

      High level goals
      1. Hook up with the connector apis to provide the basic load/ extract to the spark RDD.
      2. Implementation of the Sqoop RDD to support extraction from different data sources . The design proposal will discuss the alternatives on how this can be achieved.
      3. Optimizing the loading/writing ( re-use/ refactor the consumer thread code to be agnostic of the hadoop output format)

        Attachments

        1. SQOOP-1532.patch
          825 kB
          Enrique Ruiz Garcia

          Activity

            People

            • Assignee:
              vybs Veena Basavaraj
              Reporter:
              vybs Veena Basavaraj
            • Votes:
              9 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

              • Created:
                Updated: