Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-1532

Sqoop2: Support Sqoop on Spark Execution Engine

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • 2.0.0
    • None
    • None

    Description

      The current execution engine supported in sqoop is MR.

      The goal if this ticket is to support sqoop jobs ( map only and map+reduce ) to run on spark environment.

      It should at the minimum support running on the standalone spark cluster and then subsequently work with YARN/mesos.

      High level goals
      1. Hook up with the connector apis to provide the basic load/ extract to the spark RDD.
      2. Implementation of the Sqoop RDD to support extraction from different data sources . The design proposal will discuss the alternatives on how this can be achieved.
      3. Optimizing the loading/writing ( re-use/ refactor the consumer thread code to be agnostic of the hadoop output format)

      Attachments

        1. SQOOP-1532.patch
          825 kB
          Enrique Ruiz Garcia

        Activity

          People

            vybs Veena Basavaraj
            vybs Veena Basavaraj
            Votes:
            6 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated: