Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-1529 Kite Connector Support
  3. SQOOP-1588

Sqoop2: Kite connector write data to HDFS

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.99.5
    • sqoop2-kite-connector
    • None

    Description

      Create a basic Kite connector that can write data (i.e. from a jdbc connection) to HDFS.

      The scope is defined as follows:

      • Destination: HDFS
      • File Format: Avro Parquet and CSV.
      • Compression Codec: Use default
      • Partitioner Strategy: Not supported
      • Column Mapping: Not supported

      Exposed Configuration:

      • [Link] File Format (Enum)
      • [To] Dataset URI (String, has a validation check)

      Workflow:

      • Create a link to Kite Connector
      • Create a job with valid configuration (see above)
      • Start a job KiteToInitializer will check dataset existence
      • Sqoop will create N KiteLoader instances.
      • Kite requires an Avro schema for data manipulation, KiteLoader will create an Avro schema from Sqoop schema provided by LoaderContext. As Sqoop schema types are not identical to Avro types, some types will be mapped. The original Sqoop type information will be kept as SqoopType in schema field, which can be used for a reversed type mapping.
      • KiteLoader will create a temporary dataset and writes data records into it. If any error occurs, the dataset will be deleted.
      • KiteToDestroy will merge all temporary datasets as one dataset.

      Further features will be implemented in follow-up JIRAs.

      Attachments

        1. SQOOP-1588.5.patch
          68 kB
          Qian Xu

        Issue Links

          Activity

            People

              stanleyxu2005 Qian Xu
              stanleyxu2005 Qian Xu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: