Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-1350

Sqoop2: Support all supported data types in the CSV Intermediate Data Format implementation

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.99.3
    • Fix Version/s: 1.99.5
    • Component/s: sqoop2-framework
    • Labels:
      None

      Description

      http://issues.apache.org/jira/secure/attachment/12589331/Sqoop2Datatypes.pdf

      Link to the original IDF proposal on the column types to support the IDF Schema.

      The updated design doc is in

      https://cwiki.apache.org/confluence/display/SQOOP/Intermediate+Data+Format+API

      1. SQOOP-1709.patch
        29 kB
        Veena Basavaraj

        Issue Links

        1.
        Sqoop2: rename Type to Column Type + size to length + minor doc fixes Sub-task Resolved Veena Basavaraj
         
        2.
        Sqoop2: Allow null as a dummy Schema Sub-task Resolved Veena Basavaraj
         
        3.
        Sqoop2: Remove Data class Sub-task Resolved Veena Basavaraj
         
        4.
        Sqoop2: Use configurable writable to get Intermediate Data Format Sub-task Resolved Veena Basavaraj
         
        5.
        Add IDF API doc/wiki for the IDF interface and Schema -> ColumnTypes Sub-task Resolved Veena Basavaraj
         
        6.
        SQOOP2: Address the validate method in Column class Sub-task Resolved Veena Basavaraj
         
        7.
        Rename Unsupported Column type to Unknown and add java doc Sub-task Resolved Veena Basavaraj
         
        8.
        Column Type enhancements for complex types Sub-task Resolved Veena Basavaraj
         
        9.
        Make name for column required ( fix the corr tests) Sub-task Resolved Veena Basavaraj
         
        10.
        Add Options as a field in the Enum object ( so it can be used for validation) Sub-task Resolved Veena Basavaraj
         
        11.
        Support List Type in CSV IDF Sub-task Resolved Veena Basavaraj
         
        12.
        Support Map Type in CSV IDF Sub-task Resolved Veena Basavaraj
         
        13.
        Sqoop2: Remove Data class from docs Sub-task Resolved Veena Basavaraj
         
        14.
        Sqoop2: Unit tests for different Column sub classes Array/Set and Map types Sub-task Resolved Veena Basavaraj
         
        15.
        Sqoop2: Time/Timestamp format support for CSV IDF Sub-task Resolved Veena Basavaraj
         
        16.
        Fix Enum to no inherit from list Sub-task Resolved Veena Basavaraj
         
        17.
        Investigation CSV IDF FORMAT of the Array/NestedArray/ Set/ Map in Postgres and HIVE. Sub-task Resolved Veena Basavaraj
         
        18.
        Sqoop2: IDF API changes Sub-task Resolved Veena Basavaraj
         
        19.
        Sqoop2: Add SqoopIDFUtils class and unit tests Sub-task Resolved Veena Basavaraj
         
        20.
        Sqoop2: Date and DateTime is not encoded in Single Quotes Sub-task Resolved Veena Basavaraj
         
        21.
        Support Enum in CSVIDF ( + add unit tests) Sub-task Resolved Veena Basavaraj
         
        22.
        Sqoop2: Handle NULLs for all types in CSV Intermediate Data Format Sub-task Resolved Veena Basavaraj
         
        23.
        Sqoop2: Update CSVIntermediate BIT data type Sub-task Resolved Veena Basavaraj
         
        24.
        Sqoop2: Define IDF object model Sub-task Resolved Veena Basavaraj
         
        25.
        Sqoop2: DateTime support in CSV IDF and iso8601 Sub-task Resolved Veena Basavaraj
         
        26.
        Sqoop2: Make DateTime Column type support datetime with and without timezone Sub-task Resolved Veena Basavaraj
         
        27.
        Using JODA for datetime means we only have 3 digit millisecond representation for fraction Sub-task Resolved Veena Basavaraj
         

          Activity

          Hide
          vybs Veena Basavaraj added a comment -

          Is there a list of a wiki I can use a reference?

          Show
          vybs Veena Basavaraj added a comment - Is there a list of a wiki I can use a reference?
          Hide
          vybs Veena Basavaraj added a comment -

          Now moved to 1.99.5

          Show
          vybs Veena Basavaraj added a comment - Now moved to 1.99.5
          Hide
          vybs Veena Basavaraj added a comment - - edited

          Pending types not yet supported in the CSV are

          1. MAP
          2. SET
          3. ENUM
          4. ARRAY
          5. TIME
          6. Something called Unsupported .!

          Amend:
          and another addition:

          NULL is not handled for al types
          Object model for Date and DateTime are JODA, so this protocol should be made standard

          Show
          vybs Veena Basavaraj added a comment - - edited Pending types not yet supported in the CSV are 1. MAP 2. SET 3. ENUM 4. ARRAY 5. TIME 6. Something called Unsupported .! Amend: and another addition: NULL is not handled for al types Object model for Date and DateTime are JODA, so this protocol should be made standard
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          Just for a reference, we've explored the Datatypes in SQOOP-515 and we do have IDF test representation described on the wiki. We however have gaps (especially in the nested types) that we should figure out (and Veena Basavaraj have a subtasks for that).

          Show
          jarcec Jarek Jarcec Cecho added a comment - Just for a reference, we've explored the Datatypes in SQOOP-515 and we do have IDF test representation described on the wiki . We however have gaps (especially in the nested types) that we should figure out (and Veena Basavaraj have a subtasks for that).

            People

            • Assignee:
              vybs Veena Basavaraj
              Reporter:
              abec Abraham Elmahrek
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development