Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-5258

Add support to parse header from the input CSV file as input columns for CsvBulkLoadTool

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Patch Available
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.15.1, 5.1.1, 4.16.1
    • Component/s: None
    • Labels:
      None

      Description

      Currently, CsvBulkLoadTool does not support reading header from the input csv and expects the content of the csv to match with the table schema. The support for the header can be added to dynamically map the schema with the header.

      The proposed solution is to introduce another option for the tool `–parse-header`. If this option is passed, the input columns list is constructed by reading the first line of the input CSV file.

      • If there is only one file, read the header from the first line and generate the `ColumnInfo` list.
      • If there are multiple files, read the header from all the files, and throw an error if the headers across files do not match.

        Attachments

        1. PHOENIX-5258-master.patch
          19 kB
          Prashant Vithani
        2. PHOENIX-5258-master.001.patch
          26 kB
          Prashant Vithani
        3. PHOENIX-5258-4.x-HBase-1.4.patch
          19 kB
          Prashant Vithani
        4. PHOENIX-5258-4.x-HBase-1.4.001.patch
          26 kB
          Prashant Vithani

          Issue Links

            Activity

              People

              • Assignee:
                prvithani Prashant Vithani
                Reporter:
                prvithani Prashant Vithani
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m