Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7001

Documentation - renaming columns name in csv header

    XMLWordPrintableJSON

    Details

    • Type: Wish
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.15.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Don't know how if this is the best place for this request but,

      Some operation are realized that eventually change the name of the column when requesting a csvh file (with header),
      These operations are not documented.
      Although it's possible to read HeaderBuilder.java, It will be interesting to create a section in documentation to explain at least the principle of these different cases to avoid stupid problems/difficulties

      List of operations (maybe not exhaustive) :

      • Trim() on CSV column name
         Name , Age,PoB  , Info
        =>
        `Name`, `Age`, `PoB` and `Info`
      • Others characters than [a-zA-Z0-9_] are replace by '_' (underscore)
        Name,Sum$,em@il
        =>
        `Name`,'`Sum_`,`em_il`
      • Fieldname starting with '_' (underscore) are prefixed by 'col'
        _name,_age_,pob_,_col_
        =>
        `col_name`, `col_age_`, `pob_`, `col_col_`
      • Fieldname starting with [^a-zA-Z] are prefixed 'col_'
        0_name, 1_age,@pob,#other1,'other2'
        =>
        `col_0_name`, `col_1_age`, `col_pob`, `col_other1`, `col_other2_`
      •  Quotation marks are removed
      • If char is unique
        • if [a-zA-Z] do nothing
        • elif [0-9] prefix with col_
        • else reanme in column_[0-9]+ where [0-9]+ designs the position of the column
      • Duplicate columns names (case insensitive) are suffixed with _[0-9]+ (starting from "_2")
        0_name,col_0_name,colx,COLX,colx,colx_2
        =>
        `col_0_name`, `col_0_name_2`, `colx`, `COLX_2`, `colx_3`, `colx_2_2`

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              benj641 benj
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: