Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7308

Incorrect Metadata from text file queries

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.17.0
    • Fix Version/s: None
    • Component/s: Metadata
    • Labels:
      None
    • Flags:
      Important

      Description

      I'm noticing some strange behavior with the newest version of Drill.  If you query a CSV file, you get the following metadata:

      SELECT * FROM dfs.test.`domains.csvh` LIMIT 1
      
      {
        "queryId": "22eee85f-c02c-5878-9735-091d18788061",
        "columns": [
          "domain"
        ],
        "rows": [}
         {       "domain": "thedataist.com"     }  ],
        "metadata": [
          "VARCHAR(0, 0)",
          "VARCHAR(0, 0)"
        ],
        "queryState": "COMPLETED",
        "attemptedAutoLimit": 0
      }
      

      There are two issues here:

      1.  VARCHAR now has precision
      2.  There are twice as many columns as there should be.

      Additionally, if you query a regular CSV, without the columns extracted, you get the following:

       "rows": [
       { 
            "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]"     }
        ],
         "metadata": [
           "VARCHAR(0, 0)",
           "VARCHAR(0, 0)"
         ],
      

        Attachments

        1. Screen Shot 2019-06-24 at 3.16.40 PM.png
          160 kB
          Charles Givre
        2. domains.csvh
          0.0 kB
          Charles Givre

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              cgivre Charles Givre
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: