Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7308

Incorrect Metadata from text file queries

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.17.0
    • None
    • Metadata
    • None
    • Important

    Description

      I'm noticing some strange behavior with the newest version of Drill.  If you query a CSV file, you get the following metadata:

      SELECT * FROM dfs.test.`domains.csvh` LIMIT 1
      
      {
        "queryId": "22eee85f-c02c-5878-9735-091d18788061",
        "columns": [
          "domain"
        ],
        "rows": [}
         {       "domain": "thedataist.com"     }  ],
        "metadata": [
          "VARCHAR(0, 0)",
          "VARCHAR(0, 0)"
        ],
        "queryState": "COMPLETED",
        "attemptedAutoLimit": 0
      }
      

      There are two issues here:

      1.  VARCHAR now has precision
      2.  There are twice as many columns as there should be.

      Additionally, if you query a regular CSV, without the columns extracted, you get the following:

       "rows": [
       { 
            "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]"     }
        ],
         "metadata": [
           "VARCHAR(0, 0)",
           "VARCHAR(0, 0)"
         ],
      

      Attachments

        1. domains.csvh
          0.0 kB
          Charles Givre
        2. Screen Shot 2019-06-24 at 3.16.40 PM.png
          160 kB
          Charles Givre

        Activity

          People

            Unassigned Unassigned
            cgivre Charles Givre
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: