Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1408

SELECT column from CSV with JOIN returns null if not part of JOIN condition

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.6.0
    • Component/s: Storage - Text & CSV
    • Labels:
      None
    • Environment:

      CentOS 6.5, MapR M7

      Description

      A SELECT for a column from a CSV file with a JOIN condition always returns null for columns not included in the JOIN condition. When querying each table separately, the values are returned as expected.

      Note that this works fine for any combination of JSON and Parquet, but fails when at least one of the files is CSV.

      Simple example with two small CSV files:

      beatles.csv:
      ---------------------------
      1,John,Lennon
      2,Paul,McCartney
      3,George,Harrison
      4,Ringo,Starr
      ----------------------------

      songs.csv:
      ----------------------------
      1,Help
      2,Yesterday
      3,Blue Jay Way
      4,Yellow Submarine
      ----------------------------

      This queries returns values as expected:

      SELECT columns[0] AS id, CONCAT(columns[1], ' ', columns[2]) AS singer FROM dfs.`beatles.csv`;

      This query returns 4 results, all with null values:

      SELECT S.columns[1] AS song, CONCAT(B.columns[1], ' ', B.columns[2]) AS singer
      FROM dfs.`beatles.csv` AS B
      INNER JOIN dfs.`songs.csv` AS S ON B.columns[0] = S.columns[0];

      The only columns that return non-null values are the ones from the JOIN condition (B.columns[0] and S.columns[0] in the following query):

      SELECT S.columns[1] AS song, CONCAT(B.columns[1], ' ', B.columns[2]) AS singer, S.columns[0] AS beatles_id, B.columns[0] AS id
      FROM dfs.`beatles.csv` AS B
      INNER JOIN dfs.`songs.csv` AS S ON B.columns[0] = S.columns[0];

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                DrillCommitter DrillCommitter
                Reporter:
                mdavis@liaison.com Mike Davis
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: