Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3551

CTAS from complex Json source with schema change is not written (and hence not read back ) correctly

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.2.0
    • Component/s: Execution - Data Types
    • Labels:
      None

      Description

      The source data contains -

      20K rows with the following -
      {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}}

      200 rows with the following -
      {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last
      entries only"}}

      Creating a table and reading it back returns incorrect data -

      CREATE TABLE testparquet as select * from `test.json`;
      SELECT * from testparquet;

      Yields

      yes {"other":"true","all":"false","sometimes":"yes"}
      yes {"other":"true","all":"false","sometimes":"yes"}
      yes {"other":"true","all":"false","sometimes":"yes"}
      yes {"other":"true","all":"false","sometimes":"yes"}

      The "additional" field is missing in all records

      Parquet metadata for the created file does not have the 'additional' field

        Attachments

        1. DRILL-3551.json
          1.45 MB
          Parth Chandra

          Issue Links

            Activity

              People

              • Assignee:
                cchang@maprtech.com Chun Chang
                Reporter:
                parthc Parth Chandra
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: