Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3551

CTAS from complex Json source with schema change is not written (and hence not read back ) correctly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.1.0
    • 1.2.0
    • Execution - Data Types
    • None

    Description

      The source data contains -

      20K rows with the following -
      {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}}

      200 rows with the following -
      {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last
      entries only"}}

      Creating a table and reading it back returns incorrect data -

      CREATE TABLE testparquet as select * from `test.json`;
      SELECT * from testparquet;

      Yields

      yes {"other":"true","all":"false","sometimes":"yes"}
      yes {"other":"true","all":"false","sometimes":"yes"}
      yes {"other":"true","all":"false","sometimes":"yes"}
      yes {"other":"true","all":"false","sometimes":"yes"}

      The "additional" field is missing in all records

      Parquet metadata for the created file does not have the 'additional' field

      Attachments

        1. DRILL-3551.json
          1.45 MB
          Parth Chandra

        Issue Links

          Activity

            People

              cchang@maprtech.com Chun Chang
              parthc Parth Chandra
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: