Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1750

Querying directories with JSON files returns incomplete results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Reviewable
    • Critical
    • Resolution: Unresolved
    • None
    • Future
    • Storage - JSON
    • None

    Description

      I happened to observe that querying (select *) a directory with json files displays only fields common to all json files. All corresponding fields are displayed while querying each of the json files individually. And in some scenarios, querying the directory crashes sqlline.

      The example below may help make the issue clear:

      > select * from dfs.`/data/json/tmp/1.json`;
      --------------------------------

      artist track_id title

      --------------------------------

      Jonathan King TRAAAEA128F935A30D I'll Slap Your Face (Entertainment USA Theme)

      --------------------------------
      1 row selected (1.305 seconds)

      > select * from dfs.`/data/json/tmp/2.json`;
      ------------------------------------------+

      artist timestamp track_id title

      ------------------------------------------+

      Supersuckers 2011-08-01 20:30:17.991134 TRAAAQN128F9353BA0 Double Wide

      ------------------------------------------+
      1 row selected (0.105 seconds)

      > select * from dfs.`/data/json/tmp/3.json`;
      --------------------------------

      timestamp track_id title

      --------------------------------

      2011-08-01 20:30:17.991134 TRAAAQN128F9353BA0 Double Wide

      --------------------------------
      1 row selected (0.083 seconds)

      > select * from dfs.`/data/json/tmp/4.json`;
      ----------------------+

      track_id title

      ----------------------+

      TRAAAQN128F9353BA0 Double Wide

      ----------------------+
      1 row selected (0.076 seconds)

      > select * from dfs.`/data/json/tmp`;
      ----------------------+

      track_id title

      ----------------------+

      TRAAAQN128F9353BA0 Double Wide
      TRAAAQN128F9353BA0 Double Wide
      TRAAAEA128F935A30D I'll Slap Your Face (Entertainment USA Theme)
      TRAAAQN128F9353BA0 Double Wide

      ----------------------+
      4 rows selected (0.121 seconds)

      JVM Crash occurs at times:

      > select * from dfs.`/data/json/tmp`;
      --------------------------------

      timestamp track_id title

      --------------------------------

      2011-08-01 20:30:17.991134 TRAAAQN128F9353BA0 Double Wide

      #

      1. A fatal error has been detected by the Java Runtime Environment:
        #
      2. SIGSEGV (0xb) at pc=0x00007f3cb99be053, pid=13943, tid=139898808436480
        #
      3. JRE version: OpenJDK Runtime Environment (7.0_65-b17) (build 1.7.0_65-mockbuild_2014_07_16_06_06-b00)
      4. Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 compressed oops)
      5. Problematic frame:
      6. V [libjvm.so+0x932053]
        #
      7. Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
        #
      8. An error report file with more information is saved as:
      9. /tmp/jvm-13943/hs_error.log
        #
      10. If you would like to submit a bug report, please include
      11. instructions on how to reproduce the bug and visit:
      12. http://icedtea.classpath.org/bugzilla
        #
        Aborted

      Attachments

        1. 1.json
          0.1 kB
          Abhishek Girish
        2. 2.json
          0.1 kB
          Abhishek Girish
        3. 3.json
          0.1 kB
          Abhishek Girish
        4. 4.json
          0.1 kB
          Abhishek Girish
        5. DRILL-1750_2015-07-06_16:39:04.patch
          3 kB
          Steven Phillips

        Activity

          People

            sphillips Steven Phillips
            agirish Abhishek Girish
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: