Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-1920

Sparksql query result is not same as presto on same sql

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.0
    • Fix Version/s: 1.3.0
    • Component/s: presto-integration
    • Labels:
      None
    • Environment:
      spark2.1,presto 0.187

      Description

      i use carbondata version is 1.2.0 and spark version is 1.6.0.
      in my test case
      1.Creating a Table
      cc.sql("create table IF NOT EXISTS test.table5(id string,name
      String,city String,age int) stored by 'carbondata' *
      tblproperties('DICTIONARY_INCLUDE' = 'age')* ")

      2.load csv data into table,data like this:
      id,name,city,age
      1,david,shenzhen,31
      88,eason,shenzhen,27
      3,jarry,wuhan,35

      3.select from sparksql,result is :
      ------------------------

      id name city age

      ------------------------

      1 david shenzhen 31
      3 jarry wuhan 35
      88 eason shenzhen 27

      ------------------------
      this result is correct

      4.select from presto,result is:
      id | name | city | age
      ----------------+----
      1 | david | shenzhen | 3
      3 | jarry | wuhan | 4
      88 | eason | shenzhen | 2
      (3 rows)
      look at the age filed,is wrong

      I know why this happens because I used dictionary encoding in the age field。

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                anubhavtarar anubhav tarar
                Reporter:
                anubhavtarar anubhav tarar
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m