Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11266

count(*) wrong result based on table statistics for external tables

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.1.0
    • 3.0.0
    • None
    • None

    Description

      Hive returns wrong count result on an external table with table statistics if I change table data files.

      This is the scenario in details:
      1) create external table my_table (...) location 'my_location';
      2) analyze table my_table compute statistics;
      3) change/add/delete one or more files in 'my_location' directory;
      4) select count(*) from my_table;

      In this case the count query doesn't generate a MR job and returns the result based on table statistics. This result is wrong because is based on statistics stored in the Hive metastore and doesn't take into account modifications introduced on data files.

      Obviously setting "hive.compute.query.using.stats" to FALSE this problem doesn't occur but the default value of this property is TRUE.

      I thinks that also this post on stackoverflow, that shows another type of bug in case of multiple insert, is related to the one that I reported:
      http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table

      Attachments

        1. HIVE-11266.patch
          1 kB
          Jesus Camacho Rodriguez
        2. HIVE-11266.01.patch
          6 kB
          Jesus Camacho Rodriguez

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jcamachorodriguez Jesus Camacho Rodriguez Assign to me
            simobatt Simone Battaglia
            Votes:
            1 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment