Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11160 Auto-gather column stats
  3. HIVE-18894

Statistics: rawDataSize seems to be underestimated for text tables

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      rawdatasize 5312 seems to be an underestimation...

      afaik for orc the rawDataSize is estimated as the "online" datasize; for text tables it currently seems like its calculated as TOTAL_SIZE - ROW_NUM in some cases

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            kgyrtkirk Zoltan Haindrich

            Dates

              Created:
              Updated:

              Slack

                Issue deployment