Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10940

HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 2.0.0
    • File Formats
    • None

    Description

          String filterText = filterExpr.getExprString();
          String filterExprSerialized = Utilities.serializeExpression(filterExpr);
      

      the serializeExpression initializes Kryo and produces a new packed object for every split.

      HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters.

      And Kryo is very slow to do this for a large filter clause.

      Attachments

        1. HIVE-10940.01.patch
          4 kB
          Sergey Shelukhin
        2. HIVE-10940.02.patch
          9 kB
          Gunther Hagleitner
        3. HIVE-10940.03.patch
          9 kB
          Gunther Hagleitner
        4. HIVE-10940.patch
          4 kB
          Sergey Shelukhin

        Issue Links

          Activity

            People

              hagleitn Gunther Hagleitner
              gopalv Gopal Vijayaraghavan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: