Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12417

Orc bloom filter options are not propagated during file write in spark

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      ORC bloom filter is supported by the version of hive used in Spark 1.5.2. However, when trying to create orc file with bloom filter option, it does not make use of it.

      E.g, following orc output does not create the bloom filter even though the options are specified.

          Map<String, String> orcOption = new HashMap<String, String>();
          orcOption.put("orc.bloom.filter.columns", "*");
          hiveContext.sql("select * from accounts where effective_date='2015-12-30'").write().
              format("orc").options(orcOption).save("/tmp/accounts");
      

        Attachments

        1. SPARK-12417.1.patch
          2 kB
          Rajesh Balamohan

          Issue Links

            Activity

              People

              • Assignee:
                apachespark Apache Spark
                Reporter:
                rajesh.balamohan Rajesh Balamohan
              • Votes:
                1 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: