[SPARK-8907] Speed up path construction in DynamicPartitionWriterContainer.outputWriterForRow - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.5.0
Component/s: SQL
Labels:
None

Target Version/s:

1.5.0
Sprint:
Spark 1.5 release

Description

Don't use zip and scala collection methods to avoid garbage collection

    val partitionPath = partitionColumns.zip(row.toSeq).map { case (col, rawValue) =>
      val string = if (rawValue == null) null else String.valueOf(rawValue)
      val valueString = if (string == null || string.isEmpty) {
        defaultPartitionName
      } else {
        PartitioningUtils.escapePathName(string)
      }
      s"/$col=$valueString"
    }.mkString.stripPrefix(Path.SEPARATOR)

We can probably use catalyst expressions themselves to construct the path, and then we can leverage code generation to do this.

Attachments

Activity

People

Assignee:: Cheng Lian

Reporter:: Reynold Xin

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 08/Jul/15 20:03

Updated:: 14/Jul/15 01:28

Resolved:: 14/Jul/15 01:28