Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.0
    • Component/s: MLlib, SQL
    • Labels:
      None
    • Target Version/s:

      Description

      Want to add a metadata field to StructField that can be used by other applications like ML to embed more information about the column.

      case class case class StructField(name: String, dataType: DataType, nullable: Boolean, metadata: Map[String, Any] = Map.empty)
      

      For ML, we can store feature information like categorical/continuous, number categories, category-to-index map, etc.

      One question is how to carry over the metadata in query execution. For example:

      val features = schemaRDD.select('features)
      val featuresDesc = features.schema('features).metadata
      

        Issue Links

          Activity

          Hide
          marmbrus Michael Armbrust added a comment -

          Issue resolved by pull request 2701
          https://github.com/apache/spark/pull/2701

          Show
          marmbrus Michael Armbrust added a comment - Issue resolved by pull request 2701 https://github.com/apache/spark/pull/2701
          Show
          mengxr Xiangrui Meng added a comment - I put the design doc here: https://docs.google.com/document/d/1RGJgVJhCebnilpL15ODcq0EWBeVjl9ltoHUvosWodPg/edit?usp=sharing
          Hide
          apachespark Apache Spark added a comment -

          User 'mengxr' has created a pull request for this issue:
          https://github.com/apache/spark/pull/2701

          Show
          apachespark Apache Spark added a comment - User 'mengxr' has created a pull request for this issue: https://github.com/apache/spark/pull/2701

            People

            • Assignee:
              mengxr Xiangrui Meng
              Reporter:
              mengxr Xiangrui Meng
              Shepherd:
              Michael Armbrust
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development