Description
Want to add a metadata field to StructField that can be used by other applications like ML to embed more information about the column.
case class case class StructField(name: String, dataType: DataType, nullable: Boolean, metadata: Map[String, Any] = Map.empty)
For ML, we can store feature information like categorical/continuous, number categories, category-to-index map, etc.
One question is how to carry over the metadata in query execution. For example:
val features = schemaRDD.select('features) val featuresDesc = features.schema('features).metadata
Attachments
Issue Links
- is depended upon by
-
SPARK-3573 Dataset
- Resolved
- links to