Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.6.0
-
None
Description
Use cases exist where a specific index within a VectorUDT column of a DataFrame is required. For example, we may be interested in extracting a specific class probability from the probabilityCol of a LogisticRegression to compute losses. However, if probability is a column of df with type VectorUDT, the following code fails:
df.select("probability.0") AnalysisException: u"Can't extract value from probability"
thrown from sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala.
VectorUDT essentially wraps a StructType, hence one would expect it to support value extraction Expressions in an analogous way.
Attachments
Issue Links
- relates to
-
SPARK-19653 `Vector` Type Should Be A First-Class Citizen In Spark SQL
- Resolved