Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5092

Selecting from a nested structure with SparkSQL should return a nested structure

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 1.2.0
    • Fix Version/s: None
    • Component/s: Spark Core
    • Labels:

      Description

      When running a sparksql query like this (at least on a json dataset)

      select
      rid,
      meta_data.name
      from
      a_table

      The rows returned lose the nested structure. I receive a row like

      Row(rid='123', name='delete')

      instead of

      Row(rid='123', meta_data=Row(name='data'))

      I personally think this is confusing especially when programmatically building and executing queries and then parsing it to find your data in a new structure. I could understand how that's less desirable in some situations, but you could get around it by supporting 'as'. If you wanted to skip the nested structure simply write.

      select
      rid,
      meta_data.name as name
      from
      a_table

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              brdwrd Brad Willard
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: