Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32731

Add tests for arrays/maps of nested structs to ReadSchemaSuite to test structs reuse

    XMLWordPrintableJSON

Details

    • Test
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.0.0
    • None
    • SQL, Tests
    • None

    Description

      Splitting tests originally posted in [PR|https://github.com/apache/spark/pull/29352] for SPARK-32531. The added tests cover cases for maps and arrays of nested structs for different file formats. Eg, https://github.com/apache/spark/pull/29353 and https://github.com/apache/spark/pull/29354 add object reuse when reading ORC and Avro files. However, for dynamic data structures like arrays and maps, we do not know just by looking at the schema what the size of the data structure will be so it has to be allocated when reading the data points. The added tests provide coverage so that objects are not accidentally reused when encountering maps and arrays.

      AFAIK this is not covered by existing tests.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              samkhan Muhammad Samir Khan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: