Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44657

Incorrect limit handling and config parsing in Arrow collect

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.2, 3.4.0, 3.4.1, 3.5.0
    • 3.4.2, 3.5.0, 4.0.0
    • Connect
    • None

    Description

      In the arrow writer code , the conditions don’t seem to hold what the documentation says regd "maxBatchSize and maxRecordsPerBatch, respect whatever smaller" since it seems to actually respect the conf which is "larger" (i.e less restrictive) due to || operator.

       

      Further, when the `CONNECT_GRPC_ARROW_MAX_BATCH_SIZE` conf is read, the value is not converted to bytes from Mib (example).

      Attachments

        Activity

          People

            vicennial Venkata Sai Akhil Gudesa
            vicennial Venkata Sai Akhil Gudesa
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: