Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17762

invokeJava fails when serialized argument list is larger than INT_MAX (2,147,483,647) bytes

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.0.0
    • None
    • SparkR

    Description

      We call writeBin within writeRaw which is called from invokeJava on the serialized arguments list. Unfortunately, writeBin has a hard-coded limit set to R_LEN_T_MAX (which is itself set to INT_MAX in base).

      To work around it, we can check for this case and serialize the batch in multiple parts.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            falaki Hossein Falaki
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment