Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4997

[C#] ArrowStreamReader doesn't consume whole stream and doesn't implement sync read

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.13.0
    • Component/s: C#

      Description

      There are 2 major issues with the ArrowStreamReader that are blocking me from using it.

      1. When it reads a batch from a .NET Stream that doesn't return the whole chunk of memory in one "Read" call (like a socket/network stream), it only calls Read once, and then continues on. This is an issue because it has "garbage" at the end of its buffer (which was never written to by the stream), and when attempting to read the next batch, it is in the middle of the previous batch from the .NET Stream. This causes all sorts of issues because it assumes the next 4 bytes are the message length, which it obviously isn't. See the reading code for where it only calls Read once - it should be in a loop.
      2. ArrowStreamReader has a synchronous ReadNextRecordBatch() method - but it throws NotImplementedException. This is necessary when a caller isn't in an async method, they can't/shouldn't call the async API.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                eerhardt Eric Erhardt
                Reporter:
                eerhardt Eric Erhardt
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 4h
                  4h
                  Remaining:
                  Time Spent - 1h 20m Remaining Estimate - 2h 40m
                  2h 40m
                  Logged:
                  Time Spent - 1h 20m Remaining Estimate - 2h 40m
                  1h 20m