[DAFFODIL-1799] Enable data streaming in the CLI - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: 2.2.0
Component/s: CLI, Performance
Labels:
None

Description

This CLI currently reads the input data as a byte array. This is simple and allows for ensuring all data is read into a memory, reducing disk overhead during the preformance command. However, this means the CLI is limited to the maximum size of an array, which is INT_MAX. In order to support the CLI parsing/unparsing larger files, we should instead work on InputStreams rather than array buffers. For the performance subcommand, this will mean requiring something like a SplittalbeInputStream that will allow multiple consumers of a single InputStream.

Some SplittableInputStream implementations do exist, for example in JMRTD and on stack overflow, but licensing issues make it so these aren't an option. Either need to find a solution compatible with our license or implement our own.

This work should be done concurrently with changes to improve the efficiency of the I/O layer.

Attachments

Issue Links

duplicates

DAFFODIL-1968 Support --stream option for CLI performance command

Closed

DAFFODIL-1565 Unparse API Cursor-behavior/streaming enhancements

Open

DAFFODIL-934 Streaming parser: Need to stream input data in, and infoset out to handle arbitrarily large data.

Closed

DAFFODIL-1065 parser: API needs to enable repeated calls to parser - not treat unconsumed data as 'left over'

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Steve Lawrence

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 20/Jul/17 11:40

Updated:: 08/Oct/18 18:34

Resolved:: 16/Aug/18 14:48