Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Incomplete
-
1.8.5
-
None
-
OSX 10.6.8, groovy 1.8.5 from macports
Description
I'd like to enhance the groovy additions to java.util.Iterator
For the current state of groovy Iterator api see:
http://groovy.codehaus.org/groovy-jdk/java/util/Iterator.html
I'd like to add to Iterator a lot of the methods from the groovy Collection api, see:
http://groovy.codehaus.org/groovy-jdk/java/util/Collection.html
Rationale:
The groovy collection api offers a rich api for applying closures
for various tasks like filtering, transforming and aggregation
But the groovy Iterator api does currently implement only a very small subset of this.
When working with large data sets,
for example extracting data from a large log file, line by line
it is not suitable to put all of that data in a collection first.
But instead that data should stream through the processing steps.
Like building up a pipe flow in unix with a lot of chained grep and sed scripts.
But don't get me wrong here, we're not talking about filtering character data,
instead we're talking about a stream of Objects that we want to pass on from step to step.
I believe the Iterator api is suitable for this.
Maybe it's currently possible to apply some of the Collection api to Iterators,
but they collect and return Collection then.
But I want to return another Iterator in turn,
so to avoid collecting the data,
but instead just process that single item,
and pass it on to the next step in turn as an Iterator.
I came up with transform() and filter() already,
I'll post this here, if I'm done debugging
Some other thoughts.
Once we have this dataflow going in the single threaded case,
then we might want to extend this to multi threading.