Commons Collections
  1. Commons Collections
  2. COLLECTIONS-442

A set of enhanced iterator classes donated by the Apache Jena project

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.x
    • Component/s: Iterator
    • Labels:
      None

      Description

      A set of templated (Generic) iterators that add filtering, mapping, and conversion to set or list collections. Tests included.

      1. COLLECTIONS-442.tar.gz
        12 kB
        Claude Warren
      2. FluentIterator.java
        5 kB
        Thomas Neidhart
      3. iter-src.zip
        32 kB
        Andy Seaborne

        Activity

        Claude Warren created issue -
        Hide
        Claude Warren added a comment -

        source code for collecitons and test cases

        Show
        Claude Warren added a comment - source code for collecitons and test cases
        Claude Warren made changes -
        Field Original Value New Value
        Attachment COLLECTIONS-442.tar.gz [ 12568700 ]
        Thomas Neidhart made changes -
        Fix Version/s 4.x [ 12313073 ]
        Hide
        Andy Seaborne added a comment - - edited

        Jena also has

        https://svn.apache.org/repos/asf/jena/trunk/jena-arq/src/main/java/org/apache/jena/atlas/iterator/Iter.java

        (tests in src/test/java) with many operations on plain Java iterators.

        Show
        Andy Seaborne added a comment - - edited Jena also has https://svn.apache.org/repos/asf/jena/trunk/jena-arq/src/main/java/org/apache/jena/atlas/iterator/Iter.java (tests in src/test/java) with many operations on plain Java iterators.
        Hide
        Thomas Neidhart added a comment -

        These are some really nice and useful iterator implementations. Thanks for the pointer.
        We will see how we can add them already for 4.0, but currently we are focusing on getting a release out.

        Show
        Thomas Neidhart added a comment - These are some really nice and useful iterator implementations. Thanks for the pointer. We will see how we can add them already for 4.0, but currently we are focusing on getting a release out.
        Thomas Neidhart made changes -
        Fix Version/s 4.0 [ 12314511 ]
        Fix Version/s 4.x [ 12313073 ]
        Hide
        Thomas Neidhart added a comment -

        I have reworked the patch a bit, see the attached file. My rationale was as follows:

        • reuse existing code as much as possible
        • use real classes instead of interfaces to avoid problems with breaking compatibility when extending later on

        An example how to use the interface:

        public class MyTest {
        
            public static void main(String[] args) {
        
                List<Integer> list = new ArrayList<Integer>();
                list.add(1);
                list.add(2);
                list.add(3);
                list.add(2);
                list.add(3);
                
                FluentIterator<String> it =
                    FluentIterator.<Integer>empty()
                                  .andThen(list.iterator())
                                  .dropIf(new Predicate<Integer>() {
                                      public boolean evaluate(Integer object) {
                                          return object.intValue() < 2;
                                      }
                                  })
                                  .unique()
                                  .andThen(list.iterator())
                                  .mapWith(new Transformer<Integer, String>() {
                                      public String transform(Integer input) {
                                          return "[" + String.valueOf(input.intValue()) + "]";
                                      }
                                  });
                
                System.out.println(it.toList());
            }
        }
        

        This prints

        [[2], [3], [1], [2], [3], [2], [3]]
        

        In the original patch, "andThen" behaved differently to the other composition methods in the sense, that only andThen returned the same object. All the other methods returned a new object.

        This could be error-prone when people do things like this:

          it.andThen(...)
          it.filterKeep(...)
        

        in fact, the "andThen" will update "it", while "filterKeep" will return a new object and leave "it" untouched. This is fine as long as you do method chaining but can lead to unexpected errors or behavior in other cases.

        So I decided to be consistent and always return the same object. The only exception is the "mapWith" method, which will always return a new object as the generic return type may change.

        I am not completely happy with the class name, so if somebody has a better idea?

        Any comments are welcome, if the API is accepted we can still include it for the upcoming 4.0-alpha1 which I plan to release in 1-2 weeks.

        Show
        Thomas Neidhart added a comment - I have reworked the patch a bit, see the attached file. My rationale was as follows: reuse existing code as much as possible use real classes instead of interfaces to avoid problems with breaking compatibility when extending later on An example how to use the interface: public class MyTest { public static void main(String[] args) { List<Integer> list = new ArrayList<Integer>(); list.add(1); list.add(2); list.add(3); list.add(2); list.add(3); FluentIterator<String> it = FluentIterator.<Integer>empty() .andThen(list.iterator()) .dropIf(new Predicate<Integer>() { public boolean evaluate(Integer object) { return object.intValue() < 2; } }) .unique() .andThen(list.iterator()) .mapWith(new Transformer<Integer, String>() { public String transform(Integer input) { return "[" + String.valueOf(input.intValue()) + "]"; } }); System.out.println(it.toList()); } } This prints [[2], [3], [1], [2], [3], [2], [3]] In the original patch, "andThen" behaved differently to the other composition methods in the sense, that only andThen returned the same object. All the other methods returned a new object. This could be error-prone when people do things like this: it.andThen(...) it.filterKeep(...) in fact, the "andThen" will update "it", while "filterKeep" will return a new object and leave "it" untouched. This is fine as long as you do method chaining but can lead to unexpected errors or behavior in other cases. So I decided to be consistent and always return the same object. The only exception is the "mapWith" method, which will always return a new object as the generic return type may change. I am not completely happy with the class name, so if somebody has a better idea? Any comments are welcome, if the API is accepted we can still include it for the upcoming 4.0-alpha1 which I plan to release in 1-2 weeks.
        Thomas Neidhart made changes -
        Attachment FluentIterator.java [ 12581831 ]
        Andy Seaborne made changes -
        Attachment iter-src.zip [ 12581869 ]
        Hide
        Andy Seaborne added a comment - - edited

        I like the style. Attached is another take on this.

        The main class is Iter that provides two styles:

        • A style like the FluentIterator style of method chaining.
        • Static methods to provide short sequences to that one-step operations can be applied to regular iterators and iterables

        Also includes a "PeekIterator" for looking one step ahead.

        The function-application style is useful for short sequences; the chainign is better for longer sequences.

            iter = Iter.removeNulls(iter) ;
        

        Example of each style: (example.IterExample.java):

                List<Integer> x = Arrays.asList(1,2,3,2,3) ;
                // Chaining style
                Iter<String> iter = Iter.iter(x)
                    .filter(new Filter<Integer>() {
                        @Override public boolean accept(Integer item)
                        { return item.intValue() >= 2 ; }})
                    .distinct()
                    .append(x.iterator())
                    .map(new Transform<Integer,String>() {
                        @Override public String convert(Integer item)
                        { return "["+String.valueOf(item)+"]" ; }}) ;
                System.out.println(iter.toList());
        

        and

                
                List<Integer> x = Arrays.asList(1,2,3,2,3) ;
                // Function application style.
                Iterator<Integer> it = Iter.filter(x, new Filter<Integer>() {
                    @Override public boolean accept(Integer item)
                    { return item.intValue() >= 2 ; }}) ;
                it = Iter.distinct(it) ;
                it = Iter.concat(it, x.iterator()) ;
                Iterator<String> its = Iter.map(it, new Transform<Integer,String>() {
                        @Override public String convert(Integer item)
                        { return "["+String.valueOf(item)+"]" ; }}) ;
                List<String> y = Iter.toList(its) ;
                System.out.println(y);
            }
        
        Show
        Andy Seaborne added a comment - - edited I like the style. Attached is another take on this. The main class is Iter that provides two styles: A style like the FluentIterator style of method chaining. Static methods to provide short sequences to that one-step operations can be applied to regular iterators and iterables Also includes a "PeekIterator" for looking one step ahead. The function-application style is useful for short sequences; the chainign is better for longer sequences. iter = Iter.removeNulls(iter) ; Example of each style: (example.IterExample.java): List<Integer> x = Arrays.asList(1,2,3,2,3) ; // Chaining style Iter<String> iter = Iter.iter(x) .filter(new Filter<Integer>() { @Override public boolean accept(Integer item) { return item.intValue() >= 2 ; }}) .distinct() .append(x.iterator()) .map(new Transform<Integer,String>() { @Override public String convert(Integer item) { return "["+String.valueOf(item)+"]" ; }}) ; System.out.println(iter.toList()); and List<Integer> x = Arrays.asList(1,2,3,2,3) ; // Function application style. Iterator<Integer> it = Iter.filter(x, new Filter<Integer>() { @Override public boolean accept(Integer item) { return item.intValue() >= 2 ; }}) ; it = Iter.distinct(it) ; it = Iter.concat(it, x.iterator()) ; Iterator<String> its = Iter.map(it, new Transform<Integer,String>() { @Override public String convert(Integer item) { return "["+String.valueOf(item)+"]" ; }}) ; List<String> y = Iter.toList(its) ; System.out.println(y); }
        Andy Seaborne made changes -
        Attachment iter-src.zip [ 12581869 ]
        Andy Seaborne made changes -
        Attachment iter-src.zip [ 12581871 ]
        Hide
        Thomas Neidhart added a comment -

        Thanks for the feedback!

        I like both styles, and as you say, they can be useful under different circumstances.

        At collections, we already have a class IteratorUtils, where all the static functions should go (some of them are already there) imho.
        Tbh I am not such a big fan of these *Utils, and their quite expressive method names, but thats the common style of collections, so we should better stick to it to be consistent.

        For the method chaining style: I will re-work it further to mimic the API of Iter, we should just use the existing classes in collections to avoid duplication of code / effort.

        Regarding more additions from jena, I will create sub-tasks for each addition to better keep track of the things that have been added. The PeekingIterator is definitely useful.

        Show
        Thomas Neidhart added a comment - Thanks for the feedback! I like both styles, and as you say, they can be useful under different circumstances. At collections, we already have a class IteratorUtils, where all the static functions should go (some of them are already there) imho. Tbh I am not such a big fan of these *Utils, and their quite expressive method names, but thats the common style of collections, so we should better stick to it to be consistent. For the method chaining style: I will re-work it further to mimic the API of Iter, we should just use the existing classes in collections to avoid duplication of code / effort. Regarding more additions from jena, I will create sub-tasks for each addition to better keep track of the things that have been added. The PeekingIterator is definitely useful.
        Hide
        Matt Benson added a comment -

        My only concern is that the long-term plan for [collections], AFAIK, was to remove its functor types in favor of the Commons [functor] API. Since functor's API has now been split to a separate artifact it wouldn't be that hard to simply depend on it in collections; however functor has yet to be released. :|

        Show
        Matt Benson added a comment - My only concern is that the long-term plan for [collections] , AFAIK, was to remove its functor types in favor of the Commons [functor] API. Since functor 's API has now been split to a separate artifact it wouldn't be that hard to simply depend on it in collections ; however functor has yet to be released. :|
        Thomas Neidhart made changes -
        Fix Version/s 4.x [ 12313073 ]
        Fix Version/s 4.0 [ 12314511 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Claude Warren
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development