Avro
  1. Avro
  2. AVRO-730

Implement set and remove methods from List interface on GenericData.Array

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.1
    • Fix Version/s: 1.5.0
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      GenericData.Array throws an UnsupportedOperationException when set() or remove() are used on an instance of it. I think it's reasonable for those methods to work though, since their implementation is simple and inline with most other List types.

      1. AVRO-730.patch
        3 kB
        Chase Bradford

        Activity

        Hide
        Chase Bradford added a comment -

        This patch adds implementations for remove(int index) and set(int index, T object).

        Show
        Chase Bradford added a comment - This patch adds implementations for remove(int index) and set(int index, T object).
        Hide
        Scott Carey added a comment -

        Overall looks good.

        Tests could be improved by adding checks that trigger IndexOutOfBoundsException when we call set() or remove() on a shrunken array. testArrayRemove() checks that for get() but not set() or remove().
        Additionally, we should test that adding another element after removal works as expected.

        Show
        Scott Carey added a comment - Overall looks good. Tests could be improved by adding checks that trigger IndexOutOfBoundsException when we call set() or remove() on a shrunken array. testArrayRemove() checks that for get() but not set() or remove(). Additionally, we should test that adding another element after removal works as expected.
        Hide
        Doug Cutting added a comment -

        We should be able to get this into 1.5.0.

        Show
        Doug Cutting added a comment - We should be able to get this into 1.5.0.
        Hide
        Chase Bradford added a comment -

        Added additional checks for ArrayIndexOutOfBounds after calls to remove.

        Show
        Chase Bradford added a comment - Added additional checks for ArrayIndexOutOfBounds after calls to remove.
        Hide
        Doug Cutting added a comment -

        I think remove() should not insert a null at the end of the array. The primary difference between GenericData.Array and ArrayList is that GenericData.Array supports reuse: if you're reading a series of List<Foo> in a loop, passing the same List<Foo> instance to each call to read() for reuse, you'd like the Foo elements to be reused too. ArrayList won't let you easily achieve this, since it always nulls out items past the end and that behavior cannot be overidden.

        Show
        Doug Cutting added a comment - I think remove() should not insert a null at the end of the array. The primary difference between GenericData.Array and ArrayList is that GenericData.Array supports reuse: if you're reading a series of List<Foo> in a loop, passing the same List<Foo> instance to each call to read() for reuse, you'd like the Foo elements to be reused too. ArrayList won't let you easily achieve this, since it always nulls out items past the end and that behavior cannot be overidden.
        Hide
        Chase Bradford added a comment -

        I had thought about that, but couldn't think of the best way to implement it without a large opportunity for odd behavior.

        Here are some of the thoughts that went through my mind:

        1) Don't do anything. This would cause a duplicate object reference at data[size] and data[size-1], which would be very bad.
        2) Set data[size] = removed_object. The inner array would have only distinct elements with no loss of instances.
        3) Shift entire array and set data[size-1] = null. We only lose one object instance at the end, which hopefully wouldn't get hit all the time.

        2) is the most efficient, but I didn't know what sort of assurances should be made about the removed element. I think most people would assume that once an object is removed from a List, regardless of the implementation, then that object is safe from modification by the list.

        Show
        Chase Bradford added a comment - I had thought about that, but couldn't think of the best way to implement it without a large opportunity for odd behavior. Here are some of the thoughts that went through my mind: 1) Don't do anything. This would cause a duplicate object reference at data [size] and data [size-1] , which would be very bad. 2) Set data [size] = removed_object. The inner array would have only distinct elements with no loss of instances. 3) Shift entire array and set data [size-1] = null. We only lose one object instance at the end, which hopefully wouldn't get hit all the time. 2) is the most efficient, but I didn't know what sort of assurances should be made about the removed element. I think most people would assume that once an object is removed from a List, regardless of the implementation, then that object is safe from modification by the list.
        Hide
        Chase Bradford added a comment -

        bullet 3 should read data[data.length - 1]

        Show
        Chase Bradford added a comment - bullet 3 should read data [data.length - 1]
        Hide
        Doug Cutting added a comment -

        Chase: good point, I didn't think that it would leave a duplicate element, which would be bad. It sounds like you've thought this through. I'm now fine with the current patch. +1 Or, I'd be fine with (2) if we added a javadoc comment noting that the reference is retained. If folks really want to remove a reference they still could by setting a value to null before removing it.

        Show
        Doug Cutting added a comment - Chase: good point, I didn't think that it would leave a duplicate element, which would be bad. It sounds like you've thought this through. I'm now fine with the current patch. +1 Or, I'd be fine with (2) if we added a javadoc comment noting that the reference is retained. If folks really want to remove a reference they still could by setting a value to null before removing it.
        Hide
        Doug Cutting added a comment -

        I will commit this soon unless someone objects.

        Show
        Doug Cutting added a comment - I will commit this soon unless someone objects.
        Hide
        Doug Cutting added a comment -

        I committed this. Thanks, Chase!

        Show
        Doug Cutting added a comment - I committed this. Thanks, Chase!

          People

          • Assignee:
            Chase Bradford
            Reporter:
            Chase Bradford
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development