Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.7.7, 1.8.2
    • Fix Version/s: 1.9.0
    • Component/s: java
    • Labels:
      None
    • Flags:
      Patch

      Description

      Java's ArrayList implementation clears all Objects from the internal buffer when the clear() method is called. This allows the Objects to be free for GC. We should do the same in Avro org.apache.avro.generic.GenericData

      ArrayList Source

      1. AVRO-2050.3.patch
        2 kB
        BELUGA BEHR
      2. AVRO-2050.2.patch
        2 kB
        BELUGA BEHR
      3. AVRO-2050.1.patch
        0.9 kB
        BELUGA BEHR

        Activity

        Hide
        nkollar Nandor Kollar added a comment -

        I'm wondering why clear() method is in overridden. It looks like the base class is AbstractList, which has clear method implemented correctly, so we might instead implement the Array iterator's remove() method no?

        Show
        nkollar Nandor Kollar added a comment - I'm wondering why clear() method is in overridden. It looks like the base class is AbstractList, which has clear method implemented correctly, so we might instead implement the Array iterator's remove() method no?
        Hide
        belugabehr BELUGA BEHR added a comment -

        Nandor Kollar

        This implementation is essentially an ArrayList. The ArrayList overwrites the clear method because using the default AbstractList implementation requires instantiating an Iterator and then deleting each item in the Iterator one at a time. This is bad performance in terms of constant stack manipulation, but also this amounts to draining the array from the head of the list. Draining from the head requires an array copy for each item removed to shift down the existing records. It is much better to override the method as ArrayList has done. However, I did see some overlap with the toString and add methods which can be leveraged. Changed the patch to remove the two overrides.

        Show
        belugabehr BELUGA BEHR added a comment - Nandor Kollar This implementation is essentially an ArrayList . The ArrayList overwrites the clear method because using the default AbstractList implementation requires instantiating an Iterator and then deleting each item in the Iterator one at a time. This is bad performance in terms of constant stack manipulation, but also this amounts to draining the array from the head of the list. Draining from the head requires an array copy for each item removed to shift down the existing records. It is much better to override the method as ArrayList has done. However, I did see some overlap with the toString and add methods which can be leveraged. Changed the patch to remove the two overrides.
        Hide
        cutting Doug Cutting added a comment -

        Arrays.fill() would be simpler than coding the loop yourself and perform similarly.

        A more optimized implementation could use System.arraycopy(), which is native code, but that's perhaps overkill.

        Show
        cutting Doug Cutting added a comment - Arrays.fill() would be simpler than coding the loop yourself and perform similarly. A more optimized implementation could use System.arraycopy(), which is native code, but that's perhaps overkill.
        Hide
        belugabehr BELUGA BEHR added a comment -

        Doug Cutting Thanks for the input. I was simply mimicking the OpenJDK implementation. I attached a new patch. Either will work.

        Show
        belugabehr BELUGA BEHR added a comment - Doug Cutting Thanks for the input. I was simply mimicking the OpenJDK implementation. I attached a new patch. Either will work.
        Hide
        hadoopqa Hadoop QA added a comment -
        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 buildtest 0m 0s master passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 buildtest 6m 52s the patch passed
        7m 8s



        Subsystem Report/Notes
        Docker Client=1.13.1 Server=1.13.1 Image:yetus/avro:793178a
        JIRA Issue AVRO-2050
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12877874/AVRO-2050.3.patch
        Optional Tests buildtest javac
        uname Linux ec17bb8bd028 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 GNU/Linux
        Build tool build
        git revision master / 793178a
        Default Java 1.7.0_111
        modules C: lang/java U: lang/java
        Console output https://builds.apache.org/job/PreCommit-AVRO-Build-TEST/27/console
        Powered by Apache Yetus 0.4.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 buildtest 0m 0s master passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 buildtest 6m 52s the patch passed 7m 8s Subsystem Report/Notes Docker Client=1.13.1 Server=1.13.1 Image:yetus/avro:793178a JIRA Issue AVRO-2050 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12877874/AVRO-2050.3.patch Optional Tests buildtest javac uname Linux ec17bb8bd028 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 GNU/Linux Build tool build git revision master / 793178a Default Java 1.7.0_111 modules C: lang/java U: lang/java Console output https://builds.apache.org/job/PreCommit-AVRO-Build-TEST/27/console Powered by Apache Yetus 0.4.0 http://yetus.apache.org This message was automatically generated.
        Hide
        gszadovszky Gabor Szadovszky added a comment -

        +1

        Show
        gszadovszky Gabor Szadovszky added a comment - +1
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit d0cce25208ad83a9fb6fcce825ae64881bdda420 in avro's branch refs/heads/master from BELUGA BEHR
        [ https://git-wip-us.apache.org/repos/asf?p=avro.git;h=d0cce25 ]

        AVRO-2050: Clear Array To Allow GC

        Show
        jira-bot ASF subversion and git services added a comment - Commit d0cce25208ad83a9fb6fcce825ae64881bdda420 in avro's branch refs/heads/master from BELUGA BEHR [ https://git-wip-us.apache.org/repos/asf?p=avro.git;h=d0cce25 ] AVRO-2050 : Clear Array To Allow GC

          People

          • Assignee:
            belugabehr BELUGA BEHR
            Reporter:
            belugabehr BELUGA BEHR
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development