Hive
  1. Hive
  2. HIVE-720

Improve ByteStream by removing all synchronized method calls

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      HIVE-720. Improve ByteStream by removing all synchronized method calls. (Yongqiang He via zshao)
      Show
      HIVE-720 . Improve ByteStream by removing all synchronized method calls. (Yongqiang He via zshao)

      Description

      org.apache.hadoop.hive.serde2.ByteStream has 2 inner classes: Input and Output, which inherits from ByteArrayInputStream and ByteArrayOutputStream.
      Both these classes have a lot of sychronized methods, which make them really slow.

      We should let ByteStream.Input and ByteStream.Output directly inherit InputStream and OutputStream so we don't need to call synchronized methods at all. This will help LazySimpleSerDe, ColumnarSerDe as well as LazyBinarySerDe.

      1. HIVE-720.2.patch
        45 kB
        He Yongqiang
      2. HIVE-720.1.patch
        12 kB
        He Yongqiang

        Activity

        Hide
        He Yongqiang added a comment -

        Hive has its own HiveDataInputBuffer and HiveDataOutputBuffer. We can reuse a lot of code here.

        Show
        He Yongqiang added a comment - Hive has its own HiveDataInputBuffer and HiveDataOutputBuffer. We can reuse a lot of code here.
        Hide
        He Yongqiang added a comment -

        HIVE-720.1.patch tries to reuse some io code. It passed tests in my local.

        Show
        He Yongqiang added a comment - HIVE-720 .1.patch tries to reuse some io code. It passed tests in my local.
        Hide
        Zheng Shao added a comment -

        Overall it looks good. Some nitpicks about class names:
        1: ByteArrayInputBuffer -> NonSyncByteArrayInputStream, ByteArrayOutputBuffer -> NonSyncByteArrayOutputStream.
        2. HiveDataInputBuffer -> NonSyncDataInputBuffer, HiveDataOutputBuffer -> NonSyncDataOutputBuffer

        Also can we put the new classes into common/io instead of common?

        Also, what is the reason that HiveDataInputBuffer inherits FilterInputStream, while HiveDataOutputBuffer inherits DataOutputStream?
        We might have discussed it before but I cannot remember. Can you put some comments into the code?

        Show
        Zheng Shao added a comment - Overall it looks good. Some nitpicks about class names: 1: ByteArrayInputBuffer -> NonSyncByteArrayInputStream, ByteArrayOutputBuffer -> NonSyncByteArrayOutputStream. 2. HiveDataInputBuffer -> NonSyncDataInputBuffer, HiveDataOutputBuffer -> NonSyncDataOutputBuffer Also can we put the new classes into common/io instead of common? Also, what is the reason that HiveDataInputBuffer inherits FilterInputStream, while HiveDataOutputBuffer inherits DataOutputStream? We might have discussed it before but I cannot remember. Can you put some comments into the code?
        Hide
        He Yongqiang added a comment -

        Will do 1 and 2, and move the two classes into common.io.
        >>what is the reason that HiveDataInputBuffer inherits FilterInputStream, while HiveDataOutputBuffer inherits DataOutputStream?
        Let me figure out why and add some comment.

        Show
        He Yongqiang added a comment - Will do 1 and 2, and move the two classes into common.io. >>what is the reason that HiveDataInputBuffer inherits FilterInputStream, while HiveDataOutputBuffer inherits DataOutputStream? Let me figure out why and add some comment.
        Hide
        Zheng Shao added a comment -

        +1. Will commit if tests pass.

        Show
        Zheng Shao added a comment - +1. Will commit if tests pass.
        Hide
        Zheng Shao added a comment -

        Committed. Thanks Yongqiang!

        Show
        Zheng Shao added a comment - Committed. Thanks Yongqiang!

          People

          • Assignee:
            He Yongqiang
            Reporter:
            Zheng Shao
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development