Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5088

heap-buffer-overflow in impala_udf::StringVal::CopyFrom

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.9.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:

      Description

      Failed in BE test aggregate-functions-test and several end-to-end tests failed in the same area.

      ==8343==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x611000004878 at pc 0x00000101bff5 bp 0x7fff67170cd0 sp 0x7fff67170480
      READ of size 320248 at 0x611000004878 thread T0
          #0 0x101bff4 in __asan_memcpy /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-centos-6/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_interceptors.cc:393
          #1 0x108c55d in impala_udf::StringVal::CopyFrom(impala_udf::FunctionContext*, unsigned char const*, unsigned long) be/src/udf/udf.cc:525:5
          #2 0x10c0bba in impala::ReservoirSampleState<impala_udf::IntVal>::Serialize(impala_udf::FunctionContext*) be/src/exprs/aggregate-functions-ir.cc:1069:21
          #3 0x10c08ab in impala_udf::StringVal impala::AggregateFunctions::ReservoirSampleSerialize<impala_udf::IntVal>(impala_udf::FunctionContext*, impala_udf::StringVal const&) be/src/exprs/aggregate-functions-ir.cc:1235:22
          #4 0x107e42f in impala_udf::UdaTestHarnessBase<impala_udf::StringVal, impala_udf::StringVal>::ExecuteOneLevel(int, impala_udf::UdaTestHarnessBase<impala_udf::StringVal, impala_udf::StringVal>::ScopedFunctionContext*) be/src/udf/uda-test-harness-impl.h:224:20
          #5 0x107c3c2 in impala_udf::UdaTestHarnessBase<impala_udf::StringVal, impala_udf::StringVal>::Execute(impala_udf::StringVal const&, impala_udf::UdaExecutionMode) be/src/udf/uda-test-harness-impl.h:129:16
          #6 0x106654f in HistogramTest_TestInt_Test::TestBody() be/src/exprs/aggregate-functions-test.cc:92:109
          #7 0x2c12602 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (be/build/debug/exprs/aggregate-functions-test+0x2c12602)
          #8 0x2c09179 in testing::Test::Run() (be/build/debug/exprs/aggregate-functions-test+0x2c09179)
          #9 0x2c092c7 in testing::TestInfo::Run() (be/build/debug/exprs/aggregate-functions-test+0x2c092c7)
          #10 0x2c093a4 in testing::TestCase::Run() (be/build/debug/exprs/aggregate-functions-test+0x2c093a4)
          #11 0x2c0a627 in testing::internal::UnitTestImpl::RunAllTests() (be/build/debug/exprs/aggregate-functions-test+0x2c0a627)
          #12 0x2c0a902 in testing::UnitTest::Run() (be/build/debug/exprs/aggregate-functions-test+0x2c0a902)
          #13 0x106771e in main be/src/exprs/aggregate-functions-test.cc:166:156
          #14 0x3ba281ecdc in __libc_start_main (/lib64/libc.so.6+0x3ba281ecdc)
          #15 0xf8c364 in _start (be/build/debug/exprs/aggregate-functions-test+0xf8c364)
      
      0x611000004878 is located 0 bytes to the right of 248-byte region [0x611000004780,0x611000004878)
      allocated by thread T0 here:
          #0 0x1031f58 in __interceptor_malloc /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-centos-6/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:52
          #1 0x108b0a2 in impala_udf::FunctionContext::Allocate(int) be/src/udf/udf.cc:309:21
          #2 0x10a93c8 in impala::AllocBuffer(impala_udf::FunctionContext*, impala_udf::StringVal*, unsigned long) be/src/exprs/aggregate-functions-ir.cc:133:18
          #3 0x10bb561 in void impala::AggregateFunctions::ReservoirSampleInit<impala_udf::IntVal>(impala_udf::FunctionContext*, impala_udf::StringVal*) be/src/exprs/aggregate-functions-ir.cc:1209:3
          #4 0x107e0d7 in impala_udf::UdaTestHarnessBase<impala_udf::StringVal, impala_udf::StringVal>::ExecuteOneLevel(int, impala_udf::UdaTestHarnessBase<impala_udf::StringVal, impala_udf::StringVal>::ScopedFunctionContext*) be/src/udf/uda-test-harness-impl.h:203:5
          #5 0x107c3c2 in impala_udf::UdaTestHarnessBase<impala_udf::StringVal, impala_udf::StringVal>::Execute(impala_udf::StringVal const&, impala_udf::UdaExecutionMode) be/src/udf/uda-test-harness-impl.h:129:16
          #6 0x106654f in HistogramTest_TestInt_Test::TestBody() be/src/exprs/aggregate-functions-test.cc:92:109
          #7 0x2c12602 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (be/build/debug/exprs/aggregate-functions-test+0x2c12602)
      
      SUMMARY: AddressSanitizer: heap-buffer-overflow /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-centos-6/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_interceptors.cc:393 in __asan_memcpy
      Shadow bytes around the buggy address:
        0x0c227fff88b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
        0x0c227fff88c0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
        0x0c227fff88d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c227fff88e0: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
        0x0c227fff88f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      =>0x0c227fff8900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00[fa]
        0x0c227fff8910: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
        0x0c227fff8920: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c227fff8930: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
        0x0c227fff8940: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c227fff8950: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
      Shadow byte legend (one shadow byte represents 8 application bytes):
        Addressable:           00
        Partially addressable: 01 02 03 04 05 06 07
        Heap left redzone:       fa
        Heap right redzone:      fb
        Freed heap region:       fd
        Stack left redzone:      f1
        Stack mid redzone:       f2
        Stack right redzone:     f3
        Stack partial redzone:   f4
        Stack after return:      f5
        Stack use after scope:   f8
        Global redzone:          f9
        Global init order:       f6
        Poisoned by user:        f7
        Container overflow:      fc
        Array cookie:            ac
        Intra object redzone:    bb
        ASan internal:           fe
        Left alloca redzone:     ca
        Right alloca redzone:    cb
      ==8343==ABORTING
      <end of output>
      Test time =   2.47 sec
      ----------------------------------------------------------
      Test Failed.
      "aggregate-functions-test" end time: Mar 16 21:53 PDT
      "aggregate-functions-test" time elapsed: 00:00:02
      ----------------------------------------------------------
      

        Issue Links

          Activity

          Hide
          tarmstrong Tim Armstrong added a comment -

          I think this is a consequence of Taras's patch

          Show
          tarmstrong Tim Armstrong added a comment - I think this is a consequence of Taras's patch
          Hide
          mikesbrown Michael Brown added a comment -

          Thanks for the triage Tim Armstrong.

          Show
          mikesbrown Michael Brown added a comment - Thanks for the triage Tim Armstrong .
          Hide
          tarasbob Taras Bobrovytsky added a comment - - edited

          The problem here is the creation of StringVal by copying:

          size_t buffer_len = sizeof(ReservoirSampleState<T>) + sizeof(ReservoirSample<T>) * num_samples_;
          StringVal dst = StringVal::CopyFrom(ctx, reinterpret_cast<uint8_t*>(this), buffer_len);
          

          buffer_len is larger than the size of ReservoirSampleState, so some garbage that follows ReservoirSampleState also gets copied. However this is OK because we never access it and it is overwritten later on. This call to CopyFrom is convenient because it allocates 'buffer_len' bytes and copies at the same time.

          Show
          tarasbob Taras Bobrovytsky added a comment - - edited The problem here is the creation of StringVal by copying: size_t buffer_len = sizeof(ReservoirSampleState<T>) + sizeof(ReservoirSample<T>) * num_samples_; StringVal dst = StringVal::CopyFrom(ctx, reinterpret_cast<uint8_t*>( this ), buffer_len); buffer_len is larger than the size of ReservoirSampleState, so some garbage that follows ReservoirSampleState also gets copied. However this is OK because we never access it and it is overwritten later on. This call to CopyFrom is convenient because it allocates 'buffer_len' bytes and copies at the same time.
          Hide
          tarmstrong Tim Armstrong added a comment -

          Taras Bobrovytsky "0x611000004878 is located 0 bytes to the right of 248-byte region" means that it's reading at least a byte past the end of the buffer, which could easily crash the program if the allocation is sitting on the edge of an unmapped area of memory.

          It looks like StringVal::CopyFrom is the wrong API to use, since you really want to:

          • Allocate a new string with buffer_len bytes
          • Copy across the sizeof(ReservoirSampleState<T>) header
          • Copy across the 'samples_' array
          Show
          tarmstrong Tim Armstrong added a comment - Taras Bobrovytsky "0x611000004878 is located 0 bytes to the right of 248-byte region" means that it's reading at least a byte past the end of the buffer, which could easily crash the program if the allocation is sitting on the edge of an unmapped area of memory. It looks like StringVal::CopyFrom is the wrong API to use, since you really want to: Allocate a new string with buffer_len bytes Copy across the sizeof(ReservoirSampleState<T>) header Copy across the 'samples_' array
          Hide
          tarasbob Taras Bobrovytsky added a comment -
          commit c27ad3526675ff0ab80c7bb5d7c8364437448b53
          Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
          Date:   Mon Mar 20 15:05:17 2017 -0700
          
              IMPALA-5088: Fix heap buffer overflow
              
              In a recent patch (IMPALA-4787), we introduced a change where we are
              reading past the end of a buffer during a copy. Even though we would
              never read the data that was copied from past the end of the buffer,
              this could cause a crash if the allocation is sitting on the edge of an
              unmapped area of memory. This also caused the ASAN build to fail.
              
              The issue is fixed by never accessing memory that is past the end of the
              buffer.
              
              Change-Id: Ia68d69b35d53934e6f1aae2e1696503d17660ca2
              Reviewed-on: http://gerrit.cloudera.org:8080/6441
              Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
              Tested-by: Impala Public Jenkins
          
          Show
          tarasbob Taras Bobrovytsky added a comment - commit c27ad3526675ff0ab80c7bb5d7c8364437448b53 Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Date: Mon Mar 20 15:05:17 2017 -0700 IMPALA-5088: Fix heap buffer overflow In a recent patch (IMPALA-4787), we introduced a change where we are reading past the end of a buffer during a copy. Even though we would never read the data that was copied from past the end of the buffer, this could cause a crash if the allocation is sitting on the edge of an unmapped area of memory. This also caused the ASAN build to fail. The issue is fixed by never accessing memory that is past the end of the buffer. Change-Id: Ia68d69b35d53934e6f1aae2e1696503d17660ca2 Reviewed-on: http: //gerrit.cloudera.org:8080/6441 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: Impala Public Jenkins

            People

            • Assignee:
              tarasbob Taras Bobrovytsky
              Reporter:
              mikesbrown Michael Brown
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development