Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-11548

[C++] RandomArrayGenerator::List size mismatch

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 4.0.0
    • C++

    Description

      RandomArrayGenerator::List consistently produces ListArrays with their length 1 below what they should be according to their documentation. Moreover the bitmaps we have are weird.
       
      Here is some simple test:
       
      TEST(TestAdapterWriteNested, ListTest) {
      int64_t num_rows = 2;
      static constexpr random::SeedType kRandomSeed2 = 0x0ff1ce;
      arrow::random::RandomArrayGenerator rand(kRandomSeed2);
      std::shared_ptr<Array> value_array = rand.ArrayOf(int32(), 2 * num_rows, 0.2);
      std::shared_ptr<Array> array = rand.List(*value_array, num_rows, 1);
      RecordProperty("bitmap",*(array->null_bitmap_data()));
      RecordProperty("length",array->length());
      RecordProperty("array",array->ToString());
      }
       
      Here are the results:
       
      <testcase name="ListTest" status="run" result="completed" time="0" timestamp="2021-02-07T15:23:16" classname="TestAdapterWriteNested">
      <properties>
      <property name="bitmap" value="3"/>
      <property name="length" value="1"/>
      <property name="array" value="[ [ null, 1074834796, 551076274, 1184187771 ] ]"/>
      </properties>
      </testcase>
       
      Here is what RandomArrayGenerator::List should do:
       
      /// \brief Generate a random ListArray
      ///
      /// \param[in] values The underlying values array
      /// \param[in] size The size of the generated list array
      /// \param[in] null_probability the probability of a list value being null
      /// \param[in] force_empty_nulls if true, null list entries must have 0 length
      ///
      /// \return a generated Array
      std::shared_ptr<Array> List(const Array& values, int64_t size, double null_probability,
      bool force_empty_nulls = false);
       
      Note that the generator failed in at least two aspects:
      1. The length of the generated array is too low.
      2. Even when null_probability is set to 1 there are still 1s in the bitmap. 
      3. The size of the bitmap is larger than the size of the Array.

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              yingzhou474 Ian Alexander Joiner
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m