Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17995

[C++] arrow::json::DecimalConverter should rescale values based on the explicit_schema

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 6.0.0, 6.0.1, 6.0.2, 7.0.0, 7.0.1, 8.0.0, 8.0.1, 9.0.0
    • 9.0.1, 10.0.0
    • C++

    Description

      The C++ lib doesn't read JSON decimal values correctly based on the explicit_schema. This can be reproduced by this helloworld program: https://github.com/stiga-huang/arrow-helloworld/tree/d267862

      The input JSON file has the following rows:

      {"id":1,"str":"Some","price":"30.04"}
      {"id":2,"str":"data","price":"1.234"} 

      If we read the price column using decimal128(9, 2), the values are

            30.04,
            12.34
      

      If we use decimal128(9, 3) instead, the values are

            3.004,
            1.234
      

      The decimal type in the explicit_schema is set here: https://github.com/stiga-huang/arrow-helloworld/blob/d26786270e87d9ab847658ead96a96190461b98f/json_decimal_example.cc#L38

      The cause is arrow::json::DecimalConverter doesn't rescale the value based on the out_type_:

        Status Convert(const std::shared_ptr<Array>& in, std::shared_ptr<Array>* out) override {
          if (in->type_id() == Type::NA) {
            return MakeArrayOfNull(out_type_, in->length(), pool_).Value(out);
          }
          const auto& dict_array = GetDictionaryArray(in);
      
          using Builder = typename TypeTraits<T>::BuilderType;
          Builder builder(out_type_, pool_);
          RETURN_NOT_OK(builder.Resize(dict_array.indices()->length()));
      
          auto visit_valid = [&builder](string_view repr) {
            ARROW_ASSIGN_OR_RAISE(value_type value,
                                  TypeTraits<T>::BuilderType::ValueType::FromString(repr));
            //////////// Should rescale the value based on out_type_ here
            builder.UnsafeAppend(value);
            return Status::OK();
          };
      
          auto visit_null = [&builder]() {
            builder.UnsafeAppendNull();
            return Status::OK();
          };
      
          RETURN_NOT_OK(VisitDictionaryEntries(dict_array, visit_valid, visit_null));
          return builder.Finish(out);
        }
      

      https://github.com/apache/arrow/blob/cdd0fdf39033b9cf132a5cfc9caa5ed60713845a/cpp/src/arrow/json/converter.cc#L171-L173

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h
                  2h