Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6129

Row_groups duplicate Rows

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 0.14.1
    • None
    • C++, Python

    Description

      Using Row_Groups to write Parquet, duplicate rows:

          Input: CSV 10 Rows

          Row_Groups=1 --> Output 10 Rows 

          Row_Groups=2 --> Output 20 Rows

       

      Is this the expected?
      attached code snippet and CSV

      Attachments

        1. tes_output.png
          25 kB
          albertoramon
        2. test01.py
          0.6 kB
          albertoramon
        3. top10.csv
          1 kB
          albertoramon

        Activity

          People

            Unassigned Unassigned
            albertoramon albertoramon
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: