Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17308

ValueError: Keyword 'validate_schema' is not yet supported with the new Dataset API

    XMLWordPrintableJSON

Details

    Description

      Documentation for PyArrow 6.x and 7.x both indicate that the `validate_schema` argument is supported in the `ParquetDataset` class. Yet passing that argument to an instance results in:

      ValueError: Keyword 'validate_schema' is not yet supported with the new Dataset API

      Code:

      parquet_dataset = pyarrow.parquet.ParquetDataset(
          path_or_paths=paths,
          validate_schema=validate_schema,
          filesystem=filesystem,
          partitioning=partitioning,
          use_legacy_dataset=False,
      )

      Docs link:

      https://arrow.apache.org/docs/6.0/python/generated/pyarrow.parquet.ParquetDataset.html

      https://arrow.apache.org/docs/7.0/python/generated/pyarrow.parquet.ParquetDataset.html 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jaidisido Abderrahmane Jaidi
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: