Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-197

parquet-cascading and the mapred API does not create metadata file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.6.0
    • None
    • None

    Description

      Repro: run a scalding job that writes parquet files to a folder. no _metadata and _common_metadata file is created

      Impact: potential performance problem if parquet metadata is read from client side, which is the case for sparkSQL

      casue: the metatdata writing logic is in the mapreduce API but not the mapred API of parquet.

      Attachments

        Issue Links

          Activity

            People

              tianshuo Tim
              tianshuo Tim
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: