Uploaded image for project: 'Crunch (Retired)'
  1. Crunch (Retired)
  2. CRUNCH-331

Change default settings for CombineFileInputFormat

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9.0, 0.8.2
    • 0.10.0, 0.8.3
    • IO
    • None

    Description

      Currently, we default to enabling the CombineFileInputFormat settings for any extensions of FileSourceImpl b/c it tends to improve performance for common file formats like text, sequence files, and Avro files. However, this default has caused problems for formats like Parquet and for custom file formats that have complex split logic.

      This JIRA is to track modifying the default combine file settings in at least some contexts, such as with From.formattedFile for custom input formats.

      Attachments

        1. CRUNCH-331.patch
          6 kB
          Micah Whitacre
        2. CRUNCH-331b.patch
          8 kB
          Micah Whitacre

        Issue Links

          Activity

            People

              mkwhitacre Micah Whitacre
              jwills Josh Wills
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: