Uploaded image for project: 'Crunch (Retired)'
  1. Crunch (Retired)
  2. CRUNCH-331

Change default settings for CombineFileInputFormat

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0, 0.8.2
    • Fix Version/s: 0.10.0, 0.8.3
    • Component/s: IO
    • Labels:
      None

      Description

      Currently, we default to enabling the CombineFileInputFormat settings for any extensions of FileSourceImpl b/c it tends to improve performance for common file formats like text, sequence files, and Avro files. However, this default has caused problems for formats like Parquet and for custom file formats that have complex split logic.

      This JIRA is to track modifying the default combine file settings in at least some contexts, such as with From.formattedFile for custom input formats.

        Attachments

        1. CRUNCH-331.patch
          6 kB
          Micah Whitacre
        2. CRUNCH-331b.patch
          8 kB
          Micah Whitacre

          Issue Links

            Activity

              People

              • Assignee:
                mkwhitacre Micah Whitacre
                Reporter:
                jwills Josh Wills

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Issue deployment