Uploaded image for project: 'Camel'
  1. Camel
  2. CAMEL-4555

Support nested directories with multiple segment files in the HDFS endpoint consumer

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.8.1
    • 2.8.3, 2.9.0
    • camel-hdfs
    • Patch Available
    • Moderate

    Description

      A common pattern in HDFS is for multiple segment files underneath a given directory, representing the fragments of data. Lots of tools understand to automatically merge these segment files (ie hadoop fs -getmerge, pig script loaders). This patch does the same for the HDFS consumer, using a temporary local directory for the merging.

      Additionally, tools like pig and oozie understand to look for a _SUCCESS file in one of these directories containing segments. This file indicates that the segments have been completely written. This patch additionally skips the directory if a _SUCCESS file is not present.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            davsclaus Claus Ibsen
            bhoyt Ben Hoyt
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 72h
                72h
                Remaining:
                Remaining Estimate - 72h
                72h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment