Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-720

CollectorSink doesn't pass the new format parameter

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.9.5
    • 0.9.5
    • Sinks+Sources
    • None

    Description

      CollectorSink doesn't properly pass the format parameter down to the EscapedCustomDfs sink.
      For example, this is working fine:
      collectorSource(54001) | escapedCustomDfs("hdfs://hadoop1-m1:8020/", "test", seqfile("SnappyCodec") );

      However, this is using the codec defined in flume-conf.xml
      collectorSource(54001) | collectorSink("hdfs://hadoop1-m1:8020/", "test-", 600000, seqfile("SnappyCodec") );

      By itself this bug would not be very serious, however the problem is that escapedCustomDfs/customDfs are using the same compressor, and they apply it on the whole file, in addition to the compression done natively by the sequence file - this makes the sequence file double compressed and invalid.
      As far as I can tell, the only way to get a valid compressed sequence file is by setting flume.collector.dfs.compress.codec to "None" in flume-site.xml and use the format parameter to specify which compression to use for the sequence file, except that doesn't work...

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jmhsieh Jonathan Hsieh
            erank Eran Kutner
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment