Flume
  1. Flume
  2. FLUME-2277

Improve FileChannel documentation to address commons support issues

    Details

    • Type: Task Task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Often users configure too small of batch size with File Channel, use sources such as Exec source which generate small batches, or do not configure multiple disks.

      1. FLUME-2277.patch
        17 kB
        Brock Noland

        Activity

        Hide
        Brock Noland added a comment -

        Roshan Naik

        Why do we feel 10 seconds is high? I actually wish we had put it at some number quite large like a five minutes or something. What is the harm in having a high write timeout?

        In other news this bug FLUME-2307 was caused in part by write timeouts.

        Show
        Brock Noland added a comment - Roshan Naik Why do we feel 10 seconds is high? I actually wish we had put it at some number quite large like a five minutes or something. What is the harm in having a high write timeout? In other news this bug FLUME-2307 was caused in part by write timeouts.
        Hide
        Roshan Naik added a comment -

        Did a quick scan of the patch and it seems to be quite a useful thing to have. One question that came to mind is .. why the default write-timeout is increased to 30sec ? it seems quite high. The old default of 10 sec itself semmed to be on the higher side.

        Show
        Roshan Naik added a comment - Did a quick scan of the patch and it seems to be quite a useful thing to have. One question that came to mind is .. why the default write-timeout is increased to 30sec ? it seems quite high. The old default of 10 sec itself semmed to be on the higher side.
        Hide
        Hari Shreedharan added a comment -

        One of the things I have noticed is that having multiple data directories (a handful not thousands) even if they are on the same disk helps, since Flume serializes the operations to a single disk even when there are no fsyncs. Unfortunately, there is no real way to work around this (since we decide whether to roll and cache the offset at which we wrote), but most disks can handle multiple files being written to at the same time and can fsync them with reasonable latency - so having multiple data disks helps

        Show
        Hari Shreedharan added a comment - One of the things I have noticed is that having multiple data directories (a handful not thousands) even if they are on the same disk helps, since Flume serializes the operations to a single disk even when there are no fsyncs. Unfortunately, there is no real way to work around this (since we decide whether to roll and cache the offset at which we wrote), but most disks can handle multiple files being written to at the same time and can fsync them with reasonable latency - so having multiple data disks helps
        Hide
        Brock Noland added a comment -

        Thanks Roshan! Should we commit this and then add the performance items in a follow on?

        Show
        Brock Noland added a comment - Thanks Roshan! Should we commit this and then add the performance items in a follow on?
        Hide
        Roshan Naik added a comment -

        FYI...
        Based on some trial FC perf measurements i did recently and reported on the dev list, there is lots of spare disk & cpu capacity even when FC speed is maxed out (on single disk) with all the configuration tweaks. Consequently, adding multiple FC instances on the same disk has helped easily improve throughput on the host. It scaled fairly linearly till there was no more CPU capacity left.
        So, before adding additional disks, it would worthwhile to add more FC instances (after verifying disk & cpu utilization) either in the form of separate agents or within a single agent on the same host.

        Was able to do a bit of profiling recently (need to do some more) on the where the cycles are going in FC... will report some preliminary observations soon on the dev list.

        Show
        Roshan Naik added a comment - FYI... Based on some trial FC perf measurements i did recently and reported on the dev list, there is lots of spare disk & cpu capacity even when FC speed is maxed out (on single disk) with all the configuration tweaks. Consequently, adding multiple FC instances on the same disk has helped easily improve throughput on the host. It scaled fairly linearly till there was no more CPU capacity left. So, before adding additional disks, it would worthwhile to add more FC instances (after verifying disk & cpu utilization) either in the form of separate agents or within a single agent on the same host. Was able to do a bit of profiling recently (need to do some more) on the where the cycles are going in FC... will report some preliminary observations soon on the dev list.
        Hide
        Brock Noland added a comment -

        Hari Shreedharan, attached is a doc update with some small code changes

        Show
        Brock Noland added a comment - Hari Shreedharan , attached is a doc update with some small code changes

          People

          • Assignee:
            Brock Noland
            Reporter:
            Brock Noland
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development