Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Fix Version/s: 0.6
    • Component/s: None
    • Labels:
      None

      Description

      We need to use the split size instead.

        Activity

        Hide
        johanoskarsson Johan Oskarsson added a comment -

        It's worth noting that when I tried using the default split size of 16k pretty much all my tasks timed out and died, failing the job. This was before the timeout was raised to 10s though, so might work better now. But before we have improved performance of the slice operation we should probably lower the 16k limit.

        Show
        johanoskarsson Johan Oskarsson added a comment - It's worth noting that when I tried using the default split size of 16k pretty much all my tasks timed out and died, failing the job. This was before the timeout was raised to 10s though, so might work better now. But before we have improved performance of the slice operation we should probably lower the 16k limit.
        Hide
        jbellis Jonathan Ellis added a comment -

        In that case we should probably reduce the default to 8k, but we're testing 10k-20k rows read per second here via get_range_slice. how big are your rows, and are you running on a VM or real hardware?

        Show
        jbellis Jonathan Ellis added a comment - In that case we should probably reduce the default to 8k, but we're testing 10k-20k rows read per second here via get_range_slice. how big are your rows, and are you running on a VM or real hardware?
        Hide
        johanoskarsson Johan Oskarsson added a comment -

        The rows only contain the text from a tweet, so not very big. This was running on EC2 instances, granted it's not the best real world test but shows the margin is not very big. Hopefully it will improve after CASSANDRA-821 is resolved.

        Show
        johanoskarsson Johan Oskarsson added a comment - The rows only contain the text from a tweet, so not very big. This was running on EC2 instances, granted it's not the best real world test but shows the margin is not very big. Hopefully it will improve after CASSANDRA-821 is resolved.
        Hide
        jbellis Jonathan Ellis added a comment -

        02
        r/m fields from CFSplit that are redundant to information in configuration; use split size for row count

        01
        move configuration static methods into ConfigHelper

        Show
        jbellis Jonathan Ellis added a comment - 02 r/m fields from CFSplit that are redundant to information in configuration; use split size for row count 01 move configuration static methods into ConfigHelper
        Hide
        johanoskarsson Johan Oskarsson added a comment -

        +1.
        My only concern is that having the static configuration methods in the ConfigHelper might make them harder to find for the users, most input formats I have worked with have them in the input format class itself. A class level javadoc in the input format with a short user guide might be a good complement.

        Show
        johanoskarsson Johan Oskarsson added a comment - +1. My only concern is that having the static configuration methods in the ConfigHelper might make them harder to find for the users, most input formats I have worked with have them in the input format class itself. A class level javadoc in the input format with a short user guide might be a good complement.
        Hide
        jbellis Jonathan Ellis added a comment -

        Would it be better to just move CH.* to the InputFormat?

        I split it out since it seemed weird to have the record reader call a bunch of static methods on the IF, but if that's the customary place then that's fine.

        Show
        jbellis Jonathan Ellis added a comment - Would it be better to just move CH.* to the InputFormat? I split it out since it seemed weird to have the record reader call a bunch of static methods on the IF, but if that's the customary place then that's fine.
        Hide
        jbellis Jonathan Ellis added a comment -

        committed w/ extra javadoc to 0.6 and trunk

        Show
        jbellis Jonathan Ellis added a comment - committed w/ extra javadoc to 0.6 and trunk

          People

          • Assignee:
            jbellis Jonathan Ellis
            Reporter:
            jbellis Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development