Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-1566

Add ability for client to start Scanner readahead immediately

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: client
    • Labels:
      None

      Description

      When the client cares about getting results in sorted order, the BatchScanner, as nice as it is, is mostly irrelevant.

      One interesting property of the Scanner is that it will begin to pre-fetch more results after the 3rd batch of results has been fetched from the server.

      Clients may have an idea of the number of records that will be returned by a scan, and thus will have an idea about how they want to control such a readahead. It would be nice to allow the client to control after how many batches the readahead thread starts.

        Activity

        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 05b3359b9c6643cbdb2284afead9a0dcac2a9300 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=05b3359 ]

        ACCUMULO-1566 Pass down the readaheadThreshold parameter from the client to the
        server so that the same limit is adhered to by the server in regards to
        pipelining.

        Thanks to Keith for his help here.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 05b3359b9c6643cbdb2284afead9a0dcac2a9300 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=05b3359 ] ACCUMULO-1566 Pass down the readaheadThreshold parameter from the client to the server so that the same limit is adhered to by the server in regards to pipelining. Thanks to Keith for his help here.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit b4bd204320632b598bee047c620f2e45cc7caef9 in branch refs/heads/ACCUMULO-1566 from [~keith_turner]
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=b4bd204 ]

        ACCUMULO-1566 the missing link

        Show
        jira-bot ASF subversion and git services added a comment - Commit b4bd204320632b598bee047c620f2e45cc7caef9 in branch refs/heads/ ACCUMULO-1566 from [~keith_turner] [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=b4bd204 ] ACCUMULO-1566 the missing link
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit ff95c7147d210171fef3824eae399b7384cdaff9 in branch refs/heads/ACCUMULO-1566 from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=ff95c71 ]

        ACCUMULO-1566 Pass down the readaheadThreshold parameter from the client to the
        server so that the same limit is adhered to by the server in regards to
        pipelining.

        Show
        jira-bot ASF subversion and git services added a comment - Commit ff95c7147d210171fef3824eae399b7384cdaff9 in branch refs/heads/ ACCUMULO-1566 from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=ff95c71 ] ACCUMULO-1566 Pass down the readaheadThreshold parameter from the client to the server so that the same limit is adhered to by the server in regards to pipelining.
        Hide
        kturner Keith Turner added a comment -

        any problem with me opening up a new ticket for that?

        Up to you. I think one ticket would make it easier for someone looking at the 1.6.0 release notes, less for them to deduce. Could change the subject and description for this ticket.

        Show
        kturner Keith Turner added a comment - any problem with me opening up a new ticket for that? Up to you. I think one ticket would make it easier for someone looking at the 1.6.0 release notes, less for them to deduce. Could change the subject and description for this ticket.
        Hide
        elserj Josh Elser added a comment -

        Keith Turner, any problem with me opening up a new ticket for that? I think I see where the work needs to happen in TabletServer, but I'll definitely need some review before pushing the server-side changes.

        Show
        elserj Josh Elser added a comment - Keith Turner , any problem with me opening up a new ticket for that? I think I see where the work needs to happen in TabletServer, but I'll definitely need some review before pushing the server-side changes.
        Hide
        kturner Keith Turner added a comment -

        Josh Elser read ahead also happens on the server side. When the pipeline is fully spun two threads will be running on the tserver (one reading data from rfiles and one transfering data to client) and two threads on the client (one transfering data from server and one processing data). The server side currently has a hard coded trigger of three batches for spinning up the extra thread. Should probably pass the threshold from the client side and use it for the server side scan session.

        Show
        kturner Keith Turner added a comment - Josh Elser read ahead also happens on the server side. When the pipeline is fully spun two threads will be running on the tserver (one reading data from rfiles and one transfering data to client) and two threads on the client (one transfering data from server and one processing data). The server side currently has a hard coded trigger of three batches for spinning up the extra thread. Should probably pass the threshold from the client side and use it for the server side scan session.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit dab1be962b6ab1ab095c4ccf7f3995ab1208c3d7 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=dab1be9 ]

        ACCUMULO-1566 Add in an integration-test which checks that the readahead
        configuration works as intended.

        By applying the SlowIterator to sleep on next(), and then sleep for the same
        amount of time in the main loop over the iterator from the Scanner, we should
        only have to wait once when we're in "readahead mode", but wait twice when we're
        not.

        Show
        jira-bot ASF subversion and git services added a comment - Commit dab1be962b6ab1ab095c4ccf7f3995ab1208c3d7 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=dab1be9 ] ACCUMULO-1566 Add in an integration-test which checks that the readahead configuration works as intended. By applying the SlowIterator to sleep on next(), and then sleep for the same amount of time in the main loop over the iterator from the Scanner, we should only have to wait once when we're in "readahead mode", but wait twice when we're not.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit ff58f6b15c36f5f33d0a296c9806b24ad8a94ab3 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=ff58f6b ]

        ACCUMULO-1566 Simple unit test to ensure that values work as expected.

        Show
        jira-bot ASF subversion and git services added a comment - Commit ff58f6b15c36f5f33d0a296c9806b24ad8a94ab3 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=ff58f6b ] ACCUMULO-1566 Simple unit test to ensure that values work as expected.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 0d85d60c08f88bc6d3e366b192fba5a371654363 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0d85d60 ]

        ACCUMULO-1566 Lift out the implicit "3 batch" convention from ScannerIterator
        into Scanner so it can be configured by the user.

        The ScannerIterator previously started pre-fetching the next batch after the
        previous was returned, only after three batches are returned. Clients have the
        ability to know how to control this better, and, as such, we should let them.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 0d85d60c08f88bc6d3e366b192fba5a371654363 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0d85d60 ] ACCUMULO-1566 Lift out the implicit "3 batch" convention from ScannerIterator into Scanner so it can be configured by the user. The ScannerIterator previously started pre-fetching the next batch after the previous was returned, only after three batches are returned. Clients have the ability to know how to control this better, and, as such, we should let them.

          People

          • Assignee:
            elserj Josh Elser
            Reporter:
            elserj Josh Elser
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development