Accumulo
  1. Accumulo
  2. ACCUMULO-931

Oscillations in Accumulo Ingest Performance

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: 1.4.2
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Linux 2.6.32, single node, 32 cores, 96 GB RAM, 3x3TB SATA drives, RAID5

      Ingest performance into accumulo varies by 2.5x. Depending upon the number of ingestors.

      Performance tests were carried out using Graph500 benchmark (see d4m_api/examples/3Scaling/2ParallelDatabase/pDB10_EdgeInsertTEST.m from http://www.mit.edu/~kepner/D4M/).

      1. ingest_performance_explained.pdf
        145 kB
        Eric Newton
      2. 4ingestor.pdf
        192 kB
        Jeremy Kepner
      3. 3ingestor.pdf
        227 kB
        Jeremy Kepner
      4. 2ingestor.pdf
        198 kB
        Jeremy Kepner
      5. 2,3,4ingestor_1table_4tablet.pdf
        187 kB
        Jeremy Kepner
      6. 1ingestor.pdf
        200 kB
        Jeremy Kepner
      7. 1ingestor_1table_4tablet.pdf
        159 kB
        Jeremy Kepner
      8. 1ingestor_1table_2tablet.pdf
        183 kB
        Jeremy Kepner
      9. 1ingestor_1table_1tablet.pdf
        171 kB
        Jeremy Kepner
      10. 12,r8,r10,r12ingestor_1table_12tablet.pdf
        179 kB
        Jeremy Kepner

        Activity

        Hide
        Josh Elser added a comment -

        Pre-splitting the table being ingested into mitigated the issue.

        Show
        Josh Elser added a comment - Pre-splitting the table being ingested into mitigated the issue.
        Hide
        Josh Elser added a comment -

        I believe this makes sense to me given what I understand about the BatchWriter. A flush on the BatchWriter will return control to the caller when the Mutations have been written to the in memory maps and to the write ahead log. A minor compaction is not a requirement for a flush to complete.

        Show
        Josh Elser added a comment - I believe this makes sense to me given what I understand about the BatchWriter. A flush on the BatchWriter will return control to the caller when the Mutations have been written to the in memory maps and to the write ahead log. A minor compaction is not a requirement for a flush to complete.
        Hide
        Jeremy Kepner added a comment -

        So the phenomena appears to be that mutations can return control to the ingestor before the minor compactions they cause have completed. Thus a single ingestor can cause multiple simultaneous compactions on the same tablet. If the table only has one tablet this cause the ingest process back up (see file 1ingestor_1table_1tablet.pdf). The solution is to presplit the tablet. The files 1ingestor_1table_2tablet.pdf and 1ingestor_1table_4tablet.pdf shows that these splits solves the problem. The file 2,3,4ingestor_1table_4tablet.pdf show the performance of three separate runs using 2, 3, and 4 ingestors into a table with 4 tablets.. Likewise, the file 10,r8,r10,r12ingestor_1table_12tablet.pdf shows four separate runs using 10 local, 8 remote, 10 remote, and 12 remote ingestors. In all cases, the splitting resolves the performance issue.

        If there are no objections, this issue can be closed.

        Show
        Jeremy Kepner added a comment - So the phenomena appears to be that mutations can return control to the ingestor before the minor compactions they cause have completed. Thus a single ingestor can cause multiple simultaneous compactions on the same tablet. If the table only has one tablet this cause the ingest process back up (see file 1ingestor_1table_1tablet.pdf). The solution is to presplit the tablet. The files 1ingestor_1table_2tablet.pdf and 1ingestor_1table_4tablet.pdf shows that these splits solves the problem. The file 2,3,4ingestor_1table_4tablet.pdf show the performance of three separate runs using 2, 3, and 4 ingestors into a table with 4 tablets.. Likewise, the file 10,r8,r10,r12ingestor_1table_12tablet.pdf shows four separate runs using 10 local, 8 remote, 10 remote, and 12 remote ingestors. In all cases, the splitting resolves the performance issue. If there are no objections, this issue can be closed.
        Hide
        Jeremy Kepner added a comment -

        Table is just small enough that automatic table splitting doesn't kick in.

        Show
        Jeremy Kepner added a comment - Table is just small enough that automatic table splitting doesn't kick in.
        Hide
        Eric Newton added a comment -

        Just for my sanity... how did you prevent automatic splitting?

        Show
        Eric Newton added a comment - Just for my sanity... how did you prevent automatic splitting?
        Hide
        Jeremy Kepner added a comment -

        The mutations are set to be 500KB. So for this experiment ~6 mutations are being kicked off each second. When ingest slows down the rate at which the mutations also slows downs.

        Show
        Jeremy Kepner added a comment - The mutations are set to be 500KB. So for this experiment ~6 mutations are being kicked off each second. When ingest slows down the rate at which the mutations also slows downs.
        Hide
        Jeremy Kepner added a comment -

        Based on feedback I have simplified the experiment to 1 ingestor, 1 table, and 1 tablet. In this simpler experiment it is clear that the drop in ingest performance is correlated with the increase in the number of minor compactions. The table reports that it is on only 1 tablet. If this is to be believed, then multiple tablets isn't the source of multiple compactions. If it is the case that the only other way that multiple compactions can be occurring is if there are multiple ingestors, then that is what must be happening. The benchmark starts a new mutation as soon as control is returned to the main program. If it is possible for control to be returned prior to the mutation completing then this would be an explanation for how multiple minor compactions could be taking place.

        Show
        Jeremy Kepner added a comment - Based on feedback I have simplified the experiment to 1 ingestor, 1 table, and 1 tablet. In this simpler experiment it is clear that the drop in ingest performance is correlated with the increase in the number of minor compactions. The table reports that it is on only 1 tablet. If this is to be believed, then multiple tablets isn't the source of multiple compactions. If it is the case that the only other way that multiple compactions can be occurring is if there are multiple ingestors, then that is what must be happening. The benchmark starts a new mutation as soon as control is returned to the main program. If it is possible for control to be returned prior to the mutation completing then this would be an explanation for how multiple minor compactions could be taking place.
        Hide
        Jeremy Kepner added a comment -

        Simplified experiment with 1 ingestor, 1 table, and 1 tablet.

        Show
        Jeremy Kepner added a comment - Simplified experiment with 1 ingestor, 1 table, and 1 tablet.
        Hide
        John Vines added a comment -

        You definately have tablet splitting occuring. A tablet cannot minor compact more than once at once time, so you shouldn't be seeing multiple minor compactions unless the table split. Or you're doing other ingest while benchmarking, which definately could have an impact on your performance.

        Show
        John Vines added a comment - You definately have tablet splitting occuring. A tablet cannot minor compact more than once at once time, so you shouldn't be seeing multiple minor compactions unless the table split. Or you're doing other ingest while benchmarking, which definately could have an impact on your performance.
        Hide
        Jeremy Kepner added a comment -

        The link at the bottom of the comment: http://www.mit.edu/~kepner/D4M/ is t Matlab/GNU-Octave interface to Accumulo (called D4M), which is what was used to write the Graph500 benchmark. FYI, 100% sure this isn't part of the issue as the D4M overhead is minimal.

        Show
        Jeremy Kepner added a comment - The link at the bottom of the comment: http://www.mit.edu/~kepner/D4M/ is t Matlab/GNU-Octave interface to Accumulo (called D4M), which is what was used to write the Graph500 benchmark. FYI, 100% sure this isn't part of the issue as the D4M overhead is minimal.
        Hide
        Jeremy Kepner added a comment -

        Yes. That was a typo. No tablet splitting is going on. There is no correlation between the ingest performance drops and tablet splitting.

        There isn't a plot of number of Xceivers, but I believe I see an inverse correlation between number of Xceivers and the ingest performance.

        Show
        Jeremy Kepner added a comment - Yes. That was a typo. No tablet splitting is going on. There is no correlation between the ingest performance drops and tablet splitting. There isn't a plot of number of Xceivers, but I believe I see an inverse correlation between number of Xceivers and the ingest performance.
        Hide
        Josh Elser added a comment -

        Also, unrelated to the issue, but the Accumulo anchor at the bottom of the page points to Octave-java

        Show
        Josh Elser added a comment - Also, unrelated to the issue, but the Accumulo anchor at the bottom of the page points to Octave-java
        Hide
        Josh Elser added a comment -

        Eric attached a document whose reasoning seems sound to me using the graphs you provided.

        So what is the explanation?

        To be sure, "#tables" was a typo? Number of tablets.

        I looked at performance vs. #tables and there is no correlation between the ingest performance drops and table splitting.

        Show
        Josh Elser added a comment - Eric attached a document whose reasoning seems sound to me using the graphs you provided. So what is the explanation? To be sure, "#tables" was a typo? Number of tablets . I looked at performance vs. #tables and there is no correlation between the ingest performance drops and table splitting.
        Hide
        Jeremy Kepner added a comment -

        I looked at performance vs. #tables and there is no correlation between the ingest performance drops and table splitting.

        Show
        Jeremy Kepner added a comment - I looked at performance vs. #tables and there is no correlation between the ingest performance drops and table splitting.
        Hide
        Jeremy Kepner added a comment -

        So what is the explanation?

        Show
        Jeremy Kepner added a comment - So what is the explanation?
        Hide
        Eric Newton added a comment -

        I've used the graphs you've provided to explain a bit about what is going on with your ingest performance over time.

        Show
        Eric Newton added a comment - I've used the graphs you've provided to explain a bit about what is going on with your ingest performance over time.
        Hide
        Eric Newton added a comment -

        I would graph ingest performance vs # tablets. I'll bet that you find that ingest performance dips when the table splits. If you prevent automatic splitting, or wait long enough for the automatic splitting to complete, you should see ingest performance become steady.

        Show
        Eric Newton added a comment - I would graph ingest performance vs # tablets. I'll bet that you find that ingest performance dips when the table splits. If you prevent automatic splitting, or wait long enough for the automatic splitting to complete, you should see ingest performance become steady.
        Hide
        Jeremy Kepner added a comment -

        Ingestor performance for 1, 2, 3, 4 ingestors.

        Show
        Jeremy Kepner added a comment - Ingestor performance for 1, 2, 3, 4 ingestors.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jeremy Kepner
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development