Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3466

Improve filterViableNodes performance in pipeline creation



    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.5.0
    • None
    • SCM


      Per sodonnell's investigation, pipeline creation may have performance issue once the load-sorting algorithm in HDDS-3139


      This task is to track potential performance bottleneck caused by this sorting operation for pipeline creation in large scale cluster.


      I am a little concerned about the expense of forming the list of healthy nodes on large clusters. We have to do quite a lot of work to form a list and then only use 3 nodes from the list. Even the method currentPipelineCount() needs to do a few map lookups per node to get the current pipeline count. This is the case even before this change. Creating a pipeline on a large cluster would be expensive already, but this change probably makes it worse, due to the sort needed. I know it was me who suggested the sort.

      I think the code as it is will work OK upto about 1000 nodes, and then the performance will drop off as the number of nodes goes toward 10k.

      Eg here are some benchmarks I created using this test code, which is similar to what we are doing in filterViableNodes():


      {{ public List<Object> sortingWithMap(BenchmarkState state) {
      return state.otherList.stream()
      .map(o -> new Mock(o, state.rand.nextInt(20)))
      .filter(o -> o.getSize() <= 20)
      .map(o -> o.getObject())

      The OPs per second for various list sizes are:


      {{Benchmark (listSize) Mode Cnt Score Error Units
      Sorting.sortingWithMap 100 thrpt 3 113948.345 ± 446.426 ops/s
      Sorting.sortingWithMap 1000 thrpt 3 9468.507 ± 894.138 ops/s
      Sorting.sortingWithMap 5000 thrpt 3 1931.612 ± 263.919 ops/s
      Sorting.sortingWithMap 10000 thrpt 3 970.745 ± 25.823 ops/s
      Sorting.sortingWithMap 100000 thrpt 3 87.684 ± 35.438 ops/s}}

      For a 1000 node cluster, with 10 pipelines per node, we would be looking at about 1 second to form all the piplines.

      For a 5k node cluster, it would be about 25 seconds

      For a 10k node cluster it would be 103 seconds, but even here, that would be at close to 1000 pipelines per second.




            Unassigned Unassigned
            timmylicheng Li Cheng
            0 Vote for this issue
            4 Start watching this issue