Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3816 Erasure Coding
  3. HDDS-6422

EC: Fix too many idle threads during reconstruct read.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • EC-Branch
    • None

    Description

      We saw a lot of idle threads during a test for EC reconstruct read:

      14 DNs, write 10 10G files(EC: 10+4) with 10 threads using ockg:

      ./bin/ozone freon ockg -p test -n 10 -t 10 -s $((10*1024*1024*1024))

      Then kill 4 DNs, and use ockv to validate read:

      ./bin/ozone freon ockv -p test -n 10 -t 10

      And we found that threads for ec reconstruct read grows beyond 1000 as we proceed:

       

      1024   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.014      false        false
      1025   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.015      false        false
      1026   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.017      false        false
      1027   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.015      false        false
      1028   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.014      false        false
      1057   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.041      false        false
      1059   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.014      false        false
      1060   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.015      false        false
      1061   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.014      false        false
      1062   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.014      false        false
      1063   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.015      false        false
      1064   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.011      false        false
      1065   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.020      false        false
      1066   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.013      false        false
      1067   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.015      false        false
      1068   ec-reader-for-conID: 4 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.013      false        false
      1069   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.024      false        false
      1070   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.032      false        false
      1071   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.020      false        false
      1072   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.022      false        false
      1073   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.021      false        false
      1074   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.018      false        false
      1075   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.020      false        false
      1076   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.025      false        false
      1077   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.019      false        false
      1078   ec-reader-for-conID: 5 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.027      false        false
      1089   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.018      false        false
      1090   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.019      false        false
      1092   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.020      false        false
      1095   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.016      false        false
      1096   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.016      false        false
      1098   ec-reader-for-conID: 3 locID: 10961100 main                5            WAITING      0.0          0.000         0:0.016      false        false 

       

       

      For now, one thread pool of size 10 is created for each block group with EC 10+4, and the pool is shutdown when the InputStream for the block group is closed. But as I read the code, the BlockExtendedInputStream for each block is only closed altogether when the KeyInputStream is closed(This design may be intended to support seek backward during read).

      So for a big file of 10G, we would have a lot of idle threads in WAITING state, this does not scale well for concurrent reconstruct read, and I think even a key level thread pool doesn't scale.

      So we could have a client global thread pool for EC reconstruct read, then the number of threads will be under control, and the pool size could be configurable to fit all kinds of loads.

      Attachments

        Issue Links

          Activity

            People

              markgui Mark Gui
              markgui Mark Gui
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: