Uploaded image for project: 'Apache Open Climate Workbench'
  1. Apache Open Climate Workbench
  2. CLIMATE-248

PERFORMANCE - Rebinning Daily to Monthly datasets takes a really long time

    Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.1-incubating, 0.2-incubating
    • Fix Version/s: 1.3.0
    • Labels:
    • Environment:

      *nix

      Description

      When I was testing the dataset_processor module I noticed that most tests would complete in less than 1 second. Then I came across the "test_daily_to_monthly_rebin" test and it would take over 2 minutes to complete.

      The test initially used a 1x1 degree grid covering the globe and daily time step for 2 years (730 days).

      I ran some initial checks and the lag appears to be down in the code where the data is rebinned down in '_rcmes_calc_average_on_new_time_unit_K'.

                      mask = np.zeros_like(data)
                      mask[timeunits!=myunit,:,:] = 1.0
                      # Calculate missing data mask within each time unit...
                      datamask_at_this_timeunit = np.zeros_like(data)
                      datamask_at_this_timeunit[:]= process.create_mask_using_threshold(data[timeunits==myunit,:,:],threshold=0.75)
                      # Store results for masking later
                      datamask_store.append(datamask_at_this_timeunit[0])
                      # Calculate means for each pixel in this time unit, ignoring missing data (using masked array).
                      datam = ma.masked_array(data,np.logical_or(mask,datamask_at_this_timeunit))
                      meanstore[i,:,:] = ma.average(datam,axis=0)
      

      That block is suspect since the rest of the code is doing simple string parsing and appending to lists. I don't have the time to do a deep dive into this now, and it technically isn't broken, but just really slow.

        Attachments

        1. test.py
          0.6 kB
          Cameron Goodale
        2. test_monthly_rebin.py
          2 kB
          Alex Goodman
        3. inital_profile.txt
          55 kB
          Cameron Goodale

          Activity

            People

            • Assignee:
              cgoodale Cameron Goodale
              Reporter:
              cgoodale Cameron Goodale
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Due:
                Created:
                Updated: