Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.23.2
    • Fix Version/s: None
    • Component/s: mrv2
    • Labels:
      None

      Description

      The Job History web services canput a very large load on the job history server. We should put in a limit on the number of entries that can be returned by the web service, and also add in the ability to modify the starting location in the list, so that all entries can still be downlaoded. Just not all at once.

        Activity

        Hide
        Bhallamudi Venkata Siva Kamesh added a comment -

        Hi Robert,
        As per my understanding, /mapreduce/jobs API causes a very large load on the job history server. However, a user can mention limit also, ofcourse it can be optional.

        Intention of this ticket is to return only a limited number of items, something like recent 50 or 100 among 20000, in the absense of limit factor or irrespective of limit factor?

        I think, suppose say a user mentions limit as 50, even then all the jobs from intermediateListCache & jobListCache are being fetched from JHS and again iterating over the list in the HsWebServices, to return only required number of jobs. If so, I think, we can move the limit factor to JHS to parse only that many jobs, some thing like following API will be useful

         public Map<JobId, Job> getNJobs(String queue, String user, long sTime, long sEnd, long fStart, long fEnd, int n) {}
        

        Please provide your comments.

        Show
        Bhallamudi Venkata Siva Kamesh added a comment - Hi Robert, As per my understanding, /mapreduce/jobs API causes a very large load on the job history server. However, a user can mention limit also, ofcourse it can be optional. Intention of this ticket is to return only a limited number of items, something like recent 50 or 100 among 20000, in the absense of limit factor or irrespective of limit factor? I think, suppose say a user mentions limit as 50, even then all the jobs from intermediateListCache & jobListCache are being fetched from JHS and again iterating over the list in the HsWebServices, to return only required number of jobs. If so, I think, we can move the limit factor to JHS to parse only that many jobs, some thing like following API will be useful public Map<JobId, Job> getNJobs(String queue, String user, long sTime, long sEnd, long fStart, long fEnd, int n) {} Please provide your comments.
        Hide
        Robert Joseph Evans added a comment -

        Yes you are correct about the implementation of the /mapreduce/jobs API up until MAPREDUCE-3944. MAPREDUCE-3944 changed it so that /mapreduce/jobs will only return the equivalent of a PartialJob, not a CompletedJob. This reduces the load a lot, and is hopefully a temporary measure until more issues can be addressed as part of MAPREDUCE-3973.

        I am not sure of the best way to implement this, because it is something that we want to be able to do consistently across APIs if need be. But, it will probably be something like keeping the limit that is already in place and adding in a start parameter as well. We will probably also need to add in fields to the returned list indicating the start index and end index of the items returned, and possibly something else indicating the total number of objects that can be returned.

        The maximum limit would be in place regardless of what the limit is set in the URL.

        The real difficultly with this is that we now need to be able to maintain a guaranteed consistent ordering of the values returned or the start will be useless.

        Show
        Robert Joseph Evans added a comment - Yes you are correct about the implementation of the /mapreduce/jobs API up until MAPREDUCE-3944 . MAPREDUCE-3944 changed it so that /mapreduce/jobs will only return the equivalent of a PartialJob, not a CompletedJob. This reduces the load a lot, and is hopefully a temporary measure until more issues can be addressed as part of MAPREDUCE-3973 . I am not sure of the best way to implement this, because it is something that we want to be able to do consistently across APIs if need be. But, it will probably be something like keeping the limit that is already in place and adding in a start parameter as well. We will probably also need to add in fields to the returned list indicating the start index and end index of the items returned, and possibly something else indicating the total number of objects that can be returned. The maximum limit would be in place regardless of what the limit is set in the URL. The real difficultly with this is that we now need to be able to maintain a guaranteed consistent ordering of the values returned or the start will be useless.
        Hide
        Bhallamudi Venkata Siva Kamesh added a comment -

        Hi Robert,
        Thanks for clarifying. Yes, MAPREDUCE-3944 eliminated JobHistory.getJob(JobId) invocation for each job and replaced CompletedJob with PartialJob.

        As I observed in the JobHistory#getAllJobsInternal uses a TreeMap to store jobId and its corresponding PartialJob. But, intermediateListCache & jobListCache are implementations of SortedMap, so the iterator returns jobs from most recent to least recent. Again we are storing these Jobs in a TreeMap. So each put opearation takes O(lgn) time. Where as put operatin of LinkedHashMap takes O(1) time and guareentes retrival of the elements as per their insertion order so I think we can replace TreeMap with LinkedHashMap. Or is there anything am I missing?

        Show
        Bhallamudi Venkata Siva Kamesh added a comment - Hi Robert, Thanks for clarifying. Yes, MAPREDUCE-3944 eliminated JobHistory.getJob(JobId) invocation for each job and replaced CompletedJob with PartialJob. As I observed in the JobHistory#getAllJobsInternal uses a TreeMap to store jobId and its corresponding PartialJob. But, intermediateListCache & jobListCache are implementations of SortedMap, so the iterator returns jobs from most recent to least recent. Again we are storing these Jobs in a TreeMap. So each put opearation takes O(lgn) time. Where as put operatin of LinkedHashMap takes O(1) time and guareentes retrival of the elements as per their insertion order so I think we can replace TreeMap with LinkedHashMap. Or is there anything am I missing?
        Hide
        Robert Joseph Evans added a comment -

        Bhallamudi,

        I agree that there are a number of optimizations that can be made and that we can guarantee ordering if we do them correctly. I am just not positive what those optimizations should be yet. I made a proposal on MAPREDUCE-3973 to possibly change the back end all together. If you have feedback on it I would be very grateful to hear it. Depending on how that proposal is received the way this JIRA is implemented could be very different.

        Show
        Robert Joseph Evans added a comment - Bhallamudi, I agree that there are a number of optimizations that can be made and that we can guarantee ordering if we do them correctly. I am just not positive what those optimizations should be yet. I made a proposal on MAPREDUCE-3973 to possibly change the back end all together. If you have feedback on it I would be very grateful to hear it. Depending on how that proposal is received the way this JIRA is implemented could be very different.

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Joseph Evans
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:

              Development