Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-10298

TimeLine entity information only stored in one region when use apache HBase as backend storage

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.1.1
    • Fix Version/s: None
    • Component/s: ATSv2, timelineservice
    • Labels:
      None
    • Target Version/s:

      Description

      Issue

      TimeLine entity information only stored in one region when use apache HBase as backend storage

      Probable cause

      We found in the source code that the rowKey is composed of clusterId、userId、flowName、flowRunId and appId when hbase timeline writer stores timeline entity info,which probably cause the rowKey is sorted by dictionary order. Thus timeline entity may only store in one region or few adjacent regions.

      Related code snippet

      HBaseTimelineWriterImpl.java

       

      public TimelineWriteResponse write(TimelineCollectorContext context,
       TimelineEntities data, UserGroupInformation callerUgi)
       throws IOException {
       ...
       boolean isApplication = ApplicationEntity.isApplicationEntity(te);
       byte[] rowKey;
       if (isApplication){ 
       ApplicationRowKey applicationRowKey = new ApplicationRowKey(clusterId, userId, flowName, flowRunId, appId); rowKey = applicationRowKey.getRowKey();
       store(rowKey, te, flowVersion, Tables.APPLICATION_TABLE); 
       }else { 
       EntityRowKey entityRowKey = new EntityRowKey(clusterId, userId, flowName, flowRunId, appId, te.getType(), te.getIdPrefix(), te.getId()); 
       rowKey = entityRowKey.getRowKey(); 
       store(rowKey, te, flowVersion, Tables.ENTITY_TABLE); 
       }
       if (!isApplication && SubApplicationEntity.isSubApplicationEntity(te)) { 
       SubApplicationRowKey subApplicationRowKey = new SubApplicationRowKey(subApplicationUser, clusterId, te.getType(), te.getIdPrefix(), te.getId(), userId);
       rowKey = subApplicationRowKey.getRowKey(); 
       store(rowKey, te, flowVersion, Tables.SUBAPPLICATION_TABLE); }
      ...
      }
      

       

      Suggestion

      We can use the hash code of original rowKey as the rowKey to store and read timeline entity data.

        Attachments

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

              Dates

              • Created:
                Updated:

                Issue deployment