Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11093

In fs-support-append false case, timelineserver reads event files in arbitrary order

    XMLWordPrintableJSON

Details

    Description

      In our setup, we are using Hive and Tez and using offline tez-ui mode, where we copy the ATS events files to other place and start tez-ui. To keep the event files small, we are using 

      fs-support-append = false, and creating a new file for writing the events. We can see that, in this mode, file name contains suffix as timestamp.

      But at the time of read, we are relying on the FS to provide correct file order. If that fails, we are adding event out of order, leading to event being discarded, or incorrect information.

      Fix could be sorting of the file names, based on suffix if append mode is not used.

      sample file names:

      • summarylog-appattempt_1647348120288_0001_000001_460237
      • entitylog-timelineEntityGroupId_1647348120288_1_dag_1647348120288_0001_1_673147

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              guptashailesh92@gmail.com shailesh gupta
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h