A user who lands on the history server's application listing page before the history server's background processing has finished replaying the existing event logs complains why a particular application is not included in the list. There is no indication on the UI about the possible reason, or that refreshing the page after some time may show up the application the user expects to see. This problem is very noticeable when there are large sized event logs that take a long time to finish replaying.
The usability problems with large (number or size) event logs and the history server have been known. Particularly
SPARK-5522, SPARK-13988 and other issues referenced within them highlight the problems and the improvements done so far. To improve the history server startup time and reduce the impact of large event log files, the event logs are now processed (replayed) via a pool of threads. This allows a user to browse to the application listing page before the event logs have finished replaying. After history server startup, a user expects to see any old completed applications to appear in the application list page. But unless the corresponding event logs have finished replaying, the application wont be in the list, and the user complains. There is no feedback to the user about this on the UI, hence this JIRA to try and address this problem.
Idea is to give some indication of the number of event logs that are pending replay to the user. Note that the way the replay is currently designed, one cycle of "check for logs that need to be replayed > replay the logs > update application info" needs to complete before a new one begins. Therefore, it should be possible for the FsApplicationHistoryProvider to send info about the number of logs that are currently pending processing. This in turn would address the user anxiety of not seeing the application they expect to see.
I will be attaching a pull request with my initial take on implementing this.