Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25318

Improvement of scheduler and execution for Flink OLAP

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      We use flink to perform OLAP queries. We launch flink session cluster, submit batch jobs to the cluster as OLAP queries, and fetch the jobs' results. OLAP jobs are generally small queries which will finish at the seconds or milliseconds, and users always submit multiple jobs to the session cluster concurrently. We found the qps and latency of jobs will be greatly affected when there're tens jobs are running, even when there's little data in each query. We will give the result of benchmark for the latest version later.
      After discussed with Xintong Song, and thanks for his advice, we create this issue to trace and manager Flink OLAP related improvements. More users and developers are welcome and feel free to create Flink OLAP related subtasks here, thanks

      Attachments

        Issue Links

        1.
        Enable TCP connection reuse across multiple jobs. Sub-task Closed Yangze Guo Actions
        2.
        Support listen and notify mechanism for PartitionRequest Sub-task Open Shammon Actions
        3.
        System classloader memory leak after loading too many codegen classes. Sub-task Open Unassigned Actions
        4.
        Improvement of reuse segments for join/agg/sort operators in TaskManager for flink olap queries Sub-task Open Unassigned Actions
        5.
        Improvement of execution graph store in flink session cluster for jobs Sub-task Closed Shammon Actions
        6.
        Manage and share gateways of taskmanagers between jobs in session cluster Sub-task Open Unassigned Actions
        7.
        Improvement of task deployment by enable source split asynchronous enumerate Sub-task Open Unassigned Actions
        8.
        Add benchmarks for performance in OLAP scenarios Sub-task Open Unassigned Actions
        9.
        Improvement of connection from TM to JM in session cluster Sub-task Open Unassigned Actions
        10.
        Add thread dump feature for jobmanager Sub-task Closed Zhanghao Chen Actions
        11.
        Too many JM logs in flink session cluster for olap queries Sub-task Open Unassigned Actions
        12.
        TaskExecutor always creates local file for task even when local state store is not used Sub-task Resolved Junfan Zhang Actions
        13.
        ExecutionGraphInfoStore in session cluster should split failed and successful jobs Sub-task Open Unassigned Actions
        14.
        Ignore buffer pools which have no floating buffer in buffer redistributing Sub-task Closed Yangze Guo Actions
        15.
        Remove the redundant serialization of RPC invocation at Flink side. Sub-task Closed Yangze Guo Actions
        16.
        Memory pages in LazyMemorySegmentPool should be clear after they are released to MemoryManager Sub-task Open Shammon Actions
        17.
        Optimize the time of fetching job status in the job submission of session cluster Sub-task Closed Yangze Guo Actions
        18.
        Parallelized heavy serialization operations in StreamingJobGraphGenerator Sub-task In Progress Yangze Guo Actions

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            zjureel Shammon

            Dates

              Created:
              Updated:

              Slack

                Issue deployment