Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6940

Clarify the effect of configuring per-job state backend

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.3.0, 1.4.0
    • Fix Version/s: 1.4.0, 1.3.2
    • Component/s: Documentation
    • Labels:
      None

      Description

      The documentation of having different options configuring flink state backend is confusing. We should add explicit doc explaining configuring a per-job flink state backend in code will overwrite any default state backend configured in flink-conf.yaml

        Issue Links

          Activity

          Hide
          Zentol Chesnay Schepler added a comment -

          1.3: c704adf0ddfb9d8196dd0efbb912ce544680082f
          1.4: 8215f2e18b7cc44cf0bef0b560cbc757c2783b72

          Show
          Zentol Chesnay Schepler added a comment - 1.3: c704adf0ddfb9d8196dd0efbb912ce544680082f 1.4: 8215f2e18b7cc44cf0bef0b560cbc757c2783b72
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/4136

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/4136
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on the issue:

          https://github.com/apache/flink/pull/4136

          will merge this.

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on the issue: https://github.com/apache/flink/pull/4136 will merge this.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bowenli86 commented on the issue:

          https://github.com/apache/flink/pull/4136

          @zentol @alpinegizmo Guys, please let me know your thoughts

          Show
          githubbot ASF GitHub Bot added a comment - Github user bowenli86 commented on the issue: https://github.com/apache/flink/pull/4136 @zentol @alpinegizmo Guys, please let me know your thoughts
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bowenli86 commented on the issue:

          https://github.com/apache/flink/pull/4136

          @zentol @alpinegizmo Let me know your thoughts on it

          Show
          githubbot ASF GitHub Bot added a comment - Github user bowenli86 commented on the issue: https://github.com/apache/flink/pull/4136 @zentol @alpinegizmo Let me know your thoughts on it
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bowenli86 commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r126503269

          — Diff: docs/ops/state_backends.md —
          @@ -123,8 +123,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          -State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +State backends can be configured per job in code. In addition, you can define a default state backend in *flink-conf.yaml* that is used when the job does not explicitly define a state backend.
          — End diff –

          This is probably more readable. I'll update doc.

          Show
          githubbot ASF GitHub Bot added a comment - Github user bowenli86 commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r126503269 — Diff: docs/ops/state_backends.md — @@ -123,8 +123,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend -State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +State backends can be configured per job in code. In addition, you can define a default state backend in * flink-conf.yaml * that is used when the job does not explicitly define a state backend. — End diff – This is probably more readable. I'll update doc.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user alpinegizmo commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r126431179

          — Diff: docs/ops/state_backends.md —
          @@ -123,8 +123,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          -State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +State backends can be configured per job in code. In addition, you can define a default state backend in *flink-conf.yaml* that is used when the job does not explicitly define a state backend.
          — End diff –

          @zentol I find that "in code" reads rather awkwardly, and I don't see how it adds any value, since the details of how to do per-job configuration are shown below. Nevertheless, this topic can be a bit confusing, so I would suggest something more like this (assuming I got the details right):

          The default state backend, if you specify nothing, is the jobmanager. If you wish to establish a different default for all jobs on your cluster, you can do so by defining a new default state backend in *flink-conf.yaml*. The default state backend can be overridden on a per-job basis, as shown below.

          Show
          githubbot ASF GitHub Bot added a comment - Github user alpinegizmo commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r126431179 — Diff: docs/ops/state_backends.md — @@ -123,8 +123,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend -State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +State backends can be configured per job in code. In addition, you can define a default state backend in * flink-conf.yaml * that is used when the job does not explicitly define a state backend. — End diff – @zentol I find that "in code" reads rather awkwardly, and I don't see how it adds any value, since the details of how to do per-job configuration are shown below. Nevertheless, this topic can be a bit confusing, so I would suggest something more like this (assuming I got the details right): The default state backend, if you specify nothing, is the jobmanager. If you wish to establish a different default for all jobs on your cluster, you can do so by defining a new default state backend in * flink-conf.yaml *. The default state backend can be overridden on a per-job basis, as shown below.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r126401914

          — Diff: docs/ops/state_backends.md —
          @@ -123,8 +123,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          -State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +State backends can be configured per job in code. In addition, you can define a default state backend in *flink-conf.yaml* that is used when the job does not explicitly define a state backend.
          — End diff –

          huh...i just noticed the 2 subsections below that describe how to configure the default/per-job state backend. Aren't we just duplicating details here?

          @alpinegizmo Do you have any input?

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r126401914 — Diff: docs/ops/state_backends.md — @@ -123,8 +123,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend -State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +State backends can be configured per job in code. In addition, you can define a default state backend in * flink-conf.yaml * that is used when the job does not explicitly define a state backend. — End diff – huh...i just noticed the 2 subsections below that describe how to configure the default/per-job state backend. Aren't we just duplicating details here? @alpinegizmo Do you have any input?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r126275617

          — Diff: docs/ops/state_backends.md —
          @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml`
          — End diff –

          sorry for the wait, I'll take a look at this PR on monday.

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r126275617 — Diff: docs/ops/state_backends.md — @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml` — End diff – sorry for the wait, I'll take a look at this PR on monday.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bowenli86 commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r126240877

          — Diff: docs/ops/state_backends.md —
          @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml`
          — End diff –

          @zentol any suggestions?

          Show
          githubbot ASF GitHub Bot added a comment - Github user bowenli86 commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r126240877 — Diff: docs/ops/state_backends.md — @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml` — End diff – @zentol any suggestions?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bowenli86 commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r123306543

          — Diff: docs/ops/state_backends.md —
          @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml`
          — End diff –

          Yeah, this one works too.

          How about "State backends can be configured per job in code. In addition, you can define a default state backend in flink-conf.yaml that is used when the job does not explicitly define a state backend."?

          Show
          githubbot ASF GitHub Bot added a comment - Github user bowenli86 commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r123306543 — Diff: docs/ops/state_backends.md — @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml` — End diff – Yeah, this one works too. How about "State backends can be configured per job in code. In addition, you can define a default state backend in flink-conf.yaml that is used when the job does not explicitly define a state backend."?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r123228173

          — Diff: docs/ops/state_backends.md —
          @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml`
          — End diff –

          Would it be enough to reword the existing docs to:

          "
          State backends can be configured per job. In addition, you can define a default state backend *in `flink-conf.yaml`* to be used when the job does not explicitly define a state backend.
          "

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r123228173 — Diff: docs/ops/state_backends.md — @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml` — End diff – Would it be enough to reword the existing docs to: " State backends can be configured per job. In addition, you can define a default state backend * in `flink-conf.yaml` * to be used when the job does not explicitly define a state backend. "
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user bowenli86 commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r123089833

          — Diff: docs/ops/state_backends.md —
          @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml`
          — End diff –

          I don't believe implying something is enough.

          My teammates and I went through this piece of documentation several times when trying to enable checkpoints, but couldn't figure out the exact configurations and the relationship among several config/code keys. Until we shout out in user list http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/confusing-RocksDBStateBackend-parameters-td13810.html

          Thus I believe the explicit explanation is absolutely necessary.

          Show
          githubbot ASF GitHub Bot added a comment - Github user bowenli86 commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r123089833 — Diff: docs/ops/state_backends.md — @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml` — End diff – I don't believe implying something is enough. My teammates and I went through this piece of documentation several times when trying to enable checkpoints, but couldn't figure out the exact configurations and the relationship among several config/code keys. Until we shout out in user list http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/confusing-RocksDBStateBackend-parameters-td13810.html Thus I believe the explicit explanation is absolutely necessary.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/4136#discussion_r122970989

          — Diff: docs/ops/state_backends.md —
          @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp

            1. Configuring a State Backend

          State backends can be configured per job. In addition, you can define a default state backend to be used when the
          -job does not explicitly define a state backend.
          +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml`
          — End diff –

          this addition is redundant, as " you can define a default state backend to be used when the
          job does not explicitly define a state backend." already implies this.

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/4136#discussion_r122970989 — Diff: docs/ops/state_backends.md — @@ -124,7 +124,7 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp Configuring a State Backend State backends can be configured per job. In addition, you can define a default state backend to be used when the -job does not explicitly define a state backend. +job does not explicitly define a state backend. Besides, state backend configured per-job will overwrite the default state backend configured in `flink-conf.yaml` — End diff – this addition is redundant, as " you can define a default state backend to be used when the job does not explicitly define a state backend." already implies this.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user bowenli86 opened a pull request:

          https://github.com/apache/flink/pull/4136

          FLINK-6940[docs] Clarify the effect of configuring per-job state backend

          The documentation of having different options configuring flink state backend is confusing. We should add explicit doc explaining configuring a per-job flink state backend in code will overwrite any default state backend configured in flink-conf.yaml

          Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
          If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
          In addition to going through the list, please provide a meaningful description of your changes.

          • [x] General
          • The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
          • The pull request addresses only one issue
          • Each commit in the PR has a meaningful commit message (including the JIRA id)
          • [x] Documentation
          • Documentation has been added for new functionality
            -Old documentation affected by the pull request has been updated
          • JavaDoc for public methods has been added
          • [x] Tests & Build
          • Functionality added by the pull request is covered by tests
          • `mvn clean verify` has been executed successfully locally or a Travis build has passed

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/bowenli86/flink FLINK-6940

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/4136.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #4136


          commit 6ef758bfc8ef44a1ed6061ce85af849cc3f65c96
          Author: Bowen Li <bowenli86@gmail.com>
          Date: 2017-06-18T07:17:20Z

          FLINK-6940[docs] Clarify the effect of configuring per-job state backend


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user bowenli86 opened a pull request: https://github.com/apache/flink/pull/4136 FLINK-6940 [docs] Clarify the effect of configuring per-job state backend The documentation of having different options configuring flink state backend is confusing. We should add explicit doc explaining configuring a per-job flink state backend in code will overwrite any default state backend configured in flink-conf.yaml Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide] ( http://flink.apache.org/how-to-contribute.html ). In addition to going through the list, please provide a meaningful description of your changes. [x] General The pull request references the related JIRA issue (" [FLINK-XXX] Jira title text") The pull request addresses only one issue Each commit in the PR has a meaningful commit message (including the JIRA id) [x] Documentation Documentation has been added for new functionality -Old documentation affected by the pull request has been updated JavaDoc for public methods has been added [x] Tests & Build Functionality added by the pull request is covered by tests `mvn clean verify` has been executed successfully locally or a Travis build has passed You can merge this pull request into a Git repository by running: $ git pull https://github.com/bowenli86/flink FLINK-6940 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4136.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4136 commit 6ef758bfc8ef44a1ed6061ce85af849cc3f65c96 Author: Bowen Li <bowenli86@gmail.com> Date: 2017-06-18T07:17:20Z FLINK-6940 [docs] Clarify the effect of configuring per-job state backend

            People

            • Assignee:
              phoenixjiangnan Bowen Li
              Reporter:
              phoenixjiangnan Bowen Li
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development