Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5976

Refactoring duplicate Tokenizer in flink-test

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.3.0
    • Component/s: Examples
    • Labels:

      Description

      There are some duplicate code like this in flink-test, I think refactor this will be better.
      ```
      public final class Tokenizer implements FlatMapFunction<String, Tuple2<String, Integer>> {

      @Override
      public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
      // normalize and split the line
      String[] tokens = value.toLowerCase().split("
      W+");

      // emit the pairs
      for (String token : tokens) {
      if (token.length() > 0)

      { out.collect(new Tuple2<String, Integer>(token, 1)); }

      }
      }
      }

      ```

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user uce commented on the issue:

          https://github.com/apache/flink/pull/3482

          Hey @liuyuzhong7! Thanks for the contribution. A few notes:

          • Could you rebase your branch on top of the master branch. We don't do merge commits. After you've rebased the branch, I will review it (you can force push to your PR branch)
          • Could you rename the commits and this PR to:
            ```
            FLINK-5976 [tests] Deduplicate Tokenizer in tests
            ```
          • Could you unset the `fix version` in the JIRA and set `affects version` to `1.2.0`?
          Show
          githubbot ASF GitHub Bot added a comment - Github user uce commented on the issue: https://github.com/apache/flink/pull/3482 Hey @liuyuzhong7! Thanks for the contribution. A few notes: Could you rebase your branch on top of the master branch. We don't do merge commits. After you've rebased the branch, I will review it (you can force push to your PR branch) Could you rename the commits and this PR to: ``` FLINK-5976 [tests] Deduplicate Tokenizer in tests ``` Could you unset the `fix version` in the JIRA and set `affects version` to `1.2.0`?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user liuyuzhong7 commented on the issue:

          https://github.com/apache/flink/pull/3482

          @uce Thanks for review. OK, I'll do it later.

          Show
          githubbot ASF GitHub Bot added a comment - Github user liuyuzhong7 commented on the issue: https://github.com/apache/flink/pull/3482 @uce Thanks for review. OK, I'll do it later.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user liuyuzhong7 closed the pull request at:

          https://github.com/apache/flink/pull/3482

          Show
          githubbot ASF GitHub Bot added a comment - Github user liuyuzhong7 closed the pull request at: https://github.com/apache/flink/pull/3482
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user liuyuzhong7 opened a pull request:

          https://github.com/apache/flink/pull/3485

          FLINK-5976 [tests] Deduplicate Tokenizer in tests

          There are some duplicate code like this in flink-test, I think refactor this will be better.

          ```
          public final class Tokenizer implements FlatMapFunction<String, Tuple2<String, Integer>> {

          @Override
          public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
          // normalize and split the line
          String[] tokens = value.toLowerCase().split("
          W+");

          // emit the pairs
          for (String token : tokens) {
          if (token.length() > 0)

          { out.collect(new Tuple2<String, Integer>(token, 1)); }

          }
          }
          }
          ```

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/liuyuzhong7/flink FLINK-5976

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3485.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3485


          commit db4192110b1b610846f50527a5fc67450d3d1cca
          Author: liuyuzhong7 <liuyuzhong7@gmail.com>
          Date: 2017-03-07T11:15:40Z

          FLINK-5976 [tests] Deduplicate Tokenizer in tests


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user liuyuzhong7 opened a pull request: https://github.com/apache/flink/pull/3485 FLINK-5976 [tests] Deduplicate Tokenizer in tests There are some duplicate code like this in flink-test, I think refactor this will be better. ``` public final class Tokenizer implements FlatMapFunction<String, Tuple2<String, Integer>> { @Override public void flatMap(String value, Collector<Tuple2<String, Integer>> out) { // normalize and split the line String[] tokens = value.toLowerCase().split(" W+"); // emit the pairs for (String token : tokens) { if (token.length() > 0) { out.collect(new Tuple2<String, Integer>(token, 1)); } } } } ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/liuyuzhong7/flink FLINK-5976 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3485.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3485 commit db4192110b1b610846f50527a5fc67450d3d1cca Author: liuyuzhong7 <liuyuzhong7@gmail.com> Date: 2017-03-07T11:15:40Z FLINK-5976 [tests] Deduplicate Tokenizer in tests
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user liuyuzhong7 commented on the issue:

          https://github.com/apache/flink/pull/3485

          @uce
          1. This pull request has squash in one commit.
          2. JIRA has set affects version to 1.2.0

          Help me to review this code, Thanks.

          Show
          githubbot ASF GitHub Bot added a comment - Github user liuyuzhong7 commented on the issue: https://github.com/apache/flink/pull/3485 @uce 1. This pull request has squash in one commit. 2. JIRA has set affects version to 1.2.0 Help me to review this code, Thanks.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3485#discussion_r104650335

          — Diff: flink-tests/src/test/java/org/apache/flink/test/util/Tokenizer.java —
          @@ -0,0 +1,17 @@
          +package org.apache.flink.test.util;
          — End diff –

          this file is missing the apache license.

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/3485#discussion_r104650335 — Diff: flink-tests/src/test/java/org/apache/flink/test/util/Tokenizer.java — @@ -0,0 +1,17 @@ +package org.apache.flink.test.util; — End diff – this file is missing the apache license.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user liuyuzhong7 commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3485#discussion_r104826690

          — Diff: flink-tests/src/test/java/org/apache/flink/test/util/Tokenizer.java —
          @@ -0,0 +1,17 @@
          +package org.apache.flink.test.util;
          — End diff –

          @zentol fixed.

          Show
          githubbot ASF GitHub Bot added a comment - Github user liuyuzhong7 commented on a diff in the pull request: https://github.com/apache/flink/pull/3485#discussion_r104826690 — Diff: flink-tests/src/test/java/org/apache/flink/test/util/Tokenizer.java — @@ -0,0 +1,17 @@ +package org.apache.flink.test.util; — End diff – @zentol fixed.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3485

          How about we create a package `org.apache.flink.test.testfunctions` where we collect all shared reusable function implementations?

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3485 How about we create a package `org.apache.flink.test.testfunctions` where we collect all shared reusable function implementations?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3485

          Would be nice to follow up here and collect also functions like "identity mapper" or so in that package.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3485 Would be nice to follow up here and collect also functions like "identity mapper" or so in that package.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user liuyuzhong7 commented on the issue:

          https://github.com/apache/flink/pull/3485

          @StephanEwen It's a good idea. We can remove many duplicate code in flink-tests in this way.
          But I found it also duplicate in other modules. How to solve it in other modules.
          ![image](https://cloud.githubusercontent.com/assets/24708126/23732447/1f3cdd08-04ae-11e7-866e-a5a0e6d5e242.png)

          Show
          githubbot ASF GitHub Bot added a comment - Github user liuyuzhong7 commented on the issue: https://github.com/apache/flink/pull/3485 @StephanEwen It's a good idea. We can remove many duplicate code in flink-tests in this way. But I found it also duplicate in other modules. How to solve it in other modules. ! [image] ( https://cloud.githubusercontent.com/assets/24708126/23732447/1f3cdd08-04ae-11e7-866e-a5a0e6d5e242.png )
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3485

          I don't think it is necessary to unify all example functions. These are just minimal non critical pieces defined by some tests. One we drop Java 7, most of them will anyways disappear and become Lambdas. So I would not overdo this right now...

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3485 I don't think it is necessary to unify all example functions. These are just minimal non critical pieces defined by some tests. One we drop Java 7, most of them will anyways disappear and become Lambdas. So I would not overdo this right now...
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user liuyuzhong7 commented on the issue:

          https://github.com/apache/flink/pull/3485

          OK, get it. So how about just do it now in flink-tests like this pull request?

          Show
          githubbot ASF GitHub Bot added a comment - Github user liuyuzhong7 commented on the issue: https://github.com/apache/flink/pull/3485 OK, get it. So how about just do it now in flink-tests like this pull request?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user liuyuzhong7 commented on the issue:

          https://github.com/apache/flink/pull/3485

          @StephanEwen
          FlatMapFunction "Tokenizer" has move to org.apache.flink.test.testfunctions.
          Help me to review this. Thanks.
          And collect other reusable functions to org.apache.flink.test.testfunctions later.

          Show
          githubbot ASF GitHub Bot added a comment - Github user liuyuzhong7 commented on the issue: https://github.com/apache/flink/pull/3485 @StephanEwen FlatMapFunction "Tokenizer" has move to org.apache.flink.test.testfunctions. Help me to review this. Thanks. And collect other reusable functions to org.apache.flink.test.testfunctions later.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3485

          Looks good, thanks! Merging this...

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3485 Looks good, thanks! Merging this...
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/3485

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3485
          Hide
          StephanEwen Stephan Ewen added a comment -

          Fixed in e5e6da0b38495d4030fd0b790a81ca70f9ce84b2

          Show
          StephanEwen Stephan Ewen added a comment - Fixed in e5e6da0b38495d4030fd0b790a81ca70f9ce84b2

            People

            • Assignee:
              Unassigned
              Reporter:
              liuyuzhong7 liuyuzhong7
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development