Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6393

Add Evenly Graph Generator to Flink Gelly

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Minor
    • Resolution: Implemented
    • Affects Version/s: None
    • Fix Version/s: 1.3.0, 1.4.0
    • Component/s: Gelly
    • Labels:
      None

      Description

      Evenly graph means every vertex in the graph has the same degree, so the graph can be treated as evenly due to all the edges in the graph are distributed evenly. when vertex degree is 0, an empty graph will be generated. when vertex degree is vertex count - 1, complete graph will be generated.

        Issue Links

          Activity

          Hide
          fanzhidongyzby FlorianFan added a comment -

          Fabian Hueske
          Hi, can you give me the permission of assigning issue to myself, thanks !

          Show
          fanzhidongyzby FlorianFan added a comment - Fabian Hueske Hi, can you give me the permission of assigning issue to myself, thanks !
          Hide
          fhueske Fabian Hueske added a comment -

          FlorianFan, done!

          Show
          fhueske Fabian Hueske added a comment - FlorianFan , done!
          Hide
          greghogan Greg Hogan added a comment -

          FlorianFan, this looks interesting. How will the edges be assigned in the intermediate cases? If vertices is 10 and degree is 3, does vertex 0 have edges to 1, 2, and 3 or something else? How can this be used?

          Show
          greghogan Greg Hogan added a comment - FlorianFan , this looks interesting. How will the edges be assigned in the intermediate cases? If vertices is 10 and degree is 3, does vertex 0 have edges to 1, 2, and 3 or something else? How can this be used?
          Hide
          fanzhidongyzby FlorianFan added a comment - - edited

          Greg Hogan]
          Yeah, this is really interesting ! The core idea is based on the concept of central symmetry. From the view of any vertex in the graph, the other vertices and edges are the same. So in the intermediate cases, edges are created from one vertex to the opposite vertex(if exists) and vertices on both sides of it.
          The algorithm has been implemented, I'm now writing unit test case of it and will send pull request soon.

          Show
          fanzhidongyzby FlorianFan added a comment - - edited Greg Hogan ] Yeah, this is really interesting ! The core idea is based on the concept of central symmetry. From the view of any vertex in the graph, the other vertices and edges are the same. So in the intermediate cases, edges are created from one vertex to the opposite vertex(if exists) and vertices on both sides of it. The algorithm has been implemented, I'm now writing unit test case of it and will send pull request soon.
          Hide
          fanzhidongyzby FlorianFan added a comment -

          Implement evenly graph algorithm.

          Show
          fanzhidongyzby FlorianFan added a comment - Implement evenly graph algorithm.
          Hide
          fanzhidongyzby FlorianFan added a comment - - edited

          Greg Hogan
          The evenly graph can be used to create testing dataset of graph, which could be needed by specific graph algorithm or theoretical testing. In addition, evenly graph is completely symmetry and beautiful, haha!

          Show
          fanzhidongyzby FlorianFan added a comment - - edited Greg Hogan The evenly graph can be used to create testing dataset of graph, which could be needed by specific graph algorithm or theoretical testing. In addition, evenly graph is completely symmetry and beautiful, haha!
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user fanzhidongyzby opened a pull request:

          https://github.com/apache/flink/pull/3788

          FLINK-6393 [gelly] Add Evenly Graph Generator to Flink Gelly

          Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
          If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
          In addition to going through the list, please provide a meaningful description of your changes.

          • [x] General
          • The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
          • The pull request addresses only one issue
          • Each commit in the PR has a meaningful commit message (including the JIRA id)
          • [ ] Documentation
          • Documentation has been added for new functionality
          • Old documentation affected by the pull request has been updated
          • JavaDoc for public methods has been added
          • [x] Tests & Build
          • Functionality added by the pull request is covered by tests
          • `mvn clean verify` has been executed successfully locally or a Travis build has passed

          Evenly graph means every vertex in the graph has the same degree, so the graph can be treated as evenly due to all the edges in the graph are distributed evenly. when vertex degree is 0, an empty graph will be generated. when vertex degree is vertex count - 1, complete graph will be generated.

          The core idea is based on the concept of central symmetry. From the view of any vertex in the graph, the other vertices and edges are the same. So in the intermediate cases, edges are created from one vertex to the opposite vertex(if exists) and vertices on both sides of it.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/fanzhidongyzby/flink master

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3788.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3788


          commit 8d58ac97da66e1be143ad732173891add3fffa4e
          Author: FlorianFan <fanzhidongyzby@163.com>
          Date: 2017-04-27T12:41:53Z

          FLINK-6393 [gelly] Add Evenly Graph Generator to Flink Gelly


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user fanzhidongyzby opened a pull request: https://github.com/apache/flink/pull/3788 FLINK-6393 [gelly] Add Evenly Graph Generator to Flink Gelly Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide] ( http://flink.apache.org/how-to-contribute.html ). In addition to going through the list, please provide a meaningful description of your changes. [x] General The pull request references the related JIRA issue (" [FLINK-XXX] Jira title text") The pull request addresses only one issue Each commit in the PR has a meaningful commit message (including the JIRA id) [ ] Documentation Documentation has been added for new functionality Old documentation affected by the pull request has been updated JavaDoc for public methods has been added [x] Tests & Build Functionality added by the pull request is covered by tests `mvn clean verify` has been executed successfully locally or a Travis build has passed Evenly graph means every vertex in the graph has the same degree, so the graph can be treated as evenly due to all the edges in the graph are distributed evenly. when vertex degree is 0, an empty graph will be generated. when vertex degree is vertex count - 1, complete graph will be generated. The core idea is based on the concept of central symmetry. From the view of any vertex in the graph, the other vertices and edges are the same. So in the intermediate cases, edges are created from one vertex to the opposite vertex(if exists) and vertices on both sides of it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fanzhidongyzby/flink master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3788.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3788 commit 8d58ac97da66e1be143ad732173891add3fffa4e Author: FlorianFan <fanzhidongyzby@163.com> Date: 2017-04-27T12:41:53Z FLINK-6393 [gelly] Add Evenly Graph Generator to Flink Gelly
          Hide
          fanzhidongyzby FlorianFan added a comment -

          pull request sent

          Show
          fanzhidongyzby FlorianFan added a comment - pull request sent
          Hide
          fanzhidongyzby FlorianFan added a comment -

          Greg Hogan
          Could this pr be merged into master branch?

          Show
          fanzhidongyzby FlorianFan added a comment - Greg Hogan Could this pr be merged into master branch?
          Hide
          greghogan Greg Hogan added a comment -

          Hi FlorianFan, we'll leave this Jira open until the pull request is merged into the repository.

          I looked for a description of this graph early today and the closest I could find was CirculantGraph, which is a generalization. What algorithms would use the EvenlyGraph?

          Show
          greghogan Greg Hogan added a comment - Hi FlorianFan , we'll leave this Jira open until the pull request is merged into the repository. I looked for a description of this graph early today and the closest I could find was CirculantGraph , which is a generalization. What algorithms would use the EvenlyGraph?
          Hide
          fanzhidongyzby FlorianFan added a comment -

          Hi Greg Hogan

          EvenlyGraph is a general graph generator, which can be used by any graph algorithm which hopes the testing graph's vertices and edges is evenly distributed.

          After reading the detail of CirculantGraph, I found EvenlyGraph is the subset of it, just like EmptyGraph or CompleteGraph is the subset of EvenlyGraph.

          I didn't know about CirculantGraph before, the concept of EvenlyGraph came from the testing dataset which is used by performance testing of graph algorithm in production environment. not specific algorithms use EvenlyGraph, but it's an important type of graph dataset which can help us to find the performance of any algorithm we care about.

          Therefore, I think EvenlyGraph generator can enrich the gelly graph library, and provide a wider scope of graph testing. In addition, EvenlyGraph may be a special case of CirculantGraph, just like EmptyGraph or CompleteGraph, which is simplified and has brief and fast implementation of generating algorithm.

          Show
          fanzhidongyzby FlorianFan added a comment - Hi Greg Hogan , EvenlyGraph is a general graph generator, which can be used by any graph algorithm which hopes the testing graph's vertices and edges is evenly distributed. After reading the detail of CirculantGraph, I found EvenlyGraph is the subset of it, just like EmptyGraph or CompleteGraph is the subset of EvenlyGraph. I didn't know about CirculantGraph before, the concept of EvenlyGraph came from the testing dataset which is used by performance testing of graph algorithm in production environment. not specific algorithms use EvenlyGraph, but it's an important type of graph dataset which can help us to find the performance of any algorithm we care about. Therefore, I think EvenlyGraph generator can enrich the gelly graph library, and provide a wider scope of graph testing. In addition, EvenlyGraph may be a special case of CirculantGraph, just like EmptyGraph or CompleteGraph, which is simplified and has brief and fast implementation of generating algorithm.
          Hide
          fanzhidongyzby FlorianFan added a comment -

          reset issue state

          Show
          fanzhidongyzby FlorianFan added a comment - reset issue state
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user gallenvara commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3788#discussion_r113842976

          — Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/generator/EvenlyGraph.java —
          @@ -0,0 +1,145 @@
          +/*
          + * Licensed to the Apache Software Foundation (ASF) under one
          + * or more contributor license agreements. See the NOTICE file
          + * distributed with this work for additional information
          + * regarding copyright ownership. The ASF licenses this file
          + * to you under the Apache License, Version 2.0 (the
          + * "License"); you may not use this file except in compliance
          + * with the License. You may obtain a copy of the License at
          + *
          + * http://www.apache.org/licenses/LICENSE-2.0
          + *
          + * Unless required by applicable law or agreed to in writing, software
          + * distributed under the License is distributed on an "AS IS" BASIS,
          + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
          + * See the License for the specific language governing permissions and
          + * limitations under the License.
          + */
          +
          +package org.apache.flink.graph.generator;
          +
          +import org.apache.flink.api.common.functions.FlatMapFunction;
          +import org.apache.flink.api.java.DataSet;
          +import org.apache.flink.api.java.ExecutionEnvironment;
          +import org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields;
          +import org.apache.flink.graph.Edge;
          +import org.apache.flink.graph.Graph;
          +import org.apache.flink.graph.Vertex;
          +import org.apache.flink.types.LongValue;
          +import org.apache.flink.types.NullValue;
          +import org.apache.flink.util.Collector;
          +import org.apache.flink.util.LongValueSequenceIterator;
          +import org.apache.flink.util.Preconditions;
          +
          +/**
          + * Evenly graph means every

          {@link Vertex}

          in the

          {@link Graph}

          has the same degree.
          + * when vertex degree is 0,

          {@link EmptyGraph}

          will be generated.
          + * when vertex degree is vertex count - 1,

          {@link CompleteGraph}

          will be generated.
          — End diff –

          `vertex count - 1` is confused. Can you change the description more clearly?

          Show
          githubbot ASF GitHub Bot added a comment - Github user gallenvara commented on a diff in the pull request: https://github.com/apache/flink/pull/3788#discussion_r113842976 — Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/generator/EvenlyGraph.java — @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.graph.generator; + +import org.apache.flink.api.common.functions.FlatMapFunction; +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; +import org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields; +import org.apache.flink.graph.Edge; +import org.apache.flink.graph.Graph; +import org.apache.flink.graph.Vertex; +import org.apache.flink.types.LongValue; +import org.apache.flink.types.NullValue; +import org.apache.flink.util.Collector; +import org.apache.flink.util.LongValueSequenceIterator; +import org.apache.flink.util.Preconditions; + +/** + * Evenly graph means every {@link Vertex} in the {@link Graph} has the same degree. + * when vertex degree is 0, {@link EmptyGraph} will be generated. + * when vertex degree is vertex count - 1, {@link CompleteGraph} will be generated. — End diff – `vertex count - 1` is confused. Can you change the description more clearly?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user gallenvara commented on the issue:

          https://github.com/apache/flink/pull/3788

          @fanzhidongyzby , thanks for the pr. Just a minor comment, mostly looks good to me.

          Show
          githubbot ASF GitHub Bot added a comment - Github user gallenvara commented on the issue: https://github.com/apache/flink/pull/3788 @fanzhidongyzby , thanks for the pr. Just a minor comment, mostly looks good to me.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user fanzhidongyzby commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3788#discussion_r113848435

          — Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/generator/EvenlyGraph.java —
          @@ -0,0 +1,145 @@
          +/*
          + * Licensed to the Apache Software Foundation (ASF) under one
          + * or more contributor license agreements. See the NOTICE file
          + * distributed with this work for additional information
          + * regarding copyright ownership. The ASF licenses this file
          + * to you under the Apache License, Version 2.0 (the
          + * "License"); you may not use this file except in compliance
          + * with the License. You may obtain a copy of the License at
          + *
          + * http://www.apache.org/licenses/LICENSE-2.0
          + *
          + * Unless required by applicable law or agreed to in writing, software
          + * distributed under the License is distributed on an "AS IS" BASIS,
          + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
          + * See the License for the specific language governing permissions and
          + * limitations under the License.
          + */
          +
          +package org.apache.flink.graph.generator;
          +
          +import org.apache.flink.api.common.functions.FlatMapFunction;
          +import org.apache.flink.api.java.DataSet;
          +import org.apache.flink.api.java.ExecutionEnvironment;
          +import org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields;
          +import org.apache.flink.graph.Edge;
          +import org.apache.flink.graph.Graph;
          +import org.apache.flink.graph.Vertex;
          +import org.apache.flink.types.LongValue;
          +import org.apache.flink.types.NullValue;
          +import org.apache.flink.util.Collector;
          +import org.apache.flink.util.LongValueSequenceIterator;
          +import org.apache.flink.util.Preconditions;
          +
          +/**
          + * Evenly graph means every

          {@link Vertex}

          in the

          {@link Graph}

          has the same degree.
          + * when vertex degree is 0,

          {@link EmptyGraph}

          will be generated.
          + * when vertex degree is vertex count - 1,

          {@link CompleteGraph}

          will be generated.
          — End diff –

          "vertex count - 1" means the degree of every vertex when evenly graph is complete graph, is this clear?

          Show
          githubbot ASF GitHub Bot added a comment - Github user fanzhidongyzby commented on a diff in the pull request: https://github.com/apache/flink/pull/3788#discussion_r113848435 — Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/generator/EvenlyGraph.java — @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.graph.generator; + +import org.apache.flink.api.common.functions.FlatMapFunction; +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; +import org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields; +import org.apache.flink.graph.Edge; +import org.apache.flink.graph.Graph; +import org.apache.flink.graph.Vertex; +import org.apache.flink.types.LongValue; +import org.apache.flink.types.NullValue; +import org.apache.flink.util.Collector; +import org.apache.flink.util.LongValueSequenceIterator; +import org.apache.flink.util.Preconditions; + +/** + * Evenly graph means every {@link Vertex} in the {@link Graph} has the same degree. + * when vertex degree is 0, {@link EmptyGraph} will be generated. + * when vertex degree is vertex count - 1, {@link CompleteGraph} will be generated. — End diff – "vertex count - 1" means the degree of every vertex when evenly graph is complete graph, is this clear?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user greghogan commented on the issue:

          https://github.com/apache/flink/pull/3788

          @fanzhidongyzby, what do you think about generalizing this to the `CirculantGraph`? The graph could be initialized with a vertex count and the configured with one or more ranges. Each range would be an offset and a length.

          Show
          githubbot ASF GitHub Bot added a comment - Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3788 @fanzhidongyzby, what do you think about generalizing this to the `CirculantGraph`? The graph could be initialized with a vertex count and the configured with one or more ranges. Each range would be an offset and a length.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user fanzhidongyzby commented on the issue:

          https://github.com/apache/flink/pull/3788

          @greghogan , I'm preparing to implement `CirculantGraph `, and rewrite `EmptyGraph`, `CompleteGraph` and `EvenlyGraph` with `CirculantGraph`, is this ok ?

          Show
          githubbot ASF GitHub Bot added a comment - Github user fanzhidongyzby commented on the issue: https://github.com/apache/flink/pull/3788 @greghogan , I'm preparing to implement `CirculantGraph `, and rewrite `EmptyGraph`, `CompleteGraph` and `EvenlyGraph` with `CirculantGraph`, is this ok ?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user greghogan commented on the issue:

          https://github.com/apache/flink/pull/3788

          @fanzhidongyzby that sounds good except I would leave the `EmptyGraph` as it is. In addition we'll want to add the new generators as driver inputs in `flink-gelly-examples`, add to the list of inputs in `Runner`, and add a test each to `EdgeListITCase` (run the test to compute the checksum to create the test string).

          Also, on the graph name, without a reference to the literature, perhaps this could be something invoking the idea that vertices are connected by length-1 paths to "far" vertices, by length-2 paths to "near" vertices, by length-3 paths to "far" vertices, etc. Something like `EchoGraph`.

          Show
          githubbot ASF GitHub Bot added a comment - Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3788 @fanzhidongyzby that sounds good except I would leave the `EmptyGraph` as it is. In addition we'll want to add the new generators as driver inputs in `flink-gelly-examples`, add to the list of inputs in `Runner`, and add a test each to `EdgeListITCase` (run the test to compute the checksum to create the test string). Also, on the graph name, without a reference to the literature, perhaps this could be something invoking the idea that vertices are connected by length-1 paths to "far" vertices, by length-2 paths to "near" vertices, by length-3 paths to "far" vertices, etc. Something like `EchoGraph`.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user fanzhidongyzby closed the pull request at:

          https://github.com/apache/flink/pull/3788

          Show
          githubbot ASF GitHub Bot added a comment - Github user fanzhidongyzby closed the pull request at: https://github.com/apache/flink/pull/3788
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user fanzhidongyzby commented on the issue:

          https://github.com/apache/flink/pull/3788

          Hi, @greghogan , this pull request was closed unexpectedly by myself, please go to new [pull request](https://github.com/apache/flink/pull/3802) recreated.

          Show
          githubbot ASF GitHub Bot added a comment - Github user fanzhidongyzby commented on the issue: https://github.com/apache/flink/pull/3788 Hi, @greghogan , this pull request was closed unexpectedly by myself, please go to new [pull request] ( https://github.com/apache/flink/pull/3802 ) recreated.
          Hide
          greghogan Greg Hogan added a comment -

          master: 3ee8c69aa4390a8d51b33f262f719fb1a5474d51
          release-1.3: cfaecda3ac9a736213f8bfe643b8f57ce492e243

          Show
          greghogan Greg Hogan added a comment - master: 3ee8c69aa4390a8d51b33f262f719fb1a5474d51 release-1.3: cfaecda3ac9a736213f8bfe643b8f57ce492e243

            People

            • Assignee:
              fanzhidongyzby FlorianFan
              Reporter:
              fanzhidongyzby FlorianFan
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development