[FLINK-1343] Branching Join Program Deadlocks - ASF JIRA

Attach files

Attach Screenshot

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.8.0, 0.9
Fix Version/s: 0.9
Component/s: Runtime / Coordination
Labels:
None

Description

The following program which gets its data from a single non-parallel data source, branches two times, and joins the branches with two joins, deadlocks.

public class DeadlockProgram {

    public static void main(String[] args) throws Exception {
        final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

        DataSet<Long> longs = env.generateSequence(0,1000000l).setParallelism(1);
        DataSet<Long> longs2 = env.generateSequence(0, 1000000l).setParallelism(1);

        DataSet<Tuple1<Long>> longT1 = longs.map(new TupleWrapper());
        DataSet<Tuple1<Long>> longT2 = longT1.project(0);
        DataSet<Tuple1<Long>> longT3 = longs.map(new TupleWrapper()); // deadlocks
//        DataSet<Tuple1<Long>> longT3 = longs2.map(new TupleWrapper()); // works


        longT2.join(longT3).where(0).equalTo(0).projectFirst(0)
            .join(longT1).where(0).equalTo(0).projectFirst(0)
            .print();

        env.execute();
    }

    public static class TupleWrapper implements MapFunction<Long, Tuple1<Long>> {
        @Override
        public Tuple1<Long> map(Long l) throws Exception {
            return new Tuple1<Long>(l);
        }
    };
}

If one of the branches reads its data from a second data source (see inline comment) or if the single data source uses the default parallelism, the program executes correctly.