Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7609

SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • 2.13.0
    • None
    • dsl-sql
    • None

    Description

      Works in sqlline shell:

      Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0)
      0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test';
      No rows affected (0.507 seconds)
      0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test';
      No rows affected (0.004 seconds)
      0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING (id);
      +---------------------+---------------------+
      |         lhs         |         rhs         |
      +---------------------+---------------------+
      +---------------------+---------------------+
      No rows selected (2.568 seconds)
      

      But doesn't work in the test:

          Schema inputSchema = Schema.of(
              Schema.Field.of("id", Schema.FieldType.INT32));
      
          PCollection<Row> i1 = p.apply(Create.of(ImmutableList.<Row>of())
              .withCoder(SchemaCoder.of(inputSchema)));
      
          PCollection<Row> i2 = p.apply(Create.of(ImmutableList.<Row>of())
              .withCoder(SchemaCoder.of(inputSchema)));
      
          Schema outputSchema = PCollectionTuple
              .of("i1", i1)
              .and("i2", i2)
              .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM i1 JOIN i2 USING (id)"))
              .getSchema();
      
          assertEquals(ImmutableList.of("lhs", "rhs"), outputSchema.getFieldNames());
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            kanterov Gleb Kanterov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: