Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5896

Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later

Details

    Description

      When a hbase query projects both a column family and a column in the column family, the vector for the column is not created in the HbaseRecordReader.

      So, in cases where scan batch is empty we create a NullableInt vector for this column. We need to handle column creation in the reader.

      Attachments

        Issue Links

          Activity

            githubbot ASF GitHub Bot added a comment -

            GitHub user prasadns14 opened a pull request:

            https://github.com/apache/drill/pull/1005

            DRILL-5896: Handle HBase columns vector creation in the HBaseRecordReader

            Handling vector creation for projected hbase columns, by keeping track of column families and columns separately.
            @paul-rogers please review

            You can merge this pull request into a Git repository by running:

            $ git pull https://github.com/prasadns14/drill DRILL-5896

            Alternatively you can review and apply these changes as the patch at:

            https://github.com/apache/drill/pull/1005.patch

            To close this pull request, make a commit to your master/trunk branch
            with (at least) the following in the commit message:

            This closes #1005


            commit 0f61a2dcd955d30d52103be0454fa45089eee064
            Author: Prasad Nagaraj Subramanya <prasadns14@gmail.com>
            Date: 2017-10-20T21:29:01Z

            DRILL-5896: Handle HBase columns vector creation in the HBaseRecordReader


            githubbot ASF GitHub Bot added a comment - GitHub user prasadns14 opened a pull request: https://github.com/apache/drill/pull/1005 DRILL-5896 : Handle HBase columns vector creation in the HBaseRecordReader Handling vector creation for projected hbase columns, by keeping track of column families and columns separately. @paul-rogers please review You can merge this pull request into a Git repository by running: $ git pull https://github.com/prasadns14/drill DRILL-5896 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1005.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1005 commit 0f61a2dcd955d30d52103be0454fa45089eee064 Author: Prasad Nagaraj Subramanya <prasadns14@gmail.com> Date: 2017-10-20T21:29:01Z DRILL-5896 : Handle HBase columns vector creation in the HBaseRecordReader
            githubbot ASF GitHub Bot added a comment -

            Github user paul-rogers commented on a diff in the pull request:

            https://github.com/apache/drill/pull/1005#discussion_r146658771

            — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java —
            @@ -87,6 +89,7 @@ public HBaseRecordReader(Connection connection, HBaseSubScan.HBaseSubScanSpec su
            hbaseTableName = TableName.valueOf(
            Preconditions.checkNotNull(subScanSpec, "HBase reader needs a sub-scan spec").getTableName());
            hbaseScan = new Scan(subScanSpec.getStartRow(), subScanSpec.getStopRow());
            + hbaseScan1 = new Scan();
            — End diff –

            Better name or comment to explain.

            githubbot ASF GitHub Bot added a comment - Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1005#discussion_r146658771 — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java — @@ -87,6 +89,7 @@ public HBaseRecordReader(Connection connection, HBaseSubScan.HBaseSubScanSpec su hbaseTableName = TableName.valueOf( Preconditions.checkNotNull(subScanSpec, "HBase reader needs a sub-scan spec").getTableName()); hbaseScan = new Scan(subScanSpec.getStartRow(), subScanSpec.getStopRow()); + hbaseScan1 = new Scan(); — End diff – Better name or comment to explain.
            githubbot ASF GitHub Bot added a comment -

            Github user paul-rogers commented on a diff in the pull request:

            https://github.com/apache/drill/pull/1005#discussion_r146659105

            — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java —
            @@ -121,16 +125,18 @@ public HBaseRecordReader(Connection connection, HBaseSubScan.HBaseSubScanSpec su
            byte[] family = root.getPath().getBytes();
            transformed.add(SchemaPath.getSimplePath(root.getPath()));
            PathSegment child = root.getChild();

            • if (!completeFamilies.contains(new String(family, StandardCharsets.UTF_8).toLowerCase())) {
            • if (child != null && child.isNamed()) {
            • byte[] qualifier = child.getNameSegment().getPath().getBytes();
              + if (child != null && child.isNamed()) {
              + byte[] qualifier = child.getNameSegment().getPath().getBytes();
              + hbaseScan1.addColumn(family, qualifier);
              + if (!completeFamilies.contains(new String(family, StandardCharsets.UTF_8))) {
                • End diff –

            Redundant conversion of `family` to `String`, here and below.

            githubbot ASF GitHub Bot added a comment - Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1005#discussion_r146659105 — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java — @@ -121,16 +125,18 @@ public HBaseRecordReader(Connection connection, HBaseSubScan.HBaseSubScanSpec su byte[] family = root.getPath().getBytes(); transformed.add(SchemaPath.getSimplePath(root.getPath())); PathSegment child = root.getChild(); if (!completeFamilies.contains(new String(family, StandardCharsets.UTF_8).toLowerCase())) { if (child != null && child.isNamed()) { byte[] qualifier = child.getNameSegment().getPath().getBytes(); + if (child != null && child.isNamed()) { + byte[] qualifier = child.getNameSegment().getPath().getBytes(); + hbaseScan1.addColumn(family, qualifier); + if (!completeFamilies.contains(new String(family, StandardCharsets.UTF_8))) { End diff – Redundant conversion of `family` to `String`, here and below.
            githubbot ASF GitHub Bot added a comment -

            Github user paul-rogers commented on a diff in the pull request:

            https://github.com/apache/drill/pull/1005#discussion_r146658667

            — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java —
            @@ -75,6 +75,8 @@

            private TableName hbaseTableName;
            private Scan hbaseScan;
            + private Scan hbaseScan1;
            + Set<String> completeFamilies;
            — End diff –

            `private`?

            githubbot ASF GitHub Bot added a comment - Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1005#discussion_r146658667 — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java — @@ -75,6 +75,8 @@ private TableName hbaseTableName; private Scan hbaseScan; + private Scan hbaseScan1; + Set<String> completeFamilies; — End diff – `private`?
            githubbot ASF GitHub Bot added a comment -

            Github user paul-rogers commented on a diff in the pull request:

            https://github.com/apache/drill/pull/1005#discussion_r146659717

            — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java —
            @@ -186,6 +192,10 @@ public void setup(OperatorContext context, OutputMutator output) throws Executio
            }
            }
            }
            +
            + for (String familyName : completeFamilies)

            { + getOrCreateFamilyVector(familyName, false); + }

            — End diff –

            Does this create just the map, or also the vectors within the map? Maybe a comment to explain the goals?

            githubbot ASF GitHub Bot added a comment - Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1005#discussion_r146659717 — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java — @@ -186,6 +192,10 @@ public void setup(OperatorContext context, OutputMutator output) throws Executio } } } + + for (String familyName : completeFamilies) { + getOrCreateFamilyVector(familyName, false); + } — End diff – Does this create just the map, or also the vectors within the map? Maybe a comment to explain the goals?
            githubbot ASF GitHub Bot added a comment -

            Github user prasadns14 commented on a diff in the pull request:

            https://github.com/apache/drill/pull/1005#discussion_r147042347

            — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java —
            @@ -75,6 +75,8 @@

            private TableName hbaseTableName;
            private Scan hbaseScan;
            + private Scan hbaseScan1;
            + Set<String> completeFamilies;
            — End diff –

            Fixed

            githubbot ASF GitHub Bot added a comment - Github user prasadns14 commented on a diff in the pull request: https://github.com/apache/drill/pull/1005#discussion_r147042347 — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java — @@ -75,6 +75,8 @@ private TableName hbaseTableName; private Scan hbaseScan; + private Scan hbaseScan1; + Set<String> completeFamilies; — End diff – Fixed
            githubbot ASF GitHub Bot added a comment -

            Github user prasadns14 commented on a diff in the pull request:

            https://github.com/apache/drill/pull/1005#discussion_r147042356

            — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java —
            @@ -87,6 +89,7 @@ public HBaseRecordReader(Connection connection, HBaseSubScan.HBaseSubScanSpec su
            hbaseTableName = TableName.valueOf(
            Preconditions.checkNotNull(subScanSpec, "HBase reader needs a sub-scan spec").getTableName());
            hbaseScan = new Scan(subScanSpec.getStartRow(), subScanSpec.getStopRow());
            + hbaseScan1 = new Scan();
            — End diff –

            Fixed

            githubbot ASF GitHub Bot added a comment - Github user prasadns14 commented on a diff in the pull request: https://github.com/apache/drill/pull/1005#discussion_r147042356 — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java — @@ -87,6 +89,7 @@ public HBaseRecordReader(Connection connection, HBaseSubScan.HBaseSubScanSpec su hbaseTableName = TableName.valueOf( Preconditions.checkNotNull(subScanSpec, "HBase reader needs a sub-scan spec").getTableName()); hbaseScan = new Scan(subScanSpec.getStartRow(), subScanSpec.getStopRow()); + hbaseScan1 = new Scan(); — End diff – Fixed
            githubbot ASF GitHub Bot added a comment -

            Github user prasadns14 commented on a diff in the pull request:

            https://github.com/apache/drill/pull/1005#discussion_r147042366

            — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java —
            @@ -121,16 +125,18 @@ public HBaseRecordReader(Connection connection, HBaseSubScan.HBaseSubScanSpec su
            byte[] family = root.getPath().getBytes();
            transformed.add(SchemaPath.getSimplePath(root.getPath()));
            PathSegment child = root.getChild();

            • if (!completeFamilies.contains(new String(family, StandardCharsets.UTF_8).toLowerCase())) {
            • if (child != null && child.isNamed()) {
            • byte[] qualifier = child.getNameSegment().getPath().getBytes();
              + if (child != null && child.isNamed()) {
              + byte[] qualifier = child.getNameSegment().getPath().getBytes();
              + hbaseScan1.addColumn(family, qualifier);
              + if (!completeFamilies.contains(new String(family, StandardCharsets.UTF_8))) {
                • End diff –

            Fixed

            githubbot ASF GitHub Bot added a comment - Github user prasadns14 commented on a diff in the pull request: https://github.com/apache/drill/pull/1005#discussion_r147042366 — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java — @@ -121,16 +125,18 @@ public HBaseRecordReader(Connection connection, HBaseSubScan.HBaseSubScanSpec su byte[] family = root.getPath().getBytes(); transformed.add(SchemaPath.getSimplePath(root.getPath())); PathSegment child = root.getChild(); if (!completeFamilies.contains(new String(family, StandardCharsets.UTF_8).toLowerCase())) { if (child != null && child.isNamed()) { byte[] qualifier = child.getNameSegment().getPath().getBytes(); + if (child != null && child.isNamed()) { + byte[] qualifier = child.getNameSegment().getPath().getBytes(); + hbaseScan1.addColumn(family, qualifier); + if (!completeFamilies.contains(new String(family, StandardCharsets.UTF_8))) { End diff – Fixed
            githubbot ASF GitHub Bot added a comment -

            Github user prasadns14 commented on a diff in the pull request:

            https://github.com/apache/drill/pull/1005#discussion_r147042924

            — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java —
            @@ -186,6 +192,10 @@ public void setup(OperatorContext context, OutputMutator output) throws Executio
            }
            }
            }
            +
            + for (String familyName : completeFamilies)

            { + getOrCreateFamilyVector(familyName, false); + }

            — End diff –

            It creates only the map vector

            githubbot ASF GitHub Bot added a comment - Github user prasadns14 commented on a diff in the pull request: https://github.com/apache/drill/pull/1005#discussion_r147042924 — Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java — @@ -186,6 +192,10 @@ public void setup(OperatorContext context, OutputMutator output) throws Executio } } } + + for (String familyName : completeFamilies) { + getOrCreateFamilyVector(familyName, false); + } — End diff – It creates only the map vector
            githubbot ASF GitHub Bot added a comment -

            Github user prasadns14 commented on the issue:

            https://github.com/apache/drill/pull/1005

            @paul-rogers please review the changes

            githubbot ASF GitHub Bot added a comment - Github user prasadns14 commented on the issue: https://github.com/apache/drill/pull/1005 @paul-rogers please review the changes
            githubbot ASF GitHub Bot added a comment -

            Github user asfgit closed the pull request at:

            https://github.com/apache/drill/pull/1005

            githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1005

            Merged into master with commit id dfd43d020498c09dcb2c3fed4e8c6df23d755d55

            arina Arina Ielchiieva added a comment - Merged into master with commit id dfd43d020498c09dcb2c3fed4e8c6df23d755d55

            People

              prasadns14 Prasad Nagaraj Subramanya
              prasadns14 Prasad Nagaraj Subramanya
              Paul Rogers Paul Rogers
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: