[PHOENIX-1734] Local index improvements - ASF JIRA

Rajeshbabu Chintaguntla added a comment - 13/Mar/15 16:26 - edited

Ping jamestaylor, stack, apurtell, enis, devaraj, lhofhansl, ndimiduk, anoop.hbase, ram_krish.

Rajeshbabu Chintaguntla added a comment - 13/Mar/15 16:26 - edited Ping jamestaylor , stack , apurtell , enis , devaraj , lhofhansl , ndimiduk , anoop.hbase , ram_krish .

Enis Soztutar added a comment - 13/Mar/15 18:19

After discussing this internally a bit, Rajesh and myself think that we can get Approach 3, which is the cleanest approach in our opinion. It will not need a lot of changes in HBase, and we can remove some of the stuff that the current local indexing does (balancer + SplitTransaction changes). However, it may still take some time, so we were thinking that we can support the current approach until we get the longer term solution in this jira. What do you guys think.

Enis Soztutar added a comment - 13/Mar/15 18:19 After discussing this internally a bit, Rajesh and myself think that we can get Approach 3, which is the cleanest approach in our opinion. It will not need a lot of changes in HBase, and we can remove some of the stuff that the current local indexing does (balancer + SplitTransaction changes). However, it may still take some time, so we were thinking that we can support the current approach until we get the longer term solution in this jira. What do you guys think.

James R. Taylor added a comment - 13/Mar/15 18:22

+1. I like Approach #3.

James R. Taylor added a comment - 13/Mar/15 18:22 +1. I like Approach #3.

Serhiy Bilousov added a comment - 13/Mar/15 18:40

rajeshbabu, approach #3 does look like good candidate but I have a question:
1. Does it mean each index I build for the table will end up in separate column family?
2. If yes to the #1 than how it will affect HBase performance overall since it recommended to keep number of CF to the minimum (my understanding 3, 5 is top?). Is it something we should still worry about for the latest version of HBase?

Thank you

Serhiy Bilousov added a comment - 13/Mar/15 18:40 rajeshbabu , approach #3 does look like good candidate but I have a question: 1. Does it mean each index I build for the table will end up in separate column family? 2. If yes to the #1 than how it will affect HBase performance overall since it recommended to keep number of CF to the minimum (my understanding 3, 5 is top?). Is it something we should still worry about for the latest version of HBase? Thank you

Lars Hofhansl added a comment - 13/Mar/15 18:45

You mean "However, APPROACH 3 still needs a “shadow regions”?

I like APPROACH-3. If you can make that work. You'd still need to worry about locating the index data with the indexed row. Just another CF won't help you here, as it itself will be sorted by the row key, so you need to invent the same shadow regions (as you say). So the "hard-colocation" comes from shadow CFs, not from using a CF to store the index data.
What you get is the shared WAL.

For shadow regions... Can we just have unassignable regions? We'd open those when we open the main region, and if the shadow region does not exist we'd create it (that would take care of distributed index creation). Upon splits we just drop the shadow region and then rebuild it automatically.

Lars Hofhansl added a comment - 13/Mar/15 18:45 You mean "However, APPROACH 3 still needs a “shadow regions”? I like APPROACH-3. If you can make that work. You'd still need to worry about locating the index data with the indexed row. Just another CF won't help you here, as it itself will be sorted by the row key, so you need to invent the same shadow regions (as you say). So the "hard-colocation" comes from shadow CFs, not from using a CF to store the index data. What you get is the shared WAL. For shadow regions... Can we just have unassignable regions? We'd open those when we open the main region, and if the shadow region does not exist we'd create it (that would take care of distributed index creation). Upon splits we just drop the shadow region and then rebuild it automatically.

Rajeshbabu Chintaguntla added a comment - 13/Mar/15 18:47

Does it mean each index I build for the table will end up in separate column family?

We can store all the index data in single column family until unless we want include extra columns from different column families in the index.

Rajeshbabu Chintaguntla added a comment - 13/Mar/15 18:47 Does it mean each index I build for the table will end up in separate column family? We can store all the index data in single column family until unless we want include extra columns from different column families in the index.

Rajeshbabu Chintaguntla added a comment - 13/Mar/15 19:00

lhofhansl

You mean "However, APPROACH 3 still needs a “shadow regions”?

We don't need it.
As in the current approach we can prefix region start key for the indexed row key so indexed rows will be in the data region row key range. So it should work fine without shadow regions.

Rajeshbabu Chintaguntla added a comment - 13/Mar/15 19:00 lhofhansl You mean "However, APPROACH 3 still needs a “shadow regions”? We don't need it. As in the current approach we can prefix region start key for the indexed row key so indexed rows will be in the data region row key range. So it should work fine without shadow regions.

ramkrishna.s.vasudevan added a comment - 14/Mar/15 11:18

Approach 3 looks good to me. But I think we had some discussions on this related multiple flushes happening. But here I think you are targetting only one CF and it will hold all the indices and the index name would act as the identifier I think.

ramkrishna.s.vasudevan added a comment - 14/Mar/15 11:18 Approach 3 looks good to me. But I think we had some discussions on this related multiple flushes happening. But here I think you are targetting only one CF and it will hold all the indices and the index name would act as the identifier I think.

Enis Soztutar added a comment - 16/Mar/15 23:19

The "shadow column families" in Approach 3 will look like regular CF's to HBase in Approach 3. The difference is that, the row keys will be in different sort order (and not connected to the primary row keys). So, lets say we have CF1 as primary data column family, we will also have CF1_INDEX_C1 as the shadow. Other than start key / stop key, the sorting within the region will only depend on the keys, and should equally work with the primary sort order or the secondary sort order. Since even for the seconday sort order, the prefix will be the region start key, region internals should work without modification (at least in theory, to-be-verified). Reads for data CFs and index CFs should be totally isolated otherwise seek()'ing will not work.

Enis Soztutar added a comment - 16/Mar/15 23:19 The "shadow column families" in Approach 3 will look like regular CF's to HBase in Approach 3. The difference is that, the row keys will be in different sort order (and not connected to the primary row keys). So, lets say we have CF1 as primary data column family, we will also have CF1_INDEX_C1 as the shadow. Other than start key / stop key, the sorting within the region will only depend on the keys, and should equally work with the primary sort order or the secondary sort order. Since even for the seconday sort order, the prefix will be the region start key, region internals should work without modification (at least in theory, to-be-verified). Reads for data CFs and index CFs should be totally isolated otherwise seek()'ing will not work.

Nick Dimiduk added a comment - 17/Mar/15 20:38

Approach #3 above makes sense to me as well. I could see extending it from just a "shadow CF" to a "buddy store" concept managed by new coprocessor API. This would allow us to do things like embed a lucene index in the region (as has been discussed many times, and I believe has been implemented in recent versions of Riak).

Nick Dimiduk added a comment - 17/Mar/15 20:38 Approach #3 above makes sense to me as well. I could see extending it from just a "shadow CF" to a "buddy store" concept managed by new coprocessor API. This would allow us to do things like embed a lucene index in the region (as has been discussed many times, and I believe has been implemented in recent versions of Riak).

Michael Stack added a comment - 18/Mar/15 19:50

Approaches #1 and #2 are not viable; too many changes in hbase core if you are talking 'atomic' open/close and 'consistent' updates x-region.

Approach #3 is interesting. Every index key will have startkey prefix? So a region whose startkey is:

~~PHOENIX-1734~~/browse/jira/org.apache.issues//:https

and which has an index entry by reporter:

Rajeshbabu Chintaguntla

will be keyed as follows:

~~PHOENIX-1734~~/browse/jira/org.apache.issues//:https.Rajeshbabu Chintaguntla

All scans of this index will need to first strip start key?

These keys will be fat.

Splitting such regions we will have to compensate for the region keyspace being top-heavy.

Michael Stack added a comment - 18/Mar/15 19:50 Approaches #1 and #2 are not viable; too many changes in hbase core if you are talking 'atomic' open/close and 'consistent' updates x-region. Approach #3 is interesting. Every index key will have startkey prefix? So a region whose startkey is: PHOENIX-1734 /browse/jira/org.apache.issues//:https and which has an index entry by reporter: Rajeshbabu Chintaguntla will be keyed as follows: PHOENIX-1734 /browse/jira/org.apache.issues//:https.Rajeshbabu Chintaguntla All scans of this index will need to first strip start key? These keys will be fat. Splitting such regions we will have to compensate for the region keyspace being top-heavy.

Rajeshbabu Chintaguntla added a comment - 20/Mar/15 16:03

stack

Every index key will have startkey prefix?

Currently yes. I am thinking we can find the minimal byte array between start and end key of the region and prefix it. Then in worst case we may need to prefix the region start key.

All scans of this index will need to first strip start key?

Yes we are striping out the start key.

Rajeshbabu Chintaguntla added a comment - 20/Mar/15 16:03 stack Every index key will have startkey prefix? Currently yes. I am thinking we can find the minimal byte array between start and end key of the region and prefix it. Then in worst case we may need to prefix the region start key. All scans of this index will need to first strip start key? Yes we are striping out the start key.

Michael Stack added a comment - 20/Mar/15 16:20

rajeshbabu Sounds good (especially bit of finding minimal byte array between start and end key – you've seen how this is done in the hfile index code?). Approach #3 seems viable to me; nice. Thinking more on how the region keyspace will 'bulge out' on the key we center secondary indices on, I don't think it will be a problem. We'll split on a key that falls inside the bulge but that is fine I'd say (or at least lets run with it for now I'd say).

Michael Stack added a comment - 20/Mar/15 16:20 rajeshbabu Sounds good (especially bit of finding minimal byte array between start and end key – you've seen how this is done in the hfile index code?). Approach #3 seems viable to me; nice. Thinking more on how the region keyspace will 'bulge out' on the key we center secondary indices on, I don't think it will be a problem. We'll split on a key that falls inside the bulge but that is fine I'd say (or at least lets run with it for now I'd say).

Rajeshbabu Chintaguntla added a comment - 29/Oct/15 09:09

Here is the work in progress patch maintains the local indexes data in separate column families in the same table than different table.
-The patch is from little bit older code base need to rebase and yet to fix some test cases mainly explain plans in some of the tests.
-Here removed hbase internals of split,merges and the balancer part as well which we don't need now. But one part not able to remove is IndexHalfStoreFileReader(thinking how to do that).
-Also thinking how we can handle upgrades when there are tables with local indexes. Removing them before upgrade is better then rewriting all index data. Need to come up with some tools around this.

Rajeshbabu Chintaguntla added a comment - 29/Oct/15 09:09 Here is the work in progress patch maintains the local indexes data in separate column families in the same table than different table. -The patch is from little bit older code base need to rebase and yet to fix some test cases mainly explain plans in some of the tests. -Here removed hbase internals of split,merges and the balancer part as well which we don't need now. But one part not able to remove is IndexHalfStoreFileReader(thinking how to do that). -Also thinking how we can handle upgrades when there are tables with local indexes. Removing them before upgrade is better then rewriting all index data. Need to come up with some tools around this.

Anoop Sam John added a comment - 29/Oct/15 11:45

But one part not able to remove is IndexHalfStoreFileReader(thinking how to do that).

I believe we will need this special half reader whether CF in same table or different table case.

Anoop Sam John added a comment - 29/Oct/15 11:45 But one part not able to remove is IndexHalfStoreFileReader(thinking how to do that). I believe we will need this special half reader whether CF in same table or different table case.

James R. Taylor added a comment - 29/Oct/15 16:12 - edited

Love all that code you've removed (as will stack I think ). Is this the top level idea?

Each data table column family will have a corresponding local index column family formed by prefixing the data table column family with a known prefix. The local index column families will essentially be hidden from Phoenix.
When a row is written to the data table, you write a corresponding row into the hidden local index column family, prefixed with the region start key (i.e. the rows have different row keys).
Use a custom split policy to ensure that the local index column family does not get split. Instead, you drive the split from the split of any data column family.

The "magic" is in the write of the local index rows is here:

+    private long applyFamilyMapToMemstore(Map<byte[], List<Cell>> familyMap, HRegion region)
+            throws IOException {
+        long size = 0;
+        MultiVersionConsistencyControl.WriteEntry wc = null;
+        try {
+            long mvccNum =
+                    MultiVersionConsistencyControl
+                            .getPreAssignedWriteNumber(region.getSequenceId());
+            wc = region.getMVCC().beginMemstoreInsertWithSeqNum(mvccNum);
+
+            for (Map.Entry<byte[], List<Cell>> e : familyMap.entrySet()) {
+                byte[] family = e.getKey();
+                List<Cell> cells = e.getValue();
+                Store store = region.getStore(family);
+                int listSize = cells.size();
+                for (int i = 0; i < listSize; i++) {
+                    Cell cell = cells.get(i);
+                    CellUtil.setSequenceId(cell, mvccNum);
+                    Pair<Long, Cell> ret = store.add(cell);
+                    size += ret.getFirst();
+                }
+            }
+        } finally {
+            region.getMVCC().completeMemstoreInsert(wc);
+        }
+        return size;
+    }

Does this make local index writes transactionally consistent with data table writes? Ideally, we'd want Region APIs to cover this too. Ideas, apurtell?

This looks great, rajeshbabu. Thanks so much for following through on this.

FYI, tdsilva - some incentive to get your txn merge into master soon.

James R. Taylor added a comment - 29/Oct/15 16:12 - edited Love all that code you've removed (as will stack I think ). Is this the top level idea? Each data table column family will have a corresponding local index column family formed by prefixing the data table column family with a known prefix. The local index column families will essentially be hidden from Phoenix. When a row is written to the data table, you write a corresponding row into the hidden local index column family, prefixed with the region start key (i.e. the rows have different row keys). Use a custom split policy to ensure that the local index column family does not get split. Instead, you drive the split from the split of any data column family. The "magic" is in the write of the local index rows is here: + private long applyFamilyMapToMemstore(Map< byte [], List<Cell>> familyMap, HRegion region) + throws IOException { + long size = 0; + MultiVersionConsistencyControl.WriteEntry wc = null ; + try { + long mvccNum = + MultiVersionConsistencyControl + .getPreAssignedWriteNumber(region.getSequenceId()); + wc = region.getMVCC().beginMemstoreInsertWithSeqNum(mvccNum); + + for (Map.Entry< byte [], List<Cell>> e : familyMap.entrySet()) { + byte [] family = e.getKey(); + List<Cell> cells = e.getValue(); + Store store = region.getStore(family); + int listSize = cells.size(); + for ( int i = 0; i < listSize; i++) { + Cell cell = cells.get(i); + CellUtil.setSequenceId(cell, mvccNum); + Pair< Long , Cell> ret = store.add(cell); + size += ret.getFirst(); + } + } + } finally { + region.getMVCC().completeMemstoreInsert(wc); + } + return size; + } Does this make local index writes transactionally consistent with data table writes? Ideally, we'd want Region APIs to cover this too. Ideas, apurtell ? This looks great, rajeshbabu . Thanks so much for following through on this. FYI, tdsilva - some incentive to get your txn merge into master soon.

Rajeshbabu Chintaguntla added a comment - 03/Nov/15 02:46

jamestaylor

Each data table column family will have a corresponding local index column family formed by prefixing the data table column family with a known prefix. The local index column families will essentially be hidden from Phoenix.

Yes.

When a row is written to the data table, you write a corresponding row into the hidden local index column family, prefixed with the region start key (i.e. the rows have different row keys).

Correct.

Use a custom split policy to ensure that the local index column family does not get split. Instead, you drive the split from the split of any data column family.

No. in HBase we have an optimization like if the hfile max key is less than split row or hfile min key is more than split row we need not create reference files so that need not run compaction for the hfiles. The file will be just moved to top or bottom region. But in case of local indexing mostly all the rowkeys will be less than split row irrespective of actual rowkeys. So the custom split policy helps to create both reference files for top and bottom daughter regions so that the data will be split based on actual rowkey by IndexHalfStoreFileReader. It's same as current implementation of splitting index hfiles.
The hfile splitting happen as below.
– IndexHalfStoreReader go through all the keyvalues the hfile.
– For each keyvalue we get actual rowkey and if it's less than split row it will be given to top daughter region otherwise it's belong to bottom daughter region.
– After splitting when compaction going on for top daughter region gets only the keyvalues whose actual rowkey less than split row and bottom daughter region gets only the keyvalues whose actual rowkey is more than split row.

The "magic" is in the write of the local index rows is here:

The code is dead code.That's too much internal. I have commented out calling it. HBase doen't allow to write mutations to same region in (pre/post)batchMutate coprocessors so postponed writing index updates to to post(Put/Delete).

Rajeshbabu Chintaguntla added a comment - 03/Nov/15 02:46 jamestaylor Each data table column family will have a corresponding local index column family formed by prefixing the data table column family with a known prefix. The local index column families will essentially be hidden from Phoenix. Yes. When a row is written to the data table, you write a corresponding row into the hidden local index column family, prefixed with the region start key (i.e. the rows have different row keys). Correct. Use a custom split policy to ensure that the local index column family does not get split. Instead, you drive the split from the split of any data column family. No. in HBase we have an optimization like if the hfile max key is less than split row or hfile min key is more than split row we need not create reference files so that need not run compaction for the hfiles. The file will be just moved to top or bottom region. But in case of local indexing mostly all the rowkeys will be less than split row irrespective of actual rowkeys. So the custom split policy helps to create both reference files for top and bottom daughter regions so that the data will be split based on actual rowkey by IndexHalfStoreFileReader. It's same as current implementation of splitting index hfiles. The hfile splitting happen as below. – IndexHalfStoreReader go through all the keyvalues the hfile. – For each keyvalue we get actual rowkey and if it's less than split row it will be given to top daughter region otherwise it's belong to bottom daughter region. – After splitting when compaction going on for top daughter region gets only the keyvalues whose actual rowkey less than split row and bottom daughter region gets only the keyvalues whose actual rowkey is more than split row. The "magic" is in the write of the local index rows is here: The code is dead code.That's too much internal. I have commented out calling it. HBase doen't allow to write mutations to same region in (pre/post)batchMutate coprocessors so postponed writing index updates to to post(Put/Delete).

James R. Taylor added a comment - 03/Nov/15 02:50

Thanks for the info, rajeshbabu. What do you think of this approach, stack, apurtell, enis, lhofhansl?

James R. Taylor added a comment - 03/Nov/15 02:50 Thanks for the info, rajeshbabu . What do you think of this approach, stack , apurtell , enis , lhofhansl ?

Enis Soztutar added a comment - 03/Nov/15 03:04

What do you think of this approach

I am of course in favor of approach 3 as above. We have discussed this internally with Rajesh before.

The code is dead code.That's too much internal. I have commented out calling it. HBase doen't allow to write mutations to same region in (pre/post)batchMutate coprocessors so postponed writing index updates to to post(Put/Delete).

You should be able to add your changes to the WALEdit in prePut() so that you do not have to touch any region internals at all. You can change the Put object and the WALEdit object in the coprocessor, which then gets written to WAL and memstore, etc atomically with the actual data.

Enis Soztutar added a comment - 03/Nov/15 03:04 What do you think of this approach I am of course in favor of approach 3 as above. We have discussed this internally with Rajesh before. The code is dead code.That's too much internal. I have commented out calling it. HBase doen't allow to write mutations to same region in (pre/post)batchMutate coprocessors so postponed writing index updates to to post(Put/Delete). You should be able to add your changes to the WALEdit in prePut() so that you do not have to touch any region internals at all. You can change the Put object and the WALEdit object in the coprocessor, which then gets written to WAL and memstore, etc atomically with the actual data.

Enis Soztutar added a comment - 03/Nov/15 03:38

Rajesh, is my understanding correct with this approach:
Lets say that we have a region with start key foo, and column family cf and data like:

foo, cf, c1, v4
foo, cf, c2, v2
goo, cf, c1, v4
goo, cf, c2, v3
hoo, cf, c1, v1
hoo, cf, c2, v3

If we have a local index on c1, the data for the local index will look like this (using _ as delimiter):

foo_v1_hoo, L_cf, c1, <null>
foo_v4_foo, L_cf, c1, <null>
foo_v4_goo, L_cf, c1, <null>

So, for every insert to c1, we have to do another insert atomically to another cell with a different row (which does not necessarily exist in the table).

Since we want to insert a different row atomically, I think the prePut() will not work. Also the current Indexer / IndexCommitter will not work since it cannot do the atomic writes in line. What we want is to use the already existing atomic mutations in the HRegion itself.

HRegion.mutateRowsWithLocks() comes to mind as the first candidate, but that will require that all writes from Phoenix are send to that endpoint. Instead I think we can still just add the new Cells from local indexes to the Mutation.familyMap() even though technically the row keys for the local index will be different. If the coprocessor does add the index entries in prePut(), it may work with current HRegion out of the box with all the atomicity guarantees (to be tested though .

Enis Soztutar added a comment - 03/Nov/15 03:38 Rajesh, is my understanding correct with this approach: Lets say that we have a region with start key foo, and column family cf and data like: foo, cf, c1, v4 foo, cf, c2, v2 goo, cf, c1, v4 goo, cf, c2, v3 hoo, cf, c1, v1 hoo, cf, c2, v3 If we have a local index on c1, the data for the local index will look like this (using _ as delimiter): foo_v1_hoo, L_cf, c1, < null > foo_v4_foo, L_cf, c1, < null > foo_v4_goo, L_cf, c1, < null > So, for every insert to c1, we have to do another insert atomically to another cell with a different row (which does not necessarily exist in the table). Since we want to insert a different row atomically, I think the prePut() will not work. Also the current Indexer / IndexCommitter will not work since it cannot do the atomic writes in line. What we want is to use the already existing atomic mutations in the HRegion itself. HRegion.mutateRowsWithLocks() comes to mind as the first candidate, but that will require that all writes from Phoenix are send to that endpoint. Instead I think we can still just add the new Cells from local indexes to the Mutation.familyMap() even though technically the row keys for the local index will be different. If the coprocessor does add the index entries in prePut(), it may work with current HRegion out of the box with all the atomicity guarantees (to be tested though .

Rajeshbabu Chintaguntla added a comment - 05/Nov/15 13:37

If the coprocessor does add the index entries in prePut(), it may work with current HRegion out of the box with all the atomicity guarantees (to be tested though .

We cannot pass the index entries to the Put/Delete coming in pre(Put|Delete) because index entries row keys are different.

Rajeshbabu Chintaguntla added a comment - 05/Nov/15 13:37 If the coprocessor does add the index entries in prePut(), it may work with current HRegion out of the box with all the atomicity guarantees (to be tested though . We cannot pass the index entries to the Put/Delete coming in pre(Put|Delete) because index entries row keys are different.

Rajeshbabu Chintaguntla added a comment - 13/Nov/15 15:43

Here is the working patch fixing all the test failures. jamestaylor enis samarthjain ramkrishna.s.vasudevan@gmail.com anoop.hbase Please review if you get a chance.

Rajeshbabu Chintaguntla added a comment - 13/Nov/15 15:43 Here is the working patch fixing all the test failures. jamestaylor enis samarthjain ramkrishna.s.vasudevan@gmail.com anoop.hbase Please review if you get a chance.

Enis Soztutar added a comment - 13/Nov/15 18:45

chrajeshbabu32@gmail.com Mind putting the patch to RB?

Enis Soztutar added a comment - 13/Nov/15 18:45 chrajeshbabu32@gmail.com Mind putting the patch to RB?

James R. Taylor added a comment - 13/Nov/15 19:11

Excellent work, rajeshbabu. This is a huge step forward for making local indexes copacetic with HBase internals. For easier code review, would it be possible to spin up a pull request so we can see the changes in context (since it’s rather large)? I'm not a huge fan of RB as it's counter intuitive to use IMHO (but I suppose you could do both if the HBase folks like RB).

Would be great if you could review too, jesse_yates.

Here’s some initial feedback:

Would you mind adding some code comments here? What’s the reason to not write to the WAL here? Also, what’s the reason for the check on the table name? And what’s the reason we need the allowLocalUpdates? Is that because local index writes will always be local?

-                            if (tableReference.getTableName().startsWith(MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX)) {
-                                Region indexRegion = IndexUtil.getIndexRegion(env);
-                                if (indexRegion != null) {
-                                    throwFailureIfDone();
-                                    indexRegion.batchMutate(mutations.toArray(new Mutation[mutations.size()]),
+                            if (env != null
+                                    && tableReference.getTableName().equals(
+                                        env.getRegion().getTableDesc().getNameAsString())) {
+                                throwFailureIfDone();
+                                
+                                if(allowLocalUpdates) {
+                                    for (Mutation m : mutations) {
+                                        m.setDurability(Durability.SKIP_WAL);
+                                    }
+                                    env.getRegion().batchMutate(
+                                        mutations.toArray(new Mutation[mutations.size()]),
                                         HConstants.NO_NONCE, HConstants.NO_NONCE);
-                                    return null;
                                 }
+                                return null;

Given the above usage of batchMutate(), I’m assuming that means that local index updates are not transactional with the data updates. I suppose we can do this in a follow up JIRA, but wouldn’t it be possible to use the MultiRowMutationService.mutateRows() call that includes both index and data updates to get transactionality? We could have our Indexer.preBatchMutate() call e.bypass() which is a technique I’ve seen used by Tephra to skip the normal processing. Having the index updates be transactional is one of the most important advantages of local indexing IMHO, so it'd be a shame not to have this.

Instead of having a bunch of if statements to prefix local index column families, let’s just create them like this in the beginning. The other reason to do this is that we don’t want to be creating new byte arrays per row if don’t have to. We already have a simple algorithm to generate column names when indexes are used. This would be similar for column family names of local indexes. See IndexStatementRewriter.visit(ColumnParseNode node) which is used to translate a query against the data table to a query against an index table (and that method translates column references and adds a CAST to ensure the type stays the same), MetaDataClient.createIndex() (which you hopefully won’t need to change), and in particular IndexUtil.getIndexColumnName(String cf, String cn) which takes the data column family and column name and returns the index column name.

The code in IndexMaintainer is not getting the correct array from the cell. It’s already been fixed in the 4.x and master branches, so please be careful when re-basing.

-            ImmutableBytesPtr family = new ImmutableBytesPtr(kv.getRowArray(),kv.getFamilyOffset(),kv.getFamilyLength());
-            ImmutableBytesPtr qual = new ImmutableBytesPtr(kv.getRowArray(), kv.getQualifierOffset(), kv.getQualifierLength());
             ImmutableBytesPtr value = new ImmutableBytesPtr(kv.getValueArray(), kv.getValueOffset(), kv.getValueLength());
-            valueMap.put(new ColumnReference(kv.getRowArray(), kv.getFamilyOffset(), kv.getFamilyLength(), kv.getRowArray(), kv.getQualifierOffset(), kv.getQualifierLength()), value);
+            valueMap.put(
+                new ColumnReference(kv.getRowArray(), kv.getFamilyOffset(), kv.getFamilyLength(),
+                        kv.getRowArray(), kv.getQualifierOffset(), kv.getQualifierLength()), value);

Some minor nits:

Remove comment in ParallelWriterIndexCommitter as it’s no longer applicable:

                        // TODO: Once HBASE-11766 is fixed, reexamine whether this is necessary.
                        // Also, checking the prefix of the table name to determine if this is a local
                        // index is pretty hacky. If we're going to keep this, we should revisit that
                        // as well.

Remove commented out code

-                if (localIndexTable) {
+                /**if (localIndexTable) {
                     ensureLocalIndexTableCreated(tableName, tableProps, families, splits, MetaDataUtil.getClientTimeStamp(m));
-                } else if (!MetaDataUtil.isMultiTenant(m, kvBuilder, ptr)) {
+                } else*/ if (!localIndexTable && !MetaDataUtil.isMultiTenant(m, kvBuilder, ptr)) {
                     ensureViewIndexTableCreated(tenantIdBytes.length == 0 ? null : PNameFactory.newName(tenantIdBytes), physicalTableName, MetaDataUtil.getClientTimeStamp(m));

James R. Taylor added a comment - 13/Nov/15 19:11 Excellent work, rajeshbabu . This is a huge step forward for making local indexes copacetic with HBase internals. For easier code review, would it be possible to spin up a pull request so we can see the changes in context (since it’s rather large)? I'm not a huge fan of RB as it's counter intuitive to use IMHO (but I suppose you could do both if the HBase folks like RB). Would be great if you could review too, jesse_yates . Here’s some initial feedback: Would you mind adding some code comments here? What’s the reason to not write to the WAL here? Also, what’s the reason for the check on the table name? And what’s the reason we need the allowLocalUpdates? Is that because local index writes will always be local? - if (tableReference.getTableName().startsWith(MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX)) { - Region indexRegion = IndexUtil.getIndexRegion(env); - if (indexRegion != null ) { - throwFailureIfDone(); - indexRegion.batchMutate(mutations.toArray( new Mutation[mutations.size()]), + if (env != null + && tableReference.getTableName().equals( + env.getRegion().getTableDesc().getNameAsString())) { + throwFailureIfDone(); + + if (allowLocalUpdates) { + for (Mutation m : mutations) { + m.setDurability(Durability.SKIP_WAL); + } + env.getRegion().batchMutate( + mutations.toArray( new Mutation[mutations.size()]), HConstants.NO_NONCE, HConstants.NO_NONCE); - return null ; } + return null ; Given the above usage of batchMutate(), I’m assuming that means that local index updates are not transactional with the data updates. I suppose we can do this in a follow up JIRA, but wouldn’t it be possible to use the MultiRowMutationService.mutateRows() call that includes both index and data updates to get transactionality? We could have our Indexer.preBatchMutate() call e.bypass() which is a technique I’ve seen used by Tephra to skip the normal processing. Having the index updates be transactional is one of the most important advantages of local indexing IMHO, so it'd be a shame not to have this. Instead of having a bunch of if statements to prefix local index column families, let’s just create them like this in the beginning. The other reason to do this is that we don’t want to be creating new byte arrays per row if don’t have to. We already have a simple algorithm to generate column names when indexes are used. This would be similar for column family names of local indexes. See IndexStatementRewriter.visit(ColumnParseNode node) which is used to translate a query against the data table to a query against an index table (and that method translates column references and adds a CAST to ensure the type stays the same), MetaDataClient.createIndex() (which you hopefully won’t need to change), and in particular IndexUtil.getIndexColumnName(String cf, String cn) which takes the data column family and column name and returns the index column name. The code in IndexMaintainer is not getting the correct array from the cell. It’s already been fixed in the 4.x and master branches, so please be careful when re-basing. - ImmutableBytesPtr family = new ImmutableBytesPtr(kv.getRowArray(),kv.getFamilyOffset(),kv.getFamilyLength()); - ImmutableBytesPtr qual = new ImmutableBytesPtr(kv.getRowArray(), kv.getQualifierOffset(), kv.getQualifierLength()); ImmutableBytesPtr value = new ImmutableBytesPtr(kv.getValueArray(), kv.getValueOffset(), kv.getValueLength()); - valueMap.put( new ColumnReference(kv.getRowArray(), kv.getFamilyOffset(), kv.getFamilyLength(), kv.getRowArray(), kv.getQualifierOffset(), kv.getQualifierLength()), value); + valueMap.put( + new ColumnReference(kv.getRowArray(), kv.getFamilyOffset(), kv.getFamilyLength(), + kv.getRowArray(), kv.getQualifierOffset(), kv.getQualifierLength()), value); Some minor nits: Remove comment in ParallelWriterIndexCommitter as it’s no longer applicable: // TODO: Once HBASE-11766 is fixed, reexamine whether this is necessary. // Also, checking the prefix of the table name to determine if this is a local // index is pretty hacky. If we're going to keep this , we should revisit that // as well. Remove commented out code - if (localIndexTable) { + /** if (localIndexTable) { ensureLocalIndexTableCreated(tableName, tableProps, families, splits, MetaDataUtil.getClientTimeStamp(m)); - } else if (!MetaDataUtil.isMultiTenant(m, kvBuilder, ptr)) { + } else */ if (!localIndexTable && !MetaDataUtil.isMultiTenant(m, kvBuilder, ptr)) { ensureViewIndexTableCreated(tenantIdBytes.length == 0 ? null : PNameFactory.newName(tenantIdBytes), physicalTableName, MetaDataUtil.getClientTimeStamp(m));

Enis Soztutar added a comment - 16/Nov/15 23:59

Given the above usage of batchMutate(), I’m assuming that means that local index updates are not transactional with the data updates. I suppose we can do this in a follow up JIRA, but wouldn’t it be possible to use the MultiRowMutationService.mutateRows() call that includes both index and data updates to get transactionality? We could have our Indexer.preBatchMutate() call e.bypass() which is a technique I’ve seen used by Tephra to skip the normal processing. Having the index updates be transactional is one of the most important advantages of local indexing IMHO, so it'd be a shame not to have this.

Agreed, that is what I was proposing above: https://issues.apache.org/jira/browse/PHOENIX-1734?focusedCommentId=14986604&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14986604

I was curious to see how would this work with mutateRowsWithLocks(). The problem is that the writes to the data table arrives at the regular HRegion.doMiniBatchMutation rather than mutateRowsWithLocks(). However, I was able to write a simple coprocessor test to demonstrate that adding the extra local index mutations coming from the IndexCodec should be possible in preBatchMutate().

We can do this:

// add all index updates:
    private Put addUnchecked(Cell kv, Put put) {
      byte [] family = CellUtil.cloneFamily(kv);
      NavigableMap<byte[], List<Cell>> familyMap = put.getFamilyCellMap();
      List<Cell> list = put.getFamilyCellMap().get(family);
      if (list == null) {
        list = new ArrayList<Cell>();
      }

      list.add(kv);
      familyMap.put(family, list);
      return put;
    }

at preBatchMutate() for every mutation, and it seems to be working:

keyvalues={a/info:c1/1447717036828/Put/vlen=1/seqid=0}
keyvalues={a1d/L_info:c1/1447717036828/Put/vlen=0/seqid=0}
keyvalues={a2c/L_info:c1/1447717036828/Put/vlen=0/seqid=0}
keyvalues={a3b/L_info:c1/1447717036828/Put/vlen=0/seqid=0}
keyvalues={a4a/L_info:c1/1447717036828/Put/vlen=0/seqid=0}
keyvalues={b/info:c1/1447717036828/Put/vlen=1/seqid=0}
keyvalues={c/info:c1/1447717036828/Put/vlen=1/seqid=0}
keyvalues={d/info:c1/1447717036828/Put/vlen=1/seqid=0}

Since this is pre-mutate, the changes should have atomic visibility.
The only fishy part is that, Mutation should normally only hold the Cell's for the same row. Here we are adding Cells for the local index that is for different rows. The rowLocks should be fine since locking the original row only should be enough for index updates. I did not do the deep inspection at the doMiniBatchMutation() to see whether there are assumptions that might break.

Enis Soztutar added a comment - 16/Nov/15 23:59 Given the above usage of batchMutate(), I’m assuming that means that local index updates are not transactional with the data updates. I suppose we can do this in a follow up JIRA, but wouldn’t it be possible to use the MultiRowMutationService.mutateRows() call that includes both index and data updates to get transactionality? We could have our Indexer.preBatchMutate() call e.bypass() which is a technique I’ve seen used by Tephra to skip the normal processing. Having the index updates be transactional is one of the most important advantages of local indexing IMHO, so it'd be a shame not to have this. Agreed, that is what I was proposing above: https://issues.apache.org/jira/browse/PHOENIX-1734?focusedCommentId=14986604&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14986604 I was curious to see how would this work with mutateRowsWithLocks(). The problem is that the writes to the data table arrives at the regular HRegion.doMiniBatchMutation rather than mutateRowsWithLocks() . However, I was able to write a simple coprocessor test to demonstrate that adding the extra local index mutations coming from the IndexCodec should be possible in preBatchMutate() . We can do this: // add all index updates: private Put addUnchecked(Cell kv, Put put) { byte [] family = CellUtil.cloneFamily(kv); NavigableMap< byte [], List<Cell>> familyMap = put.getFamilyCellMap(); List<Cell> list = put.getFamilyCellMap().get(family); if (list == null ) { list = new ArrayList<Cell>(); } list.add(kv); familyMap.put(family, list); return put; } at preBatchMutate() for every mutation, and it seems to be working: keyvalues={a/info:c1/1447717036828/Put/vlen=1/seqid=0} keyvalues={a1d/L_info:c1/1447717036828/Put/vlen=0/seqid=0} keyvalues={a2c/L_info:c1/1447717036828/Put/vlen=0/seqid=0} keyvalues={a3b/L_info:c1/1447717036828/Put/vlen=0/seqid=0} keyvalues={a4a/L_info:c1/1447717036828/Put/vlen=0/seqid=0} keyvalues={b/info:c1/1447717036828/Put/vlen=1/seqid=0} keyvalues={c/info:c1/1447717036828/Put/vlen=1/seqid=0} keyvalues={d/info:c1/1447717036828/Put/vlen=1/seqid=0} Since this is pre-mutate, the changes should have atomic visibility. The only fishy part is that, Mutation should normally only hold the Cell's for the same row. Here we are adding Cells for the local index that is for different rows. The rowLocks should be fine since locking the original row only should be enough for index updates. I did not do the deep inspection at the doMiniBatchMutation() to see whether there are assumptions that might break.

Enis Soztutar added a comment - 17/Nov/15 00:01

Attaching the simple test code.

Enis Soztutar added a comment - 17/Nov/15 00:01 Attaching the simple test code.

James R. Taylor added a comment - 17/Nov/15 00:53

This is great, enis - thanks for putting this together. This would make the logic pretty simple for local indexes: for non transactional tables, in Indexer.preBatchMutateWithExceptions(), instead of calling doPre(), we could add the KeyValues from indexUpdates to the Mutations in miniBatchOp. Sounds like from your experiment, it wouldn't matter which Mutation in miniBatchOp you added the index Mutations too, right? Though if need be, it wouldn't be difficult to add them to the data Mutation they're associated with (the code there is handling the case where the same data row is mutated multiple times at different time stamps which actually can't happen in Phoenix).

For transactional tables, a similar thing would be done in PhoenixTransactionalIndexer.preBatchMutate(). Instead of calling this.writer.write(), a local index would add the indexUpdates to the Mutations in miniBatchOp.

James R. Taylor added a comment - 17/Nov/15 00:53 This is great, enis - thanks for putting this together. This would make the logic pretty simple for local indexes: for non transactional tables, in Indexer.preBatchMutateWithExceptions(), instead of calling doPre(), we could add the KeyValues from indexUpdates to the Mutations in miniBatchOp . Sounds like from your experiment, it wouldn't matter which Mutation in miniBatchOp you added the index Mutations too, right? Though if need be, it wouldn't be difficult to add them to the data Mutation they're associated with (the code there is handling the case where the same data row is mutated multiple times at different time stamps which actually can't happen in Phoenix). For transactional tables, a similar thing would be done in PhoenixTransactionalIndexer.preBatchMutate(). Instead of calling this.writer.write(), a local index would add the indexUpdates to the Mutations in miniBatchOp .

Enis Soztutar added a comment - 17/Nov/15 01:47

More findings:

In doMiniBatchMutation(), we call getFamilyCellMap() before preBatchMutate(). However, prePut(), preDelete() happens before even doMiniBatchMutation is called.
```
        Map<byte[], List<Cell>> familyMap = mutation.getFamilyCellMap();
```
Thus, Indexer should inject at the prePut() level, not preBatchMutate() level for local indexes if we are following this path.

It seems that other code paths in doMiniBatchMutate() does not need the row key from the Mutation. Only rowLock and checkRow() needs that. Local index updates in theory always happen to be in the current region by definition, thus checkRow() is not needed (unless there is a bug). Can we have a case where the row key for the index update is the same as a data row key? If that is possible, lacking row locks might cause problems?

Sounds like from your experiment, it wouldn't matter which Mutation in miniBatchOp you added the index Mutations too, right? Though if need be, it wouldn't be difficult to add them to the data Mutation they're associated with

I think it will be safest to add the index updates to the same row. The doMiniBatchMutation() is not atomic across operations in the batch, and every operation has a different success / exception tracker.

Enis Soztutar added a comment - 17/Nov/15 01:47 More findings: In doMiniBatchMutation() , we call getFamilyCellMap() before preBatchMutate() . However, prePut() , preDelete() happens before even doMiniBatchMutation is called. Map< byte [], List<Cell>> familyMap = mutation.getFamilyCellMap(); Thus, Indexer should inject at the prePut() level, not preBatchMutate() level for local indexes if we are following this path. It seems that other code paths in doMiniBatchMutate() does not need the row key from the Mutation. Only rowLock and checkRow() needs that. Local index updates in theory always happen to be in the current region by definition, thus checkRow() is not needed (unless there is a bug). Can we have a case where the row key for the index update is the same as a data row key? If that is possible, lacking row locks might cause problems? Sounds like from your experiment, it wouldn't matter which Mutation in miniBatchOp you added the index Mutations too, right? Though if need be, it wouldn't be difficult to add them to the data Mutation they're associated with I think it will be safest to add the index updates to the same row. The doMiniBatchMutation() is not atomic across operations in the batch, and every operation has a different success / exception tracker.

Enis Soztutar added a comment - 17/Nov/15 02:57

The rest of the patch looks good to my untrained eyes.

Should this be 4.6?

+    public static final int NEW_LOCAL_INDEX_IMPLEMENTATION_MIN_VERSION = VersionUtil
+            .encodeVersion("4.5.0");

In bunch of places, we do compute family name resulting in a lot of boiler plate code. Maybe extract that to an Util. Can also be done a PColumn, so that KVColumnExpression does not have to know about local index logic.

This is for later, but I think we can simplify the IndexHalfStoreFileReader as well and make it more robust against hbck. The problem is that, in some hbase failure cases, hbck may end up doing merges, plugging region holes, etc which might leave the IndexHalfStoreFileReader confused. I think we can instead inject a regular StoreFileReader (as well as ref reader) which simply replaces all start prefixes with the region start key if it is not matching.

Great work!

Enis Soztutar added a comment - 17/Nov/15 02:57 The rest of the patch looks good to my untrained eyes. Should this be 4.6? + public static final int NEW_LOCAL_INDEX_IMPLEMENTATION_MIN_VERSION = VersionUtil + .encodeVersion( "4.5.0" ); In bunch of places, we do compute family name resulting in a lot of boiler plate code. Maybe extract that to an Util. Can also be done a PColumn, so that KVColumnExpression does not have to know about local index logic. This is for later, but I think we can simplify the IndexHalfStoreFileReader as well and make it more robust against hbck. The problem is that, in some hbase failure cases, hbck may end up doing merges, plugging region holes, etc which might leave the IndexHalfStoreFileReader confused. I think we can instead inject a regular StoreFileReader (as well as ref reader) which simply replaces all start prefixes with the region start key if it is not matching. Great work!

Rajeshbabu Chintaguntla added a comment - 17/Nov/15 08:24

Thanks jamestaylor,enis for reviews and feedback.
I will try the approach of adding index mutations/keyvalues to the corresponding mutations in pre(Put/Delete) and trying to avoid creating the byte arrays of prefixing the local index column family prefix by doing once in IndexStatementRewriter or Util. Once after handling these will create pull request and put it in rb.

Rajeshbabu Chintaguntla added a comment - 17/Nov/15 08:24 Thanks jamestaylor , enis for reviews and feedback. I will try the approach of adding index mutations/keyvalues to the corresponding mutations in pre(Put/Delete) and trying to avoid creating the byte arrays of prefixing the local index column family prefix by doing once in IndexStatementRewriter or Util. Once after handling these will create pull request and put it in rb.

Enis Soztutar added a comment - 20/Nov/15 23:36

We were discussing this with aliciashu, and she raised a good concern.

I think with this patch, we are creating 1 shadow CF per table CF, but that CF contains data from multiple local indexes. We are using the column name for filtering out the other index's data. However, this means that the data from multiple local indices are intermixed at the physical layout making the scans for a single local index to scan over all the data. Thus a full table scan for a local index, still scans over the data for all local indices. Point index lookups will not be affected as much. I believe the query planner will never create a scan for scanning more than one index's data.

Should we change the patch to create one CF per table CF per local index? It will be more similar to the one-table per local index setup, and scans for a particular covered index will only scan over the index data.

Enis Soztutar added a comment - 20/Nov/15 23:36 We were discussing this with aliciashu , and she raised a good concern. I think with this patch, we are creating 1 shadow CF per table CF, but that CF contains data from multiple local indexes. We are using the column name for filtering out the other index's data. However, this means that the data from multiple local indices are intermixed at the physical layout making the scans for a single local index to scan over all the data. Thus a full table scan for a local index, still scans over the data for all local indices. Point index lookups will not be affected as much. I believe the query planner will never create a scan for scanning more than one index's data. Should we change the patch to create one CF per table CF per local index? It will be more similar to the one-table per local index setup, and scans for a particular covered index will only scan over the index data.

James R. Taylor added a comment - 21/Nov/15 00:27

The row key for local indexes lead with the index id, so all data is together - no extra data will be scanned.

James R. Taylor added a comment - 21/Nov/15 00:27 The row key for local indexes lead with the index id, so all data is together - no extra data will be scanned.

Enis Soztutar added a comment - 21/Nov/15 03:24

The row key for local indexes lead with the index id, so all data is together - no extra data will be scanned.

I see. I remember seeing the column as keeping the index id, did not look at the actual row key format. If we also use Scan prefix filters, it should be fine then.

Enis Soztutar added a comment - 21/Nov/15 03:24 The row key for local indexes lead with the index id, so all data is together - no extra data will be scanned. I see. I remember seeing the column as keeping the index id, did not look at the actual row key format. If we also use Scan prefix filters, it should be fine then.

James R. Taylor added a comment - 24/Nov/15 23:17

rajeshbabu - I see you've posted a new rev of your patch. Is it ready for review? I don't see the changes that Enis and I talked about yet, though.

James R. Taylor added a comment - 24/Nov/15 23:17 rajeshbabu - I see you've posted a new rev of your patch. Is it ready for review? I don't see the changes that Enis and I talked about yet, though.

Devaraj Das added a comment - 24/Nov/15 23:49

Speaking on behalf of rajeshbabu (since I talked to him today morning on this, and he is probably sleeping), I believe he has accommodated your asks jamestaylor. He hasn't accommodated enis's asks since not all mutations can be handled in the way, enis proposed. We can take a look at that atomicity aspect later is what we thought.

Devaraj Das added a comment - 24/Nov/15 23:49 Speaking on behalf of rajeshbabu (since I talked to him today morning on this, and he is probably sleeping), I believe he has accommodated your asks jamestaylor . He hasn't accommodated enis 's asks since not all mutations can be handled in the way, enis proposed. We can take a look at that atomicity aspect later is what we thought.

James R. Taylor added a comment - 25/Nov/15 06:15

Thanks for the revised patch, rajeshbabu. Would you mind putting together a pull request as it's difficult to see this changes in context. One minor thing, though: can you use a different, more unusual local index column family prefix, like "L#" ?

James R. Taylor added a comment - 25/Nov/15 06:15 Thanks for the revised patch, rajeshbabu . Would you mind putting together a pull request as it's difficult to see this changes in context. One minor thing, though: can you use a different, more unusual local index column family prefix, like "L#" ?

ASF GitHub Bot added a comment - 25/Nov/15 16:38

GitHub user chrajeshbabu opened a pull request:

https://github.com/apache/phoenix/pull/135

~~PHOENIX-1734~~ Local index improvements

Patch supports storing local indexing data in the same data table.
1) Removed code used HBase internals in balancer, split and merge.
2) Create index create column families suffix with L# for data column families.
3) Changes in read and write path to use column families prefixed with L# for local indexes.
4) Done changes to tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chrajeshbabu/phoenix master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/phoenix/pull/135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #135

commit 4e663a2479adbf3e41826f40c1b2ed6bb69d7634
Author: Rajeshbabu Chintaguntla <rajeshbabu@apache.org>
Date: 2015-11-25T16:33:33Z

~~PHOENIX-1734~~ Local index improvements(Rajeshbabu)

ASF GitHub Bot added a comment - 25/Nov/15 16:38 GitHub user chrajeshbabu opened a pull request: https://github.com/apache/phoenix/pull/135 PHOENIX-1734 Local index improvements Patch supports storing local indexing data in the same data table. 1) Removed code used HBase internals in balancer, split and merge. 2) Create index create column families suffix with L# for data column families. 3) Changes in read and write path to use column families prefixed with L# for local indexes. 4) Done changes to tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/chrajeshbabu/phoenix master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/phoenix/pull/135.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #135 commit 4e663a2479adbf3e41826f40c1b2ed6bb69d7634 Author: Rajeshbabu Chintaguntla <rajeshbabu@apache.org> Date: 2015-11-25T16:33:33Z PHOENIX-1734 Local index improvements(Rajeshbabu)

Rajeshbabu Chintaguntla added a comment - 25/Nov/15 17:55

jamestaylor enis created pull request(https://github.com/apache/phoenix/pull/135) by handling the feedback. Please review.

Would you mind adding some code comments here? What’s the reason to not write to the WAL here? Also, what’s the reason for the check on the table name? And what’s the reason we need the allowLocalUpdates? Is that because local index writes will always be local?

Added the comments.

Instead of having a bunch of if statements to prefix local index column families, let’s just create them like this in the beginning. The other reason to do this is that we don’t want to be creating new byte arrays per row if don’t have to. We already have a simple algorithm to generate column names when indexes are used. This would be similar for column family names of local indexes. See IndexStatementRewriter.visit(ColumnParseNode node) which is used to translate a query against the data table to a query against an index table (and that method translates column references and adds a CAST to ensure the type stays the same), MetaDataClient.createIndex() (which you hopefully won’t need to change), and in particular IndexUtil.getIndexColumnName(String cf, String cn) which takes the data column family and column name and returns the index column name.

Removed bunch code of prefixing local index column family prefix and just changed MetaDataClient#createIndex to create local index with prefixed column families so write path and scan pick proper column families without much changes. Added some static methods to get local index column family from data column family and vice versa. And also added a map of actual covered columns to local index covered columns in index maintainer and used when ever required.

The code in IndexMaintainer is not getting the correct array from the cell. It’s already been fixed in the 4.x and master branches, so please be careful when re-basing.

Not changed this part.

Remove commented out code

Done.

can you use a different, more unusual local index column family prefix, like "L#" ?

Changed it to L#.

For making both data and local index updates transactional not able to use enis suggestions becuase for we cannot add deletes of different in preDelete but puts are fine. Will look at this later and work on.

Rajeshbabu Chintaguntla added a comment - 25/Nov/15 17:55 jamestaylor enis created pull request( https://github.com/apache/phoenix/pull/135 ) by handling the feedback. Please review. Would you mind adding some code comments here? What’s the reason to not write to the WAL here? Also, what’s the reason for the check on the table name? And what’s the reason we need the allowLocalUpdates? Is that because local index writes will always be local? Added the comments. Instead of having a bunch of if statements to prefix local index column families, let’s just create them like this in the beginning. The other reason to do this is that we don’t want to be creating new byte arrays per row if don’t have to. We already have a simple algorithm to generate column names when indexes are used. This would be similar for column family names of local indexes. See IndexStatementRewriter.visit(ColumnParseNode node) which is used to translate a query against the data table to a query against an index table (and that method translates column references and adds a CAST to ensure the type stays the same), MetaDataClient.createIndex() (which you hopefully won’t need to change), and in particular IndexUtil.getIndexColumnName(String cf, String cn) which takes the data column family and column name and returns the index column name. Removed bunch code of prefixing local index column family prefix and just changed MetaDataClient#createIndex to create local index with prefixed column families so write path and scan pick proper column families without much changes. Added some static methods to get local index column family from data column family and vice versa. And also added a map of actual covered columns to local index covered columns in index maintainer and used when ever required. The code in IndexMaintainer is not getting the correct array from the cell. It’s already been fixed in the 4.x and master branches, so please be careful when re-basing. Not changed this part. Remove commented out code Done. can you use a different, more unusual local index column family prefix, like "L#" ? Changed it to L#. For making both data and local index updates transactional not able to use enis suggestions becuase for we cannot add deletes of different in preDelete but puts are fine. Will look at this later and work on.

ASF GitHub Bot added a comment - 28/Nov/15 23:33

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46085683

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/DeleteIT.java —
@@ -186,7 +186,9 @@ private void testDeleteRange(boolean autoCommit, boolean createIndex, boolean lo
PreparedStatement stmt;
conn.setAutoCommit(autoCommit);
deleteStmt = "DELETE FROM IntIntKeyTest WHERE i >= ? and i < ?";

assertIndexUsed(conn, deleteStmt, Arrays.<Object>asList(5,10), indexName, false);
+ if(!local) {
- - End diff –

Why isn't the local index used here and below anymore?

ASF GitHub Bot added a comment - 28/Nov/15 23:33 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46085683 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/DeleteIT.java — @@ -186,7 +186,9 @@ private void testDeleteRange(boolean autoCommit, boolean createIndex, boolean lo PreparedStatement stmt; conn.setAutoCommit(autoCommit); deleteStmt = "DELETE FROM IntIntKeyTest WHERE i >= ? and i < ?"; assertIndexUsed(conn, deleteStmt, Arrays.<Object>asList(5,10), indexName, false); + if(!local) { End diff – Why isn't the local index used here and below anymore?

ASF GitHub Bot added a comment - 28/Nov/15 23:37

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46085722

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java —
@@ -873,7 +873,7 @@ public void initTable() throws Exception {
" SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"]\n" +
"CLIENT MERGE SORT\n" +
" PARALLEL LEFT-JOIN TABLE 0\n" +

" CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
+ " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
- - End diff –

Is local index still being used? How will the user easily know this? One way is through the indexId in the scan, but is this not obvious enough?

ASF GitHub Bot added a comment - 28/Nov/15 23:37 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46085722 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java — @@ -873,7 +873,7 @@ public void initTable() throws Exception { " SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"] \n" + "CLIENT MERGE SORT\n" + " PARALLEL LEFT-JOIN TABLE 0\n" + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768] \n" + + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768] \n" + End diff – Is local index still being used? How will the user easily know this? One way is through the indexId in the scan, but is this not obvious enough?

ASF GitHub Bot added a comment - 28/Nov/15 23:41

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46085765

— Diff: phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReaderGenerator.java —
@@ -80,6 +81,9 @@ public Reader preStoreFileReaderOpen(ObserverContext<RegionCoprocessorEnvironmen
HRegionInfo childRegion = region.getRegionInfo();
byte[] splitKey = null;
if (reader == null && r != null) {
+ if(!p.toString().contains(QueryConstants.LOCAL_INDEX_COLUMN_FAMILY_PREFIX))

{ + return super.preStoreFileReaderOpen(ctx, fs, p, in, size, cacheConf, r, reader); + }

— End diff –

Does the usage of this IndexHalfStoreFileReader go through limited private HBase APIs or are we still using private APIs here?

ASF GitHub Bot added a comment - 28/Nov/15 23:41 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46085765 — Diff: phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReaderGenerator.java — @@ -80,6 +81,9 @@ public Reader preStoreFileReaderOpen(ObserverContext<RegionCoprocessorEnvironmen HRegionInfo childRegion = region.getRegionInfo(); byte[] splitKey = null; if (reader == null && r != null) { + if(!p.toString().contains(QueryConstants.LOCAL_INDEX_COLUMN_FAMILY_PREFIX)) { + return super.preStoreFileReaderOpen(ctx, fs, p, in, size, cacheConf, r, reader); + } — End diff – Does the usage of this IndexHalfStoreFileReader go through limited private HBase APIs or are we still using private APIs here?

ASF GitHub Bot added a comment - 28/Nov/15 23:44

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46085787

— Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java —
@@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn
}
}
ImmutableBytesPtr ptr = new ImmutableBytesPtr();

table.newKey(ptr, pkValues);
+ if(table.getIndexType()==IndexType.LOCAL) {
- - End diff –

Why this new code? Are we still generating local index updates on the server side?

ASF GitHub Bot added a comment - 28/Nov/15 23:44 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46085787 — Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java — @@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn } } ImmutableBytesPtr ptr = new ImmutableBytesPtr(); table.newKey(ptr, pkValues); + if(table.getIndexType()==IndexType.LOCAL) { End diff – Why this new code? Are we still generating local index updates on the server side?

ASF GitHub Bot added a comment - 28/Nov/15 23:52

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46085866

— Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java —
@@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
throws IOException {
s = super.preScannerOpen(e, scan, s);
if (ScanUtil.isAnalyzeTable(scan)) {
+ if (!ScanUtil.isLocalIndex(scan))

{ + scan.getFamilyMap().clear(); + }

+
— End diff –

Why this change?

ASF GitHub Bot added a comment - 28/Nov/15 23:52 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46085866 — Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java — @@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment throws IOException { s = super.preScannerOpen(e, scan, s); if (ScanUtil.isAnalyzeTable(scan)) { + if (!ScanUtil.isLocalIndex(scan)) { + scan.getFamilyMap().clear(); + } + — End diff – Why this change?

ASF GitHub Bot added a comment - 29/Nov/15 18:51

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46096275

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/DeleteIT.java —
@@ -186,7 +186,9 @@ private void testDeleteRange(boolean autoCommit, boolean createIndex, boolean lo
PreparedStatement stmt;
conn.setAutoCommit(autoCommit);
deleteStmt = "DELETE FROM IntIntKeyTest WHERE i >= ? and i < ?";

assertIndexUsed(conn, deleteStmt, Arrays.<Object>asList(5,10), indexName, false);
+ if(!local) {
- - End diff –

It will be used. In case of local indexes explain plan always have "SCAN OVER DATATABLENAME" so it's failing so not checking assertion for local indexes case. Will modify it make proper assertion in case of local indexes like it should have index id as well.

ASF GitHub Bot added a comment - 29/Nov/15 18:51 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46096275 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/DeleteIT.java — @@ -186,7 +186,9 @@ private void testDeleteRange(boolean autoCommit, boolean createIndex, boolean lo PreparedStatement stmt; conn.setAutoCommit(autoCommit); deleteStmt = "DELETE FROM IntIntKeyTest WHERE i >= ? and i < ?"; assertIndexUsed(conn, deleteStmt, Arrays.<Object>asList(5,10), indexName, false); + if(!local) { End diff – It will be used. In case of local indexes explain plan always have "SCAN OVER DATATABLENAME" so it's failing so not checking assertion for local indexes case. Will modify it make proper assertion in case of local indexes like it should have index id as well.

ASF GitHub Bot added a comment - 29/Nov/15 19:04

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46096395

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java —
@@ -873,7 +873,7 @@ public void initTable() throws Exception {
" SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"]\n" +
"CLIENT MERGE SORT\n" +
" PARALLEL LEFT-JOIN TABLE 0\n" +

" CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
+ " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
- - End diff –

Yes it's using local indexes but it's confusing because we are having same table name. Only difference here is index id.
What do you think of changing it to logical name in ExplainTable?
buf.append("OVER " + tableRef.getTable().getPhysicalName().getString());

ASF GitHub Bot added a comment - 29/Nov/15 19:04 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46096395 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java — @@ -873,7 +873,7 @@ public void initTable() throws Exception { " SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"] \n" + "CLIENT MERGE SORT\n" + " PARALLEL LEFT-JOIN TABLE 0\n" + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768] \n" + + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768] \n" + End diff – Yes it's using local indexes but it's confusing because we are having same table name. Only difference here is index id. What do you think of changing it to logical name in ExplainTable? buf.append("OVER " + tableRef.getTable().getPhysicalName().getString());

ASF GitHub Bot added a comment - 29/Nov/15 19:06

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46096413

— Diff: phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReaderGenerator.java —
@@ -80,6 +81,9 @@ public Reader preStoreFileReaderOpen(ObserverContext<RegionCoprocessorEnvironmen
HRegionInfo childRegion = region.getRegionInfo();
byte[] splitKey = null;
if (reader == null && r != null) {
+ if(!p.toString().contains(QueryConstants.LOCAL_INDEX_COLUMN_FAMILY_PREFIX))

{ + return super.preStoreFileReaderOpen(ctx, fs, p, in, size, cacheConf, r, reader); + }

— End diff –

Still using private APIs here will raise an improvement action for this and try to make it to use Scanner APIs.

ASF GitHub Bot added a comment - 29/Nov/15 19:06 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46096413 — Diff: phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReaderGenerator.java — @@ -80,6 +81,9 @@ public Reader preStoreFileReaderOpen(ObserverContext<RegionCoprocessorEnvironmen HRegionInfo childRegion = region.getRegionInfo(); byte[] splitKey = null; if (reader == null && r != null) { + if(!p.toString().contains(QueryConstants.LOCAL_INDEX_COLUMN_FAMILY_PREFIX)) { + return super.preStoreFileReaderOpen(ctx, fs, p, in, size, cacheConf, r, reader); + } — End diff – Still using private APIs here will raise an improvement action for this and try to make it to use Scanner APIs.

ASF GitHub Bot added a comment - 29/Nov/15 19:08

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46096442

— Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java —
@@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn
}
}
ImmutableBytesPtr ptr = new ImmutableBytesPtr();

table.newKey(ptr, pkValues);
+ if(table.getIndexType()==IndexType.LOCAL) {
- - End diff –

We are preparing index updates server side only. But I have added this to prepare proper local index mutations when we run upsert into index table directly in any case if required.

ASF GitHub Bot added a comment - 29/Nov/15 19:08 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46096442 — Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java — @@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn } } ImmutableBytesPtr ptr = new ImmutableBytesPtr(); table.newKey(ptr, pkValues); + if(table.getIndexType()==IndexType.LOCAL) { End diff – We are preparing index updates server side only. But I have added this to prepare proper local index mutations when we run upsert into index table directly in any case if required.

ASF GitHub Bot added a comment - 29/Nov/15 19:13

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46096498

— Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java —
@@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
throws IOException {
s = super.preScannerOpen(e, scan, s);
if (ScanUtil.isAnalyzeTable(scan)) {
+ if (!ScanUtil.isLocalIndex(scan))

{ + scan.getFamilyMap().clear(); + }

+
— End diff –

when we run stats on data table we select only data column families this change helps to choose all the column families to stats preparation and prepare stats at a time for data and local index tables.

ASF GitHub Bot added a comment - 29/Nov/15 19:13 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46096498 — Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java — @@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment throws IOException { s = super.preScannerOpen(e, scan, s); if (ScanUtil.isAnalyzeTable(scan)) { + if (!ScanUtil.isLocalIndex(scan)) { + scan.getFamilyMap().clear(); + } + — End diff – when we run stats on data table we select only data column families this change helps to choose all the column families to stats preparation and prepare stats at a time for data and local index tables.

ASF GitHub Bot added a comment - 29/Nov/15 19:13

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46096509

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java —
@@ -873,7 +873,7 @@ public void initTable() throws Exception {
" SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"]\n" +
"CLIENT MERGE SORT\n" +
" PARALLEL LEFT-JOIN TABLE 0\n" +

" CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
+ " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768]\n" +
- - End diff –

I think using the physical name makes sense, as we're trying to show more-or-less what's going on at the HBase level. How about adding "LOCAL INDEX" like this:
OVER my_table LOCAL INDEX

ASF GitHub Bot added a comment - 29/Nov/15 19:13 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46096509 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java — @@ -873,7 +873,7 @@ public void initTable() throws Exception { " SERVER AGGREGATE INTO DISTINCT ROWS BY [\"I.0:NAME\"] \n" + "CLIENT MERGE SORT\n" + " PARALLEL LEFT-JOIN TABLE 0\n" + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + "" + JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768] \n" + + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_ITEM_TABLE_DISPLAY_NAME +" [-32768] \n" + End diff – I think using the physical name makes sense, as we're trying to show more-or-less what's going on at the HBase level. How about adding "LOCAL INDEX" like this: OVER my_table LOCAL INDEX

ASF GitHub Bot added a comment - 29/Nov/15 19:14

Github user chrajeshbabu commented on the pull request:

https://github.com/apache/phoenix/pull/135#issuecomment-160454376

Thanks for review @JamesRTaylor. Please find my answers to the comments.

ASF GitHub Bot added a comment - 29/Nov/15 19:14 Github user chrajeshbabu commented on the pull request: https://github.com/apache/phoenix/pull/135#issuecomment-160454376 Thanks for review @JamesRTaylor. Please find my answers to the comments.

ASF GitHub Bot added a comment - 29/Nov/15 19:21

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46096617

— Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java —
@@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
throws IOException {
s = super.preScannerOpen(e, scan, s);
if (ScanUtil.isAnalyzeTable(scan)) {
+ if (!ScanUtil.isLocalIndex(scan))

{ + scan.getFamilyMap().clear(); + }

+
— End diff –

That's a good idea, but how about we drive this from the client side instead? There's code in MetaDataClient.updateStatistics() that was adding the local index physical table (if stats on indexes are being updated). How about changing that to add the local index shadow column families?

ASF GitHub Bot added a comment - 29/Nov/15 19:21 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46096617 — Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java — @@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment throws IOException { s = super.preScannerOpen(e, scan, s); if (ScanUtil.isAnalyzeTable(scan)) { + if (!ScanUtil.isLocalIndex(scan)) { + scan.getFamilyMap().clear(); + } + — End diff – That's a good idea, but how about we drive this from the client side instead? There's code in MetaDataClient.updateStatistics() that was adding the local index physical table (if stats on indexes are being updated). How about changing that to add the local index shadow column families?

ASF GitHub Bot added a comment - 29/Nov/15 19:29

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46096704

— Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java —
@@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn
}
}
ImmutableBytesPtr ptr = new ImmutableBytesPtr();

table.newKey(ptr, pkValues);
+ if(table.getIndexType()==IndexType.LOCAL) {
- - End diff –

I don't think we should support updating index tables directly, so I don't think we should include this code. We've talked about having a kind of "scrutiny" process that scans the data table and ensures that all the corresponding indexes have the correct rows (I filed ~~PHOENIX-2460~~ for this). I suspect this would be done through our coprocessers, though, not through this client API. WDYT?

ASF GitHub Bot added a comment - 29/Nov/15 19:29 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46096704 — Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java — @@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn } } ImmutableBytesPtr ptr = new ImmutableBytesPtr(); table.newKey(ptr, pkValues); + if(table.getIndexType()==IndexType.LOCAL) { End diff – I don't think we should support updating index tables directly, so I don't think we should include this code. We've talked about having a kind of "scrutiny" process that scans the data table and ensures that all the corresponding indexes have the correct rows (I filed PHOENIX-2460 for this). I suspect this would be done through our coprocessers, though, not through this client API. WDYT?

ASF GitHub Bot added a comment - 30/Nov/15 06:10

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46111214

— Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java —
@@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
throws IOException {
s = super.preScannerOpen(e, scan, s);
if (ScanUtil.isAnalyzeTable(scan)) {
+ if (!ScanUtil.isLocalIndex(scan))

{ + scan.getFamilyMap().clear(); + }

+
— End diff –

How about changing that to add the local index shadow column families?

That's better James. Will do in the next patch.

ASF GitHub Bot added a comment - 30/Nov/15 06:10 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46111214 — Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java — @@ -158,6 +162,10 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment throws IOException { s = super.preScannerOpen(e, scan, s); if (ScanUtil.isAnalyzeTable(scan)) { + if (!ScanUtil.isLocalIndex(scan)) { + scan.getFamilyMap().clear(); + } + — End diff – How about changing that to add the local index shadow column families? That's better James. Will do in the next patch.

ASF GitHub Bot added a comment - 30/Nov/15 06:11

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/135#discussion_r46111265

— Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java —
@@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn
}
}
ImmutableBytesPtr ptr = new ImmutableBytesPtr();

table.newKey(ptr, pkValues);
+ if(table.getIndexType()==IndexType.LOCAL) {
- - End diff –

so I don't think we should include this code.

Will remove in the next patch.

We've talked about having a kind of "scrutiny" process that scans the data table and ensures that all the corresponding indexes have the correct rows (I filed ~~PHOENIX-2460~~ for this).

+1 on this.

ASF GitHub Bot added a comment - 30/Nov/15 06:11 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/135#discussion_r46111265 — Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java — @@ -134,7 +143,36 @@ private static void setValues(byte[][] values, int[] pkSlotIndex, int[] columnIn } } ImmutableBytesPtr ptr = new ImmutableBytesPtr(); table.newKey(ptr, pkValues); + if(table.getIndexType()==IndexType.LOCAL) { End diff – so I don't think we should include this code. Will remove in the next patch. We've talked about having a kind of "scrutiny" process that scans the data table and ensures that all the corresponding indexes have the correct rows (I filed PHOENIX-2460 for this). +1 on this.

ASF GitHub Bot added a comment - 01/Dec/15 04:55

Github user JamesRTaylor commented on the pull request:

https://github.com/apache/phoenix/pull/135#issuecomment-160852616

Couple of other comments:

You'll need to add code during installation of 4.7.0 (in ConnectionQueryServicesImpl.init()) that takes care of removing old coprocessors, adding new coprocessors, disabling local indexes and potentially automatically kicking off a new index build
We'll likely want to add a check that disallows column family names prefixed with "C#" to prevent inadvertently treating these as local index shadow column families. If we can get rid of the checks that use this prefix, then we might not need to do this.
Also, there's an edge case of an existing column family using the "C#" prefix. Unlikely, but not sure how we'd want to handle this. Maybe error on an attempt to create a local index on such a table?

ASF GitHub Bot added a comment - 01/Dec/15 04:55 Github user JamesRTaylor commented on the pull request: https://github.com/apache/phoenix/pull/135#issuecomment-160852616 Couple of other comments: You'll need to add code during installation of 4.7.0 (in ConnectionQueryServicesImpl.init()) that takes care of removing old coprocessors, adding new coprocessors, disabling local indexes and potentially automatically kicking off a new index build We'll likely want to add a check that disallows column family names prefixed with "C#" to prevent inadvertently treating these as local index shadow column families. If we can get rid of the checks that use this prefix, then we might not need to do this. Also, there's an edge case of an existing column family using the "C#" prefix. Unlikely, but not sure how we'd want to handle this. Maybe error on an attempt to create a local index on such a table?

Rajeshbabu Chintaguntla added a comment - 22/Jan/16 16:22 - edited

Patch handles review comments and has fixes for couple of issues in split, merges. Will update pull request with this patch.

Rajeshbabu Chintaguntla added a comment - 22/Jan/16 16:22 - edited Patch handles review comments and has fixes for couple of issues in split, merges. Will update pull request with this patch.

James R. Taylor added a comment - 22/Jan/16 18:36

Thanks for the patch, rajeshbabu. It's coming along nicely. Here's some feedback:

The way we're handling NotServingRegionException needs to be improved. Maybe you can walk us through that, but I think there are several issues:

We can't assume the structure (i.e. parent/child iterator) when we detect this situation as we're doing here and in ScanUtil.getNewIterator. That's basically hard coding the query plan. We also wouldn't want to have to insert this logic in every ResultIterator implementation - it needs to be centralized IMO.

     @Override
     public Tuple next() throws SQLException {
+        Tuple next = null;
+            try {
+                next = nextInternal();
+            } catch(Exception e) {
+                try { // Rethrow as SQLException
+                    throw ServerUtil.parseServerException(e);
+                } catch (StaleRegionBoundaryCacheException e2) {
+                    if(scan != null && ScanUtil.isLocalIndex(scan)) {
+                        TableResultIterator tableResultIterator = ((TableResultIterator)getIterator());
+                        tableResultIterator.close();
+                        setIterator(ScanUtil.getNewIterator(tableResultIterator, this.scan,
+                            this.next == UNINITIALIZED ? null : this.next));
+                        this.next = UNINITIALIZED;
+                        next = nextInternal();
+                    } else {
+                        throw e;
+                    }
+                }
+            }
+        return next;
+    }

What would happen if this exception occurred during an aggregation? We have partial state already. Is that factored in correctly?
Perhaps a more feasible approach would be to detect this up front when the scan starts by sending the expected end region boundary? That way the existing mechanism would handle it (isolated to BaseParallelIterators).

I don't see any upgrade code in ConnectionQueryServicesImpl to handle existing tables with local indexes. We'll need to handle that smoothly.
The query plan is not clear at all that a local indexes is being used:
```
WAY RANGE SCAN OVER my_table [-32768, 'foo']
```
We should change it to make that clear, as well as offsetting the index id by adding {{-Short.MAX_VALUE}+1}, which I realize isn't related to your change, but it would improve the readability and since you're changing all the query plans, might as well do that now too. Something like this:
```
WAY RANGE SCAN OVER LOCAL INDEX ON my_table [1, 'foo']
```

Given that we're ready for a 4.7.0 release now, I think it makes sense to push this to a 4.8.0 release. We could follow up with that as soon as this is ready. Probably a good idea not to stuff too much into a single release and that'll give you a little more time to do the above.

WDYT, rajeshbabu?

James R. Taylor added a comment - 22/Jan/16 18:36 Thanks for the patch, rajeshbabu . It's coming along nicely. Here's some feedback: The way we're handling NotServingRegionException needs to be improved. Maybe you can walk us through that, but I think there are several issues: We can't assume the structure (i.e. parent/child iterator) when we detect this situation as we're doing here and in ScanUtil.getNewIterator. That's basically hard coding the query plan. We also wouldn't want to have to insert this logic in every ResultIterator implementation - it needs to be centralized IMO. @Override public Tuple next() throws SQLException { + Tuple next = null ; + try { + next = nextInternal(); + } catch (Exception e) { + try { // Rethrow as SQLException + throw ServerUtil.parseServerException(e); + } catch (StaleRegionBoundaryCacheException e2) { + if (scan != null && ScanUtil.isLocalIndex(scan)) { + TableResultIterator tableResultIterator = ((TableResultIterator)getIterator()); + tableResultIterator.close(); + setIterator(ScanUtil.getNewIterator(tableResultIterator, this .scan, + this .next == UNINITIALIZED ? null : this .next)); + this .next = UNINITIALIZED; + next = nextInternal(); + } else { + throw e; + } + } + } + return next; + } What would happen if this exception occurred during an aggregation? We have partial state already. Is that factored in correctly? Perhaps a more feasible approach would be to detect this up front when the scan starts by sending the expected end region boundary? That way the existing mechanism would handle it (isolated to BaseParallelIterators). I don't see any upgrade code in ConnectionQueryServicesImpl to handle existing tables with local indexes. We'll need to handle that smoothly. The query plan is not clear at all that a local indexes is being used: WAY RANGE SCAN OVER my_table [-32768, 'foo' ] We should change it to make that clear, as well as offsetting the index id by adding {{-Short.MAX_VALUE}+1}, which I realize isn't related to your change, but it would improve the readability and since you're changing all the query plans, might as well do that now too. Something like this: WAY RANGE SCAN OVER LOCAL INDEX ON my_table [1, 'foo' ] Given that we're ready for a 4.7.0 release now, I think it makes sense to push this to a 4.8.0 release. We could follow up with that as soon as this is ready. Probably a good idea not to stuff too much into a single release and that'll give you a little more time to do the above. WDYT, rajeshbabu ?

James R. Taylor added a comment - 22/Jan/16 21:02

Also, I think we should make local indexes transactional with the data table in the same release as the rework (assuming it's feasible). Otherwise, users aren't really gaining anything (indirectly, they are by ensuring that the feature doesn't break with future HBase versions, but that's more helping us with maintainability), while we're asking them to rebuild all their indexes. Just my 2 cents.

James R. Taylor added a comment - 22/Jan/16 21:02 Also, I think we should make local indexes transactional with the data table in the same release as the rework (assuming it's feasible). Otherwise, users aren't really gaining anything (indirectly, they are by ensuring that the feature doesn't break with future HBase versions, but that's more helping us with maintainability), while we're asking them to rebuild all their indexes. Just my 2 cents.

ASF GitHub Bot added a comment - 10/May/16 14:00

GitHub user chrajeshbabu opened a pull request:

https://github.com/apache/phoenix/pull/168

~~PHOENIX-1734~~ Local index improvements(Rajeshbabu)

This is the patch for new implementation of local index where we store local index data in the separate column families in the same table than different table.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chrajeshbabu/phoenix master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/phoenix/pull/168.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #168

commit 2dbb361551b48b96d8335c12b397a2258c266fd3
Author: Rajeshbabu Chintaguntla <rajeshbabu@apache.org>
Date: 2016-05-10T14:00:41Z

~~PHOENIX-1734~~ Local index improvements(Rajeshbabu)

ASF GitHub Bot added a comment - 10/May/16 14:00 GitHub user chrajeshbabu opened a pull request: https://github.com/apache/phoenix/pull/168 PHOENIX-1734 Local index improvements(Rajeshbabu) This is the patch for new implementation of local index where we store local index data in the separate column families in the same table than different table. You can merge this pull request into a Git repository by running: $ git pull https://github.com/chrajeshbabu/phoenix master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/phoenix/pull/168.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #168 commit 2dbb361551b48b96d8335c12b397a2258c266fd3 Author: Rajeshbabu Chintaguntla <rajeshbabu@apache.org> Date: 2016-05-10T14:00:41Z PHOENIX-1734 Local index improvements(Rajeshbabu)

ASF GitHub Bot added a comment - 10/May/16 15:30

Github user JamesRTaylor commented on the pull request:

https://github.com/apache/phoenix/pull/168#issuecomment-218194916

How about a version that doesn't depend on ~~HBASE-15600~~ so that we can continue to support older versions of HBase (and can do a 4.8 release)? What we've typically done, is once the change makes it to an HBase release, we update the pom on that version but provide a fallback implementation (conditional on the HBase version at runtime). Then we document that writes aren't transactional unless you're using HBase version XXX or above.

ASF GitHub Bot added a comment - 10/May/16 15:30 Github user JamesRTaylor commented on the pull request: https://github.com/apache/phoenix/pull/168#issuecomment-218194916 How about a version that doesn't depend on HBASE-15600 so that we can continue to support older versions of HBase (and can do a 4.8 release)? What we've typically done, is once the change makes it to an HBase release, we update the pom on that version but provide a fallback implementation (conditional on the HBase version at runtime). Then we document that writes aren't transactional unless you're using HBase version XXX or above.

ASF GitHub Bot added a comment - 10/May/16 15:41

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62696240

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java —
@@ -187,13 +186,13 @@ public void initTable() throws Exception

{ "CREATE LOCAL INDEX \"idx_supplier\" ON " + JOIN_SUPPLIER_TABLE_FULL_NAME + " (name)" }

, {
"SORT-MERGE-JOIN (LEFT) TABLES\n" +

" CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [-32768]\n" +
+ " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [1]\n" +
- - End diff –

We should update the explain plan code to output something else when a local index is used, otherwise it's too difficult to tell. Is the local index being used?

Something like ... OVER LOCAL INDEX <index name> ...

ASF GitHub Bot added a comment - 10/May/16 15:41 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62696240 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java — @@ -187,13 +186,13 @@ public void initTable() throws Exception { "CREATE LOCAL INDEX \"idx_supplier\" ON " + JOIN_SUPPLIER_TABLE_FULL_NAME + " (name)" } , { "SORT-MERGE-JOIN (LEFT) TABLES\n" + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [-32768] \n" + + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [1] \n" + End diff – We should update the explain plan code to output something else when a local index is used, otherwise it's too difficult to tell. Is the local index being used? Something like ... OVER LOCAL INDEX <index name> ...

ASF GitHub Bot added a comment - 10/May/16 15:49

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62697744

— Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java —
@@ -165,6 +165,9 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
throws IOException {
s = super.preScannerOpen(e, scan, s);
if (ScanUtil.isAnalyzeTable(scan)) {
+ if (!ScanUtil.isLocalIndex(scan))

{ + scan.getFamilyMap().clear(); + }

— End diff –

Can't we project the right column families from the client into the scan? Our UPDATE STATISTICS command specifies whether or not stats for indexes should be updated or not.

ASF GitHub Bot added a comment - 10/May/16 15:49 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62697744 — Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java — @@ -165,6 +165,9 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment throws IOException { s = super.preScannerOpen(e, scan, s); if (ScanUtil.isAnalyzeTable(scan)) { + if (!ScanUtil.isLocalIndex(scan)) { + scan.getFamilyMap().clear(); + } — End diff – Can't we project the right column families from the client into the scan? Our UPDATE STATISTICS command specifies whether or not stats for indexes should be updated or not.

ASF GitHub Bot added a comment - 10/May/16 15:50

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62698009

— Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java —
@@ -202,7 +207,10 @@ protected RegionScanner doPostScannerOpen(final ObserverContext<RegionCoprocesso
ScanUtil.setRowKeyOffset(scan, offsetToBe);
}
final int offset = offsetToBe;
-
+ if(MetaDataUtil.hasLocalIndexColumnFamily(region.getTableDesc()))

{ + nonLocalIndexFamilies = MetaDataUtil.getNonLocalIndexColumnFamilies(region.getTableDesc()); + }

+
— End diff –

Why is this required?

ASF GitHub Bot added a comment - 10/May/16 15:50 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62698009 — Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java — @@ -202,7 +207,10 @@ protected RegionScanner doPostScannerOpen(final ObserverContext<RegionCoprocesso ScanUtil.setRowKeyOffset(scan, offsetToBe); } final int offset = offsetToBe; - + if(MetaDataUtil.hasLocalIndexColumnFamily(region.getTableDesc())) { + nonLocalIndexFamilies = MetaDataUtil.getNonLocalIndexColumnFamilies(region.getTableDesc()); + } + — End diff – Why is this required?

ASF GitHub Bot added a comment - 10/May/16 15:51

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62698174

— Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/DelegateHTable.java —
@@ -297,4 +297,28 @@ public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier,
return delegate.checkAndDelete(row, family, qualifier, compareOp, value, delete);
}

+ @Override
— End diff –

These should call the method on the delegate.

ASF GitHub Bot added a comment - 10/May/16 15:51 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62698174 — Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/DelegateHTable.java — @@ -297,4 +297,28 @@ public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, return delegate.checkAndDelete(row, family, qualifier, compareOp, value, delete); } + @Override — End diff – These should call the method on the delegate.

ASF GitHub Bot added a comment - 10/May/16 15:56

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62699243

— Diff: phoenix-core/src/main/java/org/apache/phoenix/index/IndexMaintainer.java —
@@ -861,7 +871,12 @@ public Put buildUpdateMutation(KeyValueBuilder kvBuilder, ValueGetter valueGette
put.setDurability(!indexWALDisabled ? Durability.USE_DEFAULT : Durability.SKIP_WAL);
}
//this is a little bit of extra work for installations that are running <0.94.14, but that should be rare and is a short-term set of wrappers - it shouldn't kill GC

put.add(kvBuilder.buildPut(rowKey, ref.getFamilyWritable(), cq, ts, value));
+ if(this.isLocalIndex) {
+ ColumnReference columnReference = this.coveredColumnsMap.get(ref);
+ put.add(kvBuilder.buildPut(rowKey, columnReference.getFamilyWritable(), cq, ts, value));
- - End diff –

Tell me more about this. We keep a mapping on the client-side for column families? I thought we could generate the cf of the local index based on the cf of the data table? Is this to save re-generating the column family name all the time?

ASF GitHub Bot added a comment - 10/May/16 15:56 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62699243 — Diff: phoenix-core/src/main/java/org/apache/phoenix/index/IndexMaintainer.java — @@ -861,7 +871,12 @@ public Put buildUpdateMutation(KeyValueBuilder kvBuilder, ValueGetter valueGette put.setDurability(!indexWALDisabled ? Durability.USE_DEFAULT : Durability.SKIP_WAL); } //this is a little bit of extra work for installations that are running <0.94.14, but that should be rare and is a short-term set of wrappers - it shouldn't kill GC put.add(kvBuilder.buildPut(rowKey, ref.getFamilyWritable(), cq, ts, value)); + if(this.isLocalIndex) { + ColumnReference columnReference = this.coveredColumnsMap.get(ref); + put.add(kvBuilder.buildPut(rowKey, columnReference.getFamilyWritable(), cq, ts, value)); End diff – Tell me more about this. We keep a mapping on the client-side for column families? I thought we could generate the cf of the local index based on the cf of the data table? Is this to save re-generating the column family name all the time?

ASF GitHub Bot added a comment - 10/May/16 15:58

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62699657

— Diff: phoenix-core/src/main/java/org/apache/phoenix/iterate/ExplainTable.java —
@@ -198,8 +198,14 @@ private void appendPKColumnValue(StringBuilder buf, byte[] range, Boolean isNull
type.coerceBytes(ptr, type, sortOrder, SortOrder.getDefault());
range = ptr.get();
}

Format formatter = context.getConnection().getFormatter(type);
buf.append(type.toStringLiteral(range, formatter));
+ if (changeViewIndexId) {
- - End diff –

I like this trick. We should always do this.

ASF GitHub Bot added a comment - 10/May/16 15:58 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62699657 — Diff: phoenix-core/src/main/java/org/apache/phoenix/iterate/ExplainTable.java — @@ -198,8 +198,14 @@ private void appendPKColumnValue(StringBuilder buf, byte[] range, Boolean isNull type.coerceBytes(ptr, type, sortOrder, SortOrder.getDefault()); range = ptr.get(); } Format formatter = context.getConnection().getFormatter(type); buf.append(type.toStringLiteral(range, formatter)); + if (changeViewIndexId) { End diff – I like this trick. We should always do this.

ASF GitHub Bot added a comment - 10/May/16 16:01

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62700523

— Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java —
@@ -1016,7 +1016,7 @@ private MutationState buildIndexAtTimeStamp(PTable index, NamedTableNode dataTab
// connection so that our new index table is visible.
Properties props = new Properties(connection.getClientInfo());
props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(connection.getSCN()+1));

PhoenixConnection conn = DriverManager.getConnection(connection.getURL(), props).unwrap(PhoenixConnection.class);
+ PhoenixConnection conn = new PhoenixConnection(connection, connection.getQueryServices(), props);
- - End diff –

Just curious on this change.

ASF GitHub Bot added a comment - 10/May/16 16:01 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62700523 — Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java — @@ -1016,7 +1016,7 @@ private MutationState buildIndexAtTimeStamp(PTable index, NamedTableNode dataTab // connection so that our new index table is visible. Properties props = new Properties(connection.getClientInfo()); props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(connection.getSCN()+1)); PhoenixConnection conn = DriverManager.getConnection(connection.getURL(), props).unwrap(PhoenixConnection.class); + PhoenixConnection conn = new PhoenixConnection(connection, connection.getQueryServices(), props); End diff – Just curious on this change.

ASF GitHub Bot added a comment - 10/May/16 17:01

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62710798

— Diff: phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/ParallelWriterIndexCommitter.java —
@@ -116,7 +117,10 @@ public void write(Multimap<HTableInterfaceReference, Mutation> toWrite) throws S
// doing a complete copy over of all the index update for each table.
final List<Mutation> mutations = kvBuilder.cloneIfNecessary((List<Mutation>)entry.getValue());
final HTableInterfaceReference tableReference = entry.getKey();

final RegionCoprocessorEnvironment env = this.env;
+ if (env !=null && tableReference.getTableName().equals(
+ env.getRegion().getTableDesc().getNameAsString())) { + continue; + }
- - End diff –

Can we conditionally not filter our local index rows depending on the HBase version?

ASF GitHub Bot added a comment - 10/May/16 17:01 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62710798 — Diff: phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/ParallelWriterIndexCommitter.java — @@ -116,7 +117,10 @@ public void write(Multimap<HTableInterfaceReference, Mutation> toWrite) throws S // doing a complete copy over of all the index update for each table. final List<Mutation> mutations = kvBuilder.cloneIfNecessary((List<Mutation>)entry.getValue()); final HTableInterfaceReference tableReference = entry.getKey(); final RegionCoprocessorEnvironment env = this.env; + if (env !=null && tableReference.getTableName().equals( + env.getRegion().getTableDesc().getNameAsString())) { + continue; + } End diff – Can we conditionally not filter our local index rows depending on the HBase version?

ASF GitHub Bot added a comment - 10/May/16 17:02

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r62710902

— Diff: phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixTransactionalIndexer.java —
@@ -160,6 +163,9 @@ public void preBatchMutate(ObserverContext<RegionCoprocessorEnvironment> c,

// get the index updates for all elements in this batch
indexUpdates = getIndexUpdates(c.getEnvironment(), indexMetaData, getMutationIterator(miniBatchOp), txRollbackAttribute);
+
+ IndexUtil.addLocalUpdatesToCpOperations(c, miniBatchOp, indexUpdates,
+ m.getDurability() != Durability.SKIP_WAL);
— End diff –

Can we conditionally not run this code depending on the HBase version?

ASF GitHub Bot added a comment - 10/May/16 17:02 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r62710902 — Diff: phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixTransactionalIndexer.java — @@ -160,6 +163,9 @@ public void preBatchMutate(ObserverContext<RegionCoprocessorEnvironment> c, // get the index updates for all elements in this batch indexUpdates = getIndexUpdates(c.getEnvironment(), indexMetaData, getMutationIterator(miniBatchOp), txRollbackAttribute); + + IndexUtil.addLocalUpdatesToCpOperations(c, miniBatchOp, indexUpdates, + m.getDurability() != Durability.SKIP_WAL); — End diff – Can we conditionally not run this code depending on the HBase version?

ASF GitHub Bot added a comment - 17/May/16 04:33

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63463302

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java —
@@ -187,13 +186,13 @@ public void initTable() throws Exception

{ "CREATE LOCAL INDEX \"idx_supplier\" ON " + JOIN_SUPPLIER_TABLE_FULL_NAME + " (name)" }

, {
"SORT-MERGE-JOIN (LEFT) TABLES\n" +

" CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [-32768]\n" +
+ " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [1]\n" +
- - End diff –

@JamesRTaylor Yes agree. Is this fine "OVER LOCAL INDEX OF DATA_TABLE"? or can we include index table name as well?

ASF GitHub Bot added a comment - 17/May/16 04:33 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63463302 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java — @@ -187,13 +186,13 @@ public void initTable() throws Exception { "CREATE LOCAL INDEX \"idx_supplier\" ON " + JOIN_SUPPLIER_TABLE_FULL_NAME + " (name)" } , { "SORT-MERGE-JOIN (LEFT) TABLES\n" + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [-32768] \n" + + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [1] \n" + End diff – @JamesRTaylor Yes agree. Is this fine "OVER LOCAL INDEX OF DATA_TABLE"? or can we include index table name as well?

ASF GitHub Bot added a comment - 17/May/16 04:34

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63463350

— Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java —
@@ -202,7 +207,10 @@ protected RegionScanner doPostScannerOpen(final ObserverContext<RegionCoprocesso
ScanUtil.setRowKeyOffset(scan, offsetToBe);
}
final int offset = offsetToBe;
-
+ if(MetaDataUtil.hasLocalIndexColumnFamily(region.getTableDesc()))

{ + nonLocalIndexFamilies = MetaDataUtil.getNonLocalIndexColumnFamilies(region.getTableDesc()); + }

+
— End diff –

This is not required. Removed it. Just while debugging one of the issue added it.

ASF GitHub Bot added a comment - 17/May/16 04:34 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63463350 — Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java — @@ -202,7 +207,10 @@ protected RegionScanner doPostScannerOpen(final ObserverContext<RegionCoprocesso ScanUtil.setRowKeyOffset(scan, offsetToBe); } final int offset = offsetToBe; - + if(MetaDataUtil.hasLocalIndexColumnFamily(region.getTableDesc())) { + nonLocalIndexFamilies = MetaDataUtil.getNonLocalIndexColumnFamilies(region.getTableDesc()); + } + — End diff – This is not required. Removed it. Just while debugging one of the issue added it.

ASF GitHub Bot added a comment - 17/May/16 04:37

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63463462

— Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/DelegateHTable.java —
@@ -297,4 +297,28 @@ public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier,
return delegate.checkAndDelete(row, family, qualifier, compareOp, value, delete);
}

+ @Override
— End diff –

These methods required for 1.2.2-SNAPSHOT version so currently not required with older versions. will remove it.

ASF GitHub Bot added a comment - 17/May/16 04:37 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63463462 — Diff: phoenix-core/src/main/java/org/apache/phoenix/execute/DelegateHTable.java — @@ -297,4 +297,28 @@ public boolean checkAndDelete(byte[] row, byte[] family, byte[] qualifier, return delegate.checkAndDelete(row, family, qualifier, compareOp, value, delete); } + @Override — End diff – These methods required for 1.2.2-SNAPSHOT version so currently not required with older versions. will remove it.

ASF GitHub Bot added a comment - 17/May/16 04:37

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63463482

— Diff: phoenix-core/src/main/java/org/apache/phoenix/index/IndexMaintainer.java —
@@ -861,7 +871,12 @@ public Put buildUpdateMutation(KeyValueBuilder kvBuilder, ValueGetter valueGette
put.setDurability(!indexWALDisabled ? Durability.USE_DEFAULT : Durability.SKIP_WAL);
}
//this is a little bit of extra work for installations that are running <0.94.14, but that should be rare and is a short-term set of wrappers - it shouldn't kill GC

put.add(kvBuilder.buildPut(rowKey, ref.getFamilyWritable(), cq, ts, value));
+ if(this.isLocalIndex) {
+ ColumnReference columnReference = this.coveredColumnsMap.get(ref);
+ put.add(kvBuilder.buildPut(rowKey, columnReference.getFamilyWritable(), cq, ts, value));
- - End diff –

Yes James. It's to save regenerating the column family name all the time.

ASF GitHub Bot added a comment - 17/May/16 04:37 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63463482 — Diff: phoenix-core/src/main/java/org/apache/phoenix/index/IndexMaintainer.java — @@ -861,7 +871,12 @@ public Put buildUpdateMutation(KeyValueBuilder kvBuilder, ValueGetter valueGette put.setDurability(!indexWALDisabled ? Durability.USE_DEFAULT : Durability.SKIP_WAL); } //this is a little bit of extra work for installations that are running <0.94.14, but that should be rare and is a short-term set of wrappers - it shouldn't kill GC put.add(kvBuilder.buildPut(rowKey, ref.getFamilyWritable(), cq, ts, value)); + if(this.isLocalIndex) { + ColumnReference columnReference = this.coveredColumnsMap.get(ref); + put.add(kvBuilder.buildPut(rowKey, columnReference.getFamilyWritable(), cq, ts, value)); End diff – Yes James. It's to save regenerating the column family name all the time.

ASF GitHub Bot added a comment - 17/May/16 04:39

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63463584

— Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java —
@@ -1016,7 +1016,7 @@ private MutationState buildIndexAtTimeStamp(PTable index, NamedTableNode dataTab
// connection so that our new index table is visible.
Properties props = new Properties(connection.getClientInfo());
props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(connection.getSCN()+1));

PhoenixConnection conn = DriverManager.getConnection(connection.getURL(), props).unwrap(PhoenixConnection.class);
+ PhoenixConnection conn = new PhoenixConnection(connection, connection.getQueryServices(), props);
- - End diff –

DriverManager.getConnection has more overhead than initializing the PhoenixConnection right that's why changed.

ASF GitHub Bot added a comment - 17/May/16 04:39 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63463584 — Diff: phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java — @@ -1016,7 +1016,7 @@ private MutationState buildIndexAtTimeStamp(PTable index, NamedTableNode dataTab // connection so that our new index table is visible. Properties props = new Properties(connection.getClientInfo()); props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(connection.getSCN()+1)); PhoenixConnection conn = DriverManager.getConnection(connection.getURL(), props).unwrap(PhoenixConnection.class); + PhoenixConnection conn = new PhoenixConnection(connection, connection.getQueryServices(), props); End diff – DriverManager.getConnection has more overhead than initializing the PhoenixConnection right that's why changed.

ASF GitHub Bot added a comment - 17/May/16 04:40

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63463618

— Diff: phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/ParallelWriterIndexCommitter.java —
@@ -116,7 +117,10 @@ public void write(Multimap<HTableInterfaceReference, Mutation> toWrite) throws S
// doing a complete copy over of all the index update for each table.
final List<Mutation> mutations = kvBuilder.cloneIfNecessary((List<Mutation>)entry.getValue());
final HTableInterfaceReference tableReference = entry.getKey();

final RegionCoprocessorEnvironment env = this.env;
+ if (env !=null && tableReference.getTableName().equals(
+ env.getRegion().getTableDesc().getNameAsString())) { + continue; + }
- - End diff –

Yes we should do that. Currently made a patch without ~~HBASE-15600~~ so once it's ready then will add condition check.

ASF GitHub Bot added a comment - 17/May/16 04:40 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63463618 — Diff: phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/ParallelWriterIndexCommitter.java — @@ -116,7 +117,10 @@ public void write(Multimap<HTableInterfaceReference, Mutation> toWrite) throws S // doing a complete copy over of all the index update for each table. final List<Mutation> mutations = kvBuilder.cloneIfNecessary((List<Mutation>)entry.getValue()); final HTableInterfaceReference tableReference = entry.getKey(); final RegionCoprocessorEnvironment env = this.env; + if (env !=null && tableReference.getTableName().equals( + env.getRegion().getTableDesc().getNameAsString())) { + continue; + } End diff – Yes we should do that. Currently made a patch without HBASE-15600 so once it's ready then will add condition check.

ASF GitHub Bot added a comment - 17/May/16 04:40

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63463623

— Diff: phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixTransactionalIndexer.java —
@@ -160,6 +163,9 @@ public void preBatchMutate(ObserverContext<RegionCoprocessorEnvironment> c,

// get the index updates for all elements in this batch
indexUpdates = getIndexUpdates(c.getEnvironment(), indexMetaData, getMutationIterator(miniBatchOp), txRollbackAttribute);
+
+ IndexUtil.addLocalUpdatesToCpOperations(c, miniBatchOp, indexUpdates,
+ m.getDurability() != Durability.SKIP_WAL);
— End diff –

Yes

ASF GitHub Bot added a comment - 17/May/16 04:40 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63463623 — Diff: phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixTransactionalIndexer.java — @@ -160,6 +163,9 @@ public void preBatchMutate(ObserverContext<RegionCoprocessorEnvironment> c, // get the index updates for all elements in this batch indexUpdates = getIndexUpdates(c.getEnvironment(), indexMetaData, getMutationIterator(miniBatchOp), txRollbackAttribute); + + IndexUtil.addLocalUpdatesToCpOperations(c, miniBatchOp, indexUpdates, + m.getDurability() != Durability.SKIP_WAL); — End diff – Yes

ASF GitHub Bot added a comment - 17/May/16 04:42

Github user chrajeshbabu commented on the pull request:

https://github.com/apache/phoenix/pull/168#issuecomment-219617896

James as we discussed I have made a patch working with older versions of HBase first and handled review comments here. Will create new pull request with that patch.

Thanks for reviews.

ASF GitHub Bot added a comment - 17/May/16 04:42 Github user chrajeshbabu commented on the pull request: https://github.com/apache/phoenix/pull/168#issuecomment-219617896 James as we discussed I have made a patch working with older versions of HBase first and handled review comments here. Will create new pull request with that patch. Thanks for reviews.

ASF GitHub Bot added a comment - 17/May/16 06:10

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63468900

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java —
@@ -187,13 +186,13 @@ public void initTable() throws Exception

{ "CREATE LOCAL INDEX \"idx_supplier\" ON " + JOIN_SUPPLIER_TABLE_FULL_NAME + " (name)" }

, {
"SORT-MERGE-JOIN (LEFT) TABLES\n" +

" CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [-32768]\n" +
+ " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [1]\n" +
- - End diff –

Yes, I think it'd be good to mention the local index name as there could be more than one. If you include the full name of the local index (<schema name>.<table name> which is what you get from PTable.getName()), then that's unique so you wouldn't need to include the data table name too.

ASF GitHub Bot added a comment - 17/May/16 06:10 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63468900 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java — @@ -187,13 +186,13 @@ public void initTable() throws Exception { "CREATE LOCAL INDEX \"idx_supplier\" ON " + JOIN_SUPPLIER_TABLE_FULL_NAME + " (name)" } , { "SORT-MERGE-JOIN (LEFT) TABLES\n" + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [-32768] \n" + + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [1] \n" + End diff – Yes, I think it'd be good to mention the local index name as there could be more than one. If you include the full name of the local index (<schema name>.<table name> which is what you get from PTable.getName()), then that's unique so you wouldn't need to include the data table name too.

ASF GitHub Bot added a comment - 17/May/16 06:14

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63469161

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexIT.java —
@@ -707,16 +702,17 @@ public void testIndexHalfStoreFileReader() throws Exception {
} catch (Exception ex)

{ Log.info(ex); }

-

long waitStartTime = System.currentTimeMillis();
// wait until merge happened
while (System.currentTimeMillis() - waitStartTime < 10000) {
List<HRegionInfo> regions = admin.getTableRegions(indexTable);
System.out.println("Waiting:" + regions.size());
if (regions.size() < numRegions) { - break; - }
Threads.sleep(1000);
+ if(!localIndex) {
+ long waitStartTime = System.currentTimeMillis();
+ // wait until merge happened
+ while (System.currentTimeMillis() - waitStartTime < 10000) {
+ List<HRegionInfo> regions = admin.getTableRegions(indexTable);
+ System.out.println("Waiting:" + regions.size());
- - End diff –

Remove System.out.println() or replace with logging.

ASF GitHub Bot added a comment - 17/May/16 06:14 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63469161 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexIT.java — @@ -707,16 +702,17 @@ public void testIndexHalfStoreFileReader() throws Exception { } catch (Exception ex) { Log.info(ex); } - long waitStartTime = System.currentTimeMillis(); // wait until merge happened while (System.currentTimeMillis() - waitStartTime < 10000) { List<HRegionInfo> regions = admin.getTableRegions(indexTable); System.out.println("Waiting:" + regions.size()); if (regions.size() < numRegions) { - break; - } Threads.sleep(1000); + if(!localIndex) { + long waitStartTime = System.currentTimeMillis(); + // wait until merge happened + while (System.currentTimeMillis() - waitStartTime < 10000) { + List<HRegionInfo> regions = admin.getTableRegions(indexTable); + System.out.println("Waiting:" + regions.size()); End diff – Remove System.out.println() or replace with logging.

ASF GitHub Bot added a comment - 17/May/16 06:19

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63469452

— Diff: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java —
@@ -667,4 +651,28 @@ public static String getIndexColumnExpressionStr(PColumn col)

{ return col.getExpressionStr() == null ? IndexUtil.getCaseSensitiveDataColumnFullName(col.getName().getString()) : col.getExpressionStr(); }

+
+ public static void addLocalUpdatesToCpOperations(ObserverContext<RegionCoprocessorEnvironment> c,
— End diff –

Minor nit, but if this is called in only one place, can we move this there and make it private?

ASF GitHub Bot added a comment - 17/May/16 06:19 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63469452 — Diff: phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java — @@ -667,4 +651,28 @@ public static String getIndexColumnExpressionStr(PColumn col) { return col.getExpressionStr() == null ? IndexUtil.getCaseSensitiveDataColumnFullName(col.getName().getString()) : col.getExpressionStr(); } + + public static void addLocalUpdatesToCpOperations(ObserverContext<RegionCoprocessorEnvironment> c, — End diff – Minor nit, but if this is called in only one place, can we move this there and make it private?

ASF GitHub Bot added a comment - 17/May/16 06:23

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r63469708

— Diff: phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java —
@@ -2455,6 +2409,19 @@ public Void call() throws Exception {
}

if (currentServerSideTableTimeStamp < MetaDataProtocol.MIN_SYSTEM_TABLE_TIMESTAMP_4_8_0) {
+ Properties props = PropertiesUtil.deepCopy(metaConnection.getClientInfo());
+ props.remove(PhoenixRuntime.CURRENT_SCN_ATTRIB);
+ props.remove(PhoenixRuntime.TENANT_ID_ATTRIB);
+ PhoenixConnection conn =
+ new PhoenixConnection(ConnectionQueryServicesImpl.this,
+ metaConnection.getURL(), props, metaConnection
+ .getMetaDataCache());
+ try {
+ UpgradeUtil.upgradeLocalIndexes(conn);
— End diff –

Should we only disable all local indexes here as a full rebuild would take a lot of time? Then we can let users use your utility to rebuild them?

ASF GitHub Bot added a comment - 17/May/16 06:23 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r63469708 — Diff: phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java — @@ -2455,6 +2409,19 @@ public Void call() throws Exception { } if (currentServerSideTableTimeStamp < MetaDataProtocol.MIN_SYSTEM_TABLE_TIMESTAMP_4_8_0) { + Properties props = PropertiesUtil.deepCopy(metaConnection.getClientInfo()); + props.remove(PhoenixRuntime.CURRENT_SCN_ATTRIB); + props.remove(PhoenixRuntime.TENANT_ID_ATTRIB); + PhoenixConnection conn = + new PhoenixConnection(ConnectionQueryServicesImpl.this, + metaConnection.getURL(), props, metaConnection + .getMetaDataCache()); + try { + UpgradeUtil.upgradeLocalIndexes(conn); — End diff – Should we only disable all local indexes here as a full rebuild would take a lot of time? Then we can let users use your utility to rebuild them?

ASF GitHub Bot added a comment - 17/May/16 06:25

Github user JamesRTaylor commented on the pull request:

https://github.com/apache/phoenix/pull/168#issuecomment-219630101

Looks very good, @chrajeshbabu! Thanks so much for this. Are you ok making the changes I mentioned?

ASF GitHub Bot added a comment - 17/May/16 06:25 Github user JamesRTaylor commented on the pull request: https://github.com/apache/phoenix/pull/168#issuecomment-219630101 Looks very good, @chrajeshbabu! Thanks so much for this. Are you ok making the changes I mentioned?

ASF GitHub Bot added a comment - 21/May/16 00:24

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r64123637

— Diff: phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java —
@@ -2455,6 +2409,19 @@ public Void call() throws Exception {
}

if (currentServerSideTableTimeStamp < MetaDataProtocol.MIN_SYSTEM_TABLE_TIMESTAMP_4_8_0) {
+ Properties props = PropertiesUtil.deepCopy(metaConnection.getClientInfo());
+ props.remove(PhoenixRuntime.CURRENT_SCN_ATTRIB);
+ props.remove(PhoenixRuntime.TENANT_ID_ATTRIB);
+ PhoenixConnection conn =
+ new PhoenixConnection(ConnectionQueryServicesImpl.this,
+ metaConnection.getURL(), props, metaConnection
+ .getMetaDataCache());
+ try {
+ UpgradeUtil.upgradeLocalIndexes(conn);
— End diff –

Agree with you James. Will do it in other patch.

ASF GitHub Bot added a comment - 21/May/16 00:24 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r64123637 — Diff: phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java — @@ -2455,6 +2409,19 @@ public Void call() throws Exception { } if (currentServerSideTableTimeStamp < MetaDataProtocol.MIN_SYSTEM_TABLE_TIMESTAMP_4_8_0) { + Properties props = PropertiesUtil.deepCopy(metaConnection.getClientInfo()); + props.remove(PhoenixRuntime.CURRENT_SCN_ATTRIB); + props.remove(PhoenixRuntime.TENANT_ID_ATTRIB); + PhoenixConnection conn = + new PhoenixConnection(ConnectionQueryServicesImpl.this, + metaConnection.getURL(), props, metaConnection + .getMetaDataCache()); + try { + UpgradeUtil.upgradeLocalIndexes(conn); — End diff – Agree with you James. Will do it in other patch.

ASF GitHub Bot added a comment - 21/May/16 00:26

Github user chrajeshbabu commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r64123714

— Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java —
@@ -187,13 +186,13 @@ public void initTable() throws Exception

{ "CREATE LOCAL INDEX \"idx_supplier\" ON " + JOIN_SUPPLIER_TABLE_FULL_NAME + " (name)" }

, {
"SORT-MERGE-JOIN (LEFT) TABLES\n" +

" CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [-32768]\n" +
+ " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [1]\n" +
- - End diff –

I will work on this in other patch James. Again I need to change all the test cases.

ASF GitHub Bot added a comment - 21/May/16 00:26 Github user chrajeshbabu commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r64123714 — Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java — @@ -187,13 +186,13 @@ public void initTable() throws Exception { "CREATE LOCAL INDEX \"idx_supplier\" ON " + JOIN_SUPPLIER_TABLE_FULL_NAME + " (name)" } , { "SORT-MERGE-JOIN (LEFT) TABLES\n" + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " + MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX + JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [-32768] \n" + + " CLIENT PARALLEL 1-WAY RANGE SCAN OVER " +JOIN_SUPPLIER_TABLE_DISPLAY_NAME + " [1] \n" + End diff – I will work on this in other patch James. Again I need to change all the test cases.

ASF GitHub Bot added a comment - 21/May/16 00:40

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r64124199

— Diff: phoenix-core/src/main/java/org/apache/phoenix/util/UpgradeUtil.java —
@@ -292,6 +294,119 @@ private static void preSplitSequenceTable(PhoenixConnection conn, int nSaltBucke
}
}
}
+
+ public static void upgradeLocalIndexes(PhoenixConnection metaConnection) throws SQLException,
+ IOException, org.apache.hadoop.hbase.TableNotFoundException {
+ HBaseAdmin admin = null;
+ try {
+ admin = metaConnection.getQueryServices().getAdmin();
+ ResultSet rs = metaConnection.createStatement().executeQuery("SELECT TABLE_SCHEM, TABLE_NAME, DATA_TABLE_NAME FROM SYSTEM.CATALOG "
+ + " WHERE COLUMN_NAME IS NULL"
+ + " AND COLUMN_FAMILY IS NULL"
+ + " AND INDEX_TYPE=2");
+ boolean droppedLocalIndexes = false;
+ while (rs.next()) {
+ if(!droppedLocalIndexes) {
+ HTableDescriptor[] localIndexTables = admin.listTables(MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX+".*");
+ String localIndexSplitter = LocalIndexSplitter.class.getName();
+ for (HTableDescriptor table : localIndexTables) {
+ HTableDescriptor dataTableDesc = admin.getTableDescriptor(TableName.valueOf(MetaDataUtil.getUserTableName(table.getNameAsString())));
+ HColumnDescriptor[] columnFamilies = dataTableDesc.getColumnFamilies();
+ boolean modifyTable = false;
+ for(HColumnDescriptor cf : columnFamilies) {
+ String localIndexCf = QueryConstants.LOCAL_INDEX_COLUMN_FAMILY_PREFIX+cf.getNameAsString();
+ if(dataTableDesc.getFamily(Bytes.toBytes(localIndexCf))==null){
+ HColumnDescriptor colDef =
+ new HColumnDescriptor(localIndexCf);
+ for(Entry<ImmutableBytesWritable, ImmutableBytesWritable>keyValue: cf.getValues().entrySet())

{ + colDef.setValue(keyValue.getKey().copyBytes(), keyValue.getValue().copyBytes()); + }

+ dataTableDesc.addFamily(colDef);
+ modifyTable = true;
+ }
+ }
+ List<String> coprocessors = dataTableDesc.getCoprocessors();
+ for(String coprocessor: coprocessors) {
+ if(coprocessor.equals(localIndexSplitter))

{ + dataTableDesc.removeCoprocessor(localIndexSplitter); + modifyTable = true; + }

+ }
+ if(modifyTable)

{ + admin.modifyTable(dataTableDesc.getName(), dataTableDesc); + }

+ }
+ admin.disableTables(MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX+".*");
+ admin.deleteTables(MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX+".*");
+ droppedLocalIndexes = true;
+ }
+ String getColumns =
+ "SELECT COLUMN_NAME, COLUMN_FAMILY FROM SYSTEM.CATALOG WHERE TABLE_SCHEM "
+ + (rs.getString(1) == null ? "IS NULL " : "='" + rs.getString(1)
+ + "'") + " and TABLE_NAME='" + rs.getString(2)
+ + "' AND COLUMN_NAME IS NOT NULL";
+ ResultSet getColumnsRs = metaConnection.createStatement().executeQuery(getColumns);
+ List<String> indexedColumns = new ArrayList<String>(1);
+ List<String> coveredColumns = new ArrayList<String>(1);
+
+ while (getColumnsRs.next()) {
+ String column = getColumnsRs.getString(1);
+ String columnName = IndexUtil.getDataColumnName(column);
+ if (columnName.equals(MetaDataUtil.getViewIndexIdColumnName()))

{ + continue; + }

+ String columnFamily = IndexUtil.getDataColumnFamilyName(column);
+ if (getColumnsRs.getString(2) == null) {
+ if (columnFamily != null && !columnFamily.isEmpty()) {
+ if (columnFamily.equals(QueryConstants.DEFAULT_COLUMN_FAMILY))

{ + indexedColumns.add(columnName); + }

else

{ + indexedColumns.add(SchemaUtil.getColumnName(columnFamily, + columnName)); + }

+ }
+ } else

{ + coveredColumns.add(SchemaUtil.getColumnName(columnFamily, columnName)); + }

+ }
+ StringBuilder createIndex = new StringBuilder("CREATE LOCAL INDEX ");
+ createIndex.append(rs.getString(2));
+ createIndex.append(" ON ");
+ createIndex.append(SchemaUtil.getTableName(rs.getString(1), rs.getString(3)));
+ createIndex.append("(");
+ for (int i = 0; i < indexedColumns.size(); i++) {
+ createIndex.append(indexedColumns.get);
+ if (i < indexedColumns.size() - 1)

{ + createIndex.append(","); + }

+ }
+ createIndex.append(")");
+
+ if (!coveredColumns.isEmpty()) {
+ createIndex.append(" INCLUDE(");
+ for (int i = 0; i < coveredColumns.size(); i++) {
+ createIndex.append(coveredColumns.get);
+ if (i < coveredColumns.size() - 1)

{ + createIndex.append(","); + }

+ }
+ createIndex.append(")");
— End diff –

Can we just add the ASYNC keyword here so that the rebuilds are asynchronous, can be started when the user is ready, and will run through MR?

ASF GitHub Bot added a comment - 21/May/16 00:40 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r64124199 — Diff: phoenix-core/src/main/java/org/apache/phoenix/util/UpgradeUtil.java — @@ -292,6 +294,119 @@ private static void preSplitSequenceTable(PhoenixConnection conn, int nSaltBucke } } } + + public static void upgradeLocalIndexes(PhoenixConnection metaConnection) throws SQLException, + IOException, org.apache.hadoop.hbase.TableNotFoundException { + HBaseAdmin admin = null; + try { + admin = metaConnection.getQueryServices().getAdmin(); + ResultSet rs = metaConnection.createStatement().executeQuery("SELECT TABLE_SCHEM, TABLE_NAME, DATA_TABLE_NAME FROM SYSTEM.CATALOG " + + " WHERE COLUMN_NAME IS NULL" + + " AND COLUMN_FAMILY IS NULL" + + " AND INDEX_TYPE=2"); + boolean droppedLocalIndexes = false; + while (rs.next()) { + if(!droppedLocalIndexes) { + HTableDescriptor[] localIndexTables = admin.listTables(MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX+".*"); + String localIndexSplitter = LocalIndexSplitter.class.getName(); + for (HTableDescriptor table : localIndexTables) { + HTableDescriptor dataTableDesc = admin.getTableDescriptor(TableName.valueOf(MetaDataUtil.getUserTableName(table.getNameAsString()))); + HColumnDescriptor[] columnFamilies = dataTableDesc.getColumnFamilies(); + boolean modifyTable = false; + for(HColumnDescriptor cf : columnFamilies) { + String localIndexCf = QueryConstants.LOCAL_INDEX_COLUMN_FAMILY_PREFIX+cf.getNameAsString(); + if(dataTableDesc.getFamily(Bytes.toBytes(localIndexCf))==null){ + HColumnDescriptor colDef = + new HColumnDescriptor(localIndexCf); + for(Entry<ImmutableBytesWritable, ImmutableBytesWritable>keyValue: cf.getValues().entrySet()) { + colDef.setValue(keyValue.getKey().copyBytes(), keyValue.getValue().copyBytes()); + } + dataTableDesc.addFamily(colDef); + modifyTable = true; + } + } + List<String> coprocessors = dataTableDesc.getCoprocessors(); + for(String coprocessor: coprocessors) { + if(coprocessor.equals(localIndexSplitter)) { + dataTableDesc.removeCoprocessor(localIndexSplitter); + modifyTable = true; + } + } + if(modifyTable) { + admin.modifyTable(dataTableDesc.getName(), dataTableDesc); + } + } + admin.disableTables(MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX+".*"); + admin.deleteTables(MetaDataUtil.LOCAL_INDEX_TABLE_PREFIX+".*"); + droppedLocalIndexes = true; + } + String getColumns = + "SELECT COLUMN_NAME, COLUMN_FAMILY FROM SYSTEM.CATALOG WHERE TABLE_SCHEM " + + (rs.getString(1) == null ? "IS NULL " : "='" + rs.getString(1) + + "'") + " and TABLE_NAME='" + rs.getString(2) + + "' AND COLUMN_NAME IS NOT NULL"; + ResultSet getColumnsRs = metaConnection.createStatement().executeQuery(getColumns); + List<String> indexedColumns = new ArrayList<String>(1); + List<String> coveredColumns = new ArrayList<String>(1); + + while (getColumnsRs.next()) { + String column = getColumnsRs.getString(1); + String columnName = IndexUtil.getDataColumnName(column); + if (columnName.equals(MetaDataUtil.getViewIndexIdColumnName())) { + continue; + } + String columnFamily = IndexUtil.getDataColumnFamilyName(column); + if (getColumnsRs.getString(2) == null) { + if (columnFamily != null && !columnFamily.isEmpty()) { + if (columnFamily.equals(QueryConstants.DEFAULT_COLUMN_FAMILY)) { + indexedColumns.add(columnName); + } else { + indexedColumns.add(SchemaUtil.getColumnName(columnFamily, + columnName)); + } + } + } else { + coveredColumns.add(SchemaUtil.getColumnName(columnFamily, columnName)); + } + } + StringBuilder createIndex = new StringBuilder("CREATE LOCAL INDEX "); + createIndex.append(rs.getString(2)); + createIndex.append(" ON "); + createIndex.append(SchemaUtil.getTableName(rs.getString(1), rs.getString(3))); + createIndex.append("("); + for (int i = 0; i < indexedColumns.size(); i++) { + createIndex.append(indexedColumns.get ); + if (i < indexedColumns.size() - 1) { + createIndex.append(","); + } + } + createIndex.append(")"); + + if (!coveredColumns.isEmpty()) { + createIndex.append(" INCLUDE("); + for (int i = 0; i < coveredColumns.size(); i++) { + createIndex.append(coveredColumns.get ); + if (i < coveredColumns.size() - 1) { + createIndex.append(","); + } + } + createIndex.append(")"); — End diff – Can we just add the ASYNC keyword here so that the rebuilds are asynchronous, can be started when the user is ready, and will run through MR?

ASF GitHub Bot added a comment - 21/May/16 00:43

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/168#discussion_r64124279

— Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java —
@@ -165,6 +165,9 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment
throws IOException {
s = super.preScannerOpen(e, scan, s);
if (ScanUtil.isAnalyzeTable(scan)) {
+ if (!ScanUtil.isLocalIndex(scan))

{ + scan.getFamilyMap().clear(); + }

— End diff –

Can we file a JIRA for this? MetaDataClient.updateStatisticsInternal() would just need to project the correct column family here.

ASF GitHub Bot added a comment - 21/May/16 00:43 Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/phoenix/pull/168#discussion_r64124279 — Diff: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java — @@ -165,6 +165,9 @@ public RegionScanner preScannerOpen(ObserverContext<RegionCoprocessorEnvironment throws IOException { s = super.preScannerOpen(e, scan, s); if (ScanUtil.isAnalyzeTable(scan)) { + if (!ScanUtil.isLocalIndex(scan)) { + scan.getFamilyMap().clear(); + } — End diff – Can we file a JIRA for this? MetaDataClient.updateStatisticsInternal() would just need to project the correct column family here.

James R. Taylor added a comment - 21/May/16 00:50

+1 with the adding of ASYNC to CREATE INDEX statement during ConnectionQueryServices.init() call for conversion (assuming tests pass). Please file JIRAs for the follow up work. Great job, rajeshbabu.

James R. Taylor added a comment - 21/May/16 00:50 +1 with the adding of ASYNC to CREATE INDEX statement during ConnectionQueryServices.init() call for conversion (assuming tests pass). Please file JIRAs for the follow up work. Great job, rajeshbabu .

Hudson added a comment - 24/May/16 16:54

FAILURE: Integrated in Phoenix-master #1234 (See https://builds.apache.org/job/Phoenix-master/1234/)
~~PHOENIX-1734~~ Local index improvements(Rajeshbabu) (rajeshbabu: rev 10909ae502095bac775d98e6d92288c5cad9b9a6)

phoenix-core/src/it/java/org/apache/phoenix/end2end/index/txn/TxWriteFailureIT.java
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/AggregateQueryIT.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/LocalIndexIT.java
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/GroupedAggregateRegionObserver.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/IndexExpressionIT.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java
phoenix-core/src/main/java/org/apache/phoenix/query/QueryConstants.java
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/AbstractBulkLoadTool.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/DeleteIT.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/DropMetadataIT.java
phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexSplitter.java
phoenix-core/src/main/java/org/apache/phoenix/exception/SQLExceptionCode.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/SubqueryIT.java
phoenix-core/src/main/java/org/apache/phoenix/util/UpgradeUtil.java
phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java
phoenix-core/src/test/java/org/apache/phoenix/hbase/index/write/TestParalleWriterIndexCommitter.java
phoenix-core/src/it/java/org/apache/phoenix/hbase/index/balancer/IndexLoadBalancerIT.java
phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexSplitTransaction.java
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/Indexer.java
phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReaderGenerator.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/UserDefinedFunctionsIT.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexFailureIT.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/txn/RollbackIT.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/TenantSpecificViewIndexIT.java
phoenix-core/src/main/java/org/apache/phoenix/schema/TableRef.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolIT.java
phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/balancer/IndexLoadBalancer.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinLocalIndexIT.java
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/master/IndexMasterObserver.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/BaseViewIT.java
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexTool.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/BaseTenantSpecificViewIndexIT.java
phoenix-core/src/test/java/org/apache/phoenix/query/BaseTest.java
phoenix-core/src/test/java/org/apache/phoenix/hbase/index/write/TestParalleIndexWriter.java
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/BaseScannerRegionObserver.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/UpgradeIT.java
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java
phoenix-core/src/main/java/org/apache/phoenix/iterate/ExplainTable.java
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/IndexWriter.java
phoenix-core/src/main/java/org/apache/phoenix/index/IndexMaintainer.java
phoenix-core/src/main/java/org/apache/phoenix/compile/ProjectionCompiler.java
phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixTransactionalIndexer.java
phoenix-core/src/it/java/org/apache/hadoop/hbase/regionserver/wal/WALReplayWithIndexWritesAndCompressedWALIT.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java
phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java
phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/txn/MutableRollbackIT.java
phoenix-core/src/main/java/org/apache/phoenix/compile/CreateTableCompiler.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/IndexIT.java
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/ScanRegionObserver.java
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/recovery/TrackingParallelWriterIndexCommitter.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexIT.java
phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java
phoenix-core/src/main/java/org/apache/phoenix/util/MetaDataUtil.java
phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixIndexFailurePolicy.java
phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/IndexCommitter.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/ViewIT.java
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/IndexRegionSplitPolicy.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/ReadOnlyIndexFailureIT.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/SubqueryUsingSortMergeJoinIT.java
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/ParallelWriterIndexCommitter.java
phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixIndexCodec.java
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/ViewIndexIT.java

Hudson added a comment - 24/May/16 16:54 FAILURE: Integrated in Phoenix-master #1234 (See https://builds.apache.org/job/Phoenix-master/1234/ ) PHOENIX-1734 Local index improvements(Rajeshbabu) (rajeshbabu: rev 10909ae502095bac775d98e6d92288c5cad9b9a6) phoenix-core/src/it/java/org/apache/phoenix/end2end/index/txn/TxWriteFailureIT.java phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java phoenix-core/src/it/java/org/apache/phoenix/end2end/AggregateQueryIT.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/LocalIndexIT.java phoenix-core/src/main/java/org/apache/phoenix/coprocessor/GroupedAggregateRegionObserver.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/IndexExpressionIT.java phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinIT.java phoenix-core/src/main/java/org/apache/phoenix/query/QueryConstants.java phoenix-core/src/main/java/org/apache/phoenix/mapreduce/AbstractBulkLoadTool.java phoenix-core/src/it/java/org/apache/phoenix/end2end/DeleteIT.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/DropMetadataIT.java phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexSplitter.java phoenix-core/src/main/java/org/apache/phoenix/exception/SQLExceptionCode.java phoenix-core/src/it/java/org/apache/phoenix/end2end/SubqueryIT.java phoenix-core/src/main/java/org/apache/phoenix/util/UpgradeUtil.java phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java phoenix-core/src/test/java/org/apache/phoenix/hbase/index/write/TestParalleWriterIndexCommitter.java phoenix-core/src/it/java/org/apache/phoenix/hbase/index/balancer/IndexLoadBalancerIT.java phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexSplitTransaction.java phoenix-core/src/main/java/org/apache/phoenix/hbase/index/Indexer.java phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/IndexHalfStoreFileReaderGenerator.java phoenix-core/src/it/java/org/apache/phoenix/end2end/UserDefinedFunctionsIT.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexFailureIT.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/txn/RollbackIT.java phoenix-core/src/it/java/org/apache/phoenix/end2end/TenantSpecificViewIndexIT.java phoenix-core/src/main/java/org/apache/phoenix/schema/TableRef.java phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolIT.java phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java phoenix-core/src/main/java/org/apache/phoenix/hbase/index/balancer/IndexLoadBalancer.java phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinLocalIndexIT.java phoenix-core/src/main/java/org/apache/phoenix/hbase/index/master/IndexMasterObserver.java phoenix-core/src/it/java/org/apache/phoenix/end2end/BaseViewIT.java phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexTool.java phoenix-core/src/it/java/org/apache/phoenix/end2end/BaseTenantSpecificViewIndexIT.java phoenix-core/src/test/java/org/apache/phoenix/query/BaseTest.java phoenix-core/src/test/java/org/apache/phoenix/hbase/index/write/TestParalleIndexWriter.java phoenix-core/src/main/java/org/apache/phoenix/coprocessor/BaseScannerRegionObserver.java phoenix-core/src/it/java/org/apache/phoenix/end2end/UpgradeIT.java phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java phoenix-core/src/main/java/org/apache/phoenix/iterate/ExplainTable.java phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/IndexWriter.java phoenix-core/src/main/java/org/apache/phoenix/index/IndexMaintainer.java phoenix-core/src/main/java/org/apache/phoenix/compile/ProjectionCompiler.java phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixTransactionalIndexer.java phoenix-core/src/it/java/org/apache/hadoop/hbase/regionserver/wal/WALReplayWithIndexWritesAndCompressedWALIT.java phoenix-core/src/it/java/org/apache/phoenix/end2end/HashJoinIT.java phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/txn/MutableRollbackIT.java phoenix-core/src/main/java/org/apache/phoenix/compile/CreateTableCompiler.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/IndexIT.java phoenix-core/src/main/java/org/apache/phoenix/coprocessor/ScanRegionObserver.java phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/recovery/TrackingParallelWriterIndexCommitter.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexIT.java phoenix-core/src/main/java/org/apache/phoenix/util/IndexUtil.java phoenix-core/src/main/java/org/apache/phoenix/util/MetaDataUtil.java phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixIndexFailurePolicy.java phoenix-core/src/main/java/org/apache/phoenix/compile/UpsertCompiler.java phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/IndexCommitter.java phoenix-core/src/it/java/org/apache/phoenix/end2end/ViewIT.java phoenix-core/src/main/java/org/apache/phoenix/hbase/index/IndexRegionSplitPolicy.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/ReadOnlyIndexFailureIT.java phoenix-core/src/it/java/org/apache/phoenix/end2end/SubqueryUsingSortMergeJoinIT.java phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/ParallelWriterIndexCommitter.java phoenix-core/src/main/java/org/apache/phoenix/index/PhoenixIndexCodec.java phoenix-core/src/it/java/org/apache/phoenix/end2end/index/ViewIndexIT.java

Rajeshbabu Chintaguntla added a comment - 24/May/16 17:05

jamestaylor Thanks for reviews. committed the patch to master. Failed test passed locally. Looking at test failures will upload addendum to fix the failures now.

+1 with the adding of ASYNC to CREATE INDEX statement during ConnectionQueryServices.init()

I have made the create index async for all cases offline upgrade from psql and for ConnectionQueryServices.init(). Do you think for offline upgrade also ASYNC is fine? or normal creation is fine?

Rajeshbabu Chintaguntla added a comment - 24/May/16 17:05 jamestaylor Thanks for reviews. committed the patch to master. Failed test passed locally. Looking at test failures will upload addendum to fix the failures now. +1 with the adding of ASYNC to CREATE INDEX statement during ConnectionQueryServices.init() I have made the create index async for all cases offline upgrade from psql and for ConnectionQueryServices.init(). Do you think for offline upgrade also ASYNC is fine? or normal creation is fine?

Sergey Soldatov added a comment - 24/May/16 21:11

rajeshbabu, jamestaylor
just FYI, checked that recent changes in CSVBulkLoad are compatible with the new local indexes. It works, even better than before. Loaded 5 mil records for table with 1 global index and 2 local indexes. On single node cluster that took less than 10 min. (table with over 20 columns, CSV file 1.5Gb). And some performance observations:
simple query select * from table where indexed_col = something :
0.2 sec with local index
1 min without index (after split almost 2 min)
~1.5 sec was with old implementation

Now about a problem I found. I tried to split/compact this table from HBase shell. and the compaction fails :

2016-05-24 12:26:30,362 ERROR [regionserver//10.22.8.101:16201-longCompactions-1464116687568] regionserver.CompactSplitThread: Compaction failed Request = regionName=GIGANTIC_TABLE,\x80\x03\xD0\xA3,1464117986481.3a4eef7f676dd670ce4fc1ef5130c293., storeName=L#0, fileCount=1, fileSize=32.0 M (32.0 M), priority=9, time=154281628674638 
java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.LocalIndexStoreFileScanner.isSatisfiedMidKeyCondition(LocalIndexStoreFileScanner.java:158)
        at org.apache.hadoop.hbase.regionserver.LocalIndexStoreFileScanner.next(LocalIndexStoreFileScanner.java:55)
        at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
        at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:581)
        at org.apache.phoenix.schema.stats.StatisticsScanner.next(StatisticsScanner.java:73)
        at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:318)
        at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:111)
        at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:119)
        at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1223)
        at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1845)
        at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:529)
        at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:566)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Sergey Soldatov added a comment - 24/May/16 21:11 rajeshbabu , jamestaylor just FYI, checked that recent changes in CSVBulkLoad are compatible with the new local indexes. It works, even better than before. Loaded 5 mil records for table with 1 global index and 2 local indexes. On single node cluster that took less than 10 min. (table with over 20 columns, CSV file 1.5Gb). And some performance observations: simple query select * from table where indexed_col = something : 0.2 sec with local index 1 min without index (after split almost 2 min) ~1.5 sec was with old implementation Now about a problem I found. I tried to split/compact this table from HBase shell. and the compaction fails : 2016-05-24 12:26:30,362 ERROR [regionserver//10.22.8.101:16201-longCompactions-1464116687568] regionserver.CompactSplitThread: Compaction failed Request = regionName=GIGANTIC_TABLE,\x80\x03\xD0\xA3,1464117986481.3a4eef7f676dd670ce4fc1ef5130c293., storeName=L#0, fileCount=1, fileSize=32.0 M (32.0 M), priority=9, time=154281628674638 java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.LocalIndexStoreFileScanner.isSatisfiedMidKeyCondition(LocalIndexStoreFileScanner.java:158) at org.apache.hadoop.hbase.regionserver.LocalIndexStoreFileScanner.next(LocalIndexStoreFileScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:581) at org.apache.phoenix.schema.stats.StatisticsScanner.next(StatisticsScanner.java:73) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:318) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:111) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:119) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1223) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1845) at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:529) at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:566) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Hudson added a comment - 25/May/16 15:59

SUCCESS: Integrated in Phoenix-master #1235 (See https://builds.apache.org/job/Phoenix-master/1235/)
~~PHOENIX-1734~~ Local index improvements(Rajeshbabu)-addendum (rajeshbabu: rev aedf48fab4610d979c023515f03606e5d81dea5b)

phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java
phoenix-core/src/main/java/org/apache/phoenix/compile/PostLocalIndexDDLCompiler.java

Hudson added a comment - 25/May/16 15:59 SUCCESS: Integrated in Phoenix-master #1235 (See https://builds.apache.org/job/Phoenix-master/1235/ ) PHOENIX-1734 Local index improvements(Rajeshbabu)-addendum (rajeshbabu: rev aedf48fab4610d979c023515f03606e5d81dea5b) phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java phoenix-core/src/main/java/org/apache/phoenix/compile/PostLocalIndexDDLCompiler.java

Rajeshbabu Chintaguntla added a comment - 26/May/16 09:29

sergey.soldatov I am able to reproduce the NPE when there are multiple local indexes on the table looking at it.

Rajeshbabu Chintaguntla added a comment - 26/May/16 09:29 sergey.soldatov I am able to reproduce the NPE when there are multiple local indexes on the table looking at it.

Rajeshbabu Chintaguntla added a comment - 26/May/16 15:07

The reason for the NPE during compaction after split was the split key selected by hbase was from local index column family which should not be accepted to control the splits of local index cfs based on split key on data table column families. Here is the patch to select split key from non local index column families always.
sergey.soldatov jamestaylor will commit it if you are fine with it.

Rajeshbabu Chintaguntla added a comment - 26/May/16 15:07 The reason for the NPE during compaction after split was the split key selected by hbase was from local index column family which should not be accepted to control the splits of local index cfs based on split key on data table column families. Here is the patch to select split key from non local index column families always. sergey.soldatov jamestaylor will commit it if you are fine with it.

James R. Taylor added a comment - 26/May/16 15:18

+1

James R. Taylor added a comment - 26/May/16 15:18 +1

Rajeshbabu Chintaguntla added a comment - 22/Jun/16 16:28

Resolving this. Thanks for review jamestaylor.

Rajeshbabu Chintaguntla added a comment - 22/Jun/16 16:28 Resolving this. Thanks for review jamestaylor .

Ankit Singhal added a comment - 11/Aug/16 08:27

Bulk close of all issues that has been resolved in a released version.

Ankit Singhal added a comment - 11/Aug/16 08:27 Bulk close of all issues that has been resolved in a released version.

James R. Taylor added a comment - 18/Apr/17 00:35

rajeshbabu - question on the change to Indexer.java in this patch: it seems that every Put and Delete in the batch of a local index would cause all index updates to be written again and again. Or am I missing something that will prevent this?

Also, just curious - why was this change made at all? Can't all the index updates be written where they like with global indexes? It's not entirely clear why we need a special case.

lhofhansl

James R. Taylor added a comment - 18/Apr/17 00:35 rajeshbabu - question on the change to Indexer.java in this patch: it seems that every Put and Delete in the batch of a local index would cause all index updates to be written again and again. Or am I missing something that will prevent this? Also, just curious - why was this change made at all? Can't all the index updates be written where they like with global indexes? It's not entirely clear why we need a special case. lhofhansl

Samarth Jain added a comment - 18/Apr/17 00:58

I am wondering if this is possibly related to the slowness that mujtabachohan saw when testing local indexes - https://issues.apache.org/jira/browse/PHOENIX-3778

Samarth Jain added a comment - 18/Apr/17 00:58 I am wondering if this is possibly related to the slowness that mujtabachohan saw when testing local indexes - https://issues.apache.org/jira/browse/PHOENIX-3778

ASF GitHub Bot added a comment - 28/Sep/20 17:32

stoty commented on pull request #135:
URL: https://github.com/apache/phoenix/pull/135#issuecomment-700177637

:broken_heart: *-1 overall*

Vote	Subsystem	Runtime	Comment
:----:	----------:	--------:	:--------
+0 :ok:	reexec	0m 0s	Docker mode activated.
-1 :x:	patch	0m 3s	https://github.com/apache/phoenix/pull/135 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/in-progress/precommit-patchnames for help.

Subsystem	Report/Notes
----------:	:-------------
GITHUB PR	https://github.com/apache/phoenix/pull/135
Console output	https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-135/1/console
versions	git=2.17.1
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

ASF GitHub Bot added a comment - 28/Sep/20 17:32 stoty commented on pull request #135: URL: https://github.com/apache/phoenix/pull/135#issuecomment-700177637 :broken_heart: * -1 overall * Vote Subsystem Runtime Comment :----: ----------: --------: :-------- +0 :ok: reexec 0m 0s Docker mode activated. -1 :x: patch 0m 3s https://github.com/apache/phoenix/pull/135 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/in-progress/precommit-patchnames for help. Subsystem Report/Notes ----------: :------------- GITHUB PR https://github.com/apache/phoenix/pull/135 Console output https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-135/1/console versions git=2.17.1 Powered by Apache Yetus 0.12.0 https://yetus.apache.org This message was automatically generated. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org

ASF GitHub Bot added a comment - 13/Nov/20 20:52

chrajeshbabu closed pull request #168:
URL: https://github.com/apache/phoenix/pull/168

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

ASF GitHub Bot added a comment - 13/Nov/20 20:52 chrajeshbabu closed pull request #168: URL: https://github.com/apache/phoenix/pull/168 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org

ASF GitHub Bot added a comment - 13/Nov/20 20:53

chrajeshbabu closed pull request #135:
URL: https://github.com/apache/phoenix/pull/135

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

ASF GitHub Bot added a comment - 13/Nov/20 20:53 chrajeshbabu closed pull request #135: URL: https://github.com/apache/phoenix/pull/135 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org

Phoenix

Local index improvements

Details

Description

Attachments

Attachments

Issue Links

Sub-Tasks

Activity

People

Dates