Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
The readFields() method of all Tuple implementations do a mFields.clear() and then load data into the same tuple. If that was changed to do
if (mFields.size() > 0) { mFields = new ArrayList<Object>(mFields.size()); }
many places in code where we do TupleFactory.newTuple(List c); can be replaced with TupleFactory.newTupleNoCopy(List c);. This will avoid a expensive System.arrayCopy() call which is a native method.
POPackage.java getValueTuple()
if( keyLookupSize > 0) { // we have some fields of the "value" in the // "key". int finalValueSize = keyLookupSize + val.size(); copy = mTupleFactory.newTuple(finalValueSize); int valIndex = 0; // an index for accessing elements from // the value (val) that we have currently for(int i = 0; i < finalValueSize; i++) { Integer keyIndex = keyLookup.get(i); if(keyIndex == null) { // the field for this index is not in the // key - so just take it from the "value" // we were handed copy.set(i, val.get(valIndex)); valIndex++; } ..... } else if (isProjectStar) { // the whole "value" is present in the "key" copy = mTupleFactory.newTuple(keyAsTuple.getAll()); } else { // there is no field of the "value" in the // "key" - so just make a copy of what we got // as the "value" copy = mTupleFactory.newTuple(val.getAll()); }
Some cases might take a slight hit in GC due to new ArrayList initialization. For eg: if( keyLookupSize > 0) condition in above code as it does not do a newTuple. But other 2 cases would greatly benefit as we can avoid arraylist copy. Same in POFRJoin.
Will run some tests to validate the theory and post a patch.