Index: src/docbkx/performance.xml =================================================================== --- src/docbkx/performance.xml (revision 1098694) +++ src/docbkx/performance.xml (working copy) @@ -189,6 +189,16 @@ have the cache value be large because it costs more in memory for both client and RegionServer, so bigger isn't always better. +
+ Scan Attribute Selection + + Whenever a Scan is used to process large numbers of rows (and especially when used + as a MapReduce source), be aware of which attributes are selected. If scan.addFamily is called + then all of the attributes in the specified ColumnFamily will be returned to the client. + If only a small number of the available attributes are to be processed, then only those attributes should be specified + in the input scan because attribute over-selection is a non-trivial performance penalty over large datasets. + +
Close ResultScanners