Have a look at
LUCENE-1470, even 2 was considered then.
That was not really useable even at that time! The improvements in contrast to 4 were zero. It was even worse (because the term dictionary got larger, which had impact in 2.x and 3.x. At that time, I was always using 8 as precisionStep for longs and ints. The same applied for Solr. Lucene was the only one using 4 as default. ElasticSearch was cloning Lucene's standards.
I would really prefer to use 8 for both ints and longs. The change from 8 to 16 is increasing the number of terms while range query immense and the index size between 8 and 16 is not really a problem. To me it has also shown that because of the way how floats/doubles are encoded, the precision step of 8 is really good for longs. In most cases stuff never changes (like exponent), so there is exactly one term indexed for that.
With a precision step of 16 I would imagine the differences between 16 and 64 would be neglectible, too The main reason for having lower precision steps are indexes were the values are equally distributed. For stuff like values clustered around some numbers, the precisionstep is irrelevant! In most cases because the way how it works, for larger shifts the indexed value is constant, so you have one or 2 terms that hit all documents and are never used by the range query..
So before changing the default, I would suggest to have a test with an index that has equally distributed numbers of the full 64 bit range.
I think 11 is better than 12
...because the last term is better used. The number of terms indexed is the same for 11 and 12 (6*11=66, 6*12=72, but 5*12=60 is too small). But unfortunately this is not a multiple of 4, so would not be backwards compatible.
I think the main problem of this issue is, that we only have one default. Sombeody never doing any ranges does not need the additional terms at all. That's the main problem. Solr is better here, as it provided 2 predefined field types, but Lucene only has one - and that is the bug.
So my proposal: Provide a 2nd field type as a 2nd default with correct documnetation, suggesting it to users, only wanting to index numeric identifiers or non-docvalues fields they want to sort on.
And second, we should do LUCENE-5605 - I started with it last week, but was interrupted by NativeFSIndexCorrumpter The problem is the precisionStep alltogether! We should make it an implementation detail. When constructing a NRQ, you should not need to pass it. Because of this I opened LUCENE-5605, so anybody creating a NRQ/NRF should pass the FieldType to the NRQ ctor, not an arbitrary number. Then its ensured that the people use the same settings for indexing and querying.
Together with this, we should provide 2 predfined field types per data type and remove the constant from NumericUtils completely. The 2 field types per data type might be named like DEFAULT_INT_FOR_RANGEQUERY_FILEDTYPE and DEFAULT_INT_OTHERIWSE_FIELDTYPE (please choose better names and javadocs). And we should make 8 the new default, which is fully backwards compatible. And hide the precision step completely! 16 is really too large for lots of queries. And difference in index size is neglectibale, unless you have a purely numeric index (in which case you should use a RDBMS instead of an Lucene index to query your data !). Indexing time is also, as Mike discovered not a problem at all. If people don't reuse the IntField instance, its always as slow, because the TokenStream has to be recreated on every number. The number of terms is not the issue at all, sorry!
About ElasticSearch: Unfortunately the schemaless mode of ElasticSearch always uses 4 as precStep if it detects a numeric or date type. ES should change this, but maybe have a bit more intelligent "guessing". E.g., If you index the "_id" field as an integer, it should automatically use infinite (DEFAULT_INT_OTHERIWSE_TYPE) precStep - nobody would do range queries on the "_id" field. For all standard numeric fields it should use precstep=8.