The current jena-text integration of Lucene has both duplicate and unused fields that increase the required space and reduce the performance of the Lucene integration.
and that text:multilingualSupport false ;, then
The two Lucene documents that will be indexed appear as follows:
The graph field (and associated lang and uid fields) appear twice in each document. The initial occurrence results from the text:graphField configuration and the second is an artifact of TextQueryFuncs.entityFromQuad adding the graph to the Entity via entity.put(...).
This second occurrence of the graph field is not effective since there is no search over tokenized graph URIs and there is currently no way to return the graph field so no need to store it.
It might well be a useful improvement to allow the graph field to be retrieved via text:query PF but that would most reasonably be done by adding the Field.Store.YES to the FieldType for the initial occurrence of the graph field.
The second occurrence of a uid field is the result of the unnecessary graph occurrence resulting from the Entity to Document conversion in TextLuceneIndex. This is never used since the purpose of the uid field is to handle the deleting of documents from the Lucene index when a triple is deleted and does not involve the graph URI.
The solution is to delete lines 89-90 of TextQueryFuncs.