About moving concatFields() to the tika language identifier: I think the way to go is just move the whole method there, then change the detectLanguage() method to take the SolrInputDocument instead of a String. You don't need to carry over the field parameter from concatFields(), since data member inputFields will be accessible everywhere it's needed.
[VZ] This call looks more cleaner now, i changed inputFields to private now to reduce visibility scope
I should have mentioned previously: I don't like the maxAppendSize and maxTotalAppendSize names - "size" is ambiguous (could refer to bytes, chars, whatever), and "append" refers to an internal operation... I'd like to see "append"=>"field value" and "size"=>"chars": maxFieldValueChars, and maxTotalChars (since appending doesn't need to be mentioned for the global limit). The same thing goes for the default constants and the test method names.
[VZ] Renamed parameters and test methods
Some minor issues I found with your patch:
As I said previously: "We should also set default maxima for both per-value and total chars, rather than MAX_INT, as in the current patch."
The total chars default should be its own setting; I was thinking we could make it double the per-value default?
[VZ] added default value to maxTotalChars and changed both to 10K like in com.cybozu.labs.langdetect.Detector.maxLength
It's better not to reorder import statements unless you're already making significant changes to them; it distracts from the meat of the change. (You reordered them in LangDetectLanguageIdentifierUpdateProcessor and LanguageIdentifierUpdateProcessorFactoryTestCase)
[VZ] This is IDE optimization to put imports in alphabetical order - restored it to original order
In LanguageIdentifierUpdateProcessor.concatFields(), when you trim the concatenated text to maxTotalAppendSize, I think StringBuilder.setLength(maxTotalAppendSize); would be more efficient than StringBuilder.delete(maxTotalAppendSize, sb.length() - 1);
[VZ] Yep, cleaned that
In addition to the test you added for the global limit, we should also test using both the per-value and global limits at the same time.
[VZ] Tests for both limits added