Lucene - Core
  1. Lucene - Core
  2. LUCENE-6732

Improve validate-source-patterns in build.xml (e.g., detect invalid license headers!!)

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.4, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Today I enabled warnings analysis on Policeman Jenkins. This scans the build log for warnings by javac and reports them in statistics, together with source file dumps.

      When doing that I found out that someone added again a lot of "invalid" license headers using /** instead a simple comment. This causes javadocs warnings under some circumstances, because /** is start of javadocs and not a license comment.

      I then tried to fix the validate-source-patterns to detect this, but due to a bug in ANT, the <containsregexp/> filter is applied per line (although it has multiline matching capabilities!!!).

      So I rewrote our checker to run with groovy. This also has some good parts:

      • it tells you wwhat was broken, otherwise you just know there is an error, but not whats wrong (tab, nocommit,...)
      • its much faster (multiple <containsregexp/> read file over and over, this one reads file one time into a string and then applies all regular expressions).
      1. LUCENE-6732.patch
        135 kB
        Uwe Schindler
      2. LUCENE-6732.patch
        10 kB
        Uwe Schindler
      3. LUCENE-6732-v2.patch
        5 kB
        Uwe Schindler
      4. LUCENE-6732-verbose.patch
        0.8 kB
        Uwe Schindler

        Activity

        Hide
        Uwe Schindler added a comment -

        Patch. Some files with invalid license headers were fixed already, but I have now like 100 more files to fix:

        -validate-source-patterns:
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/ar/ArabicAnalyzer.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/bg/BulgarianAnalyzer.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/el/GreekAnalyzer.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/el/GreekLowerCaseFilter.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/fa/PersianAnalyzer.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/hi/HindiAnalyzer.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/th/ThaiAnalyzer.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestStopFilter.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/el/GreekAnalyzerTest.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestScandinavianFoldingFilterFactory.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestScandinavianNormalizationFilterFactory.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/payloads/NumericPayloadTokenFilterTest.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/payloads/TokenOffsetPayloadTokenFilterTest.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/payloads/TypeAsPayloadTokenFilterTest.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/sinks/DateRecognizerSinkTokenizerTest.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/sinks/TestTeeSinkTokenFilter.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/sinks/TokenTypeSinkTokenizerTest.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/snowball/TestSnowballPorterFilterFactory.java
        [source-patterns] javadoc-style license header: lucene/analysis/common/src/tools/java/org/apache/lucene/analysis/standard/GenerateJflexTLDMacros.java
        [source-patterns] javadoc-style license header: lucene/analysis/icu/src/java/org/apache/lucene/collation/ICUCollationDocValuesField.java
        [source-patterns] javadoc-style license header: lucene/analysis/icu/src/test/org/apache/lucene/collation/TestICUCollationDocValuesField.java
        [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseIterationMarkCharFilter.java
        [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseIterationMarkCharFilterFactory.java
        [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseNumberFilter.java
        [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/test/org/apache/lucene/analysis/ja/TestJapaneseIterationMarkCharFilter.java
        [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/test/org/apache/lucene/analysis/ja/TestJapaneseNumberFilter.java
        [source-patterns] javadoc-style license header: lucene/analysis/stempel/src/java/org/apache/lucene/analysis/stempel/StempelFilter.java
        [source-patterns] javadoc-style license header: lucene/analysis/stempel/src/java/org/apache/lucene/analysis/stempel/StempelStemmer.java
        [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/Constants.java
        [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/AbstractQueryMaker.java
        [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/FileBasedQueryMaker.java
        [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/Sample.java
        [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/NewAnalyzerTask.java
        [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/SearchTravRetLoadFieldSelectorTask.java
        [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/utils/ExtractReuters.java
        [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/BloomFilterFactory.java
        [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/BloomFilteringPostingsFormat.java
        [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/DefaultBloomFilterFactory.java
        [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java
        [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/HashFunction.java
        [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/MurmurHash2.java
        [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/memory/DirectPostingsFormat.java
        [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/codecs/StoredFieldsReader.java
        [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/codecs/StoredFieldsWriter.java
        [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxQuery.java
        [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/search/spans/TermSpans.java
        [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/util/FilterIterator.java
        [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/util/SmallFloat.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestAtomicUpdate.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestByteSlices.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestIndexWriterMerging.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestIndexWriterOnJRECrash.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestIndexWriterReader.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestStressDeletes.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestStressIndexing.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestStressIndexing2.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestTermdocPerf.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/search/TestCustomSearcherSort.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/search/payloads/TestPayloadSpans.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/util/Test2BPagedBytes.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/util/TestNumericUtils.java
        [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/util/TestSmallFloat.java
        [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/highlight/DefaultEncoder.java
        [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/highlight/Encoder.java
        [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/highlight/SimpleHTMLEncoder.java
        [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/WeightedFieldFragList.java
        [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/WeightedFragListBuilder.java
        [source-patterns] javadoc-style license header: lucene/highlighter/src/test/org/apache/lucene/search/vectorhighlight/WeightedFragListBuilderTest.java
        [source-patterns] javadoc-style license header: lucene/misc/src/java/org/apache/lucene/document/LazyDocument.java
        [source-patterns] javadoc-style license header: lucene/misc/src/java/org/apache/lucene/misc/IndexMergeTool.java
        [source-patterns] javadoc-style license header: lucene/misc/src/java/org/apache/lucene/uninverting/FieldCacheSanityChecker.java
        [source-patterns] javadoc-style license header: lucene/misc/src/test/org/apache/lucene/uninverting/TestFieldCache.java
        [source-patterns] javadoc-style license header: lucene/misc/src/test/org/apache/lucene/uninverting/TestFieldCacheSanityChecker.java
        [source-patterns] javadoc-style license header: lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java
        [source-patterns] javadoc-style license header: lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/standard/parser/StandardSyntaxParser.java
        [source-patterns] javadoc-style license header: lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/standard/parser/StandardSyntaxParserTokenManager.java
        [source-patterns] javadoc-style license header: lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/standard/processors/BooleanQuery2ModifierNodeProcessor.java
        [source-patterns] javadoc-style license header: lucene/suggest/src/java/org/apache/lucene/search/spell/NGramDistance.java
        [source-patterns] javadoc-style license header: lucene/test-framework/src/java/org/apache/lucene/codecs/bloom/TestBloomFilteredLucenePostings.java
        [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/HeartBeater.java
        [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrMapper.java
        [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrOutputFormat.java
        [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrRecordWriter.java
        [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrReducer.java
        [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/Utils.java
        [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/ZooKeeperInspector.java
        [source-patterns] javadoc-style license header: solr/contrib/morphlines-cell/src/java/org/apache/solr/morphlines/cell/SolrCellBuilder.java
        [source-patterns] javadoc-style license header: solr/contrib/morphlines-core/src/java/org/apache/solr/morphlines/solr/TokenizeTextBuilder.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollectorException.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/CollectionStats.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/LocalStatsCache.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/LocalStatsSource.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/StatsCache.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/StatsSource.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/StatsUtil.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/TermStats.java
        [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/store/blockcache/CachedIndexOutput.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/TestDocumentBuilder.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/cloud/CdcrReplicationHandlerTest.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/TestElisionMultitermQuery.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestBaseStatsCache.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestDefaultStatsCache.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestExactSharedStatsCache.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestExactStatsCache.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestLRUStatsCache.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/update/AddBlockUpdateTest.java
        [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/util/PrimUtilsTest.java
        
        BUILD FAILED
        C:\Users\Uwe Schindler\Projects\lucene\trunk-lusolr1\build.xml:130: 108 source files contain @author javadoc tags, tabs, svn keywords, javadoc-style licenses, or nocommits.
        
        Total time: 21 seconds
        

        This task is about 2 times faster than the old one.

        Show
        Uwe Schindler added a comment - Patch. Some files with invalid license headers were fixed already, but I have now like 100 more files to fix: -validate-source-patterns: [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/ar/ArabicAnalyzer.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/bg/BulgarianAnalyzer.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/el/GreekAnalyzer.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/el/GreekLowerCaseFilter.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/fa/PersianAnalyzer.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/hi/HindiAnalyzer.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/java/org/apache/lucene/analysis/th/ThaiAnalyzer.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestStopFilter.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/el/GreekAnalyzerTest.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestScandinavianFoldingFilterFactory.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/miscellaneous/TestScandinavianNormalizationFilterFactory.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/payloads/NumericPayloadTokenFilterTest.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/payloads/TokenOffsetPayloadTokenFilterTest.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/payloads/TypeAsPayloadTokenFilterTest.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/sinks/DateRecognizerSinkTokenizerTest.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/sinks/TestTeeSinkTokenFilter.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/sinks/TokenTypeSinkTokenizerTest.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/test/org/apache/lucene/analysis/snowball/TestSnowballPorterFilterFactory.java [source-patterns] javadoc-style license header: lucene/analysis/common/src/tools/java/org/apache/lucene/analysis/standard/GenerateJflexTLDMacros.java [source-patterns] javadoc-style license header: lucene/analysis/icu/src/java/org/apache/lucene/collation/ICUCollationDocValuesField.java [source-patterns] javadoc-style license header: lucene/analysis/icu/src/test/org/apache/lucene/collation/TestICUCollationDocValuesField.java [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseIterationMarkCharFilter.java [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseIterationMarkCharFilterFactory.java [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseNumberFilter.java [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/test/org/apache/lucene/analysis/ja/TestJapaneseIterationMarkCharFilter.java [source-patterns] javadoc-style license header: lucene/analysis/kuromoji/src/test/org/apache/lucene/analysis/ja/TestJapaneseNumberFilter.java [source-patterns] javadoc-style license header: lucene/analysis/stempel/src/java/org/apache/lucene/analysis/stempel/StempelFilter.java [source-patterns] javadoc-style license header: lucene/analysis/stempel/src/java/org/apache/lucene/analysis/stempel/StempelStemmer.java [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/Constants.java [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/AbstractQueryMaker.java [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/FileBasedQueryMaker.java [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/programmatic/Sample.java [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/NewAnalyzerTask.java [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/SearchTravRetLoadFieldSelectorTask.java [source-patterns] javadoc-style license header: lucene/benchmark/src/java/org/apache/lucene/benchmark/utils/ExtractReuters.java [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/BloomFilterFactory.java [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/BloomFilteringPostingsFormat.java [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/DefaultBloomFilterFactory.java [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/HashFunction.java [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/bloom/MurmurHash2.java [source-patterns] javadoc-style license header: lucene/codecs/src/java/org/apache/lucene/codecs/memory/DirectPostingsFormat.java [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/codecs/StoredFieldsReader.java [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/codecs/StoredFieldsWriter.java [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxQuery.java [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/search/spans/TermSpans.java [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/util/FilterIterator.java [source-patterns] javadoc-style license header: lucene/core/src/java/org/apache/lucene/util/SmallFloat.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestAtomicUpdate.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestByteSlices.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestIndexWriterMerging.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestIndexWriterOnJRECrash.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestIndexWriterReader.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestStressDeletes.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestStressIndexing.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestStressIndexing2.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/index/TestTermdocPerf.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/search/TestCustomSearcherSort.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/search/payloads/TestPayloadSpans.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/util/Test2BPagedBytes.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/util/TestNumericUtils.java [source-patterns] javadoc-style license header: lucene/core/src/test/org/apache/lucene/util/TestSmallFloat.java [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/highlight/DefaultEncoder.java [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/highlight/Encoder.java [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/highlight/SimpleHTMLEncoder.java [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/WeightedFieldFragList.java [source-patterns] javadoc-style license header: lucene/highlighter/src/java/org/apache/lucene/search/vectorhighlight/WeightedFragListBuilder.java [source-patterns] javadoc-style license header: lucene/highlighter/src/test/org/apache/lucene/search/vectorhighlight/WeightedFragListBuilderTest.java [source-patterns] javadoc-style license header: lucene/misc/src/java/org/apache/lucene/document/LazyDocument.java [source-patterns] javadoc-style license header: lucene/misc/src/java/org/apache/lucene/misc/IndexMergeTool.java [source-patterns] javadoc-style license header: lucene/misc/src/java/org/apache/lucene/uninverting/FieldCacheSanityChecker.java [source-patterns] javadoc-style license header: lucene/misc/src/test/org/apache/lucene/uninverting/TestFieldCache.java [source-patterns] javadoc-style license header: lucene/misc/src/test/org/apache/lucene/uninverting/TestFieldCacheSanityChecker.java [source-patterns] javadoc-style license header: lucene/queries/src/java/org/apache/lucene/queries/mlt/MoreLikeThis.java [source-patterns] javadoc-style license header: lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/standard/parser/StandardSyntaxParser.java [source-patterns] javadoc-style license header: lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/standard/parser/StandardSyntaxParserTokenManager.java [source-patterns] javadoc-style license header: lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/standard/processors/BooleanQuery2ModifierNodeProcessor.java [source-patterns] javadoc-style license header: lucene/suggest/src/java/org/apache/lucene/search/spell/NGramDistance.java [source-patterns] javadoc-style license header: lucene/test-framework/src/java/org/apache/lucene/codecs/bloom/TestBloomFilteredLucenePostings.java [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/HeartBeater.java [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrMapper.java [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrOutputFormat.java [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrRecordWriter.java [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/SolrReducer.java [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/Utils.java [source-patterns] javadoc-style license header: solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/ZooKeeperInspector.java [source-patterns] javadoc-style license header: solr/contrib/morphlines-cell/src/java/org/apache/solr/morphlines/cell/SolrCellBuilder.java [source-patterns] javadoc-style license header: solr/contrib/morphlines-core/src/java/org/apache/solr/morphlines/solr/TokenizeTextBuilder.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollectorException.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/CollectionStats.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/LocalStatsCache.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/LocalStatsSource.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/StatsCache.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/StatsSource.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/StatsUtil.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/search/stats/TermStats.java [source-patterns] javadoc-style license header: solr/core/src/java/org/apache/solr/store/blockcache/CachedIndexOutput.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/TestDocumentBuilder.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/cloud/CdcrReplicationHandlerTest.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/TestElisionMultitermQuery.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestBaseStatsCache.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestDefaultStatsCache.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestExactSharedStatsCache.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestExactStatsCache.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/search/stats/TestLRUStatsCache.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/update/AddBlockUpdateTest.java [source-patterns] javadoc-style license header: solr/core/src/test/org/apache/solr/util/PrimUtilsTest.java BUILD FAILED C:\Users\Uwe Schindler\Projects\lucene\trunk-lusolr1\build.xml:130: 108 source files contain @author javadoc tags, tabs, svn keywords, javadoc-style licenses, or nocommits. Total time: 21 seconds This task is about 2 times faster than the old one.
        Hide
        Robert Muir added a comment -

        +1, this is great

        Show
        Robert Muir added a comment - +1, this is great
        Hide
        Uwe Schindler added a comment -

        Patch fixing bugs and improving the scanner (it had false positives before). For the license checking it is bit more complex and implemented in separate code path: First a "javadoc" is detected and then if it contains "Licensed to" it is detected as header.

        For now I left out js and xml files, a lot of them contain tabs. We should fix this, too. Any comments on this?

        Show
        Uwe Schindler added a comment - Patch fixing bugs and improving the scanner (it had false positives before). For the license checking it is bit more complex and implemented in separate code path: First a "javadoc" is detected and then if it contains "Licensed to" it is detected as header. For now I left out js and xml files, a lot of them contain tabs. We should fix this, too. Any comments on this?
        Hide
        Uwe Schindler added a comment -

        I will commit and backport this now, because patch is quite large.

        Show
        Uwe Schindler added a comment - I will commit and backport this now, because patch is quite large.
        Hide
        ASF subversion and git services added a comment -

        Commit 1695380 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1695380 ]

        LUCENE-6732: Improve checker for invalid source patterns to also detect javadoc-style license headers. Use Groovy to implement the checks instead of plain Ant

        Show
        ASF subversion and git services added a comment - Commit 1695380 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1695380 ] LUCENE-6732 : Improve checker for invalid source patterns to also detect javadoc-style license headers. Use Groovy to implement the checks instead of plain Ant
        Hide
        ASF subversion and git services added a comment -

        Commit 1695386 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1695386 ]

        Merged revision(s) 1695380 from lucene/dev/trunk:
        LUCENE-6732: Improve checker for invalid source patterns to also detect javadoc-style license headers. Use Groovy to implement the checks instead of plain Ant

        Show
        ASF subversion and git services added a comment - Commit 1695386 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1695386 ] Merged revision(s) 1695380 from lucene/dev/trunk: LUCENE-6732 : Improve checker for invalid source patterns to also detect javadoc-style license headers. Use Groovy to implement the checks instead of plain Ant
        Hide
        Uwe Schindler added a comment -

        I leave this open to fix the JS and XML files in Solr.

        Show
        Uwe Schindler added a comment - I leave this open to fix the JS and XML files in Solr.
        Hide
        ASF subversion and git services added a comment -

        Commit 1695395 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1695395 ]

        LUCENE-6732: Remove tabs in JS and XML files

        Show
        ASF subversion and git services added a comment - Commit 1695395 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1695395 ] LUCENE-6732 : Remove tabs in JS and XML files
        Hide
        ASF subversion and git services added a comment -

        Commit 1695401 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1695401 ]

        Merged revision(s) 1695395 from lucene/dev/trunk:
        LUCENE-6732: Remove tabs in JS and XML files

        Show
        ASF subversion and git services added a comment - Commit 1695401 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1695401 ] Merged revision(s) 1695395 from lucene/dev/trunk: LUCENE-6732 : Remove tabs in JS and XML files
        Hide
        ASF subversion and git services added a comment -

        Commit 1695405 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1695405 ]

        LUCENE-6732: Remove tabs XSL files

        Show
        ASF subversion and git services added a comment - Commit 1695405 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1695405 ] LUCENE-6732 : Remove tabs XSL files
        Hide
        ASF subversion and git services added a comment -

        Commit 1695407 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1695407 ]

        Merged revision(s) 1695405 from lucene/dev/trunk:
        LUCENE-6732: Remove tabs XSL files

        Show
        ASF subversion and git services added a comment - Commit 1695407 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1695407 ] Merged revision(s) 1695405 from lucene/dev/trunk: LUCENE-6732 : Remove tabs XSL files
        Hide
        Uwe Schindler added a comment -

        I improved the checker. It now detects all licenses inside javadocs comments: it uses Apache RAT to do that [which is loaded already].

        And I found more offenders!

        Show
        Uwe Schindler added a comment - I improved the checker. It now detects all licenses inside javadocs comments: it uses Apache RAT to do that [which is loaded already] . And I found more offenders!
        Hide
        Uwe Schindler added a comment -

        Patch using Apache RAT to detect if a javadocs comment is a License:

        • first it finds all javadocs comments via Regex (as before)
        • instead of just checking for "Licensed to" inside, it now passes the inner match of the previous to the Apache RAT license checker. If that detects a license it reports this as error.
        Show
        Uwe Schindler added a comment - Patch using Apache RAT to detect if a javadocs comment is a License: first it finds all javadocs comments via Regex (as before) instead of just checking for "Licensed to" inside, it now passes the inner match of the previous to the Apache RAT license checker. If that detects a license it reports this as error.
        Hide
        ASF subversion and git services added a comment -

        Commit 1695496 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1695496 ]

        LUCENE-6732: Improve javadoc-style license checker to use Apache RAT

        Show
        ASF subversion and git services added a comment - Commit 1695496 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1695496 ] LUCENE-6732 : Improve javadoc-style license checker to use Apache RAT
        Hide
        ASF subversion and git services added a comment -

        Commit 1695499 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1695499 ]

        Merged revision(s) 1695496 from lucene/dev/trunk:
        LUCENE-6732: Improve javadoc-style license checker to use Apache RAT

        Show
        ASF subversion and git services added a comment - Commit 1695499 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1695499 ] Merged revision(s) 1695496 from lucene/dev/trunk: LUCENE-6732 : Improve javadoc-style license checker to use Apache RAT
        Hide
        ASF subversion and git services added a comment -

        Commit 1695586 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1695586 ]

        LUCENE-6732: More filetypes to check

        Show
        ASF subversion and git services added a comment - Commit 1695586 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1695586 ] LUCENE-6732 : More filetypes to check
        Hide
        ASF subversion and git services added a comment -

        Commit 1695587 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1695587 ]

        Merged revision(s) 1695586 from lucene/dev/trunk:
        LUCENE-6732: More filetypes to check

        Show
        ASF subversion and git services added a comment - Commit 1695587 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1695587 ] Merged revision(s) 1695586 from lucene/dev/trunk: LUCENE-6732 : More filetypes to check
        Hide
        ASF subversion and git services added a comment -

        Commit 1695669 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1695669 ]

        LUCENE-6732: Scan txt files in root folder, too. TODO: scan txt files everywhere

        Show
        ASF subversion and git services added a comment - Commit 1695669 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1695669 ] LUCENE-6732 : Scan txt files in root folder, too. TODO: scan txt files everywhere
        Hide
        ASF subversion and git services added a comment -

        Commit 1695670 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1695670 ]

        Merged revision(s) 1695669 from lucene/dev/trunk:
        LUCENE-6732: Scan txt files in root folder, too. TODO: scan txt files everywhere

        Show
        ASF subversion and git services added a comment - Commit 1695670 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1695670 ] Merged revision(s) 1695669 from lucene/dev/trunk: LUCENE-6732 : Scan txt files in root folder, too. TODO: scan txt files everywhere
        Hide
        Uwe Schindler added a comment -

        I added some more txt files that caused smoketester to fail (because of the way how Solr's src.tgz contains the changes.html in docs/ folder).

        The only missing filetype that is checked globally is txt, this causes headaches:

        • some resource or test files contains tabs. Maybe some of them can just be renamed (import files that are tab-separated values) could be renamed to *.tsv (the standard file ending for that)
        • there are also some stopwords files (or similar with tabs, we have to check those)
        • some Licenses in licenses/ folder have tabs. This is easy to fix.

        I keep this open until fix.

        Show
        Uwe Schindler added a comment - I added some more txt files that caused smoketester to fail (because of the way how Solr's src.tgz contains the changes.html in docs/ folder). The only missing filetype that is checked globally is txt, this causes headaches: some resource or test files contains tabs. Maybe some of them can just be renamed (import files that are tab-separated values) could be renamed to *.tsv (the standard file ending for that) there are also some stopwords files (or similar with tabs, we have to check those) some Licenses in licenses/ folder have tabs. This is easy to fix. I keep this open until fix.
        Hide
        Mikhail Khludnev added a comment -

        Got an observation. Something weird happen with one of my working copy files, the validation script failed with quite laconic:

        java.io.IOException: Input/output error
        

        There is no a problem path, in exception. It's not a problem of the script, but just a lack of usability. Do you think it's worth to improve exception reporting in groovy script? I'm not familiar, but I can try.
        for the reference the stack trace:

        Caused by: java.io.IOException: Input/output error
        	at java.io.FileInputStream.readBytes(Native Method)
        	at java.io.FileInputStream.read(FileInputStream.java:255)
        	at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
        	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
        	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
        	at java.io.InputStreamReader.read(InputStreamReader.java:184)
        	at java.io.BufferedReader.read1(BufferedReader.java:210)
        	at java.io.BufferedReader.read(BufferedReader.java:286)
        	at java.io.Reader.read(Reader.java:140)
        	at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:884)
        	at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588)
        	at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source)
        	at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
        	at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
        	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        	at embedded_script_in__Users_mkhl_Documents_lucene_solr_https_5x_build_dot_xml$_run_closure3.doCall(embedded_script_in__Users_mkhl_Documents_lucene_solr_https_5x_build_dot_xml:60)
        
        Show
        Mikhail Khludnev added a comment - Got an observation. Something weird happen with one of my working copy files, the validation script failed with quite laconic: java.io.IOException: Input/output error There is no a problem path, in exception. It's not a problem of the script, but just a lack of usability. Do you think it's worth to improve exception reporting in groovy script? I'm not familiar, but I can try. for the reference the stack trace: Caused by: java.io.IOException: Input/output error at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.read1(BufferedReader.java:210) at java.io.BufferedReader.read(BufferedReader.java:286) at java.io.Reader.read(Reader.java:140) at org.codehaus.groovy.runtime.IOGroovyMethods.getText(IOGroovyMethods.java:884) at org.codehaus.groovy.runtime.ResourceGroovyMethods.getText(ResourceGroovyMethods.java:588) at org.codehaus.groovy.runtime.dgm$964.invoke(Unknown Source) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125) at embedded_script_in__Users_mkhl_Documents_lucene_solr_https_5x_build_dot_xml$_run_closure3.doCall(embedded_script_in__Users_mkhl_Documents_lucene_solr_https_5x_build_dot_xml:60)
        Hide
        Uwe Schindler added a comment -

        This is a standard Java Exception. There is no problem with it. Ant or Java would report similar. You cannot really improve this. The operating system does not give more information. This is basic ant semantics of using FileScanner and opening files.

        The only thing you can do is to log the filename with debug prio, so you can try with ant -verbose. Should I do this?

        Show
        Uwe Schindler added a comment - This is a standard Java Exception. There is no problem with it. Ant or Java would report similar. You cannot really improve this. The operating system does not give more information. This is basic ant semantics of using FileScanner and opening files. The only thing you can do is to log the filename with debug prio, so you can try with ant -verbose. Should I do this?
        Hide
        Mike Drob added a comment -

        log the filename

        +1

        Show
        Mike Drob added a comment - log the filename +1
        Hide
        Uwe Schindler added a comment -

        If you want to fix underlying issue open bug in Java bugtracker and request that File streams include file name in exception message.

        The Groovy script is fine, so please leve it as is. Maybe add debug logging for investigation as said before.

        Show
        Uwe Schindler added a comment - If you want to fix underlying issue open bug in Java bugtracker and request that File streams include file name in exception message. The Groovy script is fine, so please leve it as is. Maybe add debug logging for investigation as said before.
        Hide
        Uwe Schindler added a comment -

        Simple patch.

        The output won't change, but if you get an error like this try: ant -verbose validate

        I will commit this later. The behaviour is now identical to the <copy/> task, hwich also prints all files when verbose.

        Show
        Uwe Schindler added a comment - Simple patch. The output won't change, but if you get an error like this try: ant -verbose validate I will commit this later. The behaviour is now identical to the <copy/> task, hwich also prints all files when verbose.
        Hide
        ASF subversion and git services added a comment -

        Commit 1718479 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1718479 ]

        LUCENE-6732: Improve logging, add verbose logging of filenames

        Show
        ASF subversion and git services added a comment - Commit 1718479 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1718479 ] LUCENE-6732 : Improve logging, add verbose logging of filenames
        Hide
        ASF subversion and git services added a comment -

        Commit 1718480 from Uwe Schindler in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1718480 ]

        Merged revision(s) 1718479 from lucene/dev/trunk:
        LUCENE-6732: Improve logging, add verbose logging of filenames

        Show
        ASF subversion and git services added a comment - Commit 1718480 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1718480 ] Merged revision(s) 1718479 from lucene/dev/trunk: LUCENE-6732 : Improve logging, add verbose logging of filenames
        Hide
        Uwe Schindler added a comment -

        Fixed. Thanks Mikhail!

        Show
        Uwe Schindler added a comment - Fixed. Thanks Mikhail!

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Uwe Schindler
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development