Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-7611

NiFi fails to index provenance events

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 1.11.4
    • Fix Version/s: None
    • Component/s: None
    • Environment:
      Microsoft Windows Server 2016 Standard - Intel Xeon Gold 6140 CPU @ 2,30 GHz 8 processors, 32 GB RAM, total disk space 877 GB
    • Flags:
      Important

      Description

      Getting error "failed to index provenance events". Nifi.app log displays following information:

      2020-07-08 09:00:00,406 ERROR [Index Provenance Events-4] o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events

      org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed

                      at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:681)

                      at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:695)

                      at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1281)

                      at org.apache.lucene.index.IndexWriter.addDocuments(IndexWriter.java:1257)

                      at org.apache.nifi.provenance.lucene.LuceneEventIndexWriter.index(LuceneEventIndexWriter.java:70)

                      at org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:202)

                      at org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:113)

                      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

                      at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

                      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

                      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

                      at java.base/java.lang.Thread.run(Thread.java:834)

      Caused by: java.nio.file.FileSystemException: E:\nifi-storage\provenance_repository\lucene-8-index-1593163985970_11r.cfe: The process cannot access the file because it is being used by another process.

       

                      at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)

                      at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)

                      at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:108)

                      at java.base/sun.nio.fs.WindowsFileSystemProvider.newFileChannel(WindowsFileSystemProvider.java:120)

                      at java.base/java.nio.channels.FileChannel.open(FileChannel.java:292)

                      at java.base/java.nio.channels.FileChannel.open(FileChannel.java:345)

                      at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238)

                      at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:157)

                      at org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.readEntries(Lucene50CompoundReader.java:105)

                      at org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.<init>(Lucene50CompoundReader.java:69)

                      at org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.getCompoundReader(Lucene50CompoundFormat.java:70)

                      at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:100)

                      at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:83)

                      at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:172)

                      at org.apache.lucene.index.ReadersAndUpdates.getReaderForMerge(ReadersAndUpdates.java:709)

                      at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4396)

                      at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4054)

                      at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)

                      at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)

       

      Logs eventually grow over time and fill up the partition.

       

      Configuration related to provenance repository:

       

      1. Provenance Repository Properties

      nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository

      nifi.provenance.repository.debug.frequency=1_000_000

      nifi.provenance.repository.encryption.key.provider.implementation=

      nifi.provenance.repository.encryption.key.provider.location=

      nifi.provenance.repository.encryption.key.id=

      nifi.provenance.repository.encryption.key=

       

      1. Persistent Provenance Repository Properties

      nifi.provenance.repository.directory.default=E:\\nifi-storage
      provenance_repository

      nifi.provenance.repository.directory.content1=F:\\nifi-storage
      provenance_repository

      nifi.provenance.repository.max.storage.time=24 hours

      1. nifi.provenance.repository.max.storage.size=1 GB

      nifi.provenance.repository.max.storage.size=8 GB

      nifi.provenance.repository.rollover.time=30 secs

      1. nifi.provenance.repository.rollover.size=100 MB

      nifi.provenance.repository.rollover.size=1 GB

      nifi.provenance.repository.query.threads=2

      nifi.provenance.repository.index.threads=4

      #default: nifi.provenance.repository.compress.on.rollover=true

      nifi.provenance.repository.compress.on.rollover=false

      nifi.provenance.repository.always.sync=false

      1. Comma-separated list of fields. Fields that are not indexed will not be searchable. Valid fields are:
      1. EventType, FlowFileUUID, Filename, TransitURI, ProcessorID, AlternateIdentifierURI, Relationship, Details

      nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, ProcessorID, Relationship

      1. FlowFile Attributes that should be indexed and made searchable.  Some examples to consider are filename, uuid, mime.type

      nifi.provenance.repository.indexed.attributes=

      1. Large values for the shard size will result in more Java heap usage when searching the Provenance Repository
      1. but should provide better performance
      1. nifi.provenance.repository.index.shard.size=500 MB

      nifi.provenance.repository.index.shard.size=4 GB

       

      1. Indicates the maximum length that a FlowFile attribute can be when retrieving a Provenance Event from
      1. the repository. If the length of any attribute exceeds this value, it will be truncated when the event is retrieved.

      nifi.provenance.repository.max.attribute.length=65536

      nifi.provenance.repository.concurrent.merge.threads=2

       

      1. Volatile Provenance Respository Properties

      nifi.provenance.repository.buffer.size=100000

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              vojdal Michal W

              Dates

              • Created:
                Updated:

                Issue deployment