The EntityProcessor org.apache.stanbol.entityhub.indexing.core.processor.FieldValueFilter allows to filter Entities based on values of a field. The typical use case is to filter Entities based on their rdf:type statement.
The processor is typically configured by the indexing/config/entityTypes.properties:
Usage example include:
- only index Persons: values=schema:Person
- index Person and Organizations: values=schema:Person;schema:Organization
- index Persons and Entities without type: values=schema:Person;null
- index everything other than Persons: values=*;!schema:Person
- exclude Entities without a type: values=*;!null
When using the tool I noticed that including the * when writing exclusions is not intuitive so this will change the behavior so that * is no longer needed. E.g.
- index everything other than Persons: values=!schema:Person
- exclude Entities without a type: values=!null