On profiling entity creation flow, it was observed that several calls are made to AtlasGraphUtilsV1.getVertexByUniqueAttributes.
These calls result in querying database using graph query. There is a potential for improving this if index query was used.
Upon experimentation, it was found that there is a 50% improvement in performance of entity creation if this method was replaced with equivalent that uses indexQuery.
Also, when large number of entities are created (typically using import_hive.sh), the CPU usage on Atlas was reduced, as the Solr was being used for doing some of the work.
- Add new method to AtlasGraphUtilsV1.getAtlasVertexFromIndexQuery that will use AtlasGraphProvider.indexQuery to fetch vertices.
- Ensure that query created is 'escaped' appropriately.
- Include logic to fallback to graph query if the property being queried for is not indexed.
Since this is a high-impact change, it will be worth while to verify other dependent modules.