Machine learning pipeline consists of two stages: model training and model inference (model training is a process of training a model using existing data with known target values, model inference is a process of making predictions on a new data using trained model).
It's important that a model can be trained in one environment/system and after that is used for inference in another. A trained model is an immutable object without any side-effects (a pure mathematical function in math language). As result of that, an inference process has an excellent linear scalability characteristics because different inferences can be done in parallel in different threads or on different nodes.
The goal of "TensorFlow model inference on Apache Ignite" is to allow user to easily import pre-trained TensorFlow model into Apache Ignite, distribute it across nodes in a cluster, provide a common interface to call these models to make inference and finally perform load balancing so that all node resources are properly utilized.