Description
We need a convenient API for model inference. The current idea is to utilize Service Grid for this purpose. We should have two options, first is deliver a model to any node(server or client) and infer this model on that node. The second approach is to pin a model to a specific server and infer model on that server, it could be useful in case if we need some specific hardware which we don't have at any server like a GPU or TPU.
So the first approach is suitable for lightweight models and the second approach is suitable for some complex models like Neural Networks.
Attachments
Issue Links
- Is contained by
-
IGNITE-10286 [ML] Umbrella: Model serving
- Open