Details
-
Improvement
-
Status: To Do
-
Major
-
Resolution: Unresolved
-
None
Description
Issue :
RNN.hybrid_forward() spends a significant time on _rnn_param_concat
Root Cause:
Creating a simple LSTM (256 -> 256) with gluon (gluon_lstm.py) and running inference on CPU with the profile shows the following output. { Refer gluon_lstm.py for the code}
_rnn_param_concat() takes more time than the RNN operator.
------------------- | ------- | -------- | ----------- | ------------ | ------------ | ------------ | |||
operator | |||||||||
================= | |||||||||
Name | Total | Count | Time (ms) | Min | Time (ms) | Max | Time (ms) | Avg | Time (ms) |
------ | --------- | — | ---------- | — | ---------- | — | ---------- | ||
RNN | 72 | 72.312 | 0.065 | 33.15 | 1.0043 | ||||
_zeros | 788 | 1.76 | 0.001 | 0.011 | 0.0022 | ||||
DeleteVariable | 1110 | 0.637 | 0 | 0.002 | 0.0006 | ||||
SwapAxis | 462 | 2.223 | 0.003 | 0.015 | 0.0048 | ||||
_rnn_param_concat | 76 | 103.813 | 1.122 | 2.08 | 1.366 |
Attachments
Attachments
Issue Links
- links to