We wanted to use ONNX runtime for a "model driver" for MD simulations, where any ML model can be used for molecular dynamics simulations. Problem was it was way too immature. Like ceiling function will only work with single precision in ONNX. But the biggest issue was that we could not take derivatives in ONNX runtime, so any complicated model that uses derivatives inside was a nogo, is that limitation still exist? Do you know if it can take derivatives in training mode now?
Yea, ONNX runtime is mostly used for inference. The requirements for training and inference differ quite a lot: training requires a library that can calculate gradients for the back propagation, loop over large datasets, split the model across multiple GPUs, etc. During inference you need to run a quantized version of the model on a specific target hardware, whether it be CPU, GPU, or mobile. So typically you will use one library for training, and convert it to a different library for deployment.
> And this is a superficial difference carried from old days when we need to do deployment and deployment-specific optimizations.
Is it? From what I understand, to use an analogy, ONNX is the bytecode specification and JVM whereas Pytorch, TF and other frameworks combined with converting tools are the Java compilers.
It is true if your model is statically shaped. ONNX is also a collection of C++ code , a C++ library for doing shape inference. You cannot project the C++ code to another high level programming language.
The model itself doesn't contain C++ code. The runtime supports Python, JS, etc. I don't think they are shipping a copy of clang as part of the rt. I could be wrong, I haven't looked. Of course it needs the scaffolding to get data in and out.
I just did an install of the runtime on Python ( pip install onnxruntime ) . Here are the additional packages it installs.
ONNX is a spec. ONNX Runtime is an implementation of ONNX. There are other implementations too. But ONNX is not a text spec like the RFCs for network protocols. ONNX is also a collection of C/C++ code. ONNX's implementations rely on this code to do type and shape inference. My point was: if someone wants to implement ONNX(write a library that can load and run ONNX models), he/she has to reuse this C/C++ code, or totally rewrite them in his/her favorite programming language(but I think it is not very practical).
If an ONNX implementation wants to do codegen, like what XLA does, then usually it is based on LLVM and it needs to be shipped with a copy of LLVM.