Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are two kinds of runtime: training and inference. ONNX runtime as far as I know is only for inference, which is open for all.


The training support is much less mature and much less widely used, but it does exit: https://onnxruntime.ai/docs/get-started/training-on-device.h... https://onnxruntime.ai/docs/get-started/training-pytorch.htm...


We wanted to use ONNX runtime for a "model driver" for MD simulations, where any ML model can be used for molecular dynamics simulations. Problem was it was way too immature. Like ceiling function will only work with single precision in ONNX. But the biggest issue was that we could not take derivatives in ONNX runtime, so any complicated model that uses derivatives inside was a nogo, is that limitation still exist? Do you know if it can take derivatives in training mode now?

Eventually we went with pytorch only support for the time being, with still exploring OpenXLA in place of ONNX, as a universal adapter: https://github.com/ipcamit/colabfit-model-driver


Yea, ONNX runtime is mostly used for inference. The requirements for training and inference differ quite a lot: training requires a library that can calculate gradients for the back propagation, loop over large datasets, split the model across multiple GPUs, etc. During inference you need to run a quantized version of the model on a specific target hardware, whether it be CPU, GPU, or mobile. So typically you will use one library for training, and convert it to a different library for deployment.


There's a training runtime too (and it enables edge training, as sibling reply hopes for in next decade)


And this is a superficial difference carried from old days when we need to do deployment and deployment-specific optimizations.

With LoRA / QLoRA, my bet is that edge training capabilities are as important in the next decade. I don't have any citations though.


> And this is a superficial difference carried from old days when we need to do deployment and deployment-specific optimizations.

Is it? From what I understand, to use an analogy, ONNX is the bytecode specification and JVM whereas Pytorch, TF and other frameworks combined with converting tools are the Java compilers.


Onnx is just a serialisation format (using protobuf iirc) for the network, weights, etc.

Your training framework and a suitable export is the compiler.

Onnx Runtime (which really has various backends), tensorrt, .. (whatever inference engine you are using) is your JVM.


That is my understanding, ONNX is the weights and the operators. You could then project that model into SPIR-V, Verilog or run it via native code.


It is true if your model is statically shaped. ONNX is also a collection of C++ code , a C++ library for doing shape inference. You cannot project the C++ code to another high level programming language.


The model itself doesn't contain C++ code. The runtime supports Python, JS, etc. I don't think they are shipping a copy of clang as part of the rt. I could be wrong, I haven't looked. Of course it needs the scaffolding to get data in and out.

I just did an install of the runtime on Python ( pip install onnxruntime ) . Here are the additional packages it installs.

    Package       Version
    ------------- -------
    coloredlogs   15.0.1
    flatbuffers   23.5.26
    humanfriendly 10.0
    mpmath        1.3.0
    numpy         1.25.2
    onnxruntime   1.15.1
    packaging     23.1
    protobuf      4.23.4
    sympy         1.12
https://onnxruntime.ai/docs/install/


ONNX is a spec. ONNX Runtime is an implementation of ONNX. There are other implementations too. But ONNX is not a text spec like the RFCs for network protocols. ONNX is also a collection of C/C++ code. ONNX's implementations rely on this code to do type and shape inference. My point was: if someone wants to implement ONNX(write a library that can load and run ONNX models), he/she has to reuse this C/C++ code, or totally rewrite them in his/her favorite programming language(but I think it is not very practical).

If an ONNX implementation wants to do codegen, like what XLA does, then usually it is based on LLVM and it needs to be shipped with a copy of LLVM.


The biggest problem with onnx models is that you can't reshape them :/




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: