Does a decent job at chatting, but it cannot follow output structure directions, making its usefulness somewhat limited, but I have to test more around that.
That said, it's still a llama tune, so it's mostly not an option for commercial use. They do have a pythia option, which works worse in every significant way.
The shared reinforcement learning data is extremely valuable tho, will be interesting to see the model trained out of it in the coming months
That said, it's still a llama tune, so it's mostly not an option for commercial use. They do have a pythia option, which works worse in every significant way.
The shared reinforcement learning data is extremely valuable tho, will be interesting to see the model trained out of it in the coming months