I think with this everything OpenAI demonstrated ~5 weeks ago has been recreated by actually-open AI. Even if it runs much much slower on prosumer hardware and with less good results at least it is de-magicked.
It'll work! I just haven't touched any of the 4bit stuff myself, so I don't personally know how to add it. Great low-hanging fruit for anyone else to take on.
Do you reckon the 4bit quantized Vicuna just won't do here? https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-1...
I think with this everything OpenAI demonstrated ~5 weeks ago has been recreated by actually-open AI. Even if it runs much much slower on prosumer hardware and with less good results at least it is de-magicked.