Since Folax is inherently drawing responses from ChatGPT, it can often hallucinate and present incorrect answers — often very confidently. Once again, the only way to remedy this is to upgrade to newer models, such as GPT-4 (or equivalent), which have fewer hallucinations and more accurate responses.
Counterpoint: Moving to GPT-4 makes it harder to realise when the reply is complete bullshit.
This is why ChatGPT needs to provide sources and references. but since it scraped things indiscriminately, that'll lead them to legal trouble. There's a services like perplexity[.]ai that uses internet search plugin for ChatGPT? and lists sources. much better if you want to check the validity of the things it spits out
Yeah this seems like a really tough problem with LLMs. From memory OpenAI have said they are hoping to see a big improvement next year which is a pretty long time given the rapid pace of everything else in the AI space.
I really hope they or others can make some big strides here because it really limits the usefulness of these models.
The whole problem I have is the models are rewarded/refined for believability and not for accuracy.
Once there is enough LLM generated shit on the web, it will be used (most likely inadvertently) to train newer LLMs and we will be in a garbage in - garbage out deluge of accurate sounding bullshit that much of the web will become useless.
Yeah 100% with you on that. I think the folks building these things are also aware of this issue and maybe that's one of the reasons why ChatGPTs training set still ends in 2021. We'll have to wait and see what new solutions and techniques come along but for now I think we're going to be stuck with this problem for a while.