Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)SH
semioticbreakdown [she/her] @ semioticbreakdown @hexbear.net
Posts
0
Comments
40
Joined
3 wk. ago

  • My experience is that with ollama and deepseek r1 it reprocess the think tags. they get referenced directly.

    This does happen (and i fucked with weird prompts for deepseek a lot, with very weird results) and I think it does cause what you described but like... the COT would get reprocessed in models without think tags too just by normal CoT prompting, and I also would just straight up get other command tokens outputted in even on really short prompts with minimal CoT. So I kind of attributed it to issues with local deepseek being as small as it is. I can't find the paper but naive CoT prompting works best with models that are already of a sufficient size, but the errors do compound on smaller models with less generalization. Maybe something you could try would be parsing the think tags to remove the CoT before re-injection? I was contemplating doing this but I would have to set ollama up again.

    Its tough to say. I think an ideal experiment in my mind would be to measure hallucination rate in a baseline model, a baseline model with CoT prompting, and the same baseline model tuned by RL to do CoT without prompting. I would also want to measure hallucination rate with conversation length separately for all of those models. And I would also want to measure hallucination rate with/without CoT reinjection into chat history for the tuned CoT model. And also measuring hallucination rate across task domains with task-specific finetuning...

    Not only that it hallucinated the characters back story that's not even in the post to give them a genetic developmental disorder

  • But asking the LLM to provide the CoT doesnt pollute the prompt history anymore vs the policy being tuned via RL or SFT techniques to generate the chain of thought - the chain of thought is still being generated. I have seen it proposed to use automated-COT-prompting s.t. that CoT examples are automatically injected into the prompt, but I havent been able to find information on whether this is actually implemented in any of the widely available reasoning models.

    I'm not saying this can't, or doesn't affect it significantly as the trajectory of the policy goes wayyyyy outside its normal operating bounds but I dont think thats whats happening here (and after digging into it I dont think its model collapse either). if the hallucination rate is increasing across the board regardless of conversation length then I dont think thats necessarily a result of CoT, but indicative of an issue with the trained policy itself, which might be because of fine-tuning. Especially when you consider that the SimpleQA metric is short-form fact answering.

    And with GPT4.5 performance, its also larger than even GPT4 which had 1.76T params (16-expert MOE model), and o1 has something like 300b. o1's accuracy/consistency on SimpleQA also outperforms GPT4o still, and has a lower hallucination rate, but 4o is smaller anyway at ~200b. source for param counts

    As it turns out after doing research, o3-mini only has 3b parameters so I dont even think its model collapse, its just a tiny as hell model completely dedicated to math and science reasoning, which might be causing a "catastrophic forgetting" effect w.r.t. its ability to perform on other tasks as well, since it still outperforms GPT4.5 on math/sci reasoning metrics (But does shit on coding) and based on this metric https://github.com/vectara/hallucination-leaderboard it actually hallucinates the least w.r.t. document summarization. So maybe the performance on SimpleQA should be taken as a reflection of how much "common sense" is baked into the parameters. o3-mini and o4-mini both still outperform GPT4o-mini on SimpleQA despite being smaller, newer, and CoT models. And despite the higher hallucination rate for o4-mini, it also has a higher accuracy rate than o3-mini on SimpleQA, too from the simpleQA repo. So I dont even think this is telling us anything about CoT or dataset integrity. I think measuring hallucination for CoT vs a standard model will require a specific experiment on a base model tbh

    I am also skeptical that the public reasoning and reasoning safeguards are a cause of the hallucinations beyond the same issues fine-tuning by RL. AFAIK neither of those change the prompt at all, public reasoning just exposes the CoT (which is still part of the output) and the reasoning safeguards are trained into the responses or by evaluation of the input by a separate model. So I dont think theres additional turns or prompt rewriting being injected at all (but I could be wrong on this, I had some trouble finding information).

    Ugh. This is all machine woo. I need a drink

  • Information isnt attached to the users query, the CoT still happens in the output of the model like in the first example that you gave. This can be done without any finetuning on the policy, but reinforcement learning can also be used to encourage the chat output to break the problem down in to "logical" steps. chat models have always passed in the chat history back into the next input while appending the users turn, thats just how they work (I have no idea if o1 passes the CoT into the chat history though, so i cant comment). But it wouldnt solely account for the massive degradation of performance between o1 and o3/o4

  • welll the model are always refeeding their own output back into the model recurrently, CoT prompting works by explicitly having the model write out the intermediate steps to reduce logical jumps via the writing model. The production of the text of the reasoning model is still happening statistically, so its still prone to hallucination. my money is on the higher hallucination rate being a result of the data being polluted with synthetic information. I think its model collapse

  • yeah, it is. whether the hallucination aligns with reality is incidental. when it does, its interpreted as "correct output", and when it doesnt, its interpreted as "hallucination" - but its all hallucination. There is no thought, no reason, no cognition, no world

  • this is scary as hell to me. They have no idea what is causing these outputs and are cramming every piece of data into these things that they can, and clearly the responses are already starting to drift significantly wrt reality. the only thing keeping the scheme from falling apart is the fine-tuning and I think as they incorporate synthetic data its going to fuuuuuuuck itself. Even if they dont directly use LLM output in the dataset the user input is becoming increasingly polluted with bad information produced by LLMs over time as users input it in conversations and it proliferates across the internet. Also I was at a bookstore and caught sight of a some new-agey book about using LLMs for magic - theres a common joke about LLMs being oracles but no, some people are literally using them as oracles

    Also maybe you shouldn't include the whole Woo Canon in your statistical model that is incapable of distinguishing between fact and fiction?

    whaaaaat no way. thats good data baby!!!

  • the thing that really gets me is that even if the computational theory of mind holds, LLMs still dont constitute a cognitive agent. cognition and consciousness as a form of computation does not mean that all computation is necessarily a form of cognition and consciousness. its so painful. its substitution of signs of the real for the real; a symptom of consciousness in the form of writing has been reinterpreted to mean consciousness insofar as it forms a cohesive writing system. The model of writing has come to stand in for a mind that is writing, even if that model is nothing more than an ungrounded system of sign-interpretation that only gains meaning when it is mapped by conscious agents back into the real. It is neither self-reflective nor phenomenal. Screaming and pissing and shitting myself everytime someone anthropomorphizes an LLM

  • The witting or unwitting use of synthetic data to train generative models departs from standard AI training practice in one important respect: repeating this process for generation after generation of models forms an autophagous (“self-consuming”) loop. As Figure 3 details, different autophagous loop variations arise depending on how existing real and synthetic data are combined into future training sets. Additional variations arise depending on how the synthetic data is generated. For instance, practitioners or algorithms will often introduce a sampling bias by manually “cherry picking” synthesized data to trade off perceptual quality (i.e., the images/texts “look/sound good”) vs. diversity (i.e., many different “types” of images/texts are generated). The informal concepts of quality and diversity are closely related to the statistical metrics of precision and recall, respectively [39 ]. If synthetic data, biased or not, is already in our training datasets today, then autophagous loops are all but inevitable in the future.

  • I knew the answer was "Yes" but it took me a fuckin while to find the actual sources again

    https://arxiv.org/pdf/2307.01850 https://www.nature.com/articles/s41586-024-07566-y

    the term is "Model collapse" or "model autophagy disorder" and any generative model is susceptible to it

    as to why it has not happened too much yet: Curated datasets of human generated content with minimal AI content If it does: You could switch to an older version, yes, but to train new models with any new information past a certain point you would need to update the dataset while (ideally) introducing as little AI content as possible, which I think is becoming intractable with the widespread deployment of generative models.

  • baudrillard moment, theyre living in a fundamentally different constructed reality that has no relation to the real. its not about intelligence, its about subcultural narrative and the willful acceptance of a simulated reality that appeals to them on a fundamental level. their identity has been reconstructed around consumption of particular strains of media that appeal to their fantasy. They choose deliberately to re-enter the fascist Matrix where reality and imagination have lost any boundary, and if I had to guess I'd point to that as the main reason they love AI slop. stupidity is the wrong word, but burgerbrained is right. The burger, fries, the Amerikkkan flag have all become fetishes not to a superstitious religious population, but to a population that has embraced religious superstition and anti-intellectualism on the grounds of consumptive identity.

    e: also I think they could be re-educated, I dont think its impossible