2mo ago

Qwen3 "Leaked"

qingy2024/Qwen3-0.6B · Hugging Face

Qwen3 was apparently posted early, then quickly pulled from HuggingFace and Modelscope. The large ones are MoEs, per screenshots from Reddit:

Including a 235B/22B active and a 30B/3B active.

But its possible they're still training them to 256K:

Take it all with a grain of salt, configs could change with the official release, but it appears it is happening today.