A user on the online forum 4chan has leaked a massive 270GB of data purportedly belonging to The New York Times. This leak includes what is claimed to be the source code for the newspaper’s digital operations.
The user who posted the data claimed that The New York Times has over 5,000 source code...
A user on the online forum 4chan has leaked a massive 270GB of data purportedly belonging to The New York Times. This leak includes what is claimed to be the source code for the newspaper’s digital operations.
The Times has filed several Digital Millennium Copyright Act, or DMCA, takedown notices to developers of Wordle-inspired games, which cited infringement on the Times’ ownership of the Wordle name, as well as its look and feel — such as the layout and color scheme of green, gray and yellow tiles.
Numerous impacted developers have also taken to social media to share their frustrations. Many said that their games, which range from Wordle-like offerings in other languages to more guessing games, would be taken down as a result.
Still, Brauneis said he believes the Times’ arguments for Wordle copyright infringement are on “a little bit shaky ground” for several reasons. Rules of a game, for example, are not covered by copyright — and that can include the layout of the game itself, he said.
Anything that may help develop better adblockers/paywall bypasses or exposes how/what of our personal information is collected is a win in my book. And this may very well be none of those things.
Did this leak happen before or after NYT published an investigation detailing how Israeli forces were raping and torturing defenseless Palestinian detainees brought in from the Gaza Strip?
Thats a lot of data but surly its not all their articles cos I'd very much like to train mixtral7x8b on it along with 4chan data and shir from the dark web. Surly there is a project where such a model is public and being trained on literally everything regardless of legality.
you're getting downvoted because LLMs are simply not very good, they consume lots of energy (bad for climate), and seemingly most people involved in ai hype want to replace human creativity or something.
how about instead of training a not very trustworthy or useful LLM on lots of nyt, 4chan, and "dark web", you go read lots of nyt, 4chan, and dark web to train your own (much better) model (your brain).
They are very good they exceed the capability of many humans in many tasks. If consume energy = bad for environment then all electric vehicles are bullshit cos they have energy inefficiencies that petrol cars don't (thermodynamics is a bitch). U do realise the argument about if asking an ai to create an image is art argument is literally the same argument that was had about if photography is art.
Llm are decently trustworthy especially with chain of thought reasoning and tool capabilities. And they are extraordinarily useful people wouldnt be using them and creating a market for them of they weren't. I can't train my brain then share it for free to everyone on the internet to download I can with an ai tho.