And the forking Microsoft-owned code warehouse doesn't see this as much of a problem
Researchers at Truffle Security have found, or arguably rediscovered, that data from deleted GitHub repositories (public or private) and from deleted copies (forks) of repositories isn't necessarily deleted.
Joe Leon, a security researcher with the outfit, said in an advisory on Wednesday that being able to access deleted repo data – such as APIs keys – represents a security risk. And he proposed a new term to describe the alleged vulnerability: Cross Fork Object Reference (CFOR).
"A CFOR vulnerability occurs when one repository fork can access sensitive data from another fork (including data from private and deleted forks)," Leon explained.
For example, the firm showed how one can fork a repository, commit data to it, delete the fork, and then access the supposedly deleted commit data via the original repository.
The researchers also created a repo, forked it, and showed how data not synced with the fork continues to be accessible through the fork after the original repo is deleted. You can watch that particular demo.
Yup. Along with the code from huge organizations. I always thought it was funny that people put their code online, blindly trusting some random company that got gobbled up by Microsoft.
Not only just out there. I am regenerating your spaghetti code into a new context with copilot 🧑✈️
Your (ai-regenerated) code will be driving our military nuclear launch code base! Congratulations!
Well, sort of. GitHub certainly could refuse to render orphan commits. They pop up a banner saying so but I don't see why they should show the commit at all. They could still keep the data until it's garbage collected since a user might re-upload the commit in a new branch.
This seems like a non-issue though since someone who hasn't already seen the disclosed information would need to somehow determine the hash of the deleted commit.
Ah - Actually reading the article reveals why this is actually an issue:
What's more, Ayrey explained, you don't even need the full identifying hash to access the commit. "If you know the first four characters of the identifier, GitHub will almost auto-complete the rest of the identifier for you," he said, noting that with just sixty-five thousand possible combinations for those characters, that's a small enough number to test all the possibilities.
So enumerating all the orphan commits wouldn't be that hard.
In any case if a secret has been publicly disclosed, you should always assume it's still out there. For sure, rotate your keys.