In one of our regular back-and-forth breaks, Lori and I were talking shop a week or two ago, and she shared an excellent report by GitGuardian with me. Frankly, none of the information in the report was a huge surprise for me–Git is filled with information that companies probably don’t want out there, but that is nearly impossible to track proactively. That’s what GitGuardian tries to do for you, but honestly, I’ve not yet dug into them, so caveat emptor. Or, read someone else’s opinion to help you decide if they’re a good idea – I’m just going to talk about the report.
The key takeaway is that you are asking your DevOps teams to move faster and faster. Putting a section of code they are working on into a public repository is an easy way to “take work home” or share it, especially during lockdowns. It is also a good way to hand over critical information to anyone searching for it—about your infrastructure, or even critical bits of data like SSL keys or credentials. And with the current level of cyberattacks against nearly every industry, I’d say there are people looking for that information, no matter what industry you are in.
This is not really a DevOps-specific issue–credentials and such in code has been a problem for IT for as long as I’ve been working in the field. At some point, any automated system needs to perform AAA, and that means the data required is stored somewhere. A multitude of approaches have been attempted to resolve the problem technologically, but none have been a silver bullet. Yet. Eventually, someone will figure out a way to make it work; then a whole new attack vector will arrive, probably aimed at the tool that is settled on to protect this very data.
This is a personnel/policy issue that is exacerbated by the speed of Agile and DevOps. Your best solution, at this time, is to write a clear policy that contains consequences for putting secret data into a public repository. We’re short on “consequences” right now, preferring to say, “Right, let’s make sure that can’t happen again,” but in the case of putting secrets out in public, it is a different level of ‘bad’ than a coding error getting checked in. Those secrets can actively be used against the organization, and if the not-to-be-shared data is clearly delineated, any violation can be viewed as intentional, be it because of convenience or malfeasance.
According to the report, there are millions of bits of privileged information out there. Not all of them are from enterprise IT shops—but some are. And some might be from your organization. Set expectations, get feedback. Make it clear you do not want to stop progress on delivery of systems, but you also don’t want to implement a system with a built-in back door for attackers to use. It is easy enough to use alternative delivery methods for those few bits of data, and put the bulk of the source out where it is needed.
And keep rocking it. Toward Data Science says there are 128 million public repositories out there (Q1, 2021). Think how many lines of code are in those repositories, and how many secrets potentially are, too. Just make certain yours are not included in the list. No sense keeping your next great solution from running because your critical data leaked.