Facebook doesn't have the most stellar privacy and security track record , especially given that many of its notable gaffes were avoidable. But with billions of users and a gargantuan platform to defend, it's not easy to catch every flaw in the company's 100 million lines of code. So four years ago, Facebook engineers began building a customized assessment tool that not only checks for known types of bugs but can fully scan the entire codebase in under 30 minutes—helping engineers catch issues in tweaks, changes, or major new features before they go live.
The platform, dubbed Zoncolan, is a "static analysis" tool that maps the behavior and functions of the codebase and looks for potential problems in individual branches, as well as in the interactions of various paths through the program. Having people manually review endless code changes all the time is impractical at such a large scale. But static analysis scales extremely well, because it sets "rules" about undesirable architecture or code behavior, and automatically scans the system for these classes of bugs. See it once, catch it forever. Ideally, the system not only flags potential problems but gives engineers real-time feedback and helps them learn to avoid pitfalls.
"Every time an engineer makes a proposed change to our codebase, Zoncolan will start running in the background, and it will either report to that engineer directly or it will flag to one of our security engineers who’s on call," says Pieter Hooimeijer, a security engineering manager at Facebook. "So it runs thousands of times a day, and found on the order of 1,500 issues in calendar year 2018."
And so simultaneously the company mounted a huge effort, led by CTO Mike Schroepfer, to create artificial intelligence systems that can, at scale, identify the content that Facebook wants to zap from its platform, including spam, nudes, hate speech, ISIS propaganda, and videos of children being put in washing machines.
"It is by far the most valuable in the identification of known exposures. However, it doesn’t cover everything."
David Kennedy, TrustedSecStatic analysis tools don't find new types of vulnerabilities on their own; they can only catch things based on the rules they've been directed to follow. But they're a useful workhorse for catching the same types of mistakes again and again, or retroactively pulling out a set of bugs from a single new rule. They're also nowhere near unique to Facebook; static analysis tools are widely used in the security community and broader development industry. But Hooimeijer notes that Zoncolan is especially effective, because it is custom-built to comprehensively map Facebook's specific code. Hooimeijer says that before Facebook disclosed in March that it had accidentally stored hundreds of millions of user passwords in plain text, the company fed a rule about the bug into Zoncolan to scan the codebase for similar issues that could be lurking. And found a few.
More specifically, Ghosh says that it's time to make changes to Section 230 of the Communications Decency Act , which protects social media sites like Facebook from liability for the content their users share. "This requires a collective effort, from Facebook, users, potentially security companies like us to keep these actors off social media sites," he says.
"Four years ago we would have had to scramble a bunch of security engineers all at once to start combing the code manually looking for additional issues," Hooimeijer says about the incident. "Instead, we used Zoncolan to ensure there were no additional issues in our code base that were similar in nature. In this case we created new rules that found similar issues in practice." Inspiration for new rules that expand Zoncolan's detection capabilities come from a number of sources within Facebook, including the company's bug bounty program .
Zoncolan has a particularly tailored approach to hunting security bugs, versus more general static analysis tools that look for a broad array of design and performance bugs. It also focuses on recognizable data flows and patterns, as a way of cutting down on the false positives typical of static analysis. Still, Facebook's not the only company to customize a system to its liking; Google has its own custom-built static analysis tool as well, evaluating the company's enormous 2 billion line codebase.