I’m writing today’s blog post partly because of a conversation I had with Gunnar Peterson and partly because we hit a security code review milestone at Realex Payments this week!
We were discussing application security resourcing and how here at Realex we have roughly one full time application security resource for every ten developers. At the moment this means we have two full time application security resources in our Product Delivery team. I wrote the following tweet to Gunnar whilst I was explaining how we have ended up with this ratio:
And I must admit I made a slight mistake because I only have 3 years of data not the 4 I mentioned in the tweet but the rest still applies! I first started formally recording security code review data on the 25th September 2009 in a spreadsheet with the following column headings after a discussion with an auditor (not a QSA!):
Looking back this is something I wish I’d have started doing sooner but I’m happy sitting here with nearly 3 years of data instead of none. We used the spreadsheet to record the security code reviews we completed and this ultimately started the idea in my head to develop Agnitio because I didn’t know why the data would be important but I knew one day it would be useful. I introduced the first version of Agnitio in early 2010 and we used the spreadsheet alongside the very early versions of Agnitio up until v1.0 was released to the public. Ever since mid November 2010 we have used Agnitio to conduct 439 security code reviews covering 443,196 lines of code. When we add the pre Agnitio data we have now reviewed over 500,000 lines of code in 614 security code reviews since the 25th September 2009 up until a review I saved a couple of hours ago. I should point out that means we have manually, with humans reviewed over 500,000 lines of code now since 25th September 2009. We do of course use static analysis tools to assist the human reviewers but we do have a stance of reviewing every single line of code changed or introduced in a release. We can manage this with the amount of code we produce even with a switch to Agile development (more on this later) because since we started keeping records in late 2009 we review an average of 14,006 lines of code a month. This is of course an average and some months we have a lot less to review and other months we have had 4 or 5 times this amount to review. But if we take the average lines of code a month figure I know I need to allocate at least half of an application security resource to security code reviews each month. By knowing the average lines of code a month that we need to review, the average time taken to security test specific applications and services, time needed for work like threat modelling and miscellaneous Security Ninja work (hey, being Security Ninja isn’t as easy as I make it look!) we were able to justify hiring more application security staff.
We looked at the above data and compared it to my own knowledge of security code reviews and specifically at how long I could review code for before I became less effective. When we looked at research like the “11 proven practices for more effective, efficient peer code review” we decided that our human reviewers should limit themselves to reviewing 1,500 lines of code in a day. This is based on multiple parts of that particular study (review fewer than 200-400 lines of code at one time, take breaks between those review sittings, human concentration levels, other work that needs to be done and so on) and this has served us well over the past three years. We will often exceed 1,500 lines of code a day when it’s fine to do so but in general that’s how we estimate security code reviews here. That of course means on average we need to allocate 10 days of a humans time from our team to do security code reviews each month. Without going off track and away from the security code review data I’m focusing on today when we factored in time for security testing and other SDLC work we could clearly see we needed more resources. If you don’t have the data you can’t make a valid argument for more resources in my opinion.
The final thing I wanted to touch on was Agile development and how our data helped us have adequate security resources for our switch to Agile. I’ve said it on Twitter before but our security deliverables are the same as they were with Waterfall. They just tend to be smaller reviews that happen more frequently than they used to. We didn’t jump into the deep end when it came to Agile which helped us collect data that we could use to estimate the impact on application security when we moved all our development teams to Agile. We moved one team to Agile first and we used the data we had collected from them to figure out what, if any resource changes we needed to make in application security. What we found surprised us in to be honest. We didn’t need much more than we already had (about 1 app sec for 10 developers) even though a lot more reviews were needed. To give you an idea of how many more reviews we completed 92 security code reviews between 1st January 2011 and 13th September 2011. In the same period in 2012 we have completed 290 security code reviews.
One final point I wanted to make was around data access. We have created a small tool which accesses the Agnitio database and gives us (not just application security, anyone in the department) many different views of the three years worth of data. This includes data range searches, per component searches and per team searches. If you have the data don’t hide it.
I know people have been asking on Twitter recently whether anyone had data to backup whether training developers helped produce secure code or not (or something similar I think). Well I took a look at the data I’ve collected over the past three years where we have mandatory on hire and periodic application security training for developers in addition to all the other good things you’d expect in an SDLC. Can I say security awareness training is helping us to produce secure code? Well it’s hard to say which bit of the SDLC makes the biggest (or smallest) difference but working with the data I do have we have a first time pass rate of 93.8% (which is something I’m proud of but still want to increase) with a pass being granted when no security bugs are found (literally none).
I’m not sure what people would like me to discuss on the data we have collected so if you have any questions feel free to ask!