Which Classes Contain the Most Errors?

I love reading the book Code Complete, by Steve McConnell. The following comes from it, in its Chapter 22: Developer Testing.

It’s natural to assume that defects are distributed evenly throughout your source code. If you have an average of 10 defects per 1000 lines of code, you might assume that you’ll have one defect in a class that contains 100 lines of code. This is natural assumption but it’s wrong.

Caper Jones reported that a focuses quality-improvement program at IBM identified 31 of 425 classes in the IMS system as error-prone. The 31 classes were repaired or completely redeveloped, and, in less than a year, customer-reported defects against IMS were reduced ten to one. Total maintenance costs were reduced by about 45 percent. Customer satisfaction improved from “unacceptable” to “good” (Jones 2000).

Most errors tend to be concentrated in a few highly defective routines. Here is the general relationship between errors and code:

  • Eighty percent of the errors are found in 20 percent of a project’s classes or routines (Endres 1975, Gremillion 1984, Boehm 1987b, Shull et al 2002).
  • Fifty percent of the errors are found in 5 percent of a project;s classes (Jones 2000).

These relationships might not seem so important until you recognize a few corollaries. First, 20% of a project’s routines contribute 80% of the cost of development (Boehm 1987b). That doesn’t necessarily mean that the 20% that cost the most are the same as the 20% with the most defects, but it’s pretty suggestive.

Second, regardless of the exact proportion of the cost contributed by highly defective routines, highly defective routines are extremely expensive. In a classic study in the 1960s, IBM performed an analysis of it’s OS/360 operating system and found that errors were not distributed evenly across all routines but were concentrated into a few.

Third, the implication of expensive routines for development is clear. As the old expression goes, “time is money”. The corollary is that “money is time”, and if you can cut close to 80 percent of the cost by avoiding troublesome routines, you can cut a substantial amount of the schedule as well.

Fourth, the implication of avoiding troublesome routines for maintenance is equally clear. Maintenance activities should be focuses on identifying, redesigning, and rewriting from the ground up those routines that have been identified as error-prone.




Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: