Cyclomatic Complexity
Have you ever opened up a file in an older piece of software and seen this?
# rubocop:ignore Metrics/CyclomaticComplexity
def complex_method(x, y, z)
if x > 0
if y > 0
if z > 0
return "All positive"
else
return "z is not positive"
end
else
if z > 0
return "y is not positive, but z is positive"
else
return "y and z are not positive"
end
end
else
if y > 0
if z > 0
return "x is not positive, but y and z are positive"
else
return "x and z are not positive, but y is positive"
end
else
if z > 0
return "x and y are not positive, but z is positive"
else
return "All are not positive"
end
end
end
end
# did you actually read the whole thing? I hope not!
# rubocop enable:Metrics/CyclomaticComplexity
What is Cyclomatic Complexity?
Looking at that method, it’s, well, bad. It’s hard to read and you’re likely lost by the second or third conditional.
It’s easy to say, complexity here is bad. But what does the linter mean when it calls this “Cyclomatic Complexity”?
In typical Wikipedia fashion, the first sentence on cyclomatic complexity uses the word “complexity” to define it:
Cyclomatic complexity is a software metric used to indicate the complexity of a program.
Errrr…okay. Cyclomatic complexity is…complexity? Not helpful.
Let’s read on:
It is a quantitative measure of the number of linearly independent paths through a program's source code.
Ah! That’s a bit more helpful. Cyclomatic Complexity is a number grading system to tell you how messy your code is! 😆
Let’s dig into this.
The Math (or not)
There are formulae behind computing cyclomatic complexity, and you can map those formulas onto the code. We’ll take a look at this in a minute, but first for those of us who think of things in connected graphs rather than formulas. There’s another way of thinking about cyclomatic complexity.
How many tests do you have to write to get it right?
In the disastrous example above, I can see at least 9 test cases because there are 8 return
statements and the implicit return
of the method itself. That’s a lot to keep track of!
You can even visualize this! Here’s an example from the Wikipedia page:
The formulae
The caption on the image above uses a formula where
E = the number of edges of the graph.
N = the number of nodes of the graph.
P = the number of connected components.
And calculates the complexity with E - N + 2P
. You can determine the number of edges and nodes by counting the control flow statements (conditionals).
Rubocop uses a slightly simpler calculation.
Control flow statements (I.e., conditionals, or if
statements) count towards cyclomatic complexity. Every method has a cyclomatic complexity of 1 if the code path is linear. That means there are no control flow statements and no alternate outcomes.
In Rubocop, every control flow statement adds 1 to the cyclomatic complexity. Control flow statements are if
, unless
, ternary statements using ?
and :
, &&
, and ||
.
You might think those last two are odd because you can write if true && true
and that looks like one statement. But it can be broken down into two if
statements so it should be treated the same.
if true
if true
return someting
end
end
Rubocop also counts enumerations towards cyclomatic complexity because code can branch based on the contents of the enumerator. However, enumerations do not multiply the complexity. For example, if an enumerator has 5 entities and there is an if
inside the enumeration block, the if
only counts 1
point towards cyclomatic complexity, not 5. In most cases, Rubocop won’t know how many entities are in the enumerator anyway.
The Rubocop docs have a great example of how they calculate cyclomatic complexity:
def each_child_node(*types) # count begins: 1
unless block_given? # unless: +1
return to_enum(__method__, *types)
children.each do |child| # each{}: +1
next unless child.is_a?(Node) # unless: +1
yield child if types.empty? || # if: +1, ||: +1
types.include?(child.type)
end
self
end # total: 6
How do I fix cyclomatic complexity?!
Each case will be slightly different, so here’s the approach I usually take.
- Start with breaking the method down into smaller methods that have a single responsibility. Keep human-readability in mind!
- Bring early
return
statements to the top. These are called guard clauses. If there are any cases where there is a single condition likeif var == 1
, you can convert this to a guard clause. - If you begin to see a pattern where some groups of method have related functionality or handle specific types of data, consider breaking these off into their own class.
- Skim through design pattern docs and see if any would help you. I’m bad at remembering names of design patterns and their examples. But I often find them helpful when I’m untangling complex code. So do a bit of searching or use your favourite AI to decide if a design pattern can help.
- Did I say keep human-readability in mind? We’re writing code for humans to read and computers to execute. Computers don’t care if your code is messy, but humans will get conflustered if your code is hard to read and it causes a bug.