Long Method Smell: When is a method too big?
Unfortunately most people still measure size of code in number of Lines of Code (LoC). We all know LoC is a professional malpractice. Now, how do you objectively identify a long method? If we are not supposed to count LoC, then how can we define a long method?
Some people say, if the code does not fit in one screen and if you have to hit page down, then the method is long. How many times have you looked at code that fits in one screen, but still felt that code was long? Happens to me all the time.
Joshua Kerievsky says “If one cannot quickly and easily understand what a method does and how the method does it, it is a long method”. I really like this definition. But is a little wage to me and I don’t quite understand the theory behind why and when can something be hard to easily understand.
One approach I’ve found to rationalize long method smell is by using the Single Responsibility Principle (SRP). If the method violates SRP, there is a good chance that its Long Method.
If I need to parse the method’s code more than once, then its a good indication that the method is complicated to understand.
Cyclomatic Complexity can also give some interesting data points to under/measure when a piece of code is long. Usually large methods have a higher CC.
Recently I stumbled upon “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information“, a 1956 paper by the cognitive psychologist George A. Miller of Princeton University’s Department of Psychology.
In this paper, Miller showed a number of remarkable coincidences between the channel capacity of a number of human cognitive and perceptual tasks. In each case, the effective channel capacity is equivalent to between 5 and 9 equally-weighted error-less choices: on average, about 2.5 bits of information. – Source WikiPedia
What does this mean? In a layman’s world, this means that 7+/-2 is the number of things (concepts) we can hold in our brain. So when I look at a piece of code and if it has more than 9 things in there, it exceeds my brain capacity to hold it in my memory and actually understand what is going on. I often notice that 7 or less things in the code is easy to manage. Once it starts cross that number, its gets exponentially difficult to hold it in my mind and to understand what is going on.
So if you are thinking of deleting elements from an array if they match a set of to-be-deleted elements, then that’s a good method for me. Why? Coz : I have an array, a set, an iterator, a loop, current values, a comparator and a delete operation. Around 7 things. That’s the max I can hold in my brain. But now if all of a sudden you throw thread synchronization into this, I may end up taking the loop, matching the current elements and the deletion out into another method.
So size has nothing to do with LoC, its a measure of related concepts that you need to hold in your brain.