Of course it’s impossible to answer this question in one single post, or even in one single book (the best known book for this is the one from M.Feathers, “Working Effectively with Legacy Code” that you can buy here). In fact it’s impossible to answer this question at all as it will all depend on what is the current status of what you have to refactor and every situation is specific. But still, there are few things that can be said and that can/should be applied when it’s about refactoring.
The first thing to do is to establish a dependency graph. It does not have to be precise (the more it is the better it is), but it should be sufficient to define the general modules and the main classes inside those modules. This will enable you to define the coupling and the constraints of your development.
Once that you have establish a dependency graph then starts then questions will arise, the main being : where should I start ?
The first reaction is generally to use your experience and your intuition. I don’t say it’s a bad thing, but it’s not the easiest as for sure it will be quite subjective. It will depend on things that you can not always rely on, like :
- I know the app, so I know where the bad things are (and they are generally in someone else’s classes)
- This class looks big so it should be refactored
- This view controller uses some logic so let’s start with it
In fact, it should come in interactions (in parallel) with objectives factors and the good point (and generally unknown point) is that there are metrics that can give you better ideas about where to start.
One of this metric is called the Instability Metric. It is calculated like this :
Instability = Out / (In + Out)
where Out is the number of arrows coming out of the class (the class will depend on the classes they are pointing to), and In the number of arrows pointing to the class (those classes depend on the class that the arrows are pointing to).
Let’s take examples :
In Fig-1, ClassA has 3 arrows Out and 1 arrow In, so it’s Instability Metric equals 0.75 ( 3 / (1+3)).
In Fig-2, ClassB has 3 arrows Out and no arrow In, so it’s Instability Metric equals 1 ( 3/(0+3)). As we can see, this class depends on other classes (a lot) but no one depends on it. It’s a class that is really instable as it will be in danger if any of its dependency changes but if we change it no one will care.
Let’s take a last example. In Fig-3, ClassC has no arrow Out but 3 arrows In. It’s Instability Metric is thus 0 (0 / (3+0)). 3 classes depend on it but it does not depend on anyone. It’s a very stable class as only internal changes will affect it. On the other hand, a change on this class will potentially impact a lot of other classes (at least 3).
So the closer to 0 is the Instability Metric, the more stable the class is.
Rem : for those who are wondering : What if a class has no arrow In and Out ? Well, Instability Metric can not be calculated, but at the same time, or all your app is coded in one single file and the developper who did this should be fired, or this class has no interest for your development has it is not used at all, so you should/can delete it.
So now that you have this nice metric, what can you do with it ? Well, that’s the nice part, you can apply the Stable Dependency Principle which says :
Stability should follow dependency
which means that, Instability should decrease when you follow the dependency relations (the arrows to make it short).
Let’s apply this to a concrete example and you will see the interest (power?) of it.
In Fig-4, you see a dependency graph and the question that’s asked to you is : Is there any problem in this graph ? If your answer is “noop, everything is fine”, then you have a problem with architecture and dependency cycles. If your answer is “there are dependency cycles in this graph and a cycle is something that should not happen”, then you can keep on reading this blog post and move to the next question : “What should I do and what should I fix ?”. Well, most of the time people will focus on ClassB, mostly because it stands in the middle of the graph and because it has a lot of arrows in and out.
In that case, the best thing is to calculate instability and apply the Stable Dependency Principle. In our case, Fig-5 will give you hint on what to refactor.
What we can see here is that Instability is going up between Class D and Class C and thus it violates the SDP. So if you were a software architect, you should propose to refactor Class C in order to fix the cycles and respect the SDP.
Now, what can you propose to fix this ? A very good bad idea is to delete the dependency between Class C and Class B. If you apply again Instability and SDP you will see immediately the problem (see fig-6).
Another idea is to eliminate the dependency between Class D and Class C.
This is an excellent choice, except that eliminating a dependency is generally very difficult. The good point in this idea is the fact that you start to draw frontier inside your architecture and you can see that there’s obviously a boundary between to modules here : one including classes C and F and another one including the rest of the classes.
Once that this boundary is defined, the good choice (and generally the easiest) is to use DIP (Dependency Inversion Principle) to decouple the two defined modules and resolve the dependency cycles and follow the SDP, as in Fig-8.
Now you have two decoupled parts with clean boundaries, and the impact in your code is potentially minimal as the implementation should not be too difficult. The main drawback is the fact that you have a cycle in terms of dependency injection but any nice injection framework can solve this (in general).
I hope this post was of any help to someone. Let’s be SOLID. Have a nice day.