Wednesday, March 04, 2015

Does refactoring worsen code?

Thought provoking: Study finds that refactoring doesn’t improve code quality. The opening:

Refactoring software, that is, restructuring existing source code to make it more readable, efficient, and maintainable, is something all developers do every now and again. Of course, the implicit assumption behind refactoring is that the benefits (time and headaches saved in the future) outweigh the costs (time and effort spent now). However, new experimental research suggests that this may not be the case and that software code quality may not be improved much, if at all, by refactoring.

The study was done by researchers in Sri Lanka and recently published in the International Journal of Software Engineering & Applications titled An Empirical Evaluation of Impact of Refactoring On Internal and External Measures of Code Quality. The goal was to test whether common refactoring techniques resulted in measurable improvements in software quality, both externally (e.g., Is the code more maintainable?) and internally (e.g., Number of lines of code).

The researchers selected a small-scale application (about 4,500 lines of C# code) used by the academic staff at the University of Kelaniya for scheduling events and managing online documents for evaluation. 10 common refactoring techniques were applied to the code (e.g., Replace Type Code with Subclasses, Replace Conditional with Polymorphism).

Reading the press account I expect objections such as:

C#?
Well, language should not matter here
Too small an example!
This is a fair complaint, the impact of refactoring scales with code base size
Those weren't professionals
The reviewers were a mix, but this complaint overlooks how much code is written and maintained by "non-programmers", an oxymoron as programming is an activity, not a type of person

My own objection is that the refactoring did not go far enough. Deepening a type hierarchy does often make code "worse", delegation is often a better choice than inheritance. In a larger code base than the sample used, this become more obvious.

No comments: