Code duplication
Code duplication is very hard to define. It generally falls within one of these definitions.
- Identical code fragments except for variations in whitespace (may be also variations in layout) and comments.
- Structurally / syntactically identical fragments except for variations in identifiers, literals, types, layout and comments. The reserved words and the sentence structures are essentially the same.
- As previous, but with further modifications – statements can be changed, added and / or deleted in addition to variations in identifiers, literals, types, layout and comments.
- Code fragments that perform the same computation but implemented through different syntactic variants.
Why care?
- Propagation of bugs: if a code fragment contains a bug and this fragment is copied, then the bug will exist in all pasted fragments. More generally, duplicating code will also duplicate the associated technical debt.
- Increased maintenance cost: any maintenance required on a copied code fragments will certainly need to be applied on the pasted ones, i.e. duplication multiplies the work to be done.
- Increased time to understand and thus to improve/modify existing system if it contains a lot of duplications, because differences must be studied by developers before modifications.
- As an indicator of a bad design, lack of good inheritance structure or abstraction.
- As an indicator about copyright infringement.
How to see it?
Sonar has the tools for you to see these duplicated blocks.