Code duplication

Code duplication

Code duplication is very hard to define. It generally falls within one of these definitions.

  1. Identical code fragments except for variations in whitespace (may be also variations in layout) and comments.
  2. Structurally / syntactically identical fragments except for variations in identifiers, literals, types, layout and comments. The reserved words and the sentence structures are essentially the same.
  3. As previous, but with further modifications – statements can be changed, added and / or deleted in addition to variations in identifiers, literals, types, layout and comments.
  4. Code fragments that perform the same computation but implemented through different syntactic variants.

Why care?

  • Propagation of bugs: if a code fragment contains a bug and this fragment is copied, then the bug will exist in all pasted fragments. More generally, duplicating code will also duplicate the associated technical debt.
  • Increased maintenance cost: any maintenance required on a copied code fragments will certainly need to be applied on the pasted ones, i.e. duplication multiplies the work to be done.
  • Increased time to understand and thus to improve/modify existing system if it contains a lot of duplications, because differences must be studied by developers before modifications.
  • As an indicator of a bad design, lack of good inheritance structure or abstraction.
  • As an indicator about copyright infringement.

How to see it?

Sonar has the tools for you to see these duplicated blocks.