What are code duplicates, how do they arise and why should they be eliminated?
Code duplicate – identical or similar source code
Source code that is used in identical form several times within a piece of software is called a code duplicate – alternatively also a software clone or source code clone. Similar code sections or fragments are also considered duplicates.
Code duplicates are created when existing functionalities are copied from one place to another within a software. This is also called copy-and-paste programming.
Reasons for code duplicates
There are several reasons that lead to code duplicates:
- Existing code is copied in order to adapt it elsewhere (e.g. by renaming variables or deleting or adding lines of code) or to test it.
- Functioning code is reused to minimise the risk of new errors.
- Developers are under time pressure and want to use existing code to save implementation time.
- Developers do not have the knowledge to produce a specific code and therefore duplicate existing code.
Code is generated automatically, e.g. by Model Driven Development or Software Factories.
The use of libraries can also lead to the creation of software clones.
Reasons for eliminating code duplicates
- reduce maintenance costs. A source code clone must be read and understood at every point of use. In contrast to other lines of code to which this also applies, it is important to find out whether minor differences (e.g. differences in identifiers, the use of gaps or comments) are intended. Once optimizations have been identified, they must be implemented at all points of use.
- to facilitate bugfixing. It is easy to overlook source clones and errors copied with them. This results in inconsistent changes.
- Minimise memory requirements. Cloning increases the amount of code and thus the amount of memory required. This is particularly critical in the area of embedded systems, as there is usually only little memory available. In addition, duplication also increases the time required for compilation.
- to facilitate the readability of code.
When identifying code duplicates – also known as clone detection – tools can help with both textual and lexical or abstract analysis. The use of metrics is also often supported by tools.
Software clones can be resolved with the help of abstractions (e.g. by outsourcing recurring algorithms into procedures or methods) or by using a common base class. If this is not possible so easily and quickly, it is recommended to “observe” the code duplicate in order to at least avoid further, unwanted inconsistent changes.
Here you will find additional information from our Smartpedia section: