Detecting Code Duplicates in C/C++ with CCFinderX
Over time, as your source code repository and software team(s) grow, you may have more and more code that just does the same thing. This is obviously not desirable since several persons work on code doing the same thing, so you just pay twice for the cost for development and debugging. To avoid this issue, proper team communication and management must be in place (e.g. discourage copy/paste of source code, use a common source control repository..). However, it might be difficult to always detect where the code duplicates are. Luckily, code duplication analysis tools such as CCFinderX are here to help.
As described on CCFinderX website:
CCFinderX is a code-clone detector, which detects code clones (duplicated code fragments) from source files written in Java, C/C++, COBOL, VB, C#.
CCFinderX is a major version up of CCFinder, and it has been totally re-designed and re-implemented from scratch. Its new design and technologies aim at improving performance, enabling a user-side customization of a preprocessor, and providing an interactive analysis based on metrics.
CCFinderX can be installed in Windows XP/Vista or Linux Ubuntu.
- For Windows, download ccfx-win32-en.zip.
- For Linux, download ccfx-src.7z and karmicmakefileetc.7z (Experimental)
For Windows, you’ll also need to download and install Python 2.6 in order to run CCFinderX version 10.2.7.4. N.B: I had Python 2.5.2, but I had to upgrade to Python 2.6.5 to make it work. CCFinderX will not work with Python 3.x even if you change gemx.bat
After extracting ccfx-win32-en-zip to C:\Program Files, click on C:\Program Files\ccfx-win32-en\bin\gemx.bat to start CCFinderX.
To start, detecting clone, click on File->Detect Clones and select the directory where your source code is located and the preprocess script (cpp, cobol, csharp, java, plaintext or visualbasic).
For testing, I’ve just created a directory with two C files with some code including the function ConvertEncodings in both source files.
Once the duplicate detection is complete, I usually click on “Clone Set Table” tab in the second column, where you’ll be able to sort the duplicate by length and the “Source Text” tab to check the duplicates side by side.
I then use a file comparison software such as BeyondCompare or WinMerge to handle the code duplicates, either by deleting the extra function, or by creating a function in case some parts of code have been heavily copied and pasted.