bug #760014: autogenerate check documentation for duplicate code from javadoc
merged the existing texts in old config_duplicates.xml to package.html and java class doc
This commit is contained in:
parent
da1cb4309e
commit
08d574454c
|
|
@ -29,23 +29,73 @@ import org.apache.commons.logging.Log;
|
|||
import org.apache.commons.logging.LogFactory;
|
||||
|
||||
/**
|
||||
* Checks for duplicate code that only differs by indentation.
|
||||
* <p>
|
||||
* Performs a line-by-line comparison of all code lines and reports
|
||||
* duplicate code if a sequence of lines differs only in
|
||||
* indentation. All import statements in Java code are ignored, any
|
||||
* other line - including javadoc, whitespace lines between methods,
|
||||
* etc. - is considered (which is why the check is called
|
||||
* <em>strict</em>).
|
||||
* </p>
|
||||
*
|
||||
* <subsection name="Properties">
|
||||
* <table>
|
||||
* <tr>
|
||||
* <th>name</th>
|
||||
* <th>description</th>
|
||||
* <th>type</th>
|
||||
* <th>default value</th>
|
||||
* </tr>
|
||||
* <tr>
|
||||
* <td>min</td>
|
||||
* <td>how many lines must be equal to be considered a duplicate</td>
|
||||
* <td><a href="property_types.html#int">int</a></td>
|
||||
* <td><span class="default">12</span></td>
|
||||
* </tr>
|
||||
* <tr>
|
||||
* <td>charset</td>
|
||||
* <td>name of the file charset</td>
|
||||
* <td><a href="property_types.html#string">String</a></td>
|
||||
* <td>System property "file.encoding"</td>
|
||||
* </tr>
|
||||
* </table>
|
||||
* </subsection>
|
||||
*
|
||||
* <subsection name="Examples">
|
||||
* <p> To configure the check: </p>
|
||||
* <source>
|
||||
* <module name="StrictDuplicateCode"/>
|
||||
* </source>
|
||||
*
|
||||
* <p>
|
||||
* There are many approaches for detecting duplicate code. Some involve
|
||||
* parsing a file of a programming language and analyzing the source trees
|
||||
* of all files. This is a very powerful approach for a specific programming
|
||||
* language (such as Java), as it can potentially even detect duplicate code
|
||||
* where linebreaks have been changed, variables have been renamed, etc.
|
||||
* To configure the check so that it allows larger equivalent blocks:
|
||||
* </p>
|
||||
* <source>
|
||||
* <module name="StrictDuplicateCode">
|
||||
* <property name="min" value="15"/>
|
||||
* </module>
|
||||
* </source>
|
||||
*
|
||||
* <p>
|
||||
* This copy and paste detection implementation works differently.
|
||||
* It cannot detect copy and paste code where the author deliberately
|
||||
* tries to hide his copy+paste action. Instead it focusses on the standard
|
||||
* corporate problem of reuse by copy and paste. Usually this leaves linebreaks
|
||||
* and variable names intact. Since we do not need to analyse a parse tree
|
||||
* our tool is not tied to a particular programming language.
|
||||
* To configure the check so that it handles files with the <span
|
||||
* class="code">UTF-8</span> charset:
|
||||
* </p>
|
||||
* <source>
|
||||
* <module name="StrictDuplicateCode">
|
||||
* <property name="charset" value="UTF-8"/>
|
||||
* </module>
|
||||
* </source>
|
||||
* </subsection>
|
||||
*
|
||||
* <subsection name="Package">
|
||||
* <p>com.puppycrawl.tools.checkstyle.checks.duplicates</p>
|
||||
* </subsection>
|
||||
*
|
||||
* <subsection name="Parent Module">
|
||||
* <p>
|
||||
* <a href="config.html#Checker">Checker</a>
|
||||
* </p>
|
||||
* </subsection>
|
||||
*
|
||||
* @author Lars Kühne
|
||||
*/
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
<html>
|
||||
<body>
|
||||
<p>
|
||||
<span class="xdocspagetitle">Duplicate code detection</span> allows you to find
|
||||
<span class="xdocspagetitle">Duplicate code</span> detection allows you to find
|
||||
code that has been generated by Copy/Paste programming. Duplicate code typically
|
||||
leads to higher maintainance cost because bugs will need to be fixed twice,
|
||||
more code needs to be tested, etc.
|
||||
|
|
@ -28,16 +28,49 @@ Note that there are brilliant commercial implementations of duplicate code
|
|||
detection tools. One that is particularly noteworthy is
|
||||
<a href="http://www.redhillconsulting.com.au/products/simian/">Simian</a>
|
||||
from RedHill Consulting, Inc.
|
||||
</p>
|
||||
<p>
|
||||
Simian has managed to find a very good balance of the above tradeoffs.
|
||||
It is superior to the checks in this package in many repects.
|
||||
Simian is reasonably priced (free for noncommercial projects)
|
||||
and includes a Checkstyle plugin.
|
||||
</p>
|
||||
<p>
|
||||
The following table summarizes the characteristics of the available
|
||||
Checkstyle plugins for duplicate code detection:
|
||||
</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Name</th>
|
||||
<th>Speed</th>
|
||||
<th>Memory Usage</th>
|
||||
<th>False Alarms</th>
|
||||
<th>Supported languages</th>
|
||||
<th>Fuzzy matches</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>StrictDuplicateCode</td>
|
||||
<td>Medium</td>
|
||||
<td>Very Low</td>
|
||||
<td>Possible but very unlikely</td>
|
||||
<td>any language</td>
|
||||
<td>No</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Simian</td>
|
||||
<td>Very high</td>
|
||||
<td>Low</td>
|
||||
<td>Possible but very unlikely</td>
|
||||
<td>many languages, including Java and C/C++/C#</td>
|
||||
<td>Limited support</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>
|
||||
<strong>
|
||||
We encourage all users of Checkstyle to evaluate Simian as an
|
||||
alternative to the Checks we offer in our distribution.
|
||||
</strong>
|
||||
</p>
|
||||
|
||||
</body>
|
||||
</html>
|
||||
|
|
@ -1,143 +0,0 @@
|
|||
<?xml version="1.0" encoding="ISO-8859-1"?>
|
||||
|
||||
<document>
|
||||
|
||||
<properties>
|
||||
<title>Duplicate Code</title>
|
||||
<author email="checkstyle-devel@lists.sourceforge.net">Checkstyle Development Team</author>
|
||||
</properties>
|
||||
|
||||
<body>
|
||||
|
||||
<p>
|
||||
There are many trade-offs when writing a duplicate code detection tool.
|
||||
Some of the conflicting goals are:
|
||||
<ul>
|
||||
<li>Fast</li>
|
||||
<li>Low memory usage</li>
|
||||
<li>Avoid false alarms</li>
|
||||
<li>Support multiple/arbitrary languages</li>
|
||||
<li>
|
||||
Support Fuzzy matches (comments, whitespace, linebreaks, variable
|
||||
renaming, etc.)
|
||||
</li>
|
||||
</ul>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Note that there are brilliant commercial implementations of duplicate
|
||||
code detection tools. One that is particularly noteworthy is <a
|
||||
href="http://www.redhillconsulting.com.au/products/simian/">Simian</a>
|
||||
from RedHill Consulting, Inc.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Simian is reasonably priced (free for noncommercial projects) and
|
||||
includes a Checkstyle plugin. We encourage all users of Checkstyle to
|
||||
evaluate Simian as an alternative to the Checks we offer in our
|
||||
distribution.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The following table summarizes the characteristics of the available
|
||||
Checkstyle plugins for duplicate code detection:
|
||||
</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Name</th>
|
||||
<th>Speed</th>
|
||||
<th>Memory Usage</th>
|
||||
<th>False Alarms</th>
|
||||
<th>Supported languages</th>
|
||||
<th>Fuzzy matches</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>StrictDuplicateCode</td>
|
||||
<td>Medium</td>
|
||||
<td>Very Low</td>
|
||||
<td>Possible but very unlikely</td>
|
||||
<td>any language</td>
|
||||
<td>No</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Simian</td>
|
||||
<td>Very high</td>
|
||||
<td>Low</td>
|
||||
<td>Possible but very unlikely</td>
|
||||
<td>many languages, including Java and C/C++/C#</td>
|
||||
<td>Limited support</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<section name="StrictDuplicateCode">
|
||||
<p>
|
||||
Performs a line-by-line comparison of all code lines and reports
|
||||
duplicate code, i.e. a sequence of lines that differ only in
|
||||
indentation. All import statements in Java code are ignored, any
|
||||
other line - including javadoc, whitespace lines between methods,
|
||||
etc. - is considered (which is why the check is called
|
||||
<em>strict</em>).
|
||||
</p>
|
||||
|
||||
<subsection name="Properties">
|
||||
<table>
|
||||
<tr>
|
||||
<th>name</th>
|
||||
<th>description</th>
|
||||
<th>type</th>
|
||||
<th>default value</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>min</td>
|
||||
<td>how many lines must be equal to be considered a duplicate</td>
|
||||
<td><a href="property_types.html#int">int</a></td>
|
||||
<td><span class="default">12</span></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>charset</td>
|
||||
<td>name of the file charset</td>
|
||||
<td><a href="property_types.html#string">String</a></td>
|
||||
<td>System property "file.encoding"</td>
|
||||
</tr>
|
||||
</table>
|
||||
</subsection>
|
||||
|
||||
<subsection name="Examples">
|
||||
<p> To configure the check: </p>
|
||||
<source>
|
||||
<module name="StrictDuplicateCode"/>
|
||||
</source>
|
||||
|
||||
<p>
|
||||
To configure the check so that it allows larger equivalent blocks:
|
||||
</p>
|
||||
<source>
|
||||
<module name="StrictDuplicateCode">
|
||||
<property name="min" value="15"/>
|
||||
</module>
|
||||
</source>
|
||||
|
||||
<p>
|
||||
To configure the check so that it handles files with the <span
|
||||
class="code">UTF-8</span> charset:
|
||||
</p>
|
||||
<source>
|
||||
<module name="StrictDuplicateCode">
|
||||
<property name="charset" value="UTF-8"/>
|
||||
</module>
|
||||
</source>
|
||||
</subsection>
|
||||
|
||||
<subsection name="Package">
|
||||
<p> com.puppycrawl.tools.checkstyle.checks.duplicates </p>
|
||||
</subsection>
|
||||
|
||||
<subsection name="Parent Module">
|
||||
<p>
|
||||
<a href="config.html#checker">Checker</a>
|
||||
</p>
|
||||
</subsection>
|
||||
</section>
|
||||
</body>
|
||||
</document>
|
||||
Loading…
Reference in New Issue