bug #760014: autogenerate check documentation for duplicate code from javadoc

merged the existing texts in old config_duplicates.xml to package.html and java class doc
This commit is contained in:
Lars Kühne 2005-01-09 14:55:41 +00:00
parent da1cb4309e
commit 08d574454c
3 changed files with 98 additions and 158 deletions

View File

@ -29,23 +29,73 @@ import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
/**
* Checks for duplicate code that only differs by indentation.
* <p>
* Performs a line-by-line comparison of all code lines and reports
* duplicate code if a sequence of lines differs only in
* indentation. All import statements in Java code are ignored, any
* other line - including javadoc, whitespace lines between methods,
* etc. - is considered (which is why the check is called
* <em>strict</em>).
* </p>
*
* <subsection name="Properties">
* <table>
* <tr>
* <th>name</th>
* <th>description</th>
* <th>type</th>
* <th>default value</th>
* </tr>
* <tr>
* <td>min</td>
* <td>how many lines must be equal to be considered a duplicate</td>
* <td><a href="property_types.html#int">int</a></td>
* <td><span class="default">12</span></td>
* </tr>
* <tr>
* <td>charset</td>
* <td>name of the file charset</td>
* <td><a href="property_types.html#string">String</a></td>
* <td>System property &quot;file.encoding&quot;</td>
* </tr>
* </table>
* </subsection>
*
* <subsection name="Examples">
* <p> To configure the check: </p>
* <source>
* &lt;module name=&quot;StrictDuplicateCode&quot;/&gt;
* </source>
*
* <p>
* There are many approaches for detecting duplicate code. Some involve
* parsing a file of a programming language and analyzing the source trees
* of all files. This is a very powerful approach for a specific programming
* language (such as Java), as it can potentially even detect duplicate code
* where linebreaks have been changed, variables have been renamed, etc.
* To configure the check so that it allows larger equivalent blocks:
* </p>
* <source>
* &lt;module name=&quot;StrictDuplicateCode&quot;&gt;
* &lt;property name=&quot;min&quot; value=&quot;15&quot;/&gt;
* &lt;/module&gt;
* </source>
*
* <p>
* This copy and paste detection implementation works differently.
* It cannot detect copy and paste code where the author deliberately
* tries to hide his copy+paste action. Instead it focusses on the standard
* corporate problem of reuse by copy and paste. Usually this leaves linebreaks
* and variable names intact. Since we do not need to analyse a parse tree
* our tool is not tied to a particular programming language.
* To configure the check so that it handles files with the <span
* class="code">UTF-8</span> charset:
* </p>
* <source>
* &lt;module name=&quot;StrictDuplicateCode&quot;&gt;
* &lt;property name=&quot;charset&quot; value=&quot;UTF-8&quot;/&gt;
* &lt;/module&gt;
* </source>
* </subsection>
*
* <subsection name="Package">
* <p>com.puppycrawl.tools.checkstyle.checks.duplicates</p>
* </subsection>
*
* <subsection name="Parent Module">
* <p>
* <a href="config.html#Checker">Checker</a>
* </p>
* </subsection>
*
* @author Lars K&uuml;hne
*/

View File

@ -1,7 +1,7 @@
<html>
<body>
<p>
<span class="xdocspagetitle">Duplicate code detection</span> allows you to find
<span class="xdocspagetitle">Duplicate code</span> detection allows you to find
code that has been generated by Copy/Paste programming. Duplicate code typically
leads to higher maintainance cost because bugs will need to be fixed twice,
more code needs to be tested, etc.
@ -28,16 +28,49 @@ Note that there are brilliant commercial implementations of duplicate code
detection tools. One that is particularly noteworthy is
<a href="http://www.redhillconsulting.com.au/products/simian/">Simian</a>
from RedHill Consulting, Inc.
</p>
<p>
Simian has managed to find a very good balance of the above tradeoffs.
It is superior to the checks in this package in many repects.
Simian is reasonably priced (free for noncommercial projects)
and includes a Checkstyle plugin.
</p>
<p>
The following table summarizes the characteristics of the available
Checkstyle plugins for duplicate code detection:
</p>
<table>
<tr>
<th>Name</th>
<th>Speed</th>
<th>Memory Usage</th>
<th>False Alarms</th>
<th>Supported languages</th>
<th>Fuzzy matches</th>
</tr>
<tr>
<td>StrictDuplicateCode</td>
<td>Medium</td>
<td>Very Low</td>
<td>Possible but very unlikely</td>
<td>any language</td>
<td>No</td>
</tr>
<tr>
<td>Simian</td>
<td>Very high</td>
<td>Low</td>
<td>Possible but very unlikely</td>
<td>many languages, including Java and C/C++/C#</td>
<td>Limited support</td>
</tr>
</table>
<p>
<strong>
We encourage all users of Checkstyle to evaluate Simian as an
alternative to the Checks we offer in our distribution.
</strong>
</p>
</body>
</html>

View File

@ -1,143 +0,0 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<document>
<properties>
<title>Duplicate Code</title>
<author email="checkstyle-devel@lists.sourceforge.net">Checkstyle Development Team</author>
</properties>
<body>
<p>
There are many trade-offs when writing a duplicate code detection tool.
Some of the conflicting goals are:
<ul>
<li>Fast</li>
<li>Low memory usage</li>
<li>Avoid false alarms</li>
<li>Support multiple/arbitrary languages</li>
<li>
Support Fuzzy matches (comments, whitespace, linebreaks, variable
renaming, etc.)
</li>
</ul>
</p>
<p>
Note that there are brilliant commercial implementations of duplicate
code detection tools. One that is particularly noteworthy is <a
href="http://www.redhillconsulting.com.au/products/simian/">Simian</a>
from RedHill Consulting, Inc.
</p>
<p>
Simian is reasonably priced (free for noncommercial projects) and
includes a Checkstyle plugin. We encourage all users of Checkstyle to
evaluate Simian as an alternative to the Checks we offer in our
distribution.
</p>
<p>
The following table summarizes the characteristics of the available
Checkstyle plugins for duplicate code detection:
</p>
<table>
<tr>
<th>Name</th>
<th>Speed</th>
<th>Memory Usage</th>
<th>False Alarms</th>
<th>Supported languages</th>
<th>Fuzzy matches</th>
</tr>
<tr>
<td>StrictDuplicateCode</td>
<td>Medium</td>
<td>Very Low</td>
<td>Possible but very unlikely</td>
<td>any language</td>
<td>No</td>
</tr>
<tr>
<td>Simian</td>
<td>Very high</td>
<td>Low</td>
<td>Possible but very unlikely</td>
<td>many languages, including Java and C/C++/C#</td>
<td>Limited support</td>
</tr>
</table>
<section name="StrictDuplicateCode">
<p>
Performs a line-by-line comparison of all code lines and reports
duplicate code, i.e. a sequence of lines that differ only in
indentation. All import statements in Java code are ignored, any
other line - including javadoc, whitespace lines between methods,
etc. - is considered (which is why the check is called
<em>strict</em>).
</p>
<subsection name="Properties">
<table>
<tr>
<th>name</th>
<th>description</th>
<th>type</th>
<th>default value</th>
</tr>
<tr>
<td>min</td>
<td>how many lines must be equal to be considered a duplicate</td>
<td><a href="property_types.html#int">int</a></td>
<td><span class="default">12</span></td>
</tr>
<tr>
<td>charset</td>
<td>name of the file charset</td>
<td><a href="property_types.html#string">String</a></td>
<td>System property &quot;file.encoding&quot;</td>
</tr>
</table>
</subsection>
<subsection name="Examples">
<p> To configure the check: </p>
<source>
&lt;module name=&quot;StrictDuplicateCode&quot;/&gt;
</source>
<p>
To configure the check so that it allows larger equivalent blocks:
</p>
<source>
&lt;module name=&quot;StrictDuplicateCode&quot;&gt;
&lt;property name=&quot;min&quot; value=&quot;15&quot;/&gt;
&lt;/module&gt;
</source>
<p>
To configure the check so that it handles files with the <span
class="code">UTF-8</span> charset:
</p>
<source>
&lt;module name=&quot;StrictDuplicateCode&quot;&gt;
&lt;property name=&quot;charset&quot; value=&quot;UTF-8&quot;/&gt;
&lt;/module&gt;
</source>
</subsection>
<subsection name="Package">
<p> com.puppycrawl.tools.checkstyle.checks.duplicates </p>
</subsection>
<subsection name="Parent Module">
<p>
<a href="config.html#checker">Checker</a>
</p>
</subsection>
</section>
</body>
</document>