575 lines
24 KiB
XML
575 lines
24 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
|
|
<document xmlns="http://maven.apache.org/XDOC/2.0"
|
|
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
|
xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd">
|
|
|
|
<properties>
|
|
<title>Writing Checks</title>
|
|
<author>Lars Kühne</author>
|
|
</properties>
|
|
|
|
<body>
|
|
|
|
<section name="Overview">
|
|
|
|
<p>
|
|
OK, so you have finally decided to write your own Check. Welcome
|
|
aboard, this is really a easy thing to do. Very basic Java knowledge is required
|
|
to write a Check, it is good practice for even for student.
|
|
There are actually two
|
|
kinds of Checks, so before you can start, you have to find out
|
|
which kind of Check you want to implement.
|
|
</p>
|
|
|
|
<p>
|
|
The functionality of Checkstyle is implemented in modules that can
|
|
be plugged into Checkstyle. Modules can be containers for other
|
|
modules, i.e. they form a tree structure. The toplevel modules
|
|
that are known directly to the Checkstyle kernel (which is also a
|
|
module and forms the root of the tree) implement the <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/FileSetCheck.html">FileSetCheck</a>
|
|
interface. These are pretty simple to grasp: they take a set of
|
|
input files and fire error messages.
|
|
</p>
|
|
|
|
<p>
|
|
Checkstyle provides a few FileSetCheck implementations by default
|
|
and one of them happens to be the <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/TreeWalker.html">TreeWalker</a>. A
|
|
TreeWalker supports submodules that are derived from the <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/Check.html">Check</a>
|
|
class. The TreeWalker operates by separately transforming each of
|
|
the Java input files into an abstract syntax tree and then handing
|
|
the result over to each of the Check submodules which in turn have
|
|
a look at a certain aspect of the tree.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Writing Checks">
|
|
|
|
<p>
|
|
Most of the functionality of Checkstyle is implemented as
|
|
Checks. If you know how to write your own Checks, you can extend
|
|
Checkstyle according to your needs without having to wait for the
|
|
Checkstyle development team. You are about to become a Checkstyle
|
|
Expert.
|
|
</p>
|
|
|
|
<p>
|
|
Suppose you have a convention that the number of methods in a
|
|
class should not exceed a certain limit, say 30. This rule makes
|
|
sense, a class should only do one thing and do it well. With a
|
|
zillion methods chances are that the class does more than one
|
|
thing. The only problem you have is that your convention is not
|
|
checked by Checkstyle, so you'll have to write your own Check
|
|
and plug it into the Checkstyle framework.
|
|
</p>
|
|
|
|
<p> This chapter is organized as a tour that takes you
|
|
through the process step by step and explains both the theoretical
|
|
foundations and the <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/package-summary.html">Checkstyle
|
|
API</a> along the way. </p>
|
|
|
|
</section>
|
|
|
|
<section name="Java Grammar">
|
|
|
|
<p>
|
|
Every Java Program is structured into files, and each of these
|
|
files has a certain structure. For example, if there is a
|
|
package statement then it is the first line of the file that is
|
|
not comment or whitespace. After the package statement comes a
|
|
list of import statements, which is followed by a class or
|
|
interface definition, and so on.
|
|
</p>
|
|
<p>
|
|
If you have ever read an introductory level Java book you probably
|
|
knew all of the above. And if you have studied computer science,
|
|
you probably also know that the rules that specify the Java language
|
|
can be formally specified using a grammar (statement is simplified
|
|
for didactic purposes).
|
|
</p>
|
|
<p>
|
|
There are tools which read a grammar definition and produce a parser
|
|
for the language that is specified in the grammar. In other
|
|
words, the output of the tool is a program that can transform a
|
|
stream of characters (a Java file) into a tree representation
|
|
that reflects the structure of the file. Checkstyle uses the
|
|
parser generator <a href="http://www.antlr.org">ANTLR</a> but
|
|
that is an implementation detail you do not need to worry about
|
|
when writing Checks, as well tested parser will parse Java file for you.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="The Checkstyle SDK Gui">
|
|
|
|
<p>
|
|
Still with us? Great, you have mastered the basic theory so here
|
|
is your reward - a GUI that displays the structure of a Java source file. To run it type
|
|
</p>
|
|
<source>
|
|
java -cp checkstyle-${projectVersion}-all.jar com.puppycrawl.tools.checkstyle.gui.Main
|
|
</source>
|
|
|
|
<p>
|
|
on the command line. Click the button at the bottom of the frame
|
|
and select a syntactically correct Java source file. The frame
|
|
will be populated with a tree that corresponds to the structure
|
|
of the Java source code.
|
|
</p>
|
|
|
|
<p>
|
|
<img alt="screenshot" src="images/gui_screenshot.png"/>
|
|
</p>
|
|
|
|
<p> In the leftmost column you can open and close branches
|
|
of the tree, the remaining columns display information about each node
|
|
in the tree. The second column displays a token type for each node. As
|
|
you navigate from the root of the tree to one of the leafs, you'll
|
|
notice that the token type denotes smaller and smaller units of your
|
|
source file, i.e. close to the root you might see the token type <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/TokenTypes.html#CLASS_DEF">CLASS_DEF</a>
|
|
(a node that represents a class definition) while you will see token
|
|
types like IDENT (an identifier) near the leaves of the tree. </p>
|
|
|
|
<p>
|
|
We'll get back to the details in the other columns later, they
|
|
are important for implementing Checks but not for understanding
|
|
the basic concepts. For now it is sufficient to know that the
|
|
gui is a tool that lets you look at the structure of a Java file, i.e. you can see the Java grammar 'in action'.
|
|
</p>
|
|
|
|
<p> If you use <a href="https://eclipse.org">Eclipse</a> you can install
|
|
<a href="https://github.com/sevntu-checkstyle/checkstyle-ast-eclipse-viewer">
|
|
Checkstyle AST Eclipse Viewer</a>
|
|
plugin to launch that application from context menu on any file in Eclipse.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Understanding the visitor pattern">
|
|
|
|
<p>
|
|
Ready for a bit more theory? The last bit
|
|
that is missing before you can start writing Checks is understanding
|
|
the <a href="http://en.wikipedia.org/wiki/Visitor_pattern">Visitor pattern</a>.
|
|
</p>
|
|
|
|
<p>
|
|
When working with <a href="http://en.wikipedia.org/wiki/Abstract_syntax_tree">
|
|
Abstract Syntax Tree (AST)</a>, a simple approach to define check operations
|
|
on them would be to add a <code>check()</code> method to the Class that defines
|
|
the AST nodes. For example, our AST type could have a method
|
|
<code>checkNumberOfMethods()</code>. Such an approach would suffer from a few
|
|
serious drawbacks. Most importantly, it does not provide an extensible
|
|
design, i.e. the Checks have to be known at compile time; there is no
|
|
way to write plugins.
|
|
</p>
|
|
|
|
<p> Hence Checkstyle's AST classes do not have any
|
|
methods that implement checking functionality. Instead,
|
|
Checkstyle's <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/TreeWalker.html">TreeWalker</a>
|
|
takes a set of objects that conform to a <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/Check.html">Check</a>
|
|
interface. OK, you're right - actually it's not an interface
|
|
but an abstract class to provide some helper methods. A Check provides
|
|
methods that take an AST as an argument and perform the checking process
|
|
for that AST, most prominently <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/Check.html#visitToken(com.puppycrawl.tools.checkstyle.api.DetailAST)"><code>visitToken()</code></a>. </p>
|
|
|
|
<p> It is important to understand that the individual
|
|
Checks do no drive the AST traversal (it possible to traverse itself, but not recommended).
|
|
Instead, the TreeWalker initiates
|
|
a recursive descend from the root of the AST to the leaf nodes and calls
|
|
the Check methods. The traversal is done using a <a
|
|
href="http://mathworld.wolfram.com/Depth-FirstTraversal.html">depth-first</a>
|
|
algorithm. </p>
|
|
|
|
<p> Before any visitor method is called, the TreeWalker
|
|
will call <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/Check.html#beginTree(com.puppycrawl.tools.checkstyle.api.DetailAST)"><code>beginTree()</code></a> to give the Check a chance to do
|
|
some initialization. Then, when performing the recursive descend from
|
|
the root to the leaf nodes, the <code>visitToken()</code>
|
|
method is called. Unlike the basic examples in the pattern book, there
|
|
is a <code>visitToken()</code> counterpart called <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/Check.html#leaveToken(com.puppycrawl.tools.checkstyle.api.DetailAST)"><code>leaveToken()</code></a>. The TreeWalker will call that
|
|
method to signal that the subtree below the node has been processed and
|
|
the TreeWalker is backtracking from the node. After the root node has
|
|
been left, the TreeWalker will call <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/Check.html#finishTree(com.puppycrawl.tools.checkstyle.api.DetailAST)"><code>finishTree()</code></a>. </p>
|
|
|
|
</section>
|
|
|
|
<section name="Visitor in action">
|
|
|
|
<p>
|
|
Let's get back to our example and start writing code - that's why
|
|
you came here, right?
|
|
When you fire up the Checkstyle GUI and look at a few source
|
|
files you'll figure out pretty quickly that you are mainly
|
|
interested in the number of tree nodes of type METHOD_DEF. The
|
|
number of such tokens should be counted separately for each
|
|
CLASS_DEF / INTERFACE_DEF.
|
|
</p>
|
|
|
|
<p>
|
|
Hence we need to register the Check for the token types
|
|
CLASS_DEF and INTERFACE_DEF. The TreeWalker will only call
|
|
visitToken for these token types. Because the requirements of
|
|
our tasks are so simple, there is no need to implement the other
|
|
fancy methods, like <code>finishTree()</code>, etc., so here is our first
|
|
shot at our Check implementation:
|
|
</p>
|
|
|
|
<source>
|
|
package com.mycompany.checks;
|
|
import com.puppycrawl.tools.checkstyle.api.*;
|
|
|
|
public class MethodLimitCheck extends Check
|
|
{
|
|
private static final int DEFAULT_MAX = 30;
|
|
private int max = DEFAULT_MAX;
|
|
|
|
@Override
|
|
public int[] getDefaultTokens()
|
|
{
|
|
return new int[]{TokenTypes.CLASS_DEF, TokenTypes.INTERFACE_DEF};
|
|
}
|
|
|
|
@Override
|
|
public void visitToken(DetailAST ast)
|
|
{
|
|
// find the OBJBLOCK node below the CLASS_DEF/INTERFACE_DEF
|
|
DetailAST objBlock = ast.findFirstToken(TokenTypes.OBJBLOCK);
|
|
|
|
// count the number of direct children of the OBJBLOCK
|
|
// that are METHOD_DEFS
|
|
int methodDefs = objBlock.getChildCount(TokenTypes.METHOD_DEF);
|
|
|
|
// report error if limit is reached
|
|
if (methodDefs > this.max) {
|
|
String message = "too many methods, only " + this.max + " are allowed";
|
|
log(ast.getLineNo(), message);
|
|
}
|
|
}
|
|
}
|
|
</source>
|
|
|
|
</section>
|
|
|
|
<section name="Navigating the AST">
|
|
|
|
<p>
|
|
In the example above you already saw that the DetailsAST class
|
|
provides utility methods to extract information from the tree,
|
|
like <code>getChildCount()</code>. By now you have
|
|
probably consulted the API documentation and found that
|
|
<a href="https://github.com/checkstyle/checkstyle/blob/master/src/main/java/com/puppycrawl/tools/checkstyle/api/DetailAST.java">DetailsAST</a> additionally provides methods for navigating around
|
|
in the syntax tree, like finding the next sibling of a node, the
|
|
children of a node, the parent of a node, etc.
|
|
</p>
|
|
|
|
<p>
|
|
These methods provide great power for developing complex
|
|
Checks. Most of the Checks that Checkstyle provides by default
|
|
use these methods to analyze the environment of the ASTs that
|
|
are visited by the TreeWalker. Don't abuse that feature for
|
|
exploring the whole tree, though. Let the TreeWalker drive the
|
|
tree traversal and limit the visitor to the neighbours of a
|
|
single AST.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Defining Check Properties">
|
|
|
|
<p>
|
|
|
|
OK Mr. Checkstyle, that's all very nice but in my company we
|
|
have several projects, and each has another number of allowed
|
|
methods. I need to control my Check through properties, so where
|
|
is the API to do that?
|
|
</p>
|
|
|
|
<p>
|
|
Well, the short answer is, there is no API. It's magic. Really!
|
|
</p>
|
|
|
|
<p>
|
|
If you need to make something configurable, just add a setter method
|
|
to the Check:
|
|
</p>
|
|
|
|
<source>
|
|
public class MethodLimitCheck extends Check
|
|
{
|
|
// code from above omitted for brevity
|
|
public void setMax(int limit)
|
|
{
|
|
max = limit;
|
|
}
|
|
}
|
|
</source>
|
|
|
|
<p>
|
|
With this code added, you can set the property <code>max</code> for the MethodLimitCheck module in the
|
|
configuration file. It doesn't get any simpler than that. The secret is
|
|
that Checkstyle uses <a href="https://docs.oracle.com/javase/tutorial/reflect/member/fieldValues.html">
|
|
JavaBean reflection to set the JavaBean properties</a>. That works for all primitive types like boolean,
|
|
int, long, etc., plus Strings, plus arrays of these types.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Logging errors">
|
|
|
|
<p>
|
|
Detecting errors is one thing, presenting them to the user is
|
|
another. To do that, the Check base class provides several log
|
|
methods, the simplest of them being <code>Check.log(String)</code>. In your
|
|
Check you can simply use a verbatim error string like in <code>log("Too many methods, only " + mMax +
|
|
" are allowed");</code> as the argument. That will
|
|
work, but it's not the best possible solution if your Check is
|
|
intended for a wider audience.
|
|
</p>
|
|
|
|
<p>
|
|
If you are not living in a country where people speak English,
|
|
you may have noticed that Checkstyle writes internationalized
|
|
error messages, for example if you live in Germany the error
|
|
messages are German. The individual Checks don't have to do
|
|
anything fancy to achieve this, it's actually quite easy and the
|
|
Checkstyle framework does most of the work.
|
|
</p>
|
|
|
|
<p>
|
|
To support internationalized error messages, you need to create or reuse existing
|
|
a messages.properties file alongside your Check class
|
|
(<a href="https://github.com/checkstyle/checkstyle/blob/master/src/main/resources/com/puppycrawl/tools/checkstyle/checks/sizes/messages.properties">example</a>)
|
|
, i.e. the
|
|
Java file and the properties files should be in the same
|
|
directory. Add a symbolic error code and an English
|
|
representation to the messages.properties. The file should
|
|
contain the following line: <code>too.many.methods=Too many methods, only {0} are
|
|
allowed</code>. Then replace the verbatim error message with
|
|
the symbolic representation and use one of the log helper
|
|
methods to provide the dynamic part of the message (mMax in this
|
|
case): <code>log("too.many.methods",
|
|
mMax);</code>. Please consult the documentation of Java's <a
|
|
href="http://java.sun.com/j2se/1.4.1/docs/api/java/text/MessageFormat.html">MessageFormat</a>
|
|
to learn about the syntax of format strings (especially about
|
|
those funny numbers in the translated text).
|
|
</p>
|
|
|
|
<p>
|
|
Supporting a new language is very easy now, simply create a new
|
|
messages file for the language, e.g. messages_fr.properties to
|
|
provide French error messages. The correct file will be chosen
|
|
automatically, based on the language settings of the user's
|
|
operating system.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Integrate your Check">
|
|
|
|
<p>
|
|
The great final moment has arrived, you are about to run your
|
|
Check. To integrate your Check, add a new subentry under the
|
|
TreeWalker module of your configuration file. Use the full
|
|
classname of your Check class as the name of the module.
|
|
Your configuration file <code>config.xml</code> should look something like this:
|
|
</p>
|
|
|
|
<source>
|
|
<?xml version="1.0"?>
|
|
<!DOCTYPE module PUBLIC
|
|
"-//Puppy Crawl//DTD Check Configuration 1.3//EN"
|
|
"http://www.puppycrawl.com/dtds/configuration_1_3.dtd">
|
|
<module name="Checker">
|
|
<module name="TreeWalker">
|
|
<!-- your standard Checks that come with Checkstyle -->
|
|
<module name="UpperEll"/>
|
|
<module name="MethodLength"/>
|
|
<!-- your Check goes here -->
|
|
<module name="com.mycompany.checks.MethodLimitCheck">
|
|
<property name="max" value="45"/>
|
|
</module>
|
|
</module>
|
|
</module>
|
|
</source>
|
|
|
|
<p>
|
|
To run the new Check on the command line compile your Check,
|
|
create a jar that contains the classes and property files,
|
|
e.g. <code>mycompanychecks.jar</code>. Then run
|
|
(with the path separator adjusted to your platform):
|
|
</p>
|
|
|
|
<source>
|
|
java -classpath mycompanychecks.jar:checkstyle-${projectVersion}-all.jar \
|
|
-c config.xml
|
|
</source>
|
|
|
|
<p>
|
|
Did you see all those errors about "too many methods"
|
|
flying over your screen? Congratulations. You can now consider
|
|
yourself a Checkstyle expert. Go to your fridge. Have a beer.
|
|
</p>
|
|
|
|
<p>
|
|
Please consult the <a href="config.html#Packages">Checkstyle
|
|
configuration manual</a> to learn how to integrate your Checks
|
|
into the package configuration so that you can use <code>MethodLimit</code> instead of the full class name.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Limitations">
|
|
|
|
<p>
|
|
OK, so you have written your first Check, and you have found
|
|
several flaws in many of your programs. You now know that your
|
|
boss does not follow the coding conventions he wrote. And you
|
|
know that you are the king of the world. To become a programming
|
|
god, you want to write your second Check - now wait, first you
|
|
should know what your limits are.
|
|
</p>
|
|
|
|
<p>
|
|
There are basically only few of them:
|
|
</p>
|
|
<ul>
|
|
<li>Java code should be written with <a href="http://en.wikipedia.org/wiki/ASCII">ASCII</a> characters only, no UTF-8 support.</li>
|
|
<li>To get valid violations, code have to be compilable, in other case you can get not easy to understand parse errors.</li>
|
|
<li>You cannot determine the type of an expression. Example: "getValue() + getValue2()"</li>
|
|
<li>You cannot determine the full inheritance hierarchy of type.</li>
|
|
<li>You cannot see the content of other files. You have content of one file only during all Checks execution. All files are processed one by one.</li>
|
|
</ul>
|
|
<p>
|
|
This means that you cannot implement some of the code inspection
|
|
features that are available in advanced IDEs like <a
|
|
href="http://www.intellij.com/idea/">IntelliJ IDEA</a>,
|
|
<a href="http://findbugs.sourceforge.net/">Findbug</a>,
|
|
<a href="http://pmd.sourceforge.net/">PMD</a>,
|
|
<a href="http://www.sonarqube.org/">Sonarqube</a>.
|
|
</p>
|
|
|
|
<p>
|
|
For example you will not be able to implement:
|
|
<br/>
|
|
- a Check that finds redundant type casts or unused public methods.
|
|
<br/>
|
|
- a Check that validate that user custom Exception class inherited from java.lang.Exception class.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Writing FileSetChecks">
|
|
|
|
<p>Writing a FileSetCheck usually required when you do not need parse Java file
|
|
to get inner structure, or you are going to validate non "*.java" files.
|
|
</p>
|
|
|
|
<p> Writing a FileSetCheck is pretty straightforward: Just
|
|
inherit from <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/AbstractFileSetCheck.html">AbstractFileSetCheck</a>
|
|
and override the abstract <a
|
|
href="apidocs/com/puppycrawl/tools/checkstyle/api/AbstractFileSetCheck.html#processFiltered(java.io.File,%20java.util.List)"><code>processFiltered(java.io.File, java.util.List)</code></a> method and you're
|
|
done. A very simple example could fire an error if the number of files
|
|
exceeds a certain limit. Here is a FileSetCheck that does just that:</p>
|
|
|
|
<source>
|
|
package com.mycompany.checks;
|
|
import java.io.File;
|
|
import java.util.List;
|
|
import com.puppycrawl.tools.checkstyle.api.*;
|
|
|
|
public class LimitImplementationFiles extends AbstractFileSetCheck
|
|
{
|
|
private static final int DEFAULT_MAX = 100;
|
|
private int fileCount;
|
|
private int max = DEFAULT_MAX;
|
|
public void setMax(int aMax)
|
|
{
|
|
this.max = aMax;
|
|
}
|
|
|
|
@Override
|
|
public void beginProcessing(String aCharset)
|
|
{
|
|
super.beginProcessing(aCharset);
|
|
|
|
//reset the file count
|
|
this.fileCount = 0;
|
|
}
|
|
|
|
@Override
|
|
public void processFiltered(File file, List<String> aLines)
|
|
{
|
|
this.fileCount++;
|
|
|
|
if (this.fileCount > this.max) {
|
|
// log the message
|
|
log(0, "max.files.exceeded", Integer.valueOf(this.max));
|
|
// you can call log() multiple times to flag multiple
|
|
// errors in the same file
|
|
}
|
|
}
|
|
}
|
|
</source>
|
|
|
|
<p>
|
|
Note that the configuration via bean introspection also applies
|
|
here. By implementing the <code>setMax()</code>
|
|
method the FileSetCheck automatically makes "max" a
|
|
legal configuration parameter that you can use in the Checkstyle
|
|
configuration file.
|
|
</p>
|
|
<p>
|
|
There are virtually no limits what you can do in
|
|
FileSetChecks, but please do not be creazy.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Huh? I can't figure it out!">
|
|
<p>
|
|
That's probably our fault, and it means that we have to provide
|
|
better documentation. Please do not hesitate to ask questions on
|
|
the user <a href="http://checkstyle.sourceforge.net/mail-lists.html">
|
|
mailing lists</a>, this will help us to improve this
|
|
document. Please ask your questions as precisely as possible.
|
|
We will not be able to answer questions like "I want to
|
|
write a Check but I don't know how, can you help me?". Tell
|
|
us what you are trying to do (the purpose of the Check), what
|
|
you have understood so far, and what exactly you are getting stuck
|
|
on.
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section name="Contributing">
|
|
|
|
<p>
|
|
We need <em>your</em> help to keep improving Checkstyle.
|
|
|
|
Whenever you write a Check or FileSetCheck that you think is
|
|
generally useful, please consider
|
|
<a href="contributing.html">contributing</a> it to the
|
|
Checkstyle community and submit it for inclusion in the next
|
|
release of Checkstyle.
|
|
|
|
</p>
|
|
</section>
|
|
|
|
|
|
</body>
|
|
</document>
|
|
|