355 lines
14 KiB
HTML
355 lines
14 KiB
HTML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
|
"DTD/xhtml1-strict.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
|
<head>
|
|
<title>Writing your own checks</title>
|
|
<link rel="stylesheet" type="text/css" href="mystyle.css"/>
|
|
</head>
|
|
|
|
<body>
|
|
|
|
<!-- The header -->
|
|
<table border="0" width="100%" summary="header layout">
|
|
<tr>
|
|
<td>
|
|
<h1>Writing your own Checks</h1>
|
|
</td>
|
|
<td align="right">
|
|
<img src="logo.png" alt="Checkstyle Logo"/>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
<!-- content -->
|
|
<table border="0" width="100%" cellpadding="5" summary="body layout">
|
|
<tr>
|
|
<!--Left menu-->
|
|
<td class="menu" valign="top">
|
|
<p><a href="#overview">Overview</a></p>
|
|
<p><a href="#checks">Writing Checks</a></p>
|
|
<ul>
|
|
<li><a href="#grammar">Java Grammar</a></li>
|
|
<li><a href="#gui">Checkstyle GUI</a></li>
|
|
<li><a href="#visitor">Vistor Pattern</a></li>
|
|
<li><a href="#regtokens">Visitor in Action</a></li>
|
|
<li><a href="#astnav">Navigating the AST</a></li>
|
|
<li><a href="#integrate">Integrating Checks</a></li>
|
|
<li><a href="#limitations">Limitations</a></li>
|
|
</ul>
|
|
<p><a href="#filesetchecks">Writing FileSetChecks</a>
|
|
<p><a href="#huh">Huh?</a></p>
|
|
</td>
|
|
|
|
<!--Content-->
|
|
<td class="content" valign="top" align="left">
|
|
<a name="overview"></a>
|
|
<h2>Overview</h2>
|
|
<p class="body">
|
|
OK, so you have finally decided to write your own check.
|
|
Welcome aboard, this is really a fun thing to do. There are
|
|
actually two kinds of checks, so before you can start, you have
|
|
to find out which kind of check you want to implement.
|
|
</p>
|
|
|
|
<p class="body">
|
|
The functionality of Checkstyle is implemented in modules that
|
|
can be plugged into Checkstyle. Modules can be containers for
|
|
other modules, i.e. they form a tree structure. The toplevel
|
|
modules that are known directly to the Checkstyle kernel (which
|
|
is also a module and forms the root of the tree) are
|
|
FileSetChecks. These are pretty simple to grasp: they take a set
|
|
of input files and fire error messages.
|
|
</p>
|
|
|
|
<p class="body">
|
|
Checkstyle provides a few FileSetCheck implementations by
|
|
default and one of them happens to be the TreeWalker. A
|
|
TreeWalker typically has some submodules, called Checks. The
|
|
TreeWalker operates by seperately transforming each of the java
|
|
input files into an abstract syntax tree and then handing the
|
|
result over to each of the Check submodules which in turn have a
|
|
look at a certain aspect of the tree.
|
|
</p>
|
|
|
|
<a name="checks"></a>
|
|
<h2>Writing Checks</h2>
|
|
<p class="body">
|
|
|
|
Most of the functionality of Checkstyle is implemented as
|
|
Checks. If you know how to write your own Checks, you can extend
|
|
Checkstyle according to your needs without having to wait for
|
|
the checkstyle development team. You are about to become a
|
|
Checkstyle Expert.
|
|
</p>
|
|
|
|
<p class="body">
|
|
Suppose you have a convention that the number of methods in a
|
|
class should not exceed a certain limit, say 30. This rule makes
|
|
sense, a class should only do one thing and do it well. With a
|
|
zillion methods chances are that the class does more than one
|
|
thing. The only problem you have is that your convention is not
|
|
checked by Checkstyle, so you'll have to write your own check
|
|
and plug it into the Checkstyle framework.
|
|
</p>
|
|
|
|
<p class="body">
|
|
This chapter is organized as a tour that takes you through the
|
|
process step by step and explains both the theoretical
|
|
foundations and the Checkstyle API along the way.
|
|
</p>
|
|
|
|
<a name="grammar"></a>
|
|
<h3>Java Grammar</h3>
|
|
<p class="body">
|
|
Every Java Program is structured into files, and each of these
|
|
files has a certain structure. For example, if there is a
|
|
package statement then it is the first line of the file that is
|
|
not comment or whitespace. After the package statement comes a
|
|
list of import statements, which is followed by a class or
|
|
interface definition, and so on.
|
|
</p>
|
|
<p class="body">
|
|
If you have ever read an introductory level Java book you probably
|
|
knew all of the above. And if you have studied computer science,
|
|
you probably also know that the rules that specify the Java Language
|
|
can be formally specified using a Grammar (statement is simplified
|
|
for didactic purposes).
|
|
</p>
|
|
<p class="body">
|
|
Tools exist which read a grammar definition and produce a parser
|
|
for the language that is specified in the grammar. In other
|
|
words the output of the tool is a program that can transform a
|
|
stream of characters (a Java File) into a tree representation
|
|
that reflects the structure of the file. CheckStyle uses the
|
|
parser generator <a href="http://www.antlr.org">ANTLR</a> but
|
|
that is an implementation detail you do not need to worry about
|
|
when writing checks. Several other parser generators exist and
|
|
they all work well.
|
|
</p>
|
|
|
|
<a name="gui"></a>
|
|
<h3>The Checkstyle SDK gui</h3>
|
|
<p class="body">
|
|
Still with us? Great, you have mastered the basic theory so here
|
|
is your reward - a gui that displays the structure of a Java
|
|
source file. To run it type
|
|
<pre>
|
|
java -classpath checkstyle-all-${version}.jar com.puppycrawl.toos.checkstyle.gui.Main
|
|
</pre>
|
|
</p>
|
|
<p class="body">
|
|
on the command line. Click the button at the botton of the frame
|
|
and select a syntactically correct Java source file. The frame
|
|
will be populated with a tree that corresponds to the structure
|
|
of the java source code.
|
|
</p>
|
|
|
|
<p class="body">
|
|
TODO: screenshot
|
|
</p>
|
|
|
|
<p class="body">
|
|
In the leftmost column you can open and close branches of the
|
|
tree, the remaining columns display information about each node
|
|
in the tree. The second column displays a token type for each
|
|
node. As you navigate from the root of the tree to one of the
|
|
leafs, you'll notice that the token type denotes smaller and
|
|
smaller units of your source file, i.e. close to the root you
|
|
might see the token type CLASS_DEF (a node that represents a
|
|
class definition) while you will see token types like IDENT (an
|
|
identifier) near the leafs of the tree.
|
|
</p>
|
|
|
|
<p class="body">
|
|
We'll get back to the details in the other columns later, they
|
|
are important for implementing checks but not for understanding
|
|
the basic concepts. For now it is sufficient to know that the
|
|
gui is a tool that lets you look at the structure of a java
|
|
file, i.e. you can see the java grammar 'in action'.
|
|
</p>
|
|
|
|
<a name="visitor"></a>
|
|
<h3>Understanding the visitor pattern</h3>
|
|
<p class="body">
|
|
TODO: A brief explanation of the Visitor pattern, xref to
|
|
GoF/pattern wiki.
|
|
</p>
|
|
|
|
<a name="regtokens"></a>
|
|
<h3>Visitor in action</h3>
|
|
<p class="body">
|
|
When you fire up the checkstyle GUI and look at a few source
|
|
files you'll figure out pretty quickly that you are mainly
|
|
interested in the number of tree nodes of type METHOD_DEF. The
|
|
number of such tokens should be counted separately for each
|
|
CLASS_DEF / INTERFACE_DEF.
|
|
</p>
|
|
|
|
<p class="body">
|
|
Now you have to decide how constructors are treated. Do they
|
|
count as a method for the purposes of your Check? Maybe you
|
|
should make that configurable, and we have good news for you:
|
|
Checkstyle lets you control the token types for which your
|
|
visitor methods are called.
|
|
</p>
|
|
|
|
<p class="body">
|
|
TODO: Explain how. Explain the visitor methods
|
|
(visitToken, leaveToken, beginTree, endTree).
|
|
</p>
|
|
|
|
<a name="astnav"></a>
|
|
<h3>Navigating the Abstract Syntax Tree (AST)</h3>
|
|
<p class="body">
|
|
TODO: Explain the navigation methods in DetailAST and how to
|
|
use them.
|
|
</p>
|
|
|
|
<a name="logerrors"></a>
|
|
<h3>Logging errors</h3>
|
|
<p class="body">
|
|
Detecting errors is one thing, presenting them to the user is
|
|
another. To do that, the Check base class provides several log
|
|
messages, the most simple of them is Check.log(String). In your
|
|
check you can simply use a verbatim error string like in <span
|
|
class="code">log("Too many methods, only " + mMax +
|
|
" are allowed");</span> as the argument. That will
|
|
work, but it's not the best possible solution if your check is
|
|
intended for a wider audience.
|
|
</p>
|
|
|
|
<p class="body">
|
|
If you are not living in a country where people speak English
|
|
you may have noticed that Checkstyle writes internationalized
|
|
error messages, for example if you live in Germany the error
|
|
messages are german. The individual checks don't have to do
|
|
anything fancy to achieve this, it's actually quite easy and the
|
|
Checkstyle framework does most of the work.
|
|
</p>
|
|
|
|
<p class="body">
|
|
To support internationalized error messages, you need to create
|
|
a message.properties file alongside your Check class, i.e. the
|
|
java file and the properties files should be in the same
|
|
directory. Add a symbolic error code and an english
|
|
representation to the messages.properties, the file should
|
|
contain the following line: <span
|
|
class="code">too.many.methods=Too many methods, only {0} are
|
|
allowed</span>. Then replace the verbatim error message with
|
|
the symbolic representation and use one of the log helper
|
|
methods to provide the dynamic part of the message (mMax in this
|
|
case): <span class="code">log("too.many.methods",
|
|
mMax);</span>. Please consult the documentation of Java's <a
|
|
href="http://java.sun.com/j2se/1.4.1/docs/api/java/text/MessageFormat.html">MassageFormat</a>
|
|
to learn about the syntax of format strings (especially about
|
|
those funny numbers in the translated text).
|
|
</p>
|
|
|
|
<p class="body">
|
|
Supporting a new language is very easy now, simply create a new
|
|
messages file for the language, e.g. messages_fr.properties to
|
|
provide french error messages. The correct file will be chosen
|
|
automatically, based on the language settings of the user's
|
|
operating system.
|
|
</p>
|
|
|
|
<a name="integrate"></a>
|
|
<h3>Integrate your Check</h3>
|
|
<p class="body">
|
|
TODO: Explain the config system and how to integrate a user check.
|
|
</p>
|
|
|
|
<a name="limitations"></a>
|
|
<h3>Limitations</h3>
|
|
<p class="body">
|
|
OK, so you have written your first Check, and you have found
|
|
several flaws in many of your programs. You now know that your
|
|
boss does not follow the coding conventions he wrote. And you
|
|
know that you are the king of the world. To become a programming
|
|
god, you want to write your second check - now wait, first you
|
|
should know what your limits are.
|
|
</p>
|
|
|
|
<p class="body">
|
|
There are basically only two of them:
|
|
<ul>
|
|
<li>You cannot determine the type of an expression.</li>
|
|
<li>You cannot see the content of other files.</li>
|
|
</ul>
|
|
TODO: Explain the practical consequences of these limitations.
|
|
</p>
|
|
|
|
|
|
<a name="filesetchecks"></a>
|
|
<h2>Writing FileSetChecks</h2>
|
|
<p class="body">
|
|
Writing a FileSetCheck is pretty straightforward: Just inherit
|
|
from AbstractFileSetCheck and implement the process(File[]
|
|
files) method and you're done. A very simple example could fire
|
|
an error if the number of files that are passed in exceeds a
|
|
certain limit.
|
|
</p>
|
|
<p class="body">
|
|
TODO: Implement that FSC and provide it as an example. Sketch:
|
|
<pre>
|
|
private int max = 100;
|
|
|
|
public void setMax(int aMax)
|
|
{
|
|
max = aMax;
|
|
}
|
|
|
|
public void process(File[] files)
|
|
{
|
|
if (files != null && files.length > max)
|
|
{
|
|
// build the error list
|
|
Object[] key = new Object[]{it.next()};
|
|
LocalizedMessage[] errors = new LocalizedMessage[1];
|
|
final String className = getClass().getName();
|
|
final int pkgEndIndex = className.lastIndexOf('.');
|
|
final String pkgName = className.substring(0, pkgEndIndex);
|
|
final String bundle = pkgName + ".messages";
|
|
errors[0] = new LocalizedMessage(
|
|
0, bundle, "max.files.exceeded", key);
|
|
|
|
// fire the errors to the AuditListeners
|
|
getMessageDispatcher().fireErrors(path, errors);
|
|
}
|
|
}
|
|
</pre>
|
|
</p>
|
|
<p class="body">
|
|
Note that by implementing the setMax() method the FileSetCheck
|
|
automatically makes "max" a legal configuration
|
|
parameter that you can use in the Checkstyle configuration file.
|
|
</p>
|
|
<p class="body">
|
|
There are virtually no limits what you can do in
|
|
FileSetChecks. The most crazy ideas we've had so far are
|
|
<ul>
|
|
<li class="body">to find global code problems like unused public methods.</li>
|
|
<li class="body">to find duplicate code.</li>
|
|
<li class="body">to port the TreeWalker solution to check C#
|
|
instead of Java.</li>
|
|
</p>
|
|
|
|
<a name="huh"></a>
|
|
<h2>Huh? I can't figure it out!</h3>
|
|
<p class="body">
|
|
That's probably our fault, it means that we have to provide
|
|
better docs. Please do not hesitate to ask questions on the user
|
|
mailing list, this will help us to improve this document.
|
|
Please make your question as precise as possible, we will not
|
|
be able to answer questions like "I want to write a check
|
|
but I don't know how, can you help me?".
|
|
</p>
|
|
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<hr />
|
|
</body> </html>
|