Checker Framework logo

This part of the tutorial shows how the Checker Framework can detect and help correct missing input validation.

1. The RegexExample program

If you have not already done so, download the source files for the tutorial.

The RegexExample.java program is called with two arguments: a regular expression and a string. The program prints the text from the string that matches the first capturing group in the regular expression.
Compile the program:

$ javac RegexExample.java

2. Run the RegexExample program

Run the program with a valid regular expression and a matching string:

$ java RegexExample '[01]?\d-([0123]?\d)-\d{4}+' '01-24-2013'
Group 1: 24

Run the program with an invalid regular expression and any string:

$ java RegexExample '[01]?[\d-[0123]?\d-\d{4}+' '01-24-2013'
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 24
[01]?[d-[0123]?d-d{4}+
                     ^
        at java.util.regex.Pattern.error(Pattern.java:1924)
        at java.util.regex.Pattern.clazz(Pattern.java:2493)
        at java.util.regex.Pattern.sequence(Pattern.java:2030)
        at java.util.regex.Pattern.expr(Pattern.java:1964)
        at java.util.regex.Pattern.compile(Pattern.java:1665)
        at java.util.regex.Pattern.<init>(Pattern.java:1337)
        at java.util.regex.Pattern.compile(Pattern.java:1022)
        at RegexExample.main(RegexExample.java:13)

Good programming style dictates that the user should not see a stack trace, even if the user supplies invalid output.

3. Run the Regex Checker

The Regex Checker prevents, at compile time, use of syntactically invalid regular expressions and access of invalid capturing groups. In other words, it prevents you from writing code that would throw certain exceptions at run time. Next run the Regex Checker to see how it could have spotted this issue at compile time.

$ javacheck -processor org.checkerframework.checker.regex.RegexChecker RegexExample.java
RegexExample.java:13: error: [argument.type.incompatible] incompatible types in argument.
        Pattern pat = Pattern.compile(regex);
                                      ^
  found   : String
  required: @Regex String
1 errors

The "incompatible types" warning indicates that variable regex is not of type @Regex String which is required for strings passed to Pattern.compile().

4. Fix the Code

The right way to fix the problems is for the code to issue a user-friendly message at run time. It should verify the user input using the RegexUtil.isRegex(String, int) method. If the string is not a valid regular expression, it should print an error message and halt. If it is a valid regular expression, perform as before.

You need to make two changes to RegexExample.java to correctly handle invalid user input. At the top of the file, add

import org.checkerframework.checker.regex.util.RegexUtil;

After variable regex is defined but before it is used (that is, before Pattern pat = Pattern.compile(regex);), add

    if (!RegexUtil.isRegex(regex, 1)) {
        System.out.println("Input is not a regular expression \"" + regex
            + "\": " + RegexUtil.regexException(regex).getMessage());
        System.exit(1);
    }

5. Re-run the Regex Checker

$ javacheck -processor org.checkerframework.checker.regex.RegexChecker RegexExample.java

There should be no warnings. This shows that the code will not throw a PatternSyntaxException at compile time.

6. Run the Example

Run the program as before (but adding checker-qual.jar to the classpath) to verify that the program prints a user-friendly warning.

$ java -cp ".:$CHECKERFRAMEWORK/checker/dist/checker-qual.jar" RegexExample '[01]?[\d-\([0123]?\d\)-\d{4}+' '01-24-2013'
Input is not a regular expression "[01]?[d-([0123]?d)-d{4}+": Illegal character range near index 24

For a full discussion of the Regex Checker, please see the Regex Checker chapter of the Checker Framework manual.