Debugging PMD?

I’m setting up PMD using the built in Gradle plugin, having previously had it running in our Maven build before we migrated to Gradle. I’m using:

  • Gradle 8.13
  • PMD 6.55.0
  • A custom PMD rule which is packaged into a jar
  • A ruleset xml which only includes the one single custom rule, no others

I have added config to my top level build.gradle, in my subprojects closure, to apply PMD currently just for one sub module, like this:

// Apply PMD plugin only to selected modules
if (project.name in ['my-module']) {
    apply plugin: 'pmd'

    dependencies {
        // custom rules
        pmd 'com.something:custom-pmd-rules:1.170-SNAPSHOT'
        pmd files(projectDir)
    }

    pmd {
        toolVersion = "6.55.0"
        ruleSets = [] // clear default rule sets
        ruleSetFiles = files("pmd-ruleset.xml")
        consoleOutput = true
    }

    // disable pmdTest and pmdTestFixture tasks.
    tasks.matching { task ->
        task.name in ["pmdTest", "pmdTestFixtures"] }
            .each { task ->
                task.onlyIf { false }
            }
    // Run pmdMain task when property is specified. We'll run this on CI
    tasks.matching { task ->
        task.name.equals("pmdMain")}
            .each { task ->
                    task.onlyIf { rootProject.hasProperty("runPMD") }
        }
}

PMD runs, but I’m having some bad issues. It is taking 30 minutes to run on my Mac, without any tests, just PMD. Normally my whole build with 11,000 tests takes about 3 mins. So I suspect something is badly wrong, but I can’t even figure out how to run this via a debugger. If I configure Gradle for debugging, and to wait for a debugger on port 5005, yes it does that, but it never hits any breakpoints in PMD code or my custom rule. So I think it is forking a Gradle worker for PMD which is not passing on any JVM debug args?

If I look at the PMD task it extends from AbstractCodeQualityTask which has this code:

    protected void configureForkOptions(JavaForkOptions forkOptions) {
        forkOptions.setMinHeapSize((String)this.getMinHeapSize().getOrNull());
        forkOptions.setMaxHeapSize((String)this.getMaxHeapSize().getOrNull());
        forkOptions.setExecutable(((JavaLauncher)this.getJavaLauncher().get()).getExecutablePath().getAsFile().getAbsolutePath());
        maybeAddOpensJvmArgs((JavaLauncher)this.getJavaLauncher().get(), forkOptions);
    }

    private static void maybeAddOpensJvmArgs(JavaLauncher javaLauncher, JavaForkOptions forkOptions) {
        if (JavaVersion.toVersion(javaLauncher.getMetadata().getJavaRuntimeVersion()).isJava9Compatible()) {
            forkOptions.jvmArgs(new Object[]{"--add-opens", "java.prefs/java.util.prefs=ALL-UNNAMED"});
        }

    }

I can’t see anything here related to passing debug args to a forked JVM, which is making me suspect a JVM is being forked, but without any debug settings. So I am just debugging the original Gradle JVM, not the one running PMD.

In my PMD output, I see a huge pause before any of my rule violations come up, with repeats of a worrying output:

Analysis cache loaded
Analysis cache updated

This repeats hundreds, if not thousands of times, which seems wrong!?

So my questions are:

  1. How can I run this in proper debug mode? Right now I’m thinking I will have to create a custom Java exec task rather than using the built in PMD plugin. Or is there a way to do it using the built in plugin?
  2. Anybody got any idea why the analysis cache is constantly loading and updating? Is this right or wrong?
  3. Config I could pass to the PMD task to enable more debug logging?

So currently I have switched this to a JavaExec task. It now runs in 10 seconds.

        tasks.register('ourPMD', JavaExec) {
            // we need our custom pmd configuration plus the project dir,
            // because each project dir will contain the pmd-ruleset.xml file
            // for that module
            classpath = configurations.pmd + files(projectDir)

            // debug
            logger.quiet("PMD JARs: ${configurations.pmd.files*.name}")

            mainClass = 'net.sourceforge.pmd.PMD'

            jvmArgs = [
                    // uncomment here if you need to debug the process
                    // note you will also need to add all of the PMD libs on the custom pmd
                    // classpath to an implementation scope in the module you want to debug,
                    // otherwise IntelliJ cannot load the PMD classes.
                 //   '-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5006',
                    '-Dorg.slf4j.simpleLogger.defaultLogLevel=DEBUG',
                    '-Dorg.slf4j.simpleLogger.log.net.sourceforge.pmd=DEBUG',
                    '-Dorg.slf4j.simpleLogger.showDateTime=true',
                    '-Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss.SSS'
            ]

            args = [
                    '-d', sourceSets.main.java.srcDirs.first(),
                    '-R', "pmd-ruleset.xml",
                    '-f', 'text'
            ]
        }

This has unblocked me for now. But I’m still curious why I could not debug the official Gradle PMD plugin, or get it working correctly.

  1. How can I run this in proper debug mode? Right now I’m thinking I will have to create a custom Java exec task rather than using the built in PMD plugin. Or is there a way to do it using the built in plugin?

From a quick look I indeed also did not find a way how to debug the PMD execution or set any other JVM arguments, which really makes me wonder.
Seems to me you should open a feature request for this.
In the meantime you could try whether you can inject the debug options using JAVA_TOOL_OPTIONS environment variable which is automatically considered by the JRE itself, so if a separate process is started, it should then use the JVM args defined in there, including debug settings.

  1. Anybody got any idea why the analysis cache is constantly loading and updating? Is this right or wrong?

No idea, this is probably more a question to some PMD community or maintainers, that thing is not a Gradle-thing, but a PMD-thing. Maybe this was always done like that for your project but you just didn’t see the log messages. :man_shrugging:

  1. Config I could pass to the PMD task to enable more debug logging?

From Gradle-side most you can do is using --debug, whether PMD supports something else to increase logging or something, no idea.

or get it working correctly.

If you suspect the Gradle PMD plugin is doing something to cause this, you should definitely open a bug report about it.


Btw. are you aware that you do many very bad-practices in the snippets you showed that have quiet some bad effect and also make your build way slower und unreliable than it needs to be?

Hi Bjorn. Thanks for getting back to me and your continued support (you helped me with the original move of this build from Maven to Gradle).

I’ve opened a feature request to be able to pass debug properties to the forked JVM for code quality tasks:

As regards the PMD cache, I have become aware that there is a boolean property that controls this. So I wonder if it is defaulting to true when invoked via the Gradle PMD plugin and (by fluke) defaulting to false when I have run PMD via my JavaExec fork.

The docs say “Incremental analysis is enabled automatically once a location to store the cache has been defined.” I’m not setting this in my code, but in the Gradle PMD class I see:

public File getIncrementalCacheFile() {
    return new File(this.getTemporaryDir(), "incremental.cache");
}

So I think if I (or anyone) wanted to run the Gradle PMD plugin but turn off the cache, we could pass the boolean to disable it.

You mention that I’ve got some bad practices in my code. I would definitely like to get some feedback on this - what have I done wrong and how could I improve it? Thanks.

So I think if I (or anyone) wanted to run the Gradle PMD plugin but turn off the cache, we could pass the boolean to disable it.

Sounds like it, yes.
But usually you would want to have incremental processing, so that subsequent runs are faster I’d guess.
Not used PMD much myself in the past.

You mention that I’ve got some bad practices in my code. I would definitely like to get some feedback on this - what have I done wrong and how could I improve it? Thanks.

Well, you asked for it, so don’t complain. :smiley:

  • “my subprojects closure” => highly discouraged bad practice. Any way of doing cross-project configuration like subprojects { ... }, allprojects { ... }, project(...) { ... }, or any similar means is bad. It immediately introduces project coupling which works against more sophisticated Gradle features and optimizations like configure-on-demand or the upcoming isolated projects, besides that it makes builds harder to understand and harder to maintain. If you want to have centralized build logic, you should use convention plugins, for example in buildSrc or - what I prefer - in an included build, for example implemented as precompiled script plugins, and then apply those convention plugins directly to those projects that should have their effect. Even querying mutable information form a different project’s model is almost as bad.
  • do not use the legacy way to apply plugins using apply..., but always use the plugins { ... } block
  • do not use files or fileTree for dependencies ever, they have signifcant drawbacks and quirks, besides that files(projectDir) per-se does anyway not sound too helpful to me from a cursory look
  • Never use .each or any other iterating method on a lazy domain collection like the one returned from tasks.matching. This has significant problems. For example you totally destroy task-configuration avoidance wasting much time on every run, because you force each and every task to be realized on each and every run to check the matching condition which needs the actual task. If you really need to do something like that, at least first restrict the set further for example by using withType before using matching. But it also has other problems because by using the normal each it is eager so you only get the task if it is registered already. If you used for example .configureEach instead then it would be task-configuration-safe and future-task-adding safe, as then only the tasks realized anyway due to some other reason are checked whether they match If you know the tasks are already created, you can also just do tasks.named("...") { ... } then you also spare checking the names of all tasks whether they match.
  • When using the Groovy DSL, you don’t need to use equals, in Groovy that is equal to using == and actually the Groovier way.
  • don’t use onlyIf { false } if you do not need to evaluate something that is only clear right before the task is going to execute, but just set enabled = false
  • rootProject.hasProperty("runPMD") is exactly such a case of cross-project querying you also shouldn’t do, and additionally, if you consider using the configuration cache (which you should do consider) then it probably will not work as onlyIf is executed at execution phase at which time you do not have access to the project model anymore, if you only use standard ways to set project properties, you can just store a providers.gradleProperty result to a local variable and evaluate that in the onlyIf, that should then be a configuration cache safe variant

And finally a more personal recommendation, switch to Kotlin DSL. By now it is the default DSL, you immediately get type-safe build scripts, actually helpful error messages if you mess up the syntax, and amazingly better IDE support if you are using a good IDE like IntelliJ IDEA or Android Studio.