How to declare task inputs not known at configuration time

OK, I must be missing something simple here, but I cannot figure out how to do something that seems pretty straight forward. We have a simple data file that is run through the C preprocessor to generate a file that has had some constants pickefd up from the header files included in the data file. While it is easy to implement an exec task to perform this processing, I cannot seem to figure out how to get the inputs correctly specified for this task. It is simple to specify the original data file as an input, but it is the included header files that are proving difficult to specify. I can easily read the data file during the configuration phase, match on any #include lines and then add the files as task inputs, but this will end up reading the data file every time we do a build even when nothing has changed and this is a waste of time.

The documentation clearly states that task inputs and outputs can only be specified during the configuration phase, but I tried it any way and alas the doc did not lie. This seems like a common scenario, in fact one would imagine the C/C++ plugin would have similar needs, but through some experimentation it appears to me that the header files included in a source file being compiled are not magically added as inputs either.

Is there any way to do this that does not require the files to be read always during configuration?

I would use a custom task that knows how to parse the data file. You can annotate methods with @InputFiles, which is similar to adding files to inputs.files, except it’s called when needed vs every time the build runs.

Here’s an example:

class CustomTask extends DefaultTask {
    @InputFile
    File fileToRead
    @InputFiles
    FileCollection getOtherInputs() {
        if (cachedOtherInputs==null) {
            // code to read the file
            cachedOtherInputs = project.files("a", "b", "c")
        }
        println "Getting otherInputs"
        return cachedOtherInputs
    }
    private FileCollection cachedOtherInputs
}

task foo(type: CustomTask) {
    fileToRead = file("includes")
}

task otherTask << { println "Running $name" }

task dependentTask << { println "Running $name" }
dependentTask.dependsOn foo

You’ll see if you run ‘dependentTask’ or ‘foo’, the file will be parsed. If you run ‘otherTask’, it’s not.

The C/C++ plugins are doing something similar to gather the header inputs for compilation, but we’re just now starting to optimize for it.

I think you can also accomplish this within a build script without a custom task, but it’ll be messy. If you wrap your code to figure out the list of includes in a Closure, it shouldn’t be evaluated immediately.

Thanks and while this is certainly better than reading the file every time during configuration, it is still not exactly what I was hoping for. I guess what I am striving for is reading the file once to determine the included headers and not needing to read it again unless the containing data file is changed.

The goal is to have a task that:

  • has an input data file (data.file) that #includes several header files (include1.h and include2.h),
  • has a single output file (output.file), and
  • the task is only executed if one or more of data.file, include1.h or include2.h have changed or of course if the output file does not exist

This is exactly the same as how you would want a C/C++ compile task to operate, so I am wondering what technique the c/cpp plugin is using to try and accomplish this?

I have been successful using a similar approach to the above (only needed a custom task that extends Exec) to get the desired behavior, but the data file is read each time the task is executed to see if the set of include files has changed, but at least this only runs if the task needs to be run. It just seems a shame however, to spend the time reading a file that we know has not changed.