Creating a Meta-Task

I want to create a simple task that does a few things in sequence:

  1. Run a command.
  2. Create a .tar.gz file from the output of (1)
  3. Upload the tarball to S3

I can make all of this work without too much trouble by creating a plugin and using tasks like Exec and Tar and a series of DependsOn relationships. Note that the command in (1) is never considered “up to date” so that every time the upload to S3 task is called the other tasks upon which it depends have to run. Also, there is never any use case for running any tasks but the final one that uploads to S3. Unfortunately, the plugin approach ends up creating 5 tasks and the user should never call any of them but the last one.

What I think I really want is just a single task. I can almost do that by creating a task class that extends DefaultTask and it ends up being much less code and less confusing. There is, unfortunately, one big blockerr: since you can’t directly instantiate and call Task classes from other tasks what was easy with the Tar task becomes a lot of code as you have to create InputStreams, OutputStream, add a tar and compression library dependency, etc.

2 questions:

  1. It seems like a design flaw that you can’t re-use Task classes in other tasks. Why does this limitation exist and is there any workaround?
  2. What’s the best way to do this?

For reference, here’s what my plugin looks like:

class MakeDependencySnapshot implements Plugin<Project> {
    private static final File snapshotVersionFile = new File('bower-snapshot-version')
    private static final String bowerSnapshotGroup = 'Bower Dependency Snapshots'

    private String getVersionFromFile() {
        String fileContents = snapshotVersionFile.text
        String result = fileContents.trim()
        assert !result.contains("\n")
        return result
    }

    @Override
    void apply(Project project) {
        project.extensions.create('jsSnapshot', MakeDependencySnapshotOpts)

        def version = null;
        if (project.properties.containsKey('snapshotVersion')) {
            project.logger.info("User specified version as: {}", project.snapshotVersion);
            version = project.snapshotVersion
        } else {
            version = getVersionFromFile()
            project.logger.info('Obtained version information from {}. Current version is {}',
                snapshotVersionFile, version)
            version = (version.toInteger() + 1).toString()
        }
        project.logger.info('Will set the snapshot version to {}', version)

        def tempDir = File.createTempDir()
        def snapShotName = "snapshot-${version}.tar.gz"
        def tarball = new File("$tempDir/$snapShotName")
        project.logger.info("Will save tar of bower components to {}", tempDir)

        project.task('runBower', type: Exec) {
            group bowerSnapshotGroup
            description 'Reruns Bower to update all dependencies'
            // Each bower run can download new patch fixes, etc. so this is never considered up to date
            outputs.upToDateWhen { false; }
            commandLine 'bower', 'install'
        }

        project.task('tarBower', type: Tar, dependsOn: project.tasks.runBower) {
            group bowerSnapshotGroup
            description "Creates a tar.gz file of all the Bower dependencies"
            from '.bower_components'
            compression Compression.GZIP
            archiveName tarball.toString()
        }

        project.afterEvaluate {
            project.task('uploadBowerDepsToS3', dependsOn: project.tasks.tarBower) << {
                def s3Client = new AmazonS3Client()
                s3Client.putObject(project.jsSnapshot.s3Bucket, snapShotName, tarball)
            }

            project.uploadBowerDepsToS3 {
                group bowerSnapshotGroup
                description 'Uploads the .tar.gz file containing all the dependencies to S3'
            }

            project.task('updateSnapshotVersionNumber', dependsOn: project.tasks.uploadBowerDepsToS3) << {
                snapshotVersionFile.write "$version\n"
            }

            project.updateSnapshotVersionNumber {
                group bowerSnapshotGroup
                description "Writes the updated version number to $snapshotVersionFile"
            }

            project.task('updateSnapshot', dependsOn: project.tasks.updateSnapshotVersionNumber) {
                project.logger.info('Running snapshot update tasks.')
            }

            project.updateSnapshot {
                group bowerSnapshotGroup
                description 'Causes all bower snapshot update tasks (re-running bower, creating the tarball, etc.) '
                    'to run'
            }
        }
    }
}

class MakeDependencySnapshotOpts {
    String s3Bucket
}

And here is what my task COULD look like if I could instantiate the Tar task and call it:

class UpdateSnapshot extends DefaultTask {
    @TaskAction
    def update() {
        Project project = getProject();
        project.exec {
            commandLine 'bower', 'install'
        }
        def tarFile = File.createTempFile('dependencies', '.tar.gz')
        project.logger.info('Will create tarball at {}', tarFile);
        // This fails with "Task ... has been instantiated directly which is not supported...."
        Tar tar = new Tar()
        tar.from ='.bower_components'
        tar.compression = Compression.GZIP
        tar.execute()
      // etc.
    }
}

It is worth noting another issue with the Plugin setup which is that the path to the tarball is needed in several tasks. The easy way to do this is run def tempDir = File.createTempDir() during the configuration phase which means that the temporary directory gets created even if none of the tasks that use it are ever executed. This is much easier to handle with the single Task approach.

Just some thoughts that might convince you that tasks are actually a good idea :slight_smile:

Except when it fails and you try to debug what went wrong.

The user won’t see the “private” tasks (runBower, tarBower) by default if you don’t give them a group. They would only appear if the user runs gradle tasks --all

Thanks @st_oehme. Good points. There may be some value in breaking this up into some tasks. However, it’d still be nice if there was a way to call one task from another. For example, having a task that does nothing but create a temporary directory is possible but seems awkward. It seems nicer if I could have a task that creates a temporary directory and then creates a tarball in it. But I can’t make that a single task easily since I can’t call the Tar task from within my new task (granted I could call project.exec{ commandLine 'tar', 'zcvf' ... } or write a bunch of Java/Groovy code to generate the tarball but these have obvious downsides).

Yes, there are certainly tasks whose implementation would be nice to reuse. There are some examples of that, e.g. using Project.copy() instead of a Copy task. The problem for us is that once we provide such helper methods/classes, we need to maintain them. So we need to weigh the number of use cases vs. how much we want to blow up the API surface. Generally, we are rather conservative about this, since we have a strong commitment to backwards compatibility.

The output directory of a task is automatically created, so there is nothing extra to do here if you use a normal Tar task.

Why not just make a general mechanism for being able to instantiate any task and run it. Tasks are pretty much just classes. I fully expected new Tar() to work and was pretty surprised to find that although these are just Groovy classes they are special and can only be instantiated via the DSL. If you can find a way to allow users to instantiate and run any task without the DSL you wouldn’t have this tradeoff: there wouldn’t be anything extra to maintain. In fact, there would no longer be a need for project.copy or project.exec.

As far as I can tell tasks are instantiated via task(.., Type: <TaskClass>) and then the configuration closure. So, why not a method like project.getTask(Tar.class) { // config here }. Would that be possible?

Tasks are pretty strongly bound to the project and its lifecycle. A lot of the task API would not make any sense for a task that is not part of the project’s Task graph. For instance, people would probably start instantiating detached tasks and then using dependsOn and complaining that it does not work when they call execute on it.

That makes sense though I think the project.getTask proposal above at least mitigates the concerns that users would instantiate a detached task: the task would be bound to the project instance on which you called getTask.

I meant “detached” as in “not part of the task graph”. If they are part of the task graph, then users could reference them, which is not what you want. If they are not part of the task graph, then things like dependsOn won’t work, which would lead to the confusion I described earlier.

So in summary: It’s the implementation of a task that you might want to reuse, but not the task itself. I encourage you to expose the implementation of your own tasks as helper classes that your users can reuse, if you are okay with the maintenance overhead. But for most use cases, having normal tasks is just fine and actually helpful.

Thank you for understanding my more conservative standpoint on this :slight_smile: