Best practice for api vs implementation in multi-module project

Reposting SO question here as it looks like a better place for advance topics

Let say I have the following modules in an application

  • library
  • base
  • feature1
  • feature2
  • app

Now the relations between the modules are:

base wraps library

feature1 and feature2 make use (depends) on base

app puts together feature1 and feature2

Everything in this multi module structure should be able to work using Gradle’s implementation dependencies and there’s no need to use the api clause anywhere.

Now, let say feature1 needs to access an implementation detail of base included in library .

In order to make library available to feature1 we have two options as far as I can tell:

  1. Change implementation for api in base to leak the dependency to modules that depend on base
  2. Add library as an implementation dependency to feature1 without having base leak the dependency on library

Of course the example has been simplified for the sake of the question, but you understand how this can became a configuration hell with a big number of modules with 4 or 5 levels of dependencies.

We could create a base-feature intermediate module that can wrap base and provide another level of abstraction for feature1 to consume without leaking library , but let’s leave that solution out of the scope of this problem to focus on the setup of the dependencies.

Some trade-offs that I detected on the above options:

Option 1) pros

  • Smaller build.gradle 's files, as no need to repeat implementation clauses
  • Faster build scrips edits. Just make the single change on the api clause and see the changes propagated to all consumer modules

Option 1) cons

  • Classes might come up available in modules that shouldn’t have access to them.
  • Prone to miss use by developers, as they have implementations available and not only the module interfaces.

Option 2) pros

  • It makes crystal clear which dependencies the module has.
  • No guessing where the classes are coming from (think 4 or 5 levels of modules leaking dependencies), as their origin is always declared in the module dependencies.

Option 2) cons

  • Makes updating a dependency more tedious, as all the modules with the implementation clause have to be modified. Even though I believe this is a good thing because it keeps track exactly of how a change modified the project, I see how it can take more time.

Now the questions :

  • Is there any trade-offs in terms of compilation of this multi-module scenario?
  • Is a module leaking a dependency “faster” to be compiled for consumer modules?
  • Does it make a substantial difference in build times?
  • What other side effects, pros/cons am I missing ?

Thanks for your time.

What you describe is a fairly common discussion about layered architecture systems, also known as “strict” vs “loose” layering, or “open” vs “closed” layers. See this (hopefully free for you too) chapter from Software Architecture Patterns for some semiotics which is unlikely to help you much with your choice

From my point of view, if a module needs to break layering, I’d model the project structure to expose this in the most direct and visible way. In this case it means adding library as implementation dependency of feature1. Yes it makes the diagram uglier, yes it forces you to touch few more files on upgrade, and that is the point - your design has a flaw and it is now visible.

If few modules need to break the layer encapsulation in the same way, I may consider adding a separate base module exposing that functionality, with a name such as base-xyz. Adding a new module is a big thing, not because of the technical work, but because our brain can handle only so many “things” at a time (chunking). I believe the same would hold for Gradle “variants” when they become available, but I can’t claim that yet as I haven’t tried them hands on.

If all clients of the base module need to access library (i.e. because you use classes or exceptions from library in your public signatures) then you should expose library as API dependency of base. The downside of that is that library becomes part of the public API of base, and it is probably bigger than you would like, and not under your control. Public API is something you are responsible for, and you want to keep it small, documented, and backwards compatible.

At this point you may be thinking about jigsaw modules (good), osgi (err… don’t), or wrapping the parts of lib that you need to expose in your own classes (maybe?)

Wrapping only for the sake of breaking dependencies is not always a great idea. For one it increases the amount of code you maintain and (hopefully) document. If you start doing small adaptations in the base layer, and the library is a well known library, you introduce (value added) inconsistencies - one needs to always be on guard whether their assumptions for lib still hold. Finally, often the thin wrappers end up leaking the library design, so even if they wrap the API - that still forces you to touch the client code when you replace/upgrade lib, at which point you may have been better off using lib directly.

So, as you can see, is about trade-offs and usability. The CPU doesn’t care where your module boundaries lie, and all developers are different - some cope better with large amount of simple things, some cope better with small number of highly abstract concepts.

Don’t obsess about the best (as in What Would Uncle Bob Do) design when any good design would work. The amount of extra complexity that is justified for the sake of introducing order is a fuzzy quantity, and is something that you are in charge of deciding. Make you best call and don’t be afraid to change it tomorrow :slight_smile:

2 Likes

@ddimitrov thank you for a very detailed answer and the link to the chapter (it was indeed for free to me as well)

I was aware that, at this point, it’s a matter of personal preference as I don’t see a clear “best” option. However I was curious to read other points of views and arguments for/against each of the approaches.

I would agree with you. I believe that being expressive and declaring every dependency in the setup, makes it easier to understand and read, while slightly slower to change and/or update.

3rd party libs are indeed tricky.

My concern about build performance was based on the fact that this is an Android project with ~60 modules and up to 5 layers. Most of the modules are Android libraries with resources that need to be processed, kotlin code, etc. that take a while to build. If there was any considerable save in terms of build times, then I would definitely consider it as part of the decision, as the developer experience improves with faster builds.

Thank you again for your answer and let me know if you would be interested in posting it in SO as an answer, or you are OK with me doing so.

Done, with some editing.

1 Like

What kind of leak are we talking about?
If I use implementation in libraries, then in the host application I will be forced to write something like
implementation library
implementation wraps
implementation base

Otherwise app will be a crash in runtime.

Those, there will be a wraps and base leak anyway.

If I use api in libraries, then in the host application I will be forced to write something like

implementation library

and it’s all. and here will also be your “leak”.

Not exactly - in Gradle, you need to list as implementation only what you compile against. I.e. in the case of log4j you only list “log4j-api”.

Taking it further, if the library listed in implementation has dependencies of its own (wraps and base in your example), you don’t need to repeat them if your code does not call them directly. This way they won’t be accessible at compile time, but they will still be present in the runtime classpath. The benefits are faster compilation times and improved maintainability as you have fewer dependencies to worry about.

The “leak” I was talking about is when you decide to hide a library behind an API of your own design, ostensibly to allow transparent switching to a different backing library. Much too often I’ve seen such efforts end with an API that closely resembles the wrapped API, as in a custom tracing solution that is so much coupled to Log4j that there is no way to plug in Zipkin or Jaeger. This is a general warning against overcomplicating for sake of design - not specific to Gradle.

Let me know if it is clearer now :slight_smile:

Here is my case: https://stackoverflow.com/q/58318259/345810

And I have to list all the dependencies, or use the “api” keyword.

I think this is a very common case. Since no one designs the API in advance. At first it’s just an application (well, if there is an MVP architecture, Clean, etc), then the application develops like this:
application -> modules -> libraries -> API.

If you want to hide it, then declare it as implementation and it will work BUT you cannot have methods taking arguments or returning results with types that are contained in or derived from the “hidden” library, as that would amount to exposing it, hence it would be a failure to hide it. The implementation scope will add its transitive dependencies to the runtime classpath, of depending project.

If you don’t want to hide the library, then it is part of your API and you should declare it as such. So, make up your mind and use the right scope for your intent.

And by the way these are not keywords, but configurations. There is a section about them in the manual. The API and implementation are defined in the java-base plugin - if you have better idea how things are supposed to work, it is trivial to add your own configurations and try it out.

I don’t need to retell the documentation here. Your advice is irrelevant, and completely useless, sorry.

With this attitude you are not going to get a lot of help. Have fun, my builds are fine :slight_smile: