We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The resource leak increased the number of threads with every execution of Exclipse-based formatter step in a Maven module. It happened because the SpotlessCache wasn't properly utilized. Every cache key was based on randomized file names because Maven plugin relocates config files into the target directory with a random file name. This resulted in all cache accesses being cache misses.
SpotlessCache
This PR fixes the problem by:
making the Maven plugin use non-random file names for output files. FileLocator is changed to construct names based on a hash of the input path. Input path can be a URL, file in a JAR, or a regular local file
FileLocator
changing FileSignature (used for SpotlessCache keys) to use filenames instead of paths and file hashes instead of last modified timestamps
FileSignature
These two changes allow FileSignature to uniquely identify a set of files by their content and not take file paths into account. As a result, created keys for SpotlessCache are properly comparable which results in lots of cache hits and decreased number of created threads.
Changed FileCignature only accepts files, not directories. NPM formatters changed to have their signatures based on package.json file instead of the whole node_modules directory.
FileCignature
Fixes #559
Sorry, something went wrong.
There was a problem hiding this comment.
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It's great to see the consequences of the "content hash" choice, and there's a good chance that it will be the only possibility. Turns out there are some problems with sorting on filename, and even some problems with sorting on content, which I described in my review below. The problems aren't show-stoppers though, and it might be our best path.
What do you think about FileLocator first checking to see if it's a local config file, and only using copy-to-hashed-path for remote URLs? I pushed up a demo in #571) or not, it seems like #559 on its own.
In order to support the Gradle Build Cache, we might need to do the FileSignature change you've implemented here no matter what, so it's super useful to see it play out. But it hurts me to see how much extra IO it forces us to do. It's probably premature optimization to worry about it, but I think I have a way that we could keep the current FileSignature implementation, and only use content hashing when it's needed.
If we can solve #572 in the short-term, then I'd prefer to do that and buy some time to think about keeping FileSignature more IO-efficient.
Unfortunately, #572 because settings files in openhab-addons project come from a plugin dependency. Those files are relocated from a JAR file by FileLocator. The plugin dependency is configured here. I need to think a bit more about the problems you highlighted in this PR.
openhab-addons
Hey #328 (comment):
$ gradle clean --stacktrace executing gradlew instead of gradle FAILURE: Build failed with an exception. * Where: Build file '/Projects/spotless/ide/build.gradle' line: 11 * What went wrong: A problem occurred evaluating project ':ide'. > A problem occurred configuring project ':lib'. > java.lang.NullPointerException (no error message) * Try: Run with --info or --debug option to get more log output. Run with --scan to get full insights. * Exception is: org.gradle.api.GradleScriptException: A problem occurred evaluating project ':ide'. [...] Caused by: org.gradle.api.ProjectConfigurationException: A problem occurred configuring project ':lib'. [...] Caused by: java.lang.NullPointerException at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache$Table.index(UnpackedObjectCache.java:115) at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache$Table.contains(UnpackedObjectCache.java:76) at org.eclipse.jgit.internal.storage.file.UnpackedObjectCache.isUnpacked(UnpackedObjectCache.java:31) at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObject(ObjectDirectory.java:398) at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:132) at org.eclipse.jgit.lib.ObjectReader.open(ObjectReader.java:203) at org.eclipse.jgit.revwalk.RevWalk.parseAny(RevWalk.java:908) at org.eclipse.jgit.revwalk.RevWalk.parseCommit(RevWalk.java:818) at com.diffplug.gradle.spotless.GitRatchet.rootTreeShaOf(GitRatchet.java:187) at com.diffplug.gradle.spotless.SpotlessTaskBase.setupRatchet(SpotlessTaskBase.java:91) at com.diffplug.gradle.spotless.FormatExtension.setupTask(FormatExtension.java:681) at com.diffplug.gradle.spotless.JavaExtension.setupTask(JavaExtension.java:196) at com.diffplug.gradle.spotless.SpotlessExtension.lambda$createFormatTasks$0(SpotlessExtension.java:71) [...]
I usually have two git remotes:
upstream
diffplug/spotless
origin
lutovich/spotless
and I clone the original repo first, then rename upstream from origin to upstream.
This rename seems to be what's causing the NPE. Steps to reproduce:
$ git clone [email protected]:diffplug/spotless.git $ cd spotless $ gradle clean # works fine $ git remote rename origin upstream $ gradle clean # fails with NPE
Not sure if it's a bug or a problem with my local setup. That's why I describe it here instead of creating an issue.
The workaround for me is to not rename the remote :) I'll proceed with fixing this PR.
Thanks for reporting! I bet part of this is the master -> main migration, which is a minor problem for the in-progress PRs. We've also improved the git-related error messages a lot in the newer versions of Spotless, not sure if the latest spotless is in our dogfooded settings.gradle version. I bet if you merge main into your feature branch, it will resolve.
master
main
settings.gradle
The steps to reproduce use a clean clone of diffplug/spotless and do not use a stale feature branch like this one. That's why I thought it might be a bug.
553cffe
52a7a6c
@nedtwigg review comments should now be addressed. I fixed them and squashed everything into the same commit
Awesome, thanks very much! I'm gonna break this commit into two (one for the npm part, another for the rest) and then add a few things. Do you mind if I force-push to your branch, to maintain this discussion history?
NPM formatters changed to have their signatures based on package.json…
b695622
… file instead of the whole node_modules directory.
Fix resource leak in Maven plugin
823c2c3
The resource leak increased the number of threads with every execution of Exclipse-based formatter step in a Maven module. It happened because the `SpotlessCache` wasn't properly utilized. Every cache key was based on randomized file names because Maven plugin relocates config files into the target directory with a random file name. This resulted in all cache accesses being cache misses. This commit fixes the problem by: * making the Maven plugin use non-random file names for output files. `FileLocator` is changed to construct names based on a hash of the input path. Input path can be a URL, file in a JAR, or a regular local file * changing `FileSignature` (used for `SpotlessCache` keys) to use filenames instead of paths and file hashes instead of last modified timestamps These two changes allow `FileSignature` to uniquely identify a set of files by their content and not take file paths into account. As a result, created keys for `SpotlessCache` are properly comparable which results in lots of cache hits and decreased number of created threads. Changed `FileSignature` only accepts files, not directories.
Sure, no problem. Let me know if I can help with anything on this PR
Use a central cache for the FileSignature, so that we keep the local …
e4db705
…performance based on lastModified that we had prior to this PR.
0c96ee3
No need for the FileSignature cache to store multiple versions per-fi…
e872905
…le, once a file has changed, we can discard the old one.
Add a descriptive error-message for the highly-unexpected case
aaddff8
where a FileSignature is generated for multiple files with the same filename.
Fix SpotbugsWarnings (and terrible file-reading typos).
cb724e9
aaddff8). And the second thing I did was add a cache, so that when we're doing JarState, we aren't hashing the same jars over and over and over. If this LGTY, then it LGTM.
JarState
b695622? I think it's a good change. We're changing FileSignature so that it only handles files (not folders). The upside is that caching will work better (remote build cache for Gradle, fixes other caching issues for Maven).
can you take a peek at b695622? I think it's a good change. We're changing FileSignature so that it only handles files (not folders). The upside is that caching will work better (remote build cache for Gradle, fixes other caching issues for Maven).
This looks good to me. Thank you for the change.
Update the changelog for lib and plugin-maven, still need to do plugi…
5ca3b18
…n-gradle
Merge branch 'main' into fix-mvn-resource-usage
267767a
Update the plugin-gradle changelog (last changelog that needed updates).
51647ab
No rush @lutovich, but I'm waiting for a positive confirmation from you on the changes I made here before merging this, which allows other merges to go through. I'm only echoing because the GitHub UI doesn't allow me to ask for a code review from you (since it is your PR), so wanted to make sure it didn't get lost :)
Looks good, made two minor comments.
Just curious, why do you prefer to read the file this way instead of Files#readAllBytes()? Also, maybe the code could be simplified a bit by using DigestInputStream.
Files#readAllBytes()
DigestInputStream
Some of the eclipse jars push into the tens of MB, and SHA-256 only works on 512 bytes at a time. It's a micro-optimization to worry about that memory pressure, very possible that it's slower than Files.readAllBytes, and it wasn't my motivation. The real reason is that I wanted to protect against the race condition where you read a file, someone else modifies it, and then you check the timestamp (or the opposite order) and end up with a lying cache. By opening it to read, and checking the timestamp while it's open, you can be (more) sure that the content you're reading is the content for that timestamp. I think it's still not guaranteed on unix systems, so it's not perfect, but that was the motivation.
Files.readAllBytes
Make the FileSignature internal cache private final.
private final
5bf0896
Suppress unused warnings.
2395492
07d5ab0
Released in plugin-maven 2.0.0
plugin-maven 2.0.0
nedtwigg nedtwigg left review comments
Successfully merging this pull request may close these issues.