module: add clearCache for CJS and ESM#61767
Conversation
|
Review requested:
|
90303e6 to
1d0accc
Compare
|
The
notable-change
Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment. Otherwise, the commit will be placed in the Other Notable Changes section. |
|
I’m relatively +1 on having this in Node.js, but I recall having a a lot of discussions about this @GeoffreyBooth and @nodejs/loaders teams about this, and it would massively break the spec, expectations, and invariants regarding ESM. (Note, this is what people have been asking us to add for a long time). My personal objection to this API is that it would inadvertently leak memory at every turn, so while this sounds good in theory, in practice it would significantly backfire in long-running scenarios. An option could be to expose it only behind a flag, putting the user in charge of choosing this behavior. Every single scenario where I saw HMR in Node.js ends up in memory leaks. This is the reason why I had so much interest and hopes for ShadowRealm. |
benjamingr
left a comment
There was a problem hiding this comment.
I am still +1 on the feature from a user usability point of view. Code lgtm.
We're giving users a tool, it may be seen as a footgun by some but hopefully libraries that use the API correctly and warn users about incorrect usage emerge. |
|
@mcollina Thanks for the feedback. I agree the ESM semantics concerns are real. This API doesn’t change the core ESM invariants (single instance per URL); it only removes Node's internal cache entries to allow explicit reloads in opt‑in workflows. Even with that, existing references (namespaces, listeners, closures) can keep old graphs alive, so this is still potentially leaky unless the app does explicit disposal. I’ll make sure the docs call out the risks and the fact that this only clears Node’s internal caches, and I’d like loader team input on the final shape of the API. This commit should address some of your concerns. b3bd79a
Thanks for the review @benjamingr. Would you mind re-reviewing again so I can trigger CI? |
|
Thanks a lot for this ❤️ |
|
Rather than violating ESM invariants, can't node just provide a function that imports a url? i.e. While the given example of: const url = new URL('./mod.mjs', import.meta.url);
await import(url.href);
clearCache(url);
await import(url.href); // re-executes the moduleis indeed not spec compliant, it's perfectly legal to have something like: import { clearCache, importModule } from "node:module";
await importModule(someUrl);
clearCache();
await importModule(someUrl); // reexecute |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #61767 +/- ##
==========================================
- Coverage 89.72% 89.71% -0.02%
==========================================
Files 676 679 +3
Lines 206065 207821 +1756
Branches 39508 39839 +331
==========================================
+ Hits 184897 186448 +1551
- Misses 13315 13463 +148
- Partials 7853 7910 +57
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
While I am +1 to the idea in general, I am afraid the current API may bring more problem than it solves...see the comments.
(Granted it isn't really a problem unique to this specific design, I think the issue is more that this is not a very well solved problem so far, I don't really know what it should look like, though I think I might be able to point out what it should not look like to avoid adding/re-introducing leaks/use-after-frees that user land workarounds can already manage)
|
I was the one requesting this while sitting next to yagiz today. We take advantage of Module Federation which allows us to distribute code at runtime. However, when parts of the distributed system are updated, it gets stuck in module cache. I've had some workarounds, like attempting to purge require cache - however when it comes to esm, it's a difficult problem. Since we do this distribution primarily in production, and there can be thousands of updates a day, I block esm from being supported because it'll leak memory - which was fine for several years but becoming more problematic in modern tooling. On lambda we cannot just exit a process and bring a new one up without triggering a empty deploy, which has generally been a perf hit to cold start a new lambda vs try and "reset" the module cache for primitive hot reload. Now, I know this might be controversial, or not recommended - but the reality is that many large companies use federation, most fortune 50 companies use it heavily. All of them are relying on userland cobbling I've created. If there is a solution, it would be greatly appreciated by all of my users. I believe this would also be very useful in general for tooling like rspack etc where we have universal dev serves. If invalidation of specific modules causes complexity, I'd be more than happy with a nuclear option like resetModuleCache() which just clears everything entirely. Would be a little slower, but nothing is slower than killing a process and bringing up a new one. "Soft Restart" node without killing it. Don't have much opinion on spec compliance etc, can go through NAPI as well if that would avoid any spec concerns or pushback. |
|
Chiming in to say that re-loading a module is very helpful in tests. We can do this with the fabulous CJS paradigm, but ESM does not have a viable equivalent and it should. |
|
I think there are still quite a few places that need updates/tests - I tried my best to find them, but there are some dusty corners in the module loader that I have never poked at, you might want to take a heap snapshot or write more tests with
|
|
I think I addressed all of your concerns @joyeecheung. Let me know if I missed anything! |
Just pinging @guybedford to speak on the spec concerns. I think we should wait for him or someone similarly knowledgeable about the spec to comment before landing. In general I'm +1 on the feature, assuming it can be safely implemented. My (dim) recollection was that the last time we considered it, it was impossible to modify an ES module after it had been loaded into V8. Has that changed in recent years? How do you handle cases like |
A unit test environment should be using ephemeral realms. Otherwise shared global state would cause tests to interfere with each other. Module graphs created via
I see. It's not the deployment model I would choose in an ideal world but I can see how a team would end up here. I think the stakeholders need to make a call on whether or not encouraging the specification violation is important. This is super bike sheddy but I want to point out that the term Arguably hot reloaders are a significant enough departure from a classic module graph that they should be implemented directly via |
My posts throughout this thread have not had any test frameworks in mind. When we consider the operation of |
I don't believe that's a goal of this PR. This PR provides a means to fix a memory leak. |
|
Would you mind reviewing this pull-request one more time so that I can trigger CI and land? |
| * `caches` {string} Specifies which caches to clear. Must be one of: | ||
| * `'resolution'` — only clear the resolution cache entry for this specifier. | ||
| * `'module'` — clear the cached module everywhere in Node.js (not counting | ||
| JS-level references). | ||
| * `'all'` — clear both resolution and module caches. |
There was a problem hiding this comment.
Do we really need this feature, or could we leave this customization out for now?
To fully understand the resolve cache case - this means we clear the resolution for specifier, parentURL, attributes only right? It seems to me like that has to be expected here?
There was a problem hiding this comment.
I notice this was marked resolved but it's not clear what the answer was?
There was a problem hiding this comment.
Unfortunately that commit removes attributes and not the caches option this was referring to.
There was a problem hiding this comment.
@joyeecheung would you be open to having the caches option be reconsidered for a follow-on? It would be good to hear the justification for this further. To align spec caching with this PR will be some work yet, and it would help if we can align on a simple model to start.
There was a problem hiding this comment.
Did you mean to only have "all" semantics at all times? I think that's fine, resolution-only clearing is mostly for cache-busting resolutions.
To explain why I thought it would be useful to have separate module cahe clearing - while you don't want leaks, you also won't want use-after-frees and practices in vm.Script/vm.Module suggests that it works alright to allow V8 to retain the references, because once Node.js release them the rest are only retained through user code, when user module graph goes away it will go away, and libraries have been managing that alright. If you try to clear it while user code still rely on old modules you get use after frees and people will be annoyed. With cache busting solutions the library could use some help avoiding clearing the module cache too early even when the no longer allow new resolutions to old modules - and if they can't clear them separately they might be forced into a "leak-or-use-after-free" choice (even though resolution leak is usually minor compared to module leaks, so it's natural to choose resolution leak over module use-after-free which can be a lot more annoying)
|
@guybedford Are there any minutes for the TC39 modules meeting? Will this topic be discussed at the upcoming TC39 plenary meeting?
|
There was a problem hiding this comment.
I've got a number of concerns with this that I'd like to see addressed before it lands.
-
There is a V8-level memory leak risk. For the cache-busting URLs with dynamic
import(), old modules should be collectible and there likely is not a problem. However, for static import graphs I believe this introduces a definite memory leak risk. -
The docs should be stronger about the leak risk. Static import graphs create permanent V8-internal references that prevent collection.
-
There's some missing test coverage. For instance, a test verifying what happens when you clear a module that was statically imported by a still-alive parent. The test should either demonstrate that it is leaked or collected.
-
I'm not convinced the loadCJSModuleWithModuleLoad race test covers all the edge cases while a concurrent dynamic
import()of CJS is in-flight. -
It's not clear if the hash_to_module_map cleanup is sufficient for same-URL re-import. There might be a memory leak case in hash_to_module_map that needs to be evaluated.
Demonstrating that the memory-leak risk is inconsequential, or that the existing tests adequately cover it, would be enough to clear this objection.
|
@jasnell Thanks for the detailed review. Addressing each point: 1. V8-level memory leak risk for static import graphs You are correct that this is a real concern. When a parent module P statically imports M, `v8::Module::InstantiateModule` creates a permanent V8-internal strong reference from P's compiled module record to M's. Node.js's `clearCache` only removes M from JS-level caches (`loadCache`, `resolveCache`, `require.cache`). It cannot sever the V8-internal link — M's `ModuleWrap` (and its `v8::Globalv8::Module`) remains alive for as long as P is alive. The leak is bounded, not unbounded. Each clear+reimport cycle with a live static parent produces exactly one stale `ModuleWrap` (the original), which stays alive for P's lifetime. It does not accumulate per cycle — see point 5 below. For dynamic imports (`await import()` with no static parent), the old `ModuleWrap` becomes eligible for GC once it is removed from the loadCache and all JS-land namespace references are dropped. The existing module-wrap-leak test and the new static-import-leak test confirm this. 2. Docs should be stronger about the leak risk Done. Added a dedicated "Memory retention and static imports" section that explicitly states:
3. Missing test coverage for cleared statically-imported modules Added `test/es-module/test-module-clear-cache-static-import-leak.mjs` with a fixture `test/fixtures/module-cache/esm-static-parent.mjs` that statically imports the counter. Part 1 — static parent retention: Loads the static parent (which pins counter v1). Calls `clearCache(counterURL)` then `await import(counterURL)` to get counter v2. Then:
Part 2 — dynamic-only modules are freed: Uses `checkIfCollectableByCounting` across 8×4 cache-busting import/clear cycles to confirm that without a static parent, cleared `ModuleWrap`s are garbage-collected. 4. CJS concurrent import race test coverage Rewrote `test-module-clear-cache-import-cjs-race.mjs` to cover three scenarios:
5. hash_to_module_map cleanup for same-URL re-import Added `test/es-module/test-module-clear-cache-hash-map.mjs` that evaluates this directly. `hash_to_module_map` is an `unordered_multimap<int, ModuleWrap*>`. Entries are added in the `ModuleWrap` constructor and removed in `~ModuleWrap()`. `clearCache` does not directly touch it — cleanup happens when `ModuleWrap` objects are GC'd.
`GetFromModule` compares `v8::Local` handles directly, so the multimap correctly distinguishes old-M from new-M even when both coexist. No correctness issue from multiple entries. The map grows by one per cycle with a static parent, but those entries are removed when the parent is eventually collected — bounded by the number of live static parents. |
|
|
||
| #### Memory retention and static imports | ||
|
|
||
| `clearCache` only removes references from the Node.js **JavaScript-level** caches |
There was a problem hiding this comment.
Is there a possibility we could track this and emit a warning? Specifically, if we know that a module is only held at the javascript-level cache (e.g. dynamic import, etc) vs. if we know it likely would leak then we can emit a warning at the very least when clearing the cache may lead to a leak. Doesn't necessarily need to be done in this PR so consider it a non-blocking suggestion/question.
There was a problem hiding this comment.
I can take a look but we can tackle this after the initial version has landed. It's marked as "active development".

Introduce Module.clearCache() to invalidate CommonJS and ESM module caches with optional resolution context, enabling HMR-like reloads. Document the API and add tests/fixtures to cover cache invalidation behavior.