Fix DoFnInvoker cache collision for generic types#37355
Fix DoFnInvoker cache collision for generic types#37355Eliaaazzz wants to merge 3 commits intoapache:masterfrom
Conversation
This fixes a bug where ByteBuddyDoFnInvokerFactory would return the same cached invoker for different generic instantiations of the same DoFn class. Changes: 1. Introduced InvokerCacheKey with TypeDescriptors to ensure unique cache entries. 2. Updated generateInvokerClass to append type-based hash suffix. 3. Added regression test (testCacheKeyCollisionProof). Fixes apache#37351
6fcdeef to
fe957ea
Compare
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
fe957ea to
4d337b6
Compare
This PR fixes a critical issue where ByteBuddyDoFnInvokerFactory failed to distinguish between different generic instantiations of the same DoFn class (e.g., MyFn<String> vs MyFn<Integer>). 1. Cache Key Strategy: Introduced InvokerCacheKey to include input/output TypeDescriptors in the cache lookup. 2. Class Naming: Updated generateInvokerClass to append a type-based hash suffix to ensure unique class names. 3. Robustness (The Fix): Added defensive try-catch blocks when accessing TypeDescriptors. - Some internal transforms (like MapElements) throw IllegalStateException if getOutputTypeDescriptor() is called after serialization. - In these cases, the factory now gracefully falls back to using Object.class (legacy behavior), ensuring backward compatibility for transforms that do not retain type information at runtime. Fixes apache#37351
4d337b6 to
f0271f0
Compare
|
Assigning reviewers: R: @chamikaramj for label java. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
kennknowles
left a comment
There was a problem hiding this comment.
Nice! This is great. I just had a few bits to polish it up.
|
|
||
| @Override | ||
| public int hashCode() { | ||
| int result = fnClass.hashCode(); |
There was a problem hiding this comment.
I suggest using Objects.hash(fnClass, inputType, outputType) along with Objects.equals which makes it very easy to maintain consistency and clarity.
|
|
||
| @Override | ||
| public String toString() { | ||
| return String.format( |
There was a problem hiding this comment.
Here using ToStringHelper keeps a consistent format across all our classes.
| try { | ||
| inputType = fn.getInputTypeDescriptor(); | ||
| } catch (Exception e) { | ||
| // Some DoFns (like MapElements) throw IllegalStateException if queried after |
There was a problem hiding this comment.
I'm interested in digging in to this. Will we still have cache collision problems in this case?
| // different generic instantiations of the same DoFn class. | ||
| // Format: <DoFn class name>$<DoFnInvoker>$<type hash> | ||
| TypeDescriptor<Void> voidType = new StableNameTestDoFn().getInputTypeDescriptor(); | ||
| String expectedTypeSuffix = |
There was a problem hiding this comment.
Rather than repeat the logic of producing this esoteric suffix, go ahead and make it a package-private, or even public method. (the whole module should be @Internal so public methods are not frozen APIs)
There was a problem hiding this comment.
@kennknowles Thanks for the review! I have updated the code to use ToStringHelper and Objects.hash as suggested, and extracted the suffix logic to clean up the tests.
Regarding the cache collision in the fallback case:
You are exactly right—if type lookup fails, different generic instantiations like MyDoFn and MyDoFn will map to the same CacheKey.
However, it's important to highlight that type lookup failure is often an objective side effect of deserialization in a distributed environment. When a DoFn is serialized and transmitted across workers, specific generic type information can be lost due to Java's type erasure or classloader limitations on the worker side. This is an inherent constraint we must handle.
This collision is safe and intentional for two reasons:
Class Isolation: The cache key still includes fnClass, so different DoFn implementations will never share an invoker.
Erasure Compatibility: When we fall back to Object, we generate a "Raw Invoker". Due to Java type erasure, the underlying method in the bytecode acts as processElement(Object). Thus, a single shared "Raw Invoker" is perfectly compatible with any generic instantiation of that class.
I have added a comment in the catch block to explicitly mention that this fallback is a resilient design for cases where type information is lost during deserialization, ensuring the system remains functional even when reflection is limited.
bd5a070 to
f3a1596
Compare
- Replace String.format in toString() with MoreObjects.toStringHelper for consistency. - Update hashCode() and equals() to use java.util.Objects utility methods. - Extract type suffix generation logic into a reusable static method to avoid duplication in tests.
f3a1596 to
648ab3c
Compare
|
Reminder, please take a look at this pr: @chamikaramj |
|
Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment R: @ahmedabu98 for label java. Available commands:
|
This fixes a bug where ByteBuddyDoFnInvokerFactory would return the same cached invoker for different generic instantiations of the same DoFn class.
Changes:
Fixes #37351
Description
This PR addresses a critical bug in the
ByteBuddyDoFnInvokerFactorywhere the caching mechanism for generatedDoFnInvokerclasses failed to distinguish between different generic instantiations of the sameDoFnclass.1. The Problem: Cache Collision on Generic Types
The previous implementation utilized a
ConcurrentHashMap<Class<?>, Constructor<?>>to cache the generated invoker constructors. The cache key was solely theDoFnclass itself.In scenarios involving generic
DoFnclasses (e.g.,class MyGenericFn<T> extends DoFn<T, T>), the raw class object is identical regardless of the type parameterT.MyGenericFn<String>and laterMyGenericFn<Integer>, the factory would generate an invoker forStringand cache it.MyGenericFn<Integer>is requested, the factory hits the cache using the sameMyGenericFn.classkey and returns theString-specialized invoker.ClassCastExceptionor incorrectCoderinference, as the bytecode for the invoker is specialized for the wrong type.2. The Solution: Type-Aware Caching & Naming
This PR introduces a composite cache key and updates the class naming strategy to ensure full isolation between generic types.
A. Introduced
InvokerCacheKeyI replaced the simple
Class<?>key with a new static inner classInvokerCacheKey. This key encapsulates three components:fnClass: The raw class of the DoFn.inputType: TheTypeDescriptorfor the input element.outputType: TheTypeDescriptorfor the output element.By implementing
equals()andhashCode()based on all three fields, we ensure thatMyGenericFn<String>andMyGenericFn<Integer>map to distinct cache entries.B. Unique Class Naming Strategy
ByteBuddy cannot define two different classes with the exact same name in the same
ClassLoader. To support multiple generated invokers for the same rawDoFnclass, I modifiedgenerateInvokerClass.<DoFnClassName>$DoFnInvoker<DoFnClassName>$DoFnInvoker$<TypeHash>The suffix now includes a hexadecimal hash derived from the input and output
TypeDescriptors. This guarantees that the generated bytecode class names are unique for each generic instantiation.Test Plan
I have added a regression test
testCacheKeyCollisionProoftoDoFnInvokersTest.java.DynamicTypeDoFn<T>and manually forces specificTypeDescriptorreturns.String, once forInteger) and asserts that the generated Invoker classes are not the same (assertNotSame).Existing tests passed:
DoFnInvokersTestThank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.