-
Notifications
You must be signed in to change notification settings - Fork 12
Setup DistABLPLoader for GraphStore mode #485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/e2e_test |
|
/integration_test |
|
/unit_test_py |
GiGL Automation@ 19:58:53UTC : 🔄 @ 21:23:22UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 19:58:54UTC : 🔄 @ 21:07:19UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 19:58:59UTC : 🔄 @ 21:10:08UTC : ✅ Workflow completed successfully. |
mkolodner-sc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kyle, left a few comments/questions
| # Graph Store mode inputs | ||
| dict[int, tuple[torch.Tensor, torch.Tensor, Optional[torch.Tensor]]], | ||
| tuple[ | ||
| NodeType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double check, how would this interplay if I have user as my anchor node type and item as my supervision node type?
Tbh it would feel a bit weird to specify {user: {0: (0, 1, 1)}} with the 0 corresponding to the user node type but the two 1 node types are items. Ideally, if a user is providing the positive and negative labels here, I feel like they should have to provide the node type in some way. This may be even more true once we will need to support multiple supervision edge types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good callout!
Hmmm, wdyt about dict[int, tuple[tuple[NodeType, Tensor], tuple[NodeType, Tensor], Optional[tuple[NodeType, Tensor]]?
Or honest that's so complicated probably worth creating a dataclass to wrap all the data?
I know we want to avoid dataclasses here but frankly this is so complicated I think it's worth it. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is growing complicated enough here honestly where I think a dataclass specifically explaining this input (and potentially having the RemoteDataset.get_ablp_inputs() return this dataclass) makes sense.
Re: dict[int, tuple[tuple[NodeType, Tensor], tuple[NodeType, Tensor], Optional[tuple[NodeType, Tensor]]
Not to add complexity unfortunately to this already complex object, but for multiple supervision edge types, we may need to make it list[tuple[NodeType, Tensor]] for the positive and negatives :')
Alternatively, another option could be to expose positive_nodes, negative_nodes, or label_nodes as direct arguments. It seems like it's a tradeoff between
- adhering to GLT by only having one
input_nodesbut having a fairly complex structure to provide as input :') - having multiple arguments but now we've evolved the API beyond GLT/PyG and further added complexity to the API
I need to think more about which approach I prefer but these are my thoughts for now
Scope of work done
Where is the documentation for this feature?: N/A
Did you add automated tests or write a test plan?
Updated Changelog.md? NO
Ready for code review?: NO