Skip to content

init commit, tldr the cache_dir or files should probably indicate tab…#835

Draft
jhnwu3 wants to merge 4 commits intomasterfrom
fix/base_dataset_cache_name
Draft

init commit, tldr the cache_dir or files should probably indicate tab…#835
jhnwu3 wants to merge 4 commits intomasterfrom
fix/base_dataset_cache_name

Conversation

@jhnwu3
Copy link
Collaborator

@jhnwu3 jhnwu3 commented Feb 9, 2026

The problem here is it can be quite confusing for the user to know exactly what tables are cached inside each cache file when they want to specify a new location.

@jhnwu3
Copy link
Collaborator Author

jhnwu3 commented Feb 9, 2026

Don't know how to change this to a draft. But, will keep iterating on this as I move along.

@EricSchrock EricSchrock marked this pull request as draft February 9, 2026 23:15
@jhnwu3 jhnwu3 requested review from EricSchrock and Logiquo and removed request for Logiquo February 11, 2026 02:12
Comment on lines 376 to 381
if self._cache_dir is None:
# No cache_dir provided: use default pyhealth cache directory
cache_dir = Path(platformdirs.user_cache_dir(appname="pyhealth")) / "datasets"
cache_dir.mkdir(parents=True, exist_ok=True)
logger.info(f"No cache_dir provided. Using default cache dir: {cache_dir}")
self._cache_dir = cache_dir
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because all task files are still in the same .../tasks/{task.task_name}_{uuid.uuid5(uuid.NAMESPACE_DNS, task_params)}/...

For the same task with same schema, same processor, but different dataset, it will still collide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants