-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathSnakefile_FileNotFound
More file actions
96 lines (81 loc) · 3.85 KB
/
Snakefile_FileNotFound
File metadata and controls
96 lines (81 loc) · 3.85 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# SPDX-FileCopyrightText: euronion
#
# SPDX-License-Identifier: MIT
from pathlib import Path
# Rules with hardcoded paths can lead to mistakes
# if the output name is misspelled in another rule
#
# Running this rule will make the rule fail, because the output create is not the output that Snakemake expected
# > snakemake -F -c1 -s Snakefile_FileNotFound create_first_file_fixed_name
# > snakemake -F -c1 -s Snakefile_FileNotFound data/file1.txt
#
# If you check the `data` directory, you will see that `file.txt` was created instead of `file1.txt` and is not
# removed by Snakemake because it is not linked to the workflow/output of the rule that just failed:
# > ls data/
rule create_first_file_hardcoded_name:
message:
"Creating the first file with a hardcoded name"
output:
"data/file1.txt",
run:
Path("data/file1.txt").touch()
# Instead do this: refer to outputs of a rule via the `output` variable
# instead of hardcoding the path again.
# Access to the output can be done via index or key if named.
# This is also how you should usually encounter it in e.g. PyPSA-Eur or PyPSA-Earth.
# (I generally recommend using named outputs. They improve readability
# and make allow it to change the order of outputs later without breaking things.)
# > snakemake -F -c1 -s Snakefile_FileNotFound create_second_file_with_output_variable
rule create_second_file_with_output_variable:
output:
fn="data/file2.txt",
run:
Path(output[0]).touch()
# This also works:
Path(output["fn"]).touch()
# Wildcard are commonly used in Snakemake workflows to generalise rules.
# Run (it will fail, because of a missing wildcard value):
# > snakemake -F -c1 -s Snakefile_FileNotFound create_third_file_with_wildcard_output
# Run instead (this will succeed):
# > snakemake -F -c1 -s Snakefile_FileNotFound data/file3.txt
# Nasty trick, you can also specify the wildcard value on the command line (don't do this in production, internal trick only):
# > snakemake -F -c1 -s Snakefile_FileNotFound --target-jobs create_third_file_with_wildcard_output:ext=txt
rule create_third_file_with_wildcard_output:
message:
"This rule creates a file with the file extension given by the wildcard `ext`='{wildcards.ext}'"
output:
fn="data/file3.{ext}",
run:
Path(output["fn"]).touch()
# They can lead to mistakes if the wildcard is not properly specified in the input/output of another rule.
# Here we want to create all three files with different extensions created by the rule above.
# > snakemake -F -c1 -s Snakefile_FileNotFound
rule create_all_third_files_hardcoded:
input:
# here the extensions are mapped to the {ext} wildcard of the rule above
"data/file3.txt",
"data/file3.csv",
"data/file3.data",
# But typos can happen and will lead the rules to fail:
# > snakemake -F -c1 -s Snakefile_FileNotFound create_all_third_files_with_typos
rule create_all_third_files_with_typos:
input:
"data/file3.txt",
"data/file3.csv",
# These will fail (typos)
"data/fil3edata",
"data/fiIe3.dat",
# Typos can be avoided, e.g. through using expand(...) if a wildcard has to be mapped to multiple values.
# > snakemake -F -c1 -s Snakefile_FileNotFound create_all_third_files_with_expand
rule create_all_third_files_with_expand:
input:
expand("data/file3.{ext}", ext=["txt", "csv", "dat"]),
# You can also refer to the output of other rules directly without having to spell out the path again.
# Notice that we have to also use `expand(...)` here because the output still contains a wildcard.
# > snakemake -F -s Snakefile_FileNotFound -c1 create_all_third_files_with_rules_output
rule create_all_third_files_with_rules_output:
input:
expand(
rules.create_third_file_with_wildcard_output.output["fn"],
ext=["txt", "csv", "dat"],
),