Skip to content

Environment variables defined in code#4977

Draft
happz wants to merge 1 commit into
mainfrom
envvar-definitions
Draft

Environment variables defined in code#4977
happz wants to merge 1 commit into
mainfrom
envvar-definitions

Conversation

@happz

@happz happz commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Environment variables in tmt suffer from several issues:

  • Documentation is hand-written in overview.rst which makes it detached from envvars as they are used in code: we can end up with documentation for variables that do not exist anymore, or lack documentation for variables used in code.
  • Envvar names in code are ad-hoc strings, and therefore prone to typos: one part of code may expose TMT_TEST_FOO, another part TMT_TEST_FO0, which would be hard to discover without a test extensive enough.
  • Envvars are spread throughout the code without clear ownership: which piece of code "owns" TMT_TEST_STEP_DATA?

The patch adds a new class, EnvVar, which bundles together some important info about a single environment variable. Environment now accepts not just string key, but also instances of EnvVar class, taking the envvar name from the instance.

  • Bundling all envvar-related info into one package should lower the chance of getting out of sync.
  • Using EnvVar instances prevents typos: environment['TMT_TEST_FO0'] is trivial mistake, environment[ENV_TMT_TEST_FOO] leaves no space for weird fonts or low lighting.
  • Eventually, we can simply scan tmt package and discover all environment variables recognized by tmt, both consumed and exported by, and generate their documentation.
  • scope attribute should allow nice grouping, easily answering questions like "what envvars are exposed to by prepare/shell phase?"

As a PoC, I modified execute step alone, there is no documentation generated yet.

Pull Request Checklist

  • implement the feature
  • write the documentation
  • extend the test coverage
  • update the specification
  • adjust plugin docstring
  • modify the json schema
  • mention the version
  • include a release note

Environment variables in tmt suffer from several issues:

* Documentation is hand-written in `overview.rst` which makes it
  detached from envvars as they are used in code: we can end up with
  documentation for variables that do not exist anymore, or lack
  documentation for variables used in code.
* Envvar names in code are ad-hoc strings, and therefore prone to typos:
  one part of code may expose `TMT_TEST_FOO`, another part
  `TMT_TEST_FO0`, which would be hard to discover without a test
  extensive enough.
* Envvars are spread throughout the code without clear ownership: which
  piece of code "owns" `TMT_TEST_STEP_DATA`?

The patch adds a new class, `EnvVar`, which bundles together some
important info about a single environment variable. `Environment` now
accepts not just string key, but also instances of `EnvVar` class,
taking the envvar name from the instance.

* Bundling all envvar-related info into one package should lower the
  chance of getting out of sync.
* Using `EnvVar` instances prevents typos: `environment['TMT_TEST_FO0']`
  is trivial mistake, `environment[ENV_TMT_TEST_FOO]` leaves no space
  for weird fonts or low lighting.
* Eventually, we can simply scan tmt package and discover all
  environment variables recognized by tmt, both consumed and exported
  by, and generate their documentation.
* `scope` attribute should allow nice grouping, easily answering
  questions like "what envvars are exposed to by `prepare/shell` phase?"

As a PoC, I modified `execute` step alone, there is no documentation
generated yet.
@happz happz added the ci | full test Pull request is ready for the full test execution label Jun 9, 2026
@happz happz added this to planning Jun 9, 2026
@github-project-automation github-project-automation Bot moved this to backlog in planning Jun 9, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a structured EnvVar class to represent environment variables and updates the Environment class and various steps to use these typed constants instead of raw strings. Feedback includes making EnvVar hashable by defining __hash__ and __eq__, overriding __contains__, get, and pop in Environment for dictionary compatibility, reverting the redundant copy() change, and moving ENV_TMT_PLAN_ENVIRONMENT_FILE to a central module to avoid circular dependencies.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread tmt/utils/__init__.py
Comment on lines +492 to +504
def __init__(
self,
*,
name: str,
scope: Optional[set[str]] = None,
doc: Optional[str] = None,
) -> None:
self.name = name
self.scope = scope or set()

self.__doc__ = (
textwrap.dedent(doc).strip() if doc else 'This environment variable is undocumented.'
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Define __hash__ and __eq__ methods for EnvVar to make it hashable. Since Environment accepts EnvVar instances as dictionary keys, they must be hashable to prevent a TypeError: unhashable type: 'EnvVar' runtime crash.

    def __init__(
        self,
        *,
        name: str,
        scope: Optional[set[str]] = None,
        doc: Optional[str] = None,
    ) -> None:
        self.name = name
        self.scope = scope or set()

        self.__doc__ = (
            textwrap.dedent(doc).strip() if doc else 'This environment variable is undocumented.'
        )

    def __hash__(self) -> int:
        return hash(self.name)

    def __eq__(self, other: object) -> bool:
        if not isinstance(other, EnvVar):
            return NotImplemented
        return self.name == other.name

Comment thread tmt/utils/__init__.py
Comment on lines +541 to +545
def __getitem__(self, key: Union[EnvVar, EnvVarName]) -> EnvVarValue:
return super().__getitem__(key.name if isinstance(key, EnvVar) else key)

def __setitem__(self, key: Union[EnvVar, EnvVarName], value: EnvVarValue) -> None:
super().__setitem__((key.name if isinstance(key, EnvVar) else key), value)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Override __contains__, get, and pop methods in Environment to ensure full dictionary interface compatibility when using EnvVar keys. Otherwise, operations like key in environment or environment.get(key) will fail or return incorrect results because they bypass __getitem__.

Suggested change
def __getitem__(self, key: Union[EnvVar, EnvVarName]) -> EnvVarValue:
return super().__getitem__(key.name if isinstance(key, EnvVar) else key)
def __setitem__(self, key: Union[EnvVar, EnvVarName], value: EnvVarValue) -> None:
super().__setitem__((key.name if isinstance(key, EnvVar) else key), value)
def __getitem__(self, key: Union[EnvVar, EnvVarName]) -> EnvVarValue:
return super().__getitem__(key.name if isinstance(key, EnvVar) else key)
def __setitem__(self, key: Union[EnvVar, EnvVarName], value: EnvVarValue) -> None:
super().__setitem__((key.name if isinstance(key, EnvVar) else key), value)
def __contains__(self, key: object) -> bool:
return super().__contains__(key.name if isinstance(key, EnvVar) else key)
def get(self, key: Union[EnvVar, EnvVarName], default: Any = None) -> Any:
return super().get(key.name if isinstance(key, EnvVar) else key, default)
def pop(self, key: Union[EnvVar, EnvVarName], *args: Any) -> Any:
return super().pop(key.name if isinstance(key, EnvVar) else key, *args)

Comment thread tmt/utils/__init__.py

def copy(self) -> 'Environment':
return Environment(self)
return Environment.from_dict(self)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Revert the change to copy() and use Environment(self) directly. Since Environment.__init__ handles self (which is a dict[str, EnvVarValue]) correctly, using from_dict() adds redundant overhead by stringifying keys and values again.

Suggested change
return Environment.from_dict(self)
return Environment(self)

Comment thread tmt/guest/__init__.py
# more.
if self.plan_environment_path:
environment['TMT_PLAN_ENVIRONMENT_FILE'] = EnvVarValue(self.plan_environment_path)
from tmt.steps.execute import ENV_TMT_PLAN_ENVIRONMENT_FILE

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Move ENV_TMT_PLAN_ENVIRONMENT_FILE to a central module (such as a dedicated environment module) to avoid a layering violation and circular dependency caused by importing from tmt.steps.execute in tmt.guest.

Comment thread tmt/utils/__init__.py
Comment on lines +502 to +504
self.__doc__ = (
textwrap.dedent(doc).strip() if doc else 'This environment variable is undocumented.'
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter design proposal, you could instead use __inti_subclass__ and define all of these by (singleton) class definitions. The doc is then defined more directly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that require custom subclasses for individual envvars?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but would that be problematic in a way though? We have better built-in support for documenting those. It think that would be the only difference in these two approaches. Anything else that you could do with instances you can do with subclasses and vice-versa.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say I would prefer such a design. To me, using instances seems more fitting to what they represent, individual instances of some "environment variable" template. I suppose it's possible to achieve the similar outcome with subclasses, it's not what I would pick. The eventual documentation would employ custom formatting, bold text here, empty line there, foo.__doc__ would be available in both cases. So any benefit would be in tmt code and the design, and having a subclass per envvar does not feel as right tool to me.

Comment thread tmt/utils/__init__.py
Comment on lines +460 to +466
class Scope:
"""
Scopes of environment variables.
"""

#: Environment variable is consumed by tmt process itself.
TMT = {'tmt'}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you see it being expanded and why not Enum classes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you see it being expanded...

In the final docs? So far my idea is to collect all envvars, and then pick and document those for the tmt scope, then those for the execute step, then... And so on.

... and why not Enum classes?

Mostly the readability preference on my side. With Enum defining scopes like TEST or EXECUTE, envvars needed to scope={EnvVar.Scope.TEST, EnvVar.Scope.EXECUTE}, i.e. actually define a set. With scopes being sets already, scope=EnvVar.Scope.TEST | EnvVar.Scope.EXECUTE ` seemed more readable to me. Similar to bit flags, or file/directory permissions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha. Can't we use one (or more) of the enum classes. enum.Flag seems to fit your constraint there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes, that should work. If I drop the set,I'm not sure about the right annotation of FOO | BAR union, but yes, it should fit the job description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci | full test Pull request is ready for the full test execution

Projects

Status: backlog

Development

Successfully merging this pull request may close these issues.

2 participants