Skip to content

Conversation

@yiyixuxu
Copy link
Collaborator

@yiyixuxu yiyixuxu commented Jan 10, 2026

This PR adds a utility script utils/modular_auto_docstring.py that automatically generates docstrings for modular pipeline block classes from their doc property.

Usage

Mark classes with # auto_docstring comment:

# auto_docstring
class QwenImageAutoVaeEncoderStep(AutoPipelineBlocks):
    block_classes = [QwenImageInpaintVaeEncoderStep, QwenImageImg2ImgVaeEncoderStep]
    block_names = ["inpaint", "img2img"]
    block_trigger_inputs = ["mask_image", "image"]

    @property
    def doc(self):
        return (
            "Vae encoder step that encodes image inputs into latent representations.\n"
            "This is an auto pipeline block.\n"
            " - `QwenImageInpaintVaeEncoderStep` (inpaint) is used when `mask_image` is provided.\n"
            " - `QwenImageImg2ImgVaeEncoderStep` (img2img) is used when `image` is provided."
        )

Run the script to insert docstrings:

python utils/modular_auto_docstring.py --fix_and_overwrite

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yiyixuxu yiyixuxu changed the title [modular] more doc related refactors [modular] add auto_docstring & more doc related refactors Jan 10, 2026
# ======================================================

@classmethod
def prompt(cls) -> "InputParam":
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our pipeline parameter are pretty consistent across different pipelines, e.g. you always have prompt, height, width, num_inference_steps, etc. I made template for these common ones, so that it is easier to define

before you need

InputParam(
    name="prompt",
    type_hint=str,
    required=True,
    description="The prompt or prompts to guide image generation."
)

now you do

InputParam.prompt()
InputParam.height(default=1024)
InputParam.num_inference_steps(default=28)
InputParam.generator()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit apprehensive about introducing dedicated class methods for common parameters in this way. I think the class can become quite large as common inputs expand.

I would prefer to keep current syntax (IMO this ensures InputParams are defined in a consistent way) and use post init on the dataclass to automatically add a description. e.g

# centralised descriptions would live somewhere like constants.py 
# can be used for modular + non-modular
INPUT_PARAM_TEMPLATES = {
      "prompt": {"type_hint": str, "required": True, "description": "The prompt or prompts to guide image generation."},
      "height": {"type_hint": int, "description": "The height in pixels of the generated image."},
      "width": {"type_hint": int, "description": "The width in pixels of the generated image."},
      "generator": {"type_hint": torch.Generator, "description": "Torch generator for deterministic generation."},
      # ...
  }

  @dataclass
  class InputParam:
      name: str = None
      type_hint: Any = None
      required: bool = False
      default: Any = None
      description: str = None

  def __post_init__(self):
      if not self.name or self.name not in INPUT_PARAM_TEMPLATES:
          return

      template = INPUT_PARAM_TEMPLATES[self.name]
      if self.type_hint is None:
          self.type_hint = template.get("type_hint")
      if self.description is None:
          self.description = template.get("description")

If we feel that methods for these inputs are necessary, one way to address it without adding individual methods to the InputParam is to use a metaclass. It would result in the InputParam object being less crowded.

  class InputParamMeta(type):
      def __getattr__(cls, name: str):
          if name in INPUT_PARAM_TEMPLATES:
              def factory(**overrides):
                  return cls(name=name, **overrides)
              return factory
          raise AttributeError(f"No template named '{name}'")


  @dataclass
  class InputParam(metaclass=InputParamMeta):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the class methods!
also moved the common parameter definitions out of InputParam/OutputParam as well. They're currently still in the same file, but we can move them elsewhere later.

To use a predefined param:

InputParam.template("image")

i went with this API, instead of the __post_init__ because you have to explicitly opt into using the template, even though it's slightly more verbose.

The main thing I wanted to avoid is with __post_init__ auto-filling, if you make a param that exists in our template but actually intend to customize its docstring later, e.g. you do something likeInputParam(name="image"), you will silently get the template defaults, and it is hard to notice.

With .template(), it's more clear when the user intends to use the pre-defined definition vs write a custom param. so for the same example, if someone writes InputParam(name="image") without filling in fields, the docstring will show "TODO: Add description".

we can also override fields in the template like this:

InputParam.template("image", note="resized")  # appends a note to the description
InputParam.template("image", required=False)


# auto_docstring
class QwenImageAutoTextEncoderStep(AutoPipelineBlocks):
"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are all auto generated docstring

# ====================


# auto_docstring
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this mark and then run

python utils/modular_auto_docstring.py --fix_and_overwrite

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add this to make style and error out in the CI if it's not the case? 👀

This will help us enforce consistency.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should do that eventually!
(won't include in this PR though)

@yiyixuxu yiyixuxu requested review from DN6 and sayakpaul January 10, 2026 11:34
# ======================================================

@classmethod
def prompt(cls) -> "InputParam":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit apprehensive about introducing dedicated class methods for common parameters in this way. I think the class can become quite large as common inputs expand.

I would prefer to keep current syntax (IMO this ensures InputParams are defined in a consistent way) and use post init on the dataclass to automatically add a description. e.g

# centralised descriptions would live somewhere like constants.py 
# can be used for modular + non-modular
INPUT_PARAM_TEMPLATES = {
      "prompt": {"type_hint": str, "required": True, "description": "The prompt or prompts to guide image generation."},
      "height": {"type_hint": int, "description": "The height in pixels of the generated image."},
      "width": {"type_hint": int, "description": "The width in pixels of the generated image."},
      "generator": {"type_hint": torch.Generator, "description": "Torch generator for deterministic generation."},
      # ...
  }

  @dataclass
  class InputParam:
      name: str = None
      type_hint: Any = None
      required: bool = False
      default: Any = None
      description: str = None

  def __post_init__(self):
      if not self.name or self.name not in INPUT_PARAM_TEMPLATES:
          return

      template = INPUT_PARAM_TEMPLATES[self.name]
      if self.type_hint is None:
          self.type_hint = template.get("type_hint")
      if self.description is None:
          self.description = template.get("description")

If we feel that methods for these inputs are necessary, one way to address it without adding individual methods to the InputParam is to use a metaclass. It would result in the InputParam object being less crowded.

  class InputParamMeta(type):
      def __getattr__(cls, name: str):
          if name in INPUT_PARAM_TEMPLATES:
              def factory(**overrides):
                  return cls(name=name, **overrides)
              return factory
          raise AttributeError(f"No template named '{name}'")


  @dataclass
  class InputParam(metaclass=InputParamMeta):

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it will be better to review the PR after https://github.com/huggingface/diffusers/pull/12958/files#r2689367739 ? 👀

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool feature!

# ====================


# auto_docstring
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add this to make style and error out in the CI if it's not the case? 👀

This will help us enforce consistency.

Comment on lines +120 to +122
return "Text encoder step that encodes the text prompt into a text embedding. This is an auto pipeline block."
" - `QwenImageTextEncoderStep` (text_encoder) is used when `prompt` is provided."
" - if `prompt` is not provided, step will be skipped."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be autogenerated as well provided a class name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea!
I think it should be doable, I will look into it (not in this PR)

Inputs:
mask_image (`Image`):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion but I find PIL.Image.Image a tad bit more informative as a type hint.

processed_image (`None`):
image_latents (`Tensor`):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly torch.Tensor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look into improving them in a follow-up

width (`int`, *optional*):
The width in pixels of the generated image.
generator (`Generator`, *optional*):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.Generator perhaps?

@yiyixuxu
Copy link
Collaborator Author

@bot /style

@github-actions
Copy link
Contributor

github-actions bot commented Jan 19, 2026

Style bot fixed some files and pushed the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants