Skip to content

Conversation

@sayakpaul
Copy link
Member

@sayakpaul sayakpaul commented Jan 14, 2026

What does this PR do?

This PR is to assess if we can move to transformers main again for our CI. This will also help us migrate to transformers v5 successfully.

Notes

For the FAILED tests/pipelines/test_pipelines.py::DownloadTests::test_textual_inversion_unload - AttributeError: CLIPTokenizer has no attribute _added_tokens_decoder. Did you mean: 'added_tokens_decoder'? error, see this internal discussion.

@sayakpaul sayakpaul requested a review from DN6 January 14, 2026 09:23
- "tests/pipelines/test_pipelines_common.py"
- "tests/models/test_modeling_common.py"
- "examples/**/*.py"
- ".github/**.yml"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary. For this PR.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sayakpaul sayakpaul marked this pull request as draft January 15, 2026 12:05
@sayakpaul sayakpaul changed the title switch to transformers main again. [main] switch to transformers main again. Jan 15, 2026
@sayakpaul sayakpaul changed the title [main] switch to transformers main again. [wip] switch to transformers main again. Jan 15, 2026
logger.addHandler(stream_handler)


@unittest.skipIf(is_transformers_version(">=", "4.57.5"), "Size mismatch")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.nn.ConvTranspose2d,
torch.nn.ConvTranspose3d,
torch.nn.Linear,
torch.nn.Embedding,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happening because of the way weight loading is done in v5.

Comment on lines +23 to +25
model = AutoModel.from_pretrained(
"hf-internal-testing/tiny-stable-diffusion-torch", subfolder="text_encoder", use_safetensors=False
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +281 to +283
input_ids = (
input_ids["input_ids"] if not isinstance(input_ids, list) and "input_ids" in input_ids else input_ids
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputs = {
"prompt": "dance monkey",
"negative_prompt": "",
"negative_prompt": "bad",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, the corresponding tokenizer outputs:

negative_prompt=[' ']
prompt=[' ']
text_input_ids=tensor([], size=(1, 0), dtype=torch.int64)

which leads to:

E       RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1, 8] because the unspecified dimension size -1 can be any value and is ambiguous

Comment on lines +569 to +580
if is_transformers_version("<=", "4.58.0"):
token = tokenizer._added_tokens_decoder[token_id]
tokenizer._added_tokens_decoder[last_special_token_id + key_id] = token
del tokenizer._added_tokens_decoder[token_id]
elif is_transformers_version(">", "4.58.0"):
token = tokenizer.added_tokens_decoder[token_id]
tokenizer.added_tokens_decoder[last_special_token_id + key_id] = token
del tokenizer.added_tokens_decoder[token_id]
if is_transformers_version("<=", "4.58.0"):
tokenizer._added_tokens_encoder[token.content] = last_special_token_id + key_id
elif is_transformers_version(">", "4.58.0"):
tokenizer.added_tokens_encoder[token.content] = last_special_token_id + key_id
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still doesn't solve the following issue:

FAILED tests/pipelines/test_pipelines.py::DownloadTests::test_textual_inversion_unload - AttributeError: CLIPTokenizer has no attribute _added_tokens_decoder. Did you mean: 'added_tokens_decoder'?

Internal thread: https://huggingface.slack.com/archives/C014N4749J9/p1768536480412119

@sayakpaul
Copy link
Member Author

@DN6 I have fixed a majority of the issues that were being caused by v5 / main. But the main culprits (at least for the tests we're checking in this PR) seem to be the assertions on expected values. Of course, the obvious fix would be to either reduce tolerance or change the expected slices. But do you have any other suggestions?

Comment on lines +94 to +95
config = AutoConfig.from_pretrained("hf-internal-testing/tiny-random-t5")
text_encoder_2 = T5EncoderModel(config)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sayakpaul
Copy link
Member Author

A bunch of tests actually fail on Diffusers main and with transformers 4.57.5 as well. Example:

Unravel
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_cfg - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_attention_slicing_forward_pass - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_callback_inputs - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_dict_tuple_outputs_equivalent - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_inference_batch_single_identical - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_pipeline_with_accelerator_device_map - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_save_load_float16 - AssertionError: False is not true : `safety_checker.dtype` switched from `float16` to torch.float32 after loading.
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_sequential_cpu_offload_forward_pass - AssertionError: False is not true : Not offloaded: ['safety_checker']
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_sequential_offload_forward_pass_twice - AssertionError: False is not true : Not offloaded: ['safety_checker']
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_cfg - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_attention_slicing_forward_pass - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_callback_inputs - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_dict_tuple_outputs_equivalent - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_inference_batch_single_identical - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_pipeline_with_accelerator_device_map - RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when che...
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_save_load_float16 - AssertionError: False is not true : `safety_checker.dtype` switched from `float16` to torch.float32 after loading.
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_sequential_cpu_offload_forward_pass - AssertionError: False is not true : Not offloaded: ['safety_checker']
FAILED tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_sequential_offload_forward_pass_twice - AssertionError: False is not true : Not offloaded: ['safety_checker']

@sayakpaul
Copy link
Member Author

sayakpaul commented Jan 20, 2026

As we can see here and here, some tests fail across both transformers main and 4.57.3. We should likely avoid fixing that overlap in this PR and investigate it later.

To summarize things a bit,

With 4.7.3, we have a total of 132 failures and 157 with main. 93 new test failures. Common: 64

Unfold to see the failures with main
  • tests/pipelines/deepfloyd_if/test_if_inpainting.py::IFInpaintingPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.53655016) < 0.01
  • tests/pipelines/visualcloze/test_pipeline_visualcloze_combined.py::VisualClozePipelineFastTests::test_save_load_local - AssertionError: np.float32(0.00095283985) not less than 0.0005
  • tests/pipelines/deepfloyd_if/test_if_superresolution.py::IFSuperResolutionPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.3238154) not less than 0.01 : Attention slicing should not affect the inference results
  • tests/pipelines/deepfloyd_if/test_if_img2img.py::IFImg2ImgPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.005825639) not less than 0.0001
  • tests/pipelines/cogvideo/test_cogvideox_fun_control.py::CogVideoXFunControlPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.0007969141) not less than 0.0005
  • tests/pipelines/kandinsky3/test_kandinsky3.py::Kandinsky3PipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.17856759) < 0.01
  • tests/pipelines/consisid/test_consisid.py::ConsisIDPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0007199049) not less than 0.0001
  • tests/pipelines/cosmos/test_cosmos.py::CosmosTextToWorldPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.003921598) not less than 0.0005
  • tests/pipelines/hidream_image/test_pipeline_hidream.py::HiDreamImagePipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.0025759935) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/qwenimage/test_qwenimage_controlnet.py::QwenControlNetPipelineFastTests::test_qwen_controlnet - AssertionError: False is not true
  • tests/pipelines/mochi/test_mochi.py::MochiPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.0032176077) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/deepfloyd_if/test_if_img2img_superresolution.py::IFImg2ImgSuperResolutionPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0026221275) not less than 0.0001
  • tests/pipelines/consisid/test_consisid.py::ConsisIDPipelineFastTests::test_encode_prompt_works_in_isolation - AssertionError: False is not true
  • tests/pipelines/qwenimage/test_qwenimage_controlnet.py::QwenControlNetPipelineFastTests::test_qwen_controlnet_multicondition - AssertionError: False is not true
  • tests/pipelines/pixart_alpha/test_pixart.py::PixArtAlphaPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0004608631) not less than 0.0001
  • tests/pipelines/skyreels_v2/test_skyreels_v2_df_video_to_video.py::SkyReelsV2DiffusionForcingVideoToVideoPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.00086069107) < 0.0001
  • tests/pipelines/cosmos/test_cosmos.py::CosmosTextToWorldPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.003921598) not less than 0.0001
  • tests/pipelines/hidream_image/test_pipeline_hidream.py::HiDreamImagePipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.0024747849) < 0.0003
  • tests/pipelines/cosmos/test_cosmos.py::CosmosTextToWorldPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.003921598) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.003921598) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/skyreels_v2/test_skyreels_v2.py::SkyReelsV2PipelineFastTests::test_save_load_local - AssertionError: np.float32(0.006394446) not less than 0.0005
  • tests/pipelines/qwenimage/test_qwenimage.py::QwenImagePipelineFastTests::test_inference - AssertionError: False is not true
  • tests/pipelines/kandinsky3/test_kandinsky3.py::Kandinsky3PipelineFastTests::test_kandinsky3 - AssertionError: expected_slice [0.3768 0.4373 0.4865 0.489 0.4299 0.5122 0.4921 0.4924 0.5599], but got [0.37947 0.36586 0.47927 0.53466 0.44487 0.47918 0.5042 0.52899 0.54898]
  • tests/pipelines/skyreels_v2/test_skyreels_v2_df.py::SkyReelsV2DiffusionForcingPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.0063945055) not less than 0.0005
  • tests/pipelines/cogvideo/test_cogvideox.py::CogVideoXPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.0022651553) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/chroma/test_pipeline_chroma_img2img.py::ChromaImg2ImgPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00013199449) not less than 0.0001
  • tests/pipelines/skyreels_v2/test_skyreels_v2_df_image_to_video.py::SkyReelsV2DiffusionForcingImageToVideoPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00016736984) not less than 0.0001
  • tests/pipelines/cogvideo/test_cogvideox_fun_control.py::CogVideoXFunControlPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.0010365248) < 0.001
  • tests/pipelines/hidream_image/test_pipeline_hidream.py::HiDreamImagePipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0025759935) not less than 0.0001
  • tests/pipelines/visualcloze/test_pipeline_visualcloze_combined.py::VisualClozePipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.000942111) not less than 0.0001
  • tests/pipelines/deepfloyd_if/test_if_img2img.py::IFImg2ImgPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0048977137) not less than 0.0001
  • tests/pipelines/deepfloyd_if/test_if_superresolution.py::IFSuperResolutionPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.12945855) not less than 0.0001
  • tests/pipelines/skyreels_v2/test_skyreels_v2_df.py::SkyReelsV2DiffusionForcingPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0064334273) not less than 0.0001
  • tests/pipelines/deepfloyd_if/test_if_superresolution.py::IFSuperResolutionPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.2777151) not less than 0.0001
  • tests/pipelines/deepfloyd_if/test_if_inpainting.py::IFInpaintingPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.4335382) not less than 0.0001
  • tests/pipelines/bria/test_pipeline_bria.py::BriaPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00012546778) not less than 0.0001
  • tests/pipelines/hidream_image/test_pipeline_hidream.py::HiDreamImagePipelineFastTests::test_inference - AssertionError: False is not true
  • tests/pipelines/consisid/test_consisid.py::ConsisIDPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.00055527687) not less than 0.0005
  • tests/pipelines/kandinsky3/test_kandinsky3_img2img.py::Kandinsky3Img2ImgPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float64(0.05619632472991942) not less than 0.0001
  • tests/pipelines/qwenimage/test_qwenimage_edit_plus.py::QwenImageEditPlusPipelineFastTests::test_inference - AssertionError: False is not true
  • tests/pipelines/ltx/test_ltx_condition.py::LTXConditionPipelineFastTests::test_encode_prompt_works_in_isolation - AssertionError: False is not true
  • tests/pipelines/cogvideo/test_cogvideox_fun_control.py::CogVideoXFunControlPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.0013612509) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/kandinsky3/test_kandinsky3.py::Kandinsky3PipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.12960485) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/skyreels_v2/test_skyreels_v2_df_image_to_video.py::SkyReelsV2DiffusionForcingImageToVideoPipelineFastTests::test_encode_prompt_works_in_isolation - AssertionError: False is not true
  • tests/pipelines/deepfloyd_if/test_if_img2img_superresolution.py::IFImg2ImgSuperResolutionPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.0010265112) not less than 0.0001
  • tests/pipelines/deepfloyd_if/test_if_inpainting_superresolution.py::IFInpaintingSuperResolutionPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0022739172) not less than 0.0001
  • tests/pipelines/mochi/test_mochi.py::MochiPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.0005039573) not less than 0.0005
  • tests/pipelines/deepfloyd_if/test_if_inpainting.py::IFInpaintingPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.52717006) not less than 0.01 : Attention slicing should not affect the inference results
  • tests/pipelines/deepfloyd_if/test_if_inpainting.py::IFInpaintingPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.41405028) not less than 0.0001
  • tests/pipelines/ltx/test_ltx.py::LTXPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00020048022) not less than 0.0001
  • tests/pipelines/kandinsky3/test_kandinsky3_img2img.py::Kandinsky3Img2ImgPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.3048774) < 0.01
  • tests/pipelines/test_pipelines.py::DownloadTests::test_textual_inversion_unload - AttributeError: CLIPTokenizer has no attribute _update_trie
  • tests/pipelines/ltx/test_ltx.py::LTXPipelineFastTests::test_encode_prompt_works_in_isolation - AssertionError: False is not true
  • tests/pipelines/cogvideo/test_cogvideox_image2video.py::CogVideoXImageToVideoPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0019074678) not less than 0.0001
  • tests/pipelines/ltx/test_ltx_condition.py::LTXConditionPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00018006563) not less than 0.0001
  • tests/pipelines/kandinsky3/test_kandinsky3_img2img.py::Kandinsky3Img2ImgPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.19884104) not less than 0.0005
  • tests/pipelines/deepfloyd_if/test_if_inpainting_superresolution.py::IFInpaintingSuperResolutionPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.00093853474) not less than 0.0001
  • tests/pipelines/deepfloyd_if/test_if.py::IFPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.28527367) not less than 0.0001
  • tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00013393164) not less than 0.0001
  • tests/pipelines/chroma/test_pipeline_chroma.py::ChromaPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.00023895502) < 0.0001
  • tests/pipelines/qwenimage/test_qwenimage_edit.py::QwenImageEditPipelineFastTests::test_inference - AssertionError: False is not true
  • tests/pipelines/deepfloyd_if/test_if.py::IFPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.28645945) not less than 0.0001
  • tests/pipelines/chroma/test_pipeline_chroma_img2img.py::ChromaImg2ImgPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.00013142824) < 0.0001
  • tests/pipelines/skyreels_v2/test_skyreels_v2_df_video_to_video.py::SkyReelsV2DiffusionForcingVideoToVideoPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00050631166) not less than 0.0001
  • tests/pipelines/deepfloyd_if/test_if_superresolution.py::IFSuperResolutionPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.20875877) < 0.01
  • tests/pipelines/chroma/test_pipeline_chroma_img2img.py::ChromaImg2ImgPipelineFastTests::test_save_load_optional_components - AssertionError: np.float32(0.00014954805) not less than 0.0001
  • tests/pipelines/ltx/test_ltx_image2video.py::LTXImageToVideoPipelineFastTests::test_encode_prompt_works_in_isolation - AssertionError: False is not true
  • tests/pipelines/hunyuan_video/test_hunyuan_image2video.py::HunyuanVideoImageToVideoPipelineFastTests::test_inference - AssertionError: False is not true : The generated video does not match the expected slice.
  • tests/pipelines/deepfloyd_if/test_if.py::IFPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.44406703) not less than 0.01 : Attention slicing should not affect the inference results
  • tests/pipelines/hidream_image/test_pipeline_hidream.py::HiDreamImagePipelineFastTests::test_save_load_local - AssertionError: np.float32(0.0044341087) not less than 0.0005
  • tests/pipelines/cogvideo/test_cogvideox_image2video.py::CogVideoXImageToVideoPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.00066673756) not less than 0.0005
  • tests/pipelines/deepfloyd_if/test_if.py::IFPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.61046284) < 0.01
  • tests/pipelines/bria/test_pipeline_bria.py::BriaPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.00015372038) < 0.0001
  • tests/pipelines/pixart_alpha/test_pixart.py::PixArtAlphaPipelineFastTests::test_encode_prompt_works_in_isolation - AssertionError: False is not true
  • tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.003921598) not less than 0.0005
  • tests/pipelines/chroma/test_pipeline_chroma.py::ChromaPipelineFastTests::test_save_load_optional_components - AssertionError: np.float32(0.0002564788) not less than 0.0001
  • tests/pipelines/mochi/test_mochi.py::MochiPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0032176077) not less than 0.0001
  • tests/pipelines/cogview3/test_cogview3plus.py::CogView3PlusPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00010609627) not less than 0.0001
  • tests/pipelines/kandinsky3/test_kandinsky3.py::Kandinsky3PipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.12960485) not less than 0.0001
  • tests/pipelines/cosmos/test_cosmos_video2world.py::CosmosVideoToWorldPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.003921598) not less than 0.0001
  • tests/pipelines/cogvideo/test_cogvideox_image2video.py::CogVideoXImageToVideoPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.0019074678) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/skyreels_v2/test_skyreels_v2.py::SkyReelsV2PipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.0064077973) < 0.0001
  • tests/pipelines/cogvideo/test_cogvideox_fun_control.py::CogVideoXFunControlPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0013612509) not less than 0.0001
  • tests/pipelines/cogvideo/test_cogvideox.py::CogVideoXPipelineFastTests::test_save_load_local - AssertionError: np.float32(0.00089266896) not less than 0.0005
  • tests/pipelines/skyreels_v2/test_skyreels_v2.py::SkyReelsV2PipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0064335465) not less than 0.0001
  • tests/pipelines/skyreels_v2/test_skyreels_v2_df.py::SkyReelsV2DiffusionForcingPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.0064079165) < 0.0001
  • tests/pipelines/kandinsky3/test_kandinsky3_img2img.py::Kandinsky3Img2ImgPipelineFastTests::test_attention_slicing_forward_pass - AssertionError: np.float32(0.2647573) not less than 0.001 : Attention slicing should not affect the inference results
  • tests/pipelines/hunyuan_image_21/test_hunyuanimage.py::HunyuanImagePipelineFastTests::test_inference_guider - AssertionError: np.False_ is not true : output_slice: [0.60681087 0.48715815 0.5984413 0.60241336 0.48849434 0.56244814
  • tests/pipelines/cogvideo/test_cogvideox.py::CogVideoXPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.0022651553) not less than 0.0001
  • tests/pipelines/cogvideo/test_cogvideox.py::CogVideoXPipelineFastTests::test_inference_batch_single_identical - assert np.float32(0.0014221072) < 0.001
  • tests/pipelines/chroma/test_pipeline_chroma.py::ChromaPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00024789572) not less than 0.0001
  • tests/pipelines/kandinsky3/test_kandinsky3.py::Kandinsky3PipelineFastTests::test_save_load_local - AssertionError: np.float32(0.094926864) not less than 0.0005
  • tests/pipelines/ltx/test_ltx_image2video.py::LTXImageToVideoPipelineFastTests::test_dict_tuple_outputs_equivalent - AssertionError: np.float32(0.00021559) not less than 0.0001

Little utility to obtain these reports:

Unravel
import re
import argparse

def extract_failed_tests(log_file_path):
    """
    Parses a pytest log file and returns a list of fully qualified 
    paths for all failed tests.
    """
    failed_tests = []
    
    # This regex looks for the 'FAILED ' prefix used in pytest's
    # 'short test summary info' section.
    failure_pattern = re.compile(r'^FAILED\s+(.*)$')

    with open(log_file_path, 'r') as file:
        for line in file:
            line = line.strip()
            match = failure_pattern.match(line)
            if match:
                # The first group contains the full path: 
                # script_path.py::ClassName::test_name
                failed_tests.append(match.group(1))
    
    return failed_tests

if __name__ == "__main__":
    path1 = "/Users/sayakpaul/Downloads/pr_pytorch_pipelines_torch_cpu_pipelines_transformers_4.57.3_test_reports/tests_torch_cpu_pipelines_summary_short_4_57_3.txt"
    path2 = "/Users/sayakpaul/Downloads/pr_pytorch_pipelines_torch_cpu_pipelines_transformers_main_test_reports/tests_torch_cpu_pipelines_summary_short_main.txt"

    failures_one = set(extract_failed_tests(path1))
    failures_two = set(extract_failed_tests(path2))
    print(f"Old: {len(failures_one)}, New: {len(failures_two)}")
    difference = failures_two.difference(failures_one)
    intersection = failures_two.intersection(failures_one)
    print(f"{len(difference)} new test failures. Common: {len(intersection)}")
    
    diff_list = "\n".join(f"- `{diff}`" for diff in difference)
    print(diff_list)
    

Will work on fixing the new ones where possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants