Skip to content

refactor unet_1d tests#13898

Merged
sayakpaul merged 3 commits into
huggingface:mainfrom
akshan-main:tests-refactor-unet-1d
Jun 10, 2026
Merged

refactor unet_1d tests#13898
sayakpaul merged 3 commits into
huggingface:mainfrom
akshan-main:tests-refactor-unet-1d

Conversation

@akshan-main

Copy link
Copy Markdown
Contributor

What does this PR do?

Part of the ongoing modeling-test migration (following #13369 and #13153). Migrates the UNet1DModel test suites (the standard UNet and the RL value-function variant) to the mixin-based structure (Config + ModelTesterMixin).

UNet1DModel has no attention processors and doesn't support gradient checkpointing, so only ModelTesterMixin applies. The hub/pretrained integration tests are kept.

Before submitting

Who can review?

@sayakpaul @DN6

@github-actions github-actions Bot added tests size/L PR with diff > 200 LOC labels Jun 9, 2026
@sayakpaul

Copy link
Copy Markdown
Member

@askserge could you please review the PR?

(testing)

@sergereview sergereview Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤗 Serge says:

The migration to the mixin-based test structure is mostly correct, but there are issues with the output_shape property values.

Correctness

  • UNetRLModelTesterConfig.output_shape is set to (4, 14, 1) but the RL value-function model actually outputs (batch_size, 1) — i.e. (4, 1) for batch_size=4. The old test explicitly checked torch.Size((inputs_dict["sample"].shape[0], 1)). This is wrong regardless of whether the convention is per-sample or full-batch shape. It doesn't cause failures today because the base test_output has an operator-precedence bug (== … or self.output_shape is always truthy), and the custom test_output override doesn't use self.output_shape. But it will break when the base class is fixed, and it's misleading to readers.

  • UNet1DModelTesterConfig.output_shape is (4, 14, 16) (includes batch dim), while the repo convention in other migrated tests (e.g. AutoencoderKLTesterConfig) is to use per-sample shape without the batch dimension — (3, 32, 32) not (4, 3, 32, 32). Should be (14, 16) to match the convention.

Minor

  • In TestUNetRLModel.test_output (line 173), self.get_dummy_inputs() is called a second time just to read shape[0] (which is the hardcoded batch_size = 4). This allocates a full set of tensors unnecessarily.

model: claude-opus-4-6 · 28 LLM turns · 30 tool calls · 143.1s · 627103 in / 6681 out tokens

@property
def output_shape(self):
def output_shape(self) -> tuple:
return (4, 14, 16)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_shape includes the batch dimension (4, 14, 16), but the repo convention (see e.g. AutoencoderKLTesterConfig) is to use the per-sample shape without the batch dim. Should be (14, 16) to stay consistent.

Suggested change
return (4, 14, 16)
return (14, 16)

@property
def output_shape(self):
def output_shape(self) -> tuple:
return (4, 14, 1)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_shape is (4, 14, 1) but the RL value-function model outputs shape (batch_size, 1). The old test explicitly asserted torch.Size((batch_size, 1)). This value is wrong regardless of convention (per-sample it should be (1,), full-batch it should be (4, 1)).

The base test_output currently has an operator-precedence bug that makes the assertion vacuously true, and the custom test_output override doesn't reference self.output_shape, so this doesn't fail today — but it will break when the base class is fixed.

Suggested change
return (4, 14, 1)
return (1,)

with torch.no_grad():
output = model(**self.get_dummy_inputs()).sample

assert output.shape == (self.get_dummy_inputs()["sample"].shape[0], 1), "Input and output shapes do not match"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: self.get_dummy_inputs() is called a second time just to read shape[0], which is always the hardcoded batch_size = 4. This unnecessarily allocates a full set of tensors. Consider storing the inputs in a local variable:

Suggested change
assert output.shape == (self.get_dummy_inputs()["sample"].shape[0], 1), "Input and output shapes do not match"
inputs = self.get_dummy_inputs()
assert output.shape == (inputs["sample"].shape[0], 1), "Input and output shapes do not match"

(and pass inputs to the model call above as well)

@sayakpaul sayakpaul merged commit 0d56193 into huggingface:main Jun 10, 2026
12 of 13 checks passed
DN6 pushed a commit that referenced this pull request Jun 10, 2026
* refactor unet_1d tests

* use per-sample output_shape for unet_1d tests

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L PR with diff > 200 LOC tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants