Skip to content

Conversation

gmmkmtgk
Copy link

@gmmkmtgk gmmkmtgk commented Oct 2, 2025

…scheduler_kwargs)

What does this PR do?

This PR fixes the bug where Union[dict, str] fields, such as --lr_scheduler_kwargs in TrainingArguments, could not be parsed correctly from the command line.

Root Cause:
The issue was introduced when the type of lr_scheduler_kwargs changed from Optional[Union[dict[str, Any], str]] to Union[dict[str, Any], str]. HfArgumentParser previously filtered out str from Union types when None was not present, which caused argparse to attempt parsing the argument as a dict directly, leading to errors when passing JSON strings via CLI.

Fix:

  • Modified src/transformers/hf_argparser.py to detect Union[dict, str] types and keep the type as str for argparse.
  • Conversion from JSON string → dict now happens later in TrainingArguments.post_init_.
  • Maintains backward compatibility with Optional[Union[dict, str]] fields.

Tests added:

  • test_17_union_dict_str_parsing(): verifies basic Union[dict, str] CLI parsing.
  • test_18_lr_scheduler_kwargs_parsing(): regression test specifically for the lr_scheduler_kwargs issue.

Fixes the behavior reported in #41296.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant