You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the initialization of FluxAttention, we determine whether to create to_out based on the parameter pre_only.
However, during inference, we decide whether to call to_out based on whether encoder_hidden_states is provided.
This asymmetry can be somewhat confusing for beginners.
Although this does not actually cause any runtime errors.
Only FluxTransformerBlock sets pre_only to False, and likewise, only FluxTransformerBlock passes encoder_hidden_states during inference.
Another issue is that context_pre_only appears to be an unused parameter.
I was wondering if it might be better to:
Remove context_pre_only
During inference, rely on self.pre_only to decide whether to_out should be called