There have been quite some issues/questions with how to use the Encoder-Decoder model, e.g.: #4483 and #15479 . The main reason for this is that the model docs are quite outdated and we could need a nice How-to-guide.
We should mention here:
a) How to create a model ? We should show how to use the from_encoder_decoder_pretrained(...) and then how to save the model?
b) How to fine-tune this model? We should mention that this model can then be fine-tuned just like any other encoder-decoder model (Bart, T5, ...)
c) Put a big warning that the config values have to be correctly set and how to set them, e.g. read: #15479
This should be an EncoderDecoderModel specific text and be very concise and short.
In a second step, we should then write a How-to-guide that includes much more details.
More than happy to help someone tackle this first good issue
The text was updated successfully, but these errors were encountered:
First good issue
There have been quite some issues/questions with how to use the Encoder-Decoder model, e.g.: #4483 and #15479 . The main reason for this is that the model docs are quite outdated and we could need a nice How-to-guide.
So I think we have two action items here:
We should mention here:
a) How to create a model ? We should show how to use the
from_encoder_decoder_pretrained(...)and then how to save the model?b) How to fine-tune this model? We should mention that this model can then be fine-tuned just like any other encoder-decoder model (Bart, T5, ...)
c) Put a big warning that the config values have to be correctly set and how to set them, e.g. read: #15479
This should be an
EncoderDecoderModelspecific text and be very concise and short.In a second step, we should then write a How-to-guide that includes much more details.
More than happy to help someone tackle this first good issue
The text was updated successfully, but these errors were encountered: