I have two fine-tuned GPT-2 345M models with different learning rate. Is there any method to compare them and find out which one is better?
It’s hard to answer as it depends on the task. Can you provide more details?
The models have the learning rate 1e-4 and 1e-6 in accordance. Can it be helpful?
You can use metrics for measuring the quality of the sample (BLEU metrics). For sample diversity. It’s needed to measure the predicted probability distribution against the true distribution.
Fields marked with (*) are required
I have read and accept the
Just drop us an email to ...
Just drop us an email to firstname.lastname@example.org.