1. Additional metrics to test * [high priority] BERTScore-sentence, MNLI, {RoBERTa, DeBERTA}, Entail - Contradict, top-k and top-p * [medium] BERTScore-original, DeBERTa -- to see how much language models impact it * <del>[low] BERTScore-sentence, cosine, DeBERTa -- again, to see the impact of language models </del> 2. how to print result into Google Sheets Done https://github.com/SigmaWe/EvalBase/pull/4
[low] BERTScore-sentence, cosine, DeBERTa -- again, to see the impact of language modelsDone Add pretty print EvalBase#4