Large Language Model

Evaluating different training methods on QA datasets

We aim to compare and contrast a variety of training configurations on the task of fine tuning a T5 model on QA datasets (BioQA and GSM) to understand their performance

Charvi Gupta, Rushabh Musthyala

Dec 2, 2022 Large Language Model, GPU Training, Deep Learning

Evaluating different training methods on QA datasets