distribution finetuning
Technical Report: How Distribution Fine Tuning (DFT) improves LLM writing quality
Abstract/TLDR: LLMs are notoriously formulaic at writing, overusing certain tokens or phrases. I show that models trained with SFT fail to match the distribution of the training data by using Maximum… [+40738 chars]