We see that the approximations of the true posterior become better with more training steps and a lower cross-entropy loss, like we expect for a powerful model as outlined in Section 2. The losses are negative, as we are in a regression setting, where the density can be above 1.

We see that the approximations of the true posterior become better with more training steps and a lower cross-entropy loss, like we expect for a powerful model as outlined in Section 2. The losses are negative, as we are in a regression setting, where the density can be above 1.

Source publication
Preprint
Full-text available
Traditionally, neural network training has been primarily viewed as an approximation of maximum likelihood estimation (MLE). This interpretation originated in a time when training for multiple epochs on small datasets was common and performance was data bound; but it falls short in the era of large-scale single-epoch trainings ushered in by large s...

Context in source publication

Context 1
... this prior, we performed an ablation that considers different training times to see whether models tend to converge to the Bayes optimal prediction as training goes on. In Figure 4 we see that the approximation of the posterior becomes better as we perform more training steps, but seems to suffer diminishing returns as we increase the number of training steps. The losses are negative, as we are in a regression setting, where the density can be above 1. ...

Similar publications

Preprint
Full-text available
A fundamental question in interpretability research is to what extent neural networks, particularly language models, implement reusable functions via subnetworks that can be composed to perform more complex tasks. Recent developments in mechanistic interpretability have made progress in identifying subnetworks, often referred to as circuits, which...