Categories
Machine Learning
Forking Paths: How Language Models Encode the Roads Not Taken
5. Predicting Outcome Distributions from Hidden States The Hypothesis The text up to token t contains semantic information about the eventual answer. But do…
Read More