The Matrix: A Bayesian learning model for LLMs
Computer Science > Machine Learning
arXiv:2402.03175 (cs)
View a PDF of the paper titled Beyond the Black Box: A Statistical Model for LLM Reasoning and Inference, by Siddhartha Dalal and Vishal Misra
Abstract:This paper introduces a novel Bayesian learning model to explain the behavior of Large Language Models (LLMs), focusing on their core optimization metric of next token prediction. We develop a theoretical framework based on an ideal generative text model represented by a multinomial transition probability matrix with a prior, and examine how LLMs approximate this matrix. Key contributions include: (i) a continuity theorem relating embeddings to multinomial distributions, (ii) a demonstration that LLM text generation aligns with Bayesian learning principles, (iii) an explanation for the emergence of in-context learning in larger models, (iv) empirical validation using visualizations of next token probabilities from an instrumented Llama model Our findings provide new insights into LLM functioning, offering a statistical foundation for understanding their capabilities and limitations. This framework has implications for LLM design, training, and application, potentially guiding future developments in the field.
Submission history
From: Vishal Misra [view email]
[v1]
Mon, 5 Feb 2024 16:42:10 UTC (305 KB)
[v2]
Tue, 24 Sep 2024 13:30:25 UTC (2,800 KB)
Access Paper:
- View PDF
- TeX Source
- Other Formats
View a PDF of the paper titled Beyond the Black Box: A Statistical Model for LLM Reasoning and Inference, by Siddhartha Dalal and Vishal Misra
Current browse context:
cs.LG
export BibTeX citation
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.