Do large language models' understanding of time and space come from mere statistical data, or have they formed a coherent and real world model?
MIT professors Wes Gurnee and Max Tegmark released a paper this month titled "Language Models Represent Space and Time." By analyzing three spatial datasets (world, U.S., New York City locations) and three temporal datasets (historical figures, artworks, news headlines) in the Llama-2 series models, they argue that large language models construct their own world models.
Large language models also have dedicated "spatial neurons" and "temporal neurons," which reliably encode spatial and temporal coordinates.
The paper is quite academic, and I don't fully understand it either: https://arxiv.org/abs/2310.02207
The image below shows the spatial and temporal world model constructed by Llama-2-70b.
This paper finds that entity types in LLMs are unified and robust to prompts, and there exist individual neurons highly sensitive to these features. When the activation dataset is projected onto the weights of the most similar neurons, these neurons indeed show high sensitivity to the true spatial or temporal positions of entities. In other words, there are single neurons in the model that serve as predictive feature probes themselves.
As long as there is sufficient model and data size, next-token prediction alone is enough to learn a textual map of the world.
By adding linear regression probes into the model, the authors found that the model can learn linear spatial and temporal representations, which become more accurate as the model scales up and are mainly located in the earlier layers of the model. Nonlinear probes do not perform better than linear probes, indicating that the model uses linear representations. These representations are also relatively stable and not significantly affected by changes in the prompt.
In comparison to the human brain, place cells and grid cells are among the most extensively studied cells in the brain. There could be future comparative cognitive science between humans and AI: previously, comparative psychology focused on comparisons between chimpanzees and humans, but in the future, it may focus on comparisons between the human brain and AI.