Anthropocentric Bias: Evaluating Cognitive Capacities in Large Language Models

Cluedo Tech

Aug 26, 20247 min read

The evaluation of cognitive capacities in large language models (LLMs) is a topic of both theoretical and practical importance in the field of artificial intelligence (AI). As LLMs, such as OpenAI's GPT series, become more sophisticated, the question of whether these models possess cognitive competencies similar to those of humans becomes increasingly relevant. However, assessing these competencies is not straightforward, primarily due to inherent biases that influence our understanding and interpretation of LLM capabilities.

Raphaël Millière and Charles Rathkopf, in their paper "Anthropocentric Bias and the Possibility of Artificial Cognition," argue that the evaluation of LLMs is plagued by anthropocentric biases. These biases stem from the tendency to assess LLMs using human-centric standards and frameworks, leading to skewed interpretations of what these models can and cannot do.

Understanding Cognitive Competence and Performance

To fully grasp the arguments presented in the paper, it is essential first to understand the distinction between competence and performance—a concept introduced by Noam Chomsky in 1965. In the context of cognitive science, competence refers to the internal knowledge or capacity a system possesses to perform a task, while performance is the observable behavior or output when that system is engaged in the task. This distinction is crucial because a system may have the competence to perform a task but fail to exhibit that competence due to various factors affecting its performance.

For example, a human might be competent in solving complex mathematical equations (competence) but might fail to do so correctly under stress or time constraints (performance). This divergence between competence and performance is a well-acknowledged phenomenon in cognitive science, but when it comes to LLMs, the distinction becomes blurred due to anthropocentric biases.

Anthropocentric Bias: Definition and Types

Anthropocentric bias refers to the tendency to evaluate non-human systems, such as LLMs, through a human lens—applying human cognitive standards and frameworks to assess the capabilities of these systems. Millière and Rathkopf identify two primary forms of anthropocentric bias in their paper:

Type-I Anthropocentrism: This bias arises when failures in an LLM’s performance are immediately interpreted as evidence of a lack of competence, without considering that external factors might have impeded the model's performance. This perspective fails to account for auxiliary factors that could affect the model's output, thereby underestimating its true capabilities.

Type-II Anthropocentrism: This bias occurs when an LLM achieves a task using a method different from that used by humans, leading to the assumption that the model's method is less valid or less competent. Essentially, this bias suggests that cognitive competence must resemble human cognitive processes to be considered valid, disregarding the potential for alternative, non-human methods of problem-solving that may be equally or more effective.

The Performance/Competence Distinction in LLMs

The application of the competence-performance distinction to LLMs is fraught with challenges due to the inherent differences between human cognition and machine learning processes. LLMs, such as GPT, operate based on patterns learned from vast datasets, employing statistical correlations rather than understanding in the human sense.

When evaluating LLMs, researchers often fall into the trap of assuming that poor performance on a given task indicates a lack of competence. However, as Millière and Rathkopf point out, this is not necessarily the case. Various auxiliary factors—such as the design of the task, the context in which the LLM operates, or the computational resources available—can influence performance without reflecting the model's true capabilities.

For instance, an LLM might fail to solve a complex analogy task not because it lacks the underlying competence, but because the task requires intermediate computational steps that exceed the model's current capacity or because the task was designed with human cognitive processes in mind, which the LLM does not share.

Type-I Anthropocentrism: Misinterpreting Performance Failures

Type-I Anthropocentrism is a bias that arises when external factors influencing LLM performance are overlooked. This bias leads to the incorrect conclusion that an LLM’s poor performance on a task necessarily indicates a lack of competence.

To understand this bias, it is essential to recognize the role of auxiliary factors—external influences that can affect a model's performance without directly reflecting its underlying competence. The paper categorizes these auxiliary factors into three main types:

Task Demands: Some tasks designed to measure LLM competence may inadvertently include additional demands that are unrelated to the competence being tested. For example, a task requiring an LLM to make explicit grammaticality judgments might impose additional cognitive load that obscures the model's true linguistic competence. Research by Hu & Frank (2024) highlights this by comparing different methods of evaluating LLMs’ sensitivity to syntax, finding that direct probability estimation tasks provide a more accurate measure of competence than tasks requiring explicit judgments.

Computational Limitations: LLMs may have latent competence for certain tasks but be limited by computational constraints, such as the number of tokens generated or the depth of computational steps allowed before producing an answer. For instance, studies by Pfau et al. (2024) show that LLMs can achieve perfect accuracy on complex reasoning tasks when allowed to generate filler tokens, which provide the model with additional computational steps. When these filler tokens are restricted, the model’s performance drops, not due to a lack of competence but due to limited computational capacity.

Mechanistic Interference: LLMs might develop mechanisms for solving tasks that are disrupted by other competing processes within the model. For example, in modular addition tasks, a Transformer model might form a generalizable algorithm to solve the problem, but its performance could still be impaired by interference from previously learned, less general algorithms. This interference, as shown in research by Nanda et al. (2022), can lead to performance failures even when the underlying competence is intact.

These examples illustrate how Type-I Anthropocentrism can lead to the misinterpretation of LLM capabilities by failing to account for the complex factors that influence performance.

Type-II Anthropocentrism: Disregarding Non-Human Cognitive Strategies

Type-II Anthropocentrism involves the assumption that if an LLM solves a problem using a method different from that of humans, its solution is inherently less valid or competent. This bias is grounded in the belief that genuine cognitive competence must resemble human cognition, which is a restrictive and potentially misleading assumption.

Millière and Rathkopf argue that this bias obscures the real capabilities of LLMs by discounting the validity of alternative computational strategies that these models might employ. For instance, an LLM might develop a robust and generalizable method for solving mathematical problems that differs from the human approach. If this method is effective, it should be considered competent, regardless of whether it aligns with human cognitive processes.

An example of this bias in action can be seen in the debate over LLMs’ ability to perform analogical reasoning tasks. While some researchers, such as Webb et al. (2023), have demonstrated that LLMs can surpass human performance on novel analogical tasks, others, like Lewis & Mitchell (2024), have argued that the models lack true competence because their strategies differ from those used by humans. This debate underscores the importance of recognizing and valuing non-human cognitive strategies rather than dismissing them due to their divergence from human methods.

Addressing Anthropocentric Bias

To overcome these biases, Millière and Rathkopf advocate for an empirically-driven, iterative approach to evaluating LLMs. This approach involves:

Designing Behavioral Experiments: Behavioral experiments should be carefully designed to minimize anthropocentric bias by focusing on tasks that are relevant to the model’s architecture and capabilities rather than those that merely replicate human cognitive processes.

Mechanistic Studies: These studies should further evaluate the internal workings of LLMs to understand the mechanisms that drive their performance. By investigating the specific circuits and processes within the model, researchers can identify the factors that contribute to success or failure on a given task.

Iterative Refinement: The evaluation of LLMs should be an iterative process, where initial findings are used to refine the tasks and methods used in subsequent studies. This process allows for the gradual development of a more accurate understanding of LLM capabilities, free from the constraints of anthropocentric bias.

Broader Implications for AI and Cognitive Science

The insights provided by Millière and Rathkopf have significant implications for the field of AI and cognitive science. By highlighting the limitations of current methodologies rooted in human-centered thinking, they call for a reevaluation of how we assess machine intelligence.

If researchers and practitioners adopt the recommendations in this paper, it could lead to the development of new methodologies that better capture the unique capabilities of LLMs. This, in turn, could pave the way for breakthroughs in AI research and applications, as we begin to recognize and leverage the strengths of these models without being constrained by anthropocentric assumptions.

Conclusion

"Anthropocentric Bias and the Possibility of Artificial Cognition" is a pivotal paper that challenges the current paradigms used to evaluate the cognitive capacities of LLMs. By identifying and addressing the biases that skew our understanding of these models, Millière and Rathkopf provide a roadmap for more accurate and objective assessments of artificial cognition.

This paper not only contributes to the theoretical discourse on AI and cognitive science but also has practical implications for the development and application of AI technologies. As we continue to explore the potential of LLMs, it is crucial to move beyond anthropocentric biases and embrace a more nuanced understanding of what these models can achieve. By doing so, we can unlock new possibilities for AI and its role in society.

This blog post aims to provide a summary of the paper, offering readers an understanding of the issues at stake and the broader context within which these discussions take place. This post seeks to make the insights of Millière and Rathkopf accessible to a wide audience, from AI researchers to industry practitioners; however, it is essential that readers of this blog, read the paper themselves and also related research on this topic.

Cluedo Tech can help you with your AI strategy, discovery, development, and execution using the AWS AI Platform. Request a meeting.

References:

Millière, R., & Rathkopf, C. (2024). Anthropocentric Bias and the Possibility of Artificial Cognition.Proceedings of the 41st International Conference on Machine Learning.

Chomsky, N. (1965). Aspects of the Theory of Syntax. MIT Press.

Firestone, C. (2020). Performance vs. Competence in Human–Machine Comparisons. Proceedings of the National Academy of Sciences, 117(43), 26562–26571.

Hu, J., & Frank, M. C. (2024). Auxiliary Task Demands Mask the Capabilities of Smaller Language Models.

Pfau, J., Merrill, W., & Bowman, S. R. (2024). Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models.

Webb, T., Holyoak, K. J., & Lu, H. (2023). Emergent Analogical Reasoning in Large Language Models. Nature Human Behaviour.

Lewis, M., & Mitchell, M. (2024). Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models.

Nanda, N., Chan, L., Lieberum, T., Smith, J., & Steinhardt, J. (2022). Progress Measures for Grokking via Mechanistic Interpretability. In Proceedings of the Eleventh International Conference on Learning Representations.