Conference Paper

Toward Autonomy: Metacognitive Learning for Enhanced AI Performance

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Large Language Models (LLMs) lack robust metacognitive learning abilities and depend on human-provided algorithms and prompts for learning and output generation. Metacognition involves processes that monitor and enhance cognition. Learning how to learn - metacognitive learning - is crucial for adapting and optimizing learning strategies over time. Although LLMs possess limited metacognitive abilities, they cannot autonomously refine or optimize these strategies. Humans possess innate mechanisms for metacognitive learning that enable at least two unique abilities: discerning which metacognitive strategies are best and automatizing learning strategies. These processes have been effectively modeled in the ACT-R cognitive architecture, providing insights on a path toward greater learning autonomy in AI. Incorporating human-like metacognitive learning abilities into AI could potentially lead to the development of more autonomous and versatile learning mechanisms, as well as improved problem-solving capabilities and performance across diverse tasks.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Preprint
Log-based insider threat detection (ITD) detects malicious user activities by auditing log entries. Recently, large language models (LLMs) with strong common sense knowledge have emerged in the domain of ITD. Nevertheless, diverse activity types and overlong log files pose a significant challenge for LLMs in directly discerning malicious ones within myriads of normal activities. Furthermore, the faithfulness hallucination issue from LLMs aggravates its application difficulty in ITD, as the generated conclusion may not align with user commands and activity context. In response to these challenges, we introduce Audit-LLM, a multi-agent log-based insider threat detection framework comprising three collaborative agents: (i) the Decomposer agent, breaking down the complex ITD task into manageable sub-tasks using Chain-of-Thought (COT) reasoning;(ii) the Tool Builder agent, creating reusable tools for sub-tasks to overcome context length limitations in LLMs; and (iii) the Executor agent, generating the final detection conclusion by invoking constructed tools. To enhance conclusion accuracy, we propose a pair-wise Evidence-based Multi-agent Debate (EMAD) mechanism, where two independent Executors iteratively refine their conclusions through reasoning exchange to reach a consensus. Comprehensive experiments conducted on three publicly available ITD datasets-CERT r4.2, CERT r5.2, and PicoDomain-demonstrate the superiority of our method over existing baselines and show that the proposed EMAD significantly improves the faithfulness of explanations generated by LLMs.
Conference Paper
Full-text available
Metacognition can improve with practice, yet the mechanisms underlying metacognitive skill learning remain unclear and lack a robust theoretical framework. We propose that metacognitive skill learning can be largely explained by the skill acquisition model advanced by Fitts (1964) and Anderson (2013). While this model has been successful in the domains of motor skill and cognitive skill, it has not yet been applied to metacognitive skill. This novel framework can help to explain metacognitive skill learning, its cognitive underpinnings, and shed light on otherwise unexplainable empirical data.
Article
Full-text available
A theory is presented about how instruction and experience combine to produce human fluency in a complex skill. The theory depends critically on 4 aspects of the ACT-R architecture. The first is the timing of various modules, particularly motor timing, which results in behavior that closely matches human behavior. The second is the ability to interpret declarative representations of instruction so that they lead to action. The third aspect concerns how practice converts this declarative knowledge into a procedural form so that appropriate actions can be quickly executed. The fourth component, newly added to the architecture, is a Controller module that learns the setting of control variables for actions. The overall theory is implemented in a computational model that is capable of simulating human learning. Its predictions are confirmed in a first experiment involving 2 games derived from the experimental video game Space Fortress. The second experiment tests predictions from the Controller module about lack of transfer between video games. Across the 2 experiments a single model, with the same parameter settings, is shown to simulate human learning of 3 video games.
Article
Full-text available
The purpose of this article is to begin the process of engaging the international research community in developing what can be called a standard model of the mind, where the mind we have in mind here is human-like. The notion of a standard model has its roots in physics, where over more than a half-century the international community has developed and tested a standard model that combines much of what is known about particles. This model is assumed to be internally consistent, yet still have major gaps. Its function is to serve as a cumulative reference point for the field while also driving efforts to both extend and break it.
Book
In this book, the educational theory of metacognitive learning and its instructional implications are used to describe and illustrate how learners can become effective or self-directive learners. First, three levels of general knowledge of the learning process are discussed in this book through an overview of research studies. The book then describes how learners can develop along these levels and learn to effectively plan their learning. This book includes study and educational material centered on the learning and instruction of general knowledge of the learning process.
Article
The paper describes the ACT theory of learning. The theory is embodied as a computer simulation program that makes predictions about human learning of various cognitive skills such as language fluency, study skills for social science texts, problem-solving skills in mathematics, and computer programming skills. The learning takes place within the ACT theory of the performance of such skills. This theory involves a propositional network representation of general factual knowledge and a production system representation of procedural knowledge. Skill learning mainly involves addition and modification of the productions. There are five mechanisms by which this takes place: Designation, strengthening, generalization, discrimination, and composition. Each of these five learning mechanisms is discussed in detail and related to available data in procedural learning.