Conference Paper

Multi-Agent Reasoning with Large Language Models for Effective Corporate Planning

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Multi-agent systems have a long history in artificial intelligence [62,89]. Now there is substantial interest in multi-agent LLM systems [16,38,39,45,54,64,84,87]. Our system incorporates aspects of these systems such as debate [40] and the idea of role-based communication [38,54,64,100]. ...
... Just as in human deliberation, it is helpful to have some summary of what transpired. In many multi-agent systems, one Agent aggregates the communications of others [16,39,87,100]. The motivation for adding auto-moderators-a feature where Moderators come up with their own moderation instructions based on the task-is based on the paradigm of "auto-prompting" in DSPy [45]. ...
Preprint
Full-text available
Recent debates raised concerns that language models may favor certain viewpoints. But what if the solution is not to aim for a 'view from nowhere' but rather to leverage different viewpoints? We introduce Plurals, a system and Python library for pluralistic AI deliberation. Plurals consists of Agents (LLMs, optionally with personas) which deliberate within customizable Structures, with Moderators overseeing deliberation. Plurals is a generator of simulated social ensembles. Plurals integrates with government datasets to create nationally representative personas, includes deliberation templates inspired by democratic deliberation theory, and allows users to customize both information-sharing structures and deliberation behavior within Structures. Six case studies demonstrate fidelity to theoretical constructs and efficacy. Three randomized experiments show simulated focus groups produced output resonant with an online sample of the relevant audiences (chosen over zero-shot generation in 75% of trials). Plurals is both a paradigm and a concrete system for pluralistic AI. The Plurals library is available at https://github.com/josh-ashkinaze/plurals and will be continually updated.
... SocraSynth's applications span various fields, including geopolitical analysis [28], medical diagnostics [34], sales strategy [138], and Wikipedia article enhancement [30]. These applications demonstrate expanded perspectives and enhanced argumentation quality, along with significant reductions in biases and hallucinations, thereby demonstrating SocraSynth's efficacy in fostering balanced and well-reasoned discourse. ...
... SocraSynth, our previous work [25] presented in Chapter 6, addresses LLM limitations through structured multi-agent dialogues. By leveraging both adversarial and collaborative interactions between LLMs, SocraSynth demonstrates quantifiable improvements across various domains, including healthcare [34], news analysis [32], geopolitical analysis [28], corporate planning [138], and emotional behavior modeling [23,27]. These results highlight SocraSynth's potential for advancing towards AGI's generalized problem-solving capabilities. ...
Book
Full-text available
This booklet, "Unlocking the Wisdom of LLM Collaborative Intelligence," serves as an introduction to the full-length work, "The Path to Artificial General Intelligence." Through ten carefully crafted aphorisms, it distills the core insights and guiding principles that underpin the broader exploration of AI’s future through LLM Collaborative Intelligence (LCI). The author presents this framework as a promising pathway toward achieving artificial general intelligence (AGI). As the global AI community races toward AGI, key figures like Yann LeCun argue that LLMs alone are insufficient for reaching AGI. LeCun, a pioneer in deep learning, suggests that text-based models are inherently limited due to their lack of persistent memory, physical interaction, and planning abilities. He insists that true intelligence requires direct interaction with the physical world—such as through robots or sensors—arguing that language-only systems cannot achieve the grounding needed for general intelligence. In contrast, this book proposes an alternative view: LCI demonstrates that the exchange of information, contextual adaptation, and collaborative dialogue can overcome many of the limitations LeCun highlights. The key is not to rely on isolated LLMs but to leverage their synergy through structured communication, expanding their capacities beyond any individual model.
... SocraSynth's adversarial component promotes the exploration of diverse perspectives, while its collaborative component fosters rigorous reasoning to reach well-reasoned conclusions. This synergy has yielded measurable gains beyond healthcare and bias mitigation, extending to geopolitical analysis (Chang, 2023b), corporate planning (Tsao, 2023), investment banking (Chang, 2024b), and emotional behavior modeling (Chang, 2024a). These results demonstrate SocraSynth's effectiveness in mitigating LLM limitations and achieving substantial performance improvements across various applications, highlighting its potential for advancing towards AGI's generalized problemsolving capabilities. ...
... Our empirical validation demonstrates EVINCE's effectiveness in improving prediction accuracy across various domains, notably achieving a 5% improvement in medical diagnosis tasks. The framework has also shown promise in identifying biases in news articles (Chang, 2024c), showcasing its potential for broader applications in fields such as geopolitical analysis (Chang, 2023b) corporate planning (Tsao, 2023), and emotional behavior modeling (Chang, 2024a). ...
Preprint
This paper introduces EVINCE (Entropy and Variation IN Conditional Exchanges), a dialogue framework advancing Artificial General Intelligence (AGI) by enhancing versatility, adaptivity, and reasoning in large language models (LLMs). Leveraging adversarial debate and a novel dual entropy theory, EVINCE improves prediction accuracy, robustness, and stability in LLMs by integrating statistical modeling, information theory, and machine learning to balance diverse perspective exploration with strong prior exploitation. The framework's effectiveness is demonstrated through consistent convergence of information-theoretic metrics, particularly improved mutual information, fostering productive LLM collaboration. We apply EVINCE to healthcare, showing improved disease diagnosis, and discuss its broader implications for decision-making across domains. This work provides theoretical foundations and empirical validation for EVINCE, paving the way for advancements in LLM collaboration and AGI development.
... SocraSynth's adversarial component promotes the exploration of diverse perspectives, while its collaborative component fosters rigorous reasoning to reach well-reasoned conclusions. This synergy has yielded measurable gains beyond healthcare and bias mitigation, extending to geopolitical analysis (Chang, 2023b), corporate planning (Tsao, 2023), investment banking (Chang, 2024b), and emotional behavior modeling (Chang, 2024a). These results demonstrate SocraSynth's effectiveness in mitigating LLM limitations and achieving substantial performance improvements across various applications, highlighting its potential for advancing towards AGI's generalized problemsolving capabilities. ...
... Our empirical validation demonstrates EVINCE's effectiveness in improving prediction accuracy across various domains, notably achieving a 5% improvement in medical diagnosis tasks. The framework has also shown promise in identifying biases in news articles (Chang, 2024c), showcasing its potential for broader applications in fields such as geopolitical analysis (Chang, 2023b) corporate planning (Tsao, 2023), and emotional behavior modeling (Chang, 2024a). ...
Article
Full-text available
This paper introduces EVINCE (Entropy and Variation IN Conditional Exchanges), a dialogue framework advancing Artificial General Intelligence (AGI) by enhancing versatility, adaptivity, and reasoning in large language models (LLMs). Leveraging adversarial debate and a novel dual entropy theory, EVINCE improves prediction accuracy, robustness, and stability in LLMs by integrating statistical modeling, information theory, and machine learning to balance diverse perspective exploration with strong prior exploitation. The framework's effectiveness is demonstrated through consistent convergence of information-theoretic metrics, particularly improved mutual information, fostering productive LLM collaboration. We apply EVINCE to healthcare, showing improved disease diagnosis, and discuss its broader implications for decision-making across domains. This work provides theoretical foundations and empirical validation for EVINCE, paving the way for advancements in LLM collaboration and AGI development.
... EVINCE and its predecessor have proven effective across diverse domains, including healthcare (Chang and et al., 2023), business planning (Tsao, 2023), and geopolitical analysis (Chang, 2023b). In healthcare, for example, GPT-4 and Gemini LLMs have been successfully employed to address misdiagnosis. ...
Preprint
Full-text available
Biases and errors in human-labeled data present significant challenges for machine learning, especially in supervised learning reliant on potentially flawed ground truth data. These flaws, including diagnostic errors and societal biases, risk being propagated and amplified through models trained using maximum likelihood estimation. We present the Reflective LLM Dialogue Framework (RLDF), which leverages structured adversarial dialogues between multiple instances of a single LLM or different LLMs to uncover diverse perspectives and correct inconsistencies. By conditioning LLMs to adopt opposing stances, RLDF enables systematic bias detection through conditional statistics, information theory, and divergence metrics. Experiments show RLDF successfully identifies potential biases in public content while exposing limitations in human-labeled data. Our framework supports measurable progress tracking and explainable remediation actions, offering a scalable approach for improving content neutrality through transparent, multi-perspective analysis.
... By leveraging layered memory processing and consistent information exchange, this framework demonstrates augmented adaptability to historical trades and real-time market cues, significantly enhancing automated trading outcomes. Aside from trading tasks, SocraPlan [277] leverages multi-agent reasoning with LLMs for effective corporate planning. This framework conducts comprehensive market research, customer profiling, product usage analysis, and sales strategy formulation. ...
Preprint
Full-text available
Recent advances in large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain. These models have demonstrated remarkable capabilities in understanding context, processing vast amounts of data, and generating human-preferred contents. In this survey, we explore the application of LLMs on various financial tasks, focusing on their potential to transform traditional practices and drive innovation. We provide a discussion of the progress and advantages of LLMs in financial contexts, analyzing their advanced technologies as well as prospective capabilities in contextual understanding, transfer learning flexibility, complex emotion detection, etc. We then highlight this survey for categorizing the existing literature into key application areas, including linguistic tasks, sentiment analysis, financial time series, financial reasoning, agent-based modeling, and other applications. For each application area, we delve into specific methodologies, such as textual analysis, knowledge-based analysis, forecasting, data augmentation, planning, decision support, and simulations. Furthermore, a comprehensive collection of datasets, model assets, and useful codes associated with mainstream applications are presented as resources for the researchers and practitioners. Finally, we outline the challenges and opportunities for future research, particularly emphasizing a number of distinctive aspects in this field. We hope our work can help facilitate the adoption and further development of LLMs in the financial sector.
... SocraSynth has bee successfully applied to several application domains including healthcare [8] sale planning [59], and mitigating context biases [6]. ...
Conference Paper
Full-text available
This paper introduces a three-branch checks-and-balances framework for ethical alignment of Large Language Models (LLMs). Inspired by governmental systems, the framework implements three independent yet interacting components: LLMs as the executive branch for knowledge generation, \DIKE (named after the goddess of justice) as the legislative branch establishing ethical guardrails, and \ERIS (the goddess of discord) as the judicial branch for contextual interpretation. The \DIKE-\ERIS duality, through their adversarial interaction, enables adaptation to diverse cultural contexts while maintaining consistent ethical principles. This architecture addresses fundamental limitations of reinforcement learning with human feedback (RLHF) by providing interpretable, adaptable, and culturally-aware ethical reasoning. Through self-supervised learning and adversarial testing, our framework demonstrates how emotional modeling can guide linguistic behaviors toward ethical outcomes while preserving the independence of knowledge generation, ethical oversight, and contextual interpretation.
... For an automated evaluation, we employed the CRIT algorithm [7], which utilizes the Socratic method to assess the quality of reasoning. The CRIT algorithm has been previously applied in numerous studies, e.g., [6,8,9,35], where it has proven to be an effective tool for evaluating the reasoning quality of arguments. CRIT employed GPT-4 and Bard to conduct two evaluations. ...
Conference Paper
Full-text available
SocraPlan introduces a sophisticated methodology that leverages the capabilities of multiple Large Language Models (LLMs) for strategic sales planning in today's fast-paced sales environment. This method focuses on tailoring sales playbooks to the specific needs and situations of each customer, harnessing the power of Generative AI (GAI). Its primary objectives are to improve customer satisfaction by deeply understanding their unique requirements, refine sales strategies through targeted market analysis, and enhance the efficiency of the sales process. SocraPlan distinguishes itself with a collaborative and debate-driven framework that engages multiple LLMs, enabling a level of analysis, adversarial reasoning, and strategy development beyond traditional AI-based approaches that focus solely on data collection. As a result, SocraPlan stands out as a groundbreaking tool in AI-driven sales strategies, offering bespoke, impactful solutions for intricate sales planning challenges and supporting more effective deal conclusions.
... SocraSynth has bee successfully applied to several application domains including healthcare [8] sale planning [60], and mitigating context biases [6]. ...
Conference Paper
This research addresses the dual objectives of improving the quality and reducing bias in Wikipedia and news articles. Quality is evaluated in terms of breadth, depth, accuracy, and neutrality, reflecting both the soundness of the content and the authority of references. We conceptualize bias as any tilt or slant in the information presented. Our methodology employs multiple Large Language Models (LLMs) in a novel way to appraise and refine articles from different standpoints. One LLM acts as an advocate for the article's current state, promoting its strengths and integrity. Concurrently, other LLMs scrutinize and challenge the article, applying defined metrics for quality and impartiality. This dialectical approach culminates in a synthesized enhancement that consolidates diverse insights, thus advancing the article's quality by diminishing bias. Our empirical findings substantiate the effectiveness of this technique in concurrently advancing the neutrality and caliber of content.
Conference Paper
Full-text available
This study introduces SocraHealth, an innovative method using Large Language Models (LLMs) for medical diagnostics. By engaging LLM-based agents in structured debates, SocraHealth not only refines diagnoses but also corrects historical record inaccuracies, utilizing patient data effectively. The case study, featuring GPT-4 and Bard across two experiments, showcases this approach's success in producing logical, hallucination-free debates. Demonstrating a significant advancement over traditional diagnostic techniques, SocraHealth highlights the transformative power of LLMs in healthcare, especially in enhancing diagnostic accuracy and rectifying past diagnostic errors.
Conference Paper
Full-text available
This study explores the architectural advancements of large language models (LLMs), with a particular focus on the GPT-4 model. We begin with a thorough analysis of GPT-4’s distinctive features, including its polydisciplinary and polymodal data representation, the balanced approach in its algorithmic training, and the synergistic blend of human-driven insights with data-centric learning processes. Building upon these insights, we introduce SocraSynth, a {\em reasoning layer} thoughtfully crafted to augment knowledge discovery and bolster analytical reasoning across an ensemble of LLMs. SocraSynth is designed to facilitate a generative process through multi-agent analytical discussions, followed by the evaluation of the resultant arguments for their ``reasonableness.'' This approach significantly enhances interdisciplinary information discovery and complex reasoning, strategically addressing major challenges faced by LLMs, such as the production of contextually inaccurate responses (hallucinations) and entrenched statistical biases. Implementing SocraSynth across various application domains marks a significant advancement in overcoming the limitations of current LLMs, paving the way for more reliable and sophisticated AI-driven analytical tools.
Conference Paper
Full-text available
Human knowledge, vast as it is, often falls short in grasping intricate interdisciplinary domains fully. In contrast, foundation models like GPT-4, endowed with extensive multidisciplinary knowledge, can potentially bridge this gap. Significantly, we leverage the vast expanses of GPT-4's knowledge, banking on its ability to frame questions that might elude human intuition, thus paving the way for the emergence of fresh insights and potentially novel knowledge. In this study, we convened a unique committee comprising a moderator (the authors) and two GPT-4 agents. The dialogue is ignited by the ancient narrative of Adam and Eve, setting the stage for a rich exchange between the GPT-4 agents. This conversation derives from the age-old tale, as the agents delve into three intertwined domains: the significance of myths in ecological interpretation, the intricate ethical and philosophical quandaries surrounding AI, and the enigmatic realm of the human brain as complemented by technology. This dialogue not only unveils captivating insights but also underscores the indispensable value of interdisciplinary exchanges. Foundation models, as demonstrated, can catalyze such dialogues, equipping us to traverse expansive knowledge landscapes and explore domains previously beyond human comprehension.
Article
Full-text available
The CoCoMo model proposes a computational solution to the challenge of incorporating ethical and emotional intelligence considerations into AI systems, with the aim of creating AI agents that combine knowledge with compassion. To achieve this goal, Co-CoMo prioritizes fairness, beneficence, non-maleficence, empathy, adaptability, transparency, and critical and exploratory thinking abilities. The model employs consciousness modeling, reinforcement learning, and prompt template formulation to support these desired traits. By incorporating ethical and emotional intelligence considerations, a generative AI model can potentially lead to improved fairness, reduced toxicity, and increased reliability.
Conference Paper
Full-text available
This paper presents a systematic approach to using the Socratic method in developing prompt templates that effectively interact with large language models, including GPT-3. Various methods are examined, and those that yield precise answers and justifications while fostering creativity and imagination to enhance creative writing are identified. Techniques such as definition, elenchus, dialectic, maieutics, generalization, and counterfactual reasoning are discussed for their application in engineering prompt templates and their connections to inductive, deductive, and abductive reasoning. Through examples, the effectiveness of these dialogue and reasoning methods is demonstrated. An interesting observation is made that when the task's goal and user intent are conveyed to GPT-3 via ChatGPT before the start of a dialogue, the large language model seems to connect to the external context expressed in the intent and perform more effectively. Index Terms-large language model, natural language processing , prompting, the Socratic method.
The Sales Playbook of Successful B2B Teams
  • J Cleghorn
  • J Lee
  • E Kennedy
  • R Huey
SocraSynth: Socratic Synthesis for Reasoning and Decision Making
  • E Y Chang
Lamda: Language models for dialog applications
  • R Thoppilan
  • D D Freitas
  • J Hall
Llama 2: Open foundation and fine-tuned chat models
  • H Touvron
  • L Martin
  • K Stone