Science topic

Language Modeling - Science topic

Explore the latest publications in Language Modeling, and find Language Modeling experts.
Filters
All publications are displayed by default. Use this filter to view only publications with full-texts.
Publications related to Language Modeling (10,000)
Sorted by most recent
Article
Full-text available
Text clustering is an important method for organising the increasing volume of digital content, aiding in the structuring and discovery of hidden patterns in uncategorised data. The effectiveness of text clustering largely depends on the selection of textual embeddings and clustering algorithms. This study argues that recent advancements in large l...
Research Proposal
Full-text available
Special Issue Information Dear Colleagues, The fields of Machine Learning (ML), Deep Learning (DL), and Artificial Intelligence (AI) have undergone remarkable expansions over the past few decades. This surge in growth can be largely attributed to significant advancements in computing power and the unprecedented availability of vast amounts of data...
Conference Paper
Full-text available
We evaluate Google's open-weight Gemma 2 language models for Aspect-Based Sentiment Analysis (ABSA) on Italian tourism texts. The study tests both 2-billion and 9-billion parameter variants of Gemma 2 on hotel reviews from the EVALITA 2018 ABSITA dataset, comparing different prompting strategies with and without detailed aspect descriptions.
Conference Paper
Full-text available
This intermediate-level tutorial, titled "Gen-RecSys," merges both industrial and academic perspectives on recent advances in Generative AI for recommender systems (beyond LLMs). It aims to highlight the transformative role of generative models in modern recommender systems, which have significantly impacted the AI field-particularly with the rise...
Research Proposal
Full-text available
Dear Colleagues, We are presently experiencing a profound utilization of Artificial Intelligence techniques across human society. At the core of this epochal shift stands the Deep Learning methodology, serving as a pivotal enabling technology. Deep neural networks thrive on the abundance of extensive datasets and accessible computing resources. De...
Conference Paper
Full-text available
The keys to accurately predicting future price trends and correlations between assets in a portfolio are not only learning from historical price data, but also considering the current market sentiment among investors. Yet there is a lot of news published by different media everyday, some of which may describe irrelevant or even contradictory inform...
Conference Paper
Full-text available
The integration of Large Language Models (LLMs) into the assessment processes of sustainable forest investment projects is a compelling prospect, given the limitations present in manual assessment. This paper examines how such an LLM-based assessment tool can be designed and whether such a tool can serve as a viable alternative to human experts in...
Article
Full-text available
Sentiment analysis of user-generated content on social media sites reveals important information about public attitudes toward emerging technologies. Researchers face challenges in understanding these impressions, ranging from cursory evaluations to in-depth analyses. Analyzing detailed, long-form reviews exacerbates the difficulty of achieving acc...
Article
Full-text available
In the context of Industry 4.0, ensuring the compatibility of digital twins (DTs) with existing software systems in the manufacturing sector presents a significant challenge. The Asset Administration Shell (AAS), conceptualized as the standardized DT for an asset, offers a powerful framework that connects the DT with the established software infras...
Preprint
Full-text available
Assistive technologies for people with visual impairments (PVI) have made significant advancements, particularly with the integration of artificial intelligence (AI) and real-time sensor technologies. However, current solutions often require PVI to switch between multiple apps and tools for tasks like image recognition, navigation, and obstacle det...
Conference Paper
Full-text available
Collaborative ideation is a key aspect of many innovation processes. However, a lack of proper support can hinder the process and limit the ability of participants to generate innovative ideas. Thus, we introduce AI-deation, a digital environment for collaborative ideation. At the heart of the system is an AI collaborator powered by Generative Arti...
Article
Full-text available
Large language models (LLMs) have exhibited great potential in fault diagnosis of heating, ventilation, and air conditioning systems. However, the fault diagnosis accuracy of LLMs is still unsatisfactory, due to the lack of effective diagnosis accuracy enhancement methods for LLMs. To fill this gap, this study proposes a LLM fine-tuning method supe...
Article
Full-text available
Salloch and Eriksen (2024) present a compelling case for including patients as co-reasoners in medical decision-making involving artificial intelligence (AI). Drawing on O'Neill's neo- Kantian framework (1989), they argue that the "human in the loop" concept should extend beyond physicians to encompass patients as active participants in the reasoni...
Article
Full-text available
The integration of artificial intelligence (AI), particularly large language models (LLMs) like OpenAI’s ChatGPT, into clinical research could significantly enhance the informed consent process. This paper critically examines the ethical implications of employing LLMs to facilitate consent in clinical research. LLMs could offer considerable benefit...
Conference Paper
Full-text available
Extended generativity theory states that while generativity lead to more users, more users also affect product boundaries of a platform. We seek to uncover the complex relationship by turning to large language model (LLM) platforms, such as ChatGPT, Gemini or GPT4. LLM platforms are unique, because they draw from an unbounded supply of complements...
Conference Paper
Full-text available
The advancement of large language models (LLMs) has greatly facilitated math instruction, with the generated textual content serving as verbal responses to address student inquiries. However, in instructional settings, teachers often provide both verbal responses and board writing (BW) simultaneously to enhance students' knowledge construction. To...
Conference Paper
Full-text available
We present a patient-centric system integrating Large Language Models (LLMs) into medical applications, focusing on a diverse set of use cases. An initial use case for symptom reporting was explored using natural language, addressing the limitations of traditional questionnaires. This collected data can be analysed by healthcare professionals durin...
Conference Paper
Full-text available
Organizations constantly face changes arising from the environment and from within. The degree of environmental change has increased due to technological innovation and other factors, like the introduction of new regulations. These changes directly influence the business processes of organizations and often make transformations necessary. To carry...
Article
Full-text available
In large language models (LLMs), full-parameter fine-tuning is crucial for task-specific adaptation. Traditionally, this relies on deep learning training frameworks utilizing the back-propagation scheme. However, this scheme presents inherent issues, e.g. activation memory bottlenecks and backward locking, which limit the efficient computational re...
Article
Full-text available
The emergence of Large Language Models (LLMs) is currently creating a major paradigm shift in societies and businesses in the way digital technologies are used. While the disruptive effect is especially observable in the information and communication technology field, there is a clear lack of systematic studies focusing on the application and impac...
Conference Paper
Full-text available
We tested the ability of generative AI (GAI) to serve as a non-expert grader in the context of school-wide curriculum assessment. OpenAI's ChatGPT-4o Large Language Model was used to create diverse student personas. Synthetic artefacts based on exam questions from two Undergraduate courses on religion and philosophy were created. A custom GPT model...
Article
Full-text available
Degrees, unlike entities or events, refer to comparative qualities and are closely tied to gradable adjectives such as “tall.” Degree expressions have been explored in second language (L2) research, covering areas such as learnability, first language (L1) transfer, contrastive analysis, and acquisition difficulty. However, a computational approach...
Article
Full-text available
Large Language Models (LLMs) show promise in medical diagnosis, but their performance varies with prompting. Recent studies suggest that modifying prompts may enhance diagnostic capabilities. This study aimed to test whether a prompting approach that aligns with general clinical reasoning methodology—specifically, using a standardized template to f...
Article
Full-text available
LLMs show high performance in multiple fields and applications and could benefit healthcare for better patient management, prediction and decision support. However, the sensitivity and complexity of healthcare data makes it challenging to use clinical data in these models. Here, we examine these issues with reference to the four domains of data pri...
Preprint
Full-text available
In this paper, we introduce the MediaSpin dataset aiming to help in the development of models that can detect different forms of media bias present in news headlines, developed through human-supervised and-validated Large Language Model (LLM) labeling of media bias. This corpus comprises 78,910 pairs of news headlines and annotations with explanati...
Article
Full-text available
This paper is a reflection on the role of generative AI chatbots in the referred web traffic that reaches academic journals. Scenarios on informational search behaviour within the academic environment are explored, regarding the influence of these new tools based on Large Language Models (LLM) and on the generation of direct responses resulting fro...
Preprint
Full-text available
We present MAPPE, a novel algorithm integrating a k-nearest neighbor (KNN) similarity network with co-occurrence matrix analysis to extract evolutionary insights from protein language model (PLM) embeddings. The KNN network captures diverse evolutionary relationships and events, while the co-occurrence matrix identifies directional evolutionary pat...
Article
Full-text available
The advent of large language models (LLMs) like ChatGPT has the potential to revolutionize healthcare, offering opportunities for improved patient engagement, clinical decision support, and administrative efficiency. This pioneering study aimed to capture the early perspectives of Nigerian medical professionals on LLMs, particularly ChatGPT, immedi...
Article
Full-text available
Large Language Models (LLMs) enable a future in which certain types of legal documents may be generated automatically. This has a great potential to streamline legal processes, lower the cost of legal services, and dramatically increase access to justice. While many researchers focus on proposing and evaluating LLM-based applications supporting tas...
Preprint
Full-text available
As prompt engineering research rapidly evolves, evaluations beyond accuracy are crucial for developing cost-effective techniques. We present the Economical Prompting Index (EPI), a novel metric that combines accuracy scores with token consumption, adjusted by a user-specified cost concern level to reflect different resource constraints. Our study e...
Article
Full-text available
Large language models provide high-dimensional representations (embeddings) of word meaning, which allow quantifying changes in the geometry of the semantic space in mental disorders. A pattern of a more condensed (‘shrinking’) semantic space marked by an increase in mean semantic similarity between words has been recently documented in psychosis a...
Article
Full-text available
Value alignment is essential for ensuring that AI systems act in ways that are consistent with human values. Existing approaches, such as reinforcement learning with human feedback and constitutional AI, however, exhibit power asymmetries and lack transparency. These “authoritarian” approaches fail to adequately accommodate a broad array of human o...
Preprint
Full-text available
We introduce HackSynth, a novel Large Language Model (LLM)-based agent capable of autonomous penetration testing. HackSynth's dual-module architecture includes a Planner and a Summarizer, which enable it to generate commands and process feedback iteratively. To benchmark HackSynth, we propose two new Capture The Flag (CTF)-based benchmark sets util...
Preprint
Full-text available
Capability evaluations play a critical role in ensuring the safe deployment of frontier AI systems, but this role may be undermined by intentional underperformance or ``sandbagging.'' We present a novel model-agnostic method for detecting sandbagging behavior using noise injection. Our approach is founded on the observation that introducing Gaussia...
Article
Full-text available
In recent years, large language models (LLMs) have made significant progress in natural language processing (NLP). These models not only perform well in a variety of language tasks but also show great potential in the medical field. This paper aims to explore the application of LLMs in clinical dialogues, analyzing their role in improving the effic...
Conference Paper
Full-text available
The advent of generative artificial intelligence, and in particular large language models, has opened up new possibilities for information processing in a multitude of domains. Nevertheless, it is essential to validate their output in order to ensure its validity within the specified context. This is due to their nature as probabilistic models of l...
Article
Full-text available
Can we turn AI black boxes into code? Although this mission sounds extremely challenging, we show that it is not entirely impossible by presenting a proof-of-concept method, MIPS, that can synthesize programs based on the automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algor...
Preprint
Full-text available
Large language models (LLMs) gained immense popularity due to their impressive capabilities in unstructured conversations. However, they underperform compared to previous approaches in task-oriented dialogue (TOD), wherein reasoning and accessing external information are crucial. Empowering LLMs with advanced prompting strategies such as reasoning...
Article
Full-text available
Dense Passage Retrieval (DPR) serves as a crucial initial step in improving the performance of the Retrieval Augmented Generation paradigm for large language models. While DPRs are challenging to train and typically involve fine-tuning (FT) on pre-trained models to enhance embedding similarity between queries and associated textual data, the utiliz...
Article
Full-text available
This study investigated the effectiveness of using ChatGPT, a large language model (LLM), to enhance critical thinking and argumentation skills among undergraduate students studying international relations in a developing nation context. A total of 95 participants were randomly assigned to an experimental group (n = 48) and a control group (n = 47)...
Preprint
Full-text available
To protect large-scale computing environments necessary to meet increasing computing demand, cloud providers have implemented security measures to monitor Operations and Maintenance (O&M) activities and therefore prevent data loss and service interruption. Command interception systems are used to intercept, assess, and block dangerous Command-line...
Preprint
Full-text available
Free associations have been extensively used in cognitive psychology and linguistics for studying how conceptual knowledge is organized. Recently, the potential of applying a similar approach for investigating the knowledge encoded in LLMs has emerged, specifically as a method for investigating LLM biases. However, the absence of large-scale LLM-ge...
Preprint
Full-text available
Large Language Models (LLMs) have been widely used in code completion, and researchers are focusing on scaling up LLMs to improve their accuracy. However, larger LLMs will increase the response time of code completion and decrease the developers’ productivity. In this paper, we propose a lightweight and effective LLM for code completion named aiXco...
Article
Full-text available
Purpose The purpose of this study is to evaluate the capability of large language models (LLMs) to perform data quality assessment on background data with respect to a study scenario. Methods LLMs generate coherent and contextually relevant text in response to prompts. Using a chat interface and prompting the model in a conversational style, OpenA...
Preprint
Full-text available
In this report, we introduce INTELLECT-1, the first 10 billion parameter language model collaboratively trained across the globe, demonstrating that large-scale model training is no longer confined to large corporations but can be achieved through a distributed, community-driven approach. INTELLECT-1 was trained on 1 trillion tokens using up to 14...
Preprint
Full-text available
Electronic healthcare records (EHR) contain a huge wealth of data that can support the prediction of clinical outcomes. EHR data is often stored and analysed using clinical codes (ICD10, SNOMED), however these can differ across registries and healthcare providers. Integrating data across systems involves mapping between different clinical ontologie...
Preprint
Full-text available
Our aim for the ML Contest for Chip Design with HLS 2024 was to predict the validity, running latency in the form of cycle counts, utilization rate of BRAM (util-BRAM), utilization rate of lookup tables (uti-LUT), utilization rate of flip flops (util-FF), and the utilization rate of digital signal processors (util-DSP). We used Chain-of-thought tec...
Article
Full-text available
The emergence of artificial intelligence (AI) is transforming how humans live and interact, raising both excitement and concerns—particularly about the potential for AI consciousness. For example, Google engineer Blake Lemoine suggested that the AI chatbot LaMDA might become sentient. At that time, GPT-3 was one of the most powerful publicly availa...
Article
Full-text available
We explore the automated generation of open-ended questions from technical domain textbooks. These questions are more diverse than those typically examined in the field of question generation (QG) for reading comprehension. To facilitate this endeavor, we curate EngineeringQ, a prompt-based QG dataset that contains triples of (1) Context: a segment...
Article
Full-text available
Conversational skills, which are essential for effective social interactions and typically pose difficulties for individuals with autism spectrum disorder (ASD), include abilities such as initiating topics, engaging in back-and-forth dialog, and responding to conversational cues. Chatbots have been used in mental health fields, and the development...
Conference Paper
Full-text available
The large-scale adoption of large language models for the integration of generative artificial intelligence capabilities is occurring across several domains. This also applies to the domain of conceptual modeling, where a number of approaches are currently being investigated for the creation and interpretation of models utilizing this technology. H...
Preprint
Full-text available
Integrated Gradients is a well-known technique for explaining deep learning models. It calculates feature importance scores by employing a gradient based approach computing gradients of the model output with respect to input features and accumulating them along a linear path. While this works well for continuous features spaces, it may not be the m...
Preprint
Full-text available
The adoption of large language models (LLMs) in many applications, from customer service chat bots and software development assistants to more capable agentic systems necessitates research into how to secure these systems. Attacks like prompt injection and jailbreaking attempt to elicit responses and actions from these models that are not compliant...
Preprint
Full-text available
Clinical decision making (CDM) is a complex, dynamic process crucial to healthcare delivery, yet it remains a significant challenge for artificial intelligence systems. While Large Language Model (LLM)-based agents have been tested on general medical knowledge using licensing exams and knowledge question-answering tasks, their performance in the CD...
Article
Full-text available
Background Stomas present significant lifestyle and psychological challenges for patients, requiring comprehensive education and support. Current educational methods have limitations in offering relevant information to the patient, highlighting a potential role for artificial intelligence (AI). This study examined the utility of AI in enhancing sto...
Article
Full-text available
Natural climate solutions (NCS) play a critical role in climate change mitigation. NCS can generate win–win co-benefits for biodiversity and human well-being, but they can also involve trade-offs (co-impacts). However, the massive evidence base on NCS co-benefits and possible trade-offs is poorly understood. We employ large language models to asses...
Article
Full-text available
Large language models (LLMs) and multi-modal large language models (MLLMs) represent the cutting-edge in artificial intelligence. This review provides a comprehensive overview of their capabilities and potential impact on radiology. Unlike most existing literature reviews focusing solely on LLMs, this work examines both LLMs and MLLMs, highlighting...
Preprint
Full-text available
In-context generation is a key component of large language models' (LLMs) open-task generalization capability. By leveraging a few examples as context, LLMs can perform both in-domain and out-of-domain tasks. Recent advancements in auto-regressive vision-language models (VLMs) built upon LLMs have showcased impressive performance in text-to-image g...
Preprint
Full-text available
Code summarization facilitates program comprehension and software maintenance by converting code snippets into natural-language descriptions. Over the years, numerous methods have been developed for this task, but a key challenge remains: effectively evaluating the quality of generated summaries. While human evaluation is effective for assessing co...
Preprint
Full-text available
Multimodal LLMs (MLLMs) equip language models with visual capabilities by aligning vision encoders with language models. Existing methods to enhance the visual perception of MLLMs often involve designing more powerful vision encoders, which requires exploring a vast design space and re-aligning each potential encoder with the language model, result...
Article
Full-text available
Lexical Substitution (L.S.) replaces the target word or phrase with its synonym alternatives that are equivalent in meaning. Despite the richness of the Arabic language, Arabic L.S. received little attention as there are no benchmark evaluation datasets, even though researchers in many languages showed interest in this task. This paper presents an...
Preprint
Full-text available
While Large Vision Language Models (LVLMs) have become masterly capable in reasoning over human prompts and visual inputs, they are still prone to producing responses that contain misinformation. Identifying incorrect responses that are not grounded in evidence has become a crucial task in building trustworthy AI. Explainability methods such as gra...
Article
Full-text available
Abstract Background: Large language models (LLMs) like GPT-3.5-Turbo and GPT-4 show potential to transform medical diagnostics through their linguistic and analytical capabilities. This study evaluates their diagnostic proficiency using English and German medical examination datasets. Methods: We analyzed 452 English and 637 German medical examina...
Preprint
Full-text available
Recent advancements of generative LLMs (Large Language Models) have exhibited human-like language capabilities but have shown a lack of domain-specific understanding. Therefore, the research community has started the development of domain-specific LLMs for many domains. In this work we focus on discussing how to build mining domain-specific LLMs, a...
Preprint
Full-text available
Several power-law critical properties involving different statistics in natural languages -- reminiscent of scaling properties of physical systems at or near phase transitions -- have been documented for decades. The recent rise of large language models (LLMs) has added further evidence and excitement by providing intriguing similarities with notio...