Daniele Quercia’s research while affiliated with Polytechnic University of Turin and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (263)


SI: Dream content discovery from social media using natural language processing
  • Data
  • File available

May 2025

·

1 Read

·

·

·

[...]

·

Daniele Quercia

Supplementary Information to "Dream Content Discovery from Social Media Using Natural Language Processing"

Download

The framework of our study. Stage 1. involves data curation. Stage 2. consists of three steps that are the core of our proposed unsupervised mixed-method approach for dream content discovery: Step 2.1 is topic extraction using unsupervised BERTopic method, Step 2.2. is grouping similar topics into themes, and Step 2.3 is a semi-manual filtering and adjustment step to preserve only relevant dream-content. Stages 3.-5. demonstrate various applications enabled by our methodology, such as 3. uncovering the topic taxonomy in a particular dream report collection, 4. finding topics and themes that are specific to a given subset of dream reports in the given collection, and 5. tracking topic and theme trends through time
Reddit dream reports statistics: (a) number of reports of different types through time, and (b) number of reports per dreamer
The taxonomy of dream themes that we automatically uncovered — a map of dreams. Nodes represent dream themes (i.e., groups of topics); node size corresponds to the frequency of the theme in our reports; node color represents the theme centrality in the network (the color scale goes from light yellow to green to dark blue, and the darker the color, the more central the topic); finally, edge thickness corresponds to the frequency of co-occurrence of the theme topics in the dream reports
Odds-ratio (OR) analysis of themes in the four dream types. The OR scores (y-axis) are equal to the probability of the theme (x-axis) appearing in dreams labeled with a given dream type (e.g., nightmares), divided by the same probability calculated on dreams that are not labeled with that type. Scores higher/lower than one indicate that dreams of the given type feature the theme more/less often than other dreams
Temporal evolution of themes. The value of each cell is proportional to the average importance of a theme t across all the dream reports posted on a given month m (check Formula (4)). The importance of a theme in a dream is defined as the fraction of sentences in the dream report that belong to that theme (check Formula (2)). Values are standardized within each theme (check Formula (5)) so that positive (negative) values encode values of importance that are higher (lower) than the global average for that theme. Last, we smoothed values over a window length of 5 months to reduce noise (check Formula (6)). During period 1) themes of Sight and vision, Outdoor locations, Movement and action, Mental reflections, Life Events, Space, and Work plummeted, while in period 2) also the themes of People and relationships and Feelings dropped. During period 3) we see an increase in the themes such as Supernatural entities, Human body, especially teeth and blood, Religious and spiritual, Indoor Locations, and Violence and death; 4) the last two themes (Violence & death and Other) peaked around the time of the start of the war in Ukraine

+2

Dream content discovery from social media using natural language processing

May 2025

·

14 Reads

EPJ Data Science

Dreaming is a fundamental but not fully understood part of human experience. Traditional dream content analysis practices, while popular and aided by over 130 unique scales and rating systems, have limitations. Often based on retrospective surveys or lab studies, and sometimes on in-home dream reports collected over some days, they struggle to be applied on a large scale or to show the importance and connections between different dream themes. To overcome these issues, we conducted data-driven mixed-method analysis identifying topics in free-form dream reports through natural language processing. We applied this analysis on 44,213 dream reports from Reddit’s r/Dreams subreddit, where we uncovered 217 topics, grouped into 22 larger themes: the most extensive collection of dream topics to date. We validated our topics by comparing it to the widely-used Hall and van de Castle scale. Going beyond traditional scales, our method can find unique patterns in different dream types (like nightmares or recurring dreams), understand topic importance and connections (like finding a greater predominance of indoor location settings in Reddit dreams than what was in general stipulated by previous work), and observe changes in collective dream experiences over time and around major events (like the COVID-19 pandemic and the recent Russo-Ukrainian war). We envision that the applications of our method will provide valuable insights into the complex nature of dreaming and its interplay with our waking experiences.


The Experience of Running: Recommending Routes Using Sensory Mapping in Urban Environments

May 2025

·

3 Reads

Depending on the route, runners may experience frustration, freedom, or fulfilment. However, finding routes that are conducive to the psychological experience of running remains an unresolved task in the literature. In a mixed-method study, we interviewed 7 runners to identify themes contributing to running experience, and quantitatively examined these themes in an online survey with 387 runners. Using Principal Component Analysis on the survey responses, we developed a short experience sampling questionnaire that captures the three most important dimensions of running experience: \emph{performance \& achievement}, \emph{environment}, and \emph{mind \& social connectedness}. Using path preferences obtained from the online survey, we clustered them into two types of routes: \emph{scenic} (associated with nature and greenery) and \emph{urban} (characterized by the presence of people); and developed a routing engine for path recommendations. We discuss challenges faced in developing the routing engine, and provide guidelines to integrate it into mobile and wearable running apps.


PACMHCI, V9, N2, April 2025 CSCW Editorial

May 2025

·

1 Read

Proceedings of the ACM on Human-Computer Interaction

We are again thrilled to be able to present the Computer-Supported Cooperative Work and Social Computing (CSCW) community with an issue of the Proceedings of the ACM on Human-Computer Interaction, containing very interesting and relevant scholarship from its members. This issue includes 211 papers, of which 192 were accepted from the January 2024 cycle, and 19 were accepted from the July 2024 cycle. It reflects great efforts and contributions from external reviewers, Associate Chairs and Editors, who together have conducted a rigorous review process to select contributions of the highest quality advancing the CSCW field. As Track Chairs, we are grateful for the community's collective efforts to continue shaping and sharing CSCW's tradition of high-quality scholarship across the years.







RiskRAG: A Data-Driven Solution for Improved AI Model Risk Reporting

April 2025

·

8 Reads

Risk reporting is essential for documenting AI models, yet only 14% of model cards mention risks, out of which 96% copying content from a small set of cards, leading to a lack of actionable insights. Existing proposals for improving model cards do not resolve these issues. To address this, we introduce RiskRAG, a Retrieval Augmented Generation based risk reporting solution guided by five design requirements we identified from literature, and co-design with 16 developers: identifying diverse model-specific risks, clearly presenting and prioritizing them, contextualizing for real-world uses, and offering actionable mitigation strategies. Drawing from 450K model cards and 600 real-world incidents, RiskRAG pre-populates contextualized risk reports. A preliminary study with 50 developers showed that they preferred RiskRAG over standard model cards, as it better met all the design requirements. A final study with 38 developers, 40 designers, and 37 media professionals showed that RiskRAG improved their way of selecting the AI model for a specific application, encouraging a more careful and deliberative decision-making. The RiskRAG project page is accessible at: https://social-dynamics.net/ai-risks/card.


Citations (49)


... Potential agent failures significantly shape user perception, trust, and adoption [38]. Unpredictable agent behavior often elicits skepticism and deep concerns about losing control, heightened by societal anxieties about AI safety [32,52]. Consequently, users often prefer an "AI as augmentation" model: agents assisting under human oversight rather than replacing human judgment [16,76,77]. ...

Reference:

Characterizing Unintended Consequences in Human-GUI Agent Collaboration for Web Browsing
The Hall of AI Fears and Hopes: Comparing the Views of AI Influencers and those of Members of the U.S. Public Through an Interactive Platform
  • Citing Conference Paper
  • April 2025

... They are then presented with a set of human-written principles that capture culture-specific harms so that they can engage in a multi-turn process of critiquing and revising originally harmful generations to harmless generations (or vice versa), to create culturespecific preference pairs for alignment training. Enabling constitutional AI for multilingual and multicultural alignment data generation requires close collaboration among linguists, cultural anthropologists and AI researchers to co-create three key components: (1) culturally-informed constitutional principles that reflect diverse value systems and ethical frameworks across different societies [Kirk et al., 2024;Pistilli et al., 2025]; (2) sufficiently capable multilingual LLMs that can both understand these principles and generate high-quality content in target languages [Qin et al., 2024;; and (3) evaluation protocols involving native speakers and cultural experts to validate both the constitutional principles and the resulting synthetic data [Kyrychenko et al., 2025]. This direction offers a pathway toward scalable, culturally grounded alignment practices that make LLM safety more inclusive and globally relevant. ...

C3AI: Crafting and Evaluating Constitutions for Constitutional AI
  • Citing Conference Paper
  • April 2025

... For example, an analysis over 70,000 patent citations to HCI publications revealed that only a small fraction of academic work directly influences patents [11]. Similarly, an analysis of Responsible AI research found that while approximately 18.7% of AI research papers transitioned into code repositories, only 7.2% ended up in patents [34]. These findings demonstrate that HCI research often struggles to transition from theory to practice. ...

The Impact of Responsible AI Research on Innovation and Development
  • Citing Article
  • October 2024

Proceedings of the AAAI/ACM Conference on AI Ethics and Society

... The need for reporting AI risks both at the level of models and specific uses is partly driven by standards like the NIST AI Risk Management Framework [51] and regulations like the EU AI Act [14], which mandate risk documentation based on the particular use and context [26,31]. Consequently, various AI impact assessment reports [1,6,65] and cards [25] have been proposed to help AI developers prepare the required documentation, particularly for high-risk systems. ...

Co-designing an AI Impact Assessment Report Template with AI Practitioners and AI Compliance Experts
  • Citing Article
  • October 2024

Proceedings of the AAAI/ACM Conference on AI Ethics and Society

... Credo AI [13] provides a set of Policy Packs with associated online questionnaires to identify areas of non-compliance with one or more AI regulations. Herdel et al. [21] discuss the use of large language models to assist in envisioning the uses and risks of AI. Wang et al. [58] describe and test a system that helps users identify potential AI risks during prompt-based prototyping. ...

ExploreGen: Large Language Models for Envisioning the Uses and Risks of AI Technologies
  • Citing Article
  • October 2024

Proceedings of the AAAI/ACM Conference on AI Ethics and Society

... A growing body of work emphasizes the importance of task-specific assessment and documentation of potential harms related to particular AI applications (and several frameworks have been developed for this purpose, e.g., RiskCards [23], ethics sheets [77], and RAI Guidelines [18]). Our findings complement this call for action, demonstrating that bias can and does manifest in unique ways in product description generation-problems that are of critical importance but would have been difficult or impossible to identify with a more general analysis. ...

RAI Guidelines: Method for Generating Responsible AI Guidelines Grounded in Regulations and Usable by (Non-)Technical Roles
  • Citing Article
  • November 2024

Proceedings of the ACM on Human-Computer Interaction

... It has also been shown that anticipating the risks of an AI system or a model is a hard task even for practitioners and researchers with knowledge of AI [8,19]. The increasing frequency 1 https://huggingface.co/ of real-world AI incidents and harms [45,68] is likely partly due to the lack of transparency regarding risks associated with models deployed [4,13]. Risks can be reported as "model-specific", i.e., risks arising from the model's unique capabilities or limitations (e.g., perpetuating harmful biases from training data), or "use-specific", i.e., contextualized risks tied to specific applications (e.g., biases affecting attendees when transcribing virtual meeting). ...

Decoding Real-World Artificial Intelligence Incidents
  • Citing Article
  • November 2024

Computer

... Wang et al. [69] introduced FarSight, another LLM-based tool designed specifically to support prompt developers working with LLMs. Bogucka et al. [7] compiled risks of various AI uses that have led to real-world harms, presenting them in a visualization appealing to the broader public. All of these tools leveraged LLMs to identify potential uses or risks for a given AI system. ...

Atlas of AI Risks: Enhancing Public Understanding of AI Risks

Proceedings of the AAAI Conference on Human Computation and Crowdsourcing

... The thorny area of data use, privacy, and sovereignty, as discussed, is one that mobile AI providers are seeking to get ahead of. So while mobile AI brings potential risks and problems with a new, ramped-up phase of data gathering and use (Constantinides et al., 2024) that we don't yet fully understand, there is the promise of mobile AI devices to provide better options for data privacy and security with on-device housing of data at the "edge" of Internet and mobile networks, rather than relying on cloud storage and services . Another area, also evident from the case studies above, arises from the fascinating ways in which mobile AI is a focal point for the next wave of digitalization and personalization in products and services. ...

Good Intentions, Risky Inventions: A Method for Assessing the Risks and Benefits of AI in Mobile and Wearable Uses
  • Citing Article
  • September 2024

Proceedings of the ACM on Human-Computer Interaction

... To measure the impact of AI on occupational tasks, Septiandri et al [24] defined the AI Impact (AII) as a measure of how much AI could impact a job's tasks by looking at how closely these tasks are associated with patents. For each task, their method finds the patent most similar to it using a cosine similarity score. ...

The potential impact of AI innovations on US occupations
  • Citing Article
  • September 2024

PNAS Nexus