Customer projects are major do-it-yourself (DIY) undertakings, involving a considerable amount of planning, money, and effort. Examples of customer projects include installing a paver patio, tiling kitchen walls, building a backyard football toss, and renovating the bathroom. Such projects require several cross-category purchases through multiple shopping trips (Wolf and McQuitty 2011). Identifying and understanding customer projects provides novel insights into customer behavior, which in turn, improve advertising planning and the development of effective marketing decisions. In this work, we seek to contribute to the growing literature that utilizes graph mining techniques to identify persistent multi-trip purchase patterns (Dhar et al. 2014; Kim, Kim, and Chen 2012; Oestreicher-Singer et al. 2013; Videla-Cavieres and Ríos 2014). We do so, by presenting an analytical method for the identification of customer projects from retail sales transactions and by demonstrating the utility, validity, and replicability of our approach using a data set of 20,000 customers across two years from a Fortune 500 specialty retailer.
To read the full-text of this research,
you can request a copy directly from the authors.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.
Article Full-text available June 2022 · Knowledge and Information Systems
An uncertain graph (also known as probabilistic graph) is a generic model to represent many real-world networks from social to biological. In recent times, analysis and mining of uncertain graphs have drawn significant attention from the researchers of the data management community. Several noble problems have been introduced, and efficient methodologies have been developed to solve those
... [Show full abstract] problems. Hence, there is a need to summarize the existing results on this topic in a self-organized way. In this paper, we present a comprehensive survey on uncertain graph mining focusing on mainly three aspects: (i) different problems studied, (ii) computational challenges for solving those problems, and (iii) proposed methodologies. Finally, we list out important future research directions. View full-text Article Full-text available June 2022 · Proceedings of the AAAI Conference on Artificial Intelligence
In 2020, the White House released the “Call to Action to the Tech Community on New Machine Readable COVID-19 Dataset,” wherein artificial intelligence experts are asked to collect data and develop text mining techniques that can help the science community answer high-priority scientific questions related to COVID-19. The Allen Institute for AI and collaborators announced the availability of a
... [Show full abstract] rapidly growing open dataset of publications, the COVID-19 Open Research Dataset (CORD-19). As the pace of research accelerates, biomedical scientists struggle to stay current. To expedite their investigations, scientists leverage hypothesis generation systems, which can automatically inspect published papers to discover novel implicit connections. We present automated general purpose hypothesis generation systems AGATHA-C and AGATHA-GP for COVID-19 research. The systems are based on the graph mining and transformer models. The systems are massively validated using retrospective information rediscovery and proactive analysis involving human-in-the-loop expert analysis. Both systems achieve high-quality predictions across domains in fast computational time and are released to the broad scientific community to accelerate biomedical research. In addition, by performing the domain expert curated study, we show that the systems are able to discover ongoing research findings such as the relationship between COVID-19 and oxytocin hormone. All code, details, and pre-trained models are available at https://github.com/IlyaTyagin/AGATHA-C-GP. View full-text Article Full-text available July 2022 · The VLDB Journal
Given a user-specified minimum degree threshold \(\gamma \), a \(\gamma \)-quasi-clique is a subgraph where each vertex connects to at least \(\gamma \) fraction of the other vertices. Quasi-clique is a natural definition for dense structures, so finding large and hence statistically significant quasi-cliques is useful in applications such as community detection in social networks and discovering
... [Show full abstract] significant biomolecule structures and pathways. However, mining maximal quasi-cliques is notoriously expensive, and even a recent algorithm for mining large maximal quasi-cliques is flawed and can lead to a lot of repeated searches. This paper proposes a parallel solution for mining maximal quasi-cliques that is able to fully utilize CPU cores. Our solution utilizes divide and conquer to decompose the workloads into independent tasks for parallel mining, and we addressed the problem of (i) drastic load imbalance among different tasks and (ii) difficulty in predicting the task running time and the time growth with task-subgraph size, by (a) using a timeout-based task decomposition strategy, and by (b) utilizing a priority task queue to schedule long-running tasks earlier for mining and decomposition to avoid stragglers. Unlike our conference version in PVLDB 2020 where the solution was built on a distributed graph mining framework called G-thinker, this paper targets a single-machine multi-core environment which is more accessible to an average end user. A general framework called T-thinker is developed to facilitate the programming of parallel programs for algorithms that adopt divide and conquer, including but not limited to our quasi-clique mining algorithm. Additionally, we consider the problem of directly mining large quasi-cliques from dense parts of a graph, where we identify the repeated search issue of a recent method and address it using a carefully designed concurrent trie data structure. Extensive experiments verify that our parallel solution scales well with the number of CPU cores, achieving 26.68\(\times \) runtime speedup when mining a graph with 3.77M vertices and 16.5M edges with 32 mining threads. Additionally, mining large quasi-cliques from dense parts can provide an additional speedup of up to 89.46\(\times \). View full-text Article Full-text available June 2022 · Security and Communication Networks
Software-defined network (SDN) controllers, the core of SDN network architecture, need to deal with network events of the whole network, which has huge program state space and complex logic dependency, with security issues related. Vulnerabilities in the SDN controller can paralyze the whole network. Existing controller testing methods are difficult to dig into the hidden logic vulnerability for
... [Show full abstract] their ignorance of the complex events interactions among controllers, apps, and data plane inputs. Different from file processing software, network software is driven by events, and the event flow can more accurately and comprehensively reflect the execution process. In this work, we propose an SDN controller vulnerability digging method based on event flow graph analysis. The proposed method consists of three main steps: first, we execute the instrumented controller in a normal environment and generate event flow graphs and then extract their features as reference. Second, we generate and execute test cases using the fuzzing method and dig the newly built event flow graphs with potential vulnerabilities. Finally, we manually examine and validate the potential vulnerabilities. To accurately discover abnormal subgraphs, we utilize graph feature extraction and analysis technologies, such as graph mining and clustering, to distinguish the normal graph and abnormal graph. We implement our method on the Ryu controller and compare it with other SDN testing methods, such as BEADS and Delta. The evaluation indicates that our method uncovered three new vulnerabilities that other methods failed to find. View full-text Last Updated: 05 Jul 2022 Looking for the full-text?
You can request the full-text of this conference paper directly from the authors on ResearchGate.