Paul Brenner

Paul Brenner
University of Notre Dame | ND · Center for Research Computing

Doctor of Philosophy

About

83
Publications
5,442
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
982
Citations

Publications

Publications (83)
Preprint
The emergence of Multimodal Foundation Models (MFMs) holds significant promise for transforming social media platforms. However, this advancement also introduces substantial security and ethical concerns, as it may facilitate malicious actors in the exploitation of online users. We aim to evaluate the strength of security protocols on prominent soc...
Preprint
Emerging technologies, particularly artificial intelligence (AI), and more specifically Large Language Models (LLMs) have provided malicious actors with powerful tools for manipulating digital discourse. LLMs have the potential to affect traditional forms of democratic engagements, such as voter choice, government surveys, or even online communicat...
Article
The emergence of Large Language Models (LLMs) has great potential to reshape the landscape of many social media platforms. While this can bring promising opportunities, it also raises many threats, such as biases and privacy concerns, and may contribute to the spread of propaganda by malicious actors. We developed the "LLMs Among Us" experimental f...
Article
Full-text available
In the advancing world of blockchain technology, the paramount need for trustworthy oracles, the entities responsible for providing external data to blockchain’s smart contracts, is becoming more apparent. This paper introduces a novel protocol, the Persona Preserving Reputation Protocol (P2RP), which leverages Decentralized Identity (DID)-based re...
Article
Full-text available
We introduce a global-scale migration model centered on neoclassical economic migration theory and leveraging Python and Jupyter as the base modeling platform. Our goals focus on improving social scientists’ understanding of migration and their access to visually and computationally robust infrastructure. This will enhance a scientist’s capability...
Article
The dark web has become an increasingly important landscape for the sale of illicit cyber goods. Given the prevalence of malware and tools that are used to steal data from individuals on these markets, it is crucial that every company, governing body, and cyber professional be aware of what information is sold on these marketplaces. Knowing this in...
Article
Background and Context Differences in children’s and adolescents’ initial attitudes about computing and other STEM fields may form during middle school and shape decisions leading to career entry. Early emerging differences in career interest may propagate a lack of diversity in computer science and programming fields. Objective Though middle scho...
Chapter
We perform an analysis of SVM, BERT, and Longformer NLP tools as applied to large volumes of unclassified news articles given small volumes of labeled news articles for training. Analysis of the target machine learning tools is performed through a case study of global trigger events; specifically triggers of state-led mass killings. The goal of the...
Preprint
Full-text available
Analysis cyberinfrastructure refers to the combination of software and computer hardware used to support late-stage data analysis in High Energy Physics (HEP). For the purposes of this white paper, late-stage data analysis refers specifically to the step of transforming the most reduced common data format produced by a given experimental collaborat...
Conference Paper
Machine learning (ML) can in theory be used to personalize educational content by identifying online activities aligned with learners' interests. Yet, are learners' self-reported ratings of activities associated with a machine learning generated recommender score? In the current study we sought to address this question using learner's ratings of ac...
Chapter
The Journalist’s Creed, a declaration of the principles, values, and standards of a journalist, states that a journalist should “believe that clear thinking and clear statement, accuracy and fairness are fundamental to good journalism”. However, in recent years there has been concern that personal, corporate, and government biases and opinions have...
Preprint
Full-text available
Robust visualization of complex data is critical for the effective use of NLP for event classification, as the volume of data is large and the high-dimensional structure of text makes data challenging to summarize succinctly. In event extraction tasks in particular, visualization can aid in understanding and illustrating the textual relationships f...
Article
Full-text available
High Performance Computing (HPC) facilities provide vast computational power and storage, but generally work on fixed environments designed to address the most common software needs locally, making it challenging for users to bring their own software. To overcome this issue, most HPC facilities have added support for HPC friendly container technolo...
Article
Full-text available
The NSF-funded Scalable CyberInfrastructure for Artificial Intelligence and Likelihood Free Inference (SCAILFIN) project aims to develop and deploy artificial intelligence (AI) and likelihood-free inference (LFI) techniques and software using scalable cyberinfrastructure (CI) built on top of existing CI elements. Specifically, the project has exten...
Conference Paper
We introduce a global scale parallel python modeling platform alongside an example global migration model. Our goals focus on improving social scientist access to computationally robust infrastructure, which enhances a scientist's capability to model more complex macro-scale global effects or larger numbers of micro-scale agents and behaviors. We b...
Article
Full-text available
The University of Notre Dame (ND) CMS group operates a modest-sized Tier-3 site suitable for local, final-stage analysis of CMS data. However, through the ND Center for Research Computing (CRC), Notre Dame researchers have opportunistic access to roughly 25k CPU cores of computing and a 100 Gb/s WAN network link. To understand the limits of what mi...
Article
Full-text available
We previously described Lobster, a workflow management tool for exploiting volatile opportunistic computing resources for computation in HEP. We will discuss the various challenges that have been encountered while scaling up the simultaneous CPU core utilization and the software improvements required to overcome these challenges. Categories: Workfl...
Conference Paper
While many high performance and high throughput scientific computations have been successfully ported to the cloud, coupling data intensive instruments, experiments and sensors to IaaS remains a challenge. We walk through the process of cloud migrating computations on physical IT infrastructure that are integrated with mass spectrometry instruments...
Conference Paper
Global scale human simulations have application in diverse fields such as economics, anthropology and marketing. The sheer number of agents, however, makes them extremely sensitive to variations in algorithmic complexity resulting in potentially prohibitive computational resource costs. In this paper we show that the computational capability of mod...
Article
Full-text available
Analysis of high energy physics experiments using the Compact Muon Solenoid (CMS) at the Large Hadron Collider (LHC) can be limited by availability of computing resources. As a joint effort involving computer scientists and CMS physicists at Notre Dame, we have developed an opportunistic workflow management tool, Lobster, to harvest available cycle...
Conference Paper
Continued growth in cloud IaaS adoption motivates greater transparency into IaaS network performance to effectively leverage intra and intercloud service and data migration needs of a global consumer base. In this work, we produce a baseline set of network performance measurements for cloud WANs both internal to individual IaaS provider's globally...
Conference Paper
Full-text available
Increasingly complex biomedical data from diverse sources demands large storage, efficient software and high performance computing for the data’s computationally intensive analysis. Cloud technology provides flexible storage and data processing capacity to aggregate and analyze complex data; facilitating knowledge sharing and integration from diffe...
Conference Paper
The computing needs of high energy physics experiments like the Compact Muon Solenoid experiment at the Large Hadron Collider currently exceed the available dedicated computational resources, hence motivating a push to leverage opportunistic resources. However, access to opportunistic resources faces many obstacles, not the least of which is making...
Article
With building energy consumption rising in industrial nations, new approaches for energy efficiency are required. Similarly, the data centers that house information and communications technology continue to consume significant amounts of energy, especially for cooling the equipment, which in turn produces vast amounts of waste heat. A new strategy...
Article
Reducing energy consumption is a critical step in lowering data center operating costs for various institutions. As such, with the growing popularity of cloud computing, it is necessary to examine various methods by which energy consumption in cloud environments can be reduced. We analyze the effects of global virtual machine allocation on energy c...
Conference Paper
Environmentally Opportunistic Computing is an approach to use the waste heat from containerized data center nodes to offset the heating needs of adjacent buildings or facilities. The computational load is then distributed across a number of data center nodes based on where the waste heat is required. In this work, a prototype data center node that...
Chapter
Full-text available
The energy consumed by data centers is growing every year. Significant energy and cost savings are possible through modest gains in efficiency. These leverage solutions for both economical gains and improvement of their environmental footprint. One solution is Environmentally Opportunistic Computing (EOC), which is a sustainable computing concept t...
Conference Paper
Reducing energy consumption is a critical step in lowering data center operating costs for various institutions. As such, with the growing popularity of cloud computing, it is necessary to examine various methods by which energy consumption in cloud environments can be reduced. We analyze the effects of virtual machine allocation on energy consumpt...
Chapter
Full-text available
Opportunistic techniques have been widely used to create useful computational infrastructures and have demonstrated an ability to deliver computing resources to large applications. However, the management of disk space and usage in such systems is often neglected. While overarching filesystemsstructures and policies for their data.
Data
The MSM transition matrix for the Extended 1 dataset. (0.05 MB XLS)
Data
Rex estimation for WW residues using the H-bond based Method 1 versus experimental Rex/Rex(12). (0.02 MB EPS)
Data
Rex estimation for WW residues for a 2 state Markov State Model versus experimental Rex/Rex(12) and the 40 state MSM using Extended 2 data set. (0.02 MB EPS)
Data
Hydrogen bonds present in the minor state, according to Exchange State Identification Method 1. Atom names according to CHARMM 27 force field. (0.04 MB PDF)
Data
Full-text available
Dihedral values for Arg-12, Ser-13, and Gly-15 of representative structures for each macrostate of the MSM. (0.03 MB PDF)
Data
(H-bond based) Exchange State Identification Algorithm. (0.07 MB PDF)
Data
Stationary distribution π of the transition probability matrix T. (0.57 MB EPS)
Data
Representative structures from 40 macrostates, with the side chain of Arg-12 shown. The state population is indicated along with the macrostate index. These structures were selected using the microstate within each macrostate with the densest population, i.e. the most probable microstate. Macrostate 16, found to be a kinetic hub is shown in orange....
Data
Full-text available
Loop and whole protein RMSD values of representative macrostate structures with respect to APO and HOLO experimental structures. (0.03 MB PDF)
Data
Full-text available
Additional details for methods and results sections as well extra figures and tables. (0.15 MB PDF)
Data
This movie depicts the 3-D structures of each of the representative conformations of the Markov State Model of Pin1 WW domain. (3.75 MB MOV)
Data
Implied time scales for the MSM macrostates. The figure shows the slowest time scale (top envelope) and the fourth slowest time scale (bottom envelope). Bootstrapping was used to compute error bars: the initial trajectory was split into 10 different pieces to allow random re-sampling with replacement. (0.25 MB EPS)
Data
Complexity of the correlation maximization algorithm for different number of MSM macrostates relative to the complexity of maximizing correlation for 40 macrostates. (0.24 MB EPS)
Data
Full-text available
(Hybrid H-bond and MSM based) Exchange State Identification Algorithm. (0.12 MB PDF)
Data
The MSM transition matrix for the Extended 2 dataset. (0.06 MB XLS)
Data
Correlation of Rex estimation for different number of macrostates and using different simulation datasets. (0.01 MB EPS)
Article
Full-text available
Protein-protein interactions are often mediated by flexible loops that experience conformational dynamics on the microsecond to millisecond time scales. NMR relaxation studies can map these dynamics. However, defining the network of inter-converting conformers that underlie the relaxation data remains generally challenging. Here, we combine NMR rel...
Conference Paper
Increasing economic and environmental costs of running data centers has driven interest in sustainable computing. Environmentally Opportunistic Computing (EOC) is an approach to harvesting heat produced by computing clusters. Our Green Cloud EOC prototype serves as an operational demonstration that IT resources can be integrated with the dominate e...
Conference Paper
Full-text available
The United States Environmental Protection Agency forecasts the 2011 national IT electric energy expenditure will grow toward $7.4 billion [1]. In parallel to economic IT energy concerns, the general public and environmental advocacy groups are demanding proactive steps toward sustainable green processes. Our contribution to the solution of this pr...
Conference Paper
Energy is a significant and growing component of the cost of running a large computing facility. A grid workload consisting of millions of jobs running on thousands of processors may consume millions of kilowatt hours of electricity. However, because a grid workload generally consists of many independent sequential processes, we may shape its execu...
Article
a b s t r a c t Computationally complex and data intensive atomic scale biomolecular simulation is enabled via processing in network storage (PINS): a novel distributed system framework to overcome bandwidth, compute, storage, organizational, and security challenges inherent to the wide-area computation and storage grid. PINS is presented as an eff...
Article
As distributed storage systems grow, the response time between the occurrence of a fault, detection, and repair becomes signican t. Systems built on shared servers have additional complexity because of the high rate of service outages and revo- cation. Managing high replica counts in this environment becomes very costly in terms of the storage requ...
Conference Paper
We introduce Student Engineers Reaching Out (SERO), an EPICS team at the University of Notre Dame committed to Service Learning founded in engineering curricula. Two SERO case studies highlight the framework, implementation, challenges, and shared benefits of our unique Service Learning course developed specifically for engineers. The first study d...
Article
Full-text available
The authors accelerate the replica exchange method through an efficient all-pairs replica exchange. A proof of detailed balance is shown along with an analytical estimate of the enhanced exchange efficiency. The new method provides asymptotically four fold speedup of conformation traversal for replica counts of 8 and larger with typical exchange ra...
Conference Paper
Full-text available
Computationally complex and data intensive atomic scale biomolecular simulation is enabled via Processing in Network Storage (PINS): a novel distributed system frame- work to overcome bandwidth, compute, storage, and secu- rity challenges inherent to the wide area computation and storage grid. High throughput data generation require- ments for our...
Conference Paper
Full-text available
Distributed computation systems have become an impor- tant tool for scientic simulation, and a similarly distributed replica management system may be employed to increase the locality and availability of storage services. While users of such systems may have low expectations regarding the secu- rity and reliability of the computation involved, they...
Conference Paper
Full-text available
Many modern storage systems used for large-scale scientific systems are multiple use, independently administrated clusters or grids. A common technique to gain storage reliability over a long period of time is the creation of data replicas on multiple servers, but in the presence of server failures, ongoing corrective action must be taken to preven...
Chapter
Full-text available
We compare the effectiveness of different simulation sampling techniques by illustrating their application to united atom butane, alanine dipeptide, and a small solvated protein, BPTI. We introduce an optimization of the Shadow Hybrid Monte Carlo algorithm, a rigorous method that removes the bias of molecular dynamics. We also evaluate the ability...
Conference Paper
We describe a service-learning program within the CSE department at the University of Notre Dame. Started in 1997 as the first affiliate of the Purdue EPICS program, this service-learning program involves volunteer faculty-mentored teams of students applying engineering skills in local, national, and international consulting projects for various co...
Conference Paper
Full-text available
Biomolecular simulations produce more output data than can be managed effectively by traditional computing systems. Researchers need distributed systems that allow the pooling of resources, the sharing of simulation data, and the reliable publication of both tentative and final results. To address this need, we have designed GEMS, a system that ena...
Conference Paper
Full-text available
Sharing data and storage space in a distributed system remains a difficult task for ordinary users, who are constrained to the fixed abstractions and resources provided by administrators. To remedy this situation, we introduce the concept of a tactical storage system (TSS) that separates storage abstractions from storage resources, leaving users fr...
Article
The mechanical properties and dislocation microstructure of single crystals with a range of compositions within the Fex-Ni60–x-Al40 pseudobinary system have been investigated, with the purpose of bridging the behavior from FeAl to NiAl. Experiments are focused on the compression testing of <001> oriented single crystals with compositions where x =...
Article
From 2006 to 2011 the U.S. national energy consumption for powering and cooling IT servers is estimated to grow from a cost of 4.5 to 7.4 billion dollars as reported by a recent EPA study [1] which included current efficiency improvement trends. With growing national concern for energy efficiency and environmental stewardship, current power utiliza...
Article
Full-text available
The goal of scientific grid computing is to enable researchers to attack problems that were previously impossible. Cooperative storage systems have been assembled to provide users with access to surprisingly large amounts of useful space, but locating, accessing, and efficiently employing such distributed data sets in scientific simulation or analy...
Article
Conformational sampling from proteins and other molecules is performed using Monte Carlo protocols, with modifications to include random motions of key torsion angles and functionality in the presence of rings. Verification of the implementation is achieved via analysis of the sampled dihedral distributions, sampling rates, and geometric configurat...
Article
Full-text available
Modern computer simulations of biomolec- ular systems are long running programs that produce an abundance of data in the form of large output files. This data could be reused several times by different researchers, reducing wasted computer time recomputing the output and introducing the capability for apples to apples simulation evaluations. To rea...

Network

Cited By