Thesis

Answering meta-analytic questions on heterogeneous and uncertain neuroscientific data with probabilistic logic programming

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

This thesis contributes to the development of a probabilistic logic programming language specific to the domain of cognitive neuroscience, coined NeuroLang, and presents some of its applications to the meta-analysis of the functional brain mapping literature. By relying on logic formalisms such as datalog, and their probabilistic extensions, we show how NeuroLang makes it possible to combine uncertain and heterogeneous data to formulate rich meta-analytic hypotheses. We encode the Neurosynth database into a NeuroLang program and formulate probabilistic logic queries resulting in term-association brain maps and coactivation brain maps similar to those obtained with existing tools, and highlighting existing brain networks. We prove the correctness of our model by using the joint probability distribution defined by the Bayesian network translation of probabilistic logic programs, showing that queries lead to the same estimations as Neurosynth. Then, we show that modeling term-to-study associations probabilistically based on term frequency-document inverse frequency (TF-IDF) measures results in better accuracy on simulated data, and a better consistency on real data, for two-term conjunctive queries on smaller sample sizes. Finally, we use NeuroLang to formulate and test concrete functional brain mapping hypotheses, reproducing past results. By solving segregation logic queries combining the Neurosynth database, topic models, and the data-driven functional atlas DiFuMo, we find supporting evidence of the existence of an heterogeneous organisation of the frontoparietal control network (FPCN), and find supporting evidence that the subregion of the fusiform gyrus called visual word form area (VWFA) is recruited within attentional tasks, on top of language-related cognitive tasks.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... com/NeuroLang/NeuroLang; Wassermann et al., 2022. In-depth details on NeuroLang are forund in Abdallah et al., 2022 andIovene, 2021. All code was developed based on open-source, publicly available software packages. ...
Article
Full-text available
The lateral prefrontal cortex (LPFC) of humans enables flexible goal-directed behavior. However, its functional organization remains actively debated after decades of research. Moreover, recent efforts aiming to map the LPFC through meta-analysis are limited, either in scope or in the inferred specificity of structure-function associations. These limitations are in part due to the limited expressiveness of commonly-used data analysis tools, which restricts the breadth and complexity of questions that can be expressed in a meta-analysis. Here, we adopt NeuroLang, a novel approach to more expressive meta-analysis based on probabilistic first-order logic programming, to infer the organizing principles of the LPFC from 14,371 neuroimaging studies. Our findings reveal a rostrocaudal and a dorsoventral gradient, respectively explaining the most and second most variance in meta-analytic connectivity across the LPFC. Moreover, we identify a unimodal-to-transmodal spectrum of coactivation patterns along with a concrete-to-abstract axis of structure-function associations extending from caudal to rostral regions of the LPFC. Finally, we infer inter-hemispheric asymmetries along the principal rostrocaudal gradient, identifying hemisphere-specific associations with topics of language, memory, response inhibition, and sensory processing. Overall, this study provides a comprehensive meta-analytic mapping of the LPFC, grounding future hypothesis generation on a quantitative overview of past findings.
Article
Full-text available
Inferring reliable brain-behavior associations requires synthesizing evidence from thousands of functional neuroimaging studies through meta-analysis. However, existing meta-analysis tools are limited to investigating simple neuroscience concepts and expressing a restricted range of questions. Here, we expand the scope of neuroimaging meta-analysis by designing NeuroLang: a domain-specific language to express and test hypotheses using probabilistic first-order logic programming. By leveraging formalisms found at the crossroads of artificial intelligence and knowledge representation, NeuroLang provides the expressivity to address a larger repertoire of hypotheses in a meta-analysis, while seamlessly modeling the uncertainty inherent to neuroimaging data. We demonstrate the language’s capabilities in conducting comprehensive neuroimaging meta-analysis through use-case examples that address questions of structure-function associations. Specifically, we infer the specific functional roles of three canonical brain networks, support the role of the visual word-form area in visuospatial attention, and investigate the heterogeneous organization of the frontoparietal control network.
Preprint
Full-text available
The sharing of research data is essential to ensure reproducibility and maximize the impact of public investments in scientific research. Here we describe OpenNeuro, a BRAIN Initiative data archive that provides the ability to openly share data from a broad range of brain imaging data types following the FAIR principles for data sharing. We highlight the importance of the Brain Imaging Data Structure (BIDS) standard for enabling effective curation, sharing, and reuse of data. The archive presently shares more than 500 datasets including data from more than 18,000 participants, comprising multiple species and measurement modalities and a broad range of phenotypes. The impact of the shared data is evident in a growing number of published reuses, currently totalling more than 150 publications. We conclude by describing plans for future development and integration with other ongoing open science efforts.
Article
Full-text available
We introduce DeepProbLog, a neural probabilistic logic programming language that incorporates deep learning by means of neural predicates. We show how existing inference and learning techniques of the underlying probabilistic logic programming language ProbLog can be adapted for the new language. We theoretically and experimentally demonstrate that DeepProbLog supports (i) both symbolic and subsymbolic representations and inference, (ii) program induction, (iii) probabilistic (logic) programming, and (iv) (deep) learning from examples. To the best of our knowledge, this work is the first to propose a framework where general-purpose neural networks and expressive probabilistic-logical modeling and reasoning are integrated in a way that exploits the full expressiveness and strengths of both worlds and can be trained end-to-end based on examples.
Thesis
Full-text available
Within the field of brain mapping, we identified the need for a tool which is grounded in the detailed knowledge of individual variability of sulci. In this thesis, we develop a new brain mapping tool called NeuroLang, which utilises the spatial geometry of the brain.We approached this challenge with two perspectives: firstly, we grounded our theory firmly in classical neuroanatomy. Secondly, we designed and implemented methods for sulcus-specific queries in the domain-specific language, NeuroLang. We tested our method on 52 subjects and evaluated the performance of NeuroLang for population and subject-specific representations of neuroanatomy. Then, we present our novel, data-driven hierarchical organisation of sulcal stability.To conclude, we summarise the implication of our method within the current field, as well as our overall contribution to the field of brain mapping.
Article
Full-text available
Population imaging markedly increased the size of functional-imaging datasets, shedding new light on the neural basis of inter-individual differences. Analyzing these large data entails new scalability challenges, computational and statistical. For this reason, brain images are typically summarized in a few signals, for instance reducing voxel-level measures with brain atlases or functional modes. A good choice of the corresponding brain networks is important, as most data analyses start from these reduced signals. We contribute finely-resolved atlases of functional modes, comprising from 64 to 1024 networks. These dictionaries of functional modes (DiFuMo) are trained on millions of fMRI functional brain volumes of total size 2.4TB, spanned over 27 studies and many research groups. We demonstrate the benefits of extracting reduced signals on our fine-grain atlases for many classic functional data analysis pipelines: stimuli decoding from 12,334 brain responses, standard GLM analysis of fMRI across sessions and individuals, extraction of resting-state functional-connectomes biomarkers for 2,500 individuals, data compression and meta-analysis over more than 15,000 statistical maps. In each of these analysis scenarii, we compare the performance of our functional atlases with that of other popular references, and to a simple voxel-level analysis. Results highlight the importance of using high-dimensional “soft” functional atlases, to represent and analyse brain activity while capturing its functional gradients. Analyses on high-dimensional modes achieve similar statistical performance as at the voxel level, but with much reduced computational cost and higher interpretability. In addition to making them available, we provide meaningful names for these modes, based on their anatomical location. It will facilitate reporting of results.
Article
Full-text available
The human insular cortex is a heterogeneous brain structure which plays an integrative role in guiding behavior. The cytoarchitectonic organization of the human insula has been investigated over the last century using postmortem brains but there has been little progress in noninvasive in vivo mapping of its microstructure and large-scale functional circuitry. Quantitative modeling of multi-shell diffusion MRI data from 413 participants revealed that human insula microstructure differs significantly across subdivisions that serve distinct cognitive and affective functions. Insular microstructural organization was mirrored in its functionally interconnected circuits with the anterior cingulate cortex that anchors the salience network, a system important for adaptive switching of cognitive control systems. Furthermore, insular microstructural features, confirmed in Macaca mulatta, were linked to behavior and predicted individual differences in cognitive control ability. Our findings open new possibilities for probing psychiatric and neurological disorders impacted by insular cortex dysfunction, including autism, schizophrenia, and fronto-temporal dementia.
Article
Full-text available
Data analysis workflows in many scientific domains have become increasingly complex and flexible. Here we assess the effect of this flexibility on the results of functional magnetic resonance imaging by asking 70 independent teams to analyse the same dataset, testing the same 9 ex-ante hypotheses¹. The flexibility of analytical approaches is exemplified by the fact that no two teams chose identical workflows to analyse the data. This flexibility resulted in sizeable variation in the results of hypothesis tests, even for teams whose statistical maps were highly correlated at intermediate stages of the analysis pipeline. Variation in reported results was related to several aspects of analysis methodology. Notably, a meta-analytical approach that aggregated information across teams yielded a significant consensus in activated regions. Furthermore, prediction markets of researchers in the field revealed an overestimation of the likelihood of significant findings, even by researchers with direct knowledge of the dataset2–5. Our findings show that analytical flexibility can have substantial effects on scientific conclusions, and identify factors that may be related to variability in the analysis of functional magnetic resonance imaging. The results emphasize the importance of validating and sharing complex analysis workflows, and demonstrate the need for performing and reporting multiple analyses of the same data. Potential approaches that could be used to mitigate issues related to analytical variability are discussed.
Article
Full-text available
Reaching a global view of brain organization requires assembling evidence on widely different mental processes and mechanisms. The variety of human neuroscience concepts and terminology poses a fundamental challenge to relating brain imaging results across the scientific literature. Existing meta-analysis methods perform statistical tests on sets of publications associated with a particular concept. Thus, large-scale meta-analyses only tackle single terms that occur frequently. We propose a new paradigm, focusing on prediction rather than inference. Our multivariate model predicts the spatial distribution of neurological observations, given text describing an experiment, cognitive process, or disease. This approach handles text of arbitrary length and terms that are too rare for standard meta-analysis. We capture the relationships and neural correlates of 7547 neuroscience terms across 13459 neuroimaging publications. The resulting meta-analytic tool, neuroquery.org, can ground hypothesis generation and data-analysis priors on a comprehensive view of published findings on the brain.
Article
Full-text available
SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments. This Perspective describes the development and capabilities of SciPy 1.0, an open source scientific computing library for the Python programming language.
Article
Full-text available
While predominant models of visual word form area (VWFA) function argue for its specific role in decoding written language, other accounts propose a more general role of VWFA in complex visual processing. However, a comprehensive examination of structural and functional VWFA circuits and their relationship to behavior has been missing. Here, using high-resolution multimodal imaging data from a large Human Connectome Project cohort (N = 313), we demonstrate robust patterns of VWFA connectivity with both canonical language and attentional networks. Brain-behavior relationships revealed a striking pattern of double dissociation: structural connectivity of VWFA with lateral temporal language network predicted language, but not visuo-spatial attention abilities, while VWFA connectivity with dorsal fronto-parietal attention network predicted visuo-spatial attention, but not language abilities. Our findings support a multiplex model of VWFA function characterized by distinct circuits for integrating language and attention, and point to connectivity-constrained cognition as a key principle of human brain organization. The visual word form area (VWFA) is a brain region associated with written language, but it has also been linked to visuospatial attention. Here, the authors reveal distinct structural and functional circuits linking VWFA with language and attention networks, and demonstrate that these circuits separately predict language and attention abilities.
Article
Full-text available
To map the neural substrate of mental function, cognitive neuroimaging relies on controlled psychological manipulations that engage brain systems associated with specific cognitive processes. In order to build comprehensive atlases of cognitive function in the brain, it must assemble maps for many different cognitive processes, which often evoke overlapping patterns of activation. Such data aggregation faces contrasting goals: on the one hand finding correspondences across vastly different cognitive experiments, while on the other hand precisely describing the function of any given brain region. Here we introduce a new analysis framework that tackles these difficulties and thereby enables the generation of brain atlases for cognitive function. The approach leverages ontologies of cognitive concepts and multi-label brain decoding to map the neural substrate of these concepts. We demonstrate the approach by building an atlas of functional brain organization based on 30 diverse functional neuroimaging studies, totaling 196 different experimental conditions. Unlike conventional brain mapping, this functional atlas supports robust reverse inference: predicting the mental processes from brain activity in the regions delineated by the atlas. To establish that this reverse inference is indeed governed by the corresponding concepts, and not idiosyncrasies of experimental designs, we show that it can accurately decode the cognitive concepts recruited in new tasks. These results demonstrate that aggregating independent task-fMRI studies can provide a more precise global atlas of selective associations between brain and cognition.
Article
Full-text available
A defining aspect of brain organization is its spatial heterogeneity, which gives rise to multiple topographies at different scales. Brain parcellation — defining distinct partitions in the brain, be they areas or networks that comprise multiple discontinuous but closely interacting regions — is thus fundamental for understanding brain organization and function. The past decade has seen an explosion of in vivo MRI-based approaches to identify and parcellate the brain on the basis of a wealth of different features, ranging from local properties of brain tissue to long-range connectivity patterns, in addition to structural and functional markers. Given the high diversity of these various approaches, assessing the convergence and divergence among these ensuing maps is a challenge. Inter-individual variability adds to this challenge but also provides new opportunities when coupled with cross-species and developmental parcellation studies.
Article
Full-text available
Despite a growing body of research suggesting that task-based functional magnetic resonance imaging (fMRI) studies often suffer from a lack of statistical power due to too-small samples, the proliferation of such underpowered studies continues unabated. Using large independent samples across eleven tasks, we demonstrate the impact of sample size on replicability, assessed at different levels of analysis relevant to fMRI researchers. We find that the degree of replicability for typical sample sizes is modest and that sample sizes much larger than typical (e.g., N=100) produce results that fall well short of perfectly replicable. Thus, our results join the existing line of work advocating for larger sample sizes. Moreover, because we test sample sizes over a fairly large range and use intuitive metrics of replicability, our hope is that our results are more understandable and convincing to researchers who may have found previous results advocating for larger samples inaccessible.
Article
Full-text available
A central goal of cognitive neuroscience is to decode human brain activity—that is, to infer mental processes from observed patterns of whole-brain activation. Previous decoding efforts have focused on classifying brain activity into a small set of discrete cognitive states. To attain maximal utility, a decoding framework must be open-ended, systematic, and context-sensitive—that is, capable of interpreting numerous brain states, presented in arbitrary combinations, in light of prior information. Here we take steps towards this objective by introducing a probabilistic decoding framework based on a novel topic model—Generalized Correspondence Latent Dirichlet Allocation—that learns latent topics from a database of over 11,000 published fMRI studies. The model produces highly interpretable, spatially-circumscribed topics that enable flexible decoding of whole-brain images. Importantly, the Bayesian nature of the model allows one to “seed” decoder priors with arbitrary images and text—enabling researchers, for the first time, to generate quantitative, context-sensitive interpretations of whole-brain patterns of brain activity.
Article
Full-text available
Stereoelectroencephalography (SEEG) is a method for invasive study of patients with refractory epilepsy. Localization of the epileptogenic zone in SEEG relied on the hypothesis of anatomo-electro-clinical analysis limited by X-ray, analog electroencephalography (EEG), and seizure semiology in the 1950s. Modern neuroimaging studies and digital video-EEG have developed the hypothesis aiming at more precise localization of the epileptic network. Certain clinical scenarios favor SEEG over subdural EEG (SDEEG). SEEG can cover extensive areas of bilateral hemispheres with highly accurate sampling from sulcal areas and deep brain structures. A hybrid technique of SEEG and subdural strip electrode placement has been reported to overcome the SEEG limitations of poor functional mapping. Technological advances including acquisition of three-dimensional angiography and magnetic resonance image (MRI) in frameless conditions, advanced multimodal planning, and robot-assisted implantation have contributed to the accuracy and safety of electrode implantation in a simplified fashion. A recent meta-analysis of the safety of SEEG concluded the low value of the pooled prevalence for all complications. The complications of SEEG were significantly less than those of SDEEG. The removal of electrodes for SEEG was much simpler than for SDEEG and allowed sufficient time for data analysis, discussion, and consensus for both patients and physicians before the proceeding treatment. Furthermore, SEEG is applicable as a therapeutic alternative for deep-seated lesions, e.g., nodular heterotopia, in nonoperative epilepsies using SEEG-guided radiofrequency thermocoagulation. We review the SEEG method with technological advances for planning and implantation of electrodes. We highlight the indication and efficacy, advantages and disadvantages of SEEG compared with SDEEG.
Conference Paper
Full-text available
Turing complete languages can express unbounded computations over unbounded structures, either directly or by a suitable encoding. In contrast, Domain Specific Languages (DSLs) are intended to simplify the expression of computations over structures in restricted contexts. However, such simplification often proves irksome, especially for constructing more elaborate programs where the domain, though central, is one of many considerations. Thus, it is often tempting to extend a DSL with more general abstractions, typically to encompass common programming tropes, typically from favourite languages. The question then arises: once a DSL becomes Turing complete, then in what sense is it still domain specific?
Article
Full-text available
Neuroimaging meta-analysis is an area of growing interest in statistics. The special characteristics of neuroimaging data render classical meta-analysis methods inapplicable and therefore new methods have been developed. We review existing methodologies, explaining the benefits and drawbacks of each. A demonstration on a real dataset of emotion studies is included. We discuss some still-open problems in the field to highlight the need for future research.
Article
Full-text available
The development of magnetic resonance imaging (MRI) techniques has defined modern neuroimaging. Since its inception, tens of thousands of studies using techniques such as functional MRI and diffusion weighted imaging have allowed for the non-invasive study of the brain. Despite the fact that MRI is routinely used to obtain data for neuroscience research, there has been no widely adopted standard for organizing and describing the data collected in an imaging experiment. This renders sharing and reusing data (within or between labs) difficult if not impossible and unnecessarily complicates the application of automatic pipelines and quality assurance protocols. To solve this problem, we have developed the Brain Imaging Data Structure (BIDS), a standard for organizing and describing MRI datasets. The BIDS standard uses file formats compatible with existing software, unifies the majority of practices already common in the field, and captures the metadata necessary for most common data processing operations.
Article
Full-text available
Amyotrophic lateral sclerosis (ALS) is a neurodegenerative condition characterized by degeneration of upper motor neurons (UMN) arising from the motor cortex in the brain and lower motor neurons (LMN) in the brainstem and spinal cord. Cerebral changes create differences in brain activity captured by functional magnetic resonance imaging (fMRI), including the spontaneous and simultaneous activity occurring between regions known as the resting state networks (RSNs). Progressive neurodegeneration as observed in ALS may lead to a disruption of RSNs which could provide insights into the disease process. Previous studies have reported conflicting findings of increased, decreased, or unaltered RSN functional connectivity in ALS and do not report the contribution of UMN changes to RSN connectivity. We aimed to bridge this gap by exploring two networks, the default mode network (DMN) and the sensorimotor network (SMN), in 21 ALS patients and 40 age-matched healthy volunteers. An UMN score dichotomized patients into UMN+ and UMN- groups. Subjects underwent resting state fMRI scan on a high field MRI operating at 4.7 tesla. The DMN and SMN changes between subject groups were compared. Correlations between connectivity and clinical measures such as the ALS Functional Rating Scale—Revised (ALSFRS-R), disease progression rate, symptom duration, UMN score and finger tapping were assessed. Significant group differences in resting state networks between patients and controls were absent, as was the dependence on degree of UMN burden. However, DMN connectivity was increased in patients with greater disability and faster progression rate, and SMN connectivity was reduced in those with greater motor impairment. These patterns of association are in line with literature supporting loss of inhibitory interneurons.
Article
Full-text available
Systems neuroscience has identified a set of canonical large-scale networks in humans. These have predominantly been characterized by resting-state analyses of the task-unconstrained, mind-wandering brain. Their explicit relationship to defined task performance is largely unknown and remains challenging. The present work contributes a multivariate statistical learning approach that can extract the major brain networks and quantify their configuration during various psychological tasks. The method is validated in two extensive datasets (n=500 and n=81) by model-based generation of synthetic activity maps from recombination of shared network topographies. To study a use case, we formally revisited the poorly understood difference between neural activity underlying idling versus goal-directed behavior. We demonstrate that task-specific neural activity patterns can be explained by plausible combinations of resting-state networks. The possibility of decomposing a mental task into the relative contributions of major brain networks, the "network co-occurrence architecture" of a given task, opens an alternative access to the neural substrates of human cognition.
Article
Full-text available
The overall aim of this thesis is the development of novel electroencephalography (EEG) and magnetoencephalography (MEG) analysis methods to provide new insights to the functioning of the human brain. MEG and EEG are non-invasive techniques that measure outside of the head the electric potentials and the magnetic fields induced by the neuronal activity, respectively. The objective of these functional brain imaging modalities is to be able to localize in space and time the origin of the signal measured. To do so very challenging mathematical and computational problems needs to be tackled. The first part of this work proceeds from the biological origin the M/EEG signal to the resolution of the forward problem. Starting from Maxwell's equations in their quasi-static formulation and from a physical model of the head, the forward problem predicts the measurements that would be obtained for a given configuration of current generators. With realistic head models the solution is not known analytically and is obtained with numerical solvers. The first contribution of this thesis introduces a solution of this problem using a symmetric boundary element method (BEM) which has an excellent precision compared to alternative standard BEM implementations. Once a forward model is available the next challenge consists in recovering the current generators that have produced the measured signal. This problem is referred to as the inverse problem. Three types of approaches exist for solving this problem: parametric methods, scanning techniques, and image-based methods with distributed source models. This latter technique offers a rigorous formulation of the inverse problem without making strong modeling assumptions. However, it requires to solve a severely ill-posed problem. The resolution of such problems classically requires to impose constraints or priors on the solution. The second part of this thesis presents robust and tractable inverse solvers with a particular interest on efficient convex optimization methods using sparse priors. The third part of this thesis is the most applied contribution. It is a detailed exploration of the problem of retinotopic mapping with MEG measurements, from an experimental protocol design to data exploration, and resolution of the inverse problem using time frequency analysis. The next contribution of this thesis, aims at going one step further from simple source localization by providing an approach to investigate the dynamics of cortical activations. Starting from spatiotemporal source estimates the algorithm proposed provides a way to robustly track the "hot spots" over the cortical mesh in order to provide a clear view of the cortical processing over time. The last contribution of this work addresses the very challenging problem of single-trial data processing. We propose to make use of recent progress in graph-based methods in order to achieve parameter estimation on single-trial data and therefore reduce the estimation bias produced by standard multi-trial data averaging. Both the source code of our algorithms and the experimental data are freely available to reproduce the results presented. The retinotopy project was done in collaboration with the LENA team at the hôpital La Pitié-Salpêtrière (Paris).
Book
Recent advances in the area of lifted inference, which exploits the structure inherent in relational probabilistic models. Statistical relational AI (StaRAI) studies the integration of reasoning under uncertainty with reasoning about individuals and relations. The representations used are often called relational probabilistic models. Lifted inference is about how to exploit the structure inherent in relational probabilistic models, either in the way they are expressed or by extracting structure from observations. This book covers recent significant advances in the area of lifted inference, providing a unifying introduction to this very active field. After providing necessary background on probabilistic graphical models, relational probabilistic models, and learning inside these models, the book turns to lifted inference, first covering exact inference and then approximate inference. In addition, the book considers the theory of liftability and acting in relational domains, which allows the connection of learning and reasoning in relational domains. Contributors Babak Ahmadi, Hendrik Blockeel, Hung Bui, Yuqiao Chen, Arthur Choi, Jaesik Choi, Adnan Darwiche, Jesse Davis, Rodrigo de Salvo Braz, Pedro Domingos, Daan Fierens, Martin Grohe, Fabian Hadiji, Seyed Mehran Kazemi, Kristian Kersting, Roni Khardon, Angelika Kimmig, Jacek Kisyński, Daniel Lowd, Wannes Meert, Martin Mladenov, Raymond Mooney, Sriraam Natarajan, Mathias Niepert, David Poole, Scott Sanner, Pascal Schweitzer, Nima Taghipour, Guy Van den Broeck
Article
Statistical models of real world data typically involve continuous probability distributions such as normal, Laplace, or exponential distributions. Such distributions are supported by many probabilistic modelling formalisms, including probabilistic database systems. Yet, the traditional theoretical framework of probabilistic databases focuses entirely on finite probabilistic databases. Only recently, we set out to develop the mathematical theory of infinite probabilistic databases. The present paper is an exposition of two recent papers which are cornerstones of this theory. In (Grohe, Lindner; ICDT 2020) we propose a very general framework for probabilistic databases, possibly involving continuous probability distributions, and show that queries have a well-defined semantics in this framework. In (Grohe, Kaminski, Katoen, Lindner; PODS 2020) we extend the declarative probabilistic programming language Generative Datalog, proposed by (B´ar´any et al. 2017) for discrete probability distributions, to continuous probability distributions and show that such programs yield generative models of continuous probabilistic databases.
Article
Large-scale probabilistic knowledge bases are becoming increasingly important in academia and industry. They are continuously extended with new data, powered by modern information extraction tools that associate probabilities with knowledge base facts. The state of the art to store and process such data is founded on probabilistic databases. Many systems based on probabilistic databases, however, still have certain semantic deficiencies, which limit their potential applications. We revisit the semantics of probabilistic databases, and argue that the closed-world assumption of probabilistic databases, i.e., the assumption that facts not appearing in the database have the probability zero, conflicts with the everyday use of large-scale probabilistic knowledge bases. To address this discrepancy, we propose open-world probabilistic databases, as a new probabilistic data model. In this new data model, the probabilities of unknown facts, also called open facts, can be assigned any probability value from a default probability interval. Our analysis entails that our model aligns better with many real-world tasks such as query answering, relational learning, knowledge base completion, and rule mining. We make various technical contributions. We show that the data complexity dichotomy, between polynomial time and , for evaluating unions of conjunctive queries on probabilistic databases can be lifted to our open-world model. This result is supported by an algorithm that computes the probabilities of the so-called safe queries efficiently. Based on this algorithm, we prove that evaluating safe queries is in linear time for probabilistic databases, under reasonable assumptions. This remains true in open-world probabilistic databases for a more restricted class of safe queries. We extend our data complexity analysis beyond unions of conjunctive queries, and obtain a host of complexity results for both classical and open-world probabilistic databases. We conclude our analysis with an in-depth investigation of the combined complexity in the respective models.
Article
State-of-the-art inference approaches in probabilistic logic programming typically start by computing the relevant ground program with respect to the queries of interest, and then use this program for probabilistic inference using knowledge compilation and weighted model counting. We propose an alternative approach that uses efficient Datalog techniques to integrate knowledge compilation with forward reasoning with a non-ground program. This effectively eliminates the grounding bottleneck that so far has prohibited the application of probabilistic logic programming in query answering scenarios over knowledge graphs, while also providing fast approximations on classical benchmarks in the field.
Conference Paper
Recent growth in public bioinformatic databases has facilitated the analysis of genomic and proteomic data. However, the large size of the datasets makes it hard for nonexpert programmers to perform the analysis. In this paper we present B-Log, a high-level query language for bioinformatic data analysis. Based on Datalog, B-Log can simply express graph analysis algorithms; it is extended with nested tables, recursive aggregations, and foreign functions, which helps quick exploratory analyses. We implemented several analysis algorithms in B-Log; we also implemented a prototype system to explore TCGA dataset. We find B-Log to be useful for exploratory analysis and quick prototyping.
Article
Objective: Patients with medically refractory localization-related epilepsy (LRE) may be candidates for surgical intervention if the seizure onset zone (SOZ) can be well localized. Stereoelectroencephalography (SEEG) offers an attractive alternative to subdural grid and strip electrode implantation for seizure lateralization and localization; yet there are few series reporting the safety and efficacy of SEEG in pediatric patients. Methods: The authors review their initial 3-year consecutive experience with SEEG in pediatric patients with LRE. SEEG coverage, SOZ localization, complications, and preliminary seizure outcomes following subsequent surgical treatments are assessed. Results: Twenty-five pediatric patients underwent 30 SEEG implantations, with a total of 342 electrodes placed. Ten had prior resections or ablations. Seven had no MRI abnormalities, and 8 had multiple lesions on MRI. Based on preimplantation hypotheses, 7 investigations were extratemporal (ET), 1 was only temporal-limbic (TL), and 22 were combined ET/TL investigations. Fourteen patients underwent bilateral investigations. On average, patients were monitored for 8 days postimplant (range 3-19 days). Nearly all patients were discharged home on the day following electrode explantation. There were no major complications. Minor complications included 1 electrode deflection into the subdural space, resulting in a minor asymptomatic extraaxial hemorrhage; and 1 in-house and 1 delayed electrode superficial scalp infection, both treated with local wound care and oral antibiotics. SEEG localized the hypothetical SOZ in 23 of 25 patients (92%). To date, 18 patients have undergone definitive surgical intervention. In 2 patients, SEEG localized the SOZ near eloquent cortex and subdural grids were used to further delineate the seizure focus relative to mapped motor function just prior to resection. At last follow-up (average 21 months), 8 of 15 patients with at least 6 months of follow-up (53%) were Engel class I, and an additional 6 patients (40%) were Engel class II or III. Only 1 patient was Engel class IV. Conclusions: SEEG is a safe and effective technique for invasive SOZ localization in medically refractory LRE in the pediatric population. SEEG permits bilateral and multilobar investigations while avoiding large craniotomies. It is conducive to deep, 3D, and perilesional investigations, particularly in cases of prior resections. Patients who are not found to have focally localizable seizures are spared craniotomies.
Book
Cambridge Core - Neurosciences - Cognitive Neuroscience - by Marie T. Banich
Article
Meta-analysis is the quantitative, scientific synthesis of research results. Since the term and modern approaches to research synthesis were first introduced in the 1970s, meta-analysis has had a revolutionary effect in many scientific fields, helping to establish evidence-based practice and to resolve seemingly contradictory research outcomes. At the same time, its implementation has engendered criticism and controversy, in some cases general and others specific to particular disciplines. Here we take the opportunity provided by the recent fortieth anniversary of meta-analysis to reflect on the accomplishments, limitations, recent advances and directions for future developments in the field of research synthesis.
Article
Significance The frontoparietal control network (FPCN) contributes to executive control, the ability to deliberately guide action based on goals. While the FPCN is often viewed as a unitary domain general system, it is possible that the FPCN contains a fine-grained internal organization, with separate zones involved in different types of executive control. Here, we use graph theory and meta-analytic functional profiling to demonstrate that the FPCN is composed of two separate subsystems: FPCN A is connected to the default network and is involved in the regulation of introspective processes, whereas FPCN B is connected to the dorsal attention network and is involved in the regulation of perceptual attention. These findings offer a distinct perspective on the systems-level circuitry underlying cognitive control.
Article
Neuroimaging has evolved into a widely used method to investigate the functional neuroanatomy, brain-behaviour relationships, and pathophysiology of brain disorders, yielding a literature of more than 30,000 papers. With such an explosion of data, it is increasingly difficult to sift through the literature and distinguish spurious from replicable findings. Furthermore, due to the large number of studies, it is challenging to keep track of the wealth of findings. A variety of meta-analytical methods (coordinate-based and image-based) have been developed to help summarise and integrate the vast amount of data arising from neuroimaging studies. However, the field lacks specific guidelines for the conduct of such meta-analyses. Based on our combined experience, we propose best-practice recommendations that researchers from multiple disciplines may find helpful. In addition, we provide specific guidelines and a checklist that will hopefully improve the transparency, traceability, replicability and reporting of meta-analytical results of neuroimaging data.
Article
Neuroimaging evidence suggests that executive functions (EF) depend on brain regions that are not closely tied to specific cognitive demands but rather to a wide range of behaviors. A multiple-demand (MD) system has been proposed, consisting of regions showing conjoint activation across multiple demands. Additionally, a number of studies defining networks specific to certain cognitive tasks suggest that the MD system may be composed of a number of sub-networks each subserving specific roles within the system. We here provide a robust definition of an extended MDN (eMDN) based on task-dependent and task-independent functional connectivity analyses seeded from regions previously shown to be convergently recruited across neuroimaging studies probing working memory, attention and inhibition, i.e., the proposed key components of EF. Additionally, we investigated potential sub-networks within the eMDN based on their connectional and functional similarities. We propose an eMDN network consisting of a core whose integrity should be crucial to performance of most operations that are considered higher cognitive or EF. This then recruits additional areas depending on specific demands.
Article
Probabilistic data is motivated by the need to model uncertainty in large databases. Over the last twenty years or so, both the Database community and the AI community have studied various aspects of probabilistic relational data. This survey presents the main approaches developed in the literature, reconciling concepts developed in parallel by the two research communities. The survey starts with an extensive discussion of the main probabilistic data models and their relationships, followed by a brief overview of model counting and its relationship to probabilistic data. After that, the survey discusses lifted probabilistic inference, which are a suite of techniques developed in parallel by the Database and AI communities for probabilistic query evaluation. Then, it gives a short summary of query compilation, presenting some theoretical results highlighting limitations of various query evaluation techniques on probabilistic data. The survey ends with a very brief discussion of some popular probabilistic data sets, systems, and applications that build on this technology.
Article
The dorsal frontoparietal network (dFPN) of the human brain assumes a puzzling variety of functions, including motor planning and imagery, mental rotation, spatial attention, and working memory. How can a single network engage in such a diversity of roles? We propose that cognitive computations relying on the dFPN can be pinned down to a core function underlying offline motor planning: action emulation. Emulation creates a dynamic representation of abstract movement kinematics, sustains the internal manipulation of this representation, and ensures its maintenance over short time periods. Based on these fundamental characteristics, the dFPN has evolved from a pure motor control network into a domain-general system supporting various cognitive and motor functions.
Article
Accumulating evidence suggests that many findings in psychological science and cognitive neuroscience may prove difficult to reproduce; statistical power in brain imaging studies is low and has not improved recently; software errors in analysis tools are common and can go undetected for many years; and, a few large-scale studies notwithstanding, open sharing of data, code, and materials remain the rare exception. At the same time, there is a renewed focus on reproducibility, transparency, and openness as essential core values in cognitive neuroscience. The emergence and rapid growth of data archives, meta-analytic tools, software pipelines, and research groups devoted to improved methodology reflect this new sensibility. We review evidence that the field has begun to embrace new open research practices and illustrate how these can begin to address problems of reproducibility, statistical power, and transparency in ways that will ultimately accelerate discovery.
Article
Whole-brain functional magnetic resonance imaging (fMRI), in conjunction with multiband acceleration, has played an important role in mapping the functional connectivity throughout the entire brain with both high temporal and spatial resolution. Ultrahigh magnetic field strengths (7 T and above) allow functional imaging with even higher functional contrast-to-noise ratios for improved spatial resolution and specificity compared to traditional field strengths (1.5 T and 3 T). High-resolution 7 T fMRI, however, has primarily been constrained to smaller brain regions given the amount of time it takes to acquire the number of slices necessary for high resolution whole brain imaging. Here we evaluate a range of whole-brain high-resolution resting state fMRI protocols (0.9, 1.25, 1.5, 1.6 and 2 mm isotropic voxels) at 7 T, obtained with both in-plane and slice acceleration parallel imaging techniques to maintain the temporal resolution and brain coverage typically acquired at 3 T. Using the processing pipeline developed by the Human Connectome Project, we demonstrate that high resolution images acquired at 7 T provide increased functional contrast to noise ratios with significantly less partial volume effects and more distinct spatial features, potentially allowing for robust individual subject parcellations and descriptions of fine-scaled patterns, such as visuotopic organization.
Article
The ability to discriminate signal from noise plays a key role in the analysis and interpretation of functional magnetic resonance imaging (fMRI) measures of brain activity. Over the past two decades, a number of major sources of noise have been identified, including system-related instabilities, subject motion, and physiological fluctuations. This article reviews the characteristics of the various noise sources as well as the mechanisms through which they affect the fMRI signal. Approaches for distinguishing signal from noise and the associated challenges are also reviewed. These challenges reflect the fact that some noise sources, such as respiratory activity, are generated by the same underlying brain networks that give rise to functional signals that are of interest.
Conference Paper
Soufflé is an open source programming framework that performs static program analysis expressed in Datalog on very large code bases, including points-to analysis on OpenJDK7 (1.4M program variables, 350K objects, 160K methods) in under a minute. Soufflé is being successfully used for Java security analyses at Oracle Labs due to (1) its high-performance, (2) support for rapid program analysis development, and (3) customizability. Soufflé incorporates the highly flexible Datalog-based program analysis paradigm while exhibiting performance results that are on-par with manually developed state-of-the-art tools. In this tool paper, we introduce the Soufflé architecture, usage and demonstrate its applicability for large-scale code analysis on the OpenJDK7 library as a use case.
Conference Paper
We present ProbLog2, the state of the art implementation of the probabilistic programming language ProbLog. The ProbLog language allows the user to intuitively build programs that do not only encode complex interactions between a large sets of heterogenous components but also the inherent uncertainties that are present in real-life situations. The system provides efficient algorithms for querying such models as well as for learning their parameters from data. It is available as an online tool on the web and for download. The offline version offers both command line access to inference and learning and a Python library for building statistical relational learning applications from the system’s components.
Article
An intelligent agent interacting with the real world will encounter individual people, courses, test results, drugs prescriptions, chairs, boxes, etc., and needs to reason about properties of these individuals and relations among them as well as cope with uncertainty. Uncertainty has been studied in probability theory and graphical models, and relations have been studied in logic, in particular in the predicate calculus and its extensions. @is book examines the foundations of combining logic and probability into what are called relational probabilistic models. It introduces representations, inference, and learning techniques for probability, logic, and their combinations. @e book focuses on two representations in detail: Markov logic networks, a relational extension of undirected graphical models and weighted first-order predicate calculus formula, and Problog, a probabilistic extension of logic programs that can also be viewed as a Turing-complete relational extension of Bayesian networks.
Article
This article charts the tractability frontier of two classes of relational algebra queries in tuple-independent probabilistic databases. The first class consists of queries with join, projection, selection, and negation but without repeating relation symbols and union. The second class consists of quantified queries that express the following binary relationships among sets of entities: set division, set inclusion, set equivalence, and set incomparability. Quantified queries are expressible in relational algebra using join, projection, nested negation, and repeating relation symbols. Each query in the two classes has either polynomial-time or #P-hard data complexity and the tractable queries can be recognised efficiently. Our result for the first query class extends a known dichotomy for conjunctive queries without self-joins to such queries with negation. For quantified queries, their tractability is sensitive to their outermost projection operator: They are tractable if no attribute representing set identifiers is projected away and #P-hard otherwise.
Article
The fusiform gyrus (FG) is commonly included in anatomical atlases and is considered a key structure for functionally-specialized computations of high-level vision such as face perception, object recognition, and reading. However, it is not widely known that the FG has a contentious history. In this review, we first provide a historical analysis of the discovery of the FG and why certain features, such as the mid-fusiform sulcus, were discovered and then forgotten. We then discuss how observer-independent methods for identifying cytoarchitectonical boundaries of the cortex revolutionized our understanding of cytoarchitecture and the correspondence between those boundaries and cortical folding patterns of the FG. We further explain that the co-occurrence between cortical folding patterns and cytoarchitectonical boundaries are more common than classically thought and also, are functionally meaningful especially on the FG and probably in high-level visual cortex more generally. We conclude by proposing a series of alternatives for how the anatomical organization of the FG can accommodate seemingly different theoretical aspects of functional processing, such as domain specificity and perceptual expertise. Copyright © 2015. Published by Elsevier Ltd.
Article
Starting with the work of Cajal more than 100 years ago, neuroscience has sought to understand how the cells of the brain give rise to cognitive functions. How far has neuroscience progressed in this endeavor? This Perspective assesses progress in elucidating five basic brain processes: visual recognition, long-term memory, short-term memory, action selection, and motor control. Each of these processes entails several levels of analysis: the behavioral properties, the underlying computational algorithm, and the cellular/network mechanisms that implement that algorithm. At this juncture, while many questions remain unanswered, achievements in several areas of research have made it possible to relate specific properties of brain networks to cognitive functions. What has been learned reveals, at least in rough outline, how cognitive processes can be an emergent property of neurons and their connections. Copyright © 2015 Elsevier Inc. All rights reserved.
Article
The brain's default mode network consists of discrete, bilateral and symmetrical cortical areas, in the medial and lateral parietal, medial prefrontal, and medial and lateral temporal cortices of the human, nonhuman primate, cat, and rodent brains. Its discovery was an unexpected consequence of brain-imaging studies first performed with positron emission tomography in which various novel, attention-demanding, and nonself-referential tasks were compared with quiet repose either with eyes closed or with simple visual fixation. The default mode network consistently decreases its activity when compared with activity during these relaxed nontask states. The discovery of the default mode network reignited a longstanding interest in the significance of the brain's ongoing or intrinsic activity. Presently, studies of the brain's intrinsic activity, popularly referred to as resting-state studies, have come to play a major role in studies of the human brain in health and disease. The brain's default mode network plays a central role in this work. Expected final online publication date for the Annual Review of Neuroscience Volume 38 is July 08, 2015. Please see http://www.annualreviews.org/catalog/pubdates.aspx for revised estimates.