## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

(abridged) The large-scale distribution of galaxies is generally analyzed using the two-point correlation function. However, this statistic does not capture the topology of the distribution, and it is necessary to resort to higher order correlations to break degeneracies. We demonstrate that an alternate approach using network analysis can discriminate between topologically different distributions that have similar two-point correlations. We investigate two galaxy point distributions, one produced by a cosmological simulation and the other by a L\'evy walk. For the cosmological simulation, we adopt the redshift $z = 0.58$ slice from Illustris (Vogelsberger et al. 2014A) and select galaxies with stellar masses greater than $10^8$$M_\odot$. The two point correlation function of these simulated galaxies follows a single power-law, $\xi(r) \sim r^{-1.5}$. Then, we generate L\'evy walks matching the correlation function and abundance with the simulated galaxies. We find that, while the two simulated galaxy point distributions have the same abundance and two point correlation function, their spatial distributions are very different; most prominently, \emph{filamentary structures}, absent in L\'evy fractals. To quantify these missing topologies, we adopt network analysis tools and measure diameter, giant component, and transitivity from networks built by a conventional friends-of-friends recipe with various linking lengths. Unlike the abundance and two point correlation function, these network quantities reveal a clear separation between the two simulated distributions; therefore, the galaxy distribution simulated by Illustris is not a L\'evy fractal quantitatively. We find that the described network quantities offer an efficient tool for discriminating topologies and for comparing observed and theoretical distributions.

To read the full-text of this research,

you can request a copy directly from the authors.

... Lemma 4 (Invariant property [19]). ForwardPush has the following invariant property ...

... We observe strong signals from the curve of Wuhan, correlating to the initial COVID-19 outbreak and the declaration of pandemic 18 . In addition, we observed an peak from the curve of Chengdu around 8/18/20 when U.S. closed the consulate in Chengdu, China, reflecting the U.S.-China diplomatic tension 19 . ...

... Again, the embedding movement Dist( ·, ·) is defined as 1 − cos( ·, ·) 18 COIVD-19: https://www.who.int/news/item/27-04-2020-who-timeline---covid-1919 US Consulate: https://china.usembassy-china.org.cn/embassy-consulates/chengdu/ ...

Dynamic graph representation learning is a task to learn node embeddings over dynamic networks, and has many important applications, including knowledge graphs, citation networks to social networks. Graphs of this type are usually large-scale but only a small subset of vertices are related in downstream tasks. Current methods are too expensive to this setting as the complexity is at best linear-dependent on both the number of nodes and edges. In this paper, we propose a new method, namely Dynamic Personalized PageRank Embedding (\textsc{DynamicPPE}) for learning a target subset of node representations over large-scale dynamic networks. Based on recent advances in local node embedding and a novel computation of dynamic personalized PageRank vector (PPV), \textsc{DynamicPPE} has two key ingredients: 1) the per-PPV complexity is $\mathcal{O}(m \bar{d} / \epsilon)$ where $m,\bar{d}$, and $\epsilon$ are the number of edges received, average degree, global precision error respectively. Thus, the per-edge event update of a single node is only dependent on $\bar{d}$ in average; and 2) by using these high quality PPVs and hash kernels, the learned embeddings have properties of both locality and global consistency. These two make it possible to capture the evolution of graph structure effectively. Experimental results demonstrate both the effectiveness and efficiency of the proposed method over large-scale dynamic networks. We apply \textsc{DynamicPPE} to capture the embedding change of Chinese cities in the Wikipedia graph during this ongoing COVID-19 pandemic (https://en.wikipedia.org/wiki/COVID-19_pandemic). Our results show that these representations successfully encode the dynamics of the Wikipedia graph.

... To go around this theoretical limitation we instead carry out an analysis similar to that of Hong et al. (2016), comparing the Illustris 12 (Nelson et al. 2015;Vogelsberger et al. 2014) simulations (see Section 3.2.1) to an adjusted Lévy flight (ALF) simulation that is tuned to have almost identical 2PCF but different higher order information. ...

... This ensures that its 2PCF will follow a power law (see Mandelbrot 1982) similar to that found for galaxies. However, although a standard Lévy flight scheme may be able to replicate the 2PCF at large scales, at small scales, the 2PCF eventually plateaus (see Hong et al. 2016). Since the MST is sensitive to small scales, it is important that the Lévy flight simulation match that of the Illustris sample at small scales. ...

... We use the subhalo catalogue of the Illustris-1 snap 100 sample and follow Hong et al. (2016) to include only subhaloes which are large Table 1. A summary of the simulation suites used in this study. ...

Cosmological studies of large-scale structure have relied on two-point statistics, not fully exploiting the rich structure of the cosmic web. In this paper we show how to capture some of this cosmic web information by using the minimum spanning tree (MST), for the first time using it to estimate cosmological parameters in simulations. Discrete tracers of dark matter such as galaxies, N-body particles or haloes are used as nodes to construct a unique graph, the MST, that traces skeletal structure. We study the dependence of the MST on cosmological parameters using haloes from a suite of COmoving Lagrangian Acceleration (COLA) simulations with a box size of $250\ h^{-1}\, {\rm Mpc}$, varying the amplitude of scalar fluctuations (As), matter density (Ωm), and neutrino mass (∑mν). The power spectrum P and bispectrum B are measured for wavenumbers between 0.125 and 0.5 $h\, {\rm Mpc}^{-1}$, while a corresponding lower cut of ∼12.6 $h^{-1}\, {\rm Mpc}$ is applied to the MST. The constraints from the individual methods are fairly similar but when combined we see improved 1σ constraints of $\sim 17{{\ \rm per\ cent}}$ ($\sim 12{{\ \rm per\ cent}}$) on Ωm and $\sim 12{{\ \rm per\ cent}}$ ($\sim 10{{\ \rm per\ cent}}$) on As with respect to P (P + B) thus showing the MST is providing additional information. The MST can be applied to current and future spectroscopic surveys (BOSS, DESI, Euclid, PSF, WFIRST, and 4MOST) in 3D and photometric surveys (DES and LSST) in tomographic shells to constrain parameters and/or test systematics.

... To go around this theoretical limitation we instead carry out an analysis similar to that of Hong et al. (2016), comparing the Illustris 12 (Nelson et al. 2015;Vogelsberger et al. 2014) simulations (see Section 3.2.1) to an adjusted Lévy flight (ALF) simulation that is tuned to have almost identical 2PCF but different higher order information. ...

... This ensures that its 2PCF will follow a power law (see Mandelbrot 1982) similar to that found for galaxies. However, although a standard Lévy flight scheme may be able to replicate the 2PCF at large scales, at small scales the 2PCF eventually plateaus (see Hong et al. 2016). Since the MST is sensitive to small scales, it is important that the Lévy flight simulation match that of the Illustris sample at small scales. ...

... We use the sub-halo catalogue of the Illustris-1 snap 100 sample and follow Hong et al. (2016) to include only sub haloes which are large and dark matter dominated: ...

Cosmological studies of large scale structure have relied on 2-point statistics, not fully exploiting the rich structure of the cosmic web. In this paper we show how to capture some of this information by using the Minimum Spanning Tree (MST), for the first time using it to estimate cosmological parameters in simulations. Discrete tracers of dark matter such as galaxies, $N$-body particles or haloes are used as nodes to construct a unique graph, the MST, which is defined to be the minimum weighted spanning graph. We study the dependence of the MST statistics on cosmological parameters using haloes from a suite of COLA simulations with a box size of $250\ h^{-1}{\rm Mpc}$ that vary the amplitude of scalar fluctuations $\left(A_{\rm s}\right)$, matter density $\left(\Omega_{\rm m}\right)$ and neutrino mass $\left(\sum m_{\nu}\right)$. The power spectrum $P(k)$ and bispectrum $B(k_{1}, k_{2}, k_{3})$ are measured between $k$ $\sim 0.125$ and $0.5$ $h{\rm Mpc}^{-1}$, while a corresponding lower cut of $\sim12.6$ $h^{-1}{\rm Mpc}$ is applied to the MST. The constraints from the individual methods are fairly similar but when combined we see improved $1\sigma$ constraints on $\Omega_{\rm m}$ of $\sim 17\%$ with respect to $P(k)$ and $\sim 12\%$ with respect to $P(k)+B(k_{1}, k_{2}, k_{3})$ thus showing the MST is providing additional information not present in the power spectrum and bispectrum. The MST is a tool which can be used to constrain parameters and/or to test systematics in current and future galaxy surveys. This is especially applicable to spectroscopy surveys (BOSS, DESI, Euclid, PSF, WFIRST and 4MOST) where the MST can be applied in 3D comoving coordinates and photometric surveys (DES and LSST) in tomographic shells. The Python code, MiSTree, used to construct the MST and run the analysis, is made publicly available at https://knaidoo29.github.io/mistreedoc/.

... A large number of computational methods have been proposed in the literature which aim to aid the understanding of the topological and geometrical pattern of the Universe as a whole (Zeldovich, Einasto & Shandarin 1982 ;Dekel & West 1985 ;Klypin & Shandarin 1993 ;Sahni, Sathyaprakash & Shandarin 1997, 1998Basilakos, Plionis & Rowan-Robinson 2001 ;Basilakos 2003 ;Sheth et al. 2003 ;Shandarin, Sheth & Sahni 2004 ;Basilakos et al. 2006 ; Van de Weygaert et al. 2011 ;Sousbie 2011 ;Park et al. 2013 ) confirming the picture of a web-like network, but also to study the intrinsic properties of the cosmic network, namely connectivity and complexity (cf. Edelsbrunner Edelsbrunner & Harer 2010 ;Cautun et al. 2014 ;Hong et al. 2016 ;Pranav et al. 2017 ;Libeskind et al. 2017 ;Codis, Pogosyan & Pichon 2018 ;Feldbrugge et al. 2019 ;Kono et al. 2020 ;Biagetti, Cole & Shiu 2021 ;Wilding et al. 2021 ). For a re vie w, see also Wasserman ( 2018 ) and references therein. ...

... In a paper that follows-up this work, Hong et al. ( 2016 ) used network analysis to discriminate between topologically different distrib utions that ha v e similar two-point correlations. In particular, the y compared two galaxy distributions with similar two-point correlation statistics but different topologies, one derived from a cosmological simulation and the other from a L évy walk (Mandelbrot 1975 ). ...

In this paper we explore the use of spatial clustering algorithms as a new computational approach for modeling the cosmic web. We demonstrate that such algorithms are efficient in terms of computing time needed. We explore three distinct spatial methods which we suitably adjust for (i) detecting the topology of the cosmic web and (ii) categorizing various cosmic structures as voids, walls, clusters and superclusters based on a variety of topological and physical criteria such as the physical distance between objects, their masses and local densities. The methods explored are (1) a new spatial method called Gravity Latticexs; (2) a modified version of another spatial clustering algorithm, the ABACUS; and (3) the well known spatial clustering algorithm HDBSCAN. We utilize HDBSCAN in order to detect cosmic structures and categorize them using their overdensity. We demonstrate that the ABACUS method can be combined with the classic DTFE method to obtain similar results in terms of the achieved accuracy with about an order of magnitude less computation time. To further solidify our claims, we draw insights from the computer science domain and compare the quality of the results with and without the application of our method. Finally, we further extend our experiments and verify their effectiveness by showing their ability to scale well with different cosmic web structures that formed at different redshifts.

... A large number of computational methods have been proposed in the literature which aim to aid the understanding of the topological and geometrical pattern of the universe as a whole (Zeldovich et al. 1982;Dekel & West 1985;Klypin & Shandarin 1993;Sahni et al. 1997Sahni et al. , 1998Basilakos et al. 2001;Basilakos 2003;Sheth et al. 2003;Shandarin et al. 2004;Basilakos et al. 2006; Van de Weygaert et al. 2011;Sousbie 2011;Park et al. 2013) confirming the picture of a web-like network, but also to study the intrinsic properties of the cosmic network, namely connectivity and complexity (cf. Edels-brunner et al. 2002;Van de Weygaert & Bond 2008;Aragón-Calvo et al. 2010b,a;Edelsbrunner & Harer 2010;Cautun et al. 2014;Hong et al. 2016;Pranav et al. 2017;Libeskind et al. 2017;Codis et al. 2018;Feldbrugge et al. 2019;Kono et al. 2020;Biagetti et al. 2021;Wilding et al. 2021). For a review see also Wasserman (2018) and references therein. ...

... In a paper that follows-up this work, Hong et al. (2016) used network analysis to discriminate between topologically different distributions that have similar two-point correlations. In particular, they compared two galaxy distributions with similar two-point correlation statistics but different topologies, one derived from a cosmological simulation and the other from a Lévy walk (Mandelbrot 1975). ...

In this paper we explore the use of spatial clustering algorithms as a new computational approach for modeling the cosmic web. We demonstrate that such algorithms are efficient in terms of computing time needed. We explore three distinct spatial methods which we suitably adjust for (i) detecting the topology of the cosmic web and (ii) categorizing various cosmic structures as voids, walls, clusters and superclusters based on a variety of topological and physical criteria such as the physical distance between objects, their masses and local densities. The methods explored are (1) a new spatial method called Gravity Lattice ; (2) a modified version of another spatial clustering algorithm, the ABACUS; and (3) the well known spatial clustering algorithm HDBSCAN. We utilize HDBSCAN in order to detect cosmic structures and categorize them using their overdensity. We demonstrate that the ABACUS method can be combined with the classic DTFE method to obtain similar results in terms of the achieved accuracy with about an order of magnitude less computation time. To further solidify our claims, we draw insights from the computer science domain and compare the quality of the results with and without the application of our method. Finally, we further extend our experiments and verify their effectiveness by showing their ability to scale well with different cosmic web structures that formed at different redshifts.

... Different techniques in network theory support a variety of analysis such as visualization, link prediction, and clustering. The network construction from a dataset plays an important role and it has been applied in different areas from biology and neuroscience [28] (e.g., brain networks [5]) to modeling and analyzing galaxy distributions [18], and quantifying reputation in art [16]. ...

... Initiating from these studies, in particular from [1], we followed a similar approach in order to define our network construction models. Furthermore, in a similar approach Hong et al. [18] have also focused on different networks to represent the cosmic web in order to investigate the architecture of the universe. They introduced three network models each with their own linkage model. ...

We consider the problem of automatizing network generation from inter-organizational research collaboration data. The resulting networks promise to obtain crucial advanced insights. In this paper, we propose a method to convert relational data to a set of networks using a single parameter, called Linkage Threshold (LT). To analyze the impact of the LT-value, we apply standard network metrics such as network density and centrality measures on each network produced. The feasibility and impact of our approach is demonstrated by using a real-world collaboration data set from an established research institution. We show how the produced network layers can reveal insights and patterns by presenting a correlation matrix.

... As a new way to quantify the elusive topological structure of the Universe, here we apply graph theory (or, network science) to cosmological datasets (Hong & Dey 2015;Hong et al. 2016Hong et al. , 2019. The basic idea is to associate galaxies with the vertices of a graph and to connect nearby galaxies with graph edges. ...

... To build a network from each halo distribution, we use the conventional FoF recipe (Huchra & Geller 1982;Hong & Dey 2015;Hong et al. 2016Hong et al. , 2019. For a given linking length l, the adjacency matrix of the FoF recipe can be written as, ...

By utilizing large-scale graph analytic tools implemented in the modern Big Data platform, Apache Spark, we investigate the topological structure of gravitational clustering in five different universes produced by cosmological $N$-body simulations with varying parameters: (1) a WMAP 5-year compatible $\Lambda$CDM cosmology, (2) two different dark energy equation of state variants, and (3) two different cosmic matter density variants. For the Big Data calculations, we use a custom build of stand-alone Spark/Hadoop cluster at Korea Institute for Advanced Study (KIAS) and Dataproc Compute Engine in Google Cloud Platform (GCP) with the sample size ranging from 7 millions to 200 millions. We find that among the many possible graph-topological measures, three simple ones: (1) the average of number of neighbors (the so-called average vertex degree) $\alpha$, (2) closed-to-connected triple fraction (the so-called transitivity) $\tau_\Delta$, and (3) the cumulative number density $n_{s\ge5}$ of subcomponents with connected component size $s \ge 5$, can effectively discriminate among the five model universes. Since these graph-topological measures are in direct relation with the usual $n$-points correlation functions of the cosmic density field, graph-topological statistics powered by Big Data computational infrastructure opens a new, intuitive, and computationally efficient window into the dark Universe.

... Graph Theory, standing out as a foundation concept in Network Science [15], is one of the oldest branches of Mathematics, with remarkable interdisciplinary applicability in diverse areas, spanning from Social and Political Sciences, Biology, Chemistry to Neuroscience [16] and Astrophysics/Cosmology [17][18][19]. A graph can be used to model any system of entities that are pairwise related to each other. ...

In this article, we present a unified framework for the analysis and characterization of a complex system and demonstrate its application in two diverse fields: neuroscience and astrophysics. The framework brings together techniques from graph theory, applied mathematics, and dimensionality reduction through principal component analysis (PCA), separating linear PCA and its extensions. The implementation of the framework maps an abstract multidimensional set of data into reduced representations, which enable the extraction of its most important properties (features) characterizing its complexity. These reduced representations can be sign-posted by known examples to provide meaningful descriptions of the results that can spur explanations of phenomena and support or negate proposed mechanisms in each application. In this work, we focus on the clustering aspects, highlighting relatively fixed stable properties of the system under study. We include examples where clustering leads to semantic maps and representations of dynamic processes within the same display. Although the framework is composed of existing theories and methods, its usefulness is exactly that it brings together seemingly different approaches, into a common framework, revealing their differences/commonalities, advantages/disadvantages, and suitability for a given application. The framework provides a number of different computational paths and techniques to choose from, based on the dimension reduction method to apply, the clustering approaches to be used, as well as the representations (embeddings) of the data in the reduced space. Although here it is applied to just two scientific domains, neuroscience and astrophysics, it can potentially be applied in several other branches of sciences, since it is not based on any specific domain knowledge.

... Statistical moments of the edge length distribution are proposed to quantify aspects of the higher order spatial clustering of the sample points. Expecting that these measures pertain to as-pects of their web-like distribution, network analysis is obtaining increasing attention (Hong & Dey 2015;Hong et al. 2016;de Regt et al. 2018;Tsizh et al. 2020) in large-scale structure studies. ...

We trace the overall connectivity of the cosmic web as defined by haloes in the Planck-Millennium simulation using the persistence and Betti curve analysis developed in our previous papers. We consider the presence of clustering in excess of the second-order correlation function, and investigate the extent to which the dark matter haloes reflect the intricate web-like pattern of the underlying dark matter distribution. With our systematic topological analysis we correlate local information and halo properties with the multi-scale geometrical environment of the cosmic web, delineated by elongated filamentary bridges and sheetlike walls that connect compact clusters at the nodes and define the boundaries of near-empty voids. We capture the multi-scale topology traced by the discrete spatial halo distribution through filtering the distance field of the corresponding Delaunay tessellation. The tessellation is naturally adaptive to the local density, perfectly outlining the local geometry. The resulting nested alpha shapes contain the complete information on the multi-scale topology. Normalising second-order clustering, we find a remarkable linear relationship between halo masses and topology: haloes of different mass trace environments with different topological signature. This is topological bias, a bias independent of the halo clustering bias associated with the two-point correlation function. Topological bias can be viewed as an environmental structure bias. We quantify it through a linear relation accounting for selection effects in the analysis and interpretation of the spatial distribution of galaxies. This mass-dependent scaling relation allows us to take clustering into account and determine the overall connectivity based on a limited sample of galaxies. This is of particular relevance with large upcoming galaxy surveys such as DESI, Euclid, and the Vera Rubin telescope surveys.

... Some exciting ideas emerged in geoscience-related fields [23,24], where spatial applications of the network theory have been used to describe sediment pathways [25,26], morphological partition and connectivity of river basins [27,28], landscape planning [29], cave passages [30], soil porous architecture [31,32], and rock fracture networks [33,34]. Interesting reports about structural connectivity analyses are also emerging in astronomy and have been used to describe large-scale topological structure of the Universe [35,36] and for the characterization of the internal structure of the superclusters of galaxies [37]. Notably, network analyses are increasingly used to represent the Earth's climatic patterns [38,39] and were proven useful for optimizing the spatial network structure of hydrometric stations [40]. ...

Technological advances in imaging techniques and biometric data acquisition have enabled us to apply methods of network science to study the morphology and structural design of organelles, organs, and tissues, as well as the coordinated interactions among them that yield a healthy physiology at the level of whole organisms. We here review research dedicated to these advances, in particular focusing on networks between cells, the topology of multicellular structures, neural interactions, fluid transportation networks, and anatomical networks. The percolation of blood vessels, structural connectivity within the brain, the porous structure of bones, and relations between different anatomical parts of the human body are just some of the examples that we explore in detail. We argue and show that the models, methods, and algorithms developed in the realm of network science are ushering in a new era of network-based inquiry into the morphology and structural design of living systems in the broadest possible terms. We also emphasize that the need and applicability of this research is likely to increase significantly in the years to come due to the rapid progress made in the development of bioartificial substitutes and tissue engineering.

... Given a set of kinematic variables, the LHC collision events form a topological structure in kinematic space, and one should expect there to be significant differences between the topological structures predicted for SM events, and those predicted for signal events. Motivated by studies of galaxy topology [8], we build a series of network graphs from simulated LHC events, each of which uses a particular distance metric to define "friendship" between events based on their proximity in a chosen space of kinematic variables. For example, considering only the missing energy values reconstructed for each event, the SM events will cluster in a group at lower missing energy, each having many "friends" close by, while SUSY events can be expected to be few and far from the main part of the distribution, giving a small number of "friends" when viewed as part of a network. ...

A bstract
We present a novel technique for the analysis of proton-proton collision events from the ATLAS and CMS experiments at the Large Hadron Collider. For a given final state and choice of kinematic variables, we build a graph network in which the individual events appear as weighted nodes, with edges between events defined by their distance in kinematic space. We then show that it is possible to calculate local metrics of the network that serve as event-by-event variables for separating signal and background processes, and we evaluate these for a number of different networks that are derived from different distance metrics. Using a supersymmetric electroweakino and stop production as examples, we construct prototype analyses that take account of the fact that the number of simulated Monte Carlo events used in an LHC analysis may differ from the number of events expected in the LHC dataset, allowing an accurate background estimate for a particle search at the LHC to be derived. For the electroweakino example, we show that the use of network variables outperforms both cut-and-count analyses that use the original variables and a boosted decision tree trained on the original variables. The stop example, deliberately chosen to be difficult to exclude due its kinematic similarity with the top background, demonstrates that network variables are not automatically sensitive to BSM physics. Nevertheless, we identify local network metrics that show promise if their robustness under certain assumptions of node-weighted networks can be confirmed.

... For example, the key question for centrality could be what is flowing through the network? Whether the actor retains the good to pass to others (information, diseases, etc.) or higher score in centrality measures give nodes more power or importance (Alper et al., 2013;Hong et al., 2016). According to this, the following criteria were considered for the stakeholder analysis. ...

This study aimed to investigate water quality and identification, prioritization and network analysis of the stakeholders of the Anzali International Wetland, Iran. In the wetland watershed, a wide range of stakeholders benefit from ecosystem services and, with regard to the activities they carry out, affect the functions of the wetland in the lower part of the watershed. The different stakeholders were identified and categorized according to the different parameters including power, interest, co-stakeholder, and interaction. Finally, the linkage between the stakeholders was investigated using network analysis (Gephi software). To determinate water quality, the content of Dissolved Oxygen (DO), Temperature, Turbidity and Electrical Conductivity (EC) were measured in-situ while number of Fecal Coliform, amount of Biological Oxygen Demand (BOD), Chemical Oxygen Demand (COD), Phosphate and Nitrate were measured at laboratory in 4 sites from Dec2018 to Feb2020 monthly. The results of the parameters and water quality index indicated relatively high pollution in the watershed. PirBazar station, the most important source of water providing for the wetland, polluted much higher than others. The role of local communities, the participation of stakeholders and their awareness in the protection and rehabilitation of the wetland is very effective. The identification process led to 39 stakeholders categorized into five groups, with government sectors having the highest number of stakeholders. Weighted out degree showed many stakeholders that directly use wetland services such as the local people were placed in low-power categories whose importance was determined by closeness and betweenness criteria. Based on the stakeholders’ attributes in the study, the most important and influential stakeholders can be used by the related managers for decision-making and improving of water health quality.

... A number of different classification schemes with different levels of sophistication have been proposed in the literature to analyse the Cosmic Web based on the Hessian of the cosmic density field (Aragón-Calvo et al. 2007;Hahn et al. 2007;Forero-Romero et al. 2009;Sousbie et al. 2009;Zhu & Feng 2017;Cui et al. 2018), watershed segmentation of the cosmic density field (Aragón-Calvo et al. 2010), the velocity shear tensor (Hoffman et al. 2012;Fisher, Faltenbacher & Johnson 2016;Pomarède et al. 2017), a combination of density and kinematic information (Cautun, van de Weygaert & Jones 2013), Bayesian reconstruction of the density field from a population of tracers (Leclercq, Jasche & Wandelt 2015), gradient-based methods that detect filaments through density ridges (Chen et al. 2016), network analysis (Hong et al. 2016), analysis of the dark matter flip-flop field (Shandarin & Medvedev 2017), the identification of caustics (Feldbrugge et al. 2018), and Cosmic Web skeleton construction using Morse theory (Sousbie 2013;Codis et al. 2018). Libeskind et al. (2018) recently presented a thorough comparison of multiple methods. ...

We analyse the IllustrisTNG simulations to study the mass, volume fraction, and phase distribution of gaseous baryons embedded in the knots, filaments, sheets, and voids of the Cosmic Web from redshift z = 8 to redshift z = 0. We find that filaments host more star-forming gas than knots, and that filaments also have a higher relative mass fraction of gas in this phase than knots. We also show that the cool, diffuse intergalactic medium [IGM; $T\lt 10^5 \, {\rm K}$, $n_{\rm H}\lt 10^{-4}(1+z) \, {\rm cm^{-3}}$] and the warm-hot intergalactic medium [WHIM; $10^5 \lt T\lt 10^7 \, {\rm K}$, $n_{\rm H} \lt 10^{-4}(1+z)\, {\rm cm^{-3}}$] constitute ${\sim } 39$ and ${\sim } 46{{\ \rm per\ cent}}$ of the baryons at redshift z = 0, respectively. Our results indicate that the WHIM may constitute the largest reservoir of missing baryons at redshift z = 0. Using our Cosmic Web classification, we predict the WHIM to be the dominant baryon mass contribution in filaments and knots at redshift z = 0, but not in sheets and voids where the cool, diffuse IGM dominates. We also characterize the evolution of WHIM and IGM from redshift z = 4 to redshift z = 0, and find that the mass fraction of WHIM in filaments and knots evolves only by a factor of ∼2 from redshift z = 0 to 1, but declines faster at higher redshift. The WHIM only occupies $4\!-\!11{{\ \rm per\ cent}}$ of the volume at redshift 0 ≤ z ≤ 1. We predict the existence of a significant number of currently undetected O vii and Ne ix absorption systems in cosmic filaments, which could be detected by future X-ray telescopes like Athena.

... cosmic web. This entropy can be computed from the number of connections for each point in the graph, which is a property that can be computed for commonly used graphs in this context, such as minimal spanning trees (Barrow et al. 1985), Delaunay tessellations (Romano-Díaz & van de Weygaert 2007), neighbor networks within a fixed linking length (Hong et al. 2016) or the β-skeleton (Fang et al. 2019). For any of those graphs one can estimate the probability of a point having n connections, Pn, and from these values a global entropy can be defined as S = Pn>0 −Pn log 2 Pn. ...

We explore the information theory entropy of a graph as a scalar to quantify the cosmic web. We find entropy values in the range between 1.5 and 3.2 bits. We argue that this entropy can be used as a discrete analogue of scalars used to quantify the connectivity in continuous density fields. After showing that the entropy clearly distinguishes between clustered and random points, we use simulations to gauge the influence of survey geometry, cosmic variance, redshift space distortions, redshift evolution, cosmological parameters and spatial number density. Cosmic variance shows the least important influence while changes from the survey geometry, redshift space distortions, cosmological parameters and redshift evolution produce larger changes on the order of $10^{-2}$ bits. The largest influence on the graph entropy comes from changes in the number density of clustered points. As the number density decreases, and the cosmic web is less pronounced, the entropy can diminish up to 0.2 bits. The graph entropy is simple to compute and can be applied both to simulations and observational data from large galaxy redshift surveys; it is a new statistic that can be used in a complementary way to other kinds of topological or clustering measurements.

... Network structures have drawn significant attention in big data due to the possibility to apply network theory and analysis to obtain extra insights from the data. Networks are ubiquitous [27] in research areas from biology and neuroscience (e.g., brain networks [6]) to modeling and analyzing galaxy distributions [17], and quantifying reputation in art [15]. These examples are use cases where network layers play an important role to represent and analyze the data. ...

We consider the problem of automatically generating networks from data of collaborating researchers. The objective is to apply network analysis on the resulting network layers to reveal supplemental patterns and insights of the research collaborations. In this paper, we describe our data-to-networks method, which automatically generates a set of logical network layers from the relational input data using a linkage threshold. We, then, use a series of network metrics to analyze the impact of the linkage threshold on the individual network layers. Moreover, results from the network analysis also provide beneficial information to improve the network visualization. We demonstrate the feasibility and impact of our approach using real-world collaboration data. We discuss how the produced network layers can reveal insights and patterns to direct the data analytics more intelligently.

... A number of different classification schemes with different levels of sophistication have been proposed in the literature to analyse the Cosmic Web (Aragón-Calvo et al. 2007;Hahn et al. 2007;Sousbie et al. 2009;Forero-Romero et al. 2009;Aragón-Calvo et al. 2010;Hoffman et al. 2012;Cautun et al. 2013;Leclercq et al. 2015;Chen et al. 2016;Fisher et al. 2016;Hong et al. 2016;Shandarin & Medvedev 2017;Pomarède et al. 2017;Feldbrugge et al. 2018;Zhu & Feng 2017;Libeskind et al. 2018;Cui et al. 2018). For this paper we choose to implement our own version of the method as developed by Forero-Romero et al. (2009) which classifies the Cosmic Web using the deformation tensor and is based on results from the analysis of the growth of the large scale structure using the Zel'dovich approximation. ...

We analyze the state of gas embedded in different parts of the Cosmic Web using the IllustrisTNG simulations. We focus on the mass and volume fractions of baryons in different phases in knots, filaments, sheets and voids in the Cosmic Web from redshift $z=8$ to redshift $z=0$. We characterise the density-temperature distribution of different phases and inspect their metallicity, finding evidence for early metal enrichment in the Intracluster Medium (ICM) of clusters already at $z=4$. Not only we find that filaments host more star-forming gas than knots, but that filaments also have a higher relative mass fraction of gas in this phase than knots. In agreement with previous predictions, we show that the cool, diffuse Intergalactic Medium (IGM; $T<10^5 \, {\rm K}$, $ n_{\rm H}<10^{-4} \, {\rm cm^{-3}}$) and the Warm-Hot Intergalactic Medium (WHIM; $ 10^5 \, {\rm K} <T<10^7 \, {\rm K}$, $ n_{\rm H} <10^{-4} \, {\rm cm^{-3}}$) constitute $\sim 38\%$ and $\sim 47\%$ of the baryons at redshift $z=0$, respectively. Our results indicate that the WHIM can indeed constitute the largest reservoir of {\it missing} baryons at redshift $z=0$. Using our Cosmic Web classification, we predict the WHIM to be the dominant baryon mass contribution in filaments and knots at redshift $z=0$, but not in sheets and voids where the cool, diffuse IGM dominates. We also characterise the evolution of WHIM and IGM from redshift $z=4$ to redshift $z=0$, and find that the mass fraction of WHIM in filaments and knots evolves only by a factor $\sim 2$ from redshift $z=0$ to $z=1$, but declines faster at higher redshift. The WHIM only occupies $5-13\%$ of the volume at redshift $0\leq z \leq 1$. We predict the existence of a significant number of currently undetected OVII and NeIX absorption systems in cosmic filaments which could be detected by future X-ray telescopes like ATHENA.

By utilizing large-scale graph analytic tools implemented in the modern big data platform, apache spark, we investigate the topological structure of gravitational clustering in five different universes produced by cosmological N-body simulations with varying parameters: (1) a WMAP 5-yr compatible ΛCDM cosmology, (2) two different dark energy equation of state variants, and (3) two different cosmic matter density variants. For the big data calculations, we use a custom build of standalone Spark/Hadoop cluster at Korea Institute for Advanced Study and Dataproc Compute Engine in Google Cloud Platform with sample sizes ranging from 7 to 200 million. We find that among the many possible graph-topological measures, three simple ones: (1) the average of number of neighbours (the so-called average vertex degree) α, (2) closed-to-connected triple fraction (the so-called transitivity) $\tau _\Delta$, and (3) the cumulative number density ns ≥ 5 of subgraphs with connected component size s ≥ 5, can effectively discriminate among the five model universes. Since these graph-topological measures are directly related with the usual n-points correlation functions of the cosmic density field, graph-topological statistics powered by big data computational infrastructure opens a new, intuitive, and computationally efficient window into the dark Universe.

We explore the information theory entropy of a graph as a scalar to quantify the cosmic web. We find entropy values in the range between 1.5 and 3.2 bits. We argue that this entropy can be used as a discrete analogue of scalars used to quantify the connectivity in continuous density fields. After showing that the entropy clearly distinguishes between clustred and random points, we use simulations to gauge the influence of survey geometry, cosmic variance, redshift space distortions, redshift evolution, cosmological parameters, and spatial number density. Cosmic variance shows the least important influence while changes from the survey geometry, redshift space distortions, cosmological parameters, and redshift evolution produce larger changes of the order of 10−2 bits. The largest influence on the graph entropy comes from changes in the number density of clustred points. As the number density decreases, and the cosmic web is less pronounced, the entropy can diminish up to 0.2 bits. The graph entropy is simple to compute and can be applied both to simulations and observational data from large galaxy redshift surveys; it is a new statistic that can be used in a complementary way to other kinds of topological or clustering measurements.

We perform an analysis of the cosmic web as a complex network, which is built on a Λ cold dark matter (ΛCDM) cosmological simulation. For each of nodes, which are in this case dark matter haloes formed in the simulation, we compute 10 network metrics, which characterize the role and position of a node in the network. The relation of these metrics to topological affiliation of the halo, i.e. to the type of large-scale structure, which it belongs to, is then investigated. In particular, the correlation coefficients between network metrics and topology classes are computed. We have applied different machine learning methods to test the predictive power of obtained network metrics and to check if one could use network analysis as a tool for establishing topology of the large-scale structure of the Universe. Results of such predictions, combined in the confusion matrix, show that it is not possible to give a good prediction of the topology of cosmic web (score is ≈70 ${{\rm per\ cent}}$ in average) based only on coordinates and velocities of nodes (haloes), yet network metrics can give a hint about the topological landscape of matter distribution.

Modern cosmology predicts that matter in our universe today has assembled into a vast network of filamentary structures colloquially termed the "cosmic web." Because this matter is either electromagnetically invisible (i.e., dark) or too diffuse to image in emission, tests of this cosmic web paradigm are limited. Wide-field surveys do reveal web-like structures in the galaxy distribution, but these luminous galaxies represent less than 10% of baryonic matter. Statistics of absorption by the intergalactic medium (IGM) via spectroscopy of distant quasars support the model yet have not conclusively tied the diffuse IGM to the web. Here, we report on a new method inspired by the Physarum polycephalum slime mold that is able to infer the density field of the cosmic web from galaxy surveys. Applying our technique to galaxy and absorption-line surveys of the local universe, we demonstrate that the bulk of the IGM indeed resides in the cosmic web. From the outskirts of cosmic web filaments, at approximately the cosmic mean matter density (ρ m) and 5 virial radii from nearby galaxies, we detect an increasing H i absorption signature toward higher densities and the circumgalactic medium, to 200ρ m. However, the absorption is suppressed within the densest environments, suggesting shock-heating and ionization deep within filaments and/or feedback processes within galaxies. © 2020. The American Astronomical Society. All rights reserved.

In the near future, more than two thirds of the world’s population is expected to be living in cities. In this interconnected world, data collection from various sensors is eased up and unavoidable. Handling the right data is an important factor for decision making and improving services. While at the same time keeping the right level of privacy for end users is crucial. This position paper discusses the necessary trade-off between privacy needs and data handling for the improvement of services. Pseudo-anonymization techniques have shown their limits and local computation and aggregation of data seems the way to go. To illustrate the opportunity, the case for a novel generation of clustering algorithms is made that implements a privacy by design approach. Preliminary results of such a clustering algorithm use case show that our approach exhibits a high degree of elasticity.

Percolation analysis has long been used to quantify the connectivity of the cosmic web. Most of the previous work is based on density fields on grids. By smoothing into fields, we lose information about galaxy properties like shape or luminosity. Lack of mathematical model also limits our understanding of percolation analysis. In order to overcome these difficulties, we have studied percolation analysis based on discrete points. Using a Friends-of-Friends (FoF) algorithm, we generate the S-bb relation, between the fractional mass of the largest connected group (S) and the FoF linking length (bb). We propose a new model, the Probability Cloud Cluster Expansion Theory (PCCET) to relate the S-bb relation with correlation functions. We show that the S-bb relation reflects a combination of all orders of correlation functions. Using N-body simulation, we find that the S-bb relation is robust against redshift distortion and incompleteness in observation. From the Bolshoi simulation, with Halo Abundance Matching (HAM), we have generated a mock galaxy catalogue. Good matching of the projected two-point correlation function with observation is confirmed. However, comparing the mock catalogue with the latest galaxy catalogue from SDSS DR12, we have found significant differences in their S-bb relations. This indicates that the mock galaxy catalogue cannot accurately retain higher order correlation functions than the two-point correlation function, which reveals the limit of HAM method. As a new measurement, S-bb relation is applicable to a wide range of data types, fast to compute, robust against redshift distortion and incompleteness, and it contains information of all orders of correlation function.

The complex network analysis of COSMOS galaxy field for R.A. = 149.4 deg - 150.4 deg and Decl. = 1.7 deg - 2.7 deg is presented. 2D projections of spatial distributions of galaxies in three redshift slices 0.88-0.91, 0.91-0.94 and 0.94-0.97 are studied. We analyse network similarity/peculiarity of different samples and correlations of galaxy astrophysical properties (colour index and stellar mass) with their topological environments. For each slice the local and global network measures are calculated. Results indicate a high level of similarity between geometry and topology of different galaxy samples. We found no clear evidence of evolutionary change in network measures for different slices. Most local network measures have non-Gaussian distributions, often bi- or multi-modal. The distribution of local clustering coefficient C manifests three modes which allow for discrimination between stand-alone singlets and dumbbells (0 <= C < 0.1), intermediately packed galaxies (0.1 <= C < 0.9) and cliques (0.9 <= C <= 1). Analysing astrophysical properties, we show that mean values and distributions of galaxy colour index and stellar mass are similar in all slices. However, statistically significant correlations are found if one selects galaxies according to different modes of C distribution. The distribution of stellar mass for galaxies with interim C differ from the corresponding distributions for stand-alone and clique galaxies. This difference holds for all redshift slices. Besides, the analogous difference in the colour index distributions is observed only in the central redshift interval.

Even as our measurements of cosmological parameters improve, the physical nature of the dark sector of the universe largely remains a mystery. Many effects of dark sector models are most prominent at very large scales and will rely on future galaxy surveys to elucidate. In this paper we compare the topological properties of the large scale dark matter distribution in a number of cosmological models using hydrodynamical simulations and the cosmological genus statistic. Genus curves are computed from z = 11 to z = 0 for {\Lambda}CDM, Quintessence and Warm Dark Matter models, over a scale range of 1 to 20 Mpc/h. The curves are analysed in terms of their Hermite spectra to describe the power contained in non-Gaussian deformations to the cosmological density field. We find that the {\Lambda}CDM and {\Lambda}WDM models produce nearly identical genus curves indicating no topological differences in structure formation. The Quintessence model, which differs solely in its expansion history, produces significant differences in the strength and redshift evolution of non-Gaussian modes associated with higher cluster abundances and lower void abundances. These effects are robust to cosmic variance and are characteristically different from those produced by tweaking the parameters of a {\Lambda}CDM model. Given the simplicity and similarity of the models, detecting these discrepancies represents a promising avenue for understanding the effect of non-standard cosmologies on large-scale structure.

We present a science forecast for the eBOSS survey, part of the SDSS-IV
project, which is a spectroscopic survey using multiple tracers of large-scale
structure, including luminous red galaxies (LRGs), emission line galaxies
(ELGs) and quasars (both as a direct probe of structure and through the
Ly-$\alpha$ forest). Focusing on discrete tracers, we forecast the expected
accuracy of the baryonic acoustic oscillation (BAO), the redshift-space
distortion (RSD) measurements, the $f_{\rm NL}$ parameter quantifying the
primordial non-Gaussianity, the dark energy and modified gravity parameters. We
also use the line-of-sight clustering in the Ly-$\alpha$ forest to constrain
the total neutrino mass. We find that eBOSS LRGs ($0.6<z<1.0$) (combined with
the BOSS LRGs at $z>0.6$), ELGs ($0.6<z<1.2$) and Clustering Quasars (CQs)
($0.6<z<2.2$) can achieve a precision of 1%, 2.2% and 1.6% precisions,
respectively, for spherically averaged BAO distance measurements. Using the
same samples, the constraint on $f\sigma_8$ is expected to be 2.5%, 3.3% and
2.8% respectively. For primordial non-Gaussianity, eBOSS alone can reach an
accuracy of $\sigma(f_{\rm NL})\sim10-15$, depending on the external
measurement of the galaxy bias and our ability to model large-scale systematic
errors. eBOSS can at most improve the dark energy Figure of Merit (FoM) by a
factor of $3$ for the Chevallier-Polarski-Linder (CPL) parametrisation, and can
well constrain three eigenmodes for the general equation-of-state parameter
(Abridged).

A survey is given of theories for the origin of large-scale structure in the universe: clusters and superclusters of galaxies, and vast black regions practically devoid of galaxies. Special attention is paid to the theory of a neutrino-dominated universe: a cosmology in which electron neutrinos with a rest mass of a few tens of electron volts would contribute the bulk of the mean density. The evolution of small perturbations is discussed, and estimates are made for the temperature anisotropy of the microwave background radiation on various angular scales. The nonlinear stage in the evolution of smooth irrotational perturbations in a low-pressure medium is described in detail. Numerical experiments simulating large-scale structure formation processes are discussed, as well as their interpretation in the context of catastrophe theory.

We study the convergence properties of smoothed particle hydrodynamics (SPH)
using numerical tests and simple analytic considerations. Our analysis shows
that formal numerical convergence is possible in SPH only in the joint limit $N
\rightarrow \infty$, $h \rightarrow 0$, and $N_{nb} \rightarrow \infty$, where
$N$ is the total number of particles, $h$ is the smoothing length, and $N_{nb}$
is the number of neighbor particles within the smoothing volume used to compute
smoothed estimates. Previous work has generally assumed that the conditions $N
\rightarrow \infty$ and $h \rightarrow 0$ are sufficient to achieve
convergence, while holding $N_{nb}$ fixed. We demonstrate that if $N_{nb}$ is
held fixed as the resolution is increased, there will be a residual source of
error that does not vanish as $N \rightarrow \infty$ and $h \rightarrow 0$.
Formal numerical convergence in SPH is possible only if $N_{nb}$ is increased
systematically as the resolution is improved. Using analytic arguments, we
derive an optimal compromise scaling for $N_{nb}$ by requiring that this source
of error balance that present in the smoothing procedure. For typical choices
of the smoothing kernel, we find $N_{nb} \propto N^{1/2}$. This means that if
SPH is to be used as a numerically convergent method, it does not scale with
particle number as $O(N)$, but rather as $O(N^{1+\delta})$, where $\delta
\approx 1/2$, with a weak dependence on the form of the smoothing kernel.

BICEP1 is a millimeter-wavelength telescope designed specifically to measure
the inflationary B-mode polarization of the Cosmic Microwave Background (CMB)
at degree angular scales. We present results from an analysis of the data
acquired during three seasons of observations at the South Pole (2006 to 2008).
This work extends the two-year result published in Chiang et al. (2010), with
additional data from the third season and relaxed detector-selection criteria.
This analysis also introduces a more comprehensive estimation of band-power
window functions, improved likelihood estimation methods and a new technique
for deprojecting monopole temperature-to-polarization leakage which reduces
this class of systematic uncertainty to a negligible level. We present maps of
temperature, E- and B-mode polarization, and their associated angular power
spectra. The improvement in the map noise level and polarization spectra error
bars are consistent with the 52% increase in integration time relative to
Chiang et al. (2010). We confirm both self-consistency of the polarization data
and consistency with the two-year results. We measure the angular power spectra
at 21 <= l <= 335 and find that the EE spectrum is consistent with Lambda Cold
Dark Matter (LCDM) cosmology, with the first acoustic peak of the EE spectrum
now detected at 15sigma. The BB spectrum remains consistent with zero. From
B-modes only, we constrain the tensor-to-scalar ratio to r = 0.03+0.27-0.23, or
r < 0.70 at 95% confidence level.

The density distribution arising at the nonlinear stage of gravitational instability is similar to intermittency phenomena in acoustic turbulence. Initially small-amplitude density fluctuations of Gaussian type transform into thin dense pancakes, filaments, and compact clumps of matter. It is perhaps surprising that the motion of self-gravitating matter in the expanding universe is like that of noninteracting matter moving by inertia. A similar process is the distribution of light reflected or refracted from rippled water. The similarity of gravitational instability to acoustic turbulence is highlighted by the fact that late nonlinear stages of density perturbation growth can be described by the Burgers equation, which is well known in the theory of turbulence. The phenomena discussed in this article are closely related to the problem of the formation of large-scale structure of the universe, which is also discussed.

A graph-theoretical technique for assessing intrinsic patterns in point
data sets is described. A unique construction, the minimal spanning
tree, can be associated with any point-data set, given all the
interpoint separations. This construction enables the skeletal pattern
of galaxy clustering to be singled out in quantitative fashion, and
differs from other statistics applied to these data sets. This technique
is applied to two- and three-dimensional distributions of galaxies, and
also to comparable random samples and numerical simulations. The
observed CfA and Zwicky (1961-1968) data exhibit characteristic
distributions of edge-lengths in their minimal spanning trees which are
distinct from those found in random samples. These statistics are also
reevaluated after normalizing to account for the level of clustering in
the samples.

We present cosmological parameter constraints based on the final nine-year
WMAP data, in conjunction with additional cosmological data sets. The WMAP data
alone, and in combination, continue to be remarkably well fit by a
six-parameter LCDM model. When WMAP data are combined with measurements of the
high-l CMB anisotropy, the BAO scale, and the Hubble constant, the densities,
Omegabh2, Omegach2, and Omega_L, are each determined to a precision of ~1.5%.
The amplitude of the primordial spectrum is measured to within 3%, and there is
now evidence for a tilt in the primordial spectrum at the 5sigma level,
confirming the first detection of tilt based on the five-year WMAP data. At the
end of the WMAP mission, the nine-year data decrease the allowable volume of
the six-dimensional LCDM parameter space by a factor of 68,000 relative to
pre-WMAP measurements. We investigate a number of data combinations and show
that their LCDM parameter fits are consistent. New limits on deviations from
the six-parameter model are presented, for example: the fractional contribution
of tensor modes is limited to r<0.13 (95% CL); the spatial curvature parameter
is limited to -0.0027 (+0.0039/-0.0038); the summed mass of neutrinos is <0.44
eV (95% CL); and the number of relativistic species is found to be 3.84+/-0.40
when the full data are analyzed. The joint constraint on Neff and the
primordial helium abundance agrees with the prediction of standard Big Bang
nucleosynthesis. We compare recent PLANCK measurements of the
Sunyaev-Zel'dovich effect with our seven-year measurements, and show their
mutual agreement. Our analysis of the polarization pattern around temperature
extrema is updated. This confirms a fundamental prediction of the standard
cosmological model and provides a striking illustration of acoustic
oscillations and adiabatic initial conditions in the early universe.

We apply simple analyses techniques developed for the study of complex networks to the study of the cosmic web, the large-scale
galaxy distribution. In this paper, we measure three network centralities (ranks of topological importance): degree centrality
(DC), closeness centrality (CL), and betweenness centrality (BC) from a network built from the Cosmological Evolution Survey
(COSMOS) catalogue. We define eight galaxy populations according to the centrality measures: void, wall, and cluster by DC;
main branch and dangling leaf by BC; and kernel, backbone, and fracture by CL. We also define three populations by Voronoi
tessellation density to compare these with the DC selection. We apply the topological selections to galaxies in the (photometric)
redshift range 0.91 < z < 0.94 from the COSMOS survey, and explore whether the red and blue galaxy populations show differences in colour, star formation
rate, and stellar mass in the different topological regions. Despite the limitations and uncertainties associated with using
photometric redshift and indirect measurements of galactic parameters, the preliminary results illustrate the potential of
network analysis. Future surveys will provide better statistical samples to test and improve this ‘network cosmology’.

Previous simulations of the growth of cosmic structures have broadly reproduced the 'cosmic web' of galaxies that we see in the Universe, but failed to create a mixed population of elliptical and spiral galaxies, because of numerical inaccuracies and incomplete physical models. Moreover, they were unable to track the small-scale evolution of gas and stars to the present epoch within a representative portion of the Universe. Here we report a simulation that starts 12 million years after the Big Bang, and traces 13 billion years of cosmic evolution with 12 billion resolution elements in a cube of 106.5 megaparsecs a side. It yields a reasonable population of ellipticals and spirals, reproduces the observed distribution of galaxies in clusters and characteristics of hydrogen on large scales, and at the same time matches the 'metal' and hydrogen content of galaxies on small scales.

Generalized techniques for determining density enhancements in redshift
space are presented, and one of these techniques is used to examine the
effects of varying selection criteria on the dynamical parameters
defined for groups of galaxies. There is a broad range of selection
parameters which yields well defined and well behaved groups. A whole
sky catalog of nearby groups with outer number density enhancement
exceeding 20 is presented; the median M/L for these groups is
approximately 170, corresponding to a cosmological density Omega = 0.1.
A two-dimensional projection of several contours near the Virgo cluster
is examined, and the clustering is found to exhibit both concentric and
hierarchical structure.

We present a general method for calculating the bias and variance of
estimators for w(theta) based on galaxy-galaxy (DD), random-random (RR),
and galaxy-random (DR) pair counts and describe a procedure for quickly
estimating these quantities given an arbitrary two-point correlation
function and sampling geometry. These results, based conditionally upon
the number counts, are accurate for both high and low number counts. We
show explicit analytical results for the variances in the estimators
DD/RR, DD/DR, which turn out to be considerably larger than the common
wisdom Poisson estimate and report a small bias in DD/DR in addition to
that due to the integral constraint. Further, we introduce and recommend
an improved estimator (DD - 2DR + RR)/RR, whose variance is nearly
Poisson.

The development of a nondynamical computer model universe designed to
match the character of the galaxy distribution in the Lick survey is
described. The model assigns 'galaxy' positions in a three-dimensional
clustering hierarchy, fixes absolute magnitudes, and projects angular
positions of objects brighter than m = 18.9 onto the sky of an imaginary
observer. This yields a galaxy map that can be compared to that of the
Lick data. In the model there are 7.5 million galaxies at a mean space
density of 0.065 h-cubed per Mpc (H = 100 h km/s per Mpc), and 386,000
galaxies are visible at apparent magnitudes of no more than 18.9 and
galactic latitudes of at least 40 deg. By adjusting parameters in the
model within the limits allowed by the correlation functions to fourth
order, a galaxy map with a visual appearance that seems a reasonable
first approximation to that of the Lick data is obtained.

During the past several years there has been considerable work on
large-scale redshift surveys in selected regions of the sky. The
Harvard-Smithsonian Center for Astrophysics (CfA) survey is the largest
presently available sample. Part of the motivation for the survey was to
provide a sample for studies of the general statistics of the galaxy
distribution and motions. The present investigation is concerned with
the results of an analysis of the two-point position and velocity
correlation functions in the CfA sample. It is pointed out that the CfA
sample considerably improves the empirical understanding of several
important aspects of the galaxy two-point correlation functions. It is
known from angular distributions that the spatial correlation function
closely approximates a power law. That result is confirmed by direct
inversion of the observed function. The bias due to peculiar motion is
eliminated by integrating along the line of sight.

The large-scale structure (LSS) found in galaxy redshift surveys and in computer simulations of cosmic structure formation shows a very complex network of galaxy clusters, filaments and sheets around large voids. Here, we introduce a new algorithm, based on a minimal spanning tree, to find basic structural elements of this network and their properties. We demonstrate how the algorithm works using simple test cases and then apply it to haloes from the Millennium Run simulation. We show that about 70 per cent of the total halo mass is contained in a structure composed of more than 74 000 individual elements, the vast majority of which are filamentary, with lengths of up to 15 h−1 Mpc preferred. Spatially more extended structures do exist, as do examples of what appear to be sheet-like configurations of matter. What is more, LSS appears to be composed of a fixed set of basic building blocks. The LSS formed by mass selected subsamples of haloes shows a clear correlation between the threshold mass and the mean extent of major branches, with cluster-size haloes forming structures whose branches can extend to almost 200 h−1 Mpc – the backbone of LSS to which smaller branches consisting of smaller haloes are attached.

We present an assortment of methods for finding and counting simple cycles of a given length in directed and undirected graphs.
Most of the bounds obtained depend solely on the number of edges in the graph in question, and not on the number of vertices.
The bounds obtained improve upon various previously known results.

We provide simple, faster algorithms for the detection of cliques and dominating sets of fixed order. Our algorithms are based on reductions to rectangular matrix multiplication. We also describe an improved algorithm for diamonds detection.

We discuss cosmological hydrodynamic simulations of galaxy formation
performed with the new moving-mesh code AREPO, which promises higher accuracy
compared with the traditional SPH technique that has been widely employed for
this problem. We use an identical set of physics in corresponding simulations
carried out with the well-tested SPH code GADGET, adopting also the same
high-resolution gravity solver. We are thus able to compare both simulation
sets on an object-by-object basis, allowing us to cleanly isolate the impact of
different hydrodynamical methods on galaxy and halo properties. In accompanying
papers, we focus on an analysis of the global baryonic statistics predicted by
the simulation codes, (Vogelsberger et al. 2011) and complementary idealized
simulations that highlight the differences between the hydrodynamical schemes
(Sijacki et al. 2011). Here we investigate their influence on the baryonic
properties of simulated galaxies and their surrounding haloes. We find that
AREPO leads to significantly higher star formation rates for galaxies in
massive haloes and to more extended gaseous disks in galaxies, which also
feature a thinner and smoother morphology than their GADGET counterparts.
Consequently, galaxies formed in AREPO have larger sizes and higher specific
angular momentum than their SPH correspondents. The more efficient cooling
flows in AREPO yield higher densities and lower entropies in halo centers (and
the opposite trend in halo outskirts) leading to higher star formation rates of
massive galaxies. While both codes agree to acceptable accuracy on a number of
baryonic properties of cosmic structures, our results clearly demonstrate that
galaxy formation simulations greatly benefit from the use of more accurate
hydrodynamical techniques such as AREPO.

We present a detailed comparison between the well-known SPH code GADGET and
the new moving-mesh code AREPO on a number of hydrodynamical test problems.
Through a variety of numerical experiments we establish a clear link between
test problems and systematic numerical effects seen in cosmological simulations
of galaxy formation. Our tests demonstrate deficiencies of the SPH method in
several sectors. These accuracy problems not only manifest themselves in
idealized hydrodynamical tests, but also propagate to more realistic simulation
setups of galaxy formation, ultimately affecting gas properties in the full
cosmological framework, as highlighted in papers by Vogelsberger et al. (2011)
and Keres et al. (2011). We find that an inadequate treatment of fluid
instabilities in GADGET suppresses entropy generation by mixing, underestimates
vorticity generation in curved shocks and prevents efficient gas stripping from
infalling substructures. In idealized tests of inside-out disk formation, the
convergence rate of gas disk sizes is much slower in GADGET due to spurious
angular momentum transport. In simulations where we follow the interaction
between a forming central disk and orbiting substructures in a halo, the final
disk morphology is strikingly different. In AREPO, gas from infalling
substructures is readily depleted and incorporated into the host halo
atmosphere, facilitating the formation of an extended central disk. Conversely,
gaseous sub-clumps are more coherent in GADGET simulations, morphologically
transforming the disk as they impact it. The numerical artefacts of the SPH
solver are particularly severe for poorly resolved flows, and thus inevitably
affect cosmological simulations due to their hierarchical nature. Our numerical
experiments clearly demonstrate that AREPO delivers a physically more reliable
solution.

We present the Smoothed Hessian Major Axis Filament Finder (SHMAFF), an algorithm that uses the eigenvectors of the Hessian matrix of the smoothed galaxy distribution to identify individual filamentary structures. Filaments are traced along the Hessian eigenvector corresponding to the largest eigenvalue, and are stopped when the axis orientation changes more rapidly than a preset threshold. In both N-body simulations and the Sloan Digital Sky Survey (SDSS) main galaxy redshift survey data, the resulting filament length distributions are approximately exponential. In the SDSS galaxy distribution, using smoothing lengths of 10 h^{-1} Mpc and 15 h^{-1} Mpc, we find filament lengths per unit volume of 1.9x10^{-3} h^2 Mpc^{-2} and 7.6x10^{-4} h^2 Mpc^{-2}, respectively. The filament width distributions, which are much more sensitive to non-linear growth, are also consistent between the real and mock galaxy distributions using a standard cosmology. In SDSS, we find mean filament widths of 5.5 h^{-1} Mpc and 8.4 h^{-1} Mpc on 10 h^{-1} Mpc and 15 h^{-1} Mpc smoothing scales, with standard deviations of 1.1 h^{-1} Mpc and 1.4 h^{-1} Mpc, respectively. Finally, the spatial distribution of filamentary structure in simulations is very similar between z=3 and z=0 on smoothing scales as large as 15 h^{-1} Mpc, suggesting that the outline of filamentary structure is already in place at high redshift. Comment: 10 pages, 11 figures, accepted to MNRAS

A quantitative measure of the topology of large-scale structure: the genus of density contours in a smoothed density distribution, is described and applied. For random phase (Gaussian) density fields, the mean genus per unit volume exhibits a universal dependence on threshold density, with a normalizing factor that can be calculated from the power spectrum. If large-scale structure formed from the gravitational instability of small-amplitude density fluctuations, the topology observed today on suitable scales should follow the topology in the initial conditions. The technique is illustrated by applying it to simulations of galaxy clustering in a flat universe dominated by cold dark matter. The technique is also applied to a volume-limited sample of the CfA redshift survey and to a model in which galaxies reside on the surfaces of polyhedral 'bubbles'. The topology of the evolved mass distribution and 'biased' galaxy distribution in the cold dark matter models closely matches the topology of the density fluctuations in the initial conditions. The topology of the observational sample is consistent with the random phase, cold dark matter model.

We present a new ansatz which can successfully be used to determine the morphological properties of the supercluster-void network. The ansatz is based on a surface modelling scheme SURFGEN, which generates a triangulated surface from a discrete data set representing (say) the distribution of galaxies in real (or redshift) space. Four Minkowski functionals -- surface area, volume, extrinsic curvature and genus -- describe the geometry and topology of the supercluster-void network. Ratio's of Minkowski functionals -- Shapefinders -- provide us with an excellent diagnostic of three dimensional shapes of clusters, superclusters and voids. Minkowski functionals and Shapefinders are determined for a triangulated iso-density surface using SURFGEN. SURFGEN is tested against both simply and multiply connected eikonal surfaces such as triaxial ellipsoids and tori. Remarkably, the first three Minkowski functionals are computed to better than 1% accuracy while the fourth (genus) is known exactly. SURFGEN also gives excellent results when applied to Gaussian random fields. Our results indicate that the surface modelling scheme SURFGEN is accurate and robust and can successfully be used to quantify the topology and morphology of the supercluster-void network in the universe. We apply SURFGEN to three cosmological models, $\L$CDM, $\T$CDM and SCDM and obtain interesting new results pertaining to the geometry, morphology and topology of large scale structure. Comment: 28 MNRAS-style pages, 24 figures -- Revised with enhanced discussion and added references. An appendix included which shows that SURFGEN gives excellent results when applied to Gaussian Random Fields. Accepted for publication in MNRAS

We present an analysis of the Minkowski Functionals (MFs) describing the Wilkinson Microwave Anisotropy Probe (WMAP) 3-yr temperature maps to place limits on possible levels of primordial non-Gaussianity. In particular, we apply perturbative
formulae for the MFs to give constraints on the usual non-linear coupling constant fNL. The theoretical predictions are found to agree with the MFs of simulated cosmic microwave background (CMB) maps including
the full effects of radiative transfer. The agreement is also very good even when the simulation maps include various observational
artefacts, including the pixel window function, beam smearing, inhomogeneous noise and the survey mask. We accordingly find
that these analytical formulae can be applied directly to observational measurements of fNL without relying on non-Gaussian simulations. Considering the bin-to-bin covariance of the MFs in WMAP in a chi-square analysis, we find that the primordial non-Gaussianity parameter is constrained to lie in the range −70 <
fNL < 91[95 per cent confidence level (C.L.)] using the Q+V+W co-added maps.

We present measurements of the redshift-space three-point correlation function of 50,967 Luminous Red Galaxies (LRGs) from Data Release 3 (DR3) of the Sloan Digital Sky Survey (SDSS). We have studied the shape dependence of the reduced three-point correlation function (Qz(s,q,theta)) on three different scales, s=4, 7 and 10 h-1 Mpc, and over the range of 1 < q < 3 and 0 < theta < 180. On small scales (s=4 h-1 Mpc), Qz is nearly constant, with little change as a function of q and theta. However, there is evidence for a shallow U-shaped behaviour (with theta) which is expected from theoretical modeling of Qz . On larger scales (s=7 and 10 h-1 Mpc), the U-shaped anisotropy in Qz (with theta) is more clearly detected. We compare this shape-dependence in Qz(s,q,theta) with that seen in mock galaxy catalogues which were generated by populating the dark matter halos in large N-body simulations with mock galaxies using various Halo Occupation Distributions (HOD). We find that the combination of the observed number density of LRGs, the (redshift-space) two-point correlation function and Qz provides a strong constraint on the allowed HOD parameters (M_min, M_1, alpha) and breaks key degeneracies between these parameters. For example, our observed Qz disfavors mock catalogues that overpopulate massive dark matter halos with many LRG satellites. We also estimate the linear bias of LRGs to be b=1.87+/-0.07 in excellent agreement with other measurements.

We present an analytic model for the galaxy two-point correlation function in redshift space. The model is constructed within the framework of the Halo Occupation Distribution (HOD), which quantifies galaxy bias on linear and non- linear scales. We model one-halo pairwise velocities by assuming that satellite galaxy velocities follow a Gaussian distribution with dispersion proportional to the virial dispersion of the host halo. Two-halo velocity statistics are a combination of virial motions and host halo motions. The velocity distribution function (DF) of halo pairs is a complex function with skewness and kurtosis that vary substantially with scale. Using a series of collisionless N-body simulations, we demonstrate that the shape of this DF is determined primarily by the distribution of local densities around a halo pair, and at fixed density the velocity DF is close to Gaussian and nearly independent of halo mass. We calibrate a model for the conditional probability function of densities around halo pairs on these simulations. With this model, the full shape of the halo velocity DF can be accurately calculated as a function of halo mass, radial separation, angle, and cosmology. The HOD approach to redshift-space distortions utilizes clustering data from linear to non-linear scales to break the standard degeneracies inherent in previous models of redshift-space clustering. The parameters of the occupation function are well constrained by real-space clustering alone, separating constraints on bias and cosmology. We demonstrate the ability of the model to separately constrain Omega_m, sigma_8, and galaxy velocity bias in models that are constructed to have the same value of beta at large scales as well as the same finger-of-god distortions at small scales. [Abridged]

- G Kulkarni

Kulkarni, G., et al. 2007, MNRAS, 378, 1196

- T Delubac

Delubac, T., et al. 2015, A&A, 574, 59

- J Tinker

Tinker, J. 2007, MNRAS, 374, 477

- J M Colberg

Colberg J. M. 2007, MNRAS, 375, 337

- G Csardi
- T Nepusz

Csardi G., Nepusz T., 2006, InterJournal, Complex Systems,
1695 (http://igraph.org)

- A Ducout

Ducout, A. et al. 2013, MNRAS, 429, 2104

- M Levi

Levi, M. et al. 2013, arXiv:1308.0847
[34]Lidz, A. et al. 2010, ApJ, 718, 199
[35]Mandelbrot, B. 1975, C. R. Acad. Sci. (Paris) A280, 1551

- M A Aragón-Calvo
- B J T Jones
- R Van De Weygaert
- J M Van Der Hulst

Aragón-Calvo M. A., Jones B. J. T., van de Weygaert R., van der
Hulst J. M., 2007, A&A, 474, 315

- S More

More, S., et al. 2011, ApJS, 195, 4

- Zhu

Zhu et al. 2015, ApJ, 800, 6

Introduction to Algorithms

- T H Cormen

Cormen, T. H., et al. 2009, Introduction to Algorithms, 3rd
Edition, The MIT Press, Cambridge, Massachusetts
[15]Davis, M. & Peebles, P. J. E. 1983, ApJ, 26, 465

- C Hikage

Hikage C. et al. 2008, MNRAS, 389, 1439

- J R Gott
- D H Weinberg
- A L Melott

Gott J. R., Weinberg D. H., Melott A. L., 1987, ApJ, 319, 1

- R Albert
- A.-L Barabási

Albert, R., & Barabási A.-L., 2002, Reviews of Modern Physics,
74, 47

- F Eisenbrand
- F Grandoni

Eisenbrand, F., & Grandoni, F. 2004, Theoretical Computer
Science, 326, 57

- H Gil-Marin

Gil-Marin, H., et al. 2015, MNRAS, 451, 539

- A A Berlind
- D H Weinberg

Berlind, A. A., & Weinberg, D. H. 2002, ApJ, 575, 587

- V Springel

Springel, V. et al. 2005, MNRAS, 361, 776

- K L Adelberger

Adelberger, K. L., et al. 2005, ApJ, 619, 697

- S Genel

Genel, S., et al. 2014, MNRAS, 445, 175

- M Vogelsberger

Vogelsberger, M., et al. 2013, MNRAS, 436, 3031

- M Vogelsberger

Vogelsberger, M., et al. 2012, MNRAS, 425, 3024

- D Sijacki

Sijacki, D., et al. 2012, MNRAS, 380, 877

- G Hinshaw

Hinshaw, G. et al. 2013, ApJS, 208, 19