Lists of selected DM-only features for the different targets after applying GSFIA (algorithm 1).

Lists of selected DM-only features for the different targets after applying GSFIA (algorithm 1).

Source publication
Preprint
Full-text available
In this paper we study the applicability of a set of supervised machine learning (ML) models specifically trained to infer observed related properties of the baryonic component (stars and gas) from a set of features of dark matter only cluster-size halos. The training set is built from THE THREE HUNDRED project which consists of a series of zoomed...

Contexts in source publication

Context 1
... in red colour are the reduced set of features that will be considered for further analysis. These features are summarised in Table 1. ...
Context 2
... shown in Table 1, we expect that the selected variables generally come from different correlation blocks as shown in Fig. 1. This is so, since variables from the same block are correlated and once the algorithm chooses one feature, it skips using variables with the same information. ...
Context 3
... there were a clear redshift dependence on any target variable, the scale factor feature would show a higher contribution. However, as shown in Table 1 and Fig. 4, the scale factor contributes only weakly to the normalised loss function L. Furthermore, we have to highlight that although we have used Random Forest for the GSFIA, other Machine Learning algorithms might also be used. However, GSFIA is computationally expensive given the fact that its computing time increases with the number of features í µí°· as í µí±‚ (í µí°· 2 ). ...
Context 4
... order to determine the accuracy of our ML models, we have trained our 3 models on the dataset composed of all features and on the dataset with the reduced set of features summarised in Table 1 using the experimental setup described in the previous section. The average performance of the models is shown in Fig. 5. ...
Context 5
... dashed lines represent the average value of the normalised MSE for 10 different k-folds and error bars correspond to the standard deviation. The selected features for each target are highlighted in red and shown in Table 1. the worst for all targets even when all features are considered. After the previous analysis, we can conclude that XGBoost gives the most accurate model predictions. ...
Context 6
... is important to note that the scatter of the scaling law for The300 simulations is generally larger when comparing it with the values shown in Table 2, where the scatter (standard deviation of the relative difference) is reduced by a factor of 0.5 for the gas temperature, 0.3 for í µí±Œ X and 0.45 for í µí±Œ SZ . Moreover, the most relevant variables for each gas properties presented in table Table 1 can be used for finding analytical expressions for scaling laws with a reduced MSE using genetic algorithms ( Wadekar et al., 2022). ...
Context 7
... the parameter í µí»¼ í µí±‡ 0.3 cannot be ignored. This indicates that the evolution of í µí±‡ gas is relevant as it can also be appreciated in Table 1, where the scale factor a(24) is the second most important variable, reducing the normalised loss function L from 1 to 0.6. ...
Context 8
... trained models and data products for MDPL2, UNITSIM2048 and UNITSIM4096 are publicly available at https://github. com/The300th/DarkML. Table A1. The feature variables used in this text from the R catalogue. ...