Fig 6 - uploaded by Stephen France
Content may be subject to copyright.
Swiss Roll t-SNE Animation: Cut

Swiss Roll t-SNE Animation: Cut

Source publication
Article
Full-text available
This paper gives a review and synthesis of methods of evaluating dimensionality reduction techniques. Particular attention is paid to rank-order neighborhood evaluation metrics. A framework is created for exploring dimensionality reduction quality through visualization. An associated toolkit is implemented in R. The toolkit includes scatterplots, h...

Context in source publication

Context 1
... example of a QVisVis animation for examining parameter tuning is given in Figure 6. Here, t-SNE solutions for k = 1 . . . ...

Citations

... 3.2. Python Implementation of EvoMap Different mapping methods can be superior in different empirical contexts and for different tasks (France and Akkucuk 2021). Therefore, we designed EvoMap independent of a particular static mapping method (the choice of which only determines the specification of C static , its corresponding component in the gradient, and potentially its optimization technique). ...
Article
Full-text available
A common element of market structure analysis is the spatial representation of firms’ competitive positions on maps. Such maps typically capture static snapshots in time. Yet, competitive positions tend to change. Embedded in such changes are firms’ trajectories, that is, the series of changes in firms’ positions over time relative to all other firms in a market. Identifying these trajectories contributes to market structure analysis by providing a forward-looking perspective on competition, revealing firms’ (re)positioning strategies and indicating strategy effectiveness. To unlock these insights, we propose EvoMap, a novel dynamic mapping framework that identifies firms’ trajectories from high-frequency and potentially noisy data. We validate EvoMap via extensive simulations and apply it empirically to study the trajectories of more than 1,000 publicly listed firms over 20 years. We find substantial changes in several firms’ positioning strategies, including Apple, Walmart, and Capital One. Because EvoMap accommodates a wide range of mapping methods, analysts can easily apply it in other empirical settings and to data from various sources.
... Specifically, DR methods generically produce distortions in their representations of data, and these distortions are inhomogeneous across a representation; 30,40,[43][44][45][46][47] are often stochastic and non-linear, meaning that the robustness and reproducibility of results is hard to assess; 41 and often require user specification of hyperparameters, where this specification is often based on heuristics rather than quantitative principles. 10,[48][49][50] Addressing these issues provides the motivation for this work, as recovering the ability to separate signal and noise in DR output is essential for their utilization in quantitative analyses. ...
... Others have addressed the problem of DR quality assessment: work has been done to provide heuristic guidelines on ll OPEN ACCESS Article how to appropriately use DR algorithms 10,[48][49][50] and to make improvements to the algorithms themselves. [51][52][53][54][55][56][57] Several efforts to characterize the quality of DR methods have been pursued, 41,46,58 which can roughly be categorized as being global 30,[58][59][60][61][62][63][64][65] or local 29,45,46,66,67 in scope, and either based on preserving distances, 65 neighborhoods, 30,46,[58][59][60]62,68,69 or topology, 64,70,71 but in all cases they attempt to summarize the extent to which a given DR algorithm preserves some aspect of the original data's structure. ...
Article
Full-text available
Single-cell “omics”-based measurements are often high dimensional so that dimensionality reduction (DR) algorithms are necessary for data visualization and analysis. The lack of methods for separating signal from noise in DR outputs has limited their utility in generating data-driven discoveries in single-cell data. In this work we present EMBEDR, which assesses the output of any DR algorithm to distinguish evidence of structure from algorithm-induced noise in DR outputs. We apply EMBEDR to DR-generated representations of single-cell omics data of several modalities to show where they visually show real—not spurious—structure. EMBEDR generates a “p” value for each sample, allowing for direct comparisons of DR algorithms and facilitating optimization of algorithm hyperparameters. We show that the scale of a sample’s neighborhood can thus be determined and used to generate a novel “cell-wise optimal” embedding. EMBEDR is available as a Python package for immediate use.
... Moreover, these data can present some outliers and abnormalities [13]. To deal with these situations, dimensionality reduction algorithms have emerged as a successful alternative [14]. ...
Article
Full-text available
The detection and classification of heavy metals is a growing need to guarantee the quality of process water in different industries. However, the official methodologies to evaluate the presence of these contaminants require samples pre-processing, making them time-consuming and expensive; these elements do not allow online monitoring. For this reason, new technologies are required for online monitoring and evaluation. In this work, a new methodology is presented for the detection and classification of different heavy metal ions such as: As, Pb and Cd. Commercial graphite sensors modified with 2D molybdenite were used applying an electroanalytical technique of square wave voltammetry. Subsequently, signal processing based on pattern recognition and machine learning methods was carried out. This classification methodology includes the following steps: data display and arrangement, dimensionality reduction through the t-distributed stochastic neighbor embedding (t-SNE) method, which serves as feature extraction, and the support vector machines (SVM) method as a classifier. The validation is carried out with a data set of 118 aqueous samples. Leave one out cross-validation (LOOCV) was used to obtain classification accuracy. The results showed a classification accuracy of 98.31% with only two errors of the experimental validation with this data set. It is concluded that this methodology is a useful tool for detecting the presence of these ions in aqueous samples with MoS2-2D.
Article
Although popularly used in big-data analytics, dimensionality reduction is a complex, black-box technique whose outcome is difficult to interpret and evaluate. In recent years, a number of quantitative and visual methods have been proposed for analyzing low-dimensional embeddings. On the one hand, quantitative methods associate numeric identifiers to qualitative characteristics of these embeddings; and, on the other hand, visual techniques allow users to interactively explore these embeddings and make decisions. However, in the former case, users do not have control over the analysis, while in the latter case. assessment decisions are entirely dependent on the users perception and expertise. In order to bridge the gap between the two, in this article, we present VisExPreS, a visual interactive toolkit that enables a user-driven assessment of low-dimensional embeddings. VisExPreS is based on three novel techniques namely PG-LAPS, PG-GAPS, and RepSubset, that generate interpretable explanations of the preserved local and global structures in embeddings. In the first two techniques, the VisExPreS system proactively guides users during every step of the analysis. We demonstrate the utility of VisExPreS in interpreting, analyzing, and evaluating embeddings from different dimensionality reduction algorithms using multiple case studies and an extensive user study.