Model selected based on maximizing each operation strength independently.

Model selected based on maximizing each operation strength independently.

Source publication
Preprint
Full-text available
Differentiable neural architecture search (NAS) has attracted significant attention in recent years due to its ability to quickly discover promising architectures of deep neural networks even in very large search spaces. Despite its success, DARTS lacks robustness in certain cases, e.g. it may degenerate to trivial architectures with excessive para...

Contexts in source publication

Context 1
... the original darts α score is weakly and inversely correlated with the oracle scores, further supporting arguments in prior work that this is not an effective operation scoring method. Table 1 shows which subnetwork is found when we use our seven operation scoring functions. As expected, best-acc chooses the best subnetwork from the NAS-Bench-201 supernet. ...
Context 2
... iteration i, only edge i is discretized then all scores for all operations on the remaining edges is computed and correlated against best-acc. Table A1: Comparison to previous work on the ImageNet classification task. We performed our search using the CIFAR-10 dataset as described in Section 5, similarly to previous work. ...
Context 3
... A2 (at the end of this Appendix) shows all operation scores at iteration 0. This data was used to compute Spearman-ρ in Figure 2. Note that we compute Spearman-ρ per edge, and average over all edges -this summarizes how well each score tracks our "oracle" best-acc score. Table A1 shows the ImageNet classification accuracy for architectures searched on CIFAR-10. Our Zero-Cost-PT random algorithm (from Table 5) is able to find architectures with a comparable accuracy much faster than previous work. ...

Similar publications

Preprint
Full-text available
With heterogeneous systems, the number of GPUs per chip increases to provide computational capabilities for solving science at a nanoscopic scale. However, low utilization for single GPUs defies the need to invest more money for expensive ccelerators. While related work develops optimizations for improving application performance, none studies how...