Conference Paper

Filling the Gap: Fault-Tolerant Updates of On-Satellite Neural Networks Using Vector Quantization

Authors:
  • Merantix Momentum GmbH
  • Merantix Momentum
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Article
Full-text available
Artificial intelligence (AI) is paving the way for a new era of algorithms focusing directly on the information contained in the data, autonomously extracting relevant features for a given application. While the initial paradigm was to have these applications run by a server hosted processor, recent advances in microelectronics provide hardware accelerators with an efficient ratio between computation and energy consumption, enabling the implementation of AI algorithms “at the edge.” In this way only the meaningful and useful data are transmitted to the end-user, minimizing the required data bandwidth, and reducing the latency with respect to the cloud computing model. In recent years, European Space Agency (ESA) is promoting the development of disruptive innovative technologies on-board earth observation (EO) missions. In this field, the most advanced experiment to date is the Φ\Phi -sat-1, which has demonstrated the potential of artificial intelligence (AI) as a reliable and accurate tool for cloud detection on-board a hyperspectral imaging mission. The activities involved included demonstrating the robustness of the Intel Movidius Myriad 2 hardware accelerator against ionizing radiation, developing a Cloudscout segmentation neural network (NN), run on Myriad 2, to identify, classify, and eventually discard on-board the cloudy images, and assessing the innovative Hyperscout-2 hyperspectral sensor. This mission represents the first official attempt to successfully run an AI deep convolutional NN (CNN) directly inferencing on a dedicated accelerator on-board a satellite, opening the way for a new era of discovery and commercial applications driven by the deployment of on-board AI.
Conference Paper
Full-text available
The growing numbers of both active satellites and space debris are putting significant strain on Space Situational Awareness (SSA) and collision avoidance activities. One way to address this may be to introduce higher levels of automatization. In this work, we discuss one possible implementation of autonomous on-board collision avoidance capabilities for satellites. The system is divided into two blocks: the decision-making algorithm, based on machine-learning techniques, and the manoeuvre design, relying on highly-efficient analytical methods. Furthermore, a ground-based source of SSA data is required for the system to operate. The key requirements and development status of each of these elements is discussed. Finally, a proposed CubeSat demonstration mission is briefly presented.
Article
Full-text available
The environmental challenges the world faces nowadays have never been greater or more complex. Global areas covered by forests and urban woodlands are threatened by natural disasters that have increased dramatically during the last decades, in terms of both frequency and magnitude. Large-scale forest fires are one of the most harmful natural hazards affecting climate change and life around the world. Thus, to minimize their impacts on people and nature, the adoption of well-planned and closely coordinated effective prevention, early warning, and response approaches are necessary. This paper presents an overview of the optical remote sensing technologies used in early fire warning systems and provides an extensive survey on both flame and smoke detection algorithms employed by each technology. Three types of systems are identified, namely terrestrial, airborne, and spaceborne-based systems, while various models aiming to detect fire occurrences with high accuracy in challenging environments are studied. Finally, the strengths and weaknesses of fire detection systems based on optical remote sensing are discussed aiming to contribute to future research projects for the development of early warning fire systems.
Article
Full-text available
The capabilities of small satellites have improved significantly in recent years. Specifically multi-satellite systems become increasingly popular, since they allow the support of new applications. The development and testing of these multi-satellite systems is a new challenge for engineers and requires the implementation of appropriate development and testing environments. In this paper, a modular network simulation framework for space–terrestrial systems is presented. It enables discrete event simulations for the development and testing of communication protocols, as well as mission-based analysis of other satellite system aspects, such as power supply and attitude control. ESTNeT is based on the discrete event simulator OMNeT++ and will be released under an open source license.
Article
Full-text available
Given the increasing number of space-related applications, research in the emerging space industry is becoming more and more attractive. One compelling area of current space research is the design of miniaturized satellites, known as CubeSats, which are enticing because of their numerous applications and low design-and-deployment cost. The new paradigm of connected space through CubeSats makes possible a wide range of applications, such as Earth remote sensing, space exploration, and rural connectivity. CubeSats further provide a complementary connectivity solution to the pervasive Internet of Things (IoT) networks, leading to a globally connected cyber-physical system. This paper presents a holistic overview of various aspects of CubeSat missions and provides a thorough review of the topic from both academic and industrial perspectives. We further present recent advances in the area of CubeSat communications, with an emphasis on constellation-and-coverage issues, channel modeling, modulation and coding, and networking. Finally, we identify several future research directions for CubeSat communications, including Internet of space things, low-power long-range networks, and machine learning for CubeSat resource allocation.
Article
Full-text available
In the past decade deep neural networks (DNNs) have shown state-of-the-art performance on a wide range of complex machine learning tasks. Many of these results have been achieved while growing the size of DNNs, creating a demand for efficient compression and transmission of them. In this work we present DeepCABAC, a universal compression algorithm for DNNs that is based on applying Context-based Adaptive Binary Arithmetic Coder (CABAC) to the DNN parameters. CABAC was originally designed for the H.264/AVC video coding standard and became the state-of-the-art for the lossless compression part of video compression. DeepCABAC applies a novel quantization scheme that minimizes a rate-distortion function while simultaneously taking the impact of quantization to the DNN performance into account. Experimental results show that DeepCABAC consistently attains higher compression rates than previously proposed coding techniques for DNN compression. For instance, it is able to compress the VGG16 ImageNet model by x63.6 with no loss of accuracy, thus being able to represent the entire network with merely 9 MB. The source code for encoding and decoding can be found at https://github.com/fraunhoferhhi/DeepCABAC.
Conference Paper
Full-text available
The Attitude Determination and Control System (ADCS) is considered one of the most critical subsystems of a spacecraft, and must be carefully calibrated and monitored to ensure mission success. Many emerging small satellite missions feature large constellations, creating a need for new design philosophies and operational approaches to accommodate the management of many ADCS subsystems simultaneously. Planet currently operates approximately 190 Dove Cubesats for Earth Observation, with only a small team of ADCS engineers and satellite operators responsible for the performance of the entire constellation. Since Planet’s first launches in 2013, on-orbit data and diverse experiences have contributed to the evolution of techniques and tools to support a fleet of this size. Today, Planet’s ADCS engineers and operators rely on automated systems to enable on-orbit calibration, nominal operations, performance monitoring, and anomaly detection. The systems are intended to minimize the need for humans-in-the-loop, but where it is required they are designed to enable agile decision making. This paper shares techniques, insights and lessons learned from calibrating and monitoring ADCS subsystems at scale.
Conference Paper
Full-text available
Northwest Nazarene University (NNU) undergraduate engineering students and faculty designed and built Idaho's first CubeSat, MakerSat-0, which NASA launched into orbit on Nov. 18, 2017 from Vandenberg AFB aboard a Delta II rocket. MakerSat-0 was one of five CubeSats chosen by NASA in 2016 for the ELaNa XIV mission. It is the first in a series of proof-of-concept missions that will demonstrate the advantages of on-orbit manufacturing, assembly, and deployment of CubeSats from the International Space Station (ISS). This project is in collaboration with Made In Space, makers of the ISS 3D printer. For the past nine months, MakerSat-0 has been operating in a sun-synchronous polar orbit with a 97min period, 830km x 480km, an inclination of 97.71 degrees, and LTAN of 13:20. It has already travelled 110million miles in 3900 orbits and is expected to orbit for at least eight years. MakerSat-0 hosts two onboard experiments: an ionizing radiation particle counter built by Caldwell High School (CHS) students and a 3D printed polymer degradation experiment built by NNU students. Four different 3D printed polymer samples (ABS, Nylon12, PEI/PC, and PLA) are being exposed to long term spaceflight and are experiencing ongoing erosion and mass loss due to monoatomic oxygen radicals, outgassing, extreme temperatures, ultraviolet (UV) radiation, solar & cosmic ionizing radiation, and even micrometeor impacts. A novel vibrational cantilever test system was designed by the NNU team to continuously measure fractional mass losses from these polymer samples over a long time period in the harsh space environment. This will determine which materials are adequately robust for future use in 3D printed spacecraft. Early orbital data from this polymer degradation experiment shows that mass loss occurs at different rates from these various polymers, with the most robust (least mass loss) also being the densest material, PLA. Radiation data and satellite health data are analyzed, producing key lessons learned that have already been applied to the upcoming MakerSat-1 mission. MOTIVATION AND OVERVIEW
Article
Full-text available
We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce two simple global hyper-parameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to choose the right sized model for their application based on the constraints of the problem. We present extensive experiments on resource and accuracy tradeoffs and show strong performance compared to other popular models on ImageNet classification. We then demonstrate the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.
Conference Paper
Full-text available
During the last years, nano and microsatellite technologies have encountered a fast development materialized in projection of thousands of satellites to be launched in Low-Earth Orbit (LEO) in the near future. However, the ground segment has not encountered a similar development leaving a gap in the possibilities of communicating and operating an increased number of satellites. The advent of the huge constellations (e.g. OneWeb, PlanetLabs, Spire, Satellogic etc.) has made clear the necessity of downloading massive amount of data from space simplifying the management of multi-satellite mission. In addition, it will be needed a real-time access to downloaded data for a number of scenarios (e.g. refugees monitoring, disaster management, environment surveillance, etc.) able to fostering also downstream applications. In this paper a solution to these issues has been identified by envisioning a global network of 20 autonomous ground stations operating on different frequency bands (VHF, UHF, S-Band, X-Band) with different protocols and modulations, connected on the Internet and operated remotely. The solution proposed has been carried out by Leaf Space srl, an Italian startup based in Milan in collaboration with Université de Picardie Jules Verne, based on innovative ground station scheduling algorithms and tasks automation on commercial hardware, in order to provide a quality, reliable and money-saving alternative to current state-of-the-art.
Conference Paper
Full-text available
Nowadays, Normalized Difference Vegetation Index (NDVI) time-series has been successfully used in research regarding global environmental change. NDVI time series have proven to be a useful means of indicating drought -related vegetation conditions, due to their real-time coverage across the globe at relatively high spatial resolution. In this paper, we propose a method for detecting rare events in NDVI series. These events are particularly rare and infrequent which increases their complexity of detecting and analyzing them. The proposed method is based on the analysis of the random component by Jarque-Bera test to verify the presence of rare events and obtain their features (time and amplitude). For validation, we have used a database for regions in Northwestern of Tunisia. These data come from MODIS for a period from 18 February 2000 to 17 November 2013 at a spatial resolution of 250 m by 250m.
Article
Full-text available
Recently, convolutional neural networks (CNN) have demonstrated impressive performance in various computer vision tasks. However, high performance hardware is typically indispensable for the application of CNN models due to the high computation complexity, which prohibits their further extensions. In this paper, we propose an efficient framework, namely Quantized CNN, to simultaneously speed-up the computation and reduce the storage and memory overhead of CNN models. Both filter kernels in convolutional layers and weighting matrices in fully-connected layers are quantized, aiming at minimizing the estimation error of each layer's response. Extensive experiments on the ILSVRC-12 benchmark demonstrate 46×4 \sim 6 \times speed-up and 1520×15 \sim 20 \times compression with merely one percentage loss of classification accuracy. With our quantized CNN model, even mobile devices can accurately classify images within one second.
Article
Full-text available
In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in compute resources, the model size and processing speed of the network for training and evaluation. Fixed point implementation of these networks has the potential to alleviate some of the burden of these additional complexities. In this paper, we propose a quantizer design for fixed point implementation for DCNs. We then formulate an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. We perform experiments on a recently proposed DCN architecture for CIFAR-10 benchmark that generates test error of less than 7%. We evaluate the effectiveness of our proposed fixed point bit-width allocation for this DCN. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in performance. We also demonstrate that fine tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.
Conference Paper
Full-text available
Multilayer Neural Networks (MNNs) are commonly trained using gradient descent-based methods, such as BackPropagation (BP). Inference in probabilistic graphical models is often done using variational Bayes methods, such as Expectation Propagation (EP). We show how an EP based approach can also be used to train deterministic MNNs. Specifically, we approximate the posterior of the weights given the data using a " mean-field " factorized distribution, in an online setting. Using online EP and the central limit theorem we find an analytical approximation to the Bayes update of this posterior, as well as the resulting Bayes estimates of the weights and outputs. Despite a different origin, the resulting algorithm, Expectation BackPropagation (EBP), is very similar to BP in form and efficiency. However, it has several additional advantages: (1) Training is parameter-free, given initial conditions (prior) and the MNN architecture. This is useful for large-scale problems, where parameter tuning is a major challenge. (2) The weights can be restricted to have discrete values. This is especially useful for implementing trained MNNs in precision limited hardware chips, thus improving their speed and energy efficiency by several orders of magnitude. We test the EBP algorithm numerically in eight binary text classification tasks. In all tasks, EBP outperforms: (1) standard BP with the optimal constant learning rate (2) previously reported state of the art. Interestingly, EBP-trained MNNs with binary weights usually perform better than MNNs with continuous (real) weights-if we average the MNN output using the inferred posterior.
Article
Full-text available
As deep nets are increasingly used in applications suited for mobile devices, a fundamental dilemma becomes apparent: the trend in deep learning is to grow models to absorb ever-increasing data set sizes; however mobile devices are designed with very little memory and cannot store such large models. We present a novel network architecture, HashedNets, that exploits inherent redundancy in neural networks to achieve drastic reductions in model sizes. HashedNets uses a low-cost hash function to randomly group connection weights into hash buckets, and all connections within the same hash bucket share a single parameter value. These parameters are tuned to adjust to the HashedNets weight sharing architecture with standard backprop during training. Our hashing procedure introduces no additional memory overhead, and we demonstrate on several benchmark data sets that HashedNets shrink the storage requirements of neural networks substantially while mostly preserving generalization performance.
Article
Full-text available
Training of large-scale deep neural networks is often constrained by the available computational resources. We study the effect of limited precision data representation and computation on neural network training. Within the context of low-precision fixed-point computations, we observe the rounding scheme to play a crucial role in determining the network's behavior during training. Our results show that deep networks can be trained using only 16-bit wide fixed-point number representation when using stochastic rounding, and incur little to no degradation in the classification accuracy. We also demonstrate an energy-efficient hardware accelerator that implements low-precision fixed-point arithmetic with stochastic rounding.
Article
The advent of powerful edge devices and AI algorithms has already revolutionized many terrestrial applications; however, for both technical and historical reasons, the space industry is still striving to adopt these key enabling technologies in new mission concepts. In this context, the current work evaluates an heterogeneous multi-core system-on-chip processor for use on-board future spacecraft to support novel, computationally demanding digital signal processors and AI functionalities. Given the importance of low power consumption in satellites, we consider the Intel Movidius Myriad2 system-on-chip and focus on SW development and performance aspects. We design a methodology and framework to accommodate efficient partitioning, mapping, parallelization, code optimization, and tuning of complex algorithms. Furthermore, we propose an avionics architecture combining this commercial off-the-shelf chip with a field programmable gate array device to facilitate, among others, interfacing with traditional space instruments via SpaceWire transcoding. We prototype our architecture in the lab targeting vision-based navigation tasks. We implement a representative computer vision pipeline to track the 6D pose of ENVISAT using megapixel images during hypothetical spacecraft proximity operations. Overall, we achieve 2.6 to 4.9 FPS with only 0.8 to 1.1 W on Myriad2 , i.e., 10-fold acceleration versus modern rad-hard processors. Based on the results, we assess various benefits of utilizing Myriad2 instead of conventional field programmable gate arrays and CPUs.
Book
This book provides a comprehensive introduction to the OMNeT++ simulation environment and an overview of its ecosystem of ever-growing frameworks, which provide simulation models for diverse communication systems, protocols, and standards. The book covers the most recent advances of the three key points in the OMNeT++ environment: (1) The latest features that are being added to OMNeT++ itself, including improvements in the visualization options, in data processing, etc. (2) A comprehensive description of the current state of development and the work in progress of the main simulation frameworks, covering several aspects of communication such as vehicular, cellular, and sensor networks. (3) The latest advances and novel developments coming from a large research community. The presentation is guided through use cases and examples, always keeping in mind the practical and research purposes of the simulation process. • Includes an introduction to the OMNeT++ simulation framework and its main features; • Gives a comprehensive overview of ongoing research topics that exploits OMNeT++ as the simulation environment; • Provides examples and uses cases focusing on the practical aspects of simulation.
Article
The market for remote sensing space-based applications is fundamentally limited by up- and downlink bandwidth and onboard compute capability for space data handling systems. This article details how the compute capability on these platforms can be vastly increased by leveraging emerging commercial off-the-shelf (COTS) system-on-chip (SoC) technologies. The orders of magnitude increase in processing power can then be applied to consuming data at source rather than on the ground allowing the deployment of value-added applications in space, which consume a tiny fraction of the downlink bandwidth that would be otherwise required. The proposed solution has the potential to revolutionize Earth observation (EO) and other remote sensing applications, reducing the time and cost to deploy new added value services to space by a great extent compared with the state of the art. This article also reports the first results in radiation tolerance and power/performance of these COTS SoCs for space-based applications and maps the trajectory toward low Earth orbit trials and the complete life-cycle for space-based artificial intelligence classifiers on orbital platforms and spacecraft.
Chapter
The rapid development of remote sensing has made it possible to study environmental processes and changes in agriculture and also to provide important assistance in relevant practices, even operationally. This chapter describes the latest developments in remote sensing for precision agriculture with particular emphasis placed on the use of hyperspectral sensors. This chapter provides practical information regarding the identification of research challenges, limitations, and advantages of different platforms and sensors for precision agriculture. Hyperspectral remote sensing (HRS) is more effective as compared to multispectral remote sensing because it records radiation in narrow contiguous spectral channels reflected from any feature or target. More accurate spectral information retrieved using HRS can be combined with other techniques to retrieve useful information for precision agriculture. The chapter includes information about HRS sensors and also includes a discussion on the advancement and challenges of spaceborne satellites faced during agriculture monitoring. It concludes with summarizing the hurdles faced during agriculture research using hyperspectral data discussing possible pathways in which relevant research should be directed.
Article
Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning. One aspect of the field receiving considerable attention is efficiently executing deep models in resource-constrained environments, such as mobile or embedded devices. This paper focuses on this problem, and proposes two new compression methods, which jointly leverage weight quantization and distillation of larger teacher networks into smaller student networks. The first method we propose is called quantized distillation and leverages distillation during the training process, by incorporating distillation loss, expressed with respect to the teacher, into the training of a student network whose weights are quantized to a limited set of levels. The second method, differentiable quantization, optimizes the location of quantization points through stochastic gradient descent, to better fit the behavior of the teacher model. We validate both methods through experiments on convolutional and recurrent architectures. We show that quantized shallow students can reach similar accuracy levels to full-precision teacher models, while providing order of magnitude compression, and inference speedup that is linear in the depth reduction. In sum, our results enable DNNs for resource-constrained environments to leverage architecture and accuracy advances developed on more powerful devices.
Article
Researches on deep neural networks with discrete parameters and their deployment in embedded systems have been active and promising topics. Although previous works have successfully reduced precision in inference, transferring both training and inference processes to low-bitwidth integers has not been demonstrated simultaneously. In this work, we develop a new method termed as "WAGE" to discretize both training and inference, where weights (W), activations (A), gradients (G) and errors (E) among layers are shifted and linearly constrained to low-bitwidth integers. To perform pure discrete dataflow for fixed-point devices, we further replace batch normalization by a constant scaling layer and simplify other components that are arduous for integer implementation. Improved accuracies can be obtained on multiple datasets, which indicates that WAGE somehow acts as a type of regularization. Empirically, we demonstrate the potential to deploy training in hardware systems such as integer-based deep learning accelerators and neuromorphic chips with comparable accuracy and higher energy efficiency, which is crucial to future AI applications in variable scenarios with transfer and continual learning demands.
Conference Paper
We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.
Conference Paper
We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lower layers of the model. We exploit the linear structure present within the convolutional filters to derive approximations that significantly reduce the required computation. Using large state-of-the-art models, we demonstrate we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2x, while keeping the accuracy within 1% of the original model.
Technical Report
Deep convolutional neural networks (CNN) has become the most promising method for object recognition, repeatedly demonstrating record breaking results for image classification and object detection in recent years. However, a very deep CNN generally involves many layers with millions of parameters, making the storage of the network model to be extremely large. This prohibits the usage of deep CNNs on resource limited hardware, especially cell phones or other embedded devices. In this paper, we tackle this model storage issue by investigating information theoretical vector quantization methods for compressing the parameters of CNNs. In particular, we have found in terms of compressing the most storage demanding dense connected layers, vector quantization methods have a clear gain over existing matrix factorization methods. Simply applying k-means clustering to the weights or conducting product quantization can lead to a very good balance between model size and recognition accuracy. For the 1000-category classification task in the ImageNet challenge, we are able to achieve 16-24 times compression of the network with only 1% loss of classification accuracy using the state-of-the-art CNN.
Conference Paper
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, We introduce a three stage pipeline: pruning, quantization and Huffman encoding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy. Our method first prunes the network by learning only the important connections. Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman encoding. After the first two steps we retrain the network to fine tune the remaining connections and the quantized centroids. Pruning, reduces the number of connections by 9x to 13x; Quantization then reduces the number of bits that represent each connection from 32 to 5. On the ImageNet dataset, our method reduced the storage required by AlexNet by 35x from 240MB to 6.9MB, without loss of accuracy. Our method reduced the size of VGG16 by 49x from 552MB to 11.3MB, again with no loss of accuracy. This allows fitting the model into on-chip SRAM cache rather than off-chip DRAM memory, which has 180x less access energy.
Conference Paper
A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and we show that we can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model. We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.
Article
Traditionally, the space industry produced large and sophisticated spacecraft handcrafted by large teams of engineers and budgets within the reach of only a few large government-backed institutions. However, over the last decade, the space industry experienced an increased interest towards smaller missions and recent advances in commercial-off-the-shelf (COTS) technology miniaturization spurred the development of small spacecraft missions based on the CubeSat standard. CubeSats were initially envisioned primarily as educational tools or low cost technology demonstration platforms that could be developed and launched within one or two years. Recently, however, more advanced CubeSat missions have been developed and proposed, indicating that CubeSats clearly started to transition from being solely educational and technology demonstration platforms to offer opportunities for low-cost real science missions with potential high value in terms of science return and commercial revenue. Despite the significant progress made in CubeSat research and development over the last decade, some fundamental questions still habitually arise about the CubeSat capabilities, limitations, and ultimately about their scientific and commercial value. The main objective of this review is to evaluate the state of the art CubeSat capabilities with a special focus on advanced scientific missions and a goal of assessing the potential of CubeSat platforms as capable spacecraft. A total of over 1200 launched and proposed missions have been analyzed from various sources including peer-reviewed journal publications, conference proceedings, mission webpages as well as other publicly available satellite databases and about 130 relatively high performance missions were downselected and categorized into six groups based on the primary mission objectives including “Earth Science and Spaceborne Applications”, “Deep Space Exploration”, “Heliophysics: Space Weather”, “Astrophysics”, “Spaceborne In Situ Laboratory”, and “Technology Demonstration” for in-detail analysis. Additionally, the evolution of CubeSat enabling technologies are surveyed for evaluating the current technology state of the art as well as identifying potential areas that will benefit the most from further technology developments for enabling high performance science missions based on CubeSat platforms.
Conference Paper
We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are approximated with binary values resulting in 32×\times memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58×\times faster convolutional operations (in terms of number of the high precision operations) and 32×\times memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification accuracy with a Binary-Weight-Network version of AlexNet is the same as the full-precision AlexNet. We compare our method with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than 16%16\,\% in top-1 accuracy. Our code is available at: http:// allenai. org/ plato/ xnornet.
Article
High computational complexity hinders the widespread usage of Convolutional Neural Networks (CNNs), especially in mobile devices. Hardware accelerators are arguably the most promising approach for reducing both execution time and power consumption. One of the most important steps in accelerator development is hardware-oriented model approximation. In this paper we present Ristretto, a model approximation framework that analyzes a given CNN with respect to numerical resolution used in representing weights and outputs of convolutional and fully connected layers. Ristretto can condense models by using fixed point arithmetic and representation instead of floating point. Moreover, Ristretto fine-tunes the resulting fixed point network. Given a maximum error tolerance of 1%, Ristretto can successfully condense CaffeNet and SqueezeNet to 8-bit. The code for Ristretto is available.
Article
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
Article
Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide range of tasks, with the best results obtained with large training sets and large models. In the past, GPUs enabled these breakthroughs because of their greater computational speed. In the future, faster computation at both training and test time is likely to be crucial for further progress and for consumer applications on low-power devices. As a result, there is much interest in research and development of dedicated hardware for Deep Learning (DL). Binary weights, i.e., weights which are constrained to only two possible values (e.g. -1 or 1), would bring great benefits to specialized DL hardware by replacing many multiply-accumulate operations by simple accumulations, as multipliers are the most space and power-hungry components of the digital implementation of neural networks. We introduce BinaryConnect, a method which consists in training a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights in which gradients are accumulated. Like other dropout schemes, we show that BinaryConnect acts as regularizer and we obtain near state-of-the-art results with BinaryConnect on the permutation-invariant MNIST, CIFAR-10 and SVHN.
Article
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy, by learning only the important connections. Our method prunes redundant connections using a three-step method. First, we train the network to learn which connections are important. Next, we prune the unimportant connections. Finally, we retrain the network to fine tune the weights of the remaining connections. On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9x, from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG16 found that the network as a whole can be reduced 6.8x just by pruning the fully-connected layers, again with no loss of accuracy.
Article
A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and we show that we can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model. We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.
Article
The theory of quantization began in 1898 with Sheppard's study of roundoff error, but its importance for modulation and analog-to-digital conversion was first recognized during the early development of pulse code modulation systems, especially in the 1948 paper of Oliver, Pierce, and Shannon. Also in 1948, Bennett published the first high resolution analysis of quantization and an exact analysis of quantization noise for Gaussian processes, and Shannon published the beginnings of rate-distortion theory, which would provide a theory for quantization as analog-to-digital conversion and as data compression. Beginning with these three papers of fifty years ago, we trace the history of quantization from its origins through this decade, and we survey the fundamentals of the theory and many of the popular and promising techniques for quantization.