Article

FDI(R) for satellites: How to deal with high availability and robustness in the space domain?

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

FDI(R) for satellites: How to deal with high availability and robustness in the space domain? The European leader for satellite systems and at the forefront of orbital infrastructures, Thales Alenia Space, is a joint venture between Thales (67%) and Finmeccanica (33%) and forms with Telespazio a Space Alliance. Thales Alenia Space is a worldwide reference in telecoms, radar and optical Earth observation, defence and security, navigation and science. It has 11 industrial sites in 4 European countries (France, Italy, Spain and Belgium) with over 7200 employees worldwide. Satellite evolution and the wish to design more autonomous missions imply the enhancement of the satellite architecture and special attention paid to fault management (i.e., Fault Detection, Isolation and Recovery, or FDIR, in space). Nevertheless, the constraints on FDIR techniques and strategies remain the same as for standard missions: robustness, reactive detection, quick isolation/identification and validation. This paper gives an introduction to Fault Tolerance (FT) in the space domain and some principles for the coming FT architectures. The current context of FDIR is presented by describing the approach implemented on telecommunication satellites and, more precisely, on one of the most FDIR sensible subsystems: the AOCS (Attitude and Orbit Control System). Following the current state of FDIR in the space domain, some perspectives are given such as a centralized distributed FDIR strategy for the next generation of autonomous satellites as well as some research tracks and hybrid diagnosis.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Approaches in Space Systems, generally speaking, FDIR systems act as supervisory controllers and ensure that the mission objectives and dependability/safety requirements are met, so that spacecraft (S/C) are protected from failures leading to a service deterioration or even worst to the mission loss [127,128]. The survival of the spacecraft has generally priority over its service availability. ...
... FDIR system conception, design, implementation, verification, and validation are very complex tasks [127]. They strongly depend on both system-level and operational requirements (such as the S/C operational modes and mission phases [128]). Moreover, there is also a strong connection between FDIR systems and S/C on-board autonomy, which is becoming the keypoint in the design of future space missions [130,131]. ...
... Today's rapid progress in on-board resources (i.e., memories and computational power) is actually fostering the transfer of FDIR functions from the ground to the flight segment, and thus the enhancement of on-board autonomous failure management capabilities [131,132]. This section describes innovative approaches and solutions for on-board FDIR systems that are likely to be implemented within a short/medium term (or at least industrial efforts and interests driven by the space agencies would prove this statement [137,138]) in order to address the limitations of currently used FDIR solutions [127,128] and the needs of future space missions. First, we discuss the evolution of FDIR systems in the upcoming years and highlight all the limitations of currently used FDIR systems and the needs of future space missions. ...
Chapter
The following chapter gives an overview on modern techniques for guidance, navigation, and control (GNC). In particular, an overview of artificial intelligence (AI) techniques is provided in light of a tailored application to the space domain. Thanks to their enormous success in a great variety of applications and fields, modern AI techniques can be found in almost every aspect of science and engineering as well as everyday life. AI enables the automation of tasks previously limited to humans, even surpassing human performance on many tasks. Consequently, the terms AI, machine learning, and deep learning are nowadays ubiquitous and are often used interchangeably to describe computer systems which are designed to act in an intelligent way. Among the modern applications, a thorough description of innovative methods for GNC failure (or fault) detection, isolation, and recovery is presented, highlighting the latest novelties. Finally, the emerging topic of CubeSats and nanosatellites, in general, is treated by underlining the peculiar challenges that such missions pose.
... interwoven with all other features of the system, and this is one of the main causes of FDIR system design complexity [ 1,2]. The European Space Agency (ESA) has recognized the system-level role of the FDIR. ...
... FDIR systems act as supervisory controllers, preventing the spacecraft behavior from the undesired scenarios. FDIR systems are basically featured by the following capabilities [1,2]: ...
... FDIR requirements have to be scaled and deployed between the flight and the ground segments according to some factors, such as mission type and objectives, spacecraft orbit and ground visibility profile, operations concepts, communication bandwidth, and latency [1][2][3]. Moreover, important constraints (e.g., robustness, reactive detection, quick isolation/identification, and limited on-board resources -[central processing unit (CPU) and memory]) on FDIR solutions have to be taken into account. ...
Chapter
This chapter presents technical solutions and industrial processes used by the Space Industry to design, develop, test, and operate health (or failure) management systems, which are needed to devise and implement space missions with the required levels of dependability and safety. The overall chapter is inspired by Failure (or Fault) Detection, Isolation and Recovery (FDIR) systems designed for European Space Agency missions; however, the presentation is maintained at a proper level of detail so that its contents are in line with the FDIR practices adopted by other space agencies.
... Failures and faults are usually deployed on a set of ve hierarchical levels (see Fig. 2) which are characterized by clearly dened interfaces between them, their severity, the functions involved in the detection, and the recovery sequence ( [25] and [12]). The highest FDIR level is in charge of the execution of vital functions to ensure the S/C integrity, while lower level hierarchies operate on subsystem or unit level and are usually software driven. ...
... Space and scientic communities are strongly aware of all these issue and many investigations have been carried out recently. Improvements in FDIR programmatic aspects and development strategies have been already highlighted in section II and can also be found in [38], where the usage of concurrent engineering techniques to conduct simultaneously the design of the FDIR system and the spacecraft and the development of a [25] and [12]). The introduction of new approaches has to be justied by a cost-benet analysis in order to provide evidence that the overall mission and operational cost and maintenance eort can be reduced when the system availability is increased. ...
... robust residual generation) are necessary in order to design FDIR systems which are insensitive to noise, and at the same time sensitive to faults. To make things more dicult, upcoming missions require more accurate FDIR techniques to detect some faults with order of magnitude similar to the measurement noise level, whereas the faults considered today are at least one order of magnitude higher than the noise level [25]. Moreover, the model paradigm should allow modular, hierarchical and reusable models. ...
Article
Fault detection, isolation, and recovery (FDIR) systems are addressed, since the very beginning of any space mission design, and play a relevant role in the definition of their reliability, availability, and safety objectives. Their primary purposes are the safety of spacecraft/mission life and the improvement of its service availability. In this survey paper, current FDIR system engineering and programmatic approaches are investigated along with their strong connection with the wider concept of onboard autonomy, which is becoming the key point in the design of new-generation spacecraft. Different perspectives are presented, covering the whole lifecycle of FDIR system development, which is currently regarded as a self-standing and system-level discipline. Special attention is given to the FDIR early lifecycle phases and FDIR system hierarchybased architecture. Recent space projects have brought to light some flaws in the current FDIR system design approaches. These findings pave the way for innovative solutions (e.g., qualitative and quantitative model-based methods, formal verification, and analytical redundancy), which can support and not rule out conventional industrial practices. The various model-based FDIR methods are not addressed in this paper; however, exhaustive surveys dealing with this topic are mentioned for further investigation. The experience and the lessons learned in the FDIR field during the manufacturing of the Galileo full operational capability (FOC) satellites atOHB System AGare reported. In particular, it is highlighted how wellestablished and accepted common practices have been exploited to their maximum extent to sort out the aforesaid issues.
... The latter falls outside the scope of this paper, and it shall be the subject of further works. In the literature [12][13][14], we identified two FDIR strategies: half-satellite's and hierarchical FDIR strategies. In the half-satellite's FDIR strategy, no isolation is performed when an anomaly is detected, instead the spacecraft is fully configured to switch to redundant units. ...
... Then, the failed components are recovered during the next contact with the ground. This strategy is quite simple as it requires In the literature [12][13][14], we identified two FDIR strategies: half-satellite's and hierarchical FDIR strategies. In the half-satellite's FDIR strategy, no isolation is performed when an anomaly is detected, instead the spacecraft is fully configured to switch to redundant units. ...
Article
Full-text available
While there is no rigorous framework to develop nanosatellites flight software, this manuscript aimed to explore and establish processes to design a reliable and reusable flight software architecture for cost-efficient student Cubesat missions such as Masat-1. Masat-1 is a 1Unit CubeSat, developed using a systems engineering approach, off-the-shelf components and open-source software tools. It was our aim to use it as a test-bed platform and as an initial reference for Cubesat flight software development in Morocco. The command and data handling system chosen for Masat-1 is a system-on-module-embedded computer running freeRTOS. A real-time operating system was used in order to simplify the real-time onboard management. To ensure software design reliability, modularity, reusability and extensibility, our solution follows a layered service oriented architectural pattern, and it is based on a finite state machine in the application layer to execute the mission functionalities in a deterministic manner. Moreover, a client-server model was elected to ensure the inter-process communication and resources access while using uniform APIs to enhance cross-platform data exchange. A hierarchical fault tolerance architecture was also implemented after a systematic assessment of the Masat-1 mission risks using reliability block diagrams (RBDs) and functional failure mode, effect and criticality analysis (FMECA).
... Space missions need to perform their tasks with a high level of autonomy under extreme environmental conditions. In order to reach the similar robustness levels for AWE systems, we tailored a hierarchical FDIR architecture which is developed for European earth observation satellites [14]. ...
... We adapted the FDIR architecture from space industry [14], which has five hierarchical levels, as illustrated in Figure 2. ...
Article
Full-text available
Airborne wind energy (AWE) systems use tethered flying devices to harvest wind energy beyond the height range accessible to tower-based turbines. AWE systems can produce the electric energy with a lower cost by operating in high altitudes where the wind regime is more stable and stronger. For the commercialization of AWE, system reliability and safety have become crucially important. To reach required availability and safety levels, we adapted an fault detection, isolation and recovery (FDIR) architecture from space industry. This work focuses on, "flight anomaly detection" layer of the FDIR. Tests verifies that proposed architecture is capable of detecting flight anomalies without generating false alarms.
... The addition of autonomous failure detection and recovery systems can reduce the number of mission outages, which occur in spacecraft characterized by low level of autonomous fault management. In such a case, the spacecraft enters a known safe configuration for each detected anomaly, while further investigations and recovery procedures can be carried out by the ground operators [20]. However, designing spacecraft with highly autonomous on-board capabilities should be balanced against other alternatives. ...
... Three spacecraft categories can be identified: communication satellites, earth observation satellites, and spacecraft used in science and robotics exploration missions. Current European space missions typically use level E2, while some advanced earth observation and science missions implement level E3 [20]. Sequences of preplanned commands (level E2) are fairly simple in structure, and the execution by the OBSW is straightforward. ...
Article
Full-text available
Different drivers are nowadays leading spacecraft toward an increased level of on-board autonomy. In this paper, we survey model-based techniques as a vehicle to implement highly autonomous on-board capabilities for spacecraft mission planning and execution. In this respect, spacecraft reconfiguration approaches based on Markovian Decision Process are explored, and then compared with other model-based alternatives. The integration of planning systems and dynamic reprogramming capabilities into the on-board software is presented. Finally, operational concepts for mission planning and execution in recent European space projects as well as the implementation of in-flight adaptive mission operations via on-board control procedures are also analyzed.
... In such scenario the FDIR appear as one of the most important subsystems for the safety assurance of space missions. The importance of such subsystems is enhanced with the following statement regarding FDIR "It is a cornerstone for the satellite's autonomy enhancement" [4]. This means that the FDIR subsystem design, test, and implementation are seen as imperative to guarantee a dependable and autonomous system with a minimal risk of harmful failures. ...
... Assuming that FDIR execution process is organized in a set of levels, it is possible to identify the level where each module is executed. The levels are (i) equipment-level where it is located the signal processor module; (ii) functional-level where it is located the data interpreter module; (iii) operation-level where it is located the AOCS simulation module; (iv) decision-level where it is located the result comparison and the fault pattern matching modules [4]. ...
Article
Full-text available
The main objective of this paper is the study of a FDIR for an IMU aiming at space applications with focus on the gyro signal analysis and the tests of the filtering algorithms. The algorithms have been tested by using lab data provided by the DMC LABSIM (Physical’s Simulation Laboratory of the Space Mechanics and Control Division of INPE). The results have demonstrated good agreement with the concepts applied in this study. Automatic detection procedures are very important in the characterization of occurrence, definition of criteria, and device types in the scenario of AOCS FDIR. An IMU comprised of four gyros in a tetrahedral configuration is one of the assumed components for the AOCS (attitude and orbit control subsystem) considered in this work. The types of failures considered in this paper are the step abrupt change, ramp/drift/slow, stuck, cyclic, erratic, spike, and finally the stuck for variance alteration noise. An appropriate algorithm for the automatic detection of each type of fault is developed. The approach includes the mapping capability of fault event indicators to the IMU. This mapping is very important in the characterization of the occurrence, definition of criteria, and device types as well as associated fault identification for an AOCS.
... These failures can also occur due to interface communication issues within the system or power supply problems related to the sensor or equipment. [12] ...
Article
Full-text available
The impact of disruptive factors originating from the space environment on spacecraft is highly significant in terms of design, operations, and safety. Factors such as solar wind, ionizing radiation, atomic corrosion, particle collisions, extreme vacuum, low gravity, and temperature changes affect spacecraft. Space agencies and research institutes have been working for years to model these effects. In the context of Fault-Tolerant Control (FTC) in spacecraft, numerous preventive measures are taken against disruptive factors of the space environment, aiming for spacecraft to perform their missions reliably for extended periods in space. This study represents the effects of the space environment in Earth's orbit and the precautions taken against these effects during the development and operation of spacecraft systems.
... The need for a new strategy taking into account the relevant role of the FDIR design in addressing the new and emerging mission challenges is well identified in literature ( [35,36]). It is widely agreed that the goal will be to design a combined centralized and distributed FDIR system, having a single location gathering the whole satellite status but keeping local FDIR functions at unit level. ...
Article
Full-text available
The paper presents the initial outcomes of a project, currently ongoing under the supervision of the European Space Agency, having the main objective to specify and design a Fault Detection Isolation and Recovery (FDIR) system by making use of relevant RAMS (Reliability, Availability, Maintainability, Safety) analyses for missions in non-deterministic environment with limited resources. The initial project tasks have been to select a study case represented by a CubeSat complex mission, analyse in detail both its mission and system requirements and, based on them, define a set of relevant RAMS analyses to be carried out in the second phase of the project, as inputs for the development of a FDIR concept aimed at a careful balance of the limited spacecraft resources in case of critical failures. Two possible study cases have been identified: LUMIO, a 12U CubeSat mission for the observation of micro-meteoroid impacts on the Lunar farside, and M-ARGO, a 12U deep-space CubeSat which will rendezvous with a near-Earth asteroid and characterize its physical properties for the presence of in-situ resources. Although both missions are characterized by a high level of autonomy and complexity in a harsh environment, LUMIO has been eventually selected as study case for the project. In the paper, the challenges and features of this mission are shortly presented. The specificities of the RAMS analysis and FDIR concept for this specific class of small satellite missions (including the selected study case) are highlighted in the paper, looking in particular at aspects such as the improvement of reliability while maintaining the CubeSat philosophy, the tuning of mission and system requirements in view of facilitating the design and implementation of the FDIR concept, and the current gaps within the RAMS/FDIR body of knowledge. The conclusions drawn during this first project phase provide a real view of how systems engineering must work in tandem with RAMS analyses and FDIR to achieve a more robust and functional mission architecture, thus improving the mission reliability.
... Different missions apply their specific FDIR architecture. A publicly available FDIR architecture for AWES was presented in [14], building on an architecture that was originally developed for space applications [15]. The SAVOIR-FDIR working group at ESA is working on a FDIR guideline for the system level [16]. ...
Article
Full-text available
Integrating the operation of airborne wind energy systems safely into the airspace requires a systematic qualification process. It seems likely that the European Union Aviation Safety Agency will approve commercial systems as unmanned aircraft systems within the "specific" category, requiring risk-based operational authorization. In this paper, we interpret the risk assessment methodology for airborne wind energy systems, going through the ten required steps of the recommended procedure and discussing the particularities of tethered energy-harvesting systems. Although the described process applies to the entire field of airborne wind energy, we detail it for a commercial flexible-wing airborne wind energy system. We find that the air risk mitigations improve the consolidated specific assurance and integrity level by a factor of two. It is expected that the framework will increase the safety level of commercial airborne wind energy systems and ultimately lead to operation approval.
... The avail- 45 ability of accurate information provided by the FDI may be crucial to preserve the overall integrity of the system, so that adequate recovery countermeasures (e.g. activation of backup components or reconfiguration) can be applied in a timely manner (see for instance [2]). ...
Article
The integrity of complex dynamic systems often relies on the ability to detect, during operation, the occurrence of faults, or, in other words, to diagnose the system. The feasibility of this task, also known as diagnosability, depends on the nature of the system dynamics, the impact of faults, and the availability of a suitable set of sensors. Standard techniques for analyzing the diagnosability problem rely on a model of the system and on proving the absence of a faulty trace that cannot be distinguished by a non-faulty one (this pair of traces is called critical pair). In this paper, we tackle the problem of verifying diagnosability under the presence of fairness conditions. These extend the expressiveness of the system models enabling the specification of assumptions on the system behavior such as the infinite occurrence of observations and/or faults. We adopt a comprehensive framework that encompasses fair transition systems, temporally extended fault models, delays between the occurrence of a fault and its detection, and rich operational contexts. We show that in presence of fairness the definition of diagnosability has several interesting variants, and discuss the relative strengths and the mutual relationships. We prove that the existence of critical pairs is not always sufficient to analyze diagnosability, and needs to be generalized to critical sets. We define new notions of critical pairs, called ribbon-shape, with special looping conditions to represent the critical sets. Based on these findings, we provide algorithms to prove the diagnosability under fairness. The approach is built on top of the classical twin plant construction, and generalizes it to cover the various forms of diagnosability and find sufficient delays. The proposed algorithms are implemented within the xSAP platform for safety analysis, leveraging efficient symbolic model checking primitives. An experimental evaluation on a heterogeneous set of realistic benchmarks from various application domains demonstrates the effectiveness of the approach.
... As FDIR has a long standing history in the space domain there exists a huge number of excellent literature about the topic in general, e.g. for single satellites [10,11,12,13,14,15,16] and whole formations [17], just to name a few. ...
Conference Paper
In recent years modular satellite architectures have become more and more prevalent, thanks to their advantages in comparison with mission-tailored single use architectures. Fractionated satellite architectures go even further by increasing the subsystem autonomy and interconnectivity even more, e.g. by going from a wired satellite harness to a wireless one. If the OBDH of a fractionated satellite in an earth observing formation fails, the other satellites in the formation could take direct control of the broken satellite's attitude control subsystem and therefore the original mission objective could still be continued. However, this approach imposes different challenges to the satellites. First, the failure in a subsystem has to be detected and an appropriate replacement has to be selected and assigned. Second, if for example closed-loop control is performed via wireless connection links, the communication channel properties have to be taken into account in the controller design. This paper presents solutions for both described problems. The solution of the first problem requires regular intra-satellite communication e.g. via alive messages to recognize failures in subsystems. For the selection of an appropriate replacement subsystem, a bidding method from the multi-robot-systems domain is proposed. For the second problem of control via communication links, we propose a networked control approach based on a Lyapunov Model-Predictive Control (MPC) methodology. As demonstration example, MPC is adapted and applied to a satellite attitude control problem of a typical nano-satellite in LEO. Using a high-fidelity attitude-dynamic simulation, a comparison of stability and control performance of a traditional Lyapunov-based method and a networked MPC is performed. Figures of merit for stability and performance are presented.
... Furthermore, adequate Ground control could be compromised due to communication delays and required Ground decision-making time, endangering the system, although safing procedures are strictly adhered to. To meet the needs of future missions and increase their scientific return, future space systems will require an increased level of intelligence on-board [83], and an effective fault detection, isolation and recovery strategy [69]. Taking autonomous decisions by fixed number of threads, implementing a predefined scheduling policy and taking into account the possibility of pre-emption. ...
Article
Deep space missions are characterized by severely constrained communication links. To meet the needs of future missions and increase their scientific return, future space systems will require an increased level of autonomy on-board. In this work, we propose a comprehensive approach to on-board autonomy. We rely on model-based reasoning, and we consider many important (on-line and off-line) reasoning capabilities such as plan generation, validation, execution and monitoring, runtime diagnosis, and fault detection, identification, and recovery. The controlled platform is represented symbolically, and the reasoning capabilities are seen as symbolic manipulation of such formal model. We have developed a prototype of our framework, and we have integrated it within an on-board Autonomous Reasoning Engine. Finally, we have evaluated our approach on three case-studies inspired by real-world projects and characterized it in terms of reliability, availability, and performance.
... The FDIR system is designed after an analysis stage where the critical conditions that can be reached during the mission are identified, and the procedures that can be applied to solve them are defined. A brief description of how to address the design of an FDIR system can be found in [3]. ...
Article
In space missions, boot software is in charge of the initialisation sequence of flight computers. The processor module in which it runs has a high tolerance to radiation, although not all devices have the same tolerance level. A boot software design capable of recovering from errors in the most vulnerable devices shall provide greater system reliability. This work has been carried out in the context of the boot software development for the control unit of the Energetic Particle Detector instrument on-board the Solar Orbiter mission. This mission operates close to the Sun where high-energy particles can cause single event effects on electronic devices, especially SDRAM and EEPROM, which show lower radiation tolerance than the other devices. This fact motivates this work, where a sensitivity analysis of the incidence of single event effects on the behaviour of the boot software is carried out. Specifically, a fault injection environment has been used to analyse the effect of ‘‘stuck-at’’ bits on the boot software ability to deploy and pass control to the application software. The results show the boot software vulnerability to this kind of permanent effects and have led to the implementation of a reliability-oriented design, presented in this paper.
... We adapt an hierarchical FDIR architecture that is used in space industry 48 because it fits well to the investigated AWE system. Satellites, for example, have also high reliability requirements, incorporate a safe mode, and use holistic anomaly detection. ...
Article
Full-text available
Airborne wind energy systems use tethered flying devices to harvest wind energy beyond the height range accessible to tower‐based wind turbines. Current commercial prototypes have reached power ratings of up to several hundred kilowatts, and companies are aiming at long‐term operation in relevant environments. As consequence, system reliability, operational robustness, and safety have become crucially important aspects of system development. In this study, we analyze the reliability and safety of a 100‐kW technology development platform with the objective of achieving continuous automatic operation. We first outline the different components of the kite power system and its operational modes. In the next step, we identify failure modes, their causes, and effects by means of failure mode and effects analysis (FMEA) and fault tree analysis (FTA). Potentially hazardous situations and mechanisms which can render the system nonoperational are identified, and mitigation measures are proposed. We find that the majority of these measures can be performed by a failure detection, isolation, and recovery (FDIR) system for which we present a hierarchical architecture adapted from space industry.
... Space exploration missions require critical autonomous proximity operations. Mission safety is usually guaranteed via hierarchical implementation of Fault/Failure Detection, Isolation and Recovery (FDIR) approach (see for instance Olive [2012], Zolghadri [2012]). Fault detection and isolation are performed by simple cross checks between redundant units, limit checking, voting mechanisms, etc. ...
Conference Paper
Full-text available
The presented work is a result of a research collaboration between European Space Agency, Thales Alenia Space and IMS Laboratory with the aim of promoting fault-tolerant control strategies to advance spacecraft autonomy. A multiple observer based scheme is proposed jointly with an online constrained allocation algorithm to detect, isolate and accommodate a single thruster fault affecting the propulsion system of an autonomous spacecraft. Robust residual generator with enhanced robustness to time delays induced by the propulsion drive electronics and uncertainties on thruster rise times is used for fault detection purposes. A decision test on the residual of the fault detector triggers a bank of nonlinear unknown input observers which is in charge of confining the fault to a subset of possible faults. The faulty thruster isolation is achieved by matching the residual and the thruster force directions using the direction cosine approach. Finally, the fault is accommodated by redistributing the desired forces and torques among the remaining (healthy) thrusters and closing the isolated thruster. Simulation results from the "high-fidelity" industrial simulator, provided by Thales Alenia Space, demonstrate the fault-tolerance capabilities of the proposed scheme.
... Several FDIR schemes exist, with a different degree of complexity both in the realization and in the implementation. One of the schemes, is the so-called ''half-satellite" strategy [1,2]: when a fault is detected, the satellite is fully reconfigured and the failed unit is switched off in favour of the redundant one. The ground stations are in charge of planning appropriate corrective actions. ...
Article
In recent years, thanks to the increase of the know-how on machine-learning techniques and the advance of the computational capabilities of on-board processing, expensive computing algorithms, such as Model Predictive Control, have begun to spread in space applications even on small on-board processor. The paper presents an algorithm for an optimal fault recovery of a 3U CubeSat, developed in MathWorks Matlab & Simulink environment. This algorithm involves optimization techniques aiming at obtaining the optimal recovery solution, and involves a Model Predictive Control approach for the attitude control. The simulated system is a CubeSat in Low Earth Orbit: the attitude control is performed with three magnetic torquers and a single reaction wheel. The simulation neglects the errors in the attitude determination of the satellite, and focuses on the recovery approach and control method. The optimal recovery approach takes advantage of the properties of magnetic actuation, which gives the possibility of the redistribution of the control action when a fault occurs on a single magnetic torquer, even in absence of redundant actuators. In addition, the paper presents the results of the implementation of Model Predictive approach to control the attitude of the satellite.
... Fault Diagnosis and Isolation (FDI) techniques received considerable attention the past three decades in the academic community (see for example Blanke and al. (2006); Chen and Patton (1999); Hwang and al. (2010); Zhang and Jiang (2008)) and model-based FDI knows growing interest for potential applications in the aerospace field as demonstrated by the recent flourishing of publications : see for example Tipaldi and Bruenjes (2015) or Yin (2016) for a recent survey, or Falcoz and al. (2010); Fonod and al. (2015); Olive (2012); Wu and al. (2010) for satellite application cases. For instance a two stage Kalman filtering algorithm was presented in Hou and al. (2008) to precisely estimate reaction flywheel faults. ...
Article
This paper deals with a practical implementation for the detection and isolation of multiple transient faults of thrusters used to control the attitude of a satellite. The approach proposed in the paper only takes into account attitude control signals to detect the faults regardless of the thrusters management algorithm and modulator. Two methods are compared to compute the residuals for the detection and isolation steps: one based on a geometric analysis of faults directions, the other using observers with unknown inputs. Luenberger’s observers allow the estimation of the efficiency losses providing delta-V information without orbit restitution or acceleration measurements. The algorithms are illustrated on representative closed-loop simulations with simultaneous transient losses of efficiency, and on flight telemetry where transient losses of efficiency are suspected with respect to a posteriori orbit restitution.
... However, this approach breaks down under increasing context uncertainty, which features deep space exploration systems or critical operational phases such as automated spacecraft docking. Further drives for rich onboard autonomy are the overall improvement of the spacecraft efficiency and reliability, the need to overcome long communication delays and outages, and the reduction of costs in ground segment operations, which can address longterm planning instead of day-to-day procedures [2]. ...
Conference Paper
Full-text available
Spacecraft autonomy is a crucial aspect of currently developed and future space projects. This paper presents a Markovian Decision Process (MDP) based framework as a way of modeling spacecraft on-board autonomy mechanisms. Its applicability to the three layered autonomous space systems architecture is shown. Special attention is given to its deliberative layer, where Approximate Dynamic Programming (ADP) state aggregation techniques are applied to determine the corresponding sub-optimal policies. An example of such approach is presented, where it is shown how the MDP structure can determine some important properties of the calculated policy, such as the balancing of the spacecraft safety versus the completion of relevant mission objectives.
... Failures are usually deployed on a set of five hierarchical levels (seeFig. 2) which are characterized by clearly defined interfaces between them, their severity, the function involved in the detection and the recovery sequence ([3]). The highest FDIR level is in charge of the execution of vital functions to ensure the S/C integrity, while lower levels operate on subsystem or unit level and are usually software driven. ...
... They will be endowed with an ergonomic interface so that ground operators can easily define and transmit such high level requests. In [3] and [4], the integration of planning systems and dynamic reprogramming capabilities into the flight software is addressed. The authors propose to organize the DHSW architecture along three hierarchical levels, that is to say the decisional level, the operational level and the functional level (see Fig. 2). ...
Conference Paper
Full-text available
This paper describes an approach for the modelling of the characterization and configuration data yielded when developing a Satellite Data Handling Software (DHSW). The model can then be used as an input for the preparation of the logical and physical representation of the Satellite Reference Database (SRDB) contents and related SW suite, an essential product that allows transferring the information between the different system stakeholders, but also to produce part of the DHSW documentation and artefacts. Special attention is given to the shaping of the general Parameter concept, which is shared by a number of different entities within a Space System.
... Drivers for autonomy are mainly the overall improvement of the spacecraft efficiency and reliability (e.g. a faster reaction to its environmental change without waiting for ground intervention), spacecraft survival in case of on-board faults and cost reduction in the ground segment operations, which can address long-term planning instead of day-to-day procedures. Spacecrafts can be featured by different level of on-board autonomy depending on factors such as mission type, mission objectives and priorities, type of the spacecraft orbit, spacecraft ground visibility profile, operations concepts and communication constraints ( [2] and [3]). Three spacecraft categories can be identified: communication satellites, earth observation satellites and science & robotic exploration missions. ...
Conference Paper
Full-text available
On-board autonomy is becoming a crucial aspect of currently developed and future space projects, especially for deep space exploration missions. In the near future, spacecrafts will be able to receive, process and achieve high-level goals even in an uncertain or dynamically varying context. This paper presents a Markovian based approach in order to model on-board autonomy mechanisms. This approach fits the three layered autonomous space systems architecture and integrates a partially observable non-homogenous Markov model for the decisional layer with a Markov decision process for the operational layer. Autonomous spacecraft reconfigurability is particularly addressed.
... This definition is shown in Table I. Spacecrafts can be featured by different level of on-board autonomy depending on factors such as mission type, mission objectives and priorities, type of the spacecraft orbit, spacecraft ground visibility profile, operations concepts and communication constraints ([7] and [8]). Three spacecraft categories can be identified [9]: communication satellites, earth observation satellites and science & robotic exploration missions. ...
Conference Paper
Full-text available
On-board Control Procedures (OBCPs) provide a useful mean to implement functional as well as operational flight procedures (e.g. payload switch-on/switch-off and spacecraft mode management) which can be modified, if necessary, even during the mission itself. This paper addresses the OBCP implementation in Meteosat Third Generation (MTG) satellite, where OBCPs are written in a specific language called On-board Command Language (OCL) and executed by a virtual machine installed into the spacecraft on-board SW. Before being uploaded, OBCPs are implemented and tested on ground by means of a dedicated OBCP development environment. Thanks to the OCL features and OBCP operational approach, OBCP usage can help in raising spacecraft robustness against environmental changes, reliability and autonomy capabilities.
... The autonomy required for active debris removal as discussed in this paper, is qualified as level E4, with a "Execution of goal-oriented mission operation on-board", resulting in the function of "goal-oriented mission re-planning". Modern spacecraft nowadays reach level E2 [16], some interplanetary spacecraft have decision making capability implemented after launch, using on-board control procedures, and can be considered to reach level E3 [17]. ...
Article
Full-text available
The awareness of space debris as a threat for the safe operation of satellites in Earth's orbit increased rapidly over the last few years. Attempts such as improving trajectory predictions of non-functional objects in space, guidelines for safer launches nowadays, and post-mission disposal maneuver, however, will not stop the growth in debris numbers, as simulations predict. Mitigation needs therefore the realization of removal missions. This paper introduces an exemplary removal mission for 5 Russian SL-8 rocket bodies at an inclination of 83° orbiting at an altitude of 970 km - an area crowded with space debris and thus involving a high collision risk. By removing large objects, the potential for the creation of smaller fragments due to collisions shall be reduced. The mission itself consists of a main satellite (Autonomous Debris Removal Satellite - ADReS-A) and smaller De-orbit Kits being launched together into an orbit close to the targets position. While the De-orbit Kits are equipped with a de-orbit thruster, the task of ADReS-A is, to approach the uncooperative target, perform berthing operations, stabilize the compound system and attach one De-orbit Kit onto the rocket body. The main satellite will take each De-orbit Kit separately to the individual targets, shuttling between the parking orbit and the target orbits. Future investigations concentrate on autonomy for highly critical situations resulting from the interaction with an uncooperative target. A prospect is given towards the end of the paper with a preliminary design for a decision process for autonomy.
... The AOCS and its associated FDIR subsystem are used in navigation control of satellites, rockets and others space vehicles [1] [14]. Those subsystems include inertial navigation sensors such as accelerometers and gyroscopes which provide electrical signals to the AOCS OBSW [9] [10]. Such sensors can be strategically combined to work together to obtain a more precise navigation [7]. ...
Article
Full-text available
The FDIR software subsystem may be part of the attitude and orbit control subsystem, AOCS. The AOCS quite often includes inertial navigation sensors being physically implemented by accelerometers and gyroscopes which provide electrical signals to the AOCS on-board software (OBSW) which, in its turn, generates the commands to the control actuators. In general, hardware like sensors and actuators present nonlinearities which sometimes make it difficult to properly interpret the output signals. In the scenario of space applications, filters are used to eliminate noise and to increase the reliability for the correct interpretation of those signals. In this paper we present a collection of filters used in inertial navigation subsystems enabling the fusion of data from sensors. Fundamentally, the filters are composed of the Kalman filter in its derivations. The filters can be used for state estimation of a system as well as for noise filtering. In this work the filters are configured with respect to their different orders of execution, their sampling rate, and their cutting-off frequency. The filter configurations can be changed by software so as to allow a flexible structure that can be adjusted for the best quality of output signal and consequently the best analysis of the satellite behaviour. The main purpose of this paper is to test the algorithm that combines several signal filters considered in this study. To accomplish this goal we developed an experiment encompassing an accelerometer and a wireless communication system so as to provide input signals to be filtered by the filtering algorithm.
... This is so in the sense that the former should be ideally solved on-line and use only local information given that faults occur at unknown times, have unknown patterns, and the existing fault detection and isolation (FDI) module in the team information may be available only locally, while the latter problem can be solved off-line and by potentially using the entire system information. Moreover, due to the information sharing structure of multi-agent systems, the fault tolerant control approaches that have extensively been studied in the literature for single agent systems [13], [15], [21], [29], [30] will not be directly applicable to multi-agent systems. ...
Article
In this work, an HH_{\infty} performance fault recovery control problem for a team of multi-agent systems that is subject to actuator faults is studied. Our main objective is to design a distributed control reconfiguration strategy such that a) in absence of disturbances the state consensus errors either remain bounded or converge to zero asymptotically, b) in presence of actuator fault the output of the faulty system behaves exactly the same as that of the healthy system, and c) the specified HH_{\infty} performance bound is guaranteed to be minimized in presence of bounded energy disturbances. The gains of the reconfigured control laws are selected first by employing a geometric approach where a set of controllers guarantees that the output of the faulty agent imitates that of the healthy agent and the consensus achievement objectives are satisfied. Next, the remaining degrees of freedom in the selection of the control law gains are used to minimize the bound on a specified HH_{\infty} performance index. The effects of uncertainties and imperfections in the FDI module decision in correctly estimating the fault severity as well as delays in invoking the reconfigured control laws are investigated and a bound on the maximum tolerable estimation uncertainties and time delays are obtained. Our proposed distributed and cooperative control recovery approach is applied to a team of five autonomous underwater vehicles to demonstrate its capabilities and effectiveness in accomplishing the overall team requirements subject to various actuator faults, delays in invoking the recovery control, fault estimation and isolation imperfections and unreliabilities under variuos control recovery scenarios.
... In the recent decades, due to the increased complexity, as well as, the need for reliability, safety, and efficient operation, a great deal of attention has been paid to the subject of Fault/Failure Detection Isolation and Recovery (FDIR) in space systems, see for instance [1], [2]. Literature reports that conventional FDIR approaches suffer from significant shortcomings, like increased mass and system complexity, often missing on-board isolation of the faults, ground intervention is not always possible due to large communication delays or visibility issues, and knowledge about the operational capabilities of the system is not present on-board. ...
Conference Paper
Full-text available
In this paper, the problem of Nonlinear Unknown Input Observer (NUIO) based Fault Detection and Isolation (FDI) scheme design for a class of nonlinear Lipschitz systems is studied. The proposed FDI method is applied to detect, isolate and accommodate thruster faults of an autonomous spacecraft involved in the rendezvous phase of the Mars Sample Return (MSR) mission. Considered fault scenarios represent fully closed thruster and thruster efficiency loss. The FDI scheme consists of a bank of NUIOs with adjustable error dynamics, a robust fault detector that is based on judiciously chosen frame and an isolation logic. The bank of observers is in charge of confining the fault to a subset of possible faults and the isolation logic makes the final decision about the faulty thruster index. Finally, a thruster fault is accommodated by re-Allocating the desired forces and torques among the remaining healthy thrusters and closing the associated thruster valve. Monte Carlo results from 'high-fidelity' MSR industrial simulator demonstrate that the proposed fault tolerant strategy is able to accommodate thruster faults that may have effect on the final rendezvous criteria.
... Many space exploration missions require critical autonomous proximity operations. Mission safety is usually guaranteed through a hierarchical implementation of fault/failure detection, isolation and recovery (FDIR) approach with several levels of fault containments defined from local component/equipment up to global system, i.e., through various equipments (sensors, thrusters, reaction wheels etc.) redundancy paths and ground intervention [1][2][3]. In the case of the Mars Sample Return (MSR) mission, the hierarchical implementation of FDIR is concerned at three levels [4,5]: (i) based exclusively on sensor measurements in the fault detection and isolation (FDI) function which are mainly signal-based techniques, (ii) relying on both actuator commands and sensor measurements in the model-based FDI function, and (iii) based on navigation outputs for the monitoring of the trajectory in the safety monitoring function. ...
Article
Full-text available
This paper deals with performance and reliability evaluation of a fault diagnosis scheme based on two distinct models to detect and isolate a single thruster fault affecting a chasing spacecraft during rendezvous with a passive target in a circular orbit. The analysis is conducted in the frame of a terminal rendezvous sequence of the Mars Sample Return mission. A complete description of a robust residual generation design approach based on eigenstructure assignment is presented. Unknown time-varying delays, induced by the thruster drive electronics and uncertainties on thruster rise times, are considered as unknown inputs. Particular novelty of the work is a new method for estimating the unknown input directions used to enhance the robustness properties of the diagnosis scheme. Monte Carlo results from a high-fidelity industrial simulator and carefully selected performance and reliability indices allows us to evaluate the effectiveness of both schemes. The obtained results reveal that the proposed fault diagnosis scheme based on a position model is a justified competitor to the conventionally used attitude model-based scheme.
... During the definition and early development phases, FDIR concepts are further detailed and translated into a FDIR architecture and FDIR function identification and allocation. Failures are usually deployed on a set of five hierarchical levels (see Fig. 2) which are characterized by clearly defined interfaces between them, their severity, the function involved in the detection and the recovery sequence ( [3]). The highest FDIR level is in charge of the execution of vital functions to ensure the S/C integrity, while lower levels operate on subsystem or unit level and are usually software driven. ...
Conference Paper
Full-text available
Spacecraft health monitoring and management systems (also referred to as FDIR (Fault Detection, Isolation and Recovery) systems)) are addressed since the very beginning of any space mission design and play a relevant role in the definition of their reliability, availability and safety objectives. Their primary purposes are the safety of spacecraft/mission life and the improvement of its service availability. In this paper current technical and programmatic FDIR strategies are presented along with their strong connection with the wider concept of on-board autonomy, which is becoming the key-point in the design of new-generation spacecrafts. Recent projects developed at OHB System AG have brought to light some issues in the current FDIR system design approaches. These findings pave the way for innovative solutions, which can support and not rule out conventional industrial practices.
... Like many existing studies [71,72], we define four levels of failure severity according to the available recovery capacities (Fig. 4): ...
Article
Experience demonstrates that autonomous mobile robots running in the field in a dynamic environment often breakdown. Generally, mobile robots are not designed to efficiently manage faulty or unforeseen situations. Even if some research studies exist, there is a lack of a global approach that really integrates dependability and particularly fault tolerance into the mobile robot design. This paper presents an approach that aims to integrate fault tolerance principles into the design of a robot real-time control architecture. A failure mode analysis is firstly conducted to identify and characterize the most relevant faults. Then the fault detection and diagnosis mechanisms are explained. Fault detection is based on dedicated software components scanning faulty behaviors. Diagnosis is based on the residual principle and signature analysis to identify faulty software or hardware components and faulty behaviors. Finally, the recovery mechanism, based on the modality principle, proposes to adapt the robot's control loop according to the context and current operational functions of the robot. This approach has been applied and implemented in the control architecture of a Pioneer 3DX mobile robot.
... Following Olive [30], today's spacecraft operations reach autonomy level E2, defined by the European Cooperation for Space Standardization (ECSS) [31] and listed in TAB 3. Some interplanetary spacecraft, however, have decision making capability implemented after launch, using On-Board Control Procedures, and are therefore considered as level E3 on subsystem level [32]. ...
Conference Paper
Full-text available
Even though the infinite vastness of the universe is an accepted theory, apparently infinity ends when it comes to orbits surrounding the Earth. This was a hard lesson to learn when Iridium 33 and Cosmos 2251 collided in the low Earth orbit in February 2009. Not at least due to this event, the threat of uncontrolled objects in space is subject to a series of activities for the stabilization of the space environment. Besides improved collision propagation and mitigation measurements currently adopted by major space agencies, the active removal of space debris (ADR) needs to be addressed and further developed within the next few years. Based on an introduced reference scenario, this paper introduces autonomy in space for such missions. Existing problems are addressed and possible approaches concerning autonomous remediation of space debris are presented.
... The European Cooperation for Space Standardization (ECSS) 33 has defined mission execution levels (see Table 2). Following Olive 35 , modern spacecraft reach level E2, some interplanetary spacecraft have a decision making capability implemented after launch, using On-Board Control Procedures, and can be considered to reach level E3 36 . The goal of the presented concept is reaching level E4. ...
Conference Paper
Full-text available
With respect to an increasing threat of space debris for the space environment, uncontrolled objects orbiting the Earth are subject to a series of activities for stabilization on the amount of man-made debris. It is well-known that one of the mitigation means, active space debris removal, (ADR) will be inevitable to achieve this goal. Mission scenarios for ADR involve docking and grabbing as well as formation flying operations with non-cooperative targets. Prospectively, these missions will require a higher level of autonomy than implemented nowadays. Especially during close approach and capture, time-critical manoeuvres due to unforeseen events have the need for autonomous situation based reactions which will alleviate this challenge. Based on a reasoned reference scenario, this paper introduces cognitive automation for critical manoeuvres with the capability to react on unexpected changes during mission time. This concept is found to be suitable for high level autonomy; for situation based action with the capability of weighing potentially contradictory goals. Its basics and system architecture are presented in this paper.
Conference Paper
When a cubesat is launched into the orbit, anomalies and failures are prone to occur due to the outer space harsh environment, and in-orbit corrective actions are in most cases not an option. These risks are further exacerbated due to the lack of quantitative/statistical data on components reliability in the literature, the potential lack of efficient fault tolerance architectures implemented onboard, and ad hoc limited testing. Therefore, while there is no rigorous or standard framework to devise a risk analysis plan for cubesats, this manuscript aimed to report the adopted approach, results, and lessons learned from Masat-1 project, the first initiative to develop a 1U cubesat mission in Morocco. This work is serving as our reference for a systematic mission risk assessment when going forward to develop a 3U cubesat mission. It will also serve as a complement to the current literature on cubesats reliability research.
Article
Current trends in the aerospace industry for the digitization of data, tools, and services call for novel solutions that can keep track of the latest technological advancements. Moreover, the increasing drive toward model-based approaches imposes the use of models for the development of complex systems, such as satellites. New types of methods and tools are needed in order to manage the still-growing complexity of embedded systems and their corresponding models. This paper addresses issues related to space systems (in-service) operations and proposes a multimodel approach to address them. The multimodel approach aims at facilitating the task of operational diagnosis by creating a monitoring and diagnosis-dedicated system model derived from existing system design (architecture, functional behavior, and safety) models. The purpose of the diagnosis-dedicated model is to enable the codesign of the system and its diagnosis tools in order to improve the performance of diagnosis activities and the system’s availability. This paper demonstrates the interests of the suggested approach with regard to the limitations of current practices.
Article
This paper presents new perspectives on the application of Artificial Intelligence (AI) solutions to process Spacecraft (S/C) flight data in order to augment currently used operational S/C health monitoring and diagnostics systems. It captures the growing general interest in the usage of such techniques in the Space engineering domain and applications. Jointly with the AI approach, the operational usage of S/C simulation models (referred to as “discipline models”) is also explored. During S/C development and testing activities, significant efforts are made by the discipline experts to build such models. However, using discipline-specific knowledge to support complex S/C operational activities (e.g., anomaly root cause analysis) remains a challenging task. Based on the current needs of Space Agencies and Industry and by exploiting the advances in AI-based solutions and technologies, this paper proposes an operational S/C model-based diagnostics framework, which can serve as basis for future developments. Such framework combines AI-based techniques, S/C flight data information, and discipline models. Three main needs are addressed: S/C anomaly root cause analysis, S/C prediction behavior, and discipline model refinement. Concrete operational case studies from the Project for On-Board Autonomy (PROBA) satellite family are presented to show the applicability of the proposed framework.
Chapter
The differences of onboard faults characteristics and severity result in different models, methods and interfaces of fault diagnosis. Thus, FDIR (Fault Discovery, Identification and Recovery) systems usually use a hierarchical architecture in centralized or distributed styles. It is difficult for the centralized FDIR to guarantee the timeliness and coverage of fault diagnosis simultaneously, and the distributed one would bring safety problems. Both of them only focus on the health states of spacecrafts, while not considering its own reliability and reusability. Taking advantage of the above two, the architecture proposed by this paper keeps synthetic views of the spacecraft health states at higher levels and distributes local FDIR at lower levels to improve the timeliness and coverage of fault diagnosis simultaneously, which is based on the hierarchical architecture of spacecrafts and fault severity levels. To ensure the safety and reliability of the FDIR system, a highly decoupled runtime model is proposed. To improve the reusability of the architecture, a unified FDIR model is proposed, which includes hierarchical programming interfaces, etc.
Article
This paper aims at providing a brief perspective of advanced model-based Fault Detection, Identification and Recovery (FDIR) for aerospace and flight-critical systems. A number of practical key factors for designing credible technological options are emphasized. Such considerations are decisive for the survivability of the design during ground/flight Validation & Verification (V&V) activities. The views reported in this paper are based on lessons learnt and results achieved through actions undertaken with Airbus during the last decade. As an illustrative example, a model-based fault monitoring technique is presented which has reached level 5 on Technological Readiness Level scale under V&V investigations at Airbus.
Chapter
This chapter presents briefly the motivations and the book outline. This book presents a number of advanced fault detection and diagnosis and reconfiguration technologies for aerospace vehicles. An attempt is made to develop useful solutions that can be relevant and viable candidates for future space and aeronautical systems. The presented techniques have been tested and validated on highly representative benchmarks, real flight data, or real-world aerospace systems. The examples presented in this book are taken mainly from four recent projects related to fault detection and diagnosis and fault-tolerant control and guidance of aircraft and space systems:
Article
Current and future space missions demand highly reliable on-board computing systems, which are capable to carry out high-performance data processing. At present no single computing scheme could efficiently tackle high-performance computing as well as reliability. This paper aims to address that gap. In the first part of the paper, a detailed survey of fault-tolerant distributed computing systems for space applications is presented. Fault types and performance parameters for assessment of a fault-tolerant system are introduced. Redundancy schemes for distributed systems are analysed. A review of the state-of-the-art on fault-tolerant distributed systems is presented and limitations of current approaches are discussed. In the second part of the paper, a new fault-tolerant distributed computing platform with wireless links among the computing nodes is proposed. Novel algorithms, enabling important aspects of the architecture, such as time slot priority adaptive fault-tolerant channel access and fault-tolerant distributed computing using task migration are introduced.
Chapter
This chapter is dedicated to techniques for ensuring fault tolerance in redundant aircraft sensors involved in computation of flight control laws. The objective is to switch off the faulty sensor and to compute a reliable (a.k.a. as “consolidated”) parameter using data from valid sensors, in order to eliminate any anomaly before propagation in the control loop. The benefit of the presented method is to improve the consolidation process with a fault detection and isolation approach when only few sources (less than three) are valid. Different techniques are compared to accurately detect any behavioral change of the sensor outputs. The approach is tested on a recorded flight dataset. This chapter is dedicated to fault detection and isolation of redundant aircraft sensors involved in the computation of flight control laws. The objective is to switch off the erroneous sensor and to compute a so-called consolidated parameter using data from valid sensors, in order to eliminate any anomaly before propagation in the control loop. We will focus on oscillatory failures and present a method for integrity control based on the processing of any flight parameter measurement in the flight control computer (FCC) like, e.g., anemometric and inertial data. One of the main tasks dedicated to the FCC is the flight control laws (FCL) computation which generates a command (position order) to servo-control each moving surface (see Fig. 5.1). The comparison between the pilot commands (or the piloting objectives) and the aircraft state is used for FCL computation. The aircraft state is measured by a set of sensors delivering, e.g., anemometric and inertial measurements that characterize the aircraft attitude, speed, and altitude. The data is acquired using an acquisition system composed by several dedicated redundant units (usually three). The FCC receives three redundant values of each flight parameter data from the sensors and must compute unique and valid flight parameters required for the FCL computation. This specific data fusion processing, called “consolidation,” classically consists of two simultaneous steps (Fig. 5.2): selection or computation of one unique parameter from the three available sources, and, in parallel, monitoring of each of the three independent sources to discard any faulty one. As a consequence, the consolidation allows reliable flight parameters computation with the required accuracy by discarding any involved failed source.
Article
In this study, the robust detection problem of intermittent faults (IFs) for linear stochastic systems subject to time-varying parametric perturbations is addressed. The authors consider the case that an IF appears and disappears non-deterministically, and lasts for random periods of active time with unknown magnitudes. A novel robust fault detection method is presented to detect all the appearing time and the disappearing time of an IF. Based on the output of an observertype residual generator, a novel robust residual is constructed by utilising a sliding-time window. Two hypothesis tests are provided to detect all the appearing and the disappearing time, respectively. Moreover, the detectability of the IF by using the proposed robust detection scheme is defined in a probabilistic sense, and a sufficient detectability condition is presented within the given framework. Capacities of false alarm rates and missed detection rates are rigorously analysed. Finally, the application of the presented scheme is illustrated on a simplified radial flight control system and the results show that the IF can be effectively detected in the presence of perturbations and noises.
Chapter
This chapter starts with some basic definitions and concepts as well as a quick literature review on FDIR academic methods. The main concepts of the industrial state-of-practice for space and avionics systems will also be briefly presented. An attempt will be made to analyze major reasons for the slow-progress in applying advanced model-based techniques to real-world aerospace systems. Fault detection and diagnosis (FDD) is an important aspect of process engineering. The primary objective of an FDD system is early detection of faults, isolation of their location, and diagnosis of their causes, enabling correction of the faults before additional damage to the system or loss of service occurs. Abnormal situations occur when processes deviate significantly (outside the allowed range) from their normal regime during online operation. A fault can be defined as an unpermitted deviation of at least one characteristic property or parameter of the system from the standard condition [1]. A failure is a permanent interruption of a system’s ability to perform a required function under specified operating conditions. Within the academic literature, the terminology is now more or less standardized. Such malfunctions may occur in the individual unit of the plants, sensors, actuators, or other devices and affect adversely the local or global behavior of the system. Process abnormalities are usually classified into additive or multiplicative faults according to the effects on a process. In general, additive faults affect processes as unknown inputs, while multiplicative faults usually have important effects on the process dynamics and can cause unstable behaviors. Abrupt faults are sudden changes in behavior of the system (step like), while incipient faults are gradual and slow drifting faults. Permanent faults lead to the total failure of the equipment (once they occur they do not disappear), transient faults are temporary malfunctioning (appear for a short time and then disappear), and intermittent faults are the repeated occurrences of transient faults (they appear, disappear, and then reappear). Hidden faults are those which are present on standby equipment and visible only when this equipment is activated.
Chapter
This chapter is dedicated to space applications. Three application cases will be presented: an Earth observation satellite, a deep space mission and an atmospheric re-entry vehicle. The design method is based on H ∞ /H − tools and is associated with a suitable post-analysis process, the so-called generalized μ-analysis. It is shown that the resulting design/analysis procedure provides an iterative refinement cycle which allows the designer to get “as close as possible” to the required robustness/performance specifications and trade-offs. This chapter is dedicated to actuator fault detection and diagnosis in space applications. Fault tolerance in terms of control and guidance will also be discussed. The design method is based on H ∞ /H − and robust pole assignment tools. Three space applications will be studied:
Article
This paper presents a novel failure-tolerant architecture for future robotic spacecraft. It is based on the Time and Space Partitioning (TSP) principle as well as a combination of Artificial Intelligence (AI) and traditional concepts for system failure detection, isolation and recovery (FDIR). Contrary to classic payload that is separated from the platform, robotic devices attached onto a satellite become an integral part of the spacecraft itself. Hence, the robot needs to be integrated into the overall satellite FDIR concept in order to prevent fatal damage upon hardware or software failure. In addition, complex dexterous manipulators as required for onorbit servicing (OOS) tasks may reach unexpected failure states, where classic FDIR methods reach the edge of their capabilities with respect to successfully detecting and resolving them. Combining, and partly replacing traditional methods with flexible AI approaches aims to yield a control environment that features increased robustness, safety and reliability for space robots. The developed architecture is based on a modular on-board operational framework that features deterministic partition scheduling, an OS abstraction layer and a middleware for standardized inter-component and external communication. The supervisor (SUV) concept is utilized for exception and health management as well as deterministic system control and error management. In addition, a Kohonen self-organizing map (SOM) approach was implemented yielding a real-time robot sensor confidence analysis and failure detection. The SOM features nonsupervized training given a typical set of defined world states. By compiling a set of reviewable three-dimensional maps, alternative strategies in case of a failure can be found, increasing operational robustness. As demonstrator, a satellite simulator was set up featuring a client satellite that is to be captured by a servicing satellite with a 7-DoF dexterous manipulator. The avionics and robot control were integrated on an embedded, space-qualified Airbus e.Cube on-board computer. The experiments showed that the integration of SOM for robot failure detection positively complemented the capabilities of traditional FDIR methods.
Conference Paper
FDIR functionalities are investigated since the very beginning of a space mission and play a relevant role in the definition of its autonomy, reliability and availability objectives. In this paper, an analytical methodology derived from the Timed Failure Propagation Graph (ETFPG) is proposed. TPFG is a causal model that captures the temporal aspects of failure propagation in a wide variety of engineering systems. It has been extended in order to incorporate the recovery actions as well as to accommodate the dependencies on the mission phases and spacecraft operational modes in the related graphs. The proposed methodology has proved to be very useful in the context of the trouble identification and shooting of complex system, such a satellite.
Conference Paper
Reliability of complex systems requires to take into account possible failures and strategies to detect and recover from system faults. This leads designers to consider models and algorithms capable of simulating and verifying fault detection, isolation and recovery (FDIR) strategies in different scenarios, characterized by uncertainty and partial information. Different solutions have been proposed. In this paper we present Trouble, a domain specific language aimed at describing and simulating troubleshooting algorithm. Different examples highlight advantages of such an approach.
Article
Full-text available
Estimating the state of a hybrid system means accounting for the mode of operation or failure and the current state of the continuously valued entities concurrently. Existing hybrid estimation schemes try to overcome the problem of an exponentially growing number of possible mode-sequence/continuous-state combinations by merging hypotheses and/or deducing likelihood measures to identify tractable sets of the most likely hypotheses. However, they still suffer from unnecessarily high computational costs as the number of possible modes increases. Hybrid diagnosis schemes, on the other hand, estimate the current mode of operation/failure only, thus leaving the continuous evolution of the system implicit. This paper proposes a novel scheme that uses a combination of both the approaches in order to define posterior transition probabilities between the specified modes of the hybrid system, hence focusing better on relevant hypotheses. In order to demonstrate the effectiveness of the proposed method, the algorithm is applied to a satellite attitude control system and compared with existing hybrid estimation/diagnosis schemes, such as the Interacting Multiple Model (IMM) algorithm, a purely parity based method (HyDiag), and an existing Hybrid Mode Estimation (HME) algorithm.
Conference Paper
Fault detection and diagnosis is a crucial aspect in spacecraft operations and on-board software with respect to safety, reliability and performance. The success of a space mission depends on the adequate and timely system reaction to unexpected environmental changes and fault or failure of components or subsystems. Within this paper, case studies of current spacecraft applications are categorized with respect to the level of autonomy that is reached. The potential of further concepts that were not yet studied in the context of spacecraft fault detection, isolation and recovery (FDIR), but in industrial and aerospace applications are examined. As a result, the cognitive automation approach, developed by the Institute of Flight Systems of the Bundeswehr University München, is identified as highly promising technology for enhancing spacecraft on-board autonomy: Context sensitive reactions in case of unexpected failure are enabled by knowledge about the current system and environmental state, system operational capabilities and the impact of faults and recovery actions on the system and system performance. A study that applies the cognitive automation concept to the power subsystem of an interplanetary spacecraft is proposed.
Conference Paper
This paper gives an overview of recent progress in theory and methods to analyze and design fault diagnosis and fault tolerant control techniques for aerospace systems. Passive and active approaches are presented and analyzed. Strongpoint and shortcomings of each approach are pointed out. Open problems related to the topic are also highlighted. The paper is written in a tutorial fashion to summarize some of the recent results in the subject area without going into details. A bibliographical review summarizing a decade of references is provided to allow interested readers to obtain more detailed information about the recent contributions in the field. Since the general areas of Fault Tolerant Control (FTC) and Fault Detection and Isolation (FDI) draws from a number of different technical areas in engineering and applied mathematics, no survey paper could hope to capture all existing contributions in the field.
Article
Full-text available
This paper describes an implementation of the 3T robot architecture which has been under development for the last eight years. The architecture uses three levels of abstraction and description languages which are compatible between levels. The makeup of the architecture helps to coordinate planful activities with real-time behaviours for dealing with dynamic environments. In recent years, other architectures have been created with similar attributes but two features distinguish the 3T architecture: (1) a variety of useful software tools have been created to help implement this architecture on multiple real robots; and (2) this architecture, or parts of it, have been implemented on a variety of very different robot systems using different processors, operating systems, effectors and sensor suites.
Conference Paper
Full-text available
This paper briefly describes a robot architecture that has been under development for the last eight years. This architecture uses several levels of abstraction and description languages that are compatible between levels. The makeup of the architecture helps to coordinate planful activities with real-time behaviors for dealing with dynamic environments. In recent years, many architectures have been created with similar attributes. The two features that distinguish this architecture from most of those are: 1) a variety of useful software tools have been created to help implement this architecture on multiple real robots; and 2) this architecture, or parts of it have been implemented on over half a dozen very different robot systems using a variety of processors, operating systems, effectors and sensor suites.
Conference Paper
Full-text available
This paper describes an implementation of the 3T robot architecture which has been under development for the last eight years. The architecture uses three levels of abstraction and description languages which are compatible between levels. The makeup of the architecture helps to coordinate planful activities with real-time behaviours for dealing with dynamic environments. In recent years, other architectures have been created with similar attributes but two features distinguish the 3T architecture : (1) a variety of useful software tools have been created to help implement this architecture on multiple real robots; and (2) this architecture, or parts of it, have been implemented on a variety of very different robot systems using different processors, operating systems, effecters and sensor suites.
Article
Full-text available
In this paper we propose a hybrid system modeling framework aimed at analyzing diagnosability. In this framework, the hybrid system is seen as the composition of an underlying discrete event and an underlying continuous systems. Diagnosability of these two underlying systems are fully analyzed and new results are provided for the underlying continuous system (called the multimode system). Based on these results, a hybrid language that contains 'natural' discrete events and discrete events capturing the continuous dynamics, is defined. On the basis of this language the diagnosability definition of hybrid systems is provided. With respect to this defini- tion, we prove that the diagnosability of the underlying continuous or the discrete event system is only a sufficient condition. Diagnosability of hybrid systems must be decided by coupling both discrete event and continuous informations. Finally, the necessary and sufficient condition of hybrid diagnosability is given.
Article
Full-text available
An autonomous robot offers a challenging and ideal field for the study of intelligent architectures. Autonomy within a rational be havior could be evaluated by the robot's effectiveness and robust ness in carrying out tasks in different and ill-known environments. It raises major requirements on the control architecture. Further more, a robot as a programmable machine brings up other archi tectural needs, such as the ease and quality of its specification and programming. This article describes an integrated architecture that allows a mobile robot to plan its tasks—taking into account temporal and domain constraints, to perform corresponding actions and to con trol their execution in real-time—while being reactive to possible events. The general architecture is composed of three levels: a de cision level, an execution level, and a functional level. The latter is composed of modules that embed the functions achieving sensor- data processing and effector control. The decision level is goal and event driven, and it may have several layers, according to the application; their basic structure is a planner/supervisor pair that enables the architecture to integrate deliberation and reaction. The proposed architecture relies naturally on several representa tions, programming paradigms, and processing approaches, which meet the precise requirements that are specified for each level. The authors have developed proper tools to meet these specifications and implement each level of the architecture: a temporal planner, IxTeT; a procedural system for task refinement and supervision, PRS; Kheops for the reactive control of the functional level, and G en oM for the specification and integration of modules at that level Validation of the temporal and logical properties of the reactive parts of the system, through these tools, are presented. Instances of the proposed architecture have been integrated into several indoor and outdoor robots. Examples from real-world ex perimentations are provided and analyzed.
Article
Full-text available
Hybrid systems serve as a powerful modeling paradigm for representing complex continuous controlled systems that exhibit discrete switches in their dynamics. The system and the models of the system are nondeterministic due to operation in uncertain environment. Bayesian belief update approaches to stochastic hybrid system state estimation face a blow up in the number of state estimates. Therefore, most popular techniques try to maintain an approximation of the true belief state by either sampling or maintaining a limited number of trajectories. These limitations can be avoided by using bounded intervals to represent the state uncertainty. This alternative leads to splitting the continuous state space into a finite set of possibly overlapping geometrical regions that together with the system modes form configurations of the hybrid system. As a consequence, the true system state can be captured by a finite number of hybrid configurations. A set of dedicated algorithms that can efficiently compute these configurations is detailed. Results are presented on two systems of the hybrid system literature.
Conference Paper
Full-text available
A model-based approach for on-line fault diagnosis in satellite is presented in this paper. The satellite studied is named MICROSCOPE. MICROSCOPE is a micro satellite under development at CNES (Centre National d'Etudes Spatiales, France). Its attitude and linear acceleration are controlled by 12 thrusters. The class of faults to be diagnosed is the thruster's faults occurring when a thruster blocks itself or closes itself during operating. The method used to tackle the problem is based on H fault estimations techniques. The solution involves linear matrix inequalities (LMI) optimization techniques. Simulation results illustrate the potential of the method.
Article
Full-text available
Fault detection and isolation is a crucial and challenging task in the automatic control of large complex systems. We propose a discrete-event system (DES) approach to the problem of failure diagnosis. We introduce two related notions of diagnosability of DES's in the framework of formal languages and compare diagnosability with the related notions of observability and invertibility. We present a systematic procedure for detection and isolation of failure events using diagnosers and provide necessary and sufficient conditions for a language to be diagnosable. The diagnoser performs diagnostics using online observations of the system behavior; it is also used to state and verify off-line the necessary and sufficient conditions for diagnosability. These conditions are stated on the diagnoser or variations thereof. The approach to failure diagnosis presented in this paper is applicable to systems that fall naturally in the class of DES's; moreover, for the purpose of diagnosis, most continuous variable dynamic systems can be viewed as DES's at a higher level of abstraction
Article
This paper deals with the problem of diagnosing systems that exhibit both continuous and discrete event dynamics. The proposed approach combines techniques from both continuous and discrete event diagnosis fields. On the on hand, an extension of the parity space approach is used to associate signatures to every operational mode of the system. On the other hand, signature switches arising from the transition from one mode to another are abstracted in the form of a set of events that capture the continuous dynamics. These events are merged into the original discrete dynamic model of the system, allowing us to apply the well-known discrete-event-systems diagnoser approach. This is illustrated on an example that shows the diagnosability improvement of the hybrid approach. Copyright © 2007 International Federation of Automatic Control All Rights Reserved.
Conference Paper
We address the problem of fault diagnosis in discrete-event systems. Our contribution is the development of a set of specialised diagnosers whose computation is much more realistic than that of the classical diagnoser. A specialised diagnoser is devoted to the diagnosis of one particular type of fault and is based on the observation of only a subpart of the system.
Conference Paper
This paper deals with the problem of diagnosing systems that exhibit both continuous and discrete event dynamics. The proposed approach combines techniques from both continuous and discrete event diagnosis fields. On the on hand, an extension of the parity space approach is used to associate signatures to every operational mode of the system. On the other hand, signature switches arising from the transition from one mode to another are abstracted in the form of a set of events that capture the continuous dynamics. These events are merged into the original discrete dynamic model of the system, allowing us to apply the well-known discrete-event-systems diagnoser approach. This is illustrated on an example that shows the diagnosability improvement of the hybrid approach.
Article
We address the problem of fault diagnosis in discrete-event systems. Our contribution is the development of a set of specialised diagnosers whose computation is much more realistic than that of the classical diagnoser. A specialised diagnoser is devoted to the diagnosis of one particular type of fault and is based on the observation of only a subpart of the system. Copyright c 2006 IFAC
Article
Renewed motives for space exploration have inspired NASA to work toward the goal of establishing a virtual presence in space, through heterogeneous fleets of robotic explorers. Information technology, and Artificial Intelligence in particular, will play a central role in this endeavor by endowing these explorers with a form of computational intelligence that we call remote agents. In this paper we describe the Remote Agent, a specific autonomous agent architecture based on the principles of model-based programming, on-board deduction and search, and goal-directed closed-loop commanding, that takes a significant step toward enabling this future. This architecture addresses the unique characteristics of the spacecraft domain that require highly reliable autonomous operations over long periods of time with tight deadlines, resource constraints, and concurrent activity among tightly coupled subsystems. The Remote Agent integrates constraintbased temporal planning and scheduling, robust multi-threaded execution, and model-based mode identification and reconfiguration. The demonstration of the integrated system as an on-board controller for Deep Space One, NASA's first New Millennium mission, is scheduled for a period of a week in mid 1999. The development of the Remote Agent also provided the opportunity to reassess some of AI's conventional wisdom about the challenges of implementing embedded systems, tractable reasoning, and knowledge representation. We discuss these issues, and our often contrary experiences, throughout the paper.
Space engineering: Space segment operability, European Cooperation for Space Standardization Standard Java as a stan-dardized on-board control procedures platform?
  • G Garcia
  • C Roubion
  • S Prunier
ECSS (2005). Space engineering: Space segment operability, European Cooperation for Space Standardization Standard, August. Garcia, G., Roubion, C. and Prunier, S. (2004). Java as a stan-dardized on-board control procedures platform?, Actes de DAta Systems In Aerospace (DASIA), Nice, France.
Decisional architecture for autonomous space systems
  • S Lemai
  • M Charmeau
  • X Olive
Lemai, S., Charmeau, M. and Olive, X. (2006). Decisional architecture for autonomous space systems, 9th ESA Workshop on Advanced Space Technologies for Robotics and Automation (ASTRA 2006), ESTEC, Noordwijk, The Netherlands.
Space engineering: Space segment operability, European Cooperation for Space Standardization Standard
ECSS (2005). Space engineering: Space segment operability, European Cooperation for Space Standardization Standard, August.
Java as a standardized on-board control procedures platform?
  • G Garcia
  • C Roubion
  • S Prunier
Garcia, G., Roubion, C. and Prunier, S. (2004). Java as a standardized on-board control procedures platform?, Actes de DAta Systems In Aerospace (DASIA), Nice, France.
Decisional architecture for autonomous space systems
  • S. Lemai