J. Torin

Chalmers University of Technology, Göteborg, Vaestra Goetaland, Sweden

Are you J. Torin?

Claim your profile

Publications (37)2.64 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents how state consistency among distributed control nodes is maintained in the presence of faults. We analyze a fault tolerant semi-synchronous architecture concept of a Distributed Flight Control System (DFCS). This architecture has been shown robust against transient faults of continuous signals through inherent replica consistency. This approach necessitates neither atomic broadcast nor replica determinism. Here, we extend the analysis of replica consistency property to confirm robustness against transient faults in discrete signals in presence of a single permanent fault in a control node. The paper is based on a case study on JAS 39 Gripen, a modern fourth generation multi purpose combat aircraft, presently operating with a centralized FCS. Our goal is to design the DFCS fault management mechanisms so that the distributed treatment of faults corresponds to the existing non-distributed FCS. In particular, fault management mechanisms not existing in the present centralized system but only in the distributed system are considered.
    System Sciences, 2005. HICSS '05. Proceedings of the 38th Annual Hawaii International Conference on; 02/2005
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This work presents how state consistency among distributed control nodes is maintained in the presence of faults. We analyze a fault-tolerant semi-synchronous architecture concept of a distributed flight control system (DFCS). This architecture has been shown robust against transient faults of continuous signals through inherent replica consistency. This approach necessitates neither atomic broadcast nor replica determinism. Here, we extend the analysis of replica consistency property to confirm robustness against transient faults in discrete signals in the presence of a single permanent fault in the DFCS components. The paper is based on a case study on JAS 39 Gripen, a modern fourth generation multi purpose combat aircraft, presently operating with a centralized FCS. Our goal is to design the DFCS fault management mechanisms so that the distributed treatment of faults corresponds to the existing nondistributed FCS.
    Digital Avionics Systems Conference, 2004. DASC 04. The 23rd; 11/2004
  • [Show abstract] [Hide abstract]
    ABSTRACT: Volvo Car Corporation and The Royal Institute of Technology initiated a joint project named FAR in October 2002. FAR stands for function and architecture integration. There were 10 M.Sc. students in the project taking a special class. The focus of the project was the development of a portable drive-by-wire system using model based development and reference architectures. The deliveries from the project were a tool chain for automatic code generation from Matlab Simulink models and a prototype vehicle in scale 1:5. The project was very successful and the result was delivered to Volvo cars in June 2003. The project deliveries have been further developed at Volvo cars.
    Mechatronics, 2004. ICM '04. Proceedings of the IEEE International Conference on; 07/2004
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present the RedCAN concept to achieve fault tolerance against node and link failures in a CAN-bus system by means of configurable switches. The basic idea in RedCAN is to isolate faulty nodes or bus segments by configuring switches that will evade a faulty node or segment and exclude it from bus access. We propose changes to the original centralized protocol, vulnerable to single point failures, and show that with a new distributed algorithm considerable more efficiency can be achieved also when network size is growing. The distributed algorithm introduces redundancy and hereby increases robustness of the system. Furthermore, the new algorithm has logarithmic complexity, as opposed to the centralized algorithms linear complexity, as the number of nodes increase. The results were gathered through a new simulator, the "RedCAN Simulation Manager", also presented. Simulations allow assessing the break-even point between centralized and distributed algorithms reconfiguration latencies as well as give ideas for further research.
    Dependable Computing, 2004. Proceedings. 10th IEEE Pacific Rim International Symposium on; 04/2004
  • Article: Unknown
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents the RedCAN concept to achieve fault tolerance against node and link failures in a CANbus system by means of configurable switches. The basic idea in RedCAN is to isolate faulty nodes or bus segments by configuring switches that will evade a faulty node or segment and exclude it from bus access. We propose changes to the original centralized protocol, vulnerable to single point failures, and show that with a new distributed algorithm considerable more efficiency can be achieved also when network size is growing. The distributed algorithm introduces redundancy and hereby increases robustness of the system. Furthermore the new algorithm has logarithmic complexity, as opposed to the centralized algorithms linear complexity, as the number of nodes increase. The results were gathered through a new simulator, the "RedCAN Simulation Manager", also presented in the paper. Simulations allow assessing the break-even point between centralized and distributed algorithms reconfiguration latencies as well as give ideas for further research.
    01/2004;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In the early stages of a design process, a detailed hazard analysis should be performed, particularly for safety critical systems. In this paper an actuator based hazard analysis method is presented. Since it is the actuators that affect the systems environment, this actuator based approach is the logical approach for an early hazard analysis when only limited information of the system implementation is available. This approach is also unique since all identified failures are distributed on four different severities. A criticality ranking is assigned to each failure as a combination of the severities and their distribution. This ranking is also used to give an indication of the preferred fail states. For the hazards resulting in a high criticality that needs to be handled, the method supports a solvability analysis between different design solutions. This solvability analysis rewards design concepts that handles hazards with high criticality numbers.
    Computer Safety, Reliability, and Security, 23rd International Conference, SAFECOMP 2004, Potsdam, Germany, September 21-24, 2004, Proceedings; 01/2004
  • [Show abstract] [Hide abstract]
    ABSTRACT: Our paper presents a new membership agreement algorithm that address asymmetric timing faults and includes a new tool simulating TTP/C clusters. The proposed algorithm flags deviating or slightly untimely messages to assure that single marginal transmitting faults are detected and that only the faulty node will be expelled. The tool can demonstrate the behavior of membership agreement algorithms such as the original TTP-C1 algorithm or our modified flagging algorithm. The performed simulations use experimental results from heavy-ion fault injection logged timing faults. The gathered results show the rare faults, which made a network using the original algorithm either collapse or become degraded, are detected and handled with the new algorithm without loss of more than the faulty node. Full Text at Springer, may require registration or fee
    International Federation for Information Processing Digital Library; Design Methods and Applications for Distributed Embedded Systems;. 01/2004;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Research in distributed dependable control systems within the automotive industry is of high importance today. One reason is the introduction of more mechatronical systems. Volvo Car Corporation and the Royal Institute of Technology initiated a joint project in October 2002 to target this technology change. The project was named FAR, which stands for Function and ARchitecture integration. FAR focused on the development of drive-by-wire systems using model based development. The deliveries from the project were a tool chain for automatic code generation from Matlab Simulink and Matlab Stateflow models and also a prototype vehicle in scale 1:5. It was a very successful project and the result was delivered to Volvo Cars in June 2003. The project deliveries have been further developed at Volvo Cars since then. Primarily, a new hazard analysis method has been developed and new fault tolerance mechanisms have been implemented. Full Text at Springer, may require registration or fee
    Design Methods and Applications for Distributed Embedded Systems, IFIP 18th World Computer Congress, TC10 Working Conference on Distributed and Parallel Embedded Systems (DIPES 2004), 22-27 August 2004, Toulouse, France; 01/2004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Not Available
    Dependable Systems and Networks, 2003. Proceedings. 2003 International Conference on; 07/2003
  • [Show abstract] [Hide abstract]
    ABSTRACT: This article describes heavy-ion fault injections in a fault tolerant communication controller, (TTP/C1). The fault-injected device was part of a time-triggered distributed system consisting of four to nine nodes respectively, utilizing a physical broadcast bus. The size of clusters was studied in order to investigate possible fault tolerance dependencies of the number of nodes due to fault injections in one node and if validation results gathered in a four-node system would remain valid for a larger cluster. The results show that a larger cluster size became more durable against fail silence violations of the faultinjected node and that the appearance rate of so-called slightly-out-ofspecification errors does not change with larger cluster sizes. One type of failures was only detected in a four-node system, namely reintegration errors.
    04/2003;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In dependable distributed systems, the communication link is a critical component with strict dependability requirements. The Time-Triggered Protocol (TTP/C) was developed to meet these requirements. To validate this design, one node in a TTP/C cluster was injected with faults using heavy-ions. It was a prototype implementation and cluster sizes of four and five nodes were tested. The experimental results show that arbitrary faults in one node can cause inconsistencies in the cluster and jeopardize the operation of correctly working nodes and the whole cluster. Further, the system's vulnerability to arbitrary failures in single nodes for a cluster with a broadcast bus is shown. Experiments with varying cluster sizes indicate a relationship between cluster size and system vulnerability thus it seems to be important to further analyze if and why cluster sizes need to be taken into account when validating distributed systems. The described inconsistencies resulted from asymmetric value faults, asymmetric timing faults or arbitrary single node failures.
    Dependable Computing, First Latin-American Symposium, LADC 2003, Sao Paulo, Brazil, October 21-24, 2003, Proceedings; 01/2003
  • [Show abstract] [Hide abstract]
    ABSTRACT: This article describes heavy-ion fault injections in a fault tolerant communication controller, (TTP/C1). The fault-injected device was part of a time-triggered distributed system consisting of four to nine nodes respectively, utilizing a physical broadcast bus. The size of clusters was studied in order to investigate possible fault tolerance dependencies of the number of nodes due to fault injections in one node and if validation results gathered in a four-node system would remain valid for a larger cluster. The results show that a larger cluster size became more durable against fail silence violations of the fault- injected node and that the appearance rate of so-called slightly-out-of- specification errors does not change with larger cluster sizes. One type of failures was only detected in a four-node system, namely reintegration errors.
    6 th IEEE Workshop, Design & Diagnostics ofElectronic Circuits & Systems,(DDECS)6 th IEEE Workshop, Design & Diagnostics ofElectronic Circuits & Systems,(DDECS); 01/2003
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes results from fault injection experiments using heavy ions in the time-triggered communication protocol for safety critical distributed systems (TTP/C, C1 implementation). The observed results show that arbitrary faults in one erroneous node could cause inconsistencies in the cluster and thus jeopardize correctly working nodes and the whole communication system. The described inconsistencies resulted from either asymmetric value faults or slightly out of specification timing faults. This system behavior can be partly explained by too strict constraints on the fault handling algorithms using the membership agreement protocol.
    01/2003;
  • K. Ahlstrom, J. Torin
    [Show abstract] [Hide abstract]
    ABSTRACT: The development of fault tolerant embedded control systems such as flight control systems (FCS) are currently highly specialized and time-consuming. We introduce a conceptual architecture for the next decade control system where all control and logic are distributed to a number of computer nodes locally linked to actuators and connected via a communication network. In this way, we substantially reduce the life-cycle cost of embedded systems and attain scalable fault tolerance. All fault tolerance is based on redundancy. Our philosophy is to cover permanent faults with hardware replication and handle all error processing caused by both permanent and transient faults with software techniques. With intelligent nodes and use of inherent redundancy we introduce a robust and simple fault tolerant system that utilizes minimum hardware and has bandwidth requirements of less than 300 kbits/s, which can be met with an electrical bus. The study is based on an FCS for JAS 39 Gripen, a multi-role combat aircraft that is statically unstable at subsonic speed.
    IEEE Aerospace and Electronic Systems Magazine 01/2003; · 0.34 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In the design of fault tolerant real time systems, the most important issue is fault handling and redundancy managing. Adding hardware as well as software in order to tolerate faults requires a redundancy strategy to attain and prove the expected as well as the required fault tolerance. This paper presents fault handling strategies of a future distributed architecture for a flight control system (FCS) designed for the JAS 39 Gripen, a modern 4th generation multi-purpose combat aircraft. The results are based on knowledge of and experience from the JAS 39 Gripen, with over 15000 flight hours. Consequently, a highly dependable real time control system is addressed, however, the principles of the distributed system are general and can be applied to other combat and commercial aircraft as well as for other embedded control systems, e.g. in cars, trains etc. The distributed architecture aims to tolerate permanent and transient physical faults, whereas software design faults are not catered for. Simulations give experimental results for validation of the fault tolerance qualities of the distributed control system. The fault handling simulations include transient fault recovery, exploring three redundancy principles and also tests of time limits for permanent fault handling, i.e. system reconfiguration. The results are based on experiments on a simulator validated against the actual aircraft.
    Digital Avionics Systems Conference, 2002. Proceedings. The 21st; 02/2002
  • K. Alstrom, J. Torin
    [Show abstract] [Hide abstract]
    ABSTRACT: The development of fault tolerant embedded control systems, such as flight control systems, FCS, is currently highly specialized and time consuming. We introduce a conceptual architecture for the next decade control system where all control and logic is distributed to a number of computer nodes locally linked to actuators and connected via a communication network. In this way we substantially decrease the lifecycle cost of such embedded systems and acquire scalable fault tolerance. Fault tolerance is based on redundancy and in our concept permanent faults are covered by hardware replication and transient faults, fault detection and processing by software techniques. With intelligent nodes and the use of inherent redundancy a robust and simple fault tolerant system is introduced with a minimum of both hardware and bandwidth requirements. The study is based on an FCS for JAS 39 Gripen, a multirole combat aircraft that is statically unstable at subsonic speed
    Digital Avionics Systems, 2001. DASC. 20th Conference; 11/2001
  • [Show abstract] [Hide abstract]
    ABSTRACT: Mass produced products are becoming more and more complex, which forces the designers to model the functionality early in the design process. UML Use cases was found to be a useful method for this purpose at Volvo Cars and is currently used for modeling all functions implemented in the electrical network. When using Use cases in the design of complex safety critical systems, there is still an uncovered demand for early hazard analysis at a functional level. This work integrates a modified functional hazard assessment method and Use cases. The analysis generates valuable results used as design requirements and dependability analysis input. The methods results have exceeded our expectations. An example is included, showing how the method works.
    Dependable Systems and Networks, 2001. DSN 2001. International Conference on; 08/2001
  • [Show abstract] [Hide abstract]
    ABSTRACT: Safety critical mass produced products are today being implemented without mechanical backup. Most of these will have distributed real time computer networks, which mast be ultra-dependable. A development method for future control-by-wire systems is presented. By using a scalable software architecture and the systems' intrinsic redundancy, it is possible to achieve dependability requirements cost-effectively. The method starts with analyzing a functional task-graph, which gives input for developing a non-redundant architecture. The redundant architecture is modified to dependability requirements by adding hardware redundancy. The functionality implemented in software is allocated to the network according to some optimization criteria. The proposed method has been used in two cases; a fly-by-wire aircraft and a drive-by-wire car
    Engineering of Complex Computer Systems, 2001. Proceedings. Seventh IEEE International Conference on; 02/2001
  • [Show abstract] [Hide abstract]
    ABSTRACT: An architecture for a highly dependable real-time computer network is presented. The architecture and communication protocol are suited for cyclic timedeterministic applications typically found in embedded control systems. Particular attention has been directed towards the requirements of safety-critical automotive control systems. The scheduling of both network communication and application processes is determined at compile time and is thus completely deterministic. ADACAPO system consists of a number of nodes communicating over two serial buses. Each node is composed of two sets of functionally identical fail-silent units, thus providing tolerance against any single fault. A high degree of error detection coverage and the tolerance towards transient faults inherently associated with cyclic operation combine to yield an architecture with a very high safety level. 1. Introduction DACAPO [1] is a fault-tolerant distributed real-time computer system developed in collab...
    11/1998;
  • [Show abstract] [Hide abstract]
    ABSTRACT: An architecture for a highly dependable real-time computer network is presented. The architecture and communication protocol are suited for cyclic time-deterministic applications typically found in embedded control systems. Particular attention has been directed towards the requirements of safety-critical automotive control systems. The scheduling of both network communication and application processes is determined at compile time and is thus completely deterministic. A DACAPO system consists of a number of nodes communicating over two serial buses. Each node is composed of two sets of functionally identical fail-silent units, thus providing tolerance against any single fault. A high degree of error detection coverage and the tolerance towards transient faults inherently associated with cyclic operation combine to yield an architecture with a very high safety level
    Intelligent Vehicles '95 Symposium., Proceedings of the; 10/1995