SYMIAN: Analysis and performance improvement of the IT incident management process

HP Labs., Palo Alto, CA, USA
IEEE Transactions on Network and Service Management 10/2010; 7(3):132 - 144. DOI: 10.1109/TNSM.2010.1009.I9P0321
Source: IEEE Xplore


Incident Management is the process through which IT support organizations manage to restore normal service operation after a service disruption. The complexity of real-life enterprise-class IT support organizations makes it extremely hard to understand the impact of organizational, structural and behavioral components on the performance of the currently adopted incident management strategy and, consequently, which actions could improve it. This paper presents SYMIAN, a decision support tool for the performance improvement of the incident management function in IT support organizations. SYMIAN simulates the effect of corrective measures before their actual implementation, enabling time, effort, and cost saving. To this end, SYMIAN models the IT support organization as an open queuing network, thereby enabling the evaluation of both the system-wide dynamics as well as the behavior of the individual organization components and their interactions. Experimental results show the SYMIAN effectiveness in the performance analysis and tuning of the incident management process for real-life IT support organizations.

Download full-text


Available from: Mauro Tortonesi, Jul 03, 2015
  • Source
    • "We adopted Symian, a state-of-the-art simulator which we developed in the context of our research [7] "
    [Show abstract] [Hide abstract]
    ABSTRACT: Business-driven IT management practices often involve the performance optimization of a system according to business criteria. The increased attention to dynamical aspects of the system behavior, the preference for simulative approaches rather than analytical ones, and the increased level of complexity posed by business-driven performance evaluation significantly complicate the optimization of BDIM systems and demand a radical rethinking of methodologies and tools. This raises the opportunity to devise and implement common methodology and tools that could be used for a large class of different BDIM optimization problems. This paper proposes a generic framework for the dynamic and adaptive optimization of BDIM systems, introduces the Open Source ruby-mhl metaheuristics library, and provides an experimental evaluation in the context of a realistic case study.
    Full-text · Article · Jun 2015
  • Source
    • "There have also been relevant research efforts to improve other aspects of IT service management. For example, Bartolini et al. [17] address the simulation and optimization of the IT incident management process to improve the handling of tickets in a service organization. Sauvé et al. [18] propose the prioritization of changes based on the assessed risks. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The success of businesses in modern organizations heavily depends on the high availability of information technology (IT) infrastructures. To prevent business disruption, IT operators have worked hard to ensure that any changes to this infrastructure are properly and efficiently deployed. Change management—a discipline of the Information Technology Infrastructure Library (ITIL)—provides important guidance to help achieve this end. As IT infrastructures grow larger, however, ensuring that changes are harmless to business continuity becomes increasingly complex. In fact, previous research has shown that existing approaches for verifying changes suffer from severe scalability issues. This problem can become a serious threat to most organizations, as it can lead for example to customer dissatisfaction due to missed deadlines in service change deployment. To bridge this gap, we propose a partial-order reduction model checking paradigm and algorithm for efficiently detecting harmful change operations. Our model improves the complexity of verifying a set of concurrent change activities against safety constraints by reducing—without losing effectiveness—the verification scope. To prove concept and technical feasibility, we carried out an extensive performance evaluation of our algorithm considering a variety of change activities, safety constraints, and configuration scenarios. The results obtained from 32 benchmarks have shown that our algorithm significantly outperformed state-of-the-art, general purpose model checkers, improving the runtime complexity from polynomial/exponential to linear. In summary, the results evidenced that change verification finally became feasible and efficient for larger IT infrastructures.
    Full-text · Article · Sep 2014 · IEEE Transactions on Network and Service Management
  • Source
    • "The importance of IT Service Management (ITSM) has increased [5] since it promotes a better alignment between IT and business needs, managing efficiency through the provision of IT services [6]. As a result, organizations adopt best practices from IT Infrastructure Library (ITIL) [5] to ensure the delivery of IT services with quality, efficiency, effectiveness, and less cost [5] [7] [8]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Information Technology Infrastructure Library (ITIL) has become the best and most adopted practice framework to implement IT Service Management (ITSM) within organizations. Service Catalog is a fundamental basis in ITIL adoption because it is the source of information about services. However, organizations often fail in service identification activity, compromising the creation of the service catalog. Moreover, ITIL states how to develop a service catalog nor offers a standard one. In this paper, we propose an IT Service Reference Catalog (ITSRC) as a basis to start creating a Service Catalog or as a reference to adapt. We evaluated the proposal in real-world settings using Design Science Research methodology, and positive results were achieved.
    Full-text · Article · Jan 2013
Show more