Article

SYMIAN: Analysis and performance improvement of the IT incident management process

HP Labs., Palo Alto, CA, USA
IEEE Transactions on Network and Service Management 10/2010; DOI: 10.1109/TNSM.2010.1009.I9P0321
Source: IEEE Xplore

ABSTRACT Incident Management is the process through which IT support organizations manage to restore normal service operation after a service disruption. The complexity of real-life enterprise-class IT support organizations makes it extremely hard to understand the impact of organizational, structural and behavioral components on the performance of the currently adopted incident management strategy and, consequently, which actions could improve it. This paper presents SYMIAN, a decision support tool for the performance improvement of the incident management function in IT support organizations. SYMIAN simulates the effect of corrective measures before their actual implementation, enabling time, effort, and cost saving. To this end, SYMIAN models the IT support organization as an open queuing network, thereby enabling the evaluation of both the system-wide dynamics as well as the behavior of the individual organization components and their interactions. Experimental results show the SYMIAN effectiveness in the performance analysis and tuning of the incident management process for real-life IT support organizations.

0 Bookmarks
 · 
141 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Simulation modelling is widely used to support decision-making in different business areas and management tasks. Given the growing importance for real-world organizations to improve Information Technology Service Management (ITSM), this paper focuses on the application of these techniques to support decision-making in this field. A review of published research articles that describe an application case has been conducted and it shows that different simulation approaches are extensively used to solve particular problems in the context of several processes. However, in these works there is no evidence of a systematic use of both ITSM frameworks and simulation model development methodologies. Given their importance to build valid simulation models, this paper proposes a novel decision-making framework whose main component is a specific methodology to systematically build simulation models that help solve real-world organization problems applying ITIL recommendations. To illustrate the usefulness of this framework, two application cases in the context of the ITIL Capacity Management and Incident Management processes are summarized. The model simulations provide information about the process results, performance and behaviour with different process configurations. Moreover, optimization experiments allow managers to determine the optimal process configuration that meets the established objectives.
    Decision Support Systems 06/2014; 66. DOI:10.1016/j.dss.2014.06.002 · 2.04 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Change Management, a core process of the Information Technology Infrastructure Library (ITIL), is concerned with the management of changes to networks and services to minimize costly disruptions on the business. As part of Change Management, IT changes need to be planned. Previous approaches to automatically generate IT change plans struggle, in terms of scalability, to properly deal with large Configuration Management Databases (CMDBs). To enable IT change planning in the large, in this paper we discuss and analyze optimizations for refinement-based IT change planning over object-oriented CMDBs. Our optimizations reduce the runtime complexity of several key operations part of refinement-based IT change planning algorithms. A sensitivity analysis shows that our optimizations outperform SHOP2 - the winner of a previous comparison among IT change planners - in terms of runtime complexity for several important characteristics of IT changes and CMDBs. A cloud deployment case study of a Three-tier application and a virtual network configuration case study demonstrate the feasibility of our approach and confirm the results from the sensitivity analysis: IT change planning has evolved from planning in the small to planning in the large.
    Network and service management (cnsm), 2012 8th international conference and 2012 workshop on systems virtualiztion management (svm); 01/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The success of businesses in modern organizations heavily depends on the high availability of information technology (IT) infrastructures. To prevent business disruption, IT operators have worked hard to ensure that any changes to this infrastructure are properly and efficiently deployed. Change management—a discipline of the Information Technology Infrastructure Library (ITIL)—provides important guidance to help achieve this end. As IT infrastructures grow larger, however, ensuring that changes are harmless to business continuity becomes increasingly complex. In fact, previous research has shown that existing approaches for verifying changes suffer from severe scalability issues. This problem can become a serious threat to most organizations, as it can lead for example to customer dissatisfaction due to missed deadlines in service change deployment. To bridge this gap, we propose a partial-order reduction model checking paradigm and algorithm for efficiently detecting harmful change operations. Our model improves the complexity of verifying a set of concurrent change activities against safety constraints by reducing—without losing effectiveness—the verification scope. To prove concept and technical feasibility, we carried out an extensive performance evaluation of our algorithm considering a variety of change activities, safety constraints, and configuration scenarios. The results obtained from 32 benchmarks have shown that our algorithm significantly outperformed state-of-the-art, general purpose model checkers, improving the runtime complexity from polynomial/exponential to linear. In summary, the results evidenced that change verification finally became feasible and efficient for larger IT infrastructures.
    IEEE Transactions on Network and Service Management 09/2014; 11(3):292-306. DOI:10.1109/TNSM.2014.2346074