
John Murphy- University College Dublin
John Murphy
- University College Dublin
About
236
Publications
81,389
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,644
Citations
Introduction
Current institution
Publications
Publications (236)
SQL-on-Hadoop systems have been gaining popularity in recent years. One popular example of SQL-on-Hadoop systems is Apache Hive; the pioneer of SQL-on-Hadoop systems. Hive is located on the top of big data stack as an application layer. Besides the application layer, the Hadoop Ecosystem is composed of 3 different main layers: storage, the resource...
The Hadoop Distributed File System (HDFS) is responsible for storing very large data-sets reliably on clusters of commodity machines. The HDFS takes advantage of replication to serve data requested by clients with high throughput. Data replication is a trade-off between better data availability and higher disk usage. Recent studies propose differen...
Over the years, Human Occupancy Measurement has had and continues to have a faire share of attention by both the research and industry communities. This long-term interest has been supported by the recent technological advances, such as the emergence of the Internet of Things (IoT), which offers a cheap alternative for gathering and processing vari...
The security of mobile communication largely depends on the strength of the authentication key exchange protocol. The 3GPP (3rd Generation Partnership Project) group has standardized the 5G AKA (Authentication and Key Agreement) protocol for the next generation of mobile communications. It has been recently shown that the current version of this pr...
There is an increasing user demand for high-quality content-rich multimedia services. Despite their advantages, current wireless networks in general and wireless mesh networks in particular have limitations in terms of quality of service (QoS) provisioning, especially when dealing with increased amounts of time sensitive traffic such as video. This...
Real-time applications, such as video conferences, have strong Quality of Service requirements for ensuring a decent Quality of Experience. Nowadays, most of these conferences are performed over wireless devices. Thus, an appropriate management of both heterogeneous mobile devices and network dynamics is necessary. Software Defined Networking enabl...
Human occupancy measurement has become a topic of increasing interest in the past few years, due to the important role it plays in controlling a number of demand-driven applications like smart lighting and smart heating, as well as improving the energy efficiency of these applications in a broader sense. Office occupancy monitoring in commercial bu...
The Hadoop Distributed File System (HDFS) is the storage of choice when it comes to large-scale distributed systems. In addition to being efficient and scalable, HDFS provides high throughput and reliability through the repli-cation of data. Recent work exploits this replication feature by dynamically varying the replication factor of in-demand dat...
The massive growth in the volume of data and the demand for big data utilisation has led to an increasing prevalence of Hadoop Distributed File System (HDFS) solutions. However, the performance of Hadoop and indeed HDFS has some limitations and remains an open problem in the research community. The ultimate goal of our research is to develop an ada...
The identification of workload-dependent performance issues, as well as their root causes, is a time-consuming and complex process which typically requires several iterations of tests (as this type of issues can depend on the input workloads), and heavily relies on human expert knowledge. To improve this process, this paper presents an automated ap...
Periodic beacon messages are one of the building blocks that enable the operation of vehicular ad hoc networks (VANETs) applications. In vehicular networks environments, congestion and awareness control mechanisms are key for a reliable and efficient functioning of vehicular applications. In order to control the channel load, a reliable mechanism a...
Real-time interactive communications, such as videoconferencing, have strong QoS requirements. Adapting such communications to network dynamics needs network management capabilities that traditional networks cannot provide. Recently, SDN has shown to be a potential solution for solving these network adaptation challenges. In a companion paper writt...
Videoconferencing applications have strong latency requirements and consume large portions of a network's band-width. Current videoconferencing solutions are not efficiently implemented as they often rely on a central server and do not leverage network layering services. In this paper, we investigate the impact of using Scalable Video Coding and So...
Plagiarism in programming assignments is an extremely common problem in universities. While there are many tools that automate the detection of plagiarism in source code, users still need to inspect the results and decide whether there is plagiarism or not. Moreover, users often rely on a single tool (using it as " gold standard " for all cases), w...
The advent of the Internet of Things (IoT) has led to a major change in the way we interact with increasingly ubiquitous connected devices such as smart objects and cyber-physical systems. It has also led to an exponential increase in the number of such Internet-connected devices over the last few years. Conducting extensive functional and performa...
Les flux de vidéo en direct, tels que ceux des flux de vidéoconférence, ont des contraintes strictes de délai et consomment beaucoup de bande passante. Pour gérer ces contraintes, les applications de vidéoconférence utilisent le plus souvent un serveur centralisé et n'exploitent pas les services offerts par les couches réseau. Avec l’émergence du S...
Performance testing is used to assess if an enterprise application can fulfil its expected Service Level Agreements. However, since some performance issues depend on the input workloads, it is common to use time-consuming and complex iterative test methods, which heavily rely on human expertise. This paper presents an automated approach to dynamica...
The identification of performance issues and the diagnosis of their root causes are time-consuming and complex tasks, especially in clustered environments. To simplify these tasks, researchers have been developing tools with built-in expertise for practitioners. However, various limitations exist in these tools that prevent their efficient usage in...
Generating synthetic data is useful in multiple application areas (e.g., database testing, software testing). Nevertheless, existing synthetic data generators are either limited to generating data that only respect the database schema constraints, or they are not accurate in terms of representativeness, unless a complex set of inputs are given from...
Réaliser des audio-conférences de qualité sur Internet est une tâche complexe. En effet, l'hétérogénéité des terminaux mobiles et la dynamique du réseau doivent être prises en compte par les systèmes MVoIP (Multiparty VoIP) afin d'assurer une qualité d'expérience suffisante aux utilisateurs. Dans cette contribution, nous présentons un nouveau systè...
Achieving high-quality voice conference calls over the Internet is a difficult task. Heterogeneous mobile devices and network dynamics must be properly managed by multiparty VoIP systems to ensure a good quality of experience. In this paper, we propose a multiparty VoIP system based on SDN technology that uses both multicast distribution and dynami...
With the ubiquitous usage of mobile devices, most communications are now impacted by the users' mobility. Therefore, applications and services must be designed to cope with network dynamics produced by those mobility patterns. Software research and development would benefit from taking device mobility into account. However, implementing and testing...
The continuous growth in user demand for high quality rich media services puts pressure on Wireless Mesh Network (WMN) resources. Solutions such as those which increase the capacity of the mesh network by equipping mesh routers with additional wireless interfaces provide better Quality of Service (QoS) for video deliveries, but result in higher ove...
Nowadays, clustered environments are commonly used in high-performance computing and enterprise-level applications to achieve faster response time and higher throughput than single machine environments. Nevertheless, how to effectively manage the workloads in these clusters has become a new challenge. As a load balancer is typically used to distrib...
Channel congestion is a well-known problem in wireless networks in general and Vehicular Ad Hoc Networks (VANETs) in particular. Literature solutions propose to alleviate this problem by controlling the network load based on parameters like vehicle density or packet collision rate. In other words, each vehicle will observe the density of vehicles (...
One of the most popular and efficient methods for conserving energy in Wireless Sensor Networks (WSNs) is data aggregation. This technique usually introduces an additional delay in the transmission of data packets. The inherent trade-off between energy consumption and end-to-end delay imposes an important decision to be made by the nodes, mainly to...
Nowadays, the unprecedented increase in road traffic congestion has led to severe consequences on individuals, economy and environment, especially in urban areas in most of big cities worldwide. The most critical among the above consequences is the delay of emergency vehicles, such as ambulances and police cars, leading to increased deaths on roads...
Data aggregation techniques have emerged as promising solutions for extending Wireless Sensor Networks (WSNs) lifetime. However, this approach suffers from a design issue in delivering the strict requirements needed by some monitoring applications. Carefully balancing Energy, Delay and Accuracy is essential for achieving these requirements. In this...
Generating synthetic data is useful in multiple application areas (e.g., database testing, software testing). Nevertheless, existing syn- thetic data generators generally lack the necessary mechanism to produce realistic data, unless a complex set of inputs are given from the user, such as the characteristics of the desired data. An automated and e...
Wireless mesh networks (WMNs) are becoming increasingly popular mostly due to their deployment flexibility. The main drawback of these networks is their lack of guaranteeing high Quality of Service (QoS) levels to their clients. The latest ubiquitous mobile and wireless support and significant growth in smartphone features have fueled user demand f...
Due to the limited capacity of road networks and sporadic on-route events, road traffic congestions are posing serious problems in most big cities worldwide and resulting in considerable number of casualties and financial losses. In order to deal efficiently with these problems and alleviate their impact on individuals, environment, and economic ac...
As part of the process to test a new release of an application, the performance testing team need to confirm that the existing functionalities do not perform worse than those in the previous release, a problem known as performance regression anomaly. Most existing approaches to analyse performance regression testing data vary according to the appli...
The growing size of cities and increasing population mobility have determined a rapid increase in the number of vehicles on the roads, which has resulted in many challenges for road traffic management authorities in relation to traffic congestion, accidents and air pollution. Over the recent years, researchers from both industry and academia were f...
It is foreseeable that in the few upcoming years, real time traffic information, including road incidents notifications, will be collected and disseminated by mobile vehicles, thanks to their plethora of embedded sensors. Each vehicle can thus actively participate in sharing the collected information with the other peers forming an infrastructure-l...
Highly accurate event detection makes Wireless Sensor Networks popular for real time monitoring applications. Wireless sensor systems that monitor physical and environmental conditions are expected to be deployed with high density, a situation which leads to spatial correlations and redundancy of the collected data. Eliminating these redundancies e...
Wireless Sensor Networks (WSNs) have gained wide-scale popularity for real-time events monitoring and de-tection due to their high accuracy and ease of deployment. Therefore, they have become increasingly prevalent solutions in several application domains such as health-care, transportation, etc. Many studies in the literature have focused on optim...
Large amounts of data often require expensive and time-consuming analysis. Therefore, highly scalable and efficient techniques are necessary to process, analyze and discover useful information. Database sampling has proven to be a powerful method to surpass these limitations. Using only a sample of the original large database brings the benefit of...
In road networks, the most common metrics to determine the optimal route relaying two points are either the path length or the travel time. However, as autonomous smart cars are expected to emerge in future smart cities and lead to an unprecedented growth of mobile applications spectrum for both drivers and passengers, we argue that other metrics c...
Wireless Sensor Networks (WSNs) have gained wide-scale popularity for real-time events monitoring and detection due to their high accuracy and ease of deployment. Therefore, they have become increasingly prevalent solutions in several application domains such as health-care, transportation, etc. Many studies in the literature have focused on optimi...
Server sprawl is a problem faced by data centers, which causes unnecessary waste of hardware resources, collateral costs of space, power and cooling systems, and administration. This is usually combated by virtualization based consolidation, and both industry and academia have put many efforts into solving the underlying virtual machine (VM) placem...
In this work, we are interested in periodic beacons transmission, the main cause of the Control Channel (CCH) congestion and the major obstacle delaying the progress of safety messages dissemination in VANETs. In order to offload the network, solutions that range from transmit rate adaptation to transmit power adaptation including hybrid solutions...
Recently, the increasing road traffic congestion has attracted a lot of attention from the research community aiming at proposing innovative solutions to reduce the huge economic loss incurred by this problem. In this paper, we first evaluate the impact of random road incidents on the commuters travel time and the overall traffic congestion level u...
Performance testing in distributed environments is challenging. Specifically, the identification of performance issues and their root causes are time-consuming and complex tasks which heavily rely on expertise. To simplify these tasks, many researchers have been developing tools with built-in expertise. However limitations exist in these tools, suc...
Data center optimization, mainly through virtual
machine (VM) placement, has received considerable attention
in the past years. A lot of heuristics have been proposed to give
quick and reasonably good solutions to this problem. However it
is difficult to compare them as they use different datasets, while
the distribution of resources in the dataset...
Modern computer applications, especially at enterprise-level, are commonly deployed with a big number of clustered instances to achieve a higher system performance, in which case single machine based solutions are less cost-effective. However, how to effectively manage these clustered applications has become a new challenge. A common approach is to...
Wireless Mesh Networks (WMNs) are becoming increasingly popular mostly due to their ease of deployment. One of the main drawbacks of these networks is that they suffer with respect to Quality of Service (QoS) provisioning to their clients. Equipping wireless mesh nodes with multiple radios in order to increase the available network bandwidth has be...
Cloud computing is causing a paradigm shift in the provision and use of software. This has changed the way of obtaining, managing and delivering computing services and solutions. Similarly, it has brought new challenges to software testing. A particular area of concern is the performance of cloud-based applications. This is because the increased co...
Cloud computing is becoming increasingly prevalent, more and more software providers are offering their applications as Software-as-a-Service solutions rather than traditional on-premises installations. In order to ensure the efficacy of the testing phase, it is critical to create a test environment that sufficiently emulates the production environ...
Increasing and variable traffic demands due to triple play services pose significant Internet Protocol Television (IPTV) resource management challenges for service providers. Managing subscriber expectations via consolidated IPTV quality reporting will play a crucial role in guaranteeing return-on-investment for players in the increasingly competit...
Ensuring fast and reliable dissemination of safety messages is a prerequisite for the establishment of safety ap-plications in Vehicular Ad Hoc Networks (VANETs). However, fulfilling this requirement is a challenging task due to the specific characteristics of VANETs and the increasing density of vehicles in urban areas. Moreover, VANETs may carry...
In the Cloud computing environment, enterprise applications can produce high volumes of data, which need to be effectively analyzed for administrators to understand system behaviors. Many run-time data analysis tools can give the most up-to-date knowledge of the system to administrators. However, when troubleshooting a problem in depth, the offline...
More and more drivers use on-board units to help them navigate in the increasing urbanised environment they live and work in. These system (e.g., routing applications on smart phones) are now very often on-line, and use information from the traffic situation (e.g., accidents, congestion) to get the best route. We can now envisage a world where all...
Bugs in a project, at any stage of Software life cycle development are costly and difficult to find and fix. Moreover, the later a bug is found, the more expensive it is to fix. There are static analysis tools to ease the process of finding bugs, but their results are not easy to filter out critical errors and is time consuming to analyze. To solve...
Database sampling has become a popular approach to handle large amounts of data in a wide range of application areas such as data mining or approximate query evaluation. Using database samples is a potential solution when using the entire database is not cost-effective, and a balance between the accuracy of the results and the computational cost of...
In a wide range of application areas (e.g. data mining, approximate query evaluation, histogram construction), database sampling has proved to be a powerful technique. It is generally used when the computational cost of processing large amounts of information is extremely high, and a faster response with a lower level of accuracy for the results is...
Performance regression testing is an important step in the production process of enterprise applications. Yet, analysing this type of testing data is mainly conducted manually and depends on the load applied during the test. To ease such a manual task we present an automated, load-independent technique to detect performance regression anomalies bas...
Wireless Mesh Networks (WMNs) are becoming increasingly popular and user demand for high-quality rich media services is continuously growing. Despite the fact that WMNs offer significant flexibility, they suffer in respect to Quality of Service (QoS) provisioning. This paper proposes a novel mechanism for providing enhanced QoS support to video ser...
Server consolidation is an important problem in any enterprise, where capital allocators (CAs) must approve any cost saving plans involving the acquisition or allocation of new assets and the decommissioning of inefficient assets. Our paper describes iVMp an interactive VM placement algorithm, that allows CAs to become 'agile' capital allocators th...
Nowadays, enterprise software systems store a large amount of operational information in logs. Manually analysing these data can be time-consuming and error-prone. Although a static knowledge database eases the task to capture recurring problems, maintaining such a knowledge repository requires periodic knowledge updates by domain experts. Moreover...
The main wireless technology used for events sensing and data collection is wireless sensor devices. These sensors are mounted on vehicles or in the roadside to send data collected periodically or upon incident detection. In this latter case, ensuring low transmission delay from the detector sensor to WSNs gateway is a real challenge. Indeed, faste...
One of the key features of the media independent handover (MIH) framework, introduced by the IEEE 802.21 standard, is the support for events, including network degradation events which can be triggered based on link layer metrics and propagated to upper layer mobility protocols. As a framework, MIH does not provide specifics on how these events are...
Performance evaluation through regression testing is an important step in the software production process. It aims to make sure that the performance of new releases do not regress under a field-like load. The main outputs of regression tests are the metrics that represent the response time of various transactions as well as the resource utilization...
Vehicle routing problem (VRP) is a generic name referring to optimization problems in transportation, distribution and logistics industry. They mainly focus on serving a number of customers by a number of vehicles. Route planning techniques is one of the main tasks of VRP which aims to find an optimal route from a starting point to a destination on...
In this paper, we deal with backoff cheating technique in IEEE802.11 based MANETs and propose a novel scheme, dubbed HsF-MAC (Hash Function based MAC protocol), to cope with it. In contrast to the existing solutions, HsF-MAC allows MANET nodes to re-calculate the backoff value used by their 1-hop neighbors and immediately detect the misbehaving one...
This paper offers an overview of the performance engineering field, including some of its latest challenges. Then, it briefly describes the research area of enhancing the performance of JEE systems through leveraging its " Extended Information " and some recent investigation trends in that front. Finally some future research ideas are presented.
A wireless mesh network is characterized by dynamicity. It needs to be monitored permanently to make sure its properties remain within certain limits in order to provide Quality-of-Service to the end users or to identify possible faults. To establish in every moment what is the appropriate reporting interval of the measured information and the way...
Enterprise systems produce a vast amount of logging data. This critical and valuable information must be processed automatically for timely system analysis and recovery. As a result of industry demands, a standard database containing known issues has been introduced - a symptom database. Each symptom consists of a rule pattern and corresponding sol...
Deployed software applications use log files to keep a record of system events. Log analysis provides support for system administrators to gain the knowledge of system health and behavior. As a result, the ability to efficiently search for patterns in historical events has become a major requirement for timely analysis. Enterprise systems today pro...
Creating fine tuned and stable systems is very important and requires use of a list of testing tools that analyze various resources (like GC logs, heap dumps, native memory, etc). Due to the nature of those tools, this kind of analysis can only be performed by a small group of expert users that have high technical skills. In this paper we present a...
Large enterprises possess massive quantities of computing systems. The costs of running and managing these systems, and capital investments for new hardware, are significant. Optimizing the utilization of those computing systems can produce significant savings for the enterprise. However, the optimization in a multinational, multi-divisional, multi...
Understanding system utilization is currently a difficult challenge for industry. Current monitoring tools tend to focus on monitoring critical servers and databases within a narrow technical context, and have not been designed to to manage extremely heterogeneous IT infrastructure such as desktops, laptops, and servers, where the number of devices...
Large applications often suffer from excessive memory consumption. The nature of these heaps, their scale and complex interconnections, makes it difficult to find the low hanging fruit. Techniques relying on dominance or allocation tracking fail to account for sharing, and overwhelm users with small details. More fundamentally, a programmer still n...
Composing large enterprise applications from reusable software components has become a major software development technique.
Components are considered as black boxes and are composed together by their provided and required services. In comparison
to other software development approaches, this allows for rapid development, clean and explicit archite...
Media Independent Handover (MIH) is an emerging standard which supports the communication of network-critical events to upper
layer mobility protocols. One of the key features of MIH is the event service, which supports predictive network degradation
events that are triggered based on link layer metrics. For set route vehicles, the constrained natu...
As Mick Jagger said "You can't always get what you want but if you try sometimes you might find you get what you need." This is an attitude that seems to prevail in the provision of web services. The provision of quality of service is seen as a compromise between the customer requirements and the ability of the service providers and the underlying...
The emerging Media Independent Handover (MIH) standard proposes to support session continuity during handover between heterogeneous networks. One of the critical features provided by MIH is an Event Service which includes predictive network degradation events, such as Link_Going_Down (LGD), which are triggered based on link layer metrics. Our resul...
In heterogeneous network environments, the network connections of a multi-homed device may have significant bandwidth differential.
For a multi-homed transmission protocol designed for network failure tolerance, such as SCTP, path selection algorithms for
data transmission drastically affect performance. This article studies the effect of path band...
In this paper, the system capacity of a multi-cell IEEE 802.16j system operating in transparent mode is investigated. A previous published analytical model is used and incorporates interference from neighbouring cells. The model can be used to determine downlink performance under max-min fairness constraints for both sectored and omnidirectional sy...
Monitoring the status of running applications is a real life requirement and important research area. In particular log analysis is often required to understand how the system is behaving during execution. For example it is common for system administrators to collect and view logs from different hardware and software components to gain an understan...
A robust mechanism to enable seamless handover of streamed IPTV in a WLAN is presented. Handover in a wireless network is usually based on signal strength measurements, but that approach does not consider levels of congestion within the network. Here, the case of stationary nodes with varying levels of network congestion is considered. A scheme tha...
Network centric handover solutions for all IP wireless networks usually require modifications to network infrastructure which can stifle any potential rollout. This has led researchers to begin looking at alternative approaches. Endpoint centric handover solutions do not require network infrastructure modification, thereby alleviating a large barri...
The Stream Control Transmission Protocol (SCTP) is a transport layer protocol which can support mobility through its multi-homing feature. SCTP's mobility support can be subdivided into 2 areas: path performance evaluation and switch implementation. As a transport layer protocol, SCTP's path performance evaluation is limited by its end-to-end netwo...
Today's enterprise applications can produce vast amounts of information both during system testing and in production. Correlation of this information can be difficult as it is generally stored in a range of different event logs, the format of which can be application or vendor specific. Furthermore these large logs can be physically distributed acr...
In this paper, the system capacity of IEEE 802.16j systems operating in transparent mode is investigated under varying numbers of relays and associated transmit power. The study is based on an extended variant of an analytical model defined in previous work and used to determine the throughput gain that can be achieved under a max-min fairness cons...
Performance Assurance is a methodology that, when applied during the design and development cycle, will greatly increase the chances of an e-Commerce project satisfying user performance requirements first time round. This paper discusses the primary risk factors in development projects, the keys to a successful risk management programme, and the to...
Ethernet passive optical networks (EPONs) have attracted considerable attention from the industry, as they offer a simple, highly flexible and cost-effective solution to the problem of providing broadband access to a customer. The authors present a new approach to bandwidth allocation in EPONs where the optical line terminator (OLT) has full contro...
In this paper, an interference-aware analytical model of IEEE 802.16j systems operating in transparent mode is described. The model can be used to determine the throughput gain that can be achieved by 802.16J relay- based systems under a max-min fairness constraint in which the difference between the data rate delivered to all subscribers is minimi...
Monitoring, analysing and understanding component based enterprise software systems are challenging tasks. These tasks are essential in solving and preventing performance and quality problems. Obtaining component level interactions which show the relationships between different software entities is a necessary prerequisite for such efforts. This pa...