[show abstract] [hide abstract]
ABSTRACT: Time series prediction has been extensively researched in both the statistical and computational intel-ligence literature with robust methods being developed that can be applied across any given application domain. A much less researched problem is multiple time series pre-diction where the objective is to simultaneously forecast the values of multiple variables which interact with each other in time varying amounts continuously over time. In this paper we describe the use of a novel integrated multi-model framework (IMMF) that combined models devel-oped at three different levels of data granularity, namely the global, local and transductive models to perform mul-tiple time series prediction. The IMMF is implemented by training a neural network to assign relative weights to predictions from the models at the three different levels of data granularity. Our experimental results indicate that IMMF significantly outperforms well established methods of time series prediction when applied to the multiple time series prediction problem.
Conference Proceeding: Mining Software Metrics from Jazz[show abstract] [hide abstract]
ABSTRACT: In this paper, we describe the extraction of source code metrics from the Jazz repository and the application of data mining techniques to identify the most useful of those metrics for predicting the success or failure of an attempt to construct a working instance of the software product. We present results from a systematic study using the J48 classification method. The results indicate that only a relatively small number of the available software metrics that we considered have any significance for predicting the outcome of a build. These significant metrics are discussed and implication of the results discussed, particularly the relative difficulty of being able to predict failed build attempts.Software Engineering Research, Management and Applications (SERA), 2011 9th International Conference on; 09/2011
International Journal of Information Sciences and Computer Engineering. 01/2010; 1(2):26-35.
Conference Proceeding: Improving web search using contextual retrieval[show abstract] [hide abstract]
ABSTRACT: Contextual retrieval is a critical technique for todaypsilas search engines in terms of facilitating queries and returning relevant information. This paper reports on the development and evaluation of a system designed to tackle some of the challenges associated with contextual information retrieval from the World Wide Web (WWW). The developed system has been designed with a view to capturing both implicit and explicit user data which is used to develop a personal contextual profile. Such profiles can be shared across multiple users to create a shared contextual knowledge base. These are used to refine search queries and improve both the search results for a user as well as their search experience. An empirical study has been undertaken to evaluate the system against a number of hypotheses. In this paper, results related to one are presented that support the claim that users can find information more readily using the contextual search system.Proceedings of the 6th International Conference on Information Technology: New Generations (ITNG), Las Vegas NV, USA; 01/2009
Conference Proceeding: Use of Hoeffding trees in concept based data stream miningS. Hoeglinger, R. Pears[show abstract] [hide abstract]
ABSTRACT: Recent research in data mining has focussed on developing new algorithms for mining high-speed data streams. Most real-world data streams have in common that the underlying data generation mechanism changes over time, introducing so-called concept drift into the data. Many current algorithms incorporate a time-based window to be able to cope with drift in order to keep their model up-to-date with the data stream. A major problem with this approach is the potential loss of valuable information as data slides out of the time window. This is particularly a concern in those environments where patterns recur. In this paper, we present a concept-based window approach, which is integrated with a high-speed decision tree learner. Our approach uses the content of the data stream itself in order to decide which information is to be erased. Several methodologies, all based around minimising the overall information loss when pruning the decision tree, are discussed.Information and Automation for Sustainability, 2007. ICIAFS 2007. Third International Conference on; 01/2008