[show abstract][hide abstract] ABSTRACT: In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation
demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval
results. One advantage is its flexibility since different weights can be assigned to different component systems so as to
obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an
In this paper, we use the multiple linear regression technique to obtain optimum weights for all involved component systems.
Optimum is in the least squares sense that minimize the difference between the estimated scores of all documents by linear
combination and the judged scores of those documents. Our experiments with four groups of runs submitted to TREC show that
the linear combination method with such weights steadily outperforms the best component system and other major data fusion
methods such as CombSum, CombMNZ, and the linear combination method with performance level/performance square weighting schemas
by large margins.
[show abstract][hide abstract] ABSTRACT: In this paper we present a new data fusion method in information retrieval, which uses ranking information of resultant documents. Our method is based on the modelling of rank-probability of relevance of documents in resultant document list using logarithmic models. The proposed method is more effective than other data fusion methods which also use ranking information, and is as effective as some data fusion methods which rely on reliable scoring information.
Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on; 09/2010
[show abstract][hide abstract] ABSTRACT: The computation of the sensitivity of a Madaline’s output to its parameter perturbation is systematically discussed. Firstly,
according to the discrete feature of Adalines, a method based on discrete stochastic technique is proposed, which derives
some analytical formulas for the computation of Adalines’ sensitivity. The method can theoretically solve some problems that
are unsolvable by the existing methods based on continuous stochastic techniques, release some unpractical constraints, and
make it available to theoretically analyze the approximation error of Adalines’ sensitivity. Secondly, on the basis of the
sensitivity of Adalines and the structural characteristics of Madalines, a new selection strategy depending on a type of dedication
degree for computing Madalines’ sensitivity is proposed, which is superior to current popular way of simply averaging in both
precision and complexity. The proposed formulas and algorithm have the advantages of simplicity, low computational complexity,
small approximation error, and high generality, as have been verified by a great amount of experimental simulations.
Sciece China. Information Sciences 01/2010; 53:2399-2414. · 0.71 Impact Factor
[show abstract][hide abstract] ABSTRACT: The sensitivity of a neural network's output to its parameter variation is an important issue in both theoretical researches and practical applications of neural networks. This paper proposes a quantified sensitivity measure of the Radial Basis Function Neural Networks (RBFNNs) to input variation. The sensitivity is defined as the mathematical expectation of squared output deviations caused by input variations. In order to quantify the sensitivity, the input is treated as a statistical variable and a numerical integral technique is employed to approximately compute the expectation. Experimental verifications are run and the results show a very good agreement between the proposed sensitivity computation and computer simulation. The quantified sensitivity measure could be helpful as a general tool for evaluating RBFNNs' performance.
International Joint Conference on Neural Networks, IJCNN 2010, Barcelona, Spain, 18-23 July, 2010; 01/2010
[show abstract][hide abstract] ABSTRACT: Ensemble learning is one of the main directions in machine learning and data mining, which allows learners to achieve higher
training accuracy and better generalization ability. In this paper, with an aim at improving generalization performance, a
novel approach to construct an ensemble of neural networks is proposed. The main contributions of the approach are its diversity
measure for selecting diverse individual neural networks and weighted fusion technique for assigning proper weights to the
selected individuals. Experimental results demonstrate that the proposed approach is effective.
Intelligent Data Engineering and Automated Learning - IDEAL 2008, 9th International Conference, Daejeon, South Korea, November 2-5, 2008, Proceedings; 01/2008
[show abstract][hide abstract] ABSTRACT: In information retrieval, the linear combination method is a very flexible and effective data fusion method, since different
weights can be assigned to different component systems. However, it remains an open question which weighting schema is good.
Previously, a simple weighting schema was very often used: for a system, its weight is assigned as its average performance
over a group of training queries. In this paper, we investigate the weighting issue by extensive experiments. We find that,
a series of power functions of average performance, which can be implemented as efficiently as the simple weighting schema,
is more effective than the simple weighting schema for data fusion.
Foundations of Intelligent Systems, 17th International Symposium, ISMIS 2008, Toronto, Canada, May 20-23, 2008, Proceedings; 01/2008
[show abstract][hide abstract] ABSTRACT: Ensemble learning to construct learners in regression and classification has practically and theoretically been proved to be able to improve the generalization capability of the learners. Nowadays, most neural network ensembles are obtained by manipulating training data and networks' architecture etc, such as Bagging, Boosting, and other methods like evolutionary techniques. In this paper, a new method to construct neural network ensembles is presented, which aims at selecting, by means of output sensitivity of an individual network, the most diverse members from a pool of trained networks. Conceptually, the sensitivity reflects a network's output behavior at a given data point, for example, the trend of the network's output nearby. So the sensitivity can be helpful to explicitly measure the output diversity among individuals in the pool. In our research, Multilayer Perceptrons (MLPs) are focused on, and the sensitivity is adopted as the partial derivative of an MLP's output to its input at data point. Based on the sensitivity, we developed four different measures for the selection of the most diverse individuals from a given pool of trained MLPs. Some experiments on the UCI benchmark data have been conducted, and the comparisons of our results with those from Bagging and Boosting show that our method has some advantages over the existing ensemble methods in ensemble size and generalization performance.
Proceedings of the International Joint Conference on Neural Networks, IJCNN 2007, Celebrating 20 years of neural networks, Orlando, Florida, USA, August 12-17, 2007; 01/2007