New Content Based Classification Technique Using NN-PCA for objectionable web page Filtering
ABSTRACT Nowadays, Internet is becoming an open source to access almost any type of information. When someone decides to open the door to internet the good along with the bad will enter the network. A lot of useful information exist on the internet but is also offensive and harmful material available which can pollute the healthy mind of children and teenagers. The need to protect children and teenagers from objectionable material has led to the development of technology to facilitate filtering of web content. Currently there are several web content filtering applications designed for use on web. Most of the approaches are not well against classifying objectionable web pages and pages related to medical consultation which are having high similarity in contents. In this paper we introduce a new content based filtering approach which uses modified entropy term weighting scheme and PCA as feature selection using neural network as a classifier to solve this issue. Our empirical evaluation shows that our approach with provable performance guarantees performs well in comparison with other commonly-used classification techniques.