The aim of this paper is script identification problem of handwritten text which facilitates the clustering of data according to their type of script. In this paper, collection of different types of handwritten text document i.e. Devanagari, Gurumukhi and Roman is taken as input and then cluster of all these documents according to script type whether i.e. Devanagari, Gurumukhi, or Roman was
... [Show full abstract] prepared. Clustering of handwritten multi-script document scheme proposed in this paper is divided into two phases. First phase used to extract the features of given text images. In the second phase, features extracted in the previous phase were used for clustering with kMeans algorithm. In feature extraction phase, we have extracted four types of features, namely, circular curvature feature, horizontal stroke density feature, pixel density feature value and zoning based feature. In this study, we have considered 4,850 samples of isolated characters of Devanagari, Gurumukhi and Roman script.