Fig 1 - uploaded by Hana Řezanková
Content may be subject to copyright.
Two-dimensional map of voting parliament members. Thin lines-borders of clusters. ♦-UR,-CPRF,-LDPR, •-ML,-ID. 

Two-dimensional map of voting parliament members. Thin lines-borders of clusters. ♦-UR,-CPRF,-LDPR, •-ML,-ID. 

Source publication
Article
Full-text available
The contribution describes some possible solutions for finding overlapping clusters of binary variables. The first approach applies standard statistical procedures, the second uses developed Hopfield-like neural network. In the first case, we suggest applying factor analysis or multiple correspondence analysis and interpreting the factor loading ma...

Context in source publication

Context 1
... Euclidean distance, cosine, Jaccard and Dice. Both hierarchial and -means k clustering gave clusters far from parliament fractions: all fractions intersected in clusters and fraction LDPR could not be separated from ER at all. Second, we performed mapping of parliament members by the method of multidimensional scaling. The results are shown in Fig. 1. This map was clustered. The borders of clusters are shown by thin lines. Generally, as factors obtained before, clusters coincide with parliament fractions except for independent deputies. The results of clustering and factorization are compared in the Table III.. The mean F-measure amounted to 0.95 that is slightly smaller than that ...

Citations

... attributes). Because of common occurrence of higher values is more important than common occurrence of lower (especially zero) values, the cosine measure is better than correlation coefficient for this purpose [12]. Cosine measure matrix for the set of patterns is shown in Table III. ...
... We compared results mentioned above with the results obtained by means described in [12]. We applied factor analysis and we used factor loadings for two components as an input for fuzzy cluster analysis in the S-PLUS system. ...
... Experimental results on Web pages suggest the effectiveness of our approach. In future, we will also provide a unified view on binary clustering [12], [11] by establishing the connections among various clustering approaches. ...
Conference Paper
Full-text available
Hana Rezanková, Dušan Húsek, Václav Snášel, Miloš Kudelka, Ondrej Lehecka, "Cluster, SOM and NMF Analyses of Web Patterns", NWESP, 2009, Next Generation Web Services Practices, International Conference on, Next Generation Web Services Practices, International Conference on 2009, , doi:10.1109/NWeSP.2009.11 This paper focuses on web pages clustering as a tool for typical Web patterns searching and using. Traditional methods of cluster analysis, self-organizing map and nonnegative matrix factorization were applied. Web pages on products sale and automatically detected Web patterns were used as a testing data. The application of GD-CLS (gradient descent constrained least squares) which combines some of the best features of other methods was evaluated as the best solution.
Thesis
Full-text available
This doctoral thesis, or dissertation, is devoted to binary data and their factorization, which is a special kind of data analysis. Binary Factor Analysis (BFA) is a nonlinear analysis of binary data, where neither classical linear algebra, nor mathematical (functional) analysis can be used. It is a binary variant of a commonly used statistical method called factor analysis. Classical factor analysis was originally developed and used by psychologists to detect hidden psychic disorders by observation of visible symptoms. Classical factor analysis works with real valued data in normal distribution. Alongside it, Binary Factor Analysis uses the same notation with a different underlying algebra to express the same kind of analysis for binary valued data. In the past, it has been shown that although classical factor analysis often works seamlessly even for data of other kinds of distribution, it is not able to effectively express symptom–factor relations in binary data, which one can see for example in psychology, medicine, or sociology. Presented doctoral thesis aims to cover BFA from several different aspects, and to specialize on problem solving algorithms. It starts from the underlying algebra, and fundamental definitions. This first part of the work is rather mathematical, but only fundamental definitions are made to keep the text understandable. The second and main part is devoted to algorithms. Several original algorithms for BFA are proposed and described, they range from main factorization algorithms through important underlying algorithms to small supporting ones, with most space devoted to main factorization. Because of a large computational complexity of BFA, a considerable effort is also being put to investigation of parallel and distributed algorithms. Third part is devoted to experimental results. The last part is the user’s manual to BiF, a reference implementation of all presented algorithms. The manual contains not only technical description, but also guidelines aimed to be a starting point for an analyst, e.g. a sociologist or a psychologist, trying to check out how he or she can benefit from BFA. Most of presented algorithms are the results of my own work. They are based on a number of different fields of computer science and mathematics, and main benefits of binary factorization is supposed to be seen in human sciences. That’s also making the work truly interdisciplinary, and forced the notation to be unified throughout all chapters, and possibly less common in some particular cases.
Chapter
In this paper, the web pages concerning products sale are analyzed with the aim to create clusters of similar web pages and characterize these by GUI patterns. We applied GD-CLS (gradient descent - constrained least squares) method which combines some of the best features of other methods. Both traditional methods for searching clusters and nonnegative matrix factorization are used.
Conference Paper
Some methods for object group identification applicable for social group identification are compared. We suppose that people are characterized by their actions, for example the deputies are characterized by their voting habits. We are interested in binary data analysis (e.g. the result of voting is yes or not). The dataset consisting of the roll-call votes records in the Russian parliament in 2004 was analyzed. Methods of hierarchical and fuzzy clustering, and Boolean factor analysis are applied. In the first case, we propose two-step analysis in which factor loadings (as result of factor analysis of objects) obtained in the first step are interpreted by cluster analysis in the second step. For the cluster number determination both traditional and modified coefficients are used. Further, we suggest using Hopfield-like neural network based Boolean factor analysis for this purpose. This proposed method gives the best results in the case of deputies grouping.