Triplet Clustering of One-Mode Two-Way Proximities

To read the full-text of this research, you can request a copy directly from the authors.


Some researchers noticed that proximities of three objects are useful to disclose relationships among objects. Sometimes it is not easy to obtain one-mode three-way proximities in contrast to obtain one-mode two-way proximities. Hence, a procedure to assemble one-mode three-way proximities from one-mode two-way proximities is introduced. And a method for hierarchical clustering of the resulting one-mode three-way proximities, where three clusters (objects) form a new cluster at each step of the clustering, is introduced. The procedure is applied to one-mode two-way dissimilarities among kinship terms, and the resulting one-mode three-way dissimilarities (dissimilarities of three kinship terms) were analyzed by the method of cluster analysis for one-mode three-way dissimilarities, which is comparable to the complete linkage. The one-mode two-way dissimilarities, from which the one-mode three-way dissimilarities were assembled, were analyzed by the complete linkage cluster analysis. The comparison of the two results shows that the present analysis revealed the aspects which cannot be disclosed by the analysis using one-mode two-way cluster analysis.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
This article proposes a model that facilitates the analysis of triadic relationships among three objects. Recently, interest in studies of three-way data models has increased, and many significant contributions have been made in this area. However, one-mode, three-way models have yet to be considered. This study focuses on a one-mode, three-way model in which three-way distances are explained as the subtraction of the smallest squared distance among the three squared distances from the sum of these squared distances. This formulation is used to illustrate the idea that relationships with many differences carry more information than relationships with few differences. Moreover, the distance between two objects is weighted more heavily when two objects differ greatly, and this weight indicates the salience of their dyadic distance. Finally, the model and algorithm are applied to purchase data for convenience stores. The proposed model of multidimensional scaling clearly identifies differences among groups of objects.
This study compares two basic variants of the sorting method: single-sort in which each respondent is given only one opportunity to sort the items; and multiple-sort in which the respondent is given several opportunities to sort, each time on a different basis. Kinship terms serve as stimulus materials. Multidimensional scaling solutions show large differences between the two methods with respect to the degree to which the kinship dimensions are used as a basis for sorting. In particular, most respondents ignore the most obvious dimension (sex of the terms) when they believe they have only one opportunity to indicate the dimensions in the set. Similar observations of pairwise judgments in another stimulus domain (consonant phonemes) suggest the same bias may be present in such judgments. Moreover, in both instances hierarchical clustering completely fails to represent the minority of judges who do not ignore the given dimension. These results indicate that a multiple set of judgments from each set of respondents may be superior to a single set of judgments for certain stimulus domains. Finally, the kinship data also indicate that male and female respondents emphasize different kinship dimensions but that aggregated multiple-sort data do appear to reflect the cognitive dimensions present in any given individual.
This paper deals with the question whether the quality of different clustering al-gorithms can be compared by a general, scientifically sound procedure which is inde-pendent of particular clustering algorithms. In our opinion, the major obstacle is the difficulty to evaluate a clustering algorithm without taking into account the context: why does the user cluster his data in the first place, and what does he want to do with the clustering afterwards? We suggest that clustering should not be treated as an application-independent mathematical problem, but should always be studied in the context of its end-use. Different techniques to evaluate clustering algorithms have to be developed for different uses of clustering. To simplify this procedure it will be useful to build a "taxonomy of clustering problems" to identify clustering applications which can be treated in a unified way. Preamble Every year, dozens of papers on clustering algorithms get published. Researchers contin-uously invent new clustering algorithms and work on improving existing ones. People who work on end-use problems ("applications") remain rather untouched by most of these papers. They continue to use their favorite algorithms, usually k-means and linkage algorithms. Researchers who publish papers about new clustering algorithms always struggle with the same question: How can they convince a reader that their algorithm is "good"? Applied people don't really care. They don't believe that there is an algorithm which can always discover what they are looking for, and they don't think that there exists "the true clustering" of a data set anyway. They continue to use k-means. * Authors in alphabetical order 1 Researchers treat clustering as if it were a scientific discipline. They try to come up with various scores to assess the quality of clustering algorithms. Applied people consider clustering rather as an art or a craft. If used with skill, it can be a useful tool. No more, no less.
An individual differences model for multidimensional scaling is outlined in which individuals are assumed differentially to weight the several dimensions of a common “psychological space”. A corresponding method of analyzing similarities data is proposed, involving a generalization of “Eckart-Young analysis” to decomposition of three-way (or higher-way) tables. In the present case this decomposition is applied to a derived three-way table of scalar products between stimuli for individuals. This analysis yields a stimulus by dimensions coordinate matrix and a subjects by dimensions matrix of weights. This method is illustrated with data on auditory stimuli and on perception of nations.
We present a new model and associated algorithm, INDCLUS, that generalizes the Shepard-Arabie ADCLUS (ADditive CLUStering) model and the MAPCLUS algorithm, so as to represent in a clustering solution individual differences among subjects or other sources of data. Like MAPCLUS, the INDCLUS generalization utilizes an alternating least squares method combined with a mathematical programming optimization procedure based on a penalty function approach to impose discrete (0,1) constraints on parameters defining cluster membership. All subjects in an INDCLUS analysis are assumed to have a common set of clusters, which are differentially weighted by subjects in order to portray individual differences. As such, INDCLUS provides a (discrete) clustering counterpart to the Carroll-Chang INDSCAL model for (continuous) spatial representations. Finally, we consider possible generalizations of the INDCLUS model and algorithm.
We present a new algorithm, MAPCLUS (MAthematicalProgrammingCLUStering), for fitting the Shepard-Arabie ADCLUS (forADditiveCLUStering) model. MAPCLUS utilizes an alternating least squares method combined with a mathematical programming optimization procedure based on a penalty function approach, to impose discrete (0,1) constraints on parameters defining cluster membership. This procedure is supplemented by several other numerical techniques (notably a heuristically based combinatorial optimization procedure) to provide an efficient general-purpose computer implemented algorithm for obtaining ADCLUS representations. MAPCLUS is illustrated with an application to one of the examples given by Shepard and Arabie using the older ADCLUS procedure. The MAPCLUS solution uses half as many clusters to achieve nearly the same level of goodness-of-fit. Finally, we consider an extension of the present approach to fitting a three-way generalization of the ADCLUS model, called INDCLUS (INdividualDifferencesCLUStering).
Techniques for partitioning objects into optimally homogeneous groups on the basis of empirical measures of similarity among those objects have received increasing attention in several different fields. This paper develops a useful correspondence between any hierarchical system of such clusters, and a particular type of distance measure. The correspondence gives rise to two methods of clustering that are computationally rapid and invariant under monotonic transformations of the data. In an explicitly defined sense, one method forms clusters that are optimally “connected,” while the other forms clusters that are optimally “compact.”
The present paper introduces an overlapping cluster analysis model and an associated algorithm that can analyze one-mode three-way similarities. The present model is an extension of ADCLUS model, and the present algorithm is based on the MAPCLUS algorithm. In the present model, one-mode three-way similarities are represented by the sum of the numerical weights of clusters to which any triplet of objects belongs. The present model and algorithm were applied to joint purchase data, and compared the result with that of MAPCLUS to show that the present model is effective in representing one-mode three-way similarities.
Distance models for three-way proximity data, which consist of numerical values assigned to triples of objects that indicate their joint (lack of) homogeneity or resemblance, require a generalization of the usual distance concept defined on pairs of objects. An axiomatic framework is given for characterizing triadic dissimilarity, triadic similarity, and triadic distance, where the term triadic implies that each element of the triple is treated on an equal footing. Two kinds of distance models are studied in detail: the Minkowski-p or Mp model, which is based upon dyadic components and includes the perimeter model as an important special case, and several models based on presence-absence variables. They are shown to satisfy the tetrahedral inequality, a condition that is characteristic for the present axiomatization. Two monotonically convergent algorithms are described that find weighted least squares representations of three-way proximity data under the Euclidean M1 model and the Euclidean M2 model. To enable a scalefree evaluation of the quality of the fit, an additive decomposition of the sum of squares of the dissimilarities is derived. As illustrated in one of the examples, distance analysis of three-way, three-mode tables is possible by a suitable manipulation of the least squares weights.
Triadic distance models can be used to analyse proximity data defined on triples of objects. Three-way symmetry is a common assumption for triadic distance models. In the present study three-way symmetry is not assumed. Triadic distance models are presented for the analysis of asymmetric three-way proximity data that result in a simultaneous representation of symmetry and asymmetry in a low-dimensional configuration. An iterative majorization algorithm is developed for obtaining the coordinates and the representation of the asymmetry. The models are illustrated by an example using longitudinal categorical data.
  • P Arabie
  • JD Carroll
  • WS DeSarbo
Analysis of individual differences in multidimensional scaling via an
  • J D Carroll
  • J J Chang