Page 1
Modified Collaborative Coefficient: a new measure
for quantifying degree of research collaboration
Kiran Savanur1R. Srikanth1,2
1 Raman Research Institute, Sadashivanagar, Bangalore 560080, India.
2 Poornaprajna Institute of Scientific Research, Bangalore 562110, India.
Abstract
Collaborative coefficient (CC) is a measure of collaboration in re
search, that reflects both the mean number of authors per paper as well
as the proportion of multiauthored papers. Although it lies between
the values 0 and 1, and is 0 for a collection of purely singleauthored
papers, it is not 1 for the case where all papers are maximally au
thored, i.e., every publication in the collection has all authors in the
collection as coauthors. We propose a simple modification of CC,
which we call Modified Collaboration Coefficient (or MCC, for short),
which improves its performance in this respect.
1 Introduction
Collaboration is an intense form of interaction, that allows for effective com
munication as well as the sharing of competence and other resources: Melin
[1]. However, the complex nature of human interaction that takes place be
tween collaborators and the magnitude of their collaboration are not easily
captured by quantutative tools. For example, the precise relationship be
tween quantifiable activities (e.g. data analysis) and intanginble contribu
tions (e.g. ideas) and their weightage in the final product of the collaboration
(e.g. a research paper) is extremely difficult to determine. Science indica
tors, however, provide additional quantitative information of a more direct
and objective nature to be geographical patterns of cooperation among sci
entific institutions: Gupta et al [2].
To compare the extent of collaboration in two fields (or subfields) or to
show the trend towards multiple authorships in a descipline, many studies
have used either the mean number of authors per paper, termed the Col
laborative Index by Lawani [3] and/or the proportion of multipleauthored
1
Page 2
papers, called Degree of Collaboration by Subramanyam [4] as a measure of
the strenth of collaboration in a discipline. These two measures are shown
inadequate by Ajiferuke et al. [5] and they derived a single meaure that in
corporates some of the merits of both of the above. Collaboration Coefficient
as defined by Ajiferuke et al., lies between 0 and 1, with 0 corresponding to
single auhored papers. However it is not 1 for the case where all papers are
maximally authored, i.e. every publication in the collection has all authors
in the collection as coauthors. Let the collection k be the research papers
published in a discipline or in a journal during a certain period of interest.
In the following, we write
fj= the number of papers having j authors in collection k;
N = the total number of papers in k. N =?jfj; and
A = the total number of authors in collection k.
2 Present measures
One of the early measures of degree of collaboration is Collaborative Index
(CI) is given by:
CI =
?A
j=1jfj
N
(1)
It is a measure of mean number of authors. Although it is easily com
putable, it is not easily interpretable as a degree, for it has no upper limit
moreover, it gives a nonzero weight to singleauthored papers, which involve
no collaboration.
Degree of Collaboration (DC), a measure of proportion of multiple
authored papers is given by:
DC = 1 −f1
N
(2)
DC is easy to calculate and easily interpretable as a degree (for it lies
between zero and one), gives zero weight to singleauthored papers, and
always ranks higher a discipline (or period) with a higher percentage of
multipleauthored papers. However, DC does not differentiate among levels
of multiple authorships.
Collaboration Coefficient (CC) was designed to remove the above short
comings pertaining to CI and DC. It is given by:
2
Page 3
CC = 1 −
?A
j=1(1/j)fj
N
(3)
It vanishes for a collection of singleauthored papers, and distinguishes
between singleauthored, twoauthored, etc., papers. However, CC fails to
yeild 1 for maximal collaboration, except when number of authors is infinite.
We note that DC also equals to 1 for maximal collaboration.
3 The proposed measure, MCC
The derviation of the new measure is almost the same as that of CC, as
given in Ajiferuke et al.
Imagine that each paper carries with it a single ”credit”, this credit being
shared among the authors. Thus if a paper has a single author, the author
receives one credit; with 2 authors, each receives 1/2 credits and, in general,
if we have X authors, each receives 1/X credits (this is the same as the idea
of fractional productivity defined by Price and Beaver as the score of an
author when he is assinged 1/n of a unit for one item for which n authors
have been credited.)
Hence the average credit awarded to each author of a random paper is
E[1/X], a value that lies between 0 and 1. Since we wish 0 to correspond to
single authorship, we define the Modified Collaborative Coefficient (MCC),
κ, as:
κ
=
=
α{1 − E[1/X]}
α
1 −
?
?
?(1/j)P(X = j)
j=1(1/j)fj
N
?
=
α
1 −
?A
?
(4)
where α is a normalization constant to be determined. Setting α = 1
yields the measure CC. The requirement that κ = 0 for single authorship
does not restrict α.
If all N articles involve all the A authors, then E[1/X] = 1/A. If we
want κ to satisfy the requirement that κ = 1 for maximal collaboration,
then we must set
α =
A
?
1 −1
?−1
=
A
A − 1
(5)
3
Page 4
We thus obtain from Eqs (4) and (5) the final expression for MCC, which
is:
κ
=
?
A{1 −?(1/j)P(X = j)}
A
A − 1
1 −1
A
?−1
{1 − E[1/X]}
=
A − 1
1 −
κ
=
?
?A
j=1(1/j)fj
N
?
(6)
The above equation is not defined for the trivial case when A = 1,
which is not a problem since collaboration is meaningless unless at least two
authors are available. CC approaches MCC only when A −→ ∞, but is
otherwise strictly less than MCC by the factor
?
1 −1
A
?
.
4
Page 5
Table 1: Distribution of authorships for Library and Information Science
Abstracts (reproduced from Ref. [5], except for last column)
Number of1961 1966 1971
authors
1 7831021 1968
(94.11) (94.28)(86.35)
2 4348 232
(5.17) (4.43)(10.18)
36 10 54
(0.72) (0.92)(2.37)
43 15
 (0.28)(0.66)
518
 (0.09)(0.35)
61
 (0.04)
70
 (0.00)
81
 (0.04)
9

10

Total8321083 2279
19761981 19861991
2771
(87.06)
312
(9.80)
65
(2.04)
23
(0.72)
6
(0.19)
5
(0.16)
1
(0.03)






3183
3697
(83.47)
559
(12.62)
123
(2.78)
33
(0.75)
8
(0.18)
5
(0.11)
4
(0.09)






4429
4971
(82.88)
786
(13.10)
2.83
(170)
36
(0.60)
17
(0.28)
10
(0.17)
3
(0.05)






5998
0
0
0
0
0
0
0
0
0
7
7
4 Examples
MCC for distribution of authorships for 1966 in Table 1 is calculated thus:
κ
=
A
A − 1
?
1.0009
?
1 −
?A
j=1(1/j)fj
N
?(1 × 1021) + (1/2 × 48) + (1/3) × 10) + (1/4 × 3) + (1/5) × 1)
1 −1021 + 24 + 3.333 + 1075 + 0.2
1083
1.0009(1 − 1049.283/1083)
?
=
1083
1083 − 1
??
1 −
1083
??
=
?
?
=
5
Page 6
=
=
?
Similarly, values of MCC for 1961, 1971, 1976, 1981, 1986 and 1991 are
calculated and displayed along with the corresponding values of CI, DC and
CC in Table 2.
1.0009(1 − 0.9689)
1.0009 × 0.0312
0.0311(7)
Table 2: Measures of collaboration obtained using Eqs. 1, 2, 3 and 6. Note
that MCC alone attains 1 for maximal collaboration.
YearCIDCCCMCC
1961
1966
1971
1976
1981
1986
1991
1.0660
1.0748
1.1880
1.1778
1.2224
1.2356
10
0.0590
0.0570
0.1365
0.1294
0.1653
0.1712
1
0.0306
0.0311
0.0752
0.0711
0.0904
0.0938
0.857
0.0306
0.0311
0.0752
0.0711
0.0904
0.0938
1
5 Application to some probability distributions
It is sometimes convenient if a relationship can be established between a
measure of strength or inequality and a theoretical distribution which fits
the observed distribution of a social phenomenon. In most cases, then, the
measure can be estimated from the parameters of the distribution.
While there has been no generally accepted model for the distribution of
authorships a few have been suggested: Price and Beaver[6] suggested the
Poisson distribution while Goffman and Waren[7] suggested the geometric
distribution. The MCC along with the other two measures is given below for
these and two other commonly used probability distributions, the binomial
and the negative binomial.
If the distribution variable is unbounded, then A = ∞, so that α = 1
from Eq 5. In this case, MCC reduces to CC. This is the case in the follow
ing two distributions.
6
Page 7
Geometric
P(X = j)=
p(1 − p)j−1;j = 1,2,...
∞
?
1/p
E(X)=
j=1
jp(1 − p)j−1
=
Where p may be interpreted as the probability of completion of a research
work without collaborators.
E(1/X)=
∞
?
−(p/(1 − p))logp
1 − E[1/X] = 1 + (p(1 − p))logp
j=1
(1/j)p(1 − p)j−1
=
=
Hence,MCC
Note that MCC → 1 as p → 0 and
MCC → 0 as p → 1
Shifted Poisson
E[X]=
∞
?
λ + 1
j=1
je−yλj−1/(j − 1)!
=
Where λ may be interpreted as the average number of colleagues con
sulted by a scholar before the completion of a research work.
E[1/X]=
∞
?
(1 − e−y)/λ
1 − E[1/X] = 1 − (1 − e−λ)/λ
j=1
(1/j)e−yλj−1/(j − 1)!
=
=
Hence,MCC
Note that MCC → 0, as λ → 0 and
MCC → 1 as λ → ∞
7
Page 8
Similarly, MCC is the same as CC for other distributions, like Lotka
distribution, given as:
f(k) =
c
kα
(8)
zerotruncated Poisson distribution, given as:
h(k) =
θk
k!(eθ− 1)
(9)
shifted negative binomical, given as:
P(X = j) =
?v+j−2
j−1
?
pv(1 − p)j−1;j = 1,2,...
(10)
and shifted inverse GaussianPoisson. where the number of authors is
unbounded. However, when A is finite, MCC > CC, as illustrated for the
following distribution.
Shifted binomial
P(X = j) = (n
j−1)pj−1(1 − p)n−(j−1);j = 1,2,...,n + 1
where p may be interpreted as the probability of a scholar working with
another colleague on a research work. n may be assumed as the greatest
number of colleagues that it is possible to collaborate with within a field.
For example, while it is possible for a scientist to work with as many as 100
colleagues on a research project, it is hardly conceivable for a humanist to
collaborate with more than four colleagues: Ajiferuke (1988).
P(X = 1)= (1 − p)n
n+1
?
np + 1
E(X)=
j=1
j(n
j−1)pj−1(1 − p)n−(j−1)
=
Here we set A=(n+1). Hence α > 1, given by:
α =
A
A − 1=
n + 1
(n + 1) − 1=n + 1
n
(11)
8
Page 9
E[1/X]=
n+1
?
p(n + 1)[1 − (1 − p)n+1]
j=1
(1/j)(n
j−1)pj−1(1 − p)n−(j−1)
=
1
MCC
=
α[1 − E[1/X]]
(n + 1)
n
=
?
1 −1 − (1 − p)n+1
p(n + 1)
?
Note that MCC → 0, as p → 0 and
CC → n/(n + 1), but MCC → 1, as p → 1
6 Conclusion
CC is an interesting measure of collaborative strength in a discipline, that
has the merit of lying between 0 and 1 (unlike previous measures of col
laboration) and tends to 0 as singleauthored papers dominate. Both these
virtues are inherited by the new measure, MCC. However, unlike CC, which
remains strictly less than 1 for finitely many authors, MCC smoothly tends
to 1 as the degree of collaboration becomes maximal. This quantitatively
captures our intuitive expectation that any quantification of collaborative
strength must become 100 % when the collaboration is maximal.
References
[1] G. MELIN, O. PERSSON, Studying research collaboration using co
authorships, Scientometrics 36 (1996) 363377.
[2] B.M. GUPTA, SURESH KUMAR, C.R. KARISIDDAPPA, Collabora
tion profile of theoretical population genetics speciality, Scientometrics,
39 (1997) 293314.
[3] S. M. LAWANI, Quality, Collaboration and Citations in cancer reseach:
A 268 bibliometric study, Ph.D. Dissertation, Florida State University,
1980, p395
[4] K. SUBRAMANYAM, Bibliometric studeis of research collaboration:
A review, Journal of Information Science, 6 (1983) 3338.
9
Page 10
[5] I. AJIFERUKE, Q. BURREL, J. TAGUE, Collaborative coefficient: A
single measure of the degree of collaboration in research, Scientometrics
14 (1988) 421433.
[6] D. DE SOLLA PRICE, D. DE B. BEAVER, Collaboration in an invis
ible college, American Psychologist, 21 (1966) 1011.
[7] W. GOFFMAN, K. S. WARREN, A mathematical analysis of a medi
cal literature: Schistosomiasis 18521962. In: K. CHESHIRE (Ed.), A
Symposium: Information in the Health Sciences. Working to the Fu
ture, Cleveland Medical Library Association of the Cleveland Health
Sciences Library, Dec. 34, 1969.
10