ArticlePDF Available

R. Brown and C. Ozgur “Framework for Evaluating OM/OR Research Quality, International Journal of Operations and Quantitative Management, Vol. 13, No. 4, 2007, pp. 285-307.

Authors:

Abstract

This paper considers assessment of research quality by focusing on definition and solution of research problems. We develop and discuss, across different classes of problems, a set of general problem solution criteria for evaluating research solutions. In academia, we frequently evaluate the quality of research by the journal in which a research article is published rather than by the content of the article itself. This paper develops basic guidelines and principles for identifying "good research" which are independent of journals in a particular field, and which can be applied to all fields. The research quality is considered by focusing on the definitions and solutions of research problems. In the area of problem definition, the following problem classes are discussed: definition, description, theory, data, methodology/technique, and construction. Certain classes of research problems are classified as "interclass" problems because of their interaction across different problem classes. Interclass problems discussed in this paper include: criteria, integration, extension, and comparison. In addition to "class", problems also have what might be called "order". This paper also explores research problems based on their "order" to determine the connectivity of a problem to a "major" problem area (broadest problem area possible). A set of general problem solution criteria for evaluating research solutions is developed. The criteria include innovation, generalization, verification, longevity, and usefulness. Using realistic examples, we discuss the solution criteria across different classes of problems and present a framework for assessing and rating research.
Assessment of Research Quality
J. Randall Brown
Graduate School of Management
Kent State University
Kent, Ohio 44242
Tel: (330) 672-2750
E-mail: Rbrown@bsa3.kent.edu
Ceyhun Ozgur*
College of Business Administration
Valparaiso University
Valparaiso, IN 46383
Tel: (219) 464 -5178
Fax: (219) 464-5789
E-mail: Ceyhun.Ozgur@valpo.edu
Assessment of Research Quality
ABSTRACT
This paper considers assessment of research quality by focusing on definition and
solution of research problems. We develop and discuss, across different classes of
problems, a set of general problem solution criteria for evaluating research solutions.
In academia, we frequently evaluate the quality of research by the journal in
which a research article is published rather than by the content of the article itself. This
paper develops basic guidelines and principles for identifying "good research" which are
independent of journals in a particular field, and which can be applied to all fields.
The research quality is considered by focusing on the definitions and solutions of
research problems. In the area of problem definition, the following problem classes are
discussed: definition, description, theory, data, methodology/technique, and construction.
Certain classes of research problems are classified as "interclass" problems because of
their interaction across different problem classes. Interclass problems discussed in this
paper include: criteria, integration, extension, and comparison. In addition to "class",
problems also have what might be called "order". This paper also explores research
problems based on their "order" to determine the connectivity of a problem to a "major"
problem area (broadest problem area possible).
A set of general problem solution criteria for evaluating research solutions is
developed. The criteria include innovation, generalization, verification, longevity, and
usefulness. Using realistic examples, we discuss the solution criteria across different
classes of problems and present a framework for assessing and rating research.
1 Introduction
In academia, we all want to do “good research” but do not always agree on what that
constitutes. Perhaps the best way to judge the quality of any piece of research is to wait
fifty to one hundred years and let history supply the answer. Unfortunately, we are
forced to judge the quality of research over a much shorter time horizon in order to make
decisions on tenure, promotion, and merit pay increases. Indeed, we must make
judgements on research that is usually less than three years old, or often that has just been
accepted for publication. In addition to making tenure, promotion, and merit pay
decisions in academia, journal editors or reviewers must decide whether or not to publish
the research paper. When judging the quality of completed research, researchers must
continually evaluate research problems and pick the most promising ones to work on.
Popper said that one must develop several problems before finding even one “good”
problem [12]. In this approach to research, one needs to be able to generate research
problems and then evaluate them. Thus, we are faced with the problem of determining
what is “good research” in almost every phase of research, including inception.
To begin, let us list some common ways people have suggested to measure the
quality of a researcher’s publications. These include the number of refereed publications,
the number of refereed publications in top journals, and a weighted average of the
researcher’s total output possibly including refereed articles, trade journal articles, books,
proceedings, and monographs. Unfortunately, all of these methods are “after the fact.”
Furthermore, these approaches do not allow us to compare the quality of two articles in
different fields. The defenders of the common quality measurement approaches listed
above say that we can only judge the quality of research relative to its particular field.
Although this evaluation technique may appear seemingly easy to implement, there are a
number of disadvantages to this simplistic approach.
1. The researcher’s main focus is on having articles published in “top” journals,
rather than on trying to advance the field. Researchers are encouraged to do
the minimum necessary to get a “hit”. In addition, researchers are channeled
into publishing safe papers in the mainstream of the field, rather than breaking
away into new and unexplored territory for which there is little reward.
Researchers are also often encouraged to make small incremental changes in
methodology and more or less repeat a research study with minor changes or
to apply the same research question with the same methodology in a different
setting.
2. Few editors or department heads can impose their ideas of what constitutes
“good research” on the majority of the researchers in the field. This stifles
independent thinking.
3. Comparing the quality of papers in different fields is almost impossible. A
paper that is “bad research” can be labeled “good research” by being
published in what some people consider to be a top journal in the field.
Seeing this, many researchers in the field do not try to increase the quality of
their research. Instead, they simply try to produce a high quantity of similar
research.
4. Most departments and fields divide the journals into categories of A, B, and C,
while also subjectively identifying the top journals. Each researcher adopts
the attitude that any journal in which he or she publishes in is a top journal, or
at least in categories of A or B. The game is to find a journal classified as a
top journal that will allow you to publish a large number of articles.
5. Researchers are discouraged from thinking about what is “good research” and
what will advance the field. Instead, they concentrate on the surrogate
represented by the top journals.
As the above discussion points out, the evaluation and assessment of research
quality is a very complex task. Due to the subjective nature of the task differences
between fields or even among researchers within the same field, it is very difficult to
develop a generally accepted research quality assessment tool. However, we need some
guidelines for identifying “good research” that are independent of the journal’s guidelines
in a particular field, and which can be applied to all fields. In other words, we are
searching for some underlying principles that will help us identify “good research.” We
will develop a set of criteria and guidelines that will hopefully act as a focal point for
further discussion and development. Only in this manner can we begin to solve the
difficult task of identifying “good research.”
Research can be decomposed into two main categories: problem definition and
problem solution. We will first consider what is a good problem (problem definition) and
then what is a good solution. As Root-Bernstein notes, not all problems are created
equal. Problems differ on a number of characteristics including “class” and “order.” [13]
2 Research Problem Classes
Root-Bernstein [14] defines eight “classes” of problems based on subject matter. These
eight classes are: (1) problem definition, (2) theory problems, (3) problems of data, (4)
methodology problems, (5) problems of criteria, (6) problems of integration, (7) problems
of extension and, (8) problems of comparison. We will add two more problem classes to
Root-Bernstein’s list: (9) construction and (10) description. The problem classes of
definition, theory, data, methodology or technique, construction, and description provide
the basis for the rest of the problem classes and are defined in Table 1.
Table 1
Problem Classes
Problem Class Definition Example
Definition Classification or distinction “What species is this?” or “What is
life?”
Theory Explanation of data according
to some unifying concept
The explanation of the fossil record
and the geographical distribution of
species by evolution; or the
explanation of the observed
characteristics of falling bodies by
gravity.
Data The verification and the
collection of data in sufficient
quantities and of sufficient
quality to verify or falsify a
theory
Collecting the amount of deflection
in the light from a star caused by
passing close to the sun (first
verification of the theory of
relativity) or a mathematical proof.
Methodology or
technique
The means by which data are
collected
“I need the following kind of data:
how do I get it?”
Construction Construct an object or
procedure
Inventing the light bulb or inventing
the simplex procedure
Description Describe what you observe “I came, I saw, I described.”
Problem classes presented in Table 2 concern interactions between the basic
problems and might be classified as “interclass” problems. For all classes of problems
provided in Table 1 and 2, we can begin to evaluate the problem quality by answering the
following questions:
1. Is it an important research problem worth solving?
2. What will the solution of this problem add to the field?
3. What does it build upon and what is needed next?
4. How does it advance the field?
5. Is it exploratory, breakthrough, or confirming?
Table 2
Interclass Problems
Problem Class Definition Example
Criteria The interpretation, meaning, or
validity of any of the first four
classes
Some definitions, such as that of velocity, are
very useful to scientists while others, such as
entelechy (actuality), are not. Clear criteria
must exist by which one may evaluate
definitions. Similarly, a theory must provide
criteria for the evaluation of data as fact (data
that verify a theory), artifact (data that are
inapplicable to testing a theory because it
results from causes defined to be outside the
bounds covered by the theory), or anomaly
(data that contradict theory but can be
demonstrated to fall within the bounds in
which the theory should be valid). Scientists
spend a great deal of time, often
unknowingly, resolving the boundaries
within which problem statements and
resolutions are valid.
Integration Arise from attempts to
integrate two or more theories
or data bases in a coherent
fashion
Reduction of biological theories to those of
chemistry and physics.
Extension The export of a technique,
criteria, or theory to a new
field of science in which the
application is problematic
Does genetics provide a sufficient base upon
which to build sociobiology?
Comparison Dealing with the competition
between two or more
definitions, theories, etc.
If two data sets are incompatible, how does
one establish which set is valid for any
particular use? Or, how does one test the
relative merits of two theories invented to
explain the same database, but with
contradictory assumptions?
To paraphrase Root-Bernstein [14], each of these problem “classes” require a
different methodology for its solution. Experiments will not provide conceptions upon
which to build theoretical explanations or new techniques by which new experiments
may be devised. Experiments only yield data. Techniques make experimentation
possible; data makes the invention of verified theoretical concepts possible. In short, the
experimental method is only one of a series of procedures used by scientists. Thus, we
need to develop a diversity of scientific methodologies of sufficient scope to handle the
diversity of problem “classes.” If the “class” of a problem can be identified correctly,
then the appropriate solution methods can also be identified and employed for a given
“class” of problems.
Although most of the classes are easy to identify, there is a special relationship
between the classes of theory and data that should be discussed. A theory is not a
collection of loosely related hypotheses, rather it is a small conceptual system that
explains something and from which deductions can be made (e.g. Darwin’s Theory of
Evolution or Einstein’s Theory of Relativity). The data class defined above only has
meaning in reference to a theory. In other words, making observations and analyzing
them can only be classified as “data” if the observations are used to prove or disprove a
theory. From this discussion, it would appear that the data class consists entirely of
making observations that are not correct. We should interpret anything that proves or
disproves a theory as a data problem. This means that almost all mathematical proofs are
data problems because they prove a theory (a theorem). Theories and data go hand in
hand and one without the other is just speculation, but together they are very powerful.
With the possible exception of the description problem class, the class of a
problem tells us very little about the possible quality of a problem solution. However,
categorizing problems into classes is very useful because we will be able to develop
specific guidelines within a problem class for evaluating problem solutions.
3 Two Examples: Cancer and New Color Dye
As we proceed, we will need to illustrate the concepts being developed with sample
problems. Rather than complicate the development in the early stages with actual papers,
we propose two “hypothetical” problems and solutions that will allow us to illustrate the
concepts, while keeping the discussion to a minimum. Suppose we have two problems
for which there are no known solutions: (1) To find a cure for cancer and (2) to find a
way to manufacture a new color dye. Both are problems of construction:
(1) construct a procedure/device to cure anyone who has cancer, and (2) construct a
procedure/device to manufacture a new color dye for five dollars or less per gallon. In
terms of importance, almost everyone would agree that problem (1) is much more
important than problem (2).
Now that the problems are defined, a researcher is assigned to each problem. For
problem (1), the researcher selects at random five people who have cancer and produces a
case study of the progress of the disease over time in each person. For problem (2), the
researcher invents a device, along with a manufacturing procedure, that enables the new
color dye to be produced at a cost of ten dollars per gallon. Note that neither researcher
solved the original problem, but each solved a related problem. For problem (1), a cure
for cancer was not devised, but the first researcher solved a problem that is hopefully
related to finding a cure for cancer. For problem (2), a procedure/device to produce the
new dye for five dollars per gallon was not solved, but the researcher constructed a
procedure/device that produced the new color dye for ten dollars per gallon. This
researcher solved a problem of construction that is hopefully related to constructing a
procedure/device that produces the new color dye for five dollars per gallon. Although
neither researcher actually solved their original problems, both found a solution to a
related problem. Therefore, we need to develop guidelines that will enable us to compare
and evaluate the quality of each solution.
Some researchers would argue that since the original problem (1) is more
important than original problem (2), then the solution of the first researcher must be
better than the solution of the second researcher. Although seemingly easy to implement,
this simplistic approach is obviously wrong as the rest of the paper will demonstrate.
Indeed, one difficulty with this lexicographic approach is that it ignores the actual
solutions and only compares the original problems. This means that the best research is
just a function of the importance of the problem area and does not depend on the
creativeness of the researcher. If the above statement were true, then the way to
academic fame would be to do “mundane” research in a very important problem area. To
analyze the lexicographic approach above, we must consider how the problems actually
solved by each researcher relate to the original problem areas.
4. Other examples
Example 3 involves a way to predict how much the Dow Jones Average will increase or
decrease in the next month. Example 4 involves constructing a new technique for
scheduling production that reduces labor cost by ten percent. Example 5 involves
constructing a method to predict where terrorists will strike next.
Example 6 involves developing a new distribution system for a retail chain that reduces
distribution costs by twenty percent. Even though example 3 and example 5 are very
important, finding a reasonable solution to them seems highly unlikely. Although
examples 4 and 6 are not as important, it is much more likely to find a reasonable
solution to them. Therefore, it would be wrong to classify a research study based on the
importance of the research problem without taking into account the possible solution to
the problem.
5. Research Problem Order
In addition to “class,” problems also have what might be called “order.” Most problems
are defined within a very broad “problem area” or what Danielli [3] has called a “major
or first-order problem.” An example of a “problem area” or “first-order problem” would
be “How does the immune system work?” There are clearly subproblems in all “classes”
here. What constitutes the “immune system” (definitional)? How does one describe an
immune reaction (data and techniques)? How does one explain the description one
obtains experimentally (theoretical)? Is the mechanism describable or explainable in
terms of already established mechanisms (integration)? To resolve these lower order
problems, one must, in turn, address yet further subproblems. For example, to define
what constitutes an immune system, one must have anatomical and physiological data; to
get data, one must do experiments; to do experiments one must develop techniques; to
develop appropriate techniques, one must have appropriate criteria to define a particular
problem and its boundaries; etc. What results from this process of “nesting” questions
might be called an “ordered problem tree” as illustrated in Figure 1. “Ordered problem
trees” may be made up of any combination of problem “classes.”
INSERT FIGURE 1 (About here)
The “order” of a problem in a “problem tree” is extremely important. Problems
may only be solved when the techniques, data, theories, or concepts exist for solving
them. The trick of problem solving, then, becomes the ability to propose a tree of
logically connected (i.e. “nested”) problems constructed so that one or more branches or
twigs connect with the known. The solution of one or more subproblems may then
provide the basis for the solution of the next problem in the “order.” In a very well
connected “logical problem tree,” the solution of a single minor problem may create a
“domino effect” or chain reaction leading to the solution of an entire problem area. The
problem is then to develop a “logical problem tree” in which a particular subproblem
provides the key to the remaining problems.
Several criteria for problem evaluation follow immediately from consideration of
problem “order.” A problem, even if valid, is not worth pursuing if a logical tree of
subproblems cannot be built or grafted between it and established techniques, data,
theories, or concepts. A problem that meets this criterion may be worth pursuing if it
meets other criteria.
The problem should not be trivial. Triviality may be difficult to determine a
priori, but clues do exist. The importance of a problem is at least partially a function of
its connection to lower order problems, or to a problem area. In other words, it is not
sufficient that a problem connect to the known; it should have a strong connection to
other unsolved problems so that its solution promises their solutions. Otherwise, a
scientist may spend his whole life posing and resolving problems that have no logical
importance to the integration or extension of knowledge.
Many researchers justify trivial problems by stating that “if we understand this,
then we will understand (e.g.) cancer.” To argue in this manner is to make direct
connections between nth order problems and a problem area without specifying the
“logical problem tree” in between. One way of evaluating a problem is to analyze the
strength of the logical connections linking the problem to its problem area. The stronger
and more numerous the logical connections, and the better those connections are
specified, the greater the probability that the resolution of the problem will not be trivial.
A final criterion that may be used in problem evaluation concerns the specificity
of the problem statement itself. A well-defined problem states, implicitly if not
explicitly, the criteria by which the solution can be recognized. On the contrary, a
problem that is stated so that its solution set cannot be imagined a priori is unlikely to be
resolved. Thus, it is important to reduce a problem to known constituents. For example,
if one asks how the ABO blood system works, then the extensive data on the ABO blood
groups provide the criteria for evaluating the success of any explanation: can the theory
explain all of the available data in terms of fact or artifact? It follows that the greater the
database, body of techniques, theories or concepts that a problem tree can be linked to,
the more clear-cut the criteria will be for recognizing an appropriate resolution to the
problems at the unknown/known juncture.
6. Problem Order for Cancer versus New Color Dye
In problem (1) a cure for cancer was not devised, but the first researcher solved a
description problem by producing a case study of the progress of the disease over time in
each of the five people. In problem (2) the researcher constructed a procedure/device that
produced the new color dye for ten dollars per gallon. When we consider the ordered
problem tree of problem (1), we find that the logical connections between the description
problem that was solved and the problem area (construct a cure for cancer) are very
weak. Indeed, it would take quite an imagination to conclude that a case study of five
people with cancer would even indirectly lead to a cure, and even if it did, all the
recognition, awards and a prominent place in history would go to the researcher who
discovered the cure and not the researcher who performed the case study.
In the ordered problem tree of problem (2), the procedure/device constructed to
produce the new color dye for ten dollars per gallon is probably very strongly connected
to the problem because the only difference is in the cost to produce a gallon of dye. For
these two examples, the ordered problem tree would indicate that the problem (2)
solution is better than the problem (1) solution. This solution is the exact opposite of the
lexicographic approach that only considers the importance of the problem area.
Until now, we have mainly considered the importance of the problem area and the
relationship of the problem to the problem area (ordered problem tree). We now consider
the problem solution itself by first developing a set of general criteria which apply to all
solutions.
7.Evaluating Problem Solutions
After the problem has been classified within an order tree, we must then evaluate
the quality of the problem solution. Although this is done “class” by “class,” there are
some criteria that cut across the problem “class.” A high quality solution must show
some innovation (the solution might even be surprising) on the part of the author and not
be the result of applying standard techniques. It must be able to be generalized (solves a
general problem comprised of an infinite number of specific problems, not just one
specific problem). A high quality solution must be proven or verified that it solves the
problem specified even if the parameters specified are changed. A high quality solution
must also be durable (have no half-life --- the solution will still be good many years from
now). A high quality solution must be useful. For example, formulating a problem as an
integer mathematical programming problem with thousands of variants and constraints is
not useful because we cannot solve such a problem. Table 3 presents a set of general
criteria for assessing the quality of a solution to any type of research problem.
Table 3
General Solution Criteria
Criteria Definition Example
Innovative Not the result of applying standard
techniques (the solution might even
be surprising)
The big bang theory of the origin of the
universe is both innovative (not like
anything else) and surprising
Generalized Solves a general problem comprised
of an infinite number of specific
problems, not just one specific
problem
Darwin’s theory of natural selection or
Dantzig’s simplex problem
Verified The proposed solution must be
verified or proven that it is, indeed, a
solution
Newton’s laws were verified over the
next 100 years by many researchers,
while Dantzig’s simplex procedure was
proven correct by mathematical
methods
Durable The solution is good forever Newton’s laws are just as good today
as they were 300 years ago and
Dantzig’s simplex procedure is still
correct
Useful The solution is potentially useful Edison’s electric light bulb and
Dantzig’s simplex procedure are still
being used today
Of the five criteria listed above, the most important is innovation. The problem
solution must be innovative in order to be of high quality. If a researcher uses standard
techniques to devise a solution that is able to be generalized, verified, durable, and is
useful, then the problem solution is not innovative and could reasonably be arrived at by
any other person who applied the same standard techniques. Thus, at best this solution
can only be of average quality. High quality solutions must show some creativity
(innovation).
8. Evaluating the Solutions of Cancer and New Die
Table 4 contains an analysis of the two problems, cancer and new color dye, on the five
general solution criteria discussed in the last section. For every criterion, the solution for
(2) new color dye is better than the solution for (1) cancer. When we consider the
problem class, the ordered problem tree, and the quality of the solution for each problem,
the documentation produced for problem (1) cancer must be qualified as low quality
research, and the documentation for problem (2) new color dye must be classified as high
quality research.
Table 4
Application of the General Solution Criteria to
Cancer and New Color Dye Solutions
Criteria (1) Cancer (2) New Color Dye
Innovative Not very innovative because
standard techniques are used in
a standard way
The solution could be very innovative
if the procedure/device was not
produced by standard techniques
Generalized Not able to be generalized
because the solution only
applies to those five people
Moderately generalized unless the
solution can be extended to other
problems
Verified Since it is not able to be
generalized, does it matter?
Yes, because we assume the solution
has been verified by testing
Durable In 100 years the case study
will have no meaning because
the way people live will have
changed so much
In 100 years the procedure/device
will still work even though better
procedures/devices may have been
discovered
Useful Not very useful because we
still cannot do anything that we
could not do before
Very useful because we can now do
something that we could not do
before the solution
Now we can utilize the above criteria for evaluating and comparing quality of
actual research papers.
9. Evaluation of the solution to other problems
Let’s assume that the researcher that is trying to predict the stock market utilizes the past
performance of the market to predict future outcomes. Even though, this approach is
reasonable, it would not generate predictable results because the stock market is very
difficult to predict. Likewise, forecasting where terrorists will strike next would be a
nearly impossible task, even with all of the information and intelligence available to us.
However, finding a solution to example 4 (constructing a new technique for scheduling
production that reduces labor cost by ten percent) and example 6 (developing a new
distribution system for a retail chain that reduces distribution costs by twenty percent) are
much more reasonable. Therefore, even though the problem is not as important, the
successful solution of the problem may make example 2, 4, and 6 more important than
examples 1, 3, and 5.
10. Narrow Versus Broad Focus
Until now, we have not considered the issue of whether a research problem has a narrow
or broad focus. This is an important consideration since a common criticism is “his/her
research is too narrow and therefore of low quality.” By implication, the critic usually
means that broadly focused research is of high quality, while narrowly focused research
is of low quality. Let us examine this approach by considering the two research problems
from the previous section; (1) cancer and (2) new color dye. Almost everyone would
agree that (1) cancer is a research problem with a broad focus while (2) new color dye
has a narrow focus. However, we have seen that the researcher’s solution to (1) cancer is
of very low quality while the solution to (2) new color dye is a high quality solution.
In general, we prefer a solution to a broad problem over a solution to a narrow
problem, but only if the broad solution contains the narrow solution and the narrow
solution does not have some properties that are better than the broad solution. Thus, we
prefer the simplex algorithm because it solves all linear programming problems to a trial
and error approach for a one constraint linear programming problem. However, we
prefer Goldberg and Tarjan’s algorithm for maximal flow problems (a subset of linear
programming problems) to the simplex method because it is much faster to solve
maximal flow problems.
However, science usually proceeds by taking broad problems, which cannot be
easily solved and breaking them into narrow problems that are hopefully easier to solve.
Indeed, this is the main technique for producing an ordered problem tree. We then attack
some of the narrower problems with the hope that if enough of them are solved, a broad
problem will then be solved. Thus, criticizing a research problem as being too narrow is
not valid unless the critic knows a high quality solution to a broader problem. In other
words, mundane research in a broad problem area is not as good as innovative research in
a narrow problem area. Science usually advances incrementally by solving narrow
problems with innovative solutions.
11. Usefulness as a Criteria
Most people would say that research must be useful in order to be of high quality and that
usefulness is the most important criterion in rating problem solutions. There is much
variety in what researchers consider useful. Even though the usefulness is important,
many of these applications do not meet the other criteria (i.e. innovation, generalization,
verification, and durability). In addition, we can define usefulness at two levels: external
and internal. The external usefulness refers to visible improvements or savings gained
from the research project, while internal usefulness deals with usefulness at a deeper
level. Internal usefulness addresses the question of whether the research or the results
obtained will easily assist solving other similar problems without extensive additional
work or research. The higher the transferability of the research and its results to other
situations or problems, the higher the internal usefulness. Even though the achievement
of a high level of internal and external usefulness is the most desirable outcome for any
research project, internal usefulness can be more important than external usefulness
because of the transferability characteristic.
12. Explanation and Understanding
George Gale [4] divides the goals of science into two groups: (1) prediction and control
and (2) explanation and understanding. Prediction and control involves statements that
use only correlation, while explanation and understanding involves statements that use
causal connections. In one sense, explanation and understanding answers the question
“why” while prediction and control only enables us to predict or control. To illustrate the
differences, let us consider the development of astronomy.
In ancient Egypt, the irrigation system was huge and needed a vast work force in
order to function properly. However, this work force was only needed during the annual
flood lasting several weeks. Rather than keep most of its population standing around
waiting for the flood, the Egyptians developed ways to predict when the flood would
occur and in the process developed astronomy. After centuries of simple observation,
correlations were made between the various positions of the stars and what was
happening on the face of the earth. In particular, they observed that every year, just
before the onset of the flood, the star Sirius rises into view at just the instant the sun rises.
Thus the Egyptians were able to formulate the following generalization of the correlation:
“If Sirius becomes visible just as the sun rises, then the river will flood, and so it is time
to man the floodgates.” Since Sirius occupies the dawn position only once each year, this
generalization is a very effective predictor.
However, it is crucial to note exactly what this conditional statement, this
prediction, is not. This statement is not an explanation; it does not claim any causal
relationship between the position of the sun and Sirius, and the occurrence of flooding.
The Egyptians did not go on to develop any kind of a sophisticated theory to explain why
the correlation occurred. Thus, their “science” here was merely of the sort, “at any time,
if the stars do x, then y will occur, and we should do z.” In effect, what they were
claiming was just “Whenever the clock of the sky says summer, then the flood is coming;
we open up the irrigation system.” This is the cookbook science that does not involve
any explanation and understanding. In astronomy, the explanation and understanding
phase started only about five hundred years ago.
Scientific explanations involve conceptual systems that postulate the existence of
particular sorts of individual objects and the interactions that take place between these
objects. When the conceptual system of a scientific explanation is wedded to the
empirical correlations, this compound produces the kind of total science that we are used
to. In other words, we cannot only predict and control, but we know why things work the
way they do. It is only with explanation and understanding that we can then build
generalizations and procedures that operate on a broad spectrum of phenomena. Indeed,
a theory is usually an explanation of why things work the way they do, and is, therefore,
the highest level of research. Thus, theory and data problems are very important, while
description problems are only important when they lead to explanation and
understanding.
Usually, explanation and understanding also means prediction and control, but
there are some notable exceptions. Darwin’s theory of evolution and theory of natural
selection both offer explanations as to how and why animals and plants developed, but
neither allow us to predict what will happen in any one particular instance. In general,
usefulness can be defined in terms of prediction and control and explanation and
understanding. A problem solution is useful if it enables us to predict and control, but it
is even more useful if it enables us to explain and understand. Seen in this light, the new
product survey is only marginally useful because it only partially predicts what new
products will be produced and offers no control mechanism. In addition, the new product
survey has no explanation or understanding of the underlying process for the
development of new products.
Although explanation and understanding is the highest form of problem solution,
good explanation and understanding results can come in a variety of forms. For example,
the explanation of why the simplex algorithm of linear programming works combines
theory (the statement of the optimality condition) and data (the proof of the optimality
condition). In other words, the explanation and understanding of a mathematical system
has many of the same characteristics as explanation and understanding of a physical
system.
To further explore the prediction and control as well as the explanation and
understanding approach to science, consider two examples from management science.
The first article (1) constructs a linear programming model for a company and uses the
results of the simplex algorithm to help schedule production. The second article (2)
investigates the structure of linear programming problems by proving that an optimal
solution must occur at an extreme point of the convex polyhedron of feasible solutions.
Many researchers would rate (1) as higher quality than (2) because (1) is more “useful.”
In analyzing these articles, let us determine their problem “class” and place them in an
“ordered problem tree.” Problem (1) is not a problem of definition, theory, data, or
methodology, but is a problem of description (use a linear programming model to
describe a real world situation) and construction (use the simplex method to construct the
best solution). Problem (2) on the other hand is a problem of theory (show the
underlying structure of linear programming problems by determining where the optimal
solutions must occur) and data (proves that the theory is correct for all linear
programming models). In the ordered problem tree of linear programming, problem (1)
is an application of the theory to a particular problem and is just an example of how the
general theory can be used. Problem (1) is used for prediction and control in a specific
situation, but does not contribute to our ability to predict-control in a broader sense. In
addition, problem (1) makes no contribution to explanation and understanding and thus
must be considered of low importance in the ordered problem tree. On the other hand,
problem (2) directly contributes to our explanation and understanding of the underlying
structure of linear programming models and how the simplex method works. Thus,
problem (2) must be considered of high importance in the ordered problem tree. Finally,
let us discuss the quality of the problem solution. The solution to problem (1) is neither
innovative nor able to be generalized, and it has a definite half-life. On the other hand, it
is verified (the linear programming solution is used to help schedule the plant) and useful
to the company, even though no one else uses it as anything but verification that linear
programming is a technology that can be used to help solve real world problems. The
solution to problem (2) is innovative (developing the correct structure with non-
standardized techniques—thinking), able to be generalized (applies to all linear
programming models), verified (the mathematical proof of the theory), durable, and
useful (directly contributes to our explanation and understanding of linear programming
models). Thus, Problem (2) must be rated as much higher quality than Problem (1).
13. Theory versus Practical Research
The last section has shown that the goal of research is to produce conceptual systems
from which deductions may be made and which fulfill our explanation-understanding
need. Unfortunately, many researchers do not make attempts to connect their research
findings to any theory or conceptual systems. Indeed, the quality of a research project
does not lie in the sophistication of the methodology or the uniqueness of the sample
data, but in how the results of the paper are innovatively tied to theory, conceptual
systems, outcomes that are able to be generalized, or new measurement techniques.
An article written by Agrawal [1], uses the rankings of U.S. News and World
Report as surrogate to measure the quality of three journals in operations management.
Even though this paper attempts to measure the quality of journals rather than the quality
of articles published in the journals it is still indicative of how wrong the measurement
process of journal quality is. Agrawal’s attempt to measure the quality of journals is
inadequate because, simply looking at the % of articles published by top 20 business
programs in the U.S., top 10 programs in Production and Operations Management and 30
U.S. universities ranked between 21 and 50 in the U.S. does not help in rating the quality
of these journals. The author makes various linkages in terms of quality that is not
substantiated. First of all, published papers indicate that higher the % of publications
among the top rated schools, higher the perceived quality of the journal. This assumption
is completely wrong from several perspectives. First, it implicitly assumes that the quality
of the articles published by the top rated institutions is higher than the quality of articles
published by authors from schools not rated in the top. This assumption overlooks the
review process associated with each paper. The author is trying to somehow connect the
perceived reputation of the top schools and the percentage of articles published by the
authors from those top schools. This does not help assess the quality of the journal.
Indeed it provides an artificial classification of top schools vs. non-top schools in
assessing the quality of the journal. Just because JOM (Journal of Operations
Management) have lower % of articles authored by top ranked institutions does not mean
the quality of the papers published by that journal is not as good as the quality of articles
published by POM (Production and Operations Management). We do not assess the
quality of journals by the author’s institution but the inherent quality of the article
published. It would be much more informative but also inadequate to consider the make-
up of the editorial review board of the various journals and what % of the editorial review
board comes from top rated institutions. However, any kind of classification by
institution is inappropriate because this implies that a person form a top rated school is
more qualified as an author or editorial review board member than a person from an
institution not ranked high. Unfortunately this type of classification is faulty. For example
a non-tenured person at a top ranked institution has not proven him/herself. On the other
hand a tenured well published professor from an institution not ranked may be more
qualified to write or judge paper. Moreover, this classification scheme implies that each
of the authors submit their best work to the journals in question, which may or may not
be true.
An article written by Smith [15] takes a better look at this issue by measuring the
average number of citations from Social Sciences Citation Index of 15 leading finance
journals. He looks at type I error (top article rejected by the top three journals) and type II
error (non-top article accepted as a top article). He concludes that based on the citations,
we need to look beyond three top finance journals, and examine article more carefully for
its intrinsic quality. Im, Kim and Kim [9],[10] assess the quality of MIS research of
individuals and institutions by counting published articles, measuring productivity score
which involves counting the number of pages of articles in six core MIS journals. The
articles are counted in two different ways: 1. Normal count (each co-author gets a one
unit count of a published article), 2. Modified count (Each co-author gets a fraction of a
unit for their contribution. For instance, when there are three authors of a paper, each
author gets 1/3 credit for the total count of papers.) This is a very elaborate assessment
procedure that was criticized by Guimaraes [6] . Guimaraes stated that Im, Kim, and
Kim [9], 10] had two questionable assumptions: 1. Selection of journals, 2.Definition of
research productivity. Guimaraes [6] strongly disagrees with Im, Kim, and Kim in
defining research quality by counting articles or number of pages. He claims that using
Im, Kim, and Kim’s criteria the research productivity of some senior researchers will be
inflated because they tend to pressure junior faculty or doctoral students to include them
as authors [6]. Im, Kim, and Kim [10] responded to Guimaraes and stated that while their
approach to measure the research productivity and quality was far from perfect, it
provides some sort of a starting point to assess the quality of research. Im, Kim, and Kim
[10] modified their assessment procedure on the basis of criticism by Guimaraes [6] and
updated the calculation of their productivity score. In their second paper, they measure
productivity based on two different surveys [10]. Even though Im, Kim, and Kim used a
very elaborate assessment procedure, I think it falls short of evaluating true assessment of
quality because of the shortcomings stated by Guimaraes [6]. I think even though it is
not perfect, Smith’s approach [15] is far more accurate to assess the quality of research
based on citations in journals rather than counting the articles in top journals.
In the OR/MS discipline there has been a debate about the split between
theoretical and practical research [8]. Corbett, Hansen and Meredith refer to this split as
“academic drift” and discuss it extensively in their respective papers [2], [7], [11].
Academic drift refers to the academicians’ increased focus on theoretical research that
has very little or no practical relevance. One possible reason why academic drift is
occurring is that theoretical papers were not evaluated in the context of proper evaluation
criteria. In other words, theoretical papers with no current practical implications and with
few possible future practical implications were being incorrectly assessed as good
research when, indeed, these research papers did not meet the usefulness criterion. Tinker
and Lowe (1984) criticized OR/MS for having too much technical specialization and not
having a coherent purpose. Tinker and Lowe (1984) criticize the research in OR/MS
discipline as research with little or no practical usefulness. However, we have to be very
careful when assessing the usefulness of a research piece because what seems to not be
useful in the current context may prove to be useful later in solving a related problem or
an entirely different research problem.
Another possible explanation for the academic drift expressed by Meredith
focuses on the difference between the realist and relativist philosophies. Meredith states
that the difference between realists and relativists neither stems from the differences in
research methods nor are they related to the epistemological differences regarding how
humans obtain knowledge [11]. According to Meredith, differences occur as a result of
how individuals view reality [11]. Realists believe that one single reality exists, whereas
relativists are convinced that there are multiple versions of reality because reality is
constructed by human thinking based on each individual’s perception of reality. [11]
Meredith goes on to discuss the two philosophies in detail and attempts to show how each
philosophy views research and theory within the context of Operations
Research/Management Science (OR/MS) field [11]. Since the OR/MS discipline initially
evolved from the successful application of theory in real-world settings, assessment of
research using relativists’ philosophy matches well with research assessment framework
and criteria that we have developed in this paper. On the other hand, we also believe that
the research performed by “realists” can also be successfully evaluated using the criteria
we have presented earlier in this paper. Even “realists” involved in conducting so called
“pure research” would not have a problem with our criteria and would welcome the
validation of his/her research through a successful practical application.
To what degree does a researcher have to test the model in order to demonstrate
its usefulness? Can the usefulness of the model be potentially demonstrated later through
a separate practical application/ real-world testing? The danger of taking the relativist
philosophy to an extreme has the potential pitfall of emphasizing the use of standard
procedures for collecting and analyzing data and generating research papers that do not
advance the field in which they are working. Geoffrion demonstrates how good theory
facilitates good practice [5]. As long as we are able to demonstrate the usefulness of
theoretical contributions, we are able to bridge the gap between theory and practice.
Therefore, we can justify pure theoretical contributions as long as the research clearly
shows at least the potential usefulness (i.e. possible area of practical application).
14. New Research Fields
Most new fields start by using description. Unfortunately, many fields stay in the
descriptive stage for too many years before moving into the discovery stage. Some
researchers claim their research, although descriptive and using standard procedure, is
high quality because the field is new. The first phase in any new field after a few months
spent in observation and description is to develop basic concepts and ways to measure
these concepts (definition and measurement problems). For example, when scientists
considered hot and cold, they first developed the concept of temperature to explain hot
and cold. Next, they invented a way to quantify and measure the concept of temperature
by eventually inventing the thermometer. Unfortunately, many new fields languish in
description for many years. The type of research needed by a particular field in part
depends on the maturity of the field. Fields in their early stages of maturity may
justifiably engage in descriptive research to explore the possible subjects of research
areas. In other words, the maturity of the field dictates the type of research needed.
However, in some of these new fields, unfortunately, it takes too long for the first phase
of inventing general concepts and ways to measure these concepts.
15. Construction
Construction or development of something can be considered either low or high quality
research depending on many factors. For example, constructing a linear programming
model to successfully schedule a particular plant, as was done in Section 9, is a great
consulting outcome that may also lead to a good academic case study. Unfortunately,
this type of work does not meet the criteria for high quality research. On the other hand,
development of the simplex algorithm would be considered high quality research because
it is innovative, able to be generalized (applies to all linear programming models),
verified (the proof that the algorithm will always determine an optimal solution), durable,
and useful (the simplex algorithm can be used to construct the optimal solution to any
linear programming model). Generally, the stronger the connection to theory or
development of new concepts, the higher the quality of research.
16. Evaluation of Research Studies.
We have discussed many factors affecting the quality of research. It is important that we
are able to evaluate the research quality in terms of a fairly objective model. As a first
alternative, we propose that individual’s research is carefully read and judged based on
principles discussed in this paper. If the evaluator does not have time to read the articles,
then average number of citations can be used as a surrogate measure.
17. Conclusions
In this paper we discuss different classes of research problems and classifications of these
problems in the "ordered problem tree" by analyzing their strength of connections to the
original problem area. Then, we develop a set of general criteria to evaluate the quality of
a solution procedure for any class of research problems. In an attempt to provide clear
guidelines to assess quality of research, we discuss issues related to depth vs. breadth,
theory vs. non-theory, and new research fields and provide a basic framework for rating
research. We hope that this paper will provide the necessary information to assess
research quality and provide evaluators and reviewers of research projects a better
understanding of the criteria and factors affecting the quality of research.
References
1. Agrawal, V. K. “Constituencies of Journals in Production and Operations
Management: Implications On Reach and Quality” Production & Operations
Mgmt, 11, 2, 2002, 101-108.
2. Corbett, C. J. and L. N. Wassenhove “The Natural Drift: What Happened to
Operations Research?” Operations Research 42, 1993, 625-640.
3. J.F. Danielli, The Future of Biology, The State University of New York Press,
New York, 1966.
4. Gale, G. Theory of Science: An Introduction to the History, Logic, and Philosophy
of Science, McGraw-Hill Book Company, New York, 1979.
5. Geoffrion A. M. “Forces, Trends and Opportunities in OR/MS” Operations
Research 40, 1992, 423-445.
6. Guimaraes, T. “Assessing Research Productivity: Important But Neglected
Considerations” Decision Line, May 1998, 18 and 22.
7. Hall, R. W. “What is so Scientific About OR/MS?“ Interfaces 15, 1985, 40-45.
8. Hansen, P. “A Short Discussion of the OR Crisis” European Journal of
Operational Research 38, 1989, 2770281.
9. Im, K. S., K. Y. Kim, & Joon S. K. “A Response to Assessing Research
Productivity: Important But Neglected Considerations’” Decision Line, Sep/Oct
1998, 12-15.
10. Im, K. S., K. Y. Kim, & Joon S, K. “An Assessment of Individual and
Institutional Research Productivity in MIS” Decision Line, Dec./Jan. 1998, 8-11.
11. Meredith, J. R. “Reconsidering the Philosophical Basis of OR/MS” Operations
Research 49, 2001 325-333.
12. Popper K. R. Logic of Scientific Discovery Basic Books , New York, 1959.
13. Root-Bernstein, R. S. “The Problem of Problems,” Journal of Theoretical Biology
99, 1982, 193-201.
14. Root-Bernstein, R. S. Discovering, Harvard University Press, Cambridge,
Massachusetts, 1989.
15. Smith, Stanley D. “Is an Article in a Top Journal a Top Article?” Financial
Management, Winter 2004, 133-149.
16. Tinker T. and T. Lowe “One Dimensional Management Science: The Making of
Technocratic Consciousness” Interfaces 14, 1984, 40-56.
Figure 1
Model of the Ordered Problem Tree
Development of
techniques
Infinite
regress into
unknown
Data
Theory
First order problem
(problem area)
Concept
Second order
problem
Third order
Problem
Fifth order
problems
Fourth order
problems
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.