Introduction to Predictive
Peter G. Menshih, Igor V. Niesov
The article discusses the theoretical and practical features of constructing predictive
classifiers based on the results of psychological tests using machine learning methods.
The purpose of creating such classifiers is to obtain models that can predict the
presence or absence of certain personal and professional qualities in a
psychodiagnostic test respondent and the success of a particular type of activity. These
forecasts help reduce hiring errors and make more informed hiring decisions.
Formation of forecasts based on the results of filling in a psychodiagnostic test is a fairly
common practice in medicine , sports , psychology , transport , the army [5,
6], and other areas. In such cases, the results of psychodiagnostic tests based, for
example, on methods such as MMPI  or the Rorschach test [8–10] are taken as initial
data. Testing data is used with unique labels - information about whether the respondent
has the necessary properties that make up the predictive metric.
One of the disadvantages of the traditional approach to forming predictive systems is
that the predictors obtained in this way are not universal and require independent
research and development in each specific case, depending on the field of application.
In addition, in the base case, the formation of a predictive assessment results from a
psychodiagnostic technique that is not initially intended for the formation of prognostic
assessments. At the same time, the results of passing psychodiagnostic methods give a
complete description of the person in his answers to the test questions. However, in
most cases, these data are not used directly but only through the values of the
calculated scales. In contrast to the test results strictly typified by the methodology in
the form of scales, which contain data about the respondent placed as if in a vacuum,
the answers to the questions themselves often contain information about a person in his
direct, living relationship with the world.
If using all the results of the psychodiagnostic test - the values of the scales and the
calculated psycho type, the answers chosen by the respondent to the questions of the
questionnaire, and also add additional data to them - answers to questions that are not
included in the questionnaire, but contain valuable information, for example, about the
field of activity and position of the respondent, then, using machine learning algorithms,
it is possible to build and train a model that can regress indicators or classify a
respondent according to criteria set by the researcher that is inaccessible to the
psychodiagnostic method itself.
Thus, predictive psychodiagnostics can be described as the process of forming a
predictive assessment of a metric based on psychodiagnostic testing data using
machine learning methods.
The development of methods capable of isolating and visualizing data inaccessible to
traditional psychodiagnostic methods is highly relevant. Given sufficient statistical
significance and predictability, such data can qualitatively change the management of
the processes of assessment and selection of personnel, career guidance, as well as
forecasting the “work path” of a person, taking into account the industry specifics of his
field of activity.
The approach described in the paper to the creation of a predictive psychodiagnostic
system is based on the principle of the synergy of objective data, such as, for example,
the values of scales formed through the use of a basic psychodiagnostic technique (for
this work, KPMI ) and a predictive assessment of additional properties and qualities
of the respondent obtained by approximation of survey data using machine learning
algorithms and the use of supervised learning methods.
The process of creating models for a predictive psychodiagnostic system generally
consists of the following set of steps:
1. definition of metrics calculated using machine learning;
2. adding questions to the basic questionnaire, the answers to which clearly
describe the selected metrics;
3. collection of a sufficient amount of data for training the ML model (i.e.,
questionnaires filled out by respondents);
4. model training.
The model obtained as a result of the described steps takes input data on the
respondent's answers to the survey questions and the scale values calculated
according to the rules of the basic psychodiagnostic methodology.
The output of the model contains a predictive estimate of the target metric.
The basic psychodiagnostic technique based on which a predictive system is built can
be any, and its choice should be based on those specific tasks that such a system must
solve. At the same time, the basic methodology for psychodiagnostic testing should, to
one degree or another, correlate with the subject area of the metrics determined using
For example, if it is necessary to work with the forecast of effectiveness in the field of
people management, then using only methods for assessing intellectual abilities will not
be enough since they need incentives to identify significant emotional intelligence
features for interpersonal interaction.
The features used by the model can be both a set of respondents' answers to the
questionnaire questions and scale values calculated following the basic
psychodiagnostic methodology. In addition, it is possible to use a combination of
answers to questions and scale values, as well as construct new features based on
them. The choice of a specific approach depends on the algorithms used, target
metrics, and the amount of data for training and requires a separate study in the context
of a specific problem being solved.
It is important to note that an excellent psychodiagnostic technique uses a model that
describes the psychological characteristics of a person. Furthermore, this model is
complete. In one way or another, it includes various factors that determine human
behavior. Moreover, this, in turn, means that it will most likely contain those factors that
determine the measured metrics.
A practical example of the implementation of the model of predictive psychodiagnostics
can be found in . The researcher's predictive model is a three-layer neural network
with a layer configuration of 60-30-15 neurons.
The model is created to form a predictive assessment of the level of involvement of
employees in sales in a large retailer's network. The input data for the model are the
results of filling out the KPMI psychodiagnostic test. The output is a value from 0 to 1,
which characterizes the predicted level of the respondent's involvement.
The answer to the NPS-type question "I am ready to recommend a job in the company
to my friends and acquaintances" was used as marking data during training, with
possible values ranging from -100 to 100 in increments of 20. At the same time,
respondents were considered critics, that is, uninvolved employees, those who chose
answer options with values from -100 to -20 inclusive, and involved (promoters), those
who chose answer options 80 and 100.
To eliminate the influence of the stochastic nature of neural network algorithms on the
result of the procedure for generating training and validation samples, as well as direct
training and validating the resulting model, the procedures for generating samples and
training/validating the model were repeated 100 times.
The average accuracy value obtained from 100 iterations was used as a performance
metric of the resulting model. In this case, accuracy is understood as the ratio of the
number of correctly classified samples to the total size of the test sample.
The total volume of initial data was 2.095 questionnaires, and the test sample size was
15%. We trained the model using a balanced dataset from the original sample by
IV. Results and Discussion
Based on the results of 100 iterations of training and validation of the resulting model.
The average height value is 66%, with a standard deviation value of 0.05. On the
training model, it takes an average of 100 epochs to reach peak performance.
The model accuracy value of 66% is at least not found by the psychodiagnostic
predictive method [8, 13, 14]. At the same time, the predictive model described in this
study is essentially not optimized. Its final performance indicator also includes the area
of model uncertainty in the range from 0.4 to 0.6.
Finer tuning of the neural network architecture and careful selection of its
hyperparameters, coupled with unusual predictive estimates in the range of 0.4–0.6,
make it possible to form models with a performance of 75% or more. The indicator is
already poorly achievable for excellent methods for constructing predictive
Using the results of psychological testing to predict any qualities of a person within a
specific activity does not give sufficiently reliable results when using classical methods
of psychodiagnostics. This is especially true for cases in which the target metric is
specific . Accuracy can be improved using machine learning methods that allow
taking into account not only scales values, but also response patterns, testing time, and
other meta-parameters. The tandem of psychodiagnostics and machine learning can
become a new milestone in determining personality traits in general and career
prospects in particular , and its capabilities are being actively studied [17–20].
In addition, developing a predictive system on a classical psychodiagnostic basis
requires a specific "sharpening" for a predictive metric. This does not allow using the
developed test for solving other problems, for example, assessing the involvement or
satisfaction of employees. In turn, the use of machine learning methods allows, subject
to the availability of the necessary information in the essence of the test questions, to
use existing tests with a minimum number of additions or without them in their original
form to build predictive systems.
In conclusion, it is worth noting that neural networks do not represent an
easy-to-understand version of the model as a whole, being "black boxes." The result of
the neural network is used "as is" without explaining how it was obtained . In the
context of professional orientation tasks, personnel assessment, and
psychodiagnostics, such a feature in certain situations can make using neural networks
difficult solely because of a lack of understanding of "how it works."
1. Arnold C. Small, James Madero, Lorie Teagno, & Michael H. Ebert. (1983). Intellect, perceptual
characteristics, and weight gain in anorexia nervosa. Journal of Clinical Psychology.
2. Lawless, J., & Grobbelaar, H. (n.d.). Sport psychological skills profile of track and field athletes
and less successful track athletes. Retrieved November 15, 2022, from
3. Pastushenya, A., Vasishchev, A., Filaretov, S., & Zharkikh, A. (2019). Improvement of
psychological tests to identify persons prone to escape from correctional institutions and places of
detention. International Penitentiary Journal, 1(2), 118–136.
4. Zaoral, A. (2009, November 30). Manual of Recommended Psychodiagnostic Methods for
Examination and Assessment of Mental Competence to Drive Motor Vehicles.
5. Muhammad, R. S., Wolters, H. M. K., & Jayne, B. S. (2020). Personality testing: enhancing
in-service selection of mid-career soldiers. Military Psychology, 32(1), 71–80.
6. DeLuca, J. (1968). Predicting the Full Scale WaisIQof Army Basic Trainees. The Journal of
Psychology, 68(1), 83–86. https://doi.org/10.1080/00223980.1968.10544132
7. Marek, R., Tarescavage, A., Ben-Porath, Y., Ashton, K., Heinberg, L., & Merrell Rish, J. (2015).
Presurgical Psychological Testing: Incremental Contribution to Predicting Failure to Follow
through with Bariatric Surgery. Surgery for Obesity and Related Diseases, 11(6), S157–S159.
8. Cmelic, S., & Henig, L. (1975). Prognostic value of the Rorschach psychodiagnostic test in
psychotherapy. Socijalna Psihijatrija, 3(4), 357–363.
9. Newmark, C.S, Konanc, J.T., Simpson, M., Boren, R.B, & Prillaman, K. (1979). Predictive validity
of the Rorschach prognostic rating scale with schizophrenic patients. Journal of Nervous and
Mental Disease, 167(3), 135–143. https://doi.org/10.1097/00005053-197903000-00001
10. Predictive validity of MMPI-2 and Rorschach in the diagnosis of depression and schizophrenia -
11. KPMI (Keys to Personal Mastery Inventory) (2016). https://psycho.ru/library/228
12. Menshih, P. (2022, June 2). Engagement prediction for retail chain salespeople. Kaggle.
13. Golyanich, V. M., Bondaruk, A. F., Shapoval, V. A., & Tulupyeva, T. V. (2018). Value contradictions
as psychodiagnostic criteria of professional competence and an intrapersonal conflict.
Experimental Psychology (Russia), 11(3), 120–139. https://doi.org/10.17759/exppsy.2018110309
14. Zhan'ko, Chulaevskiĭ. (2006, January 1). Ability to process information as a factor of professional
success - Europe PMC. https://europepmc.org/article/med/16755764
15. Catron J. Leo. Diagnostic utility of the minnesota multiphasic personality inventory for objective
diagnostic classifications - ProQuest. (1982).
16. Botek, M., & Sládek, P. (2018). USE OF PSYCHODIAGNOSTICS IN HIRING – COMPARISON
ON STUDENTS FROM PRAGUE UNIVERSITIES. 3rd International Thematic Monograph -
Thematic Proceedings: Modern Management Tools and Economy of Tourism Sector in Present
Era, 755–768. https://doi.org/10.31410/tmt.2018.755
17. Dolce, P., Marocco, D., Maldonato, M. N., & Sperandeo, R. (2020). Toward a Machine Learning
Predictive-Oriented Approach to Complement Explanatory Modeling. An Application for
Evaluating Psychopathological Traits Based on Affective Neurosciences and Phenomenology.
Frontiers in Psychology, 11. https://doi.org/10.3389/fpsyg.2020.00446
18. Gonzalez, O. (2020). Psychometric and machine learning approaches for diagnostic assessment
and tests of individual classification. Psychological Methods. https://doi.org/10.1037/met0000317
19. Fardouly, J., Crosby, R. D., & Sukunesan, S. (2022). Potential benefits and limitations of machine
learning in the field of eating disorders: current research and future directions. Journal of Eating
Disorders, 10(1). https://doi.org/10.1186/s40337-022-00581-2
20. Littlefield, A. K., Cooke, J. T., Bagge, C. L., Glenn, C. R., Kleiman, E. M., Jacobucci, R., Millner,
A. J., & Steinley, D. (2021). Machine Learning to Classify Suicidal Thoughts and Behaviors:
Implementation Within the Common Data Elements Used by the Military Suicide Research
Consortium. Clinical Psychological Science, 9(3), 467–481.
21. Chollet, F. (2018). Deep Learning with Python. Manning, Cop.