Conference PaperPDF Available

OSEMN PROCESS FOR WORKING OVER DATA ACQUIRED BY IOT DEVICES MOUNTED IN BEEHIVES

Authors:
  • Institute of Information and Communication Technologies – BAS

Abstract

Approaches for obtaining, clearing, studying, modelling, and interpreting collected IoT data are an important issue and a serious challenge for many researchers. The introduction of a standardized model of work-OSEMN, organizes the process of solving the problems. Beekeeping is a sub-sector in agriculture and it needs a unified process to work with data being obtained from sensors located in beehives. After applying a proper data processing, significant knowledge about the behaviour of individual bee colonies is gained, helping to identify correlations between the different events and the causes that invoke them. The purpose of this article is to describe the OSEMN model and its integration into beekeeping.
Current Trends in Natural Sciences Vol. 7, Issue 13, pp. 47-53, 2018
Current Trends in Natural Sciences (on-line) Current Trends in Natural Sciences (CD-Rom)
ISSN: 2284-953X ISSN: 2284-9521
ISSN-L: 2284-9521 ISSN-L: 2284-9521
http://www.natsci.upit.ro
*Corresponding author, E-mail address: k.dineva@iit.bas.bg
47
OSEMN PROCESS FOR WORKING OVER DATA ACQUIRED BY IOT
DEVICES MOUNTED IN BEEHIVES
Kristina Dineva
1*
, Tatiana Atanasova
1
1
Institute of Information and Communication Technologies - Bulgarian Academy of Sciences
Acad. G. Bonchev St., Bl. 2, Sofia, 1113 Bulgaria
Abstract
Approaches for obtaining, clearing, studying, modelling, and interpreting collected IoT data are an important issue and
a serious challenge for many researchers. The introduction of a standardized model of work - OSEMN, organizes the
process of solving the problems. Beekeeping is a sub-sector in agriculture and it needs a unified process to work with
data being obtained from sensors located in beehives. After applying a proper data processing, significant knowledge
about the behaviour of individual bee colonies is gained, helping to identify correlations between the different events
and the causes that invoke them. The purpose of this article is to describe the OSEMN model and its integration into
beekeeping.
Keywords: Beekeeping, internet of things, OSEMN.
1. INTRODUCTION
Technology development is increasingly being used in the environmental care of the planet, for
example in the field of agriculture and endangered species such as bee colonies. Beekeeping is one
of the agricultural sub-sectors where the new technologies, models and processes can be
successfully adapted and implemented. Thanks to their use in beekeeping, the knowledge about the
bees and their various conditions is improved with certain parameters.
To obtain up-to-date data on bee families, Internet of things (IoT) sensor devices are built to help to
get data on the parameters within and outside the hive over a certain time interval. The data
obtained from each beehive needs to go through a number of processing steps before it becomes
ready for extracting knowledge by which the future states of the beehive families can be predicted.
Integrating different methods, technologies, and processes allows for correct and accurate
organization of work with the data gathered from the beehives. Good data organization and the
follow-up of standardized processes such as the OSEMN (Mason and Wiggins, 2010)
increase the
probability of doing accurate analyzes and makes it easy going back to a particular step from the
data processing procedure. The information obtained after processing the extracted data, enables
beekeepers to be informed in a timely manner about deviations of the beehive parameters and the
probability of occurrence of a specific event related to bee behaviour. By integrating new
technologies into beekeeping, beekeepers are enabled to make the industry more efficient by
reducing costs, increasing and integrating new sources of income.
Current Trends in Natural Sciences Vol. 7, Issue 13, pp. 47-53, 2018
Current Trends in Natural Sciences (on-line) Current Trends in Natural Sciences (CD-Rom)
ISSN: 2284-953X ISSN: 2284-9521
ISSN-L: 2284-9521 ISSN-L: 2284-9521
http://www.natsci.upit.ro
*Corresponding author, E-mail address: k.dineva@iit.bas.bg
48
2. MAJOR PROBLEMS
Bees have the biggest role in polluting fruits, vegetables, flowers and farm crops like alfalfa that
used for feeding many animals in agriculture. Pollination depends on bees for more than 1/3 of the
world's crops. Bee colonies can be considered as a super organism, which consists of 40 to 50
thousand individual bees. They are a very important part of the earth ecosystem.
There is a 300% increase in crops globally, that need bees for pollination. There is also an
increasing mortality rate on beehive families on a global scale for three consecutive years. The
losses reach 80% in some places. Many factors, independently of each other lead to the death of the
bee families.
Researchers in that field highlight three major reasons for the increasing mortality (Figure 1).
Figure 1. Major problems of bee families in the world, Source: Syngenta 2016 [recasted]
Unregulated pesticide spraying on crops located near beehives during the pollination period is one
of the major problems (Syngenta, 2016). More than two-thirds of the pollen that bees collect and
carry to their hives is polluted with a cocktail of 17 different toxic pesticides - stated by Greenpeace
International. The chemicals found in pollen are from insecticides, acaricides, gungicides and
herbicides. After the pollen is brought to the hive, bees that do not go out to collect nectar and
pollalso can be poisoned too. Bees can die in the field from fast-acting pesticides, or also in the hive
or near it from the slow-acting pesticides. The poisonous substance can be obtained from the bee-
visited plants treated with it or can be carried by the wind from non-attractive bee cultures that are
in bloom or flowering weed or other honey-growing vegetation (Krupke et al., 2012; Allsopp et al,
2014). This problem exists not only in Europe. This is a global problem for which countries such as
England, France and Germany have taken temporary measures that are subject to monitoring and
analysis (Neumann and Carreck, 2015).
Current Trends in Natural Sciences Vol. 7, Issue 13, pp. 47-53, 2018
Current Trends in Natural Sciences (on-line) Current Trends in Natural Sciences (CD-Rom)
ISSN: 2284-953X ISSN: 2284-9521
ISSN-L: 2284-9521 ISSN-L: 2284-9521
http://www.natsci.upit.ro
*Corresponding author, E-mail address: k.dineva@iit.bas.bg
49
Another major problem directly related to the death of bees is Varroa mite - a parasite that attacks
the bees. Varroa mites are external parasites attacking honeybees and they can be seen with naked
eye. This mite is found inside honey's cells and the bee's body. Either way, they feed on the
hemolymph (blood) of the bees through the bee's body wall. In cases of severe infection, bees,
which have defects, will not develop properly and may not be perfectly normal like adult bees. Bees
which do not show defects are too weak and so they have shorter lives. Varroa mite lives on the
body of the bee but their growth occurs inside the closed cells, especially closed cells of male
workers. Nowadays Varroa mite is very important for the beekeeping industry in the world
(Mohammadreza et al., 2015). The problem with this parasite is very prevalent in Europe and North
America and has led to catastrophic losses since 2006 (Conte et al., 2010).
Besides these two factors, there is also a third one that most strongly affects the health and
productivity of the bee families – it is the beekeeper's competence and skills as well as the time and
resources invested in growing the bee colonies. The regular inspection of each beehive is of
particular importance. Beekeeping inspections are often one of the most complex tasks even for
professional beekeepers.
Studies show that more than half of the world's beehives are located outside urban areas. For this
reason, it is difficult to carry out the required number of inspections of each beehive for a variety of
reasons, such as bad meteorological conditions, poor infrastructure or the increasing transportation
costs, which in turn leads to reduced profitability. On the other hand, carrying out inspections has
an adverse effect on the overall condition of the bee family. Studies have shown that bees need
three to four days to stabilize and restore the ideal temperature and humidity parameters in the hive
(Verboven et al., 2014) after inspections.
The rapid development and the ease of integration of new technologies in beekeeping enable remote
monitoring of both the internal condition of hives and outside also the conditions. The deployment
of IoT devices in each individual hive allows collection by sensors of large volumes of data about
the hive’s parameters, which in turn, by analyzing them, makes it possible to detect and classify the
reasons explaining the increasing bee mortality. However, working with these large data volumes is
often a very hard process (Balabanov, 2016). With the use of standardized workflow processes such
as OSEMN, it is easier to apply accurate models and the results of the analysis of large data
volumes could predict deviations in the behaviour of the bee family at a much earlier stage,
allowing the beekeepers to take the right measures on time.
3. OSEMN WORKFLOW
The OSEMN process is a standardized and widely accepted model of organization of research in the
field of Data Science. The OSEMN process solves the problems with Data Science/Analytics
(Byrne, 2017) at a large scale.
The process of retrieving and manipulating beehive data needs to be organized, well prepared and
pre-processed. The use of the OSEMN process provides a clear order of activities - Obtain, Scrub,
Explore, Model the data and iNterpret the data (Figure 2). By following these steps, the entire
process can be well planned and organized – starting with data acquisition to the analyzed data
results visualization in specially developed software platform such as www.smartbeehives.eu
(Dineva and Atanasova, 2017a) for the honey beehives monitoring.
The process of work is well developed and organized. It consists of several logical consecutive
activities through which the original goals are achieved.
Current Trends in Natural Sciences Vol. 7, Issue 13, pp. 47-53, 2018
Current Trends in Natural Sciences (on-line) Current Trends in Natural Sciences (CD-Rom)
ISSN: 2284-953X ISSN: 2284-9521
ISSN-L: 2284-9521 ISSN-L: 2284-9521
http://www.natsci.upit.ro
*Corresponding author, E-mail address: k.dineva@iit.bas.bg
50
Obtain data
Data is collected from sensors that are located inside and outside hives. The sensors located inside
hives collect data about temperature, humidity, weight, noise levels, and more. These data are used
to monitor the condition of the bee families. External sensors are situated in different locations in
the apiary and collect environmental information (temperature, humidity and CO
2
) that gives a clear
and accurate idea of the particular weather, air pollution, and so on.
Figure 2. OSEMN workflow
Sensors are grouped according to the specific data to the user needs. A group of sensors are
connected to a common microcontroller, thus forming a node. One node collects specific data types,
allowing the simultaneous operation of different nodes, with virtually unlimited number of nodes.
Depending on the system load and the size of the apiary, the microcontrollers can be several types -
arduino, msp430 launchpad, nanode, pinguino pic32, stm32 discovery and others. The most popular
is arduino because it offers the necessary quality at a very good price. The honeybee monitoring
system (Dineva and Atanasova, 2017b) is designed in a way that it can easily integrate different
types of microcontrollers, making it extremely flexible and adaptable for the different use cases that
arise according the needs of the end customer.
It is desirable the used programming language to be a scripting one that can help automating data
extraction and allowing asynchronous handling of the received data. The modern dev world
provides several programming languages, and Python is a prominent preferred among others.
Scrub data
Before processing and analyzing the obtained from the honey bee monitoring system data, several
actions need to be carried out: merging the individual data columns into a single table, clearing the
data from invalid values, normalizing the data and processing the extreme values (Vander, 2016).
- Merging the individual data columns into a single table - with the help of the intelligent
beehive monitoring system, data for different parameters (temperature, humidity, weight, sound,
etc.) is collected and then it needs to be merged into a single common table where the parameters
remain as column variables.
- Clearing data from invalid values - When collecting real-world data, there are often
different reasons why there are null, NaN or NA values. As a result the analysis may be incorrect
and the data models might be created the wrong way for the reason that most of the predictive
Current Trends in Natural Sciences Vol. 7, Issue 13, pp. 47-53, 2018
Current Trends in Natural Sciences (on-line) Current Trends in Natural Sciences (CD-Rom)
ISSN: 2284-953X ISSN: 2284-9521
ISSN-L: 2284-9521 ISSN-L: 2284-9521
http://www.natsci.upit.ro
*Corresponding author, E-mail address: k.dineva@iit.bas.bg
51
algorithms can not cope well with missing or invalid data. The most common approaches to solving
this missing data problem are to replace them with average values or to directly delete these values.
If the data array is stacked then the missing data can be replaced with the next closest value in
ascending or descending order. For the purposes of the bee-monitoring project, data sequence is
important and array sorting is not performed, so the missing values are not replaced with nothing.
- Data normalization - the data obtained from the monitoring system needs to be normalized
because of the different types and ranges. The type of normalization used is min-max. Min-max
normalization performs a linear transformation to the initial data (Kantardzic, 2011), where min
a
and max
a
are the minimum and maximum values for the attribute a. Min-max normalization maps
values v of the range [min
a
, max
a
] into a particular range [new min
a
, new max
a
] by computing:
v’ = [(v-min
a
)/(max
a
- min
a
)] * (new max
a
– new min
a
) + new min
a
(1)
- Extreme value processing - A RANSAC (RANdom SAmple Consensus) algorithm is used
to determine the extremes in a dataset. This algorithm provides a statistical estimation of the
probability of obtaining reliable forecasts, i.e. probability within a predetermined number of
standard deviations from the true values. Also, the algorithm can be interpreted as a method of
detecting emergency situations. It is a non-deterministic algorithm - it produces a reasonable result
with only a certain probability and this probability can increase further because replications are
allowed. The algorithm produces and validates a linear QSAR (Quantitative Structure–Activity
Relationships) model based on the Minimum Least Square (LMS) method by (Kaspi et al., 2017):
filtering noisy samples (i.e., outliers);
selecting the best features (i.e., descriptors);
deriving a QSAR model from training set samples;
predicting the activity of test set samples while invoking the concept of applicability
domain, all in a single process without the need of complementary processes.
Explore data (EDA)
Finding, structuring, and enriching are operations that are extremely useful for exploring the
gathered data. Observing the raw data set helps choosing the best approach for conducting
analytical research. This allows the discovery and understanding of unique data elements, such as
extreme or unordinary values. This approach is used to generalize the data obtained from beehives
and to summarize their main characteristics. The main objectives for applying this approach to the
sensor readings are:
Creating hypotheses about the causes of observed phenomena;
Assisting for the selection of appropriate statistical tools and techniques;
Build a basis for future data collection through surveys and experiments;
Identifying relationships between variables.
The perspective of exploratory data analysis (EDA) is described in a simple formula:
Data = Smooth + Rough (2)
This means that data should be divided into two parts. The first part is called “smooth” and refers to
models that can be extracted from raw data using different techniques. EDA techniques focus on
extracting the "smooth" of each set of data. The smooth, whatever, comes from the data and does
not stem from our expectations or our assumptions about the data. This means that the first step in
Current Trends in Natural Sciences Vol. 7, Issue 13, pp. 47-53, 2018
Current Trends in Natural Sciences (on-line) Current Trends in Natural Sciences (CD-Rom)
ISSN: 2284-953X ISSN: 2284-9521
ISSN-L: 2284-9521 ISSN-L: 2284-9521
http://www.natsci.upit.ro
*Corresponding author, E-mail address: k.dineva@iit.bas.bg
52
the EDA process is to extract the data smoothly. A variable may have more than one template or
smooth. Retrieving smooth from raw data may require more than one pass through it and may have
more than one template that the data contains.
The second part of the formula, “rough” is the remaining leftovers that do not have a template at all.
Leftovers are what are left after all models are extracted from a dataset. However, it is very
important to look closely at the “rough” because this set of values may contain additional models
that need to be considered (Borcard et al., 2011; Waltenburg and McLauchlan, 2012).
Model data
In the machine learning paradigm, the model refers to a mathematical expression of the model
parameters, along with the inserted substitutes for each prediction, class and action for regression,
classification, and reinforcement categories, respectively.
Modelling in the bee monitoring system is used to predict, hence it requires good theoretical and
mathematical knowledge. Models can range from classical logistic regression, to a more complex
state machine or random forest to classify something, if a prediction or establishing trends are
aimed.
The generated model receives inputs with predefined structure (prepared, cleaned, normalized, etc.).
The python module scikit-learn (Hackeling, 2014) was used to easily create and use patterns in the
bee-monitoring project.
Scikit-learn is used for data modelling in the beehive monitoring system as a module that integrate
classical machine learning algorithms into several scientific Python packages:
NumPy – Base n-dimensional array package;
SciPy – Fundamental library for scientific computing;
Matplotlib – 2D – 3D plotting;
Seaborn – Visualization of statistical models
Ipython – Enhanced interactive console;
Sympy – Symbolic Mathematics;
Pandas – Structures and data analysis.
Interpret the data
The last and perhaps the most important step in the OSEMN model is the interpretation of the data.
The examination phase must answer completely or partially to the questions that provoke the data
modelling processes and needs. This is the stage during where it is used everything learned from the
collected and processed data from the beehive monitoring system.
This step includes:
Drawing conclusions from the data;
Evaluation of results;
Propagation / reporting of results.
Interpretation of research results is important for understanding the effectiveness of the study. It is
necessary to clearly describe the results in a way in which other researchers can compare with their
own results. Proper understanding of the methodology and survey statistics is necessary for the
correct interpretation. The results are analyzed using appropriate statistical methods to determine
the probability that the results were not random and can be reproduced in larger studies.
The results should be interpreted in an objective and critical way before assessing the implications
and drawing conclusions.
Current Trends in Natural Sciences Vol. 7, Issue 13, pp. 47-53, 2018
Current Trends in Natural Sciences (on-line) Current Trends in Natural Sciences (CD-Rom)
ISSN: 2284-953X ISSN: 2284-9521
ISSN-L: 2284-9521 ISSN-L: 2284-9521
http://www.natsci.upit.ro
*Corresponding author, E-mail address: k.dineva@iit.bas.bg
53
4. CONCLUSIONS
Choosing a standardized data processing workflow is a combination of expected functionality and
ease of implementation. The orientation towards standardized processes is a philosophy of work
that shifts focus from the activities to the results because activities are dealt within their coordinated
aggregation in creating value for the end result. The end result of the obtaining, clearing, studying,
modelling, and interpreting collected from the beehives IoT data is to predict events and to find a
correlation between the analyzed data and events occurring in beehives. The easily understand and
logically consistent steps of the data processing workflow (OSEMN), enriched with additional
instructions, notes and sample documents, ensure the performance of the activities and the
achievement of the results in the same way by the different participants. A higher maturity of
standardization and harmonization of practices at different stages is achieved. Errors due to
insufficient awareness are avoided. One of the most important features of standardized work
processes is that they unequivocally describe not only the sequence but also the responsibilities. At
each stage of the process, it is clear what is expected, who will receive a request for a particular
activity and to whom the outcome should be provided. OSEMN can significantly facilitate the
identification of existing good practices in the beekeeping.
Future investigation are directed to achieving transparency at each stage of the process which will
allow easy detection of errors and quick step back to a certain stage of process if needed.
5. REFERENCES
Allsopp, M., Tirado, R., Johnston, P. (2014). Plan bee – living without pesticides, Greenpeace, pp. 56.
Balabanov, T., Zankinski, I., Barova, M. (2016). Strategy for individuals distribution by incident nodes participation in
star topology of distributed evolutionary algorithms. Cybernetics and Information Technologies, 16(1), 80-88.
Borcard, D., Gillet, F., Legendre, P. (2011). Numerical Ecology with R, Springer, pp. 9 – 30.
Byrne, C. (2017). Development workflows for Data Scientists. O’Reilly, pp.28.
Dineva, K., Atanasova, T. (2017). Computer System Using Internet of Things for Monitoring of Bee Hives, Vienna,
SGEM.GeoConference (Vol. 17, Issue 63, pp. 169-176).
Dineva, K., Atanasova, T. (2017). Model of Modular IoT-based Bee-Keeping System. ESM'2017, Lisbon, EUROSIS-
ETI, 404-406.
Hackeling, G. (2014). Mastering Machine Learning with scikit-learn, PACKT, pp. 238.
Kantardzic, M. (2011). Data Mining: Concepts, Models, Methods, and Algorithms, IEEE, pp. 552.
Kaspi, O., Yosipof, A., Senderowitz, H. (2017). Random Sample Consensus (RANSAC) algorithm for material –
informatics: application to photovoltaic solar cells, Springer, 6, 2 – 15.
Krupke, Christian H., Hunt, Greg J., Eitzer, Brian D. (2012). Multiple Routes of pesticide exposure for honey bees
living near agricultural fields, PLOS.
Le Conte, Y., Ellis, M., Ritter, W. (2010). Varroa mites and honey bee health: can Varroa explain part of the colony
losses?. Apidologie, 41(3), 353-363.
Maadinia, M., Shabestari, B., Mahmoudi, R., Kaboudari, A., Rahimi Pir-Mahalleh, S. F. (2015). Evaluation of Varroa
Mites in the Apiaries from Iran. International Journal of Food Nutrition and Safety, 6(2), 74-81.
Mason, H., Wiggins, C. H. (2010). A Taxonomy of Data Science. Retrieved November 2017, from
http://www.dataists.com/2010/09/a-taxonomy-of-data-science.
Neumann, P., Carreck, N. L. ( 2015). Honey bee colony losses, Journal of Apicultural Research.
Syngenta, (2016). Financial Report 2016, Retrieved February 2017 from
https://www.syngenta.com/~/media/Files/S/Syngenta/ar-2016/syngenta-financial-report-2016.pdf
Vander, J. (2016). Python data science handbook, O’Reilly, pp.541.
Verboven, H. A., Uyttenbroeck, R., Brys, R., Hermy, M. (2014). Different responses of bees and hoverflies to land use
in an urban–rural gradient show the importance of the nature of the rural land use. Landscape and Urban
Planning, 126, 31-41.
Waltenburg, E., McLauchlan, W. (2012). Exploratory Data Analysis: A primer for undergraduates, Purdue e-Pubs,
pp.81.
... In addition, considering that, with respect to business resources and external factors, the questionnaire used in the innovation survey included several questions about cooperation, sources of information, human resources and R&D, a structure underlying the variables referring to these groups was hypothesized. Once the sample was obtained, a preliminary exploration of the data was performed, to detect extreme values, invalid values and other anomalies [71,72]. Firms with at least 20% blank responses were discarded from the analysis, due to the dependency structure that was hypothesized in the data [73]; thus, the sample for Ecuador was composed of 472 firms, and that of Peru, of 691 companies. ...
Article
Full-text available
: The purpose of this study was to analyze the current state and dynamics of the innovative behavior of medium and large manufacturing firms in Peru and Ecuador. It has been shown that the factors that enhance or enable the possibilities of innovation in organizations can be internal or external. This study took a quantitative approach, and regression models were applied to samples composed of firms. The relationships between external factors and business resources following the implementation of innovation were analyzed, as was the impact that these factors had on sales performance, considering the effect of the size and age of the firms. The innovations most implemented in firms in Ecuador were processes, and in Peru, organizational innovations were predominant. There were no external factors or business resources statistically related to these types of innovation for each country. For Peruvian firms, the age of the firm presented an inverse relationship to its performance. The study confirms the results of other studies conducted in Peru, and for Ecuador, these findings represent one of the first contributions on this topic. This study contributes to the discussion of the effects, in emerging Latin American countries, of a firm’s age on its ability to innovate.
... Untuk musiman periodenya sangat banyak di runtun waktu. Misalnya, 5 hari kerja dalam seminggu dapat menghasilkan efek pada rangkaian waktu yang diulangi setiap minggu, sedangkan jadwal liburan dan liburan sekolah dapat menghasilkan efek yang diulangi setiap tahun (Taylor & Letham, 2017 (Dineva & Atanasova, 2018). Ini adalah fase terpenting namun tidak teknis karena berkaitan dengan memahami data dengan memahami cara menyederhanakan dan meringkas hasil dari semua model yang telah dibangun. ...
... Interpretasi data merupakan tahap terakhir dan tahap paling penting di model OSEMN (Lau, 2019). Interpretasi hasil penelitian penting untuk memahami efektivitas penelitian dan mendeskripsikan secara jelas hasil perbandingan penelitian sendiri dengan penelitian lain (Dineva & Atanasova, 2018). ...
... The spectrum of application of such solutions is very wide. Since not only in cryptography, but also in other technology areas [6][7][8] good PRNG solutions are needed. Such solutions would therefore also help in the creation and refinement of the PRNG, as well as for more precise determination of the spectrum of tasks that the generator can perform well. ...
Chapter
Full-text available
A random number generator (RNG) and cryptographic algorithm is the base of each encryption system. For this, it can be considered that, no matter how complex cryptographic algorithms are applied, they become as vulnerable as the random number generator which is at the root of this system. The theme of RNG and PRNG deserves particular attention, especially because cryptographic protection in information systems relies on it. It can also be said that this is also the basis for cyber security as a whole. For this, it is extremely important to be sure of the quality of a system or mechanism for generating random numbers. In our study, as a means of assessing reliability, we rely on the mathematics of time series. The results of the proposed method are discussed and possibilities for ensuring cryptographic protection in information systems are shown.
... Today, thanks to the sensor networks and Internet of Things (IoT) paradigms, beekeepers and researchers can remotely monitor bee colonies [Meikle and Holst 2015, Kridi et al. 2016, Zogovic et al. 2017]. Remote monitoring via wireless sensors is one of the most important characteristics of the precision beekeeping [Zacepins et al. 2015] which basically involves beehives data collection, data analysis and support decision making in an apiary management context [Dineva andAtanasova 2018, Braga et al. 2019]. Once the sensors are installed in the hives, the apiary can be monitored without disturbance, even during periods when invasive inspections of the hives are contraindicated, such as during the winter [Meikle et al. 2017]. ...
Conference Paper
Honey bees, important pollinators, are threatened by a variety of pests, pathogens and extreme climatic events, such as the winter period. This paper proposes a two-stages model that seeks to define and predict evolutionary scenarios for improving the bee colonies’ well-being. The used dataset has data from both internal and external beehive sensors, and on-site inspection of beekeepers from six apiaries between the years 2016-2018. In the first stage, three evolutionary scenarios were obtained (pessimistic, conservative and optimistic) through the clustering technique. In the second one, aiming to classify these scenarios, an elastic net penalty logistic regression model was obtained with an accuracy of ~99.5%.
... In this work, we have employed a data science process that closely derives from the OSEMN framework as followed in [15] in order to beget the best possible analysis and results from the data. The OSEMN framework is a popular data science process that follows the step-wise methodology of Obtain, Scrub, Explore, Model and Interpret. ...
... It is through visual inspection that the beekeeper can detect a number of problems, including diseases (Spivak and Reuter, 2001;vanEngelsdorp et al., 2013;Mumbi et al., 2014). However, this kind of verification is time consuming and requires beekeeping skill and knowledge that can take years to acquire (Dineva and Atanasova, 2018b). If not done properly, inspections can disrupt the microclimate balance inside the hive and the work of the nest bees (the workers responsible for the internal work of the nest). ...
Article
Full-text available
The use of information and communication technologies (ICT) in agriculture is far from their potential. In this article, we consider how to facilitate and systematize the process of transforming traditional agriculture into digital agriculture; Agriculture 4.0. Among the different technologies, we focus on the IoT aspects. In the article, we propose a new approach for the design of intelligent agricultural management and supervision systems. The proposed approach is illustrated as an example of application in the beekeeping sector. Indeed, this sector is affected by a crisis due to the disappearance of bees and the different actors need support to make their decisions. As an example of decisions that can be made, we can cite: treatment planning or policy planning. An architecture based on sensors and open data is proposed to help them make decisions. An implementation of it is shown; it is based on a device with sensors, as well as an interface to collect the data on beehives and show notifications and alerts to beekeepers. The proposed architecture is flexible, and it can be used in the context of different levels of technology maturity. The final objective is to develop a reusable architecture for Agriculture 4.0.
Chapter
Pandemic situations require analysis, rapid decision making by managers and constant monitoring of the effectiveness of collective health-related approaches. These works can be more efficient with the help of clearer and more representative views of the data, as well as with the application of other measures and projections of epidemiological nature to the information. However, performing such aggregations of data can become a major challenge in contexts with little or no integration between databases, or even when there is no technological core mature enough to feed and integrate technological advances in the workflow of health professionals. This paper aims to present the results of the meeting of project approaches such as the OSEMN framework, a software architecture based on Microservices and Data Science technologies, all tools aligned to make the environment of descriptive and predictive analysis of epidemic data (still dominated by manual processes) evolve towards a context of automation, reliability and application of machine learning, aiming at the organization and addition of value to the results of the data structuring. The project’s validation objects were the documents of the situation of the Covid-19 disease pandemic in the region of the city of Brasília, Federal District, Capital of Brazil.
Article
Bees are the main pollinators of most wild plant species and insect-pollinated crops and are essential for both plant ecosystems maintenance and humans food production. Among the crops used for human feeding, 75% depend on pollination. In addition to the fact that uncertainty around the beekeeping activity could jeopardize the economic value of pollination, data on honey bee colony losses exist but have not been thoroughly and systematically analyzed to identify potential causal factors. Recognition of seasonal honey bee data patterns can be useful for a number of purposes such as swarming observations, and for forecasting colonies absconding - especially for those hives where the resources are scarce. Here we propose a method to identify honey bee seasonal patterns. The main aim of this research in identifying these patterns is to assist the beekeeper in the management and maintenance of their hives, and, additionally, to prove that with machine learning and, in particular, unsupervised learning is possible to detect seasonal honey bee patterns. We applied a clustering technique in two real datasets from HiveTool.net pursuing brood temperature, relative humidity, and beehives weight. From a clustering validation index and the k-means algorithm, we have found 6 coherent patterns related to seasons. From the found patterns, we compared three well-known classification algorithms (Naive Bayes, k-NN, and Random Forest) to propose a high accuracy classification model (hit rates up to 99.67%) that suggests seasonal honey bee patterns for remote monitoring computing systems.
Conference Paper
Full-text available
The most common and popular approaches for solving problems with bees population by tools of Internet of Things (IoT) are analyzed. These solutions can be improved by using new type of hardware infrastructure components and modular architecture that are proposed in the paper. The proposed solution is directed to small bee producers with accent on economical and practical benefits.
Article
Full-text available
An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a “one stop shop” algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For “future” predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.
Article
Full-text available
One of the strongest advantages of Distributed Evolutionary Algorithms (DEAs) is that they can be implemented in distributed environment of heterogeneous computing nodes. Usually such computing nodes differ in hardware and operating systems. Distributed systems are limited by network latency. Some Evolutionary Algorithms (EAs) are quite suitable for distributed computing implementation, because of their high level of parallelism and relatively less intensive network communication demands. One of the most widely used topologies for distributed computing is the star topology. In a star topology there is a central node with global EA population and many remote computation nodes which are working on a local population (usually sub-population of the global population). This model of distributed computing is also known as island model. What is common for DEAs is an operation called migration that transfers some individuals between local populations. In this paper, the term 'distribution' will be used instead of the term 'migration', because it is more accurate for the model proposed. This research proposes a strategy for distribution of EAs individuals in star topology based on incident node participation (INP). Solving the Rubik's cube by a Genetic Algorithm (GA) will be used as a benchmark. It is a combinatorial problem and experiments are done with a C++ program which uses OpenMPI.
Article
Full-text available
Varroa is one of the most important pests of honey bee, Acari Varroidae that causes major economic loss to the beekeeping industry by feeding on haemolymph of European honey bee colonies of L. Apis mellifera during its different developmental stages and also due to its viral transmission. In the past few years, the use of chemical pesticides to control varroa mite has led to parasite's resistance and contamination of hive products. This study is conducted on 30 apiaries and 10 hive from each apiary. Apiaries were chosen from 4 different regions of Qazvin province and were assessed at three phases, from march-2014 to September-2014. Samples were collected randomly and evaluated by observation of Varroa infestation. The highest and lowest rates of infection were in the warm season and cold season, respectively, as anticipated. The results confirmed this hypothesis. Pollution depends on the usage of anti-parasite and the location of the hives. The results briefly show that the amount of pollution in hives were 23.7% in March and 25.3% in May and 31.3% in August and in total, 50% of apiaries in 3 seasons were not contaminated.
Article
Full-text available
Since 2006, disastrous colony losses have been reported in Europe and North America. The causes of the losses were not readily apparent and have been attributed to overwintering mortalities and to a new phenomenon called Colony Collapse Disorder. Most scientists agree that there is no single explanation for the extensive colony losses but that interactions between different stresses are involved. As the presence of Varroa in each colony places an important pressure on bee health, we here address the question of how Varroa contributes to the recent surge in honey bee colony losses.
Article
Full-text available
Apiculture has been in decline in both Europe and the USA over recent decades, as is shown by the decreasing numbers of managed honey bee (Apis mellifera L.) colonies (Ellis et al., 2010; Potts et al., 2010). It therefore is crucial to make beekeeping a more attractive hobby and a less laborious profession, in order to encourage local apiculture and pollination. Apart from socio-economic factors, which can only be addressed by politicians, sudden losses of honey bee colonies have occurred, and have received considerable public attention. Indeed, in the last few years, the world's press has been full of eye catching but often uninformative headlines proclaiming the dramatic demise of the honey bee, a world pollinator crisis and the spectre of mass human starvation. "Colony Collapse Disorder" (CCD) in the USA has attracted great attention, and scientists there and in Europe are working hard to provide explanations for these extensive colony losses. Colony losses have also occurred elsewhere (Figs 1 and 2), but examination of the historical record shows that such extensive losses are not unusual (vanEngelsdorp and Meixner, 2009). African honey bees and Africanized ones in the Americas survive without Varroa destructor treatment, whilst the mite has not yet been introduced into Australia. This global picture indicates a central role of this particular ectoparasitic mite for colony losses.
Article
Full-text available
Populations of honey bees and other pollinators have declined worldwide in recent years. A variety of stressors have been implicated as potential causes, including agricultural pesticides. Neonicotinoid insecticides, which are widely used and highly toxic to honey bees, have been found in previous analyses of honey bee pollen and comb material. However, the routes of exposure have remained largely undefined. We used LC/MS-MS to analyze samples of honey bees, pollen stored in the hive and several potential exposure routes associated with plantings of neonicotinoid treated maize. Our results demonstrate that bees are exposed to these compounds and several other agricultural pesticides in several ways throughout the foraging period. During spring, extremely high levels of clothianidin and thiamethoxam were found in planter exhaust material produced during the planting of treated maize seed. We also found neonicotinoids in the soil of each field we sampled, including unplanted fields. Plants visited by foraging bees (dandelions) growing near these fields were found to contain neonicotinoids as well. This indicates deposition of neonicotinoids on the flowers, uptake by the root system, or both. Dead bees collected near hive entrances during the spring sampling period were found to contain clothianidin as well, although whether exposure was oral (consuming pollen) or by contact (soil/planter dust) is unclear. We also detected the insecticide clothianidin in pollen collected by bees and stored in the hive. When maize plants in our field reached anthesis, maize pollen from treated seed was found to contain clothianidin and other pesticides; and honey bees in our study readily collected maize pollen. These findings clarify some of the mechanisms by which honey bees may be exposed to agricultural pesticides throughout the growing season. These results have implications for a wide range of large-scale annual cropping systems that utilize neonicotinoid seed treatments.
Article
Most studies focusing on the effects of urban land use on pollinators have compared urban sites with one type of rural site. However, there is a lot of variation in the amount of natural habitats or intensive agriculture in rural areas. The position of urban areas within that continuum in terms of pollinator communities remains unclear. In this work, we studied bee and hoverfly communities (abundance, diversity, and species composition) in three site types along two river systems crossing urban areas, rural areas dominated by agriculture (termed rural-agricultural) and rural areas with high amounts of semi-natural land use (termed rural-natural). Pollinators were caught in August 2011. Abundance and diversity were highest in rural-natural sites for both taxonomic groups. Our data also indicate that hoverflies and bees responded differently to the surrounding land use, with bee abundance and diversity only significantly reduced in rural-agricultural sites but not in urban sites, and hoverfly abundance and diversity only significantly reduced in urban sites but not in rural-agricultural sites. The observed differences in the response of pollinators point out the importance of incorporating different types of rural land use and clearly defining the rural end of an urban–rural gradient in getting a clear view on how urban land use affects a specific pollinator group. Year-round sampling of these pollinators would, however, probably enable a more accurate view on these responses.
Article
This is the first four chapters of a very basic description of Exploratory Data Analysis techniques. It contains treatments of Univariate and Bivariate techniques. It contains a chapter on each set of techniques and it contains a chapter applying those techniques to various sets of empirical data. These latter chapters illustrate the value of these techniques for understanding data.
  • D Borcard
  • F Gillet
  • P Legendre
Borcard, D., Gillet, F., Legendre, P. (2011). Numerical Ecology with R, Springer, pp. 9 -30.