Content uploaded by Danny Kowerko

Author content

All content in this area was uploaded by Danny Kowerko on Jul 10, 2017

Content may be subject to copyright.

A web-based application

for data visualisation and non-linear regression

analysis including error calculation for laboratory

classes in natural and life sciences

Titus Keller and Danny Kowerko

Chemnitz University of Technology,

Endowed Professorship Media Computing,

D-09111 Chemnitz, Germany

Email: titus.keller@s2012.tu-chemnitz.de, danny.kowerko@informatik.tu-chemnitz.de

Abstract—In practical laboratory classes students traditionally

receive data by reading from a measurement device (ruler,

clock, voltmeter, etc.) or digitally as ﬁles in exchange formats

such as CSV (comma separated value). In many cases these

data have to be processed later using non-linear regression,

here referred to as curve ﬁtting. Therefore, analog data ﬁrst

have to be digitalised and imported to a data analysis and

visualisation program, which is often commercial and requires

installation. In this paper we present an alternative concept

fusing open-source community tools into a single page web

application facilitating data acquisition, visualisation, analysis via

non-linear regression and further post processing usable for error

calculations. We demonstrate the e-learning potential of this web

application accessible at curveﬁt.tu-chemnitz.de in the context of

acquired data as typically obtained in physical laboratory classes

from undergraduate studies. A prototype workﬂow for the topic

’speciﬁc electric resistance determination’ is presented along with

a technical description of the basic web technology used behind.

Restrictions, such as limited portability or cumbersome ways to

share results electronically between student and supervisor as

occurring in traditional software applications are overcome by

enabling export via URL.

The discussion is complemented by thorough comparison of

curve ﬁtting web applications with focus on their capability to

be adaptable to user-speciﬁc models (equations) as faced by

(undergraduate) students in the context of their education in

laboratory classes in natural and life sciences, such as physics,

biology and chemistry.

I. INTRODUCTION

The term regression analysis (also often referred as curve

ﬁtting) describes mathematical methods which determine the

relationship between dependent and independent variables of a

mathematical model (typically an explicit equation). The most

commonly used approaches for this problem are based on the

least squares problem. [1]

y=a·x+b(1)

As example, consider equation 1 as the model, with y as

dependent and x as the independent variable. The least squares

approach tries to determine the values for a and b which

minimize the sum of squared distances between the function

and the data points of the respective data set [2].

Curve ﬁtting algorithms solving this problem can be gener-

ally separated in two categories:

1) Linear regression algorithms, which are only applicable

to linear combinations, but produce a deterministic re-

sult.

2) Non-linear regression algorithms (for example the

Levenberg-Marquardt algorithm), which are applicable

to generic models, but use an iterative way to produce

a non-deterministic result. [3]

The generic use case of curve ﬁtting can be described

as statistical analysis. Accordingly, it is used in a wide

range of disciplines, such as natural, life, human, social

and economic sciences or data mining [4], [5], [6], [7]. An

explicit example of non-linear regression using equations with

6 or more parameters is thermal melting curve analysis, a

widespread method used in biochemistry to study stability

of DNA (deoxyribonucleic acid) and proteins [4], [8]. By

means of the web application presented herein, it was recently

demonstrated that such complex multi-parameter equations

can be ﬁtted to experimental RNA (ribonucleic acid) thermal

melting curve data including the required post-processing

calculations derived from the ﬁt parameters [9].

In the context of the broad variability of use cases, this

paper will discuss the practicability of this curve ﬁtting web

application as e-learning tool employed in practical lab classes

which are part of basic studies e.g. in physics. This proof of

concept will be exempliﬁed using data and equations from a

real lab class [10]. Existing open-source or open-access based

computation methods are merged into a browser-based web

application including data import, visualisation, regression

analysis and URL-based data and results export, usable to

quickly and systematically share results between supervisor

and student/user. Using non-proprietary resources makes the

application attractive to be offered by computing centers such

as the ’Universit¨

atsrechenzentrum’ of the Chemnitz University

TABLE I

E-LEARNING RELEVANT CRITERIA AND THEIR DESCRIPTION USED TO

CO MPAR E THE F UN CTI ONA LIT Y OF W EB-B ASE D CU RVE FIT TIN G

APPLICATIONS.

Criteria Description

1. Non-commercial Is the application non-commercial?

2. Help Are there further information about the use of

the software?

3. Input options Can the user deﬁne explicit equations and

choose between regression algorithms?

4. Graphic output Can the result and the data set be plotted as a

graph?

5. Export Are there options to export the results, for ex-

ample as PDF?

6. Error measures Are error measures displayed, which can help to

determine the goodness of the ﬁt?

7. Post-processing Is it directly possible to perform further calcu-

lations using the regression results?

of Technology as university-wide services for students and

academic staff.

II. ANALYSIS OF EXISTING CURVE FITTING WEB

APPLICATIONS

Widely used programs for calculation and visualisation of

results in practical lab classes are often commercial, such as

MS Ofﬁce/Excel (Microsoft Corporation), OriginPro (Origin-

Lab) or Igor Pro (Wavemetrics) and require installation. In

E-learning non-commercial and installation free applications

are of relevance as both criteria save money, (i) for software

licenses and (ii) for their maintenance (installation, upgrades,

...). Accordingly, we studied freely available existing web

applications in a structured and systematic manner according

to deﬁned criteria summarised in Table I.

More than twenty functional curve ﬁt web applications were

identiﬁed. Even though there are eventually more, we focus

only on four representatives fulﬁlling a maximum of relevant

functionality from Table I), namely ﬁtteia1, WolframAlpha2,

mycurveﬁt3and statpages4. The respective analysis results are

summarised in Table II.

It can be concluded that several regression web applica-

tions exist which have various limitations. Aside from these

quantiﬁable results, it has to be noted that there are other

important criteria which are not straightforward to measure,

such as ease of use or GUI (graphical user interface) design.

For example, WolframAlpha, as more generic mathematical

software, requires syntax knowledge about the existence and

use of functions. An approach to determine such aspects could

be based on the use of software ergonomic standards, such as

ISO 9241 [11].

1http://ﬁtter.ist.utl.pt/, 17.03.2017

2https://www.wolframalpha.com/, 17.03.2017

3http://mycurveﬁt.com/, 17.03.2017

4http://statpages.info/nonlin.html, 17.03.2017

TABLE II

ACOMPARISON OF REPRESENTATIVE WEBSITES,WHI CH CA N BE U SED T O

SOLVE CURVE FITTING PROBLEMS,BAS ED ON T HE CR IT ERI A DE FINE D IN

TABL E I.

Criteria ﬁtteia WolframAlpha mycurveﬁt statpages

1. - -

2.

3. partly partly partly

4. -

5. partly partly -

6. - partly partly

7. - - -

III. RESULTS AND DISCUSSION

The web application developed by the authors (available at

5) consists of two main components, a GUI to execute regular

curve ﬁtting functionality and a curve ﬁt evaluation tool not

discussed here in detail.

With regards to the contents of Table I, the web application

presented in this paper fulﬁlls all criteria at least on an

elementary functional level. A special property of the software

is the possibility to choose between various implementations

of regression algorithms, such as solutions developed in MAT-

LAB’s curve ﬁtting toolbox, Java6or GNU Octave (optim

package available under 7). Note that the latter two are open-

source and access tools, thus free to use in education.

A. User interface for curve ﬁtting

The graphical user interface covering the full curve ﬁtting

workﬂow is separated into ﬁve elements, as shown in Fig. 1.

Thereby following concept is realised:

•(left, top) Deﬁnition of the data set and options for further

processing and data import (for example of a CSV ﬁle).

•(left, middle) Input for the mathematical model via ASCII

characters and the related rendered output formula.

•(left, bottom) Parameter of the function with their start

values, results and conﬁdences.

•(right, top) The result function and the data set plotted as

exportable graph.

•(right, bottom) Post processing of curve ﬁt and other

parameters can be conducted through this element. Alter-

natively, the residuum, which plots the difference between

the data set and the result function, can be displayed.

B. Technical background of the web application

The back end of the application is written in Java and

based on Jetty as HTTP-server and servlet-container. Due to

the lack of implementations of regression algorithms writ-

ten in JavaScript the respective functionality is executed by

the server (alternatively to-JavaScript-compiler, such as Em-

scripten could be utilised). Hence the application program-

ming interface (API) for the necessary asynchronous calls is

5http://curveﬁt.tu-chemnitz.de/, 17.03.2017

6https://www.ee.ucl.ac.uk/∼mﬂanaga/java/, 20.03.2017

7https://octave.sourceforge.io/optim/, 20.03.2017

Fig. 1. Screenshot representing the graphical user interface for the curve ﬁtting functionality of the developed application. Data taken from a template protocol of

a lab class in physics at TU Chemnitz (available under https://www.tu- chemnitz.de/physik/PGP/allgemein.php, 20.03.2017). For further description, be referred

to section III-C.

implemented based on REST (Representational State Transfer)

using Jersey as a servlet. For the storage of permanent data on

server-side (for example to share results by URL) a MongoDB

database is connected to the back end.

The front end is using the regular web technologies HTML,

CSS and JS. To simplify the work process the code is mainly

written using the MVC-framework AngularJS. Advantages

for this are for example the possibility to write reusable

components or the use of data binding to connect HTML and

JavaScript content [12].

Other important libraries which have been utilised are:

•JS Expression Evaluator to parse and evaluate mathemat-

ical functions in a secure manner.

•MathJAX as a way to render the formulas entered by the

user.

•JSXGraph to plot the data set and regression results in a

graph.

•Plotly to display the heatmap of the evaluation tool.

C. Determination of speciﬁc electrical resistance as model

application used in laboratory classes

Among the multitude of potential applications for the

presented curve ﬁtting web tool is its use in education for

example in laboratory classes where experiments are carried

out producing data that have (i) to be visualised and (ii) to

be evaluated using or testing the correctness of mathemati-

cal models. This background is well-known from physic or

chemistry in secondary school or college/university courses.

Especially at the higher levels of education the use of dedicated

curve ﬁtting software is indispensable [10].

Here, a typical use case of such an approach was chosen

from a lab course in physics, namely the problem to deter-

mine the speciﬁc electrical resistance R of a wire from the

correlation between its length and its resistance. A concrete

workﬂow is describable as follows:

•Wires of different lengths L will be probed.

•An electric circuit is built to measure the current I and

voltage U from respective devices.

•The electrical resistance is calculated using the measured

current and voltage according to following the equation:

R=U/I (2)

•The data set of wire lengths and respective resistances are

entered into a data visualisation and analysis software.

•Regression analysis using a linear function is applied to

the data to calculate an average ratio of resistance and

length. This is the slope m of the ﬁt equation R=m·l.

m=dR/dl (3)

•From regression analysis, the slope m is used to calculate

the speciﬁc electrical resistance according to:

ρ=m·A=m·π

4·d2,(4)

where A is the cross section of the wire deﬁned by its

diameter d in units of meters.

•Regression also provides the 95% conﬁdence value of m

which is here denoted ∆m. Together with the error of

diameter measurement, the relative error of the speciﬁc

electrical resistance is calculated according to:

∆ρ

ρ

=

∆m

m

+

2∆d

d

.(5)

The experimentalist may use the post-processing module

shown in the bottom, right of Fig. 1 to provide d and ∆das

values, while equations (4) and (5) have to be typed to the GUI

using their ASCII representations. Results are automatically

generated and usable for the protocol. In the above-mentioned

example, the speciﬁc electric resistance is directly at hand from

Fig. 1 (bottom, right) giving ρel = (0.51±0.03)µΩ·myielding

a relative error of 5.7%. Note, that values may be ﬁxed to

a user-deﬁned number of digits using the command ﬁxed().

The example provided in Fig. 1 is available under8and is

usable as exchange format between supervisor and student, e.g.

to evaluate the correctness of the equations and calculations

entered by the students to a previously empty GUI.

IV. CONCLUSION

The web application available at curveﬁt.tu-chemnitz.de was

presented in the context of its application in e-learning. We

successfully exempliﬁed how typical tasks which are part of

laboratory classes are fully covered, i.e. the data import and

visualisation as scatter plot, regression analysis using topic-

speciﬁc functions and post-processing of user-deﬁned (given)

and parameters obtained from curve-ﬁtting. The latter allow

for mathematically reproducible error calculation. Compared

to other web applications, we have overcome various limita-

tions to provide a generic easy-to-use single page tool that can

be widely used in laboratory classes.

V. OU TL OO K

The presented curve ﬁt tool offers also a comprehensive

solution to evaluate regression algorithms utilising simulated

data not discussed here in detail. However, as errors based on

regression are rather determined numerically, it may be used in

teaching to practically visualise the inﬂuence of measurement

insecurities or statistical noise on data and their consequences

for accuracy of regression-based parameter determination.

To increase the scope of the curve ﬁt application a sustain-

able data and result storage management system is currently

in preparation. The input for data sets is at present based on

text boxes, which could beneﬁt from a change to the familiar

worksheet-like table structure well known from Excel, Google

Spreadsheets or Apache OpenOfﬁce Calc. More comprehen-

sive data management also include multiple columns and

multiple worksheets including cross-calculations, e.g. used for

data pre-processing. In the example given in section III-C, the

electrical resistance R could then be automatically calculated

from the measured voltage U and current I data according to

8http://curveﬁt.tu-chemnitz.de/#?58d14390468c6a0adc2ee4c2, 20.03.2017

2. The common issue of outlier values could be solved by

an assisted detection method or completely automatic support

using methods such as random sample consensus [13].

As generic navigation structure similar to ﬁle systems a

generic tree view can be used to permanently save and organise

data. In combination with user management it enables a

possibility to offer a cross platform web storage, which could

be used for collaborative working.

To systematically improve the application user evaluations

in multiple iterations are in preparation using either example

questionnaires or observation studies.

Acknowledgments.

This work was partially accomplished within the project

localizeIT (funding code 03IPT608X) funded by the Federal

Ministry of Education and Research (BMBF, Germany) in the

program of Entrepreneurial Regions InnoProﬁle-Transfer.

REFERENCES

[1] K. Backhaus, Multivariate Analysemethoden eine anwendungsorientierte

Einf¨

uhrung. Berlin: Springer, 2006. [Online]. Available: http:

//dx.doi.org/10.1007/3-540-29932-7

[2] H. Skala, “Will the real best ﬁt curve please stand up?” The College

Mathematics Journal, vol. 27, no. 3, pp. 220–223, 1996.

[3] M. I. Lourakis, “A brief description of the levenberg-marquardt algo-

rithm implemented by levmar,” Foundation of Research and Technology,

vol. 4, no. 1, 2005.

[4] A. B ¨

ottcher, D. Kowerko, and R. K. Sigel, “Explicit analytic

equations for multimolecular thermal melting curves,” Biophysical

chemistry, vol. 202, pp. 32–39, 2015. [Online]. Available: http:

//www.sciencedirect.com/science/article/pii/S0301462215000757

[5] D. G. Kleinbaum, L. L. Kupper, A. Nizam, and E. S. Rosenberg, Applied

regression analysis and other multivariable methods, ﬁfth edition ed.

Boston, MA: Cengage Learning, 2013.

[6] R. Ramcharan, “Regressions: Why Are Economists Obessessed with

Them?” Finance Dev, vol. 43, 2006. [Online]. Available: http://www.

ecostat.unical.it/aiello/didattica/Econometria/Regressions %20IMF.pdf

[7] D. J. Hand, H. Mannila, and P. Smyth, Principles

of data mining. MIT press, 2001. [Online]. Available:

https://books.google.de/books?hl=de&lr=&id=SdZ-bhVhZGYC&oi=

fnd&pg=PR17&dq=Data-Mining++curve+ﬁtting&ots=yxP8BjqumY&

sig=ZRnkbwFJ2edTfds 6LrMPF9ZGg

[8] J.-L. Mergny and L. Lacroix, “Analysis of Thermal Melting

Curves,” Oligonucleotides, vol. 13, no. 6, pp. 515–537, Dec.

2003. [Online]. Available: http://www.liebertonline.com/doi/abs/10.

1089/154545703322860825

[9] T. Keller, D. Kowerko, and M. Ritter, “Entwicklung eines webbasierten

Curve-ﬁtting Tools f¨

ur komplexe Multiparameter-Funktionen,” in

Studierendensymposium Informatik 2016 der TU Chemnitz. Chemnitz:

Univ.-Verl, May 2016, pp. 75–85. [Online]. Available: http://

nbn-resolving.de/urn:nbn:de:bsz:ch1- qucosa-201104

[10] W. Schenk, F. Kremer, G. Beddies, T. Franke, P. Galvosas, and

P. Rieger, Physikalisches Praktikum, W. Schenk and F. Kremer,

Eds. Wiesbaden: Springer Fachmedien Wiesbaden, 2014. [Online].

Available: http://link.springer.com/10.1007/978-3-658- 00666-2

[11] C.-C. E. de Normalisation, Ergonomische Anforderungen f¨

ur

B¨

urot¨

atigkeiten mit Bildschirmger¨

aten Teil 10: Grunds¨

atze der

Dialoggestaltung. Februar, 1995.

[12] M. Heinrich and M. Gaedke, “Data binding for standard-based web

applications,” in Proceedings of the 27th Annual ACM Symposium on

Applied Computing. ACM, 2012, pp. 652–657.

[13] M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm

for model ﬁtting with applications to image analysis and automated

cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381–395,

1981.