Content uploaded by Nadeem Javaid
Author content
All content in this area was uploaded by Nadeem Javaid on Feb 22, 2022
Content may be subject to copyright.
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
Quadratic Discriminant Analysis
Presentation Edited
by
Saba Awan
(FA19-RSE-041)
Department of Computer Science
(Date: 14/2/2022)
Under the Supervisor of
Professor Dr. Nadeem Javaid
Department of Computer Science
COMSATS University Islamabad, Islamabad Pakistan
1
Introduction
Instatistics, aquadratic classifieris astatistical classifierthat uses aquadraticdecision
surfaceto separate measurements of two or more classes of objects or events. It is a more
general version of thelinear classifier.
Statistical classificationconsiders a set ofvectorsof observationsxof an object or event, each
of which has a known typey. This set is referred to as thetraining set. The problem is then to
determine, for a given new observation vector, what the best class should be.
The standard form of a quadratic is
Quadratic discriminant analysis allows for the classifier to assess non -linear relationships.
QDA assumes that each class has its own covariance matrix.
The predictor variables are not assumed to have common variance across each of thek
(class)levels inY (label).
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
2
QDA Derivation (1/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
3
Step 1:
Given a training dataset ofNinput variablesxwith corresponding target variablest
The QDA model assumes that theclass-conditional densitieshave “Gaussian
distribution”
(1)
whereμis theclass-specific mean vectorandΣ is theclass-specific covariance matrix.
Step 2:
Calculate posterior by using Bayes' theorem,
(2)
classifyxinto class
(3)
QDA Derivation (2/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
4
Step 3:
Assuming the data points are drawn independently (4), the likelihood (conditional
probability) function can be defined as,
(4)
(5)
where, tdenote all our target variables, andπthe prior with a subscript denoting the class.
To simplify let θwillbeequaltoallclasspriors,class-specific mean vectors, and
covariance matrices
And Maximizing the likelihood is equivalent to maximizing the log-likelihood
θ=
The Eq (5) will become
(6)
QDA Derivation (3/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
5
By simplifying the natural log, rules are as follows.
QDA Derivation (4/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
6
Step 4:
By simplifying the log, by using log rules:
Explained on previous slides
(7)
QDA Derivation (5/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
7
Step 5:
Expanding Eq (7), based on Gaussian distribution
(8)
Step 6:
As the goal is to find the maximum likelihood solution for the class-specific priors,
means, and covariance matrices.
The derivation of Eq (8) will be solved for class-specific priors, means and covariance
matrices:
Derivative of class-specific prior
QDA Derivation (6/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
8
Constraint: The prior probability at max will be 1.
To maximize the likelihood with constraint optimization the Lagrange multiplier will
help.
(9)
Step 7:
Putting the value of ln P ( t| ) from Eq (8) and takin the derivative w.r.t πc class specific
prior
QDA Derivation (7/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
9
Step 8:
Constraint: The prior probability at max will be 1.
Substitutingλ=Nback into (10) gives us
(10)
(11)
Eq (11) tells thatthe class prior is simply the proportion of data points that
belong to the class
QDA Derivation (8/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
10
Step 8:
Now maximizing the log-likelihood with respect to class specific mean uc.
To solve the derivative the identity property is used.
(12)
QDA Derivation (9/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
11
Step 8:
tnc is only equal to 1, for the data points that belong to class cc .
1.The sum of L.H.S of Eq (12) only includes input variables x which belongs to cc.
2.On L.H.S dividing that sum of vectors with the number of data points in the class Nc ,
which is the same as taking the average of the vectors.
3.The class-specific mean vector is just the mean of the vectors of the class.
QDA Derivation (10/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
12
Step 9:
To maximize the log-likelihood with respect to the class-specific covariance matrix,
1.The derivative w.r.t Σc
2.By following the derivative identity:
3.Property of a trace product:
4. Matrix calculus identity:
QDA Derivation (11/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
13
The class-specific covariance matrix is just the covariance of the vectors of the
class.
(13)
QDA Derivation (12/12)
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
14
(14)
Step 10:
Lastly, From Eq (11), (12) and (13):
where argmax is an operation that finds the argument that gives the maximum value
from a target function
Conclusion
15
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
QuadraticDiscriminantAnalysis (QDA) is agenerativemodel.
QDA assumes thateach class follow a Gaussian distribution.
Theclass-specific prioris simplythe proportion of data points that belong to the class.
Theclass-specific mean vectoris theaverage of the input variables that belong to the
class.
Theclass-specific covariance matrixis just thecovariance of the vectors that belong to
the class.
References
16
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan
https://cookieblues.github.io/guides/2021/04/01/bsmalea-notes-3b/
https://towardsdatascience.com/quadratic-discriminant-analysis-ae55d8a8148a
http://uc-r.github.io/discriminant_analysis#nonlinear
Thank You !!!
17
Presentaon Prepared by Saba Awan, February 14, 2022
Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan