Content uploaded by Nadeem Javaid

Author content

All content in this area was uploaded by Nadeem Javaid on Feb 22, 2022

Content may be subject to copyright.

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

Quadratic Discriminant Analysis

Presentation Edited

by

Saba Awan

(FA19-RSE-041)

Department of Computer Science

(Date: 14/2/2022)

Under the Supervisor of

Professor Dr. Nadeem Javaid

Department of Computer Science

COMSATS University Islamabad, Islamabad Pakistan

1

Introduction

Instatistics, aquadratic classifieris astatistical classifierthat uses aquadraticdecision

surfaceto separate measurements of two or more classes of objects or events. It is a more

general version of thelinear classifier.

Statistical classificationconsiders a set ofvectorsof observationsxof an object or event, each

of which has a known typey. This set is referred to as thetraining set. The problem is then to

determine, for a given new observation vector, what the best class should be.

The standard form of a quadratic is

Quadratic discriminant analysis allows for the classifier to assess non -linear relationships.

QDA assumes that each class has its own covariance matrix.

The predictor variables are not assumed to have common variance across each of thek

(class)levels inY (label).

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

2

QDA Derivation (1/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

3

Step 1:

Given a training dataset ofNinput variablesxwith corresponding target variablest

The QDA model assumes that theclass-conditional densitieshave “Gaussian

distribution”

(1)

whereμis theclass-specific mean vectorandΣ is theclass-specific covariance matrix.

Step 2:

Calculate posterior by using Bayes' theorem,

(2)

classifyxinto class

(3)

QDA Derivation (2/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

4

Step 3:

Assuming the data points are drawn independently (4), the likelihood (conditional

probability) function can be defined as,

(4)

(5)

where, tdenote all our target variables, andπthe prior with a subscript denoting the class.

To simplify let θwillbeequaltoallclasspriors,class-specific mean vectors, and

covariance matrices

And Maximizing the likelihood is equivalent to maximizing the log-likelihood

θ=

The Eq (5) will become

(6)

QDA Derivation (3/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

5

By simplifying the natural log, rules are as follows.

QDA Derivation (4/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

6

Step 4:

By simplifying the log, by using log rules:

Explained on previous slides

(7)

QDA Derivation (5/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

7

Step 5:

Expanding Eq (7), based on Gaussian distribution

(8)

Step 6:

As the goal is to find the maximum likelihood solution for the class-specific priors,

means, and covariance matrices.

The derivation of Eq (8) will be solved for class-specific priors, means and covariance

matrices:

Derivative of class-specific prior

QDA Derivation (6/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

8

Constraint: The prior probability at max will be 1.

To maximize the likelihood with constraint optimization the Lagrange multiplier will

help.

(9)

Step 7:

Putting the value of ln P ( t| ) from Eq (8) and takin the derivative w.r.t πc class specific

prior

QDA Derivation (7/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

9

Step 8:

Constraint: The prior probability at max will be 1.

Substitutingλ=Nback into (10) gives us

(10)

(11)

Eq (11) tells thatthe class prior is simply the proportion of data points that

belong to the class

QDA Derivation (8/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

10

Step 8:

Now maximizing the log-likelihood with respect to class specific mean uc.

To solve the derivative the identity property is used.

(12)

QDA Derivation (9/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

11

Step 8:

tnc is only equal to 1, for the data points that belong to class cc .

1.The sum of L.H.S of Eq (12) only includes input variables x which belongs to cc.

2.On L.H.S dividing that sum of vectors with the number of data points in the class Nc ,

which is the same as taking the average of the vectors.

3.The class-specific mean vector is just the mean of the vectors of the class.

QDA Derivation (10/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

12

Step 9:

To maximize the log-likelihood with respect to the class-specific covariance matrix,

1.The derivative w.r.t Σc

2.By following the derivative identity:

3.Property of a trace product:

4. Matrix calculus identity:

QDA Derivation (11/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

13

The class-specific covariance matrix is just the covariance of the vectors of the

class.

(13)

QDA Derivation (12/12)

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

14

(14)

Step 10:

Lastly, From Eq (11), (12) and (13):

where argmax is an operation that finds the argument that gives the maximum value

from a target function

Conclusion

15

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

QuadraticDiscriminantAnalysis (QDA) is agenerativemodel.

QDA assumes thateach class follow a Gaussian distribution.

Theclass-specific prioris simplythe proportion of data points that belong to the class.

Theclass-specific mean vectoris theaverage of the input variables that belong to the

class.

Theclass-specific covariance matrixis just thecovariance of the vectors that belong to

the class.

References

16

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan

https://cookieblues.github.io/guides/2021/04/01/bsmalea-notes-3b/

https://towardsdatascience.com/quadratic-discriminant-analysis-ae55d8a8148a

http://uc-r.github.io/discriminant_analysis#nonlinear

Thank You !!!

17

Presentaon Prepared by Saba Awan, February 14, 2022

Department of Computer Science, COMSATS University Islamabad, Islamabad-Pakistan