Page Comparison

...

The probability for class j with the exception of the last class is

No Format
Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

...

 

The last class has probability

...



1-(sum[j=1..(k-1)]Pj(Xi))

...

 
	= 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

...



The (negative) multinomial log-likelihood is thus:

...

 

L = -sum[i=1..n]{

...


	sum[j=1..(k-1)](Yij * ln(Pj(Xi)))

...


	+(1 - (sum[j=1..(k-1)]Yij))

...

 
	* ln(1 - sum[j=1..(k-1)]Pj(Xi))

...


	} + ridge * (B^2)

In order to find the matrix B for which L is minimised, a Quasi-Newton Method is used to search for the optimized values of the m*(k-1) variables. Note that before we use the optimization procedure, we 'squeeze' the matrix B into a m*(k-1) vector. For details of the optimization procedure, please check weka.core.Optimization class.

...

Version	Old Version 2	New Version Current
Changes made by	Former user	Former user
Saved on	Dec 04, 2008	Dec 11, 2008

Versions Compared

Key