Tuesday, August 6, 2013

Kernel Machines.

http://crsouza.blogspot.com/2010/03/kernel-functions-for-machine-learning.html

kernel methods:
  1. map data into higher dimensional space in the hope that in this higher-dimensional space the data could be more easily separated or better structured. 
  2. The mapping function, doesn't need to be comptued because of the kernel trick. 
  3. kernel trick can be applied to any algorithm which solely depends on the dot product.Wherever a dot product is sued, it is replaced by a kernel function.
Kernel properties:
  • Kernel functions must be continuous, symmetric, and should have a positive (semi-) definite Gram matrix. Kernels which are said to satisfy the Mercer's theorem as PSD. PSD property insures that the optimization problem will be convex and soltuion will be unique. 
  • There are non-PSD kernel that works better sometimes, such as the sigmoid function. 
Choosing the right kernel:
  • The motivation behind the choice can be intuitive depending on what kind of information we are expecting to extract about the data.
Kernel functions:
  • Linear kernel, $ k(x, y)  =x^T y +c $. 
  • Polynomial kernel.
  • Gaussian kernel, carefully tune the parameter $ \sigma$. [
  • Exponential kernel.