Tuesday, August 6, 2013

Kernel Machines.

http://crsouza.blogspot.com/2010/03/kernel-functions-for-machine-learning.html

kernel methods:
  1. map data into higher dimensional space in the hope that in this higher-dimensional space the data could be more easily separated or better structured. 
  2. The mapping function, doesn't need to be comptued because of the kernel trick. 
  3. kernel trick can be applied to any algorithm which solely depends on the dot product.Wherever a dot product is sued, it is replaced by a kernel function.
Kernel properties:
  • Kernel functions must be continuous, symmetric, and should have a positive (semi-) definite Gram matrix. Kernels which are said to satisfy the Mercer's theorem as PSD. PSD property insures that the optimization problem will be convex and soltuion will be unique. 
  • There are non-PSD kernel that works better sometimes, such as the sigmoid function. 
Choosing the right kernel:
  • The motivation behind the choice can be intuitive depending on what kind of information we are expecting to extract about the data.
Kernel functions:
  • Linear kernel, $ k(x, y)  =x^T y +c $. 
  • Polynomial kernel.
  • Gaussian kernel, carefully tune the parameter $ \sigma$. [
  • Exponential kernel. 

Monday, May 27, 2013

UFLDL sparse auto-encoder exercise.

The hardest part is to write the backpropagation, with that done, everything is fine.

Not sure why the final training always hit maximum-iteration 400 .
but the final visualization is fine.

Friday, May 24, 2013

python google class.

python strings:
-------------------------------------------------------------------------
1. 
built in string class named str, single quote or double quote both OK. use """ for multiple line string. 

2. 
string are immutable. 

3. 
zero-based indexing for accessing string with [] 
ex: 
str = 'hello'   str[1] is e

4. 
don't use "len" as variable name!

5. 
+ to cat two strings. 

6.
a raw string : 
raw = r'this\t\n and that'
print raw

7. 
string slicing:

s = 'hello' 
print s[1:4]
print s[1:]
print s[:]
print s[1:100]

note the last one. 

8. 
string methods modifies the original string. 

print list.append(4) # this doesn't work 


python sorting 
---------------------------------------------------------------------------------
1. 
sorted(a)  doesn't change the original list.

2. 
sorted() method is recommended over the old sort() method, because the former can take as input any iterable collection .

3. 
sorted() string compare their ascii number lexicographically. 
capital letters are smaller

4. 
sort by length:

strs = ['ccc', 'aaa', 'd', 'bb']
print sorted(strs, key=len)

5. 
custom key function. 

strs = ['xc', 'zb', 'yd', 'wa']
def myFn(s):
    return s[-1]
print sorted(strs, key = myFn)



Tuples
-----------------------------------------------------------------
1.
fixed size grouping of elements. 

2. 
immutable and don't change size

3. 
function that returns fixed number multiple values may just return a tuple.

4.
tuple = (1,2,'hi')
print len(tuple)
print tuple[2]
tuple[2] = 'bye' # not going to work!!!!!!
tuple = (1,2, 'bye') # this works

5.
create a single element tuple, the lone element has to be followed by a comma:
tuple = ('hi',)
to distinguish a tuple from a parenthesized string. 

----------------------------------
done 






Thursday, May 23, 2013

Installation of Theano.

I am learning Deep learning, the first thing to do is to install Theano. This is the process:
1.
 go to the link: http://deeplearning.net/software/theano/install.html#install 

2.
 I get this error: dfftpack missing fortran complier. I download the gfortran pkg for mac. install gfortran

 3. run sudo pip install Theano
now it is running for gfortran...
but still has a lot of errors.

4. then I get a lot of erros saying missing dependencies of scipy

5. run the command sudo port install py27-numpy +atlas py27-scipy +atlas py27-pip 

but I get these two errors:
Error: org.macports.build for port llvm-3.3 returned: command execution failed
Error: Failed to install llvm-3.3

6.
run sudo port install llvm-2.9 



==================================
alright, so I failed on installing it on mac, spending the whole afternoon on ubuntu 12.04.

1. install libblas, liblapack from the software center.
2. install scipy using command  python setup.py install --user this install the scipy to my local user .
3. install numpy similarly.
4. install python nose from software center.
5. then I can test numpy using python -c "import numpy;  numpy.test()"  to get the following output:

NumPy version 1.7.0
NumPy is installed in /fs/narahomes/wenhoujx/.local/lib/python2.7/site-packages/numpy
Python version 2.7.3 (default, Aug  1 2012, 05:14:39) [GCC 4.6.3]
nose version 1.1.2
6. test scipy using python -c "import scipy; scipy.test()"    the output: 
Running unit tests for scipy
NumPy version 1.7.0
NumPy is installed in /fs/narahomes/wenhoujx/.local/lib/python2.7/site-packages/numpy
SciPy version 0.12.0
SciPy is installed in /fs/narahomes/wenhoujx/.local/lib/python2.7/site-packages/scipy
Python version 2.7.3 (default, Aug  1 2012, 05:14:39) [GCC 4.6.3]
nose version 1.1.2
/fs/narahomes/wenhoujx/.local/lib/python2.7/site-packages/numpy/lib/utils.py:139: DeprecationWarning: `scipy.lib.blas` is deprecated, use `scipy.linalg.blas` instead!
  warnings.warn(depdoc, DeprecationWarning)
/fs/narahomes/wenhoujx/.local/lib/python2.7/site-packages/numpy/lib/utils.py:139: DeprecationWarning: `scipy.lib.lapack` is deprecated, use `scipy.linalg.lapack` instead!
  warnings.warn(depdoc, DeprecationWarning)

7. now install theano, download from https://pypi.python.org/pypi/Theano#downloads
then tar, python setup.py install --user

8. test theano 


wenhoujx@narawks16:/scratch0/pylib$ python -c "import theano; theano.test()"
Theano version 0.6.0rc3
theano is installed in /fs/narahomes/wenhoujx/.local/lib/python2.7/site-packages/Theano-0.6.0rc3-py2.7.egg/theano
NumPy version 1.7.0
NumPy is installed in /fs/narahomes/wenhoujx/.local/lib/python2.7/site-packages/numpy
Python version 2.7.3 (default, Aug  1 2012, 05:14:39) [GCC 4.6.3]
nose version 1.1.2

----------------------------------------------------------------------
Ran 0 tests in 0.002s

OK

there is a problem that the test doesn't run . zero test.

9. pip uninstall theano    to clear theano
then pip install --user theano to install it in $HOME.
then run python -c "import theano; theano.test()"
now everything is running, but with warnings.

it will probably take ~30min, it is 9:24pm now.
finish on 10:31 pm

Ran 2015 tests in 3131.603s

10. done.











tar error

download a foo.tar.gz

unzip foo.tar.gz

tar -xzf foo.tar  get the error:

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

solution:
I changed the foo.tar.gz to foo.tar then tar -zxf foo.tar to get the untarred folder.

source: https://bbs.archlinux.org/viewtopic.php?id=98278
Maybe it's just a simple tar archive and not gzipped and tar does detect this but file roller does not. Have you tried renaming it to archname.tar and then installing it -- maybe gzipping it in between?

install theano