Project1: Feature extraction and Clustering

1.Task

  • Extract features(fc7) from a set of images(cifar10)
  • Do clustering and clustering

2.Analysis

I find somebody use conv5 for face clustering, conv5 inherently contain spatial information.

In CNN model, fully connected reps will have a great deal of this spatial information removed but will contain more abstractions.

May be it is a trade-off situation.

3.Solutions

Step1. Download PNG dataset and Proprocessing

  • I download train dataset (cifar10) and label from https://www.kaggle.com/c/cifar-10/data , unzip the cifar10 dataset into:

    caffe-master/data/mycifar10

  • After that, I used resizing.py to resize the images from 32X32 to 256X256 for Imagenet model.
  • Then, for make filename_list.txt, using following bash code:

    $ find pwd/data/mycifar10/resize/train -type f -exec echo {} \; > data/mycifar10/resize/filename_list.txt

  • Use :

    $ sed 's#/home/tony/caffe-master/data/mycifar10/resize/##g' data/mycifar10/resize/filename_list.txt > data/mycifar10/resize/filename_list_new.txt

    delete the absolute path.

Step2. Compute mean file of cifar10(256)

I don't think use data/ilsvrc12/imagenet_mean.binaryproto for extracting fc7 from cifar10 is a good idea.

  • replace labels in label.csv into number form 0 to 9
  • convert label.csv into the following format:

    number.png labelnumber

    such as:

    1111.png 1

  • edit and run create_imagenet.sh to make cifar10_256_lmdb

  • edit and run make_imagenet_mean.sh to compute cifar10_256_mean.binaryproto
  • edit path and run binaryproto2npy.py to convert .binaryproto mena file into .npy format

Step3. Extract features and Store features into a txt file

  • edit and use feature_extract.py to extract fc7 features into .txt(one txt for one image)
  • edit and use feature_numpy_combine.py to make 50000 txt feature into one file named with fc7_feature_txt_total.txt, which contain 50000X4097, 50000X4096 are fc7 features, and the last dimension is filename number of image

Step4. Use t-SNE to reduce dimension from 4096 to 2

t-SNE will process PCA to reduce dimension from 4096 to 20 or 30, then ust t-SNE to compute a 50000x2 metrix.

  • Download t-SNE python version from https://lvdmaaten.github.io/tsne/
  • Edit tsne.py for PCA and t-SNE
  • run out of memory
    • analysis : I think SVD function consume too mush memory, e.g.sigma is a 4096*4096 matrix(my personal computer : 8GB memory and GTX960)

Step5. Do Kmeans and visualize the results

Do Kmeans on fc7_feature(4096 dimensions) with kmenas.py, get new labels for all features, then select 20 examples for each class to plot

Compute all old labels number in new labeled dataset classified by kmeans with count.py

Step6. Results and analysis

Data results


New Class 0{0L: 520, 1L: 504, 2L: 509, 3L: 512, 4L: 483, 5L: 512, 6L: 511, 7L: 512, 8L: 559, 9L: 453}  5075
New Class 1{0L: 417, 1L: 423, 2L: 430, 3L: 418, 4L: 423, 5L: 417, 6L: 371, 7L: 439, 8L: 401, 9L: 418}  4157
New Class 2{0L: 691, 1L: 646, 2L: 695, 3L: 703, 4L: 676, 5L: 678, 6L: 653, 7L: 681, 8L: 648, 9L: 710}  6781
New Class 3{0L: 470, 1L: 466, 2L: 456, 3L: 481, 4L: 462, 5L: 458, 6L: 461, 7L: 489, 8L: 475, 9L: 489}  4707
New Class 4{0L: 512, 1L: 514, 2L: 488, 3L: 472, 4L: 510, 5L: 477, 6L: 426, 7L: 492, 8L: 464, 9L: 489}  4844
New Class 5{0L: 423, 1L: 477, 2L: 449, 3L: 426, 4L: 448, 5L: 421, 6L: 451, 7L: 432, 8L: 510, 9L: 422}  4459
New Class 6{0L: 401, 1L: 356, 2L: 393, 3L: 374, 4L: 401, 5L: 410, 6L: 370, 7L: 396, 8L: 349, 9L: 380}  3830
New Class 7{0L: 635, 1L: 608, 2L: 606, 3L: 634, 4L: 623, 5L: 615, 6L: 663, 7L: 595, 8L: 609, 9L: 646}  6234
New Class 8{0L: 595, 1L: 677, 2L: 604, 3L: 643, 4L: 636, 5L: 626, 6L: 708, 7L: 609, 8L: 636, 9L: 627}  6361
New Class 9{0L: 336, 1L: 329, 2L: 370, 3L: 337, 4L: 338, 5L: 386, 6L: 386, 7L: 355, 8L: 349, 9L: 366}  3552

Select 20 images randomly of each class classified by Kmeans

Python and Shell Scripts

  • resizing.py
#!/usr/bin/python
# -*- coding:utf8 -*-

import os
import shutil
import Image  
to_scale = 0.5
processIndex = 0
def resizeImg(imgPath):
    global processIndex
    fileList = []
    files = os.listdir(imgPath)
    for f in files:
        filePath = imgPath + os.sep + f
        if(os.path.isfile(filePath)):
            fileList.append(f)
        elif(os.path.isdir(filePath)):
            resizeImg(filePath)
    for fileName in fileList:
        processIndex+=1
        fileFullName = imgPath+os.sep+fileName
        suffix = fileName[fileName.rfind('.'):]
        if(suffix == '.png' or suffix == '.jpg'):
            print 'processing the '+str(processIndex)+'th file:'+fileFullName
            img = Image.open(fileFullName)
            w,h = img.size
            tw = int(w * to_scale)
            th = int(h * to_scale)
            reImg = img.resize((tw,th),Image.ANTIALIAS)
            reImg.save(fileFullName)
            del reImg
if __name__ == '__main__':
    scaleStr = raw_input('input to_scale: ')
    to_scale = float(scaleStr)
    scaledPath = '/home/tony/caffe-master/data/mycifar10/resize/train';
    if os.path.isdir(scaledPath):
        flag = raw_input('the output dir is exist, sure to del it(y/n)')
        if flag == 'y' or flag == 'yes':
            try:   
                shutil.rmtree(scaledPath)
            finally:
                raw_input('remove dir failed , please removed the dir manually.')
        else:
            exit
    #shutil.copytree('../',scaledPath)   
    resizeImg(scaledPath)
    raw_input("resize success")
  • create_imagenet.sh
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs

EXAMPLE=/home/tony/caffe-master/examples/fc7clustering
DATA=/home/tony/caffe-master/data/mycifar10/resize
TOOLS=/home/tony/caffe-master/build/tools

TRAIN_DATA_ROOT=/home/tony/caffe-master/data/mycifar10/resize/train/
# VAL_DATA_ROOT=/path/to/imagenet/val/

# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=false
if $RESIZE; then
  RESIZE_HEIGHT=256
  RESIZE_WIDTH=256
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi

if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi

#if [ ! -d "$VAL_DATA_ROOT" ]; then
#  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
#  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
#       "where the ImageNet validation data is stored."
#  exit 1
#fi

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $TRAIN_DATA_ROOT \
    $DATA/train_new.txt \
    $EXAMPLE/mycifar10_train_lmdb

#echo "Creating val lmdb..."

#GLOG_logtostderr=1 $TOOLS/convert_imageset \
#    --resize_height=$RESIZE_HEIGHT \
#    --resize_width=$RESIZE_WIDTH \
#    --shuffle \
#    $VAL_DATA_ROOT \
#    $DATA/val.txt \
#    $EXAMPLE/ilsvrc12_val_lmdb

echo "Done."
  • make_imagenet_mean.sh
#!/usr/bin/env sh
# Compute the mean image from the imagenet training lmdb
# N.B. this is available in data/ilsvrc12

EXAMPLE=/home/tony/caffe-master/examples/fc7clustering
DATA=/home/tony/caffe-master/data/mycifar10/resize
TOOLS=/home/tony/caffe-master/build/tools

$TOOLS/compute_image_mean $EXAMPLE/mycifar10_train_lmdb \
  $DATA/mycifar10_mean.binaryproto

echo "Done."
  • binaryproto2npy.py
#!/usr/bin/python
# -*- coding:utf8 -*-
import caffe
import numpy as np

MEAN_PROTO_PATH = '/home/tony/caffe-master/data/mycifar10/resize/mycifar10_mean.binaryproto'               # 待转换的pb格式图像均值文件路径
MEAN_NPY_PATH = '/home/tony/caffe-master/data/mycifar10/resize/mycifar10_mean.npy'                         # 转换后的numpy格式图像均值文件路径

blob = caffe.proto.caffe_pb2.BlobProto()           # 创建protobuf blob
data = open(MEAN_PROTO_PATH, 'rb' ).read()         # 读入mean.binaryproto文件内容
blob.ParseFromString(data)                         # 解析文件内容到blob

array = np.array(caffe.io.blobproto_to_array(blob))# 将blob中的均值转换成numpy格式,array的shape (mean_number,channel, hight, width)
mean_npy = array[0]                                # 一个array中可以有多组均值存在,故需要通过下标选择其中一组均值
np.save(MEAN_NPY_PATH ,mean_npy)
  • feature_extract.py
#coding:utf-8
import numpy as np
import matplotlib.pyplot as plt
import os
import caffe
import sys
import pickle
import struct
#import sys,cv2
import sys
caffe_root = '../'  
# 运行模型的prototxt
deployPrototxt =  '/home/tony/caffe-master/models/bvlc_reference_caffenet/deploy.prototxt'
# 相应载入的modelfile
modelFile = '/home/tony/caffe-master/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
# meanfile 也可以用自己生成的
meanFile = '/home/tony/caffe-master/data/mycifar10/resize/mycifar10_mean.npy'
# 需要提取的图像列表
imageListFile = '/home/tony/caffe-master/data/mycifar10/resize/filename_list_1.txt'
imageBasePath = '/home/tony/caffe-master/data/mycifar10/resize/train/'
gpuID = 0
#postfix = '.classify_allCar1716_fc6'
postfix = 'fc7.txt'
# 初始化函数的相关操作
def initilize():
    print 'initilize ... '

    sys.path.insert(0, caffe_root + 'python')
    caffe.set_mode_gpu()
    caffe.set_device(gpuID)
    net = caffe.Net(deployPrototxt, modelFile,caffe.TEST)
    return net  
# 提取特征并保存为相应地文件
def extractFeature(imageList, net):
    # 对输入数据做相应地调整如通道、尺寸等等
    transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
    transformer.set_transpose('data', (2,0,1))
    transformer.set_mean('data', np.load(meanFile).mean(1).mean(1)) # mean pixel
    transformer.set_raw_scale('data', 255)  
    transformer.set_channel_swap('data', (2,1,0))  
    # set net to batch size of 1 如果图片较多就设置合适的batchsize 
    net.blobs['data'].reshape(1,3,227,227)      #这里根据需要设定,如果网络中不一致,需要调整
    num=0
    for imagefile in imageList:

        imagefile_abs = os.path.join(imageBasePath, imagefile)
        print imagefile_abs
        net.blobs['data'].data[...] = transformer.preprocess('data', caffe.io.load_image(imagefile_abs))
        out = net.forward()
        fea_file = imagefile_abs.replace('.png',postfix)
        num +=1
        print 'Num ',num,' extract feature ',fea_file
        # with  open(fea_file,'wb') as f:
        #     for x in xrange(0, net.blobs['fc7'].data.shape[0]):
        #         for y in xrange(0, net.blobs['fc7'].data.shape[1]):
        #             f.write(struct.pack('f', net.blobs['fc7'].data[x,y]))
        np.savetxt(fea_file, net.blobs['fc7'].data)

# 读取文件列表
def readImageList(imageListFile):
    imageList = []
    with open(imageListFile,'r') as fi:
        while(True):
            line = fi.readline().strip().split()# every line is a image file name
            if not line:
                break
            imageList.append(line[0]) 
    print 'read imageList done image num ', len(imageList)
    return imageList

if __name__ == "__main__":
    net = initilize()
    imageList = readImageList(imageListFile) 
    extractFeature(imageList, net)
  • feature_numpy_combine.py
#coding:utf-8
import numpy as np
import matplotlib.pyplot as plt
import os
import sys
import pickle
import struct
featureTxtListFile = '/home/tony/caffe-master/data/mycifar10/resize/fc7_feature_txt_filenamelist_1.txt'
featureTxtBasePath = '/home/tony/caffe-master/data/mycifar10/resize/fc7_feature_txt/'
featureTxtStoreBasePath = '/media/tony/Seagate Backup Plus Drive/zsy/'
postfix = ''

def narray_vstack(featuretxtList):
    total_array = np.arange(4097) # initial array, will be deleted later
    num = 0
    for featuretxtfile in featuretxtList:
        featuretxtfile_abs = os.path.join(featureTxtBasePath,featuretxtfile)
        #print featuretxtfile_abs
        #imagenumber = .replace('fc7.txt', postfix)
        imagenumber = float(int(featuretxtfile.replace('fc7.txt', postfix)))
        #print imagenumber

        temp_array = np.loadtxt(featuretxtfile_abs)
        temp_array = np.append([imagenumber], temp_array)
        total_array = np.vstack((total_array, temp_array))
        num += 1
        if num % 500 == 0:
            print 'txt loaded number:', num


    np.savetxt(os.path.join( featureTxtStoreBasePath,'fc7_feature_txt_total.txt'), total_array)


def readFeatureTxtList(featureTxtListFile):
    featuretxtList = []
    with open(featureTxtListFile,'r') as fi:
        while(True):
            line = fi.readline().strip().split()# every line is a image file name
            if not line:
                break
            featuretxtList.append(line[0]) 
    print 'read feattxtList done featuretxt num ', len(featuretxtList)
    return featuretxtList

if __name__ == '__main__':
    featuretxtList = readFeatureTxtList(featureTxtListFile)
    narray_vstack(featuretxtList)
  • tsne.py
#
#  tsne.py
#
# Implementation of t-SNE in Python. The implementation was tested on Python 2.7.10, and it requires a working
# installation of NumPy. The implementation comes with an example on the MNIST dataset. In order to plot the
# results of this example, a working installation of matplotlib is required.
#
# The example can be run by executing: `ipython tsne.py`
#
#
#  Created by Laurens van der Maaten on 20-12-08.
#  Copyright (c) 2008 Tilburg University. All rights reserved.

import numpy as Math
import pylab as Plot

def Hbeta(D = Math.array([]), beta = 1.0):
    """Compute the perplexity and the P-row for a specific value of the precision of a Gaussian distribution."""

    # Compute P-row and corresponding perplexity
    P = Math.exp(-D.copy() * beta);
    sumP = sum(P);
    H = Math.log(sumP) + beta * Math.sum(D * P) / sumP;
    P = P / sumP;
    return H, P;


def x2p(X = Math.array([]), tol = 1e-5, perplexity = 30.0):
    """Performs a binary search to get P-values in such a way that each conditional Gaussian has the same perplexity."""

    # Initialize some variables
    print "Computing pairwise distances..."
    (n, d) = X.shape;
    sum_X = Math.sum(Math.square(X), 1);
    D = Math.add(Math.add(-2 * Math.dot(X, X.T), sum_X).T, sum_X);
    P = Math.zeros((n, n));
    beta = Math.ones((n, 1));
    logU = Math.log(perplexity);

    # Loop over all datapoints
    for i in range(n):

        # Print progress
        if i % 500 == 0:
            print "Computing P-values for point ", i, " of ", n, "..."

        # Compute the Gaussian kernel and entropy for the current precision
        betamin = -Math.inf;
        betamax =  Math.inf;
        Di = D[i, Math.concatenate((Math.r_[0:i], Math.r_[i+1:n]))];
        (H, thisP) = Hbeta(Di, beta[i]);

        # Evaluate whether the perplexity is within tolerance
        Hdiff = H - logU;
        tries = 0;
        while Math.abs(Hdiff) > tol and tries < 50:

            # If not, increase or decrease precision
            if Hdiff > 0:
                betamin = beta[i].copy();
                if betamax == Math.inf or betamax == -Math.inf:
                    beta[i] = beta[i] * 2;
                else:
                    beta[i] = (beta[i] + betamax) / 2;
            else:
                betamax = beta[i].copy();
                if betamin == Math.inf or betamin == -Math.inf:
                    beta[i] = beta[i] / 2;
                else:
                    beta[i] = (beta[i] + betamin) / 2;

            # Recompute the values
            (H, thisP) = Hbeta(Di, beta[i]);
            Hdiff = H - logU;
            tries = tries + 1;

        # Set the final row of P
        P[i, Math.concatenate((Math.r_[0:i], Math.r_[i+1:n]))] = thisP;

    # Return final P-matrix
    print "Mean value of sigma: ", Math.mean(Math.sqrt(1 / beta));
    return P;


def pca(X = Math.array([]), no_dims = 50):
    """Runs PCA on the NxD array X in order to reduce its dimensionality to no_dims dimensions."""

    print "Preprocessing the data using PCA..."
    (n, d) = X.shape;
    X = X - Math.tile(Math.mean(X, 0), (n, 1));
    (l, M) = Math.linalg.eig(Math.dot(X.T, X));
    Y = Math.dot(X, M[:,0:no_dims]);
    return Y;


def tsne(X = Math.array([]), no_dims = 2, initial_dims = 50, perplexity = 30.0):
    """Runs t-SNE on the dataset in the NxD array X to reduce its dimensionality to no_dims dimensions.
    The syntaxis of the function is Y = tsne.tsne(X, no_dims, perplexity), where X is an NxD NumPy array."""

    # Check inputs
    if isinstance(no_dims, float):
        print "Error: array X should have type float.";
        return -1;
    if round(no_dims) != no_dims:
        print "Error: number of dimensions should be an integer.";
        return -1;

    # Initialize variables
    X = pca(X, initial_dims).real;
    (n, d) = X.shape;
    max_iter = 1000;
    initial_momentum = 0.5;
    final_momentum = 0.8;
    eta = 500;
    min_gain = 0.01;
    Y = Math.random.randn(n, no_dims);
    dY = Math.zeros((n, no_dims));
    iY = Math.zeros((n, no_dims));
    gains = Math.ones((n, no_dims));

    # Compute P-values
    P = x2p(X, 1e-5, perplexity);
    P = P + Math.transpose(P);
    P = P / Math.sum(P);
    P = P * 4;                                    # early exaggeration
    P = Math.maximum(P, 1e-12);

    # Run iterations
    for iter in range(max_iter):

        # Compute pairwise affinities
        sum_Y = Math.sum(Math.square(Y), 1);
        num = 1 / (1 + Math.add(Math.add(-2 * Math.dot(Y, Y.T), sum_Y).T, sum_Y));
        num[range(n), range(n)] = 0;
        Q = num / Math.sum(num);
        Q = Math.maximum(Q, 1e-12);

        # Compute gradient
        PQ = P - Q;
        for i in range(n):
            dY[i,:] = Math.sum(Math.tile(PQ[:,i] * num[:,i], (no_dims, 1)).T * (Y[i,:] - Y), 0);

        # Perform the update
        if iter < 20:
            momentum = initial_momentum
        else:
            momentum = final_momentum
        gains = (gains + 0.2) * ((dY > 0) != (iY > 0)) + (gains * 0.8) * ((dY > 0) == (iY > 0));
        gains[gains < min_gain] = min_gain;
        iY = momentum * iY - eta * (gains * dY);
        Y = Y + iY;
        Y = Y - Math.tile(Math.mean(Y, 0), (n, 1));

        # Compute current value of cost function
        if (iter + 1) % 10 == 0:
            C = Math.sum(P * Math.log(P / Q));
            print "Iteration ", (iter + 1), ": error is ", C

        # Stop lying about P-values
        if iter == 100:
            P = P / 4;

    # Return solution
    return Y;


if __name__ == "__main__":
    print "Run Y = tsne.tsne(X, no_dims, perplexity) to perform t-SNE on your dataset."
    print "Running data on fc7 of cifar10 train dataset..."
    X = Math.loadtxt("fc7_feature_label.txt");
    labels = Math.loadtxt("fc7_feature.txt");
    Y = tsne(X, 2, 50, 20.0);
    Plot.scatter(Y[:,0], Y[:,1], 20, labels);
    Plot.show();

results matching ""

    No results matching ""