Project1: Feature extraction and Clustering
1.Task
- Extract features(fc7) from a set of images(cifar10)
- Do clustering and clustering
2.Analysis
I find somebody use conv5 for face clustering, conv5 inherently contain spatial information.
In CNN model, fully connected reps will have a great deal of this spatial information removed but will contain more abstractions.
May be it is a trade-off situation.
3.Solutions
Step1. Download PNG dataset and Proprocessing
- I download train dataset (cifar10) and label from https://www.kaggle.com/c/cifar-10/data , unzip the cifar10 dataset into:
caffe-master/data/mycifar10
- After that, I used resizing.py to resize the images from 32X32 to 256X256 for Imagenet model.
Then, for make filename_list.txt, using following bash code:
$ find
pwd
/data/mycifar10/resize/train -type f -exec echo {} \; > data/mycifar10/resize/filename_list.txtUse :
$ sed 's#/home/tony/caffe-master/data/mycifar10/resize/##g' data/mycifar10/resize/filename_list.txt > data/mycifar10/resize/filename_list_new.txt
delete the absolute path.
Step2. Compute mean file of cifar10(256)
I don't think use data/ilsvrc12/imagenet_mean.binaryproto for extracting fc7 from cifar10 is a good idea.
- replace labels in label.csv into number form 0 to 9
convert label.csv into the following format:
number.png labelnumber
such as:
1111.png 1
edit and run create_imagenet.sh to make cifar10_256_lmdb
- edit and run make_imagenet_mean.sh to compute cifar10_256_mean.binaryproto
- edit path and run binaryproto2npy.py to convert .binaryproto mena file into .npy format
Step3. Extract features and Store features into a txt file
- edit and use feature_extract.py to extract fc7 features into .txt(one txt for one image)
- edit and use feature_numpy_combine.py to make 50000 txt feature into one file named with fc7_feature_txt_total.txt, which contain 50000X4097, 50000X4096 are fc7 features, and the last dimension is filename number of image
Step4. Use t-SNE to reduce dimension from 4096 to 2
t-SNE will process PCA to reduce dimension from 4096 to 20 or 30, then ust t-SNE to compute a 50000x2 metrix.
Download t-SNE python version from https://lvdmaaten.github.io/tsne/Edit tsne.py for PCA and t-SNErun out of memoryanalysis : I think SVD function consume too mush memory, e.g.sigma is a 4096*4096 matrix(my personal computer : 8GB memory and GTX960)
Step5. Do Kmeans and visualize the results
Do Kmeans on fc7_feature(4096 dimensions) with kmenas.py, get new labels for all features, then select 20 examples for each class to plot
Compute all old labels number in new labeled dataset classified by kmeans with count.py
Step6. Results and analysis
Data results
New Class 0{0L: 520, 1L: 504, 2L: 509, 3L: 512, 4L: 483, 5L: 512, 6L: 511, 7L: 512, 8L: 559, 9L: 453} 5075
New Class 1{0L: 417, 1L: 423, 2L: 430, 3L: 418, 4L: 423, 5L: 417, 6L: 371, 7L: 439, 8L: 401, 9L: 418} 4157
New Class 2{0L: 691, 1L: 646, 2L: 695, 3L: 703, 4L: 676, 5L: 678, 6L: 653, 7L: 681, 8L: 648, 9L: 710} 6781
New Class 3{0L: 470, 1L: 466, 2L: 456, 3L: 481, 4L: 462, 5L: 458, 6L: 461, 7L: 489, 8L: 475, 9L: 489} 4707
New Class 4{0L: 512, 1L: 514, 2L: 488, 3L: 472, 4L: 510, 5L: 477, 6L: 426, 7L: 492, 8L: 464, 9L: 489} 4844
New Class 5{0L: 423, 1L: 477, 2L: 449, 3L: 426, 4L: 448, 5L: 421, 6L: 451, 7L: 432, 8L: 510, 9L: 422} 4459
New Class 6{0L: 401, 1L: 356, 2L: 393, 3L: 374, 4L: 401, 5L: 410, 6L: 370, 7L: 396, 8L: 349, 9L: 380} 3830
New Class 7{0L: 635, 1L: 608, 2L: 606, 3L: 634, 4L: 623, 5L: 615, 6L: 663, 7L: 595, 8L: 609, 9L: 646} 6234
New Class 8{0L: 595, 1L: 677, 2L: 604, 3L: 643, 4L: 636, 5L: 626, 6L: 708, 7L: 609, 8L: 636, 9L: 627} 6361
New Class 9{0L: 336, 1L: 329, 2L: 370, 3L: 337, 4L: 338, 5L: 386, 6L: 386, 7L: 355, 8L: 349, 9L: 366} 3552
Select 20 images randomly of each class classified by Kmeans
Python and Shell Scripts
- resizing.py
#!/usr/bin/python
# -*- coding:utf8 -*-
import os
import shutil
import Image
to_scale = 0.5
processIndex = 0
def resizeImg(imgPath):
global processIndex
fileList = []
files = os.listdir(imgPath)
for f in files:
filePath = imgPath + os.sep + f
if(os.path.isfile(filePath)):
fileList.append(f)
elif(os.path.isdir(filePath)):
resizeImg(filePath)
for fileName in fileList:
processIndex+=1
fileFullName = imgPath+os.sep+fileName
suffix = fileName[fileName.rfind('.'):]
if(suffix == '.png' or suffix == '.jpg'):
print 'processing the '+str(processIndex)+'th file:'+fileFullName
img = Image.open(fileFullName)
w,h = img.size
tw = int(w * to_scale)
th = int(h * to_scale)
reImg = img.resize((tw,th),Image.ANTIALIAS)
reImg.save(fileFullName)
del reImg
if __name__ == '__main__':
scaleStr = raw_input('input to_scale: ')
to_scale = float(scaleStr)
scaledPath = '/home/tony/caffe-master/data/mycifar10/resize/train';
if os.path.isdir(scaledPath):
flag = raw_input('the output dir is exist, sure to del it(y/n)')
if flag == 'y' or flag == 'yes':
try:
shutil.rmtree(scaledPath)
finally:
raw_input('remove dir failed , please removed the dir manually.')
else:
exit
#shutil.copytree('../',scaledPath)
resizeImg(scaledPath)
raw_input("resize success")
- create_imagenet.sh
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
EXAMPLE=/home/tony/caffe-master/examples/fc7clustering
DATA=/home/tony/caffe-master/data/mycifar10/resize
TOOLS=/home/tony/caffe-master/build/tools
TRAIN_DATA_ROOT=/home/tony/caffe-master/data/mycifar10/resize/train/
# VAL_DATA_ROOT=/path/to/imagenet/val/
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=false
if $RESIZE; then
RESIZE_HEIGHT=256
RESIZE_WIDTH=256
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
"where the ImageNet training data is stored."
exit 1
fi
#if [ ! -d "$VAL_DATA_ROOT" ]; then
# echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
# echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
# "where the ImageNet validation data is stored."
# exit 1
#fi
echo "Creating train lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$TRAIN_DATA_ROOT \
$DATA/train_new.txt \
$EXAMPLE/mycifar10_train_lmdb
#echo "Creating val lmdb..."
#GLOG_logtostderr=1 $TOOLS/convert_imageset \
# --resize_height=$RESIZE_HEIGHT \
# --resize_width=$RESIZE_WIDTH \
# --shuffle \
# $VAL_DATA_ROOT \
# $DATA/val.txt \
# $EXAMPLE/ilsvrc12_val_lmdb
echo "Done."
- make_imagenet_mean.sh
#!/usr/bin/env sh
# Compute the mean image from the imagenet training lmdb
# N.B. this is available in data/ilsvrc12
EXAMPLE=/home/tony/caffe-master/examples/fc7clustering
DATA=/home/tony/caffe-master/data/mycifar10/resize
TOOLS=/home/tony/caffe-master/build/tools
$TOOLS/compute_image_mean $EXAMPLE/mycifar10_train_lmdb \
$DATA/mycifar10_mean.binaryproto
echo "Done."
- binaryproto2npy.py
#!/usr/bin/python
# -*- coding:utf8 -*-
import caffe
import numpy as np
MEAN_PROTO_PATH = '/home/tony/caffe-master/data/mycifar10/resize/mycifar10_mean.binaryproto' # 待转换的pb格式图像均值文件路径
MEAN_NPY_PATH = '/home/tony/caffe-master/data/mycifar10/resize/mycifar10_mean.npy' # 转换后的numpy格式图像均值文件路径
blob = caffe.proto.caffe_pb2.BlobProto() # 创建protobuf blob
data = open(MEAN_PROTO_PATH, 'rb' ).read() # 读入mean.binaryproto文件内容
blob.ParseFromString(data) # 解析文件内容到blob
array = np.array(caffe.io.blobproto_to_array(blob))# 将blob中的均值转换成numpy格式,array的shape (mean_number,channel, hight, width)
mean_npy = array[0] # 一个array中可以有多组均值存在,故需要通过下标选择其中一组均值
np.save(MEAN_NPY_PATH ,mean_npy)
- feature_extract.py
#coding:utf-8
import numpy as np
import matplotlib.pyplot as plt
import os
import caffe
import sys
import pickle
import struct
#import sys,cv2
import sys
caffe_root = '../'
# 运行模型的prototxt
deployPrototxt = '/home/tony/caffe-master/models/bvlc_reference_caffenet/deploy.prototxt'
# 相应载入的modelfile
modelFile = '/home/tony/caffe-master/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
# meanfile 也可以用自己生成的
meanFile = '/home/tony/caffe-master/data/mycifar10/resize/mycifar10_mean.npy'
# 需要提取的图像列表
imageListFile = '/home/tony/caffe-master/data/mycifar10/resize/filename_list_1.txt'
imageBasePath = '/home/tony/caffe-master/data/mycifar10/resize/train/'
gpuID = 0
#postfix = '.classify_allCar1716_fc6'
postfix = 'fc7.txt'
# 初始化函数的相关操作
def initilize():
print 'initilize ... '
sys.path.insert(0, caffe_root + 'python')
caffe.set_mode_gpu()
caffe.set_device(gpuID)
net = caffe.Net(deployPrototxt, modelFile,caffe.TEST)
return net
# 提取特征并保存为相应地文件
def extractFeature(imageList, net):
# 对输入数据做相应地调整如通道、尺寸等等
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', np.load(meanFile).mean(1).mean(1)) # mean pixel
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))
# set net to batch size of 1 如果图片较多就设置合适的batchsize
net.blobs['data'].reshape(1,3,227,227) #这里根据需要设定,如果网络中不一致,需要调整
num=0
for imagefile in imageList:
imagefile_abs = os.path.join(imageBasePath, imagefile)
print imagefile_abs
net.blobs['data'].data[...] = transformer.preprocess('data', caffe.io.load_image(imagefile_abs))
out = net.forward()
fea_file = imagefile_abs.replace('.png',postfix)
num +=1
print 'Num ',num,' extract feature ',fea_file
# with open(fea_file,'wb') as f:
# for x in xrange(0, net.blobs['fc7'].data.shape[0]):
# for y in xrange(0, net.blobs['fc7'].data.shape[1]):
# f.write(struct.pack('f', net.blobs['fc7'].data[x,y]))
np.savetxt(fea_file, net.blobs['fc7'].data)
# 读取文件列表
def readImageList(imageListFile):
imageList = []
with open(imageListFile,'r') as fi:
while(True):
line = fi.readline().strip().split()# every line is a image file name
if not line:
break
imageList.append(line[0])
print 'read imageList done image num ', len(imageList)
return imageList
if __name__ == "__main__":
net = initilize()
imageList = readImageList(imageListFile)
extractFeature(imageList, net)
- feature_numpy_combine.py
#coding:utf-8
import numpy as np
import matplotlib.pyplot as plt
import os
import sys
import pickle
import struct
featureTxtListFile = '/home/tony/caffe-master/data/mycifar10/resize/fc7_feature_txt_filenamelist_1.txt'
featureTxtBasePath = '/home/tony/caffe-master/data/mycifar10/resize/fc7_feature_txt/'
featureTxtStoreBasePath = '/media/tony/Seagate Backup Plus Drive/zsy/'
postfix = ''
def narray_vstack(featuretxtList):
total_array = np.arange(4097) # initial array, will be deleted later
num = 0
for featuretxtfile in featuretxtList:
featuretxtfile_abs = os.path.join(featureTxtBasePath,featuretxtfile)
#print featuretxtfile_abs
#imagenumber = .replace('fc7.txt', postfix)
imagenumber = float(int(featuretxtfile.replace('fc7.txt', postfix)))
#print imagenumber
temp_array = np.loadtxt(featuretxtfile_abs)
temp_array = np.append([imagenumber], temp_array)
total_array = np.vstack((total_array, temp_array))
num += 1
if num % 500 == 0:
print 'txt loaded number:', num
np.savetxt(os.path.join( featureTxtStoreBasePath,'fc7_feature_txt_total.txt'), total_array)
def readFeatureTxtList(featureTxtListFile):
featuretxtList = []
with open(featureTxtListFile,'r') as fi:
while(True):
line = fi.readline().strip().split()# every line is a image file name
if not line:
break
featuretxtList.append(line[0])
print 'read feattxtList done featuretxt num ', len(featuretxtList)
return featuretxtList
if __name__ == '__main__':
featuretxtList = readFeatureTxtList(featureTxtListFile)
narray_vstack(featuretxtList)
- tsne.py
#
# tsne.py
#
# Implementation of t-SNE in Python. The implementation was tested on Python 2.7.10, and it requires a working
# installation of NumPy. The implementation comes with an example on the MNIST dataset. In order to plot the
# results of this example, a working installation of matplotlib is required.
#
# The example can be run by executing: `ipython tsne.py`
#
#
# Created by Laurens van der Maaten on 20-12-08.
# Copyright (c) 2008 Tilburg University. All rights reserved.
import numpy as Math
import pylab as Plot
def Hbeta(D = Math.array([]), beta = 1.0):
"""Compute the perplexity and the P-row for a specific value of the precision of a Gaussian distribution."""
# Compute P-row and corresponding perplexity
P = Math.exp(-D.copy() * beta);
sumP = sum(P);
H = Math.log(sumP) + beta * Math.sum(D * P) / sumP;
P = P / sumP;
return H, P;
def x2p(X = Math.array([]), tol = 1e-5, perplexity = 30.0):
"""Performs a binary search to get P-values in such a way that each conditional Gaussian has the same perplexity."""
# Initialize some variables
print "Computing pairwise distances..."
(n, d) = X.shape;
sum_X = Math.sum(Math.square(X), 1);
D = Math.add(Math.add(-2 * Math.dot(X, X.T), sum_X).T, sum_X);
P = Math.zeros((n, n));
beta = Math.ones((n, 1));
logU = Math.log(perplexity);
# Loop over all datapoints
for i in range(n):
# Print progress
if i % 500 == 0:
print "Computing P-values for point ", i, " of ", n, "..."
# Compute the Gaussian kernel and entropy for the current precision
betamin = -Math.inf;
betamax = Math.inf;
Di = D[i, Math.concatenate((Math.r_[0:i], Math.r_[i+1:n]))];
(H, thisP) = Hbeta(Di, beta[i]);
# Evaluate whether the perplexity is within tolerance
Hdiff = H - logU;
tries = 0;
while Math.abs(Hdiff) > tol and tries < 50:
# If not, increase or decrease precision
if Hdiff > 0:
betamin = beta[i].copy();
if betamax == Math.inf or betamax == -Math.inf:
beta[i] = beta[i] * 2;
else:
beta[i] = (beta[i] + betamax) / 2;
else:
betamax = beta[i].copy();
if betamin == Math.inf or betamin == -Math.inf:
beta[i] = beta[i] / 2;
else:
beta[i] = (beta[i] + betamin) / 2;
# Recompute the values
(H, thisP) = Hbeta(Di, beta[i]);
Hdiff = H - logU;
tries = tries + 1;
# Set the final row of P
P[i, Math.concatenate((Math.r_[0:i], Math.r_[i+1:n]))] = thisP;
# Return final P-matrix
print "Mean value of sigma: ", Math.mean(Math.sqrt(1 / beta));
return P;
def pca(X = Math.array([]), no_dims = 50):
"""Runs PCA on the NxD array X in order to reduce its dimensionality to no_dims dimensions."""
print "Preprocessing the data using PCA..."
(n, d) = X.shape;
X = X - Math.tile(Math.mean(X, 0), (n, 1));
(l, M) = Math.linalg.eig(Math.dot(X.T, X));
Y = Math.dot(X, M[:,0:no_dims]);
return Y;
def tsne(X = Math.array([]), no_dims = 2, initial_dims = 50, perplexity = 30.0):
"""Runs t-SNE on the dataset in the NxD array X to reduce its dimensionality to no_dims dimensions.
The syntaxis of the function is Y = tsne.tsne(X, no_dims, perplexity), where X is an NxD NumPy array."""
# Check inputs
if isinstance(no_dims, float):
print "Error: array X should have type float.";
return -1;
if round(no_dims) != no_dims:
print "Error: number of dimensions should be an integer.";
return -1;
# Initialize variables
X = pca(X, initial_dims).real;
(n, d) = X.shape;
max_iter = 1000;
initial_momentum = 0.5;
final_momentum = 0.8;
eta = 500;
min_gain = 0.01;
Y = Math.random.randn(n, no_dims);
dY = Math.zeros((n, no_dims));
iY = Math.zeros((n, no_dims));
gains = Math.ones((n, no_dims));
# Compute P-values
P = x2p(X, 1e-5, perplexity);
P = P + Math.transpose(P);
P = P / Math.sum(P);
P = P * 4; # early exaggeration
P = Math.maximum(P, 1e-12);
# Run iterations
for iter in range(max_iter):
# Compute pairwise affinities
sum_Y = Math.sum(Math.square(Y), 1);
num = 1 / (1 + Math.add(Math.add(-2 * Math.dot(Y, Y.T), sum_Y).T, sum_Y));
num[range(n), range(n)] = 0;
Q = num / Math.sum(num);
Q = Math.maximum(Q, 1e-12);
# Compute gradient
PQ = P - Q;
for i in range(n):
dY[i,:] = Math.sum(Math.tile(PQ[:,i] * num[:,i], (no_dims, 1)).T * (Y[i,:] - Y), 0);
# Perform the update
if iter < 20:
momentum = initial_momentum
else:
momentum = final_momentum
gains = (gains + 0.2) * ((dY > 0) != (iY > 0)) + (gains * 0.8) * ((dY > 0) == (iY > 0));
gains[gains < min_gain] = min_gain;
iY = momentum * iY - eta * (gains * dY);
Y = Y + iY;
Y = Y - Math.tile(Math.mean(Y, 0), (n, 1));
# Compute current value of cost function
if (iter + 1) % 10 == 0:
C = Math.sum(P * Math.log(P / Q));
print "Iteration ", (iter + 1), ": error is ", C
# Stop lying about P-values
if iter == 100:
P = P / 4;
# Return solution
return Y;
if __name__ == "__main__":
print "Run Y = tsne.tsne(X, no_dims, perplexity) to perform t-SNE on your dataset."
print "Running data on fc7 of cifar10 train dataset..."
X = Math.loadtxt("fc7_feature_label.txt");
labels = Math.loadtxt("fc7_feature.txt");
Y = tsne(X, 2, 50, 20.0);
Plot.scatter(Y[:,0], Y[:,1], 20, labels);
Plot.show();