Goodbyes are only for those who love with their eyes. Because for those who love with heart and soul

Uploaded by
Rumi

105 downloads 2559 Views 3MB Size

IMAGE EMPIRICAL MODE DECOMPOSITION

Life is not meant to be easy, my child; but take courage: it can be delightful. George Bernard Shaw

Directional Filters for Cartoon + Texture Image Decomposition

Ask yourself: Am I holding onto something that would be better to let go of? What is it and what’s h

Plankton Image Classification

We may have all come on different ships, but we're in the same boat now. M.L.King

document image zone classification

Silence is the language of God, all else is poor translation. Rumi

Image classification and regression

Come let us be friends for once. Let us make life easy on us. Let us be loved ones and lovers. The earth

Multi-Glance Attention Models For Image Classification

Ask yourself: How many times a day do you look at yourself in the mirror? Next

Between-Class Learning for Image Classification

The greatest of richness is the richness of the soul. Prophet Muhammad (Peace be upon him)

Multiclass feature learning for hyperspectral image classification

The best time to plant a tree was 20 years ago. The second best time is now. Chinese Proverb

Approximate String Matching Distance for Image Classification

Don't ruin a good today by thinking about a bad yesterday. Let it go. Anonymous

HEp-2 Cell Image Classification

Before you speak, let your words pass through three gates: Is it true? Is it necessary? Is it kind?

university of innsbruck

institute of computer science

intelligent and interactive systems

Kronecker Decomposition for Image Classification Sabrina Fontanella1,2 , Antonio Rodr´ıguez-S´anchez 1 , Justus Piater1 , and Sandor Szedmak3 1 University 2 University 3 Aalto

of Innsbruck of Salerno

University

´ Evora, September 2016

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

1/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

2/41

Image classification I I I

Images are classified according to their visual content Applicability: 1. Recognition of specific objects 2. Indoor/outdoor recognition 3. Analysis of medical images

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

3/41

Image classification II I

Example of classification algorithm, Bag of Words: 1. Features extraction, stored into feature vectors 2. Approximation of the distribution of the features by an histogram 3. Apply a classification algorithm (Support Vector Machine, Neural Network, Markov Random Field, etc)

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

4/41

Relations between objects are of interest

I

Is it possible to recognize relationships between the objects appearing in a scene? I

This is of interest, since this relationship can provide knowledge necessary to identify and classify the image

I

E.g. A car is quite likely to be in an image where there is also buildings and people. E.g. A zebra is quite likely to be outdoors, surrounded by Savanna plants or animals.

I

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

5/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

6/41

Decomposing the environment

I

Structured decomposition of the environment I

I

I

Learning structured output is a popular stream of machine learning

By decomposing the matrix that represent the image, the structure behind the scene could be captured Let us consider 2D image decomposition I

Points close to each other within continuous 2D blocks can strongly relate to each other

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

7/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

8/41

Tensor decomposition I

A tensor is a multidimensional or N-way array I

an N- way or Nth-order tensor is an element of the tensor product of N vector spaces

I

Tensor decomposition can be considered as a higher- order generalization of the matrix singular value decomposition (SVD) and principal component analysis (PCA)

I

The tensor decomposition for a same image is not unique Given an RGB image of size (256,256,3), it is possible to perform the following decompositions:

I

I

I

(16,16,3),(16,16,1) tensor + matrix (2 components) (8,8,3), (8,8,1), (4,4,1) tensor + 2 matrices (3 components)

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

9/41

Tensor decomposition

I

Concerning computer vision, the tensor decomposition could be used to represent: I

I

Color images, where three matrices express the RGB images and we can use a tensor of order three (for example (1024,1024,3)). Video stream of color images where the dimensions are R, G, B and the time.

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

10/41

The Kronecker product Given two matrices A ∈ RmA ×nA and B ∈ RmB ×nB , the Kronecker product X can be expressed as: A1,1 B A1,2 B · · · A1,nA B A2,1 B A2,2 B · · · A2,nA B X =A⊗B .. .. .. . . . . . . AmA ,1 B AmA ,2 B · · · AmA ,nA B with mX = mA × mB , nX = nA × nB I

If X is given (the image), how can we compute A and B (its components)? I

B can be considered as a 2D filter of the image represented by the matrix X components)?

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

11/41

The Kronecker decomposition and SVD I

The Kronecker decomposition can be carried out by Singular Value Decomposition(SVD)

I

Given an arbitrary matrix X with size m × n the SVD is given by X = USV T where I

I

I

U ∈ Rmxm is an orthogonal matrix of left singular vectors, where UU T = Im , V ∈ Rnxn , is an orthogonal matrix of right singular vectors, where VV T = In , S ∈ Rmxn , is a diagonal matrix containing the singular values with nonnegative components in its diagonal

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

12/41

Note

I

The algorithm solving the SVD does not depend on the order of the elements of the matrix

I

Thus, any permutation of the indexes, reordering, of the columns and (or) rows preserves the same solution

I

We can then work on a reordered representation of the matrix X

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

13/41

Algorithm for solving Kronecker decomposition

1. Reorder the matrix 2. Compute SVD decomposition 3. Compute the approximation of X 4. Invert the reordering

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

14/41

Nearest Kronecker Product (NKP)

I

Given a matrix X ∈ Rmxn , the NKP problem involves minimizing: φ(A, B) = kX − A ⊗ BkF

I

F is the Frobenius norm

This problem can be solved using SVD, working on a reordered representation of X

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

15/41

Step 1: Reorder matrix X

1 =A⊗B

X x 11 x21 x31 x41 x51 x61

x12 x22 x32 x42 x52 x62

x13 x23 x33 x43 x53 x63

x14 x24 x34 x44 x54 x64

x15 x25 x35 x45 x55 x65

x16 x26 x36 x46 x56 x66

a11 = a21 a31

a12 a22 a32

a13 b11 a23 ⊗ b21 a33

b12 b22

,

can be reordered into

x11 x13 x12 x14 x21 x23 x22 x24 b11 b12 = b21 b22 1

x15 x16 x25 x26

x31 x32 x41 x42

⊗ a11

x33 x34 x43 x44 a12

X˜ = A˜ ⊗ B˜ x35 x51 x36 x52 x45 x61 x46 x62 a13

a21

x53 x54 x63 x64 a22

x55 x56 x65 x66 a23

, a31

a32

a33

C.F.V. Loan. The ubiquitous Kronecker product. Journal of Computational and Applied Mathematics,

123:85-100, 2000. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

16/41

Approximation of X and reordering

I

kX˜ − vec(A) ⊗ vec(B)kF I

Vec() is a vectorization operator which stacks columns of a matrix on top of each other

I

Problem of finding the nearest rank-1 matrix to X˜

I

Well known solutions using SVD

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

17/41

Step 2: Compute SVD decomposition

kX˜ − vec(A) ⊗ vec(B)kF I I

Let X˜ = USV T the decomposition of X˜ The best A˜ and B˜ are defined as: √ √ A˜ = σ1 U(:, 1) and B˜ = σ1 V (:; 1) where σ1 is the largest singular value and U and V are the corresponding singular vectors

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

18/41

Steps 3 and 4: Approximation and reordering

I

Once we have A˜ and B˜ is possible to compute the approximation of X

I

Since at beginning we have changed the order of values into matrix, invert the reordering is necessary for obtain the original A and B

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

19/41

Components and factorization

I

I

The number of components and factorization influence the level of details Given, for example, a gray image of size (1024,1024): I

If it has many details, is better chose many components with small factorization: I

I

Example: (4,4)(4,4)(4,4)(4,4)(4,4)

If is less detailed, less component with high factorization: I

Example: (32,32)(32,32)

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

20/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

21/41

Compression I

I

The tensor decomposition can provide a very high level of images’ compression I

I

It takes consideration only the largest singular values (Eckart-Young theorem)

The level of compression is given by the total number of: elements in image matrix elements of components in the decomposition

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

22/41

Compression II I

Let I I I I

nsv number of singular values taken in consideration nf number of factors for component v value of factors nc the number of components used

Then the total number of elements of components is given by: nsv ∗ nc ∗ v nf I

For simplify the notation we assume that the all factors are equal for every component I

I

Decomposition with different factors can be taken in consideration For example (32,28)(16,8)(2,4)

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

23/41

Compression III: Example

I I

Given an image of size (1024,1024). It can be compressed with components I

(32,32)(32,32) and with 10 singular values by: 10242 = 51.2 10 ∗ 2 ∗ 322

I

(4,4),(4,4),(4,4),(4,4),(4,4) and with 10 singular values by: 10242 = 1310.72 10 ∗ 5 ∗ 42

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

24/41

Compression IV: Example Compression ratio: 202

Compression ratio: 99

Figure: Example of compression on toys room image.

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

25/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

26/41

Interpretation of image components I

X=A⊗B

I

B can be interpreted like an image filter

I

It finds the boundary of the critical regions where most of the structural information concentrates This represents a big advantage:

I

I

I

In general, in image filtering processes, a predetermined filter is used The Kronecker decomposition automatically tries to predict the optimal filters

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

27/41

Interpretation of image components II

Highest components (A)

Lowest components (B)

Figure: Toys room picture and its components. The Highest component and the Lowest component correspond to the matrices A1, ... and B1, ... respectively.

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

28/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

29/41

Learning

I

Sample set of pairs of output and input objects {(yi , xi ) : yi ∈ Y, xi ∈ X , i = 1, ..., m}

I

Define two functions, φ and ψ , that map the input and output objects respectively into linear vector spaces I I

feature space in case of the input label space in case of the output

φ : X → Hφ and ψ : Y → Hψ

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

30/41

Objective

I

Find a linear function acting on the feature space f (φ(x)) = Wφ(x) + b that produces a prediction of every input object in the label space

I

The output corresponding to X is: y = ψ −1 (f (φ(x)))

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

31/41

MMR (Maximum Margin Regression) vs SVM (Support Vector Machine) I I

MMR is a framework for multilabel classification Is based on Support Vector Machine (SVM) I

Key idea: reinterpretation of the normal vector w

SVM I w is the normal vector of the separating hyperplane.

Extended View I W is a linear operator projecting the feature space into the label space

I yi ∈ {−1, +1} binary outputs. I The labels are equal to the binary objects. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

I yi ∈ Y arbitrary outputs I ψ(yi ) ∈ Hψ are the labels, the embedded outputs in a linear vector space 32/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

33/41

ImageCLEF dataset I

Task: multi-label classification

Figure: The hierarchy of classes in ImageCLEF multi-label challenge. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

34/41

F1 score

F1 score

Results on ImageCLEF

Degree of polynomial (a)

Standard deviation (b)

Figure: Results for six filter sizes: 4, 8, 12, 20, 18 and 32 using 3 components, training with two different kernel: a) polynomial b) Gaussian. The parameter varied in F1 measure are degree of polynomial from 1 to 10 for polynomial kernel and values of standard deviation of Gaussian for Gaussian kernel.

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

35/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

36/41

Pascal and Flickr: Features to compare to Feature Hsv Lab Rgb HsvV3H1 LabV3H1 RgbV3H1 DenseHue HarrisHue DenseHueV3H1 HarrisHueV3H1 DenseSift HarrisSift DenseSiftV3H1 HarrisSiftV3H1

Dimension 4096 4096 4096 5184 5184 5184 100 100 300 300 1000 1000 3000 3000

Source color color color color color color texture texture texture texture texture texture texture texture

Descriptor HSV LAB RGB HSV LAB RGB hue Hue hue Hue sift sift sift sift

Figure: Comparing tensor decomposition with other features 1 on Pascal07 dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22).

1

Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative

metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

37/41

Results on Pascal07 dataset Feature TD HarrisSiftV3H1 HarrisSift DenseSiftV3H1 DenseSift LabV3H1 DenseHueV3H1 HarrisHueV3H1 RgbV3H1 HsvV3H1 Hsv Lab Rgb HarrisHue DenseHue

Gaussian kernel P(%) R(%) 0.4158 0.2877 0.4623 0.4491 0.4202 0.4895 0.4189 0.4886 0.3750 0.5044 0.3911 0.3366 0.3884 0.3282 0.3274 0.3884 0.3907 0.3224 0.4080 0.3048 0.3911 0.3085 0.4135 0.2920 0.3857 0.2985 0.3930 0.2887 0.3962 0.2828

F1(%) 0.3400 0.4552 0.4522 0.4510 0.4302 0.3618 0.3558 0.3552 0.3533 0.3489 0.3449 0.3423 0.3350 0.3328 0.3299

Feature TD HarrisSiftV3H1 HarrisSift DenseSiftV3H1 DenseSift HsvV3H1 RgbV3H1 LabV3H1 HarrisHueV3H1 DenseHueV3H1 Hsv HarrisHue Rgb Lab DenseHue

Polynomial kernel P(%) R(%) 0.3931 0.2855 0.4002 0.5520 0.3728 0.5523 0.3592 0.5663 0.3442 0.5337 0.3815 0.3295 0.3479 0.3551 0.3106 0.3868 0.3110 0.3894 0.3166 0.3607 0.3390 0.3232 0.3037 0.3597 0.2906 0.3420 0.2800 0.3389 0.2808 0.3329

F1(%) 0.3308 0.4640 0.4449 0.4396 0.4184 0.3536 0.3515 0.3434 0.3417 0.3363 0.3309 0.3241 0.3135 0.3031 0.2995

Figure: Comparing tensor decomposition with other features 1 on Pascal07 dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22).

1

Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative

metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

38/41

Results on Flickr dataset Feature TD HarrisSiftV3H1 DenseSift HarrisSift DenseSiftV3H1 LabV3H1 HarrisHueV3H1 DenseHueV3H1 HsvV3H1 HarrisHue RgbV3H1 Lab DenseHue Rgb Hsv

Gaussian kernel P(%) R(%) 0.3164 0.3780 0.5470 0.3842 0.5438 0.3862 0.5368 0.3780 0.5475 0.3807 0.4693 0.3200 0.4368 0.3288 0.4221 0.3333 0.4570 0.3062 0.3753 0.3435 0.4150 0.3089 0.4153 0.3016 0.3854 0.3187 0.4181 0.2824 0.4152 0.2762

F1(%) 0.3118 0.4512 0.4515 0.4435 0.4491 0.3806 0.3752 0.3723 0.3667 0.3587 0.3542 0.3494 0.3477 0.3371 0.3317

Feature TD HarrisSiftV3H1 DenseSiftV3H1 HarrisSift DenseSift LabV3H1 HsvV3H1 HarrisHueV3H1 DenseHueV3H1 RgbV3H1 Lab DenseHue HarrisHue Hsv Rgb

Polynomial kernel P(%) R(%) 0.2311 0.2615 0.5289 0.4646 0.5328 0.4415 0.5260 0.4447 0.5132 0.4316 0.4508 0.3533 0.3961 0.3655 0.4115 0.3490 0.4086 0.3445 0.3996 0.3460 0.2717 0.5600 0.2698 0.5249 0.3294 0.4159 0.3603 0.3602 0.3495 0.3406

F1(%) 0.2453 0.4940 0.4828 0.4819 0.4688 0.3961 0.3798 0.3777 0.3737 0.3704 0.3658 0.3564 0.3561 0.3540 0.3443

Figure: Comparing tensor decomposition with other features 1 on Flickr dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22).

1

Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative

metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

39/41

Conclusions

I

I

We have presented a method for feature extraction based on decomposition of environment Pro: 1. Compression 2. Automatic prediction of the best filters to use for extracting features

I

Cons: 1. Different decompositions can strong influence the final result 2. Lack of a mechanism for automatically choose the best parameters

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

40/41

institute of computer science

intelligent and interactive systems

Kronecker Decomposition for Image Classification Sabrina Fontanella1,2 , Antonio Rodr´ıguez-S´anchez 1 , Justus Piater1 , and Sandor Szedmak3 1 University 2 University 3 Aalto

of Innsbruck of Salerno

University

´ Evora, September 2016

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

1/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

2/41

Image classification I I I

Images are classified according to their visual content Applicability: 1. Recognition of specific objects 2. Indoor/outdoor recognition 3. Analysis of medical images

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

3/41

Image classification II I

Example of classification algorithm, Bag of Words: 1. Features extraction, stored into feature vectors 2. Approximation of the distribution of the features by an histogram 3. Apply a classification algorithm (Support Vector Machine, Neural Network, Markov Random Field, etc)

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

4/41

Relations between objects are of interest

I

Is it possible to recognize relationships between the objects appearing in a scene? I

This is of interest, since this relationship can provide knowledge necessary to identify and classify the image

I

E.g. A car is quite likely to be in an image where there is also buildings and people. E.g. A zebra is quite likely to be outdoors, surrounded by Savanna plants or animals.

I

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

5/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

6/41

Decomposing the environment

I

Structured decomposition of the environment I

I

I

Learning structured output is a popular stream of machine learning

By decomposing the matrix that represent the image, the structure behind the scene could be captured Let us consider 2D image decomposition I

Points close to each other within continuous 2D blocks can strongly relate to each other

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

7/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

8/41

Tensor decomposition I

A tensor is a multidimensional or N-way array I

an N- way or Nth-order tensor is an element of the tensor product of N vector spaces

I

Tensor decomposition can be considered as a higher- order generalization of the matrix singular value decomposition (SVD) and principal component analysis (PCA)

I

The tensor decomposition for a same image is not unique Given an RGB image of size (256,256,3), it is possible to perform the following decompositions:

I

I

I

(16,16,3),(16,16,1) tensor + matrix (2 components) (8,8,3), (8,8,1), (4,4,1) tensor + 2 matrices (3 components)

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

9/41

Tensor decomposition

I

Concerning computer vision, the tensor decomposition could be used to represent: I

I

Color images, where three matrices express the RGB images and we can use a tensor of order three (for example (1024,1024,3)). Video stream of color images where the dimensions are R, G, B and the time.

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

10/41

The Kronecker product Given two matrices A ∈ RmA ×nA and B ∈ RmB ×nB , the Kronecker product X can be expressed as: A1,1 B A1,2 B · · · A1,nA B A2,1 B A2,2 B · · · A2,nA B X =A⊗B .. .. .. . . . . . . AmA ,1 B AmA ,2 B · · · AmA ,nA B with mX = mA × mB , nX = nA × nB I

If X is given (the image), how can we compute A and B (its components)? I

B can be considered as a 2D filter of the image represented by the matrix X components)?

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

11/41

The Kronecker decomposition and SVD I

The Kronecker decomposition can be carried out by Singular Value Decomposition(SVD)

I

Given an arbitrary matrix X with size m × n the SVD is given by X = USV T where I

I

I

U ∈ Rmxm is an orthogonal matrix of left singular vectors, where UU T = Im , V ∈ Rnxn , is an orthogonal matrix of right singular vectors, where VV T = In , S ∈ Rmxn , is a diagonal matrix containing the singular values with nonnegative components in its diagonal

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

12/41

Note

I

The algorithm solving the SVD does not depend on the order of the elements of the matrix

I

Thus, any permutation of the indexes, reordering, of the columns and (or) rows preserves the same solution

I

We can then work on a reordered representation of the matrix X

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

13/41

Algorithm for solving Kronecker decomposition

1. Reorder the matrix 2. Compute SVD decomposition 3. Compute the approximation of X 4. Invert the reordering

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

14/41

Nearest Kronecker Product (NKP)

I

Given a matrix X ∈ Rmxn , the NKP problem involves minimizing: φ(A, B) = kX − A ⊗ BkF

I

F is the Frobenius norm

This problem can be solved using SVD, working on a reordered representation of X

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

15/41

Step 1: Reorder matrix X

1 =A⊗B

X x 11 x21 x31 x41 x51 x61

x12 x22 x32 x42 x52 x62

x13 x23 x33 x43 x53 x63

x14 x24 x34 x44 x54 x64

x15 x25 x35 x45 x55 x65

x16 x26 x36 x46 x56 x66

a11 = a21 a31

a12 a22 a32

a13 b11 a23 ⊗ b21 a33

b12 b22

,

can be reordered into

x11 x13 x12 x14 x21 x23 x22 x24 b11 b12 = b21 b22 1

x15 x16 x25 x26

x31 x32 x41 x42

⊗ a11

x33 x34 x43 x44 a12

X˜ = A˜ ⊗ B˜ x35 x51 x36 x52 x45 x61 x46 x62 a13

a21

x53 x54 x63 x64 a22

x55 x56 x65 x66 a23

, a31

a32

a33

C.F.V. Loan. The ubiquitous Kronecker product. Journal of Computational and Applied Mathematics,

123:85-100, 2000. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

16/41

Approximation of X and reordering

I

kX˜ − vec(A) ⊗ vec(B)kF I

Vec() is a vectorization operator which stacks columns of a matrix on top of each other

I

Problem of finding the nearest rank-1 matrix to X˜

I

Well known solutions using SVD

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

17/41

Step 2: Compute SVD decomposition

kX˜ − vec(A) ⊗ vec(B)kF I I

Let X˜ = USV T the decomposition of X˜ The best A˜ and B˜ are defined as: √ √ A˜ = σ1 U(:, 1) and B˜ = σ1 V (:; 1) where σ1 is the largest singular value and U and V are the corresponding singular vectors

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

18/41

Steps 3 and 4: Approximation and reordering

I

Once we have A˜ and B˜ is possible to compute the approximation of X

I

Since at beginning we have changed the order of values into matrix, invert the reordering is necessary for obtain the original A and B

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

19/41

Components and factorization

I

I

The number of components and factorization influence the level of details Given, for example, a gray image of size (1024,1024): I

If it has many details, is better chose many components with small factorization: I

I

Example: (4,4)(4,4)(4,4)(4,4)(4,4)

If is less detailed, less component with high factorization: I

Example: (32,32)(32,32)

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

20/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

21/41

Compression I

I

The tensor decomposition can provide a very high level of images’ compression I

I

It takes consideration only the largest singular values (Eckart-Young theorem)

The level of compression is given by the total number of: elements in image matrix elements of components in the decomposition

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

22/41

Compression II I

Let I I I I

nsv number of singular values taken in consideration nf number of factors for component v value of factors nc the number of components used

Then the total number of elements of components is given by: nsv ∗ nc ∗ v nf I

For simplify the notation we assume that the all factors are equal for every component I

I

Decomposition with different factors can be taken in consideration For example (32,28)(16,8)(2,4)

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

23/41

Compression III: Example

I I

Given an image of size (1024,1024). It can be compressed with components I

(32,32)(32,32) and with 10 singular values by: 10242 = 51.2 10 ∗ 2 ∗ 322

I

(4,4),(4,4),(4,4),(4,4),(4,4) and with 10 singular values by: 10242 = 1310.72 10 ∗ 5 ∗ 42

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

24/41

Compression IV: Example Compression ratio: 202

Compression ratio: 99

Figure: Example of compression on toys room image.

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

25/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

26/41

Interpretation of image components I

X=A⊗B

I

B can be interpreted like an image filter

I

It finds the boundary of the critical regions where most of the structural information concentrates This represents a big advantage:

I

I

I

In general, in image filtering processes, a predetermined filter is used The Kronecker decomposition automatically tries to predict the optimal filters

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

27/41

Interpretation of image components II

Highest components (A)

Lowest components (B)

Figure: Toys room picture and its components. The Highest component and the Lowest component correspond to the matrices A1, ... and B1, ... respectively.

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

28/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

29/41

Learning

I

Sample set of pairs of output and input objects {(yi , xi ) : yi ∈ Y, xi ∈ X , i = 1, ..., m}

I

Define two functions, φ and ψ , that map the input and output objects respectively into linear vector spaces I I

feature space in case of the input label space in case of the output

φ : X → Hφ and ψ : Y → Hψ

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

30/41

Objective

I

Find a linear function acting on the feature space f (φ(x)) = Wφ(x) + b that produces a prediction of every input object in the label space

I

The output corresponding to X is: y = ψ −1 (f (φ(x)))

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

31/41

MMR (Maximum Margin Regression) vs SVM (Support Vector Machine) I I

MMR is a framework for multilabel classification Is based on Support Vector Machine (SVM) I

Key idea: reinterpretation of the normal vector w

SVM I w is the normal vector of the separating hyperplane.

Extended View I W is a linear operator projecting the feature space into the label space

I yi ∈ {−1, +1} binary outputs. I The labels are equal to the binary objects. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

I yi ∈ Y arbitrary outputs I ψ(yi ) ∈ Hψ are the labels, the embedded outputs in a linear vector space 32/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

33/41

ImageCLEF dataset I

Task: multi-label classification

Figure: The hierarchy of classes in ImageCLEF multi-label challenge. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

34/41

F1 score

F1 score

Results on ImageCLEF

Degree of polynomial (a)

Standard deviation (b)

Figure: Results for six filter sizes: 4, 8, 12, 20, 18 and 32 using 3 components, training with two different kernel: a) polynomial b) Gaussian. The parameter varied in F1 measure are degree of polynomial from 1 to 10 for polynomial kernel and values of standard deviation of Gaussian for Gaussian kernel.

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

35/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

36/41

Pascal and Flickr: Features to compare to Feature Hsv Lab Rgb HsvV3H1 LabV3H1 RgbV3H1 DenseHue HarrisHue DenseHueV3H1 HarrisHueV3H1 DenseSift HarrisSift DenseSiftV3H1 HarrisSiftV3H1

Dimension 4096 4096 4096 5184 5184 5184 100 100 300 300 1000 1000 3000 3000

Source color color color color color color texture texture texture texture texture texture texture texture

Descriptor HSV LAB RGB HSV LAB RGB hue Hue hue Hue sift sift sift sift

Figure: Comparing tensor decomposition with other features 1 on Pascal07 dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22).

1

Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative

metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

37/41

Results on Pascal07 dataset Feature TD HarrisSiftV3H1 HarrisSift DenseSiftV3H1 DenseSift LabV3H1 DenseHueV3H1 HarrisHueV3H1 RgbV3H1 HsvV3H1 Hsv Lab Rgb HarrisHue DenseHue

Gaussian kernel P(%) R(%) 0.4158 0.2877 0.4623 0.4491 0.4202 0.4895 0.4189 0.4886 0.3750 0.5044 0.3911 0.3366 0.3884 0.3282 0.3274 0.3884 0.3907 0.3224 0.4080 0.3048 0.3911 0.3085 0.4135 0.2920 0.3857 0.2985 0.3930 0.2887 0.3962 0.2828

F1(%) 0.3400 0.4552 0.4522 0.4510 0.4302 0.3618 0.3558 0.3552 0.3533 0.3489 0.3449 0.3423 0.3350 0.3328 0.3299

Feature TD HarrisSiftV3H1 HarrisSift DenseSiftV3H1 DenseSift HsvV3H1 RgbV3H1 LabV3H1 HarrisHueV3H1 DenseHueV3H1 Hsv HarrisHue Rgb Lab DenseHue

Polynomial kernel P(%) R(%) 0.3931 0.2855 0.4002 0.5520 0.3728 0.5523 0.3592 0.5663 0.3442 0.5337 0.3815 0.3295 0.3479 0.3551 0.3106 0.3868 0.3110 0.3894 0.3166 0.3607 0.3390 0.3232 0.3037 0.3597 0.2906 0.3420 0.2800 0.3389 0.2808 0.3329

F1(%) 0.3308 0.4640 0.4449 0.4396 0.4184 0.3536 0.3515 0.3434 0.3417 0.3363 0.3309 0.3241 0.3135 0.3031 0.2995

Figure: Comparing tensor decomposition with other features 1 on Pascal07 dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22).

1

Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative

metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

38/41

Results on Flickr dataset Feature TD HarrisSiftV3H1 DenseSift HarrisSift DenseSiftV3H1 LabV3H1 HarrisHueV3H1 DenseHueV3H1 HsvV3H1 HarrisHue RgbV3H1 Lab DenseHue Rgb Hsv

Gaussian kernel P(%) R(%) 0.3164 0.3780 0.5470 0.3842 0.5438 0.3862 0.5368 0.3780 0.5475 0.3807 0.4693 0.3200 0.4368 0.3288 0.4221 0.3333 0.4570 0.3062 0.3753 0.3435 0.4150 0.3089 0.4153 0.3016 0.3854 0.3187 0.4181 0.2824 0.4152 0.2762

F1(%) 0.3118 0.4512 0.4515 0.4435 0.4491 0.3806 0.3752 0.3723 0.3667 0.3587 0.3542 0.3494 0.3477 0.3371 0.3317

Feature TD HarrisSiftV3H1 DenseSiftV3H1 HarrisSift DenseSift LabV3H1 HsvV3H1 HarrisHueV3H1 DenseHueV3H1 RgbV3H1 Lab DenseHue HarrisHue Hsv Rgb

Polynomial kernel P(%) R(%) 0.2311 0.2615 0.5289 0.4646 0.5328 0.4415 0.5260 0.4447 0.5132 0.4316 0.4508 0.3533 0.3961 0.3655 0.4115 0.3490 0.4086 0.3445 0.3996 0.3460 0.2717 0.5600 0.2698 0.5249 0.3294 0.4159 0.3603 0.3602 0.3495 0.3406

F1(%) 0.2453 0.4940 0.4828 0.4819 0.4688 0.3961 0.3798 0.3777 0.3737 0.3704 0.3658 0.3564 0.3561 0.3540 0.3443

Figure: Comparing tensor decomposition with other features 1 on Flickr dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22).

1

Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative

metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

39/41

Conclusions

I

I

We have presented a method for feature extraction based on decomposition of environment Pro: 1. Compression 2. Automatic prediction of the best filters to use for extracting features

I

Cons: 1. Different decompositions can strong influence the final result 2. Lack of a mechanism for automatically choose the best parameters

Antonio Rodr´ıguez-S´ anchez (CLEF 2016)

40/41

*When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile*

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.