***************************
MMH: Max-margin Harmonium Models
***************************

Ning Chen, Jun Zhu
chenn07[at]mails.tsinghua.edu.cn, junzhu[at]cs.cmu.edu

(C) Copyright 2011, Ning Chen (chenn07 [at] mails [dot] tsinghua [dot] edu [dot] cn)
Jun Zhu (junzhu [at] cs [dot] cmu [dot] edu)

This file is part of MMH.

MMH is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your
option) any later version.

MMH is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

------------------------------------------------------------------------

This is a C implementation of max-margin harmonium model (MMH), a
model of multi-view data which is fully described in Chen et al. (NIPS 2010)
(Predictive Subspace Learning for Multi-view Data: A Large Margin Approach).

------------------------------------------------------------------------

TABLE OF CONTENTS

A. COMPILING

B. TOPIC ESTIMATION

   1. SETTINGS FILE

   2. DATA FILE FORMAT

C. ESTIMATION AND INFERENCE

D. QUESTIONS, COMMENTS, PROBLEMS, UPDATE ANNOUNCEMENTS

------------------------------------------------------------------------

A. COMPILING

   1. For Windows users:
	Use Visual Studio 2005 to open "MMH.sln". Set the "boost" 
library (http://www.boost.org/) correctly and compile.
   2. For Linux users:
	g++ *.cpp svmlight/*.cpp svm_multiclass/*.cpp -o MMH -lm

------------------------------------------------------------------------

B. TOPIC ESTIMATION

Estimate the model by executing:

  MMH.exe estinf [Topic Number, e.g., 10] [candidate C value, e.g., 15] [dLambda, e.g., 0.1] [dDeltaEll, e.g., 1] [setting file]
  
Where [candidate C value, e.g., 15] is the value of parameter "C2" in Eq.(4), [dLambda] is the value of parameter "C1" in Eq.(4), 
[dDeltaEll] is the value of \delta L_d in hinge loss as in Eq.(4). [setting file] is a string indicating the setting file as explained below. 
The data used for estimation is also specified in the Settings file.

The model and variational parameters(i.e.,\alpha, \beta, \W, \U, etc. ) will be saved in a folder specified by different settings, and each settings,
the folder name is of the form "<feature type><number of classes>c_<topic number>_<C value>_<Lambda>_<DeltaEll>"(e.g., img5c_5_10_2_8).
The algorithm runs until that score is less than "em_convergence" (from
the settings file) or "em_max_iter" iterations are reached.

For each setting, the classification result is saved in the file:

<evl_res[Number of classes]class.txt> (e.g., evl_res5class.txt)

All the classification results are saved in the following file:

<overallRes[Type of features].txt> 
(e.g., For Image features only: overallResimg.txt;
 e.g., For Tag + Image features: overallResTagImg.txt;
 e.g., For SIFT + Image features: overallResSiftImg.txt)

The settings file and data format are described below.

1. Settings file

See settings.txt for a sample. These are placeholder values; they
should be experimented with.

This is of the following form:

		 cd max iter [positive integer e.g., 10 ]
		 cd convergence [positive float e.g., 1e-3]
		 em max iter [positive integer e.g., 20]
		 em convergence [positive float e.g., 1e-6]
		 class number [positive integer e.g., 5]
		 xdata dimension [non-negative integer e.g., 1000]
		 zdata dimension [non-negative integer e.g., 165]
		 sift dimension [non-negative integer e.g., 500]
		 train-data:[string e.g., ..\TrainSet_Trecvid_ImageData.dat] 
		 test-data: [string e.g., ..\TestSet_Trecvid_ImageData.dat] 

where the settings are

     [cd max iter]

     The maximum number of iterations of Contrastive Divergenc variational
     inference for a single document.  

     [cd convergence]

     The convergence criteria for Contrastive Divergence variational inference.  Stop if
     fabs( 1 - dkl_sleep_old / dkl_sleep ) is less than this value (or
     after the maximum number of iterations). 

     [em max iter]

     The maximum number of iterations of variational EM.

     [em convergence]

     The convergence criteria for varitional EM.  Stop if fabs( 1 - dpreobj / dobjval)
     is less than this value (or after the maximum number of iterations).  Note that "score" 
     is the lower bound on the likelihood for the whole corpus.
     
     [class number]
     
     The number of classes of the whole corpus.

     [xdata dimension]
     
     The dimension of binary text-view features.
     
     [zdata dimension]
     
     The dimension of real-valued view features.
     
     [sift dimension]
     
     The dimension of binary SIFT-view features.
     
     [train_file]
     
     The file name of training data.
     
     [test_file]
     
     The file name of testing data.


2. Data format

For multi-view data analysis, we provide three-view feature format: binary text-view (e.g., Bag of Words features), 
binary SIFT-view(e.g., image SIFT features converted to 0/1 from zero/nonzero) and real-valued view (e.g., image color features). 
Thus, each data sample is succinctly represented as a vector containing the three-view features. 
The data is a file where each line is of the form:

[Sample ID] [Label] [Number of Non-zero Text Features] [Number of Non-zero SIFT Features] [text_term_id1:count1] ... [text_term_idN:countN] [SIFT_term_id1:count1] ... [SIFT_term_idN:countM] [N-dim real valued low-level features]

where [Sample ID] is the Index of each data sample, [label] is the true label of the data sample; [Number of Non-zero Text Features] is the number of unique terms in the 
text-view features; and the [text_term_idN:countN] associated with each term is how many times that term appeared in the text. Similarly, [Number of Non-zero SIFT Features] 
is the number of unique terms in the SIFT-view features; and the [SIFT_term_idN:countN] associated with each term is how many times that term appeared in the SIFT features. 
Note that [text_term_idN] and [SIFT_term_idN] are integers which indexes the term; it is not a string. [N-dim real valued low-level features] is another view of N-dimentional real-valued features (e.g., 500-dim color features).

Please refer to 'http://www.cs.cmu.edu/~junzhu/data/readme.txt' for data examples.

------------------------------------------------------------------------

C. ESTIMATION AND INFERENCE

For simplicity, a command is provided for doing both estimation and inference.  
Usage is:

  MMH.exe estinf [Topic Numbe, e.g., 10] [candidate C value, e.g., 15] [dLambda, e.g., 0.1] [dDeltaEll, e.g., 1] [setting file]
  
  (e.g., MMH.exe estinf 5  10 1 1 trecvid_image_settings.txt)
  
Where [candidate C value, e.g., 15] is the value of parameter "C2" in Eq.(4), [dLambda] is the value of parameter "C1" in Eq.(4), 
[dDeltaEll] is the value of \delta L_d in hinge loss. [setting file] is a string indicating the setting file as explained below. 
The data used for estimation is also specified in the Settings file.
------------------------------------------------------------------------

D. QUESTIONS, COMMENTS, PROBLEMS, AND UPDATE ANNOUNCEMENTS

Questions, comments, and problems should be addressed to,
chenn07@mails.tsinghua.edu.cn, junzhu@cs.cmu.edu.