Jun Zhu
Assoc. Prof. @ THU
Adj. Faculty @ CMU
Home
Research
Publications
Teaching
Software
People
Professional
Bio
Contact

I am leading the TSAIL group. We are interested in developing machine learning theories, algorithms, and applications to problems in science, engineering and computing. We use the tools of statistical inference and large-scale computing to deal with uncertainty and information in various domains, including text mining, image & video processing, network analysis, and neuroscience. Our recent projects include deep learning, scalable (regularized) Bayesian inference, topic models, learning from crowds, and their applications in various domains.

Representation Learning
We develop statistical models to learn latent (one-layer or multiple-layer, aka deep) representations for data analysis tasks, ranging from the simple tasks of classification and regression, to the slightly more complex problems of multi-modal data fusion and multi-task learning, and to the even more complex problems of social network data analsysis and web recommendation. We are particularly focusing on solving some important problems for representation learning, including
  • model complexity inference by developing nonparametric Bayesian methods
  • max-margin latent variable models for learning predictive representations
  • learning sparse representations with sparse regularization techniques
  • scalable algorithms on CPU and GPU clusters
  • The cool thing is that we found many interesting interplays between these topics, e.g., Bayesian nonparametrics and max-margin learning are no longer isolated from each other in our RegBayes framework, and sparse coding can discover hierarchical topic representations with effective sparsity control. I also address the fundamental challenges of scaling up our techniques to large-scale applications by developing efficient Monte Carlo and variational inference algorithms.

    Regularized Bayesian Inference
    Regularized Bayesian inference (RegBayes) is a computational framework that allows your Bayesian and nonparametric Bayesian models to incorporate rich side knowledge into the inference process by defining an appropriate posterior regularization term. When the posterior regularization is defined following the principle of max-margin, RegBayes allows you to learn Bayesian and nonparametric Bayesian models discriminatively, similar to what we do in support vector machines; but here everything is done via probabilistic inference, thus easily handling noise, ambiguity, or missing values and discovering statistical structures hiding in complex data. With nonparametric techniques, so we have infinite support vector machines and infinite latent support vector machines. Some details can be found in our papers and my recent tutorial talks at ADL 2016, ACML 2013 and the one I gave at MLA 2013. A highlight of my work on Bayesian methods was published at IEEE Intelligent Systems under the tile of AI's 10 to Watch.

    Here are some of my favoriate examples showing you what we can get by doing Bayesian inference and max-margin learning jointly.

    • MedLDA: a max-margin supervised topic model with efficient and scalabe inference algorithms;
    • MMH: a max-margin latent space model for multi-view data analysis;
    • iSVM: a Dirichlet Process (DP) mixture of large-margin kernel machines that allows you to discover clustering structures in SVM classifers;
    • iLSVM: an SVM model that learns latent features good for classification and multi-task learning, and determines the feature dimension automatically;
    • MedLFRM: a link prediction model that learns latent features and determine the feature dimension automatically;
    • BM3F: a nonparametric Bayesian formulation of max-margin matrix factorization, with applications in collaborative prediction and recommendation.
    Sparse Topical Coding
    Probabilistc methods are great in inferring latent representations from complex data. But, they often have some shortcomings in terms of sparsity control and computational cost. We have developed sparse topical coding, or STC, a non-probabilistic formulation of topic models. STC relaxes the normalization constraint of a probability distribution and thus effectively controls the sparsity of latent representations.
    • STC: a topic model that learns sparse representations for words and documents
    Acknowledgements: My research is supported by National Key Project for Basic Research (973), National Natural Science Fundation of China (NSFC), National Youth Top-Notch Talents Support Program, Tsinghua Initiative Scientific Research Program, Basic Research Foundation of Tsinghua National Lab for Information Science and Technology, and Tsinghua 221 Basic Research Plan for Young Talents.
    Here is a poster to summarize some of our recent work.

    Last updated on Dec. 4th, 2012.