Yi-Hsuan Yang's Blog: October 2011

Tuesday, October 25, 2011

Shotgun: Parallel Lasso and Sparse Logistic Regression

http://select.cs.cmu.edu/code/index.html

Shot-gun outperforms other published solvers on a range of large problems, proving to be one of the most scalable algorithms for L1.

Joseph K. Bradley, Aapo Kyrola, Danny Bickson, and Carlos Guestrin. "Parallel Coordinate Descent for L1-Regularized Loss Minimization." International Conference on Machine Learning (ICML 2011).

Thursday, October 20, 2011

Advanced chroma features

http://www.mpi-inf.mpg.de/resources/MIR/chromatoolbox/

Implementation of novel chroma features proposed in the following articles:

Audio matching via chroma-based statistical features. ISMIR 2005
Making chroma features more robust to timbre changes. ICASSP 2009
Towards timbre-invariant audio features for harmony-based music. TASLP 2010

An article described this toolbox is in the proceedings of ISMIR this year

Chroma Toolbox: MATLAB Implementations for Extracting Variants of Chroma-Based Audio Features. ISMIR 2011

The author of this toolbox (Meinard Muller) will give a tutorial on "Audio Content-based Music Retrieval," with an MTG researcher, Joan Serra.

Monday, October 17, 2011

K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation, IEEE TSP 2006

The sparse representation problem can be viewed as a generalization of the VQ objective, in which we allow each input signal to be represented by a linear combination of codewords, which we now call dictionary elements. Therefore the coefficients vector is now allowed more than one nonzero entry, and these can have arbitrary values.

Music & Emotion in ISMIR 2011

http://ismir2011.ismir.net/program.html

9 out of 129 papers (7%) are related to affective analysis in music:

Modeling Dynamic Patterns for Emotional Content in Music
Identifying Emotion Segments in Music by Discovering Motifs in Physiological Data
Music Emotion Classification of Chinese Songs based on Lyrics Using TFIDF and Rhyme
Modeling Musical Emotion Dynamics with Conditional Random Fields
Mining the Correlation between Lyrical and Audio Features and the Emergence of Mood
Exploring The Relationship Between Mood and Creativity in Rock Lyrics
A Comparative Study of Collaborative vs. Traditional Musical Mood Annotation
Music Mood Classification of Television Theme Tunes
Musical Moods- A Mass Participation Experiment for Affective Classification of Music

Learned dictionaries for sparse image representation: Properties and results, SPIE 2011

Compare MOD, K-SVD, LS-DLA, ODL, RLS-DLA
Propose some dictionary properties: mutual coherence, distribution ratio, gap, sparse representation capabilities (SRC), dictionary distance
However, not able to find a clear correlation between any property and the performance of the dictionary in an image compression application.

Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space, IEEE TAC 2011

The work introduced here converges with this recent shift in affect recognition, from recognizing posed expressions in terms of discrete and basic emotion categories, to the recognition of spontaneous expressions in terms of dimensional and continuous descriptions.

Contributions:

Fuse facial expression, shoulder gesture and speech cues in analysis of human affect.
Propose an output-associative fusion framework that incorporates correlations and covariances between the emotion dimensions.
Demonstrate that capturing temporal correlations and remembering the temporally distant events (or storing them in memory) is of utmost importance for continuous affect prediction.

Challenges mentioned: reliability of ground truth, baseline problem, unbalanced data.

Interesting observation:

Arousal can be much better predicted than valence using audio cues.
For valence dimension instead, visual cues (facial expressions and shoulder movements) appear to perform better.

Sunday, October 16, 2011

Online resources for sparse encoding of signals

A nice review:
"A Review of Fast 11-Minimization Algorithms for Robust Face Recognition," by Allen Yang, Arvind Ganesh, Zihan Zhou, Shankar Sastry, and Yi Ma.

SPAMS: a SPArse Modeling Software
http://www.di.ens.fr/willow/SPAMS/doc/html/doc_spams.html

SparseLab
http://sparselab.stanford.edu/

l1-ls
http://www.stanford.edu/~boyd/l1_ls/

l1-benchmark
http://www.eecs.berkeley.edu/~yang/software/l1benchmark/

First post

Want to have a place to put down some random thinking and obviously FB is not a good one for archive purpose. So I blog.