by M. I. Mandel et al.
- We show that users agree more on tags applied to clips temporally “closer” to one another; that conditional restricted Boltzmann machine models of tags can more accurately predict related tags when they take context into account
- and that when training data is “smoothed” using context, support vector machines can better rank these clips according to the original, unsmoothed tags and do this more accurately than three standard multi-label classifiers
- This article discusses and tests two different kinds of tag language models, one based on an information-theoretic formulation of this inference [Schifanella et al. 2010], and the second based on restricted Boltzmann machines (RBMs) [Mandel et al. 2010; 2011].
- Assuming that tags applied to an artist apply equally well to all of the clips of music that the artist has released (as is done commonly [Bertin-Mahieux et al. 2008]) implies that up to 50% noise is being introduced in those tags
- A visual scene might be analogous to a musical genre, as the priors over instruments, moods, etc. found in a song should depend on the genre of the song.
- Spatial context in images could correspond to temporal context in music
No comments:
Post a Comment