Friday, November 25, 2011

A Connotative Space for Supporting Movie Affective Recommendation

A Connotative Space for Supporting Movie Affective Recommendation, Sergio Benini, Luca Canini, and Riccardo Leonardi, IEEE TMM 2011

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5962360&tag=1

"there are at least three possible levels of description for a given object, a video in our case: the denotative meaning (what is the described concept), the connotative one (by which terms the concept is described), and the affective response (how the concept is perceived by a person)."

"Connotation is essential in cinematography, as in any other art discipline. It is given by the set of conventions (such as editing, music, mise-en-scene elements, color, sound, lighting, etc.) ..."

"using connotation properties can be more reliable than exploiting emotional annotations by other users."

"A set of conventions, known as film grammar [22], governs the relationships between these elements and influences how the meanings conveyed by the director are inferred by the audience."

"borrowing the theoretical approach from art and design, ... the affectivemeaning of a movie varies along three axes which account for the natural (warm/cold), temporal (dynamic/slow), and energetic (energetic/minimal) dimension"

"For self-assessment, the emotation wheel is preferred to other models, such as PAD, since it is simpler for the users to provide a unique emotional label than to express their emotional state by a combination of values of pleasure, arousal, and dominance."

"Exploiting distances between emotions, for each scene, we then turn emotations into a 1-to-5 bipolar scale by unfolding the wheel only on the five most voted contiguous emotions, as shown in Fig. 8."

"Moreover, the choice of discarding, separately for each scene, the three least voted contiguous emotions is supported by Osgood,..."

Tuesday, November 15, 2011

Law of two-and-a-half

http://jhimusic.com/blog/?p=121



Emotion representation, analysis, and synthesis in continuous space: a survey

"Emotions are complex constructs with fuzzy boundaries and with substantial individual variations in expression and experience."

"To guarantee a more complete description of affective colouring, some resedarchers include expectation (the degree of anticipating or being taken unaware) as the fourth dimension, and intensity (how far a person is away from a state of pure, cool rationality) as the fifth dimension."

"Despite the existence of diverse affect models, search for optimal low-dimensional representation of affect, for analysis and synthesis, and for each modality or cue, remains open."

"While visual signals appear to be better for interpreting valence, audio signals seem to be better for interpreting arousal."

"There are also spin off companies emerging out of collaborative research at well-known universities (e.g., Affectiva established R. Picard and colleagues of MIT Media Lab)."

Monday, November 7, 2011

Music Discovery with Social Networks

by Cédric S. Mesnage et al, womrad, 2011

  • Study "social shuffle" (or flooding, diffusion) over Facebook by using Starnet Ap on FB
  • Definition of a successful music discovery: "it occurs when the user of the application likes a track that s/he has never heard before."
  • Conclusion: social recom > non-social recom > random recom
  • Prototype system: apps.facebook.com/music_valley

Friday, November 4, 2011

Exploring Automatic Music Annotation with Acoustically-Objective Tags

Exploring Automatic Music Annotation with Acoustically-Objective Tags
by Derek Tingle, Youngmoo E. Kim, and Douglas Turnbull, MIR 2010

http://cosmal.ucsd.edu/cal/projects/CAL10K/

  • consists of 10,870 songs annotated using a vocabulary of 475 acoustic tags and 153 genre tags from Pandora’s Music Genome Project
  • use Echo Nest API for feature extraction
  • train on cal10k and test on cal500
the 55 overlapping tags between the vocabularies of cal10k and cal500:

  1. acoustic
  2. acoustic guitar
  3. aggressive
  4. alternative
  5. ambient sounds
  6. bebop
  7. bluegrass
  8. blues
  9. breathy
  10. call and response
  11. catchy
  12. classic rock
  13. cool jazz
  14. country
  15. dance pop
  16. danceable
  17. distorted electric guitar
  18. drum set
  19. duet
  20. electric
  21. electric blues
  22. electronica
  23. emotional
  24. female lead vocals
  25. folk
  26. funk
  27. gospel
  28. gravelly
  29. hand drums
  30. harmonica
  31. heavy beat
  32. hip hop
  33. jazz
  34. light beat
  35. low pitched
  36. major
  37. male lead vocals
  38. mellow
  39. minor
  40. organ
  41. piano
  42. pop
  43. punk
  44. r&b
  45. rock
  46. saxophone
  47. slow
  48. soul
  49. string ensemble
  50. studio recording
  51. swing
  52. synthesized
  53. synthesizer
  54. trumpet
  55. vocal harmonies

Unifying Low-Level and High-Level Music Similarity Measures

This paper proposes three of distance measures based on the audio content:
  1. A low-level measure based on tempo-related description
  2. A high-level semantic measure based on the inference of different musical dimensions by support vector machines. These dimensions include genre, culture, moods, instruments, rhythm, and tempo annotations
  3. A hybrid measure which combines the above two
Evaluation:
  1. Objective evaluation: By using classification benchmark as ground truth: "For each collection, we considered songs from the same class to be similar and songs from different classes to be dissimilar, and assessed the relevance of the songs’ rankings returned by each approach."
  2. Subjective evaluation: The listener was presented with 5different playlists (one for each measure) generated from the same seed song. Independently for each playlist, we asked the listeners to provide1) a playlist similarity rating (six-point) and 2) a playlist inconsistency boolean answer (bipolar).