Department of Information & Library Science (ILS) Research Talk featuring Ms. Kayhun Choi
Title: Computational Lyricology: Towards a Quantitative Understanding of Music Lyrics and Utilizing Crowdsourced Interpretations
Abstract: Metadata serve an important role for users in browsing and searching music in a large scale music digital library. Among many different kinds of descriptive metadata, “subject” or “topic” of the lyrics is challenging to automatically extract due to the poetic nature of music lyrics. In fact, some lyrics are hard even for humans to understand, not to mention for machines. For that reason, music lovers have voluntarily shared and discussed their understandings of music lyrics in the form of comments on the web, which contain deep and diverse understandings of music lyrics. Although its crowdsourced nature calls for additional processing of data, the straightforward language can be more suitable for a machine to quantitatively analyze. This talk first compares lyrics and their interpretations in terms of their performance in the automatic classification tasks. Then, a systematic way to filter out noisy information of users’ interpretations is proposed in the context of probabilistic topic modeling. Also, traditional readability measures, e.g. concreteness of words, are discussed to show a direct application of objective metric might not precisely provide a good understanding about the difficulty of lyrics. As a remedy to this problem, a novel social data-driven method is presented, which targets on discovering the level of disputes over users by quantifying the diversity in topics.
Bio: Kahyun Choi is a PhD candidate in the School of Information Sciences at the University of Illinois at Urbana-Champaign. Prior to her PhD, she also worked as a software engineer in Naver, a search engine company in Korea. Her dissertation research examines computational methods to understand music lyrics with the help of crowdsourced interpretations. Other topics of her research include computational analysis of the Music Library Association Mailing archive, music mood/genre/subject metadata enrichment using machine learning, an informetric study of the International Society of Music Information Retrieval (ISMIR) conference, cross-cultural studies of K-Pop, and evaluation frameworks of MIR systems.