Overview Unaccompanied vocal music is a central part of Western art music, yet it requires excellent skills for singers to achieve proper intonation. In this contribution, we analyze intonation deficiencies by introducing an intonation cost measure that can be computed from choir re- cordings and may help to assess the singers’ intonation quality. With our approach, we measure the deviation between the recor- ding’s local salient frequency content and an adaptive reference grid based on the equal- tempered scale. The adaptivity introduces in- variance of the local intonation measure to global intonation drifts. In our experiments, we compute this measure for several recor- dings of Anton Bruckner’s choir piece Locus Iste. We demonstrate the robustness of the proposed measure by comparing scenarios of different complexity regarding the availa- bility of aligned scores and multi-track recor- dings, as well as the number of singers per part. Even without using score information, our cost measure shows interesting trends, thus indicating the potential of our method for real-world applications. Towards Measuring Intonation Quality of Choir Recordings: A Case Study on Bruckner’s Locus Iste Christof Weiß, Sebastian J. Schlecht, Sebastian Rosenzweig, and Meinard Müller Intonation scenarios: Intonation cost measure based on 12-tone equal-tempered scale [2]: with optimal, cost-minimizing grid shift and Gaussian deviation =16 cents adaptive grid, shift Music scenario: Anton Bruckner, Gradual Locus iste Multi-track recording [1] 16 individual singers 4 sopranos S1… S4 4 altos A1… A4 4 tenors T1…T4 4 basses B1… B4 4-part polyphony Fixed grid: good intonation Fixed grid: global offset Fixed grid: global drift Adaptive grid: global drift + local deviations Adaptive grid: global drift An Intonation Cost Measure References & Acknowledgments This work was supported by the German Research Foundation (DFG MU 2686/12-1). The Internatio- nal Audio Laboratories Erlangen are a joint institution of the Friedrich-Alexander-Universität Erlan- gen-Nürnberg (FAU) and Fraunhofer Institut für Integrierte Schaltungen IIS. We thank Helena Cuesta and colleagues from UPF Barcelona for creating and publishing the Choral Singing Dataset. [1] H. Cuesta, E. Gómez, A. Martorell, F. Loáiciga: “Analysis of intonation in unison choir singing.” In Proceedings of the International Conference of Music Perception and Cognition (ICMPC), 2018. [2] T. Nakano, M. Goto, Y. Hiraga: “An automatic singing skill evaluation method for unknown melodies using pitch interval accuracy and vibrato features.” In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), 2006. [3] J. Salamon, E. Gómez: “Melody extraction from polyphonic music signals using pitch contour characteristics.” IEEE Transactions on Audio, Speech, and Language Processing, 2012. Selected Scenarios Local Intonation Quality: Choral Singing Dataset [1] Global Intonation Quality: Different Recordings Commercial Recordings Choral Singing Dataset [1] Synthetic Examples fine note orig orig Estimating frequencies and amplitudes from salience representation [3] Different conditions / technical scenarios (A … D): Score constraints for frequency estimation or no score constraints Individual tracks from multi-track recording [1] or mixed signal Only first voices of each part (S1, A1, T1, B1) or all voices Harmonic tones*, detuned =0 cents Harmonic tones*, detuned = 15 cents Harmonic tones*, detuned = 30 cents Choir Samples SibeliusSounds (C) First voices SATB1 , corrected (C) First voices SATB1 , corrected (C) First voices SATB1 , original (D) All voices SATB , original All voices SATB , with strong reverb Internet Archive 2013 Philharmonia Vocalensemble 1979 Chor des Bayerischen Rundfunks 2012 Choir of St John's College 1996 NDR Chor Hamburg 2000 *16 partials with decaying amplitudes Piano roll with deviations of first voices S1, A1, T1, B1 (A) First voices S1, A1, T1, B1 (individual tracks), score constraints (B) First voices S1, A1, T1, B1 (individual tracks), no score constraints (C) First voices SATB1 (mixed signal), score constraints (D) All voices SATB (mixed signal), score constraints