TY - GEN
T1 - Improved Acoustic Modeling for Automatic Piano Music Transcription Using Echo State Networks
AU - Steiner, Peter
AU - Jalalvand, Azarakhsh
AU - Birkholz, Peter
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Automatic music transcription (AMT) is one of the challenging problems in Music Information Retrieval with the goal of generating a score-like representation of a polyphonic audio signal. Typically, the starting point of AMT is an acoustic model that computes note likelihoods from feature vectors. In this work, we evaluate the capabilities of Echo State Networks (ESNs) in acoustic modeling of piano music. Our experiments show that the ESN-based models outperform state-of-the-art Convolutional Neural Networks (CNNs) by an absolute improvement of 0.5 F1 -score without using an extra language model. We also discuss that a two-layer ESN, which mimics a hybrid acoustic and language model, achieves better results than the best reference approach that combines Invertible Neural Networks (INNs) with a biGRU language model by an absolute improvement of 0.91 F1 -score.
AB - Automatic music transcription (AMT) is one of the challenging problems in Music Information Retrieval with the goal of generating a score-like representation of a polyphonic audio signal. Typically, the starting point of AMT is an acoustic model that computes note likelihoods from feature vectors. In this work, we evaluate the capabilities of Echo State Networks (ESNs) in acoustic modeling of piano music. Our experiments show that the ESN-based models outperform state-of-the-art Convolutional Neural Networks (CNNs) by an absolute improvement of 0.5 F1 -score without using an extra language model. We also discuss that a two-layer ESN, which mimics a hybrid acoustic and language model, achieves better results than the best reference approach that combines Invertible Neural Networks (INNs) with a biGRU language model by an absolute improvement of 0.91 F1 -score.
KW - Acoustic modeling
KW - Automatic piano transcription
KW - Echo state network
UR - http://www.scopus.com/inward/record.url?scp=85115199523&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115199523&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-85099-9_12
DO - 10.1007/978-3-030-85099-9_12
M3 - Conference contribution
AN - SCOPUS:85115199523
SN - 9783030850982
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 143
EP - 154
BT - Advances in Computational Intelligence - 16th International Work-Conference on Artificial Neural Networks, IWANN 2021, Proceedings
A2 - Rojas, Ignacio
A2 - Joya, Gonzalo
A2 - Catala, Andreu
PB - Springer Science and Business Media Deutschland GmbH
T2 - 16th International Work-Conference on Artificial Neural Networks, IWANN 2021
Y2 - 16 June 2021 through 18 June 2021
ER -