Transcription of notes on instruments

I have been interested in audio processing and music theory for a while, and I've tried to write a music transcription algorithm. The algorithm in question should take as inputs notes from an audio file and outputs the location of the notes on the instrument of choice (guitar, bass, piano).

Doing so required me to go through a certain amount of steps.

  • First, I had to find how to extract notes from an audio file: A note is essentially a wave, a melody is made of superimposed waves. Info from the audio file can thus be extracted as a signal's plot showing how the amplitude varies with respect to time.

Plot of a signal from an audio file


  • To extract the frequencies of all the waves in the previous plot, I had to use the Fast Fourier transform.
    Fast Fourier transform is a great tool to get from a time-domain plot to a frequency domain one.

Frequency domain of the original plot

  • Then, online information was gathered to link those frequencies to the right notes: I had some knowledge on the location of notes on instruments; hence I used this knowledge to link the right notes to the right position. Online info was obtained to find which frequencies belong to which notes using this link.

Online info was obtained to find which frequencies belong to which notes (https://pages.mtu.edu/~suits/notefreqs.html)

The tKinter tool was used to create the fretboard and visualize the notes

It is important to mention that I still am struggling with some issues. More precisely :

  • Some of the notes are right but are played at a lower octave (the frequency is halved), and there is also the noise, which creates additional random notes.
  • There is still noise from the audio file that can often be confused as notes.
  • For guitars, there are multiple locations for the same note; thus, we should find how the set of locations that minimize the total distance traveled by the fingers.