The Web Site of L.A.P.

Realizing MIDI Music With GNU/Linux Tools

The storage of audio material, such as music performances, is usually done by writing the actual sampled waveforms to a digital file. This will allow for an accurate reproduction of the original audio whenever desired but will also necessitate very large storage files. Compression techniques, both lossless and lossy, are often used to reduce the file size, but even with compression the storage space for even a modest collection of music or other audio material can be considerable.

Is there a way to greatly reduce the storage space required?

With MIDI the answer is a definite “Yes!”

This article will demonstrate that large music collections can be acquired as MIDI files and can be rendered at will with highly realistic sound quality.

What is MIDI?

A MIDI file stores only the information about a musical composition, I.e, the notes, the duration of notes, and the type of instrument used to produce the notes as well as aspects such as keypress velocity, etc. The stored information can be realized as music by using a hardware or software synthesizer to produce sounds that are equivalent to those that were intended by the original composition or performance.

MIDI can perhaps be viewed as essentially an electronic, digitized form of the antique piano roll.

Much information is available about MIDI and the MIDI format and it is well beyond the scope of this article to provide a comprehensive description. A couple of the many links is given here:

The utility of MIDI is that by storing only the musical information the file storage size is literally orders of magnitude smaller than that of digitized audio waveforms. As a comparison, the complete Beethoven Piano Sonata No. 14 (the “Moonlight” sonata) can be stored in a MIDI file with about 10 kilobytes of space whereas an uncompressed WAV file, at 44100 Hertz sample rate and 16-bytes/sample (CD quality), would require about 10 megabytes. This is 3 orders of magnitude difference.

Information, however, is not sound. The musical information within a MIDI must be rendered, or translated, into sound, and this can be accomplished with a personal computer by using a software synthesizer. The remainder of this article will describe how to utilize GNU/Linux software to render into sound, at hopefully very high quality, the musical compositions that are stored in the MIDI format.

Together, MIDI files and software synthesizers can allow an extremely efficient storage and reproduction system for personal music collections.

An Initial Caveat

Before I proceed further an initial caveat should be introduced. The goal is to store and reproduce music at high quality that is preferably no different from conventional recordings or performances. Not all music is amenable to this approach. In particular, most popular music and music with prominent human vocals (singing) is not suitable. MIDI cannot store information about the human voice (aside from choral “oohs” and “aahs”). However, the genre of classical instrumental music, and especially that of classical keyboard compositions, can be handled very effectively with MIDI.

Indeed, this article would be more aptly titled “Realizing MIDI Classical Keyboard Music,” as I intend mainly to concentrate on this specific genre. As I will indicate below, the soundfont, or the digitized representation of a particular musical instrument, is critical to rendering a MIDI file and most high quality, open-source soundfonts exist mainly for keyboard instruments.

J.S Bach with GNU/Linux

I will begin with an example of the end result. The following links contain a rendition of J.S. Bach’s Toccata and Fugue in D-Minor for organ. A compressed OGG file is given of a MIDI rendering of this composition. The original MIDI file and a reference to the soundfont that were used to create the rendition will be provided later in this article. (Note that being a staunch advocate of free and open-source software (FOSS) I will not offer any MP3 or other proprietary format.)

To my ears, the reproduction is superlative, and the MIDI rendering is an excellent substitute for a recorded performance. Of course, any comments or criticisms to the contrary are certainly welcome.

GNU/Linux Software Synthesizers

The software synthesizers used in rendering MIDI are TiMidity++ and fluidsynth:

Both synthesizers are command-line driven although TiMidity++ offers several graphical interfaces. Since I prefer the simple command line I will only describe this method of control.

Both synthesizers offer a wide range of options for processing MIDI files but I illustrate only a basic few. The interested reader is encouraged to study the comprehensive man pages for each program to learn more. As more experience is gained in working with MIDI the extensive processing options can be very useful.

As already mentioned, a high quality soundfont is essentially for MIDI rendering. Soundfonts are discussed in the next section. The purpose here is to describe the rendering commands assuming that a suitable soundfont, or soundbank, is configured.

To render a MIDI file with TiMidity++ the following basic commands are used:

timidity -Ow1lSs -s44100 -A90 -a -o file.wav file.mid

timidity -Os1lSs -s44100 -A90 -a file.mid

The only difference between these variants is that the first produces a WAV format output file ("-Ow...") while the second sends audio output to the alsa device ("-Os..."). The other command options, "-O...1lSs" and "-s44100," specify CD quality of 44100 Hz sampling rate at signed 16-bit stereo (see man page for extensive options). The "-A" option specifies the output volume level and "-a" specifies that an anti-aliasing filter be applied, which may not be necessary in all cases.

TiMidity++ will read soundfonts from a location that is specified in its configuration file which is usually ~/timidity.cfg but may also be elsewhere depending on the build configuration.

Now the basic command for fluidsynth is given:

fluidsynth -i -n -a alsa -g 1 -o synth.cpu-cores=4 /path/to/soundfont.sf2 file.mid

Fluidsynth is actually an interactive program that accepts commands from the shell. In this case, however, the "-i" option indicates that the command is to be executed immediately without interaction and the "-n" option indicates that no MIDI input events are to be read.

Again, alsa is used as the audio output and the "-o" option defines a setting that specifies the number of CPU cores to use for the synthesis. The output gain is set via the "-g" option. Finally, the path to a soundfont or soundfonts is given and then the name of the MIDI file to be rendered.

As with TiMidity++ it is possible to output directly to many different audio file formats by using the "-O," "-r," and "--fast-render" options. Here is the example for CD quality:

fluidsynth -i -n --fast-render=file.wav -O s16 -r 44100 -g 1 -o synth.cpu-cores=4 /path/to/soundfont.sf2 file.mid

I find that both fluidsynth and TiMidity++ produce nearly identical rendering results but, again, the key ingredient is the soundfont. With a soundfont of sufficient quality either program is useful but fluidsynth seems to be the more actively developed project.

Soundfonts

A soundfont is merely a digital file that describes the properties of a musical instrument. It will contain various digitized waveforms for a particular instrument and other necessary parameters to create a realistic sound. For a thorough background check here:

A software synthesizes uses the information within a soundfont or soundfounts to to create sound from a MIDI file. Needless to say, only a soundfont of superior quality will ensure that a MIDI file will be rendered into very realistic and hence pleasing sound.

Superior sound fonts are usually obtained commercially but there are many high quality open source soundfonts available as well. Since I am a strong advocate of GNU/Linux and FOSS I will consider only open source soundfonts.

The file formats for a soundfont are many: PAT, SF2, SF3, SFZ, and GIG. Both TiMidity++ and fluidsynth can utilize the SF2 format and since this is most common with open-source soundfonts SF2 is the only format considered here.

SF2 soundfonts can exist for only a single instrument such as a piano or organ, or they can can exist as a set of individual soundfonts. The most common sets are collections that cover the entire general MIDI range of 128 instruments.

Before any music can be made with MIDI a suitable set of SF2 soundfonts must be installed. The question therefore becomes: where are SF2 soundfonts obtained?

Because digital music is such a popular pastime there are a lot of sources. Just a small few involving keyboards are given in the following list. The interested reader is encouraged to perform a search.

Formerly available on the Internet was a large collection (45G), called the ZSF Distribution, that contained many different versions of all the general MIDI instruments. It has now disappeared but I had managed to acquire it previously. If any readers are interested I could provide this collection by posting on Usenet.

Each SF2 soundfont, either individually or within a collection, contains a program number (0-127) that identifies the particular instrument, be it a piano, organ, or other. The configuration file for TiMidity++ needs to associate this MIDI program number with the desired soundfont.

As an example, here would be the TiMidity++ config file for a string quartet:

dir /opt/midi/timidity

soundfont FlorestanStringQuartet.sf2 order=0
soundfont ViolaSTR.sf2 order=0
soundfont ViolaPhil.sf2 order=0
soundfont ConcertoCello.sf2 order=0
soundfont FlorestanDblBase.sf2 order=0
soundfont FlorestanStringQuartet.sf2 order=0
soundfont RolandPizzicato.sf2 order=0

bank 0

40 FlorestanStringQuartet
41 ViolaPhil rate=:::90
42 ConcertoCello
43 FlorestanDblBase

In this file, the "dir" keyword indicates the location of the soundfonts. The SF2 soundfonts used are listed next. For MIDI bank 0 the instrument numbers (0-127) are then associated with the individual soundfonts. Optional parameters can also address aspects of the soundfont such as attack rate, decay rate, etc. (See man timidity.cfg page for the gory details.)

Usually, an entire GM soundfont collection can be specified with the following kind of config file:

dir /opt/midi
bank 0
0 %font FluidR3_GM.sf2 0 0 amp=29 pan=0
1 %font FluidR3_GM.sf2 0 1 amp=41 pan=0
2 %font FluidR3_GM.sf2 0 2 amp=64 pan=0
...
[snip lengthy list of instruments including different banks
and drumsets]
...

In this configuration file, the entire General MIDI range of instruments is included in one large file. The example is the FluidR3 GM soundfont collection.

However, individual soundfonts, as opposed to those within collections, tend to be of much higher quality. A GM collection, IMO, is useful only for quickly previewing various MIDI files with unknown program information. For serious, final rendering, combining individual soundfonts as in the first TiMidity++ config example is the best way to go.

Regarding fluidsynth, either single soundfonts or entire collections can be specified in the command line:

fluidsynth ... /path/to/soundfont_1.sf2 /path/to/soundfont_2.sf2 ... file.mid

fluidsynth ... /path/to/GM_soundfont.sf2 file.mid

MIDI File Sources

To obtain actual MIDI files for rendering there are many sources. I provide only a small list for classical MIDI files. All readers are encouraged to actively search as old links may vanish and new ones often appear.

A very large source for classical MIDI is the Classical Archives. This fine site used to be free but it now requires a subscription, but I believe that a small ration can be downloaded freely on a daily basis.

A truly outstanding and benevolent contribution of classical MIDI files was made my John Sankey, the former “Harpsichordist to the Internet.” Some of Sankey’s fine work is rendered by me in the examples below.

Other links:

MIDI Renditions with GNU/Linux

The sound files given here were produced with either TiMidity++ or fluidsynth using the commands shown above to create an audio file in WAV format. The audio data from the WAV data was then compressed to high quality OGG format using the oggenc utility from the vorbis-tools package:

oggenc –quality=10 file.wav

To ensure that the audio output has maximized the dynamic range without sample clipping, the rendering was done at several volume/gain levels and then the data was checked for range and clipping using the Sound eXchange (Sox) utility:

sox file.wav -n stats

As a consequence, these audio files will play without distortion on any audio system.

The peak amplitude reported is shown for each of the MIDI renderings below.

If deemed necessary, the audio data was processed with the normalize utility to promote maximum dynamic range:

normalize file.wav

Optionally, the resultant audio waveform can be examined with the graphical sound file editor mhWaveEdit.


The Music

I begin with the Scarlatti Sonata in D minor, K.9 of John Sankey (linked above). I have lost the origin of harpsichord soundfont that I used but the file metadata states that it is in the public domain. I like the very harsh and strident sound of this harpsichord. To me, it seems that such stridency is appropriate to the intensity of the music. Others may disagree.

Dynamic Range Data:

             Overall     Left      Right
Pk lev dB      -0.07     -0.07     -0.11
RMS lev dB    -14.73    -14.84    -14.62
RMS Pk dB      -8.75     -8.75     -9.07
RMS Tr dB     -78.70    -75.31    -78.70

However, Sankey recommends to use his own harpsichord soundfont (link here) for the rendering of his files. To conform to his wishes I include another rendering of Scarlatti K.9 using the Sankey soundfont.

Dynamic Range Data:

             Overall     Left      Right
Pk lev dB      -0.33     -0.33     -0.33
RMS lev dB    -15.48    -15.48    -15.48
RMS Pk dB      -9.80     -9.80     -9.80
RMS Tr dB     -94.55    -94.55    -94.55

Next up is the Bartok Piano Sonata, 1st Movement using the Yamaha Disklavier Pro grand piano SoundFont.

Dynamic Range Data:

             Overall     Left      Right
Pk lev dB      -0.58     -0.58     -0.94
RMS lev dB    -20.07    -19.41    -20.86
RMS Pk dB      -9.31     -9.31    -10.11
RMS Tr dB     -53.08    -50.95    -53.08

MIDI is very suited for microtonal music which employs unconventional scales and intervals. Rendered next is a microtonal piece of Jeff Harrington for 19-tone equal temperament. The title of this work is "Prelude 3 for 19ET Piano". The soundfont used is named "Giga Steinway B" but the origin has been lost to me. There is no copyright information given within the file but the author is given as "Vienna Master."

Dynamic Range Data:

             Overall     Left      Right
Pk lev dB      -1.08     -1.08     -3.88
RMS lev dB    -21.39    -19.76    -24.02
RMS Pk dB     -10.47    -10.47    -14.24
RMS Tr dB     -70.19    -67.72    -70.19

Now comes the rather fierce Ligeti Piano Etude 14a again using the Giga Steinway B soundfont.

Dynamic Range Data:

             Overall     Left      Right
Pk lev dB      -0.70     -0.70     -1.63
RMS lev dB    -15.01    -13.68    -16.94
RMS Pk dB     -10.09    -10.09    -12.39
RMS Tr dB     -24.84    -18.52    -24.84

More conventional, perhaps, is the Beethoven Piano Sonata No. 1, First Movement. The soundfont is the YDP Grand Piano that is referenced above.

Dynamic Range Data:

             Overall     Left      Right
Pk lev dB      -0.11     -0.11     -0.16
RMS lev dB    -20.02    -20.30    -19.75
RMS Pk dB      -8.22     -8.22     -8.63
RMS Tr dB     -87.61    -87.58    -87.61


As promised earlier, the dynamic range information and MIDI file for the J.S. Bach Toccata is now provided. The soundfont used for the rendering is named "Church Organ 1 (Mono-Econo)" and is copyright 1997 by Sonido Media. Its origin, however, has been lost me.

Dynamic Range Data for J.S. Bach Toccata and Fugue in D minor:

             Overall     Left      Right
Pk lev dB      -0.11     -0.11     -0.16
RMS lev dB    -20.02    -20.30    -19.75
RMS Pk dB      -8.22     -8.22     -8.63
RMS Tr dB     -87.61    -87.58    -87.61

Accessory GNU/Linux Audio Software

MIDI files contain information that is largely not readable by humans, but various GNU/Linux programs are available that allow the editing or viewing of MIDI data.

For editing, the excellent midicsv/csvmidi programs, by John Walker, converts MIDI data to human readable text. The idea is to convert MIDI to text, then make any changes with a text editor, and then convert back to MIDI.

A more direct, graphical approach to both editing and viewing is MidiEditor:

For generating a musical score from a MIDI file, as well as performing edits, the NtEd program is very useful:

GNU/Linux has far more sophisticated MIDI software available but since my interest in MIDI is only for the casual rendering of existing MIDI files I will make no reference anything beyond these simple utilities.

Since SF2 soundfonts are critical to rendering MIDI a soundfont editor is another important utility. On GNU/Linux systems the Swami editor is a fine choice:

The Swami editor can access any and all information within a soundfont but, for me, the most common use is to quickly change the internal program number. As indicated above, each MIDI instrument is assigned a program number from 0-127 but some individual soundfonts are distributed with their program numbers set at 0. For an organ soundfont, for example, changing this to 6 is necessary for it to be used to render organ MIDI music.

Epilogue

My intention in this article was to demonstrate that MIDI files can be an efficient storage format and reproduction medium for select types of classical music, especially keyboard compositions.

I have acquired an extensive collection of classical piano, organ, and harpsichord works from all historical periods that is contained in only a few megabytes of digital storage space using MIDI files. These files can be rendered into music at will that is virtually identical to a real performance or recording.

To have acquired an equivalent collection through the purchase of audio CDs or other media would have cost many thousands of dollars and would have necessitated extensive storage space. Even lossless FLAC format audio files would require hundreds of gigabytes of storage.

I am aware that using MIDI files exclusively for some music types may be considered a bit unconventional by many people, but the interested reader is encouraged to at least experiment with this format to judge for himself whether it can be useful.

Furthermore, this article also attempts to demonstrate the power and utility of GNU/Linux software to process and render MIDI files.