Does AAMS really master audio automatically and how is this done ?
Yes, AAMS Auto Audio Mastering System is a program designed for audio mastering.
Since the first thought of auto mastering on a computer was taken into writing AAMS V1.0 software in 2004, there has been a lot of improvements.
Mostly by using DSP-EQ, DSP-Compressor and DSP-Loudness the process of auto mastering was with AAMS V2.0, a fact.
With AAMS Reference Database with over 200 AAMS V2.6 became a tool for the most users to master audio in a single go.
And the complete internal processing of AAMS was enough to do everything inside the AAMS program, with a few button clicks.
Also AAMS became more versatile for beginners alike musicians AAMS became a fast tool.
For more professional thinking users AAMS can be adjusted for their need, but mostly maybe also a tool between mixing and mastering.
Fore most the AAMS software is there to give users a one way tool for mastering their audio, but also a learning tool.
One of the most important things of mastering audio is the way to create a sound, specially the way you want it to hear.
But also there are different kinds of audio playing systems, from normal radio or mobile phones, on speakers, on headsets, or in the car listening to music.
AAMS Analyzer is suggesting the AAMS DSP Processors to act according to rules, mastering audio rules.
Specially we call a source file the file the user inputs into AAMS.
Then the user can choose a reference file, specially out of the 200 Database of reference files supplied with AAMS.
But how does AAMS calculate the settings ?
AAMS internal analyzer will first read and analyze your mix, song or track and output a Source file.
This Source file will represent your own audio material.
When you load the Source back into AAMS and choose a matching Reference from the database of 200 available style presets,
AAMS will automatically generate suggestions for mastering your audio material.
These are displayed in charts and information screens for the user to check what AAMS is suggesting.
The internal DSP-Processing consists of a variable 1-100 Band Equalizer, 1-8 Multi-Band Compressor and a Loudness Maximiser.
These three modules will balance, EQ and Compress your audio material automatically when chosen and carefully make it loud as possible without losing quality.
AAMS can process a single master or even multiple masters automatically.
A Mastered Audio file will be written to your hard disk and you can listen to this Audio file directly.
Does AAMS master accurately and is it easy to learn ?
Great mastered mixes sound much better, but it’s hard to master with speakers and listening environments that don't represent a good flat frequency spectrum.
AAMS takes the mastering out of your hands and lets you do it in a few simple, fast and easy steps with no fiddling around.
AAMS does this by matching your audio material with the provided Reference Styles.
AAMS works automatically and all mastering functions can be done with a minimum of effort.
The learning curve of AAMS is such that you can be creating great masters faster and more accurately than before.
AAMS does not simply assist you when mastering, it will work out and calculate all necessary changes using internal Audio functions and DSP-Processing.
Writing the results as a complete mastered audio file.
AAMS provides a convenient way to get you going and making a good sounding master from your mix.
What is the main file format for loading audio into AAMS ?
The main file format is Wave Format 16 bit Integer, 24 Bit or 32 Bit Float Bit 44.100hz Stereo.
Although AAMS can convert MP3 Files to Wave Format for import, we recommend using Wave format files.
All audio formats are WAV, WAVpack, MP3, MP2, AAC, WMA, M4a, OGG Vorbis, AAC, APE.
How does AAMS analyze audio material ?
When an audio file is imported into AAMS; The AAMS Spectrum Analyzer analyzes the audio file in chunks of audio and can handle any frequency from 5hz to 22,5Khz.
The Frequency spectrum is set up in logarithmic order, so that the bottom end frequencies have a more detailed spectrum and builds upwards to the high frequencies.
This makes it easier for the user and AAMS to recognize if the bottom end frequencies are correctly matched or see the whole spectrum or equalization points.
The Bottom End is usually not heard on small monitoring speaker systems and m
ost household stereo equipment.
Sometimes there is a lack of highs and some speakers produce some distortion.
The spectrum range from 0Hz to 50Hz is often not heard correctly on most speakers, but AAMS Analyzer effectively calculates the spectrum and its EQ so that it can calculate the differences consistently and accurately.
AAMS calculates the whole frequency spectrum from 5hz to 22,5Khz for the analyzed audio.
This is separated in left and right signals so that if there was a balancing problem this is also resolved within the calculations.
A parametric equalizer suggestion is calculated and from that comes the EQ Final section and EQ Preset (Graphic EQ) setup from user defaults.
They represent a parametric equalizer and a graphical equalizer.
After this the Multi-Band Compression is calculated so that the user is informed about compression bands and how to reduce peaks and how to get your tracks more even.
The loudness display will make it easy to see if the overall volume is loud enough to match a level that is as loud as possible.
How does the AAMS Analyzer work ?
The AAMS Analyzer is what makes the difference in AAMS because it is not just a spectrum analyzer but AAMS is programmed in a way that makes the mastering process easy and carries out calculations beyond the normal spectral views.
AAMS also provides information about Multi-Band Compression settings and Loudness.
AAMS shows you how to setup the mastering chain and gives detailed information allowing you do the job in a minimal amount of time and with great accuracy.
When you load an audio file into AAMS it will investigate and hunt down frequency information and save it to disk.
AAMS will then do some heavy calculations and create a new Source Preset.
When you load a Source preset and a Reference preset AAMS automatically calculates equalization, Multi-Band compression and loudness settings.
I How do i register the AAMS Software with the send back Keycode ?
The AAMS software is freeware, with an encouragement that the user makes a registration to the Full Professional Version.
You can show your support and appreciation for AAMS and future development by registering.
To make full use of the AAMS Software Package by Registration, use the Registration page.
You can also use the AAMS Contact Page and ask for an invoice, be sure to add your own email adres.
Without any blocking of professional options, a registered and licensed user can make use of all AAMS V3 Functions!
Follow these instructions and Send us your Registration for a License!
You need to download AAMS (www.curioza.com) from the download page.
Then open the 'AAMS.zip' file and start 'AAMS V3 Setup.exe'.
When AAMS is fully installed, start AAMS.
Goto the Ábout' tab (1).
(2) Fill in your username (example 'Denis van der Velde') and Copy your Username and (3) Copy the Installcode.
We will respond to your registration and payment by email with the corresponding keycode...
You have your username and installcode and you recieved the Keycode ?
Open AAMS Software, Fill in your username, for instance 'Denis van der Velde' (Example use your own Username instead).
Fill in the Keycode we have send you, for instance ' AAMS-XX-XX-XXX' (Example use your own Keycode instead).
Use the ' Registration' button in the AAMS software About tab and follow instructions.
You are now a Registered and Full Professional AAMS user!
We put a lot of effort into programming testing and development of AAMS.
And we are still developing to make AAMS better.
So support AAMS by registering and having a full professional license.
Denis van der Velde
Sined Supplies Inc.
Does AAMS tell me what I have been doing right or wrong in my mix or already made masters ?
The AAMS Analyzer function can handle any mixed or mastered audio file in the correct format.
It displays the frequency spectrum or EQ data against a reference style of your choice, effectively showing you what you are doing right or wrong.
As there is basically is no difference between a source preset or reference preset, you can make either of them yourself or load a reference from the prepared reference style database.
After choosing a source and a reference, AAMS will show you the complete bottom end 0-120hz, Low Mids 120-2Khz, Mids 2Khz-10Khz And High Frequencies 10Khz -20Khz.
With AAMS you won’t have to bother with equalization settings yourself as AAMS calculates an equalization preset for you that is very precise and accurate.
What are the requirement specifications for AAMS ?
AAMS is not really a processor and memory hungry application.
AAMS runs best on a fast computer with 2 GB of memory.
The AAMS Analyzer will take some time to analyze an audio file.
With the recommended setup it will take about 2 to 5 Minutes before AAMS Analyzer finishes.
AAMS also uses memory to store the audio file that is processed and the amount of memory needed depends on the length and size of the loaded audio file.
The DSP-Processors need less time and less memory, but having a fast processor and free memory will mean that Windows uses the swap file less and uses RAM memory instead.
Please be patient and wait for AAMS to complete processing and you will end up with a good sounding master.
Processor - Pentium 4
Processor Speed - 3 GHz
Ram Memory - 4 Gbyte
Hard Disk Space - 1 Gb
Processor - Pentium 3
Processor Speed – 1.5 GHz
Ram Memory - 1024 Mbyte
Hard Disk Space - 500 Mb
What about single audio tracks ?
AAMS can import and analyze single audio tracks in the same way as full mixes or any other audio material.
All you have to do is choose a reference style that fits the source audio track well.
Either you can take a look in the reference style database or generate a new reference preset.
Let’s say you'd like to equalize a guitar-instrumental track with AAMS but you don't find a good reference style preset in the database.
You can generate a new reference style preset by importing more guitar-instrumental audio tracks.
Use a number of recordings that best match the intended result as the source material and import them into AAMS.
Now AAMS will average the references and you can save a new preset that you can now use as a new reference style.
When you load up your newly made reference against the source, AAMS will calculate the differences.
This is basically called matching, but AAMS can take this quite a bit further, as we introduce Batch Matching.
What is a Batch ?
A batch is a selection of audio files.
When you want to make a new reference style you can select any audio material and place them I a single directory.
You can now let AAMS carry out the analysis.
AAMS will analyze each file separately and save a reference preset accordingly in the directory next to the audio material.
When the analysis is finished, you can generate a new reference style by selecting all analyzed references and batch then into one single preset.
The Batch function creates an average of all reference presets in the list.
When you save this new reference style preset in the database directory the new style can now be used against your source audio material.
This way you can make new matching style presets for every source.
These functions creates endless possibilities to add new styles into the database.
When spread across the users of AAMS will be included into the general internet database of AAMS and subsequent versions of AAMS.
In this way we hope to encourage users to send in their own presets and make the database bigger and better with every new release of AAMS.
How can I setup my equipment for use with AAMS ?
The recommended Effect chaining is as follows:
2. Multi-Band Compression
3. Loudness Maximizing
When you use a hardware Equalizer or any VST/DirectX plugin/Rtas/AudioUnit, the routing or chaining is always subject to change as each user can decide what equipment to use alongside AAMS.
Actually AAMS does not make use of a soundcard.
So the soundcard on your system is free to do the mastering process in any given Audio Editor / Sequencer / Plugin or Hardware that can handle these effects.
AAMS just helps you to work much faster and more accurately.
What can I do to check my Master ?
If you just mastered a track and want to compare it with the original, you can analyze the master again.
You can check your Master against any Reference.
You can see in the EQpreset what EQ differences there are and you can also make a new Reference while you are mastering your track.
Right after the Equalization and Multi-Band Compression processing is applied on you Mix, save the results.
When you analyze and compare this to your Master you can see the differences in EQ.
For instance when I was using Izotope Ozone's Loudness Maximiser, there was a distinct peak at 160Hz 4/ 6db when I was boosting the Master.
For other processing like Reverb or Delay (if you use them while Mastering your track) it's best to make a comparison like this.
You must load the Master as Source and compare it to a Reference of choice.
Then you can see in the EQ-preset Tab of AAMS what differences there are in EQ.
As a final last mastering step you could correct using the EQ-Preset and Equalize your Master again, then normalize this to 0db and this would be your Final Master.
The final sound is what is important and correcting the EQ of your Master is usually more important than loudness.
Comparing is a good thing to do and is the main function of AAMS. But remember to be aware of what Source and Reference you are comparing.
When you swap source and reference, you might decide on the wrong EQ settings and your Master will sound worse.
You have to understand what you are doing, play around with it and learn for it.
Then you can decide that comparing your Master is a good idea and you can add it to your mastering routine.
In particular, if you are using other plugins or outboard equipment like Delay, Reverb or anything that can change the EQ Frequency Spectrum of your Master.
Comparing and correcting is a very good thing to do.
What can I do to check my mix before mastering ?
A 'mix' is exactly that, all tracks mixed together.
It doesn’t really matter what levels there are on the master-fader vu-meter, as long as your mix stays below 0db.
A mistake would be trying to make the mix as loud as possible at this stage.
Mixing means managing all instruments and vocals and it is more important to get all tracks sound good together instead of making all tracks as loud as possible.
Don't try to overload your mix and keep the master fader at 0db.
Don't touch the master fader and instead correct levels by using the track faders.
A good reference point is the bass drum (if there is one in your mix).
It's good to start looking at the bass drum on the master fader VU meter.
Set the peaks of the bass drum at about -6 to -10 maximum peak while your soloing the bass drum (so the only thing you should hear over the master-fader is the bass drum).
The -6 to -10db space that is left should be sufficient to leave enough room for all other tracks and avoid hitting 0db on the master fader.
When mixing tracks try to set them lower than the bass drum track and make sure you don't go over 0db on the master fader.
Another thing to keep in mind is Balancing the mix.
You can use panning on all tracks to make a nice wide mix and most stereo mixes use this kind of panning.
Check the levels on the master fader and keep them balanced (left and right signals should be balanced).
If you output or render your mix check that it stays just below 0db to -6db or so.
There are some other things to keep in mind after the mix is done.
Don't use reverb while you are mastering, you can also step back to the mix and use reverb or delay inside the mix and render it again.
Any effect like chorus, flanger or other type of effect plugin should remain inside 'the mix.
When it is not possible to change the mix you can use effects/plugins inside the mastering chain.
Remember that then the spectrum will change and you need to create a new import file for AAMS to “see” the new spectrum.
It's better to try to make your mix as balanced as you can, using all kind of effects inside the mix to make it better sounding.
Make all tracks work well together, keep the levels good and concentrate on making the music / mix.
The mastering is just to make the mix louder so it comes to 'commercial levels'.
How loud must music sound ? (Loudness War)
For most types of music the level in RMS should be about 10-12db and for softer music 15db RMS or so.
The “Loudness Race” for the loudest sound is going on for many years now and some music goes as loud as -4db RMS or -6db RMS, but this is still not the case for most music.
So how do we see 'how far we can go' with our own songs and music material?
Using a Loudness Maximiser in an Audio Editor will give a clear indication in a waveform view and when using more Loudness the Average RMS should rise towards -10db or before the Waveform starts to clip.
With a reading of -10db RMS most music material looks like it has been clipped and in an Audio Editor this is indicated by straight cut-offs at the 0db line.
Although this may sound good at -10db RMS, there is a consideration to deal with here.
First of all, most Loudness Maximizers change your Equalization a bit and although this is not much, it might be enough to make a difference.
For example, Izotope’s Loudness Maximiser, boosting at -6db to get to -10db RMS, shows an Equalization difference of 2db in the 160Hz range which is quite a lot (read about Comparing elsewhere in this FAQ). Although the rest of the spectrum seems quite unharmed, when you try boosting 2db at the 160Hz range inside your mix it makes a difference.
So with most Loudness Maximizers boosting to the loudest possible sound is making slight changes to the frequency spectrum (Equalization) and therefore changes how your music sounds.
Can we measure this behavior? Yes, first take an audio track and start mastering it until you are done with the Multi-Band Compression, then save it as 'Compressed Master.wav' or similar.
Then continue with mastering using the Loudness Maximiser and finish up mastering, saving the results as 'Loudness Master.wav' or similar.
Now you can first import your 'Compressed Master.wav' into AAMS as a Reference.
Then import the 'Loudness Master.wav' file as a Source.
Now you can see in the Spectrum Tab or EQ-Preset Tab what differences there are when using a Loudness Maximiser.
When you want to subtract that EQ-preset load the 'Loudness Master.wav' as Reference and 'Compressed Master.wav' as Source.
You can see what you need to Equalize to make your 'Loudness Master.wav' sound the same as 'Compressed Master.wav'.
Use an Equalizer and the EQ-Preset will show what you need to do.
You can now use some Equalization and then Normalize to 0db and finish off a master without the differences in Equalization behavior caused by a Loudness Maximiser.
Though this is a more complex method in finishing a master, it is recommended that you try this and see for yourself if it suits your needs.
You can also use AAMS to measure other plugin or equipment behavior.
When using a Loudness Maximiser it is recommended that you do not use too much loudness.
When you see in an Audio Editor that your waveform view is reaching or hitting the 0db line and the RMS reads about -10db, you might just consider reducing the Loudness and move the Threshold down a bit.
As long as you’re not clipping your Audio Material to the 0db Line, there should be less Equalization differences being applied to the Audio Material.
It is usually best to aim at -10 to -12db RMS, but you need to play around with the Threshold of the Loudness Maximiser, measuring and correcting the Loudness Maximizers behavior.
What is the difference between a Graphic EQ and Parametric EQ ?
The difference is that the Graphic EQ (EQ-Preset) is automatically calculated by AAMS and you don’t need to adjust the Q-Smooth in AAMS or the Q-Factor of your Graphic EQ.
When you use a Graphic Equalizer with 31 Bands or 50 Bands, you will notice that it is much easier and faster to setup.
While a Parametric EQ (EQ Final) is difficult to setup, you need a corresponding Q-Factor and most users will have problems doing this correctly so the outcome is less accurate.
When in doubt, it is strongly recommended that you use a Graphic Equalizer.
Can I use Cubase, Sonar, Logic or a sequencer to master my mix ?
It’s much easier to master audio outside these apps, using Wavelab, Soundforge or similar audio editor.
When you use a midi/audio sequencer to master, you are not in control of some necessary mastering steps.
Yes, you could use a Graphic Equalizer / Multi-Band Compression and Loudness inside a sequencer.
But you do not have the same degree of control as when you are using an audio editor.
You cannot normalize or check if your output goes above 0db and that might cause unwanted effects.
As you place effects as a serial chain on the master-out bus, you should check levels in between the effects that you use.
Check levels by turning Mastering Effects on and off and see if the audio still plays correctly.
We do recommend using a real standalone audio editor like Wavelab or Soundforge instead of mastering inside a sequencer.
When do I need to reconsider my mix ?
You can’t polish a turd.
There are a few points to consider when revisiting a mix.
The first thing is when an EQ-Preset results in several EQ points going over 10db or -10db.
You could still master this way but it's better to adjust your mix and then return to AAMS.
It's quite common to overdo bottom end frequencies and most of the time AAMS EQ-preset will show a major cut in this section.
You should consider doing another mix and cutting out some bass and bass drums or single tracks that are heavy on the lower frequencies.
When you mix, take the bass and bass drum as they come, but lower some of the bottom end frequencies on other instruments or vocals,
which should help to give a better and confident bottom end.
Just keep checking with AAMS how far you can go. Next the average compression levels should not exceed -4 / -5db or 4 / 5db.
This is quite a heavy setting for compressors and you should do some additional compression inside your mix.
When you listen to your mix and single tracks, after a while you should hear what tracks or instruments need more compression and you will learn a great deal by returning to AAMS and checking again.
The loudness of your tracks should not exceed 0db peak level as this will give strange clipping effects.
When you do a good job on Equalization and Compression it's easy to get the right level for Loudness between -10db and -12db RMS.
There is a warning system built into AAMS that you can switch it on in the Options Tab.
Then AAMS will check your imported track and warn you to consider redoing your mix.
Furthermore, Mastering a Mix (while the mix is unfinished) can also be useful, as AAMS will help you with suggestions.
You can hear what instruments or tracks stand out of the mix and what instruments are too soft.
You can also listen to your panning and maybe hear better what you have to do to finish off your mix.
Can I adjust the mix if I think I could be better ?
Yes, after Equalization and/or Multi-Band Compression you can adjust your master.
I would use a parametric EQ and sweep around a little bit.
If you like a touch more bass or highs you can adjust them afterwards.
AAMS is doing an average on your mix to master it, so some users like to compensate for their own speaker-systems or their own hearing.
You can use AAMS to adjust your mix towards a commercial sound and then adjust it with a slight touch.
Sometimes the Loudness Maximiser of choice is introducing some EQ side effects in the Mids and Highs that you may want to compensate for.
Most of the time AAMS settings are spot on but some Plugins introduce Analog Style EQ and introduce some sound changes.
The Elemental Audio Systems - Firium plugin is a Natural EQ and that is about the best you can get.
It's a decent plugin that won't introduce side effects from using the EQ.
Some Plugins have an Analog Style EQ that alters the sound with Tube effects or Analog Style (Tape) EQ.
Especially when you go heavy on the settings on those Plugins.
This will corrupt the suggestion that AAMS has given and result in a different result.
Most Graphic Equalizers are naturally programmed and it is best to use the settings and suggestions AAMS is giving.
Most parametric Equalizers have some special sound curve like Tape or Tube effects.
So you should generally try to use Plugins and Equalizers that give a natural sound.
What Plugins do you recommend ?
Needless to say AAMS internal DSP-Processing functions are specially designed to make this job automatic and easy.
AAMS can create a fully mastered track from start to end and you can listen to it directly.
If however you want to use other plugins, a mastering equalizer should have a lot of frequency bands.
(a Graphic Equalizer with at least 31 EQ bands is recommended).
The more EQ bands the better the results are.
An equalizer with a lot of frequency bands is produced by Elemental Audio Systems - Firium (Stereo 50 Band EQ) and is highly recommended for use with AAMS.
With AAMS you can export Firium settings and load them into Firium directly, but any parametric or graphic equalizer plugin would be sufficient.
A Graphic Equalizer is faster and easier to setup while a Parametric Equalizer is more difficult and less accurate.
A Multi-Band Compressor should have at least 3 Multi-Bands, but 4 or even 5 Multi-Bands are recommended.
The AAMS DSP-Compressor has a maximum of 8 Multi-Bands but these are hard to find as a plugin.
For Multi-Band compression the Waves C4 / Waves LinMB and Izotope Ozone Multi-Band Compressor are great tools.
They give peaked compression gain reduction information per Multi-Band, which is a great way to check the average compression settings AAMS is suggesting.
But there are many other good Multi-Band compressors around.
For making the overall volume in RMS loud enough we recommend Izotope Ozone Loudness Maximiser, WAVES L1/L2/L3 Ultra mixers or any other good working Maximiser, gain or limiter plugin.
Does AAMS automatically balance (pan) the audio material ?
Yes, but maybe not in the way you might expect as AAMS refers to the reference file for panning. This makes an overall panning justification. So yes, AAMS balances overall panning so that the outcome will not have a left or right bias.
Even if some tracks inside the mix are panned left and right completely, this will not be affected by AAMS mastering.
But the justification of the calculations inside AAMS will re-direct the whole audio as one.
This does not mean that the averages of the balance are zero panned as AAMS will match the panning of the reference, so you get the same spectral panning.
If the Reference is recorded unbalanced the outcome will be unbalanced in the same way.
Let’s say the balance of the reference is copied onto the source audio material.
The only way to not compensate for panning differences in equalization is to use the Mono button on the EQ preset window.
Doing so means that an average of left and right is displayed in Mono, and panning is not compensated for.
When Stereo EQ is used there will always be a balanced outcome, allowing AAMS to recognize spectral panning.
How about frequencies that are panned left and right? Again, these reflect the frequency balance of the reference chosen.
If the reference is balanced in a certain way, the outcome will be balanced accordingly.
A Reference Style that consists of multiple audio recordings is preferable since the higher the count of batched audio Reference files.
The better the general balancing outcome. That is why we suggest that a minimum of between 4 and 8 recordings are needed to make a new Reference Style.
When audio is taken from multiple commercial recordings like commercial CDs, the balancing is basically true to zero.
Does AAMS do automatic harmonic balancing ?
AAMS works to modify the source to the reference styles and harmonic balancing.
Harmonic Balancing applied when the reference style preset comes from enough different recordings.
The AAMS calculation system applies average controls on the audio source material that refer to the reference style preset.
It balances the frequency spectrum and will balance panning and harmonic equalization.
The success rate is theoretically 100%.
But is affected by the use of equalization and in particular the amount of frequency EQ bands applied by the users setup.
When the reference is based on enough audio material clips, the more accurate the style becomes.
Can AAMS handle a compilation for a full album ?
It does not matter much how many tracks or what you have on a full album, AAMS will match your tracks to a reference.
So when you choose a style from the reference database each track is mastered against it.
Master every track how you want, as you would do with a single track.
Usually this is enough to make the full album sound good.
But you could take it one step further and create an average of all tracks using the batch source function.
When you have multiple songs ready for an album and you want them all sound the same.
It’s possible to analyze the complete set of songs in AAMS and make a new averaged Source Preset from all of them.
The Source Batch will calculate and compile the full album with all tracks into a new single Source Preset.
When loaded with AAMS this works the same way as any other source preset and AAMS will calculate the overall difference of the full album.
Use the mastering preset given by AAMS on all tracks and re-master them to get a nice overall sound on your full album.
How was the pre-made Reference Styles Database created ?
The downloadable version of AAMS Auto Audio Mastering System has no basic differences.
Not stripped down and contains the complete style database of 200 reference presets.
Therefore the user can make a new Reference Preset in exactly the same way as most AAMS Reference Styles have been made.
When enough audio is collected, presets are made with the audio batch Analyzer to import incoming audio in a single batch.
When choosing audio material or recordings/songs for a style it’s important to choose at least 5 to 8 recordings of between 3 and 8 minutes in length.
The more recordings you put in, the better results AAMS will produce, creating a very accurate Reference preset.
The collected files for the new Style can be batched as a new Source Preset or Reference Preset.
Use the batch functions and start the batch, when AAMS finishes batching it will let you save it under a new name.
Due to the amount of audio material being processed, AAMS will take some time to finish.
Options are available to speed up the process and make de analysation faster, but less accurate as a result.
It’s recommended that the default option is the best balance of speed and accuracy.
It’s possible to do 1 to 1 analysis that takes the most time but is highly accurate.
Processing time can take from several seconds to several hours depending on the amount of audio material in the collection.
There is no user input needed for processing, so be patient, have a good cup of coffee.
At the end of processing you can save a new Reference Preset (a reference style).
You can load the preset back into AAMS and use it over and over again.
AAMS will load its own preset much faster (within a few seconds), making it fast and easy to check and try settings with presets.
It’s also possible to batch Source Presets and Reference Presets together and join them into a new preset.
This is how the overall RMS.aam Preset came about and is the result all of all 200 styles batched together with AAMS.
The batching system offers simple but limitless use of presets and gives every user the chance to make presets themselves.
Based on any preset from the Styles Database or from any other AAMS preset.
New presets will be shared on the web site and will be redistributed with newer versions of AAMS.
AAMS is programmed as an open database for the users, so they can expand the styles database and generate new presets of their own and or share them.
Can AAMS be used to check equipment behavior ?
Yes. Let’s say you are running a hardware loudness Maximiser or plugin to affect your mastering routing.
Does this deliver the required rise in volume without adding artifacts in the equalizer spectrum?
Does your equipment deliver the goods as is expected?
Or does it introduce differences or artifacts in the equalization spectrum that you did not expect or hear?
If you want to check this, let AAMS analyze the audio material you are testing without the effect, so a clean version will display as source.
Then apply the effect on the audio material and let AAMS analyze this and import it as reference.
Now you can see the differences that are introduced by the effect or equipment you used.
The spectrum display and equalizer suggestions will accurately calculate and display the difference between all kinds of audio and equipment.
AAMS is a great testing tool for all kinds of audio.
Comparing differences between source and reference is about 100% effective and a clear way to view audio behavior between source and reference.
Can the Equalization preset (EQ-Preset) be changed ?
The output for equalization presets can be changed into any configurable equalizer settings.
So if you have an equalizer or plugin with non-standard equalizer settings or parameters AAMS can be configured to match.
When you create a new equalizer preset you can first set the total number of equalization bands.
Then you will be asked to fill in each equalizer band. Just copy the parameters from the equalizer you want to use.
You can choose any frequency for each band between 5hz and 20Khz in 0.1Hz steps.
If you only want to equalize the bottom end of you audio material, you can set up frequencies from 5hz to 120hz in steps of 0.1hz to a total of 99 bands.
This is a feature not found on any other equalizer or plugin.
AAMS also comes with useful equalizer presets, so there is no need to program a 'bottom end equalizer with 50 or 100 bands.
Just load a suitable equalizer preset from the AAMS EQ database.
Users can load or save their own presets and edit all existing presets.
EQ presets can be shared with other users and are released with every new version of AAMS.
The equalizer inside AAMS is a useful tool to have fun with equalization and allows you to try professional techniques that can’t be done with other equalizer or mastering systems.
The main point is that AAMS is highly configurable in many ways, even if you are an experienced engineer there are lots of new roads to do things.
If you are inexperienced about analyzing, equalization, Multi-Band or mastering.
The main aim of AAMS is to automatically carry out audio tasks that would have to be done manually in the past.
Load a preset and AAMS will use this automatically until you change it.
If you close AAMS and start it up the next day, your settings are saved and loaded back.
And there is the option to save full data. So you can keep track of your projects and save all data.
How accurately does AAMS master ?
It’s always difficult to say how audio material must sound, with some listeners preferring more high frequencies and some more bass frequencies.
Because AAMS matches a source and a reference, the sound relies on the chosen Reference.
That is why AAMS includes a database of 200 pre-made styles to help you define your sound.
AAMS is unsurpassed as the best matching tool available today.
In every aspect AAMS will effectively work with greater than 97% accuracy.
AAMS corrections are calculated at between 0.1-0.3 dB RMS difference at any given frequency.
In fact when the Q. factor is changed to 1000 or higher the overall effectiveness reaches an almost perfect 0.1 dB, almost 100% effective.
The only reason why AAMS is not 100% effective is because the effectiveness value is rounded off.
In calculating frequencies and differences the AAMS calculation system code is 100% effective.
But is hindered by rounding errors when rounding off values to a certain decimal point.
This rounding off is a phenomenon that is not solved in computer-processed languages.
This means in theory AAMS coding is 100% correct about calculating differences between source and reference.
The figures are found in the Multi-Bands Display grid called 'Leff" and 'Reff".
For each band the effectiveness in dB is shown in the grid.
This means overall equalization displaying and calculations are near to 100% perfect.
You can easily see this is the case in the Multi-Band Graphic Display when the overall frequency range seems to be nearly balanced at the 0dB line.
The Multi-Band Spectrum is horizontal and not angled like the Spectrum Display.
In transferring differences in equalization spectrums from source to reference, AAMS is a very accurate matching tool.
You can also do some nice comparisons with AAMS, as it’s all about sources, references and comparing the differences.
Can I change the Options and Q-factor ?
When you first start using AAMS it is recommended that you don't change options.
It's best to read and understand more about how AAMS works with a couple of mixes or masters before changing options.
When you fiddle around with the Smooth-Q factors you should really know what you are doing.
For example, some users have used the EQ-Preset for parametric EQ when the EQ-Preset is really meant for Graphic EQ.
Some users have tried the Waves Q10 plugin and made an Equalizer preset for it, but this won’t work if you don't know how.
It's better to use the EQ-Final on parametric EQ's and the EQ-preset on Graphic EQ's.
Another thing to avoid is turning off the Automatic Adjust Deviation in the options.
This means that the EQ settings will jump and AAMS will not consider Gain Adjustments any more.
This would mean that your Equalizer Setup and Compression setup must adjust for any deviation in Gain when that is meant for Loudness.
The best thing to do is leave the Q-factor and Options alone until you know what you’re doing.
What should I do when AAMS is buggy ?
When AAMS is Analyzing and the red 'Please Wait' sign is on it is best not to use your computer for other tasks.
AAMS is doing some heavy calculations and it's recommended that you leave AAMS and your computer alone while processing take place.
This is especially the case if you have been experiencing problems when using AAMS.
I think I found a bug!
Before writing to us about a problem, please check our web page to see if there might be a newer version of the software (we may have already fixed the bug).
If you still need to write to us about a bug or some problem, please include the following information:
•the exact steps of what you were doing (or trying to do) with our software, what you were expecting to happen, and what actually happened
•the name and version of the operating system that you're using
•the type of computer that you're using (manufacturer, CPU type, etc.)
•the name and version of the host software.
•the version of our software that you're using.
•the format of our software that you are using.
Also, if you are experienced a crash with our software, crash logs can often be very helpful, so it is great if you can send the crash log to us.
Can you give a few good reasons for buying AAMS ?
0. AAMS is very easy to use, once you know how AAMS works.
1. AAMS Auto Audio Mastering System is a self-contained package that will master your music automatically.
The automatic mastering functions of AAMS will result in a completely mastered audio file with the help of internal AAMS Analyzer and DSP-Processing functions.
It is possible to master a single audio file or even multiple audio files on the fly.
Also it is possible to do semi-automatic mastering, making the user a part of the mastering process.
AAMS aim is to do mastering automatically with the best sound possible for making your music sound good on all audio systems.
2. AAMS is a fully featured Analyzer specially designed for automatic audio mastering, resulting in a great sounding master.
AAMS Auto Audio Mastering System can be used on single tracks or mixes, used to re-master or master the overall sound on a full album.
As a matching system it is very effective, automatic and fast.
Save Time and get a great sound! AAMS Analyzer is specially built to do mastering and analyzing in the best way possible.
It may be slower than some other programs, but it is more accurate and serves as a highly detailed spectrum Analyzer.
3. AAMS system allows you to do mastering in a fast and accurate manner.
The ease of use will be great if you are looking for a fast way to do Equalization, Multi-Band Compression and Loudness Maximizing.
Making your audio sound as good and loud as commercial releases.
The main feature of AAMS is to provide fast, easy and accurate mastering.
It requires minimal user input but creates maximum quality output.
4. AAMS allows the user to use their own equipment in the form of hardware or software, although AAMS can do this automatically and internally.
It allows users to check, view and correct audio using their own equipment and progressively learn to use it better.
Any hardware equipment or DirectX/VST/Rtas/AudioUnit software plugin can be used with a sequencer or audio editor alongside AAMS.
It’s the perfect studio companion for you!
5. AAMS allows the user to see differences in the different suggestion displays.
AAMS automatically calculates the required changes with a minimal amount of time needed and no fiddling around.
Spending too much time listening to equalizers and fiddling with Multi-Band compression settings?
AAMS makes this mastering process automatic, without fiddling around or the need to have access to speaker systems or environments specially designed for mastering.
6. AAMS provides a database of over 200 musical preset styles. These Reference Styles are an accurate way to do fast and easy mastering based on numerous musical styles.
The user can also create new reference styles or modify those provided in the database at installation.
7. AAMS will match the Source to the Reference with near 100% accuracy, especially where problem areas exist, like the 'bottom end' or ‘highs’ that cannot be heard on most common speaker systems, or any frequency that seems out of place.
AAMS calculates the whole frequency spectrum from 5hz to 22,5Khz and aims for full mastering sound compensation.
The lows, Mids and highs are cleared of any annoyances; the whole frequency spectrum is flattened out and gets a much better performance on most common speaker systems. Make your audio sound best on all musical systems!
AAMS is the best tool for mastering a great overall sound, accurately and in the fastest time. Whenever you need to make your sound better, AAMS is the most accurate choice, with ease of use and spot on performance.
8. For users who are inexperienced with mastering and who would like to concentrate more on the mix, instead of being busy with mastering, AAMS provides a fast, accurate way to master.
The more experienced users will also get more information from AAMS that they were looking for as AAMS is very detailed.
AAMS presents every aspect of the mastering process and keeps user input as simple as possible.
But AAMS does more than aid mastering, as it includes a lot of new features not found in any other mastering systems.
AAMS accurately calculates differences between Source and Reference audio material when testing equipment behavior.
So when working with AAMS you might find out that it lets you do much more than mastering alone, for example, getting the sound right on single tracks within your mix. Anyone who needs to understand the basics about equalizing, Multi-Band compression and changing volume levels will learn from this innovative application. When using AAMS you will learn more about mixing and mastering as you progress using the information AAMS is providing.
9. Whenever you need a fast and accurate mastering matching system, whenever need to analyze your audio material, whenever you need ease of use without fiddling around and spending time on listening, whenever you need to be freed of the horrors of mastering and concentrate on mixing, whenever you experienced or not, whenever you need to work with your own equipment or software sequencers and Plugins.
AAMS is a new innovative way to get you there fast, accurately and automatically.
Let AAMS do the work for you!
Can we put your software on a website or that we will include with our magazine ?
The answer to that question is usually yes, with the following stipulations:
•send us 1 complementary copy of that issue of your magazine.
•include versions for all supported windows platforms
•make sure that you go to our web page at the last minute and get the most recent versions of our stuff
•follow the redistribution terms of the software's license
•if possible, include some piece about us in your magazine (review, news, interview)
•contact us first (we will probably say "yes" if you agree to these stipulations, but please still contact us first)
•Only AAMS Freeware / Shareware version can be distributed (as can be downloaded on our site).
•You cannot promise users a registration for free.
What do I do with the files that I downloaded ?
The files at this web page are compressed archives.
This means that there are several files wrapped up into one single file (archive).
Which is shrunk down in size (compressed) so that it doesn't take as long to download.
You need to decompress the archives before you do anything else.
If the files don't automatically decompress (or "expand") when you download them, and if double clicking on them after downloading doesn't work either,
then that probably means that you need to install software that can decompress these archives.
We use ZIP (files ending in .zip).
If you have successfully decompressed the archives, then read our installation instructions to learn how to install our software.
Will you make 64-bit versions of your software ?
We are now working to better the 32 bit versions.
We are working on 64-bit versions of our software later.
We do not have a timeline for any of this at the moment.
Basically AAMS V3.0 will work good as 32 Bit and 64 Bit Windows Software.
So there is actually no need to make two versions of AAMS, right now.
Can i master mix tracks STEMS in AAMS ?
So I'm sure a lot of people have groaned in frustration while trying to master stems in AAMS.
You'll also find that if you use the automatic side of the software it will gain each track separately based on an individual analyzer for each one which you DO NOT want.
So I found a solution.
You're not going to like the solution because it's annoying.
Step 1- create an analyzer file for the mixdown (not master) of the whole song (not an individual stem) if you don't have an audio file for the whole song with every track on it make one.
Step 2- go to your settings and change it from automatic to semi-automatic
Step 3- go to the source tab and load that analyzer file I told you to make as the source
Step 4- go to the reference tab and load whatever rms preset you're using (I used the movie theme one)
...okay now AAMS has loaded the suggestions for the whole song not just the stem
Step 5- go to the dsp-eq tab, load the audio file, copy the suggestions (there's just a button you click), then hit the red record button..
...it will save an audio file that looks like this "yourtrackname_eq.wav"
Step 6- go to the dsp-compressor tab, load the "yourtrackname_eq.wav" audio file, copy the suggestions, then click the record button
...it will save an audio file like this "yourtrackname_eq_c.wav"
Step 7- go to the dsp-loudness tab, load the "yourtrackname_eq_c.wav" audio file, copy the gain suggestions AND the balance suggestions (two buttons), then hit the "auto record dsp loudness" button instead of the record button (it will do both the gain and balance in one shot this way)
...it will save an audio file like this "yourtrackname_eq_c_b_l.wav". This is the final mastered stem.
There your first stem is done!
Now repeat the process for each one!
notes: I don't think you actually have to keep copying the suggestions..
I believe AAMS does that automatically on semi-automatic, in fact i believe that the suggestions that you copy are based on the audio file not the reference but either way it changes the suggestions to the reference when you hit the record button automatically so it doesn't matter really.
one more note: I haven't completely finished my stems yet so I can't be completely sure I won't still run into yet more problems,
but the first 8 stems that i got done so far sounded perfect compared to the ones that failed trying to do it the automatic way the first time so I think this will work ;)
How to master your own tracks with your favorite tools using AAMS ?
1. Cut it
Mixdown at -6db so there's plenty of headroom to work with.
Always work in in an uncompressed format like 44.1khz / 16 bit wav or aiff.
Higher bit rates and sampling frequency are nice but they'll be a pain in the bum later on.
2. Analysis it
Run it through AAMS against the master.ams profile to do a Frequency Comparison Analysis (it'll also give you some compressor settings that might be handy, or not), save recommendations, DONT PROCESS.
You can also do a test run and normalize it to -5db to see how the clips and RMS check in with a Cool Edit Pro analysis (don't save the hot signal though).
3. EQ it (if you must)
If the mix is way off in AAMS do a preliminary EQ using whatever you're comfortable using. Most wave editors have a few options. AAMS can be configured to give you recommendations for just about any numbers of bands for Graphic EQ.
If you're lazy let AAMS EQ it for you on 100 bands. EQ'ing twice can be destructive..
So (if it's a weak mix) you may want to start a second set of masters one with Pre-EQ and one without as the frequency response is likely to change during mastering.
4. Trim it
You can lowpass filter the mix at 20khz to open up even more headroom and squeeze every last bit of volume into the final mix.
You could also Highpass filter it at 20Hz, which might work depending on the mix, but you really need sub monitors to feel how it turns out.
If you don't have the tools don't mess with the subs.
5. Excite it
Tons of options here.. Currently I'm using oZone 3's 4 Band Dance Master preset plus DC Offset Correction & 24 bit dither..
this does way more than it should be at this step..
in theory you just want to wet it a little and tie the mix together with a gentle reverb and/or expand the stereo acoustics.
For me oZone also runs some compression during this step which changes the frequency spread but you can configure it not to..
my preferences (and final sound) often change from project to project.
6. Check it
Run it through AAMS to see how the dynamics have shifted. Check up on the RMS, then zoom in real close so you can see the waveform scroll by while it's playing and make sure the signal is looking clean. If you don't know what you're looking for watch some tracks that sound good and eventually it'll click. Now's also a good time to EQ it if you didn't mess with it before and it's just a little bit off... if it's still spot on count your blessings and prepare to squish the h3ll out of it.
7. Compress it
I swear by Nomad Factory's E-3B Multiband Compressor on "Mastering Class A" (4:1 RM/Peal/RMS). It's old, probably outdated, won't even load properly on Vista in a DAW- but it works like magic.
Feel free to explore and find one that "sounds" right to you, or just take my word for it and beg borrow or steal to get your hands on the E-3B.
Anyways this thing will buff up a clean signal into the makings of a commercial quality powerhouse. It's definitely not transparent, but in 9 out of 10 cases it's going to sound a hellava lot better once it's had it's way with it.
8. Normalize it
Normalize by peak (Not RMS!) to -.5db , all of a sudden all the work should be paying off and it'll be a massive speaker shaking stormer. You want to get a signal with an RMS in the range of -12db to -6db. But don't normalize by RMS to buff up the signal as it'll tear chunks out of your dynamics.
For me what I end up with will tend to be based on how well I've created the track.
If I hit -10db RMS at -.5db peak I'm right happy.
That extra half decibel of head room is there to compensate for the strange peak variations that show up when you compress the track to a lossy format like mp3 or wma.
9. Check it
AAMS and E-3B won't agree. It's close, but AAMS would like to castrate the track if you gave it a chance. It's worth taking a look see and it'll probably always have just about the same discrepancy in the levels at this point, use the force and trust your compressor of choice in essence that will be your signature "sound".
Now what you should have is some serious "product" ready for distribution, compilation or even vinyl if you've watched your phase (long story, different thread).
BUT we could take it a step further.. sometimes you might fall short of -12db RMS or just need a killer VIP HOTMIX to devastate the PA , this is where the Maximizer comes in handy.
10. MAXIMIZE IT
(CAUTION!) This step can and will most likely blow your mix dynamics three sheets to the wind,
if you're presenting the mix to a label I would rather go back to square one and fix the mixdown rather than depend on the maximizer. But when you absolutely positively have to rock everything in the room the Nomad's E-3B Maximizer is there to turn it out.
Every preset on that thing is way too hot for a final master. I've toyed with using it instead of a compressor but it demolishes the target frequency spectrum. Still, for our personal arsenal or netcast there's a relatively quick workaround.
The "CD Master" setting is killer.. but saying it's overly aggressive doesn't even begin to describe the mayhem. So what you want to do is cut the threshold in half from -6db to -3db , nerf the output boost from 3db to 1.5db and fiddle with the attack from 42ish to 30.5 ms (changing the attack doesn't really matter, it's just so we played with most the sliders, it's a fast attack/release either way).
And BOOM.. instant ultra RMS sonic powerdriver without a single clip that looks like it's been tortured by the meanest hard limiter on the block. Actual mileage may very depending on the track and genre but it's a pretty safe bet in the electronic arena.
Referring to the License Agreement AAMS is Limited Freeware.
When you do not use AAMS any more please remove AAMS.
Use AAMS Installer to uninstall.
Or remove the program directory c:program filesAAMS (c:program files (x86)).
Remove all Icons on the desktop and start menu.
AAMS Test on Deep Neural Networks for Dynamic Range
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/296704118
Deep Neural Networks for Dynamic Range
Compression in Mastering Applications
Conference Paper · June 2016
4 authors, including:
Tampere University of Technology
24 PUBLICATIONS 38 CITATIONS
Tampere University of Technology
149 PUBLICATIONS 2,977 CITATIONS
Stylianos Ioannis Mimilakis
Fraunhofer Institute for Digital Media Techno…
4 PUBLICATIONS 5 CITATIONS
All content following this page was uploaded by Stylianos Ioannis Mimilakis on 14 February 2017.
The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document
and are linked to publications on Research Gate, letting you access and read them immediately.
Deep Neural Networks for Dynamic Range Compression in Mastering Applications
Stylianos Ioannis Mimilakis1, Konstantinos Drossos2, Tuomas Virtanen2, and Gerald Schuller1
1Fraunhofer IDMT, Ilmenau, Germany
2Audio Research Group, Dept. of Signal Processing, Tampere University of Technology, Tampere, Finland
The process of audio mastering often, if not always, includes various audio signal processing techniques such as
frequency equalization and dynamic range compression. With respect to the genre and style of the audio content,
the parameters of these techniques are controlled by a mastering engineer, in order to process the original audio
material. This operation relies on musical and perceptually pleasing facets of the perceived acoustic characteristics,
transmitted from the audio material under the mastering process. Modelling such dynamic operations, which
involve adaptation regarding the audio content, becomes vital in automated applications since it significantly
affects the overall performance. In this work we present a system capable of modelling such behavior focusing
on the automatic dynamic range compression. It predicts frequency coefficients which allow the dynamic range
compression, via a trained deep neural network, and applies them to unmastered audio signal served as input. Both
dynamic range compression and the prediction of the corresponding frequency coefficients take place inside the
time-frequency domain, using magnitude spectra acquired from a critical band filter bank, similar to human’s
peripheral auditory system. Results from conducted listening tests, incorporating professional music producers
and audio mastering engineers, demonstrate on average an equivalent performance compared to professionally
mastered audio content. Improvements were also observed, when compared to relevant and commercial software.
Audio production often includes a final stage of process
which is placed just before the stage of replication
and commercial distribution of the audio material. It is
entitled mastering and involves a series of audio signal
processing algorithms, aiming to provide an overall audio
enhancement in order to link the professional audio
with the hi-fidelity / home-entertainment industries .
Mastering consists of two main signal processing methods:
I) equalization of the frequency content, and ii)
dynamic range control. These two operations require
a considerable amount of parameters that have to be
defined and controlled, in order to process the audio
signals. Main ambition of this processing is to aesthetically
enhance perceived acoustic characteristics of the
signals . The selection and the adjustment of these
parameters relies solely on a continuous interaction between
the audio / mastering engineer and the apparatus
that handles the audio signals.
During the above interaction takes place an acoustic
monitoring of the processed audio, driven through the
mastering apparatus. The aforementioned parameters
are adjusted until convergence to the desired result,
based on auditory feedback and a set of subjective criteria,
which are dependent on musical facets of the audio
corpus. On one hand this fact imposes an extensive
human effort but, on the other, it is the essence of a
successful procedure. Consequently, these criteria have
been proved to be substantial in audio production 
and especially in the design of intelligent systems that
automatically perform various tasks in different stages
of audio and music production , i.e. audio mixing or
There have been published works concerned with providing
automated solutions to the above-mentioned time
consuming routines . They aim to unveil a correlation
between various audio signal features contained
inside the original audio material and the one processed
by the engineer [3, 5]. In most cases though, the
focus is in automated processes of audio mixing where
observations of the independent channels and the target
mixture signals are available [3, 4, 5].
For automated procedures in audio mastering, where
only the original (unmastered) and processed (mastered)
audio mixtures are available, only two approaches
exist. The first tries to exploit statistical properties of
the tracked fundamental frequency of the audio content,
in order to derive a set of frequency bands that will be
enhanced . In that case the fundamental frequency
was extracted from the time-domain representation of
the unmastered audio signals. The extracted information
was then used to compute histograms and the most
prominent observations of frequencies were served as
information to second order peaking-type filters, boosting
these particular frequency regions.
The second focuses on statistical properties of audio
signals which are used to control parameters for dynamic
range compression . In more detail, it takes into
account that dynamic range compression significantly
modifies the probability density function (PDF) of the
root mean square energy of the audio signal. Thus,
by minimizing the difference of the PDFs between the
mastered and unmastered audio signals, in short time
frames, parameters for the dynamic range control can
be acquired .
These two approaches can be understood as an operation
of simulating the process of audio mastering by
a recording or audio mastering engineer. It is not trivial
to define a feature space which will model such
complex and adaptive operations. Neither fundamental
frequency nor basic statistical properties could sufficiently
yield enough information for complex modelling
purposes, especially when the prior knowledge
of the audio corpus is limited, i.e. the observed two
channel mixtures before and after the processing.
A solution to the imposed difficulty from the limited
knowledge of the feature space could be given by factorization
techniques and especially non-negative matrix
factorization (NMF) . Its application to observed
mixtures of audio magnitude spectral representations
can provide decompositions of the individual components
consisted inside the mixture. In addition to
this, the signal representation obtained by NMF also
allows various implementations of audio signal processing
techniques . Nevertheless, in the case of audio
mastering, where much dynamic range compression
and gain processes are usually applied , different probability
distributions should be assumed in the NMF
model, resulting into a much more complicated model
Deep neural networks (DNNs) seem to offer a straightforward
method that encompasses the benefits from the
above techniques [11, 12]. Especially with their capabilities
in learning non-linear mappings from low-level
features to high-level ones . More specifically, a
fundamental architecture of DNNs entitled auto encoders
is capable of establishing various associations of
the presented data in an unsupervised fashion similarly
to NMF, while these auto-associated representations
can be served as features that provide predictions or
solutions to a specific problem .
In this work we try to expand the existing technologies
for automated mastering process by proposing a novel
system for off-line automated dynamic range compression.
Our system is based on a DNN formed by two
pre-trained fully connected auto encoders. In particular,
we try to map low-level, magnitude features to dynamic
range compression factors, in such a way that it
simulates the aesthetics of dynamic range processing
in audio mastering. This mapping is performed by a
trained DNN and is later used to compute gain factors
that modify the input magnitude spectra.
The rest of this paper is organized as follows. Section II
gives a detailed overview of the proposed system. Section
3 describes the experimental procedure followed
for training the DNN. Obtained results are presented
and discussed in Section 4. Section 5 concludes the
paper and proposes possible feature directions of research.
II Proposed System
The proposed system consists of two components. The
first one is responsible for spectral analysis and synthesis
of the input audio signal while the second is responsible
for the prediction and utilization of necessary
factors that will be used to transform the original spectra.
The stages of analysis and synthesis consist of short time
Fourier transformation (STFT) and its inverse
(ISTFT), followed by an overlap and add operation.
From the output of the analysis stage, the magnitude
information is given to the second component, while
the phase is kept for the re-synthesis stage.
The second component utilizes the imported magnitude
information, warps its linear frequency resolution by a
filter bank and then drives it through a DNN, yielding
exponent coefficients, which are then used to transform
the warped spectra. Both transformed and unprocessed
warped spectra are being interpolated to their original
linear scale resolution, with their ratio providing
estimations of gain factors.
Finally, the gain factors are used to transform the complex
spectra captured by the first component and then
proceed with the time-domain synthesis of the corresponding
signal. An illustration of the proposed system
is being given in Figure 1.
The detailed processing in the proposed system operates
as follows. A time domain signal x(n), with discrete
samples n, is transformed into a two-dimensional time frequency
representation X[m;k], under the assumption
that x(n) is stationary inside m short time frames and
independent over k sub-band channels / frequency bins.
To do so, a STFT is used by evaluating Equation 1 for
0 _ k < N:
Fig. 1: System Overview.
In the above equation N denotes the number of samples
for the discrete Fourier transformation (DFT), R
the analysis step size and w(n) a hamming windowing
function. The resulting representation has a linear frequency
resolution. Our interest is to investigate and
model a perceptual process. For these reasons, the magnitude
information jX[m;k]j is warped to a non-linearly
scaled frequency resolution, denoted as Xw. This scaled
frequency resolution, includes information of critical
frequency bands, similar to human’s peripheral auditory
It has to be noted that we are concerned with an offline
process, thus Xw is a matrix containing all time frames
m over the warped sub-bands c, derived from the input
audio signal. As for the warping procedure, it is performed
in two steps: i) compute triangular frequency
responses for each sub-band of the linear frequency
resolution, that form a matrixW and ii) perform a matrix
multiplication between the basic functions and the
magnitude spectra defined as :
Xw =WjXj: (2)
The dimensions of matrix W are C_M, with C and
M being the total number of sub-bands and short time
frames, respectively. The center frequencies and bandwidths
employed for the basic functions, according to
the human’s peripheral auditory system and , are
bc = 0:108 fc+24:7Hz; (3)
fc = 229[10(a1c+a0)=21:4????1] (4)
and c is an integer denoting the sub-band index and
c = 0;1;2; : : : ;C????1. a0 = 1:5 and a1 = 0:79 are constants
that determine the center frequency of the lowest
band and the band density in critical bandwidth units,
Then Xw is used as an input to the trained DNN which
outputs estimations of the exponent factor ˆR. The latter
will be utilized in next stage for transforming Xw. More
specifically, the estimations are performed by simply
feed-forwarding the warped spectra, leading to a series
of matrix vector multiplications defined as :
i j = g(Xw
i j +bl
i j = g(hl
i j +bL
where l is an index of the corresponding layer of the
network (l = 1; _ _ _ ;L), g an activation function, which
in this work is the rectified linear (ReLU), andWl and
bl are the weights and biases of each layer l, respectively.
The index i corresponds to a vector containing
short time frames, matching the input dimensions of
the DNN and j the dimensions of the hidden layer
representation hl .
The predicted coefficients ˆR are in a matrix form of
the same dimensions as Xw. Then, the transformation
is performed by raising all the elements of Xw to the
power of ˆR. For computing the gain factors G both
warped spectra Xw and Yw must be transferred to the
original linear scale. This can be performed using
Gain factors G can now be computed by the elementwise
division of the above quantities, leading to:
G = fs(
where fs is a bounding sigmoid function, which will
ensure a distortion free reconstructions, defined as:
????1, for b = 2: (8)
Finally, an element-wise multiplication between the
computed gains and the original complex spectra is
performed followed by the ISTFT and overlap-add synthesis
procedure. In case of multichannel audio input,
the prediction is performed using the average, over the
number of channels, magnitude spectra while the gain
is applied to all input channels.
III Experimental Procedure
The experimental procedure is divided in two stages.
The first one is concerned with the training procedure
of the DNN, including training data preparation, network
topologies and the strategies followed, in order
to perform the mapping from low level acoustic features
to the factors R. The latter stage, consists of the
preparation of another audio corpus, containing processed
files from various operations including professional
ones and from commercial software.
III.a Training Procedure
The overall training process is performed in three steps.
The first two incorporate an unsupervised learning approach
and the third one, henceforth called fine-tuning,
is done in a supervised fashion. During the fine-tuning
step, the input and target functions are matrices of the
same dimensions that contain the warped spectra Xw
and true estimates R, respectively. These are given as
objectives to the DNN.
In order to acquire the target function we implemented
an iterative analysis of the training dataset which
was acquired from an online dataset . The latter
contains both mastered and unmastered versions
of audio tracks from various genres. Thus, for each
version, i.e. mastered and unmastered, of all the audio
tracks we computed Xw with the described methodology.
By having analyzed pairs of unmastered
and mastered audio signals, their logarithmic ratio
R = log10(Yw)(log10(Xw))????1 can provide the dynamic
range factor for the corresponding frequency sub bands
In practice, mapping Xw to the dynamic range factor
resulted in a poor function fitting. In addition to this,
it was experimentally observed that time fluctuations
of magnitude spectra would also penalize the fitting
procedure in an undesired manner. For dealing with the
mapping issue, two prior steps of unsupervised learning
relying on auto encoders were introduced. With this
technique the initial parameters for the DNN, in fine-tuning
stage, can be learned and thus resulting a better
convergence to the desired result.
As for the time fluctuation, the matrices used in the
objective of each training instance were reshaped so
each column contained five short time frames of Xw.
The training procedure consists of the following procedures:
1. Train a deep auto encoder, with four layers of 260
fully connected, ReLU, nodes using Xw as input
and target functions.
2. Train a deep auto encoder, with three layers of
260 fully connected nodes using R as input and
target functions. For the first two layers, the ReLU
activation function g is used. The number of nodes
of the hidden layer representation is equal to 350.
3. Construct a new DNN with seven layers in total,
using the same dimensions and activation functions
as above. Initialize this DNN with the retrained
parameters Wl and bl , acquired from the
first steps. Train this network with Xw as input
and R as target functions, respectively.
Each of the above training procedure was performed
over a 150 iterations, i.e. epochs, through the dataset
while the parameters updates where performed in a
small batch size of 20 matrix rows i. For all the layers l
during the first two steps, a uniform distribution was selected
to pseudo-randomly initialize all the parameters.
The optimization technique used is described in 
with its criterion set to the mean squared error (MSE).
Finally, both auto encoders, i.e. ones from procedures 1
and 2, where trained using the dropout technique 
Table 1: Employed system parameters.
Window size (w(n)) 2049 samples
DFT size (N) 4096 samples
Step size (R) 1025 samples
Number of critical bands (C) 52
with a probability of 0:3 for a neural unit to stop contributing
to the training at each epoch. The selection
of the aforementioned parameters and techniques was
based on informal experimentation and empirical observations.
A comprehensive overview of the parameters
used throughout all the described procedure can be
found in Table 1.
III.b Audio Corpus Preparation and Subjective Evaluation
For the evaluation of the proposed system we utilized a
different dataset obtained from an online source .
This consisted of different unmastered audio tracks in
a multi-channel form, which can be categorized to various
music genres, e.g. jazz, pop, rock, ethnic, electronic
etc. Each audio track was mixed by the authors by
the usage of a typical digital audio workstation (DAW).
The mixing process yielded four stems (groups) of the
aforementioned multiple channels such as vocals, percussion,
bass and other.
From these stems we exported two versions of the eight
audio tracks. One version contained the mixture of the
stems alongside a professional mastering procedure,
following guidelines and best practices for dynamic
range compression and equalization described in [1, 2].
For the second version only the mixing process was
considered. Table 2 demonstrates the utilized apparatus
for mixing and mastering the audio corpus.
Table 2: Utilized apparatus
Usage of apparatus Brand & model
Monitoring System Audio Technical ATHM40FS
I/O Interface N.I. Komplete Audio 6
DAW Pro Tools First
The version which contained only the mixture, was
served as input to the proposed system and to one commercial
software that is acknowledged to perform automated
procedures in audio mastering . In more
detail, the software from  denoted as AAMS, performs
spectral equalization and dynamic range compression
for audio mastering purposes, by defining the
music genre of the input audio signal. After the genre
definition based on descriptions of , the automatic
procedure took place and the outcome was stored in an
From the above procedure the three resulting versions,
i.e. professionally mastered, processed by the proposed
method and by the AAMS software, were segmented
into instances of 30 seconds. The segmentation was
performed for each individual audio track, but same
time regions for all the versions of each track were
selected. The criteria for segmentation was the contribution
of all the stems to the mixture. In addition to
this, all the versions were normalized to have an equal
RMS energy, since loudness is outside the scope of this
Nine experienced and professional music producers,
mixing and mastering engineers, with relevant studies
participated in a subjective evaluation experiment. The
main objective was to grade each version according to
their subjective preference, assuming 1 as the lowest
grading point denoting poor performance, and in contrast
10, best performance. All grades were given with
respect to the dynamic range and spectral balance of the
audio material. A random shuffling of the versions was
performed before the experiment, while the amount
of playback repetitions and the used monitoring/audio
reproducing hardware was subject to each participant.
The only requirement was the usage of studio quality
IV Results & Discussion
Results from the subjective evaluation are illustrated in
Figure 2. The lower and upper quartiles are depicted
with the lower and upper horizontal lines of each box.
Red line indicates median value of grading, while cross
denotes an outlier in the observations.
By observing Figure 2 it can been seen that the proposed
system performs worse than professional mastering
operations, but on average equally well with the AAMS
commercial system that we utilized. A closer inspection
on the results’ figure can also reveal that although
our and the AAMS system have an equal median rating,
the former exhibits more higher ratings than the latter.
The difference of the upper quartiles is at the order
Professionally Mastered Proposed System AAMS
Fig. 2: Variation Analysis of Subjective Grading for
the three versions yielded from the corresponding
of one degree in the employed rating scale. This fact
clearly depicts that the 25% of the upper ratings were
significant higher than the ones of the reference system.
A similar trend can be seen on the lower quartile
where the proposed system exhibits greater minimum
ratings than the AAMS one. The difference of the lower
quartile values between the proposed and the reference
systems is at the order of 1:5 points in the used rating
Finally, one more interesting observation is that the upper
quartile value in Figure 2 for the proposed system
is almost the same as the one from the ratings that professional
mastered versions had and the lower quartile
is less than one rating degree lower from the corresponding
one of the professionally mastered versions. This
fact clearly demonstrates the improvement in the resulting
dynamic range compression and spectral balance
from the proposed system over the existing state of the
art where the reference system had lower upper quartile
at the order of one rating degree and almost two rating
degrees smaller value of the lower quartile.
In the work at hand we focused on automated audio
signal processing for audio mastering applications. We
utilized DNNs relied on the useful initialization provided
by auto encoders, for predicting dynamic range
compression and spectral balance enhancement parameters.
The latter were automatically applied to unmastered
audio tracks. The resulting automated mastered
audio material was compared to professionally mastered
versions of the same musical compositions. In
addition, we also created automated mastered versions,
again of the same audio tracks, with another and
commercial system for automated mastering.
In order to evaluate our system we compared the abovementioned
mastered versions, i.e. the professionally
mastered one, from the proposed system and from the
reference one, by implementing subjective evaluation
tests. In the latter were participating currently active
professional master and recording engineers. The results
of the subjective evaluation tests depicted that the
proposed system achieves an average rating same as the
reference one and less than the professionally mastered
versions. Nevertheless, the proposed system clearly
received more higher ratings than the reference one, as
illustrated at the resulting box plots of the subjective
Nevertheless, there are significant improvements to
be implemented at the existing automated mastering
systems in order to achieve a subjective rating similar
to the one that a professional mastering engineer would
The research leading to these results has received funding
from the European Union’s H2020 Framework
Programmed (H2020-MSCA-ITN-2014) under grant
agreement no. 642685 MacSeNet.
 Owsinski, B., The Mastering Engineer’s Handbook:
The Audio Mastering Handbook, Artistpro,
2nd edition, 2007.
 Bob, K., Mastering Audio: The Art and the
Science, Focal Press, 2nd edition, 2007.
 De Man, B., Leonard, B., King, R., and Reiss,
J. D., “An Analysis and Evaluation of Audio Features
for Multitrack Music Mixtures,” in 15th International
Society for Music Information Retrieval
Conference (ISMIR 2014), 2014.
 Reiss, J. D., “Intelligent systems for mixing multichannel
audio,” in 17th International Conference
on Digital Signal Processing (DSP), pp.
1–6, Corfu, Greece, 2011.
 Ma, Z., De Man, B., Pestana, P. D. L., Black,
D. A. A., and Reiss, J. D., “Intelligent Multitrack
Dynamic Range Compression,” J. Audio Eng. Soc,
63(6), pp. 412–426, 2015.
 Mimilakis, S.-I., Drossos, K., Floros, A., and
Katerelos, D., “Automated Tonal Balance Enhancement
for Audio Mastering Applications,” in Audio
Engineering Society Convention 134, Audio
Engineering Society, 2013.
 Hilsamer, M. and Herzog, S., “A Statistical Approach
to Automated Offline Dynamic Processing
in the Audio Mastering Process,” in Proc.
of the 17th International Conference on Digital
Audio Effects (DAFx-14), pp. 35–40, Erlangen,
 Févotte, C., Bertin, N., and Durrieu, J.-L., “Nonnegative
Matrix Factorization with the Itakurasaito
Divergence: With Application to Music Analysis,”
Neural Comput., 21(3), pp. 793–830, 2009,
 Sarver, R. and Klapuri, A., “Application of Non-
Negative Matrix Factorization to Signal-Adaptive
Audio Effects,” in Proc. of the 14th Conference
on Digital Audio Effects (DAFx-11), volume 45,
pp. 249–252, Paris, France, 2011.
 Simsekli, U., Liutkus, A., and Cemgil, T., “Alpha-
Stable Matrix Factorization,” IEEE Signal Processing
Letters, p. 5, 2015.
 Bengio, Y., “Learning deep architectures for AI,”
Foundations and Trends in Machine Learning,
2(1), pp. 1–127, 2009.
 Smaragdis, P., “NMF? Neural Nets? It’s
all the same...” http://youtube.com/
watch?v=wfmpViJIjWw, November, 2015,
presentation; Accessed December-2015.
 Moore, B. C., editor, Hearing (Handbook of
Perception and Cognition, Academic Press, San
Diego, California, 2nd edition, 1995.
 Dimensions, A., “Mastering Audio Samples- before
and after mastering.” 2015, online; Accessed
 Zoelzer, U., Digital Audio Signal Processing,
John Wiley & Sons, 2nd edition, 2008.
 Kingma, D. P. and Ba, J., “Adam: A Method for
Stochastic Optimization,” CoRR, abs/1412.6980,
 Srivastava, N., Geoffrey, H., Krizhevsky, A., Sutskever,
I., and Salakhutdinov, R., “Dropout: A
Simple Way to Prevent Neural Networks from
Overfitting,” Journal of Machine Learning Research,
15, pp. 1929–1958, 2014.
 Senior, M., Mixing Secrets For the Small Studio,
Focal Press, 2011, online Dataset; http:
 Curioza, S. F., “AAMS: Auto Audio Mastering
System,” http://curioza.com, 2011.
View publication stats
Freeware to Download, with high encouragement to Register for AAMS V3 full professional version. Registration ensures users to upgrade to AAMS V3 full version and all options opened and having full control! Fill in our contact form for registrations or questions. Goto our Shop now!
Pay with a Bank or Credit Card
Pay with PayPal account
When you want to pay directly with PayPal, or when the shop is not working for you.
Pay by Credit Card.