Subject:Decimation bug when processing / saving in 16 bit
Posted by: JonP01
Date:1/14/2011 3:57:26 AM
Hello, I have been a Sound Forge Pro 10 user for some months and have always wondered why I have suffered a significant sound quality drop when saving 16 bit files. This is even when using the MBIT+ algorithm, which is known to be one of the best decimation and dithering algorithms in existence. Whilst I realise a quality drop when going from 24 bit to 16 bit is inevitable, the loss of quality has been quite serious - far more than I expected. I decided to investigate this problem further and I believe I have discovered quite a serious bug in Sound Forge Pro 10, whenever a 16 bit file is saved to the hard drive (using the File / Save function). If I am right about this bug (and I can reproduce the problem 100% of the time on my particular machine), then it has serious implications for anyone producing 16 bit files using this application. Effectively, the quality of the MBIT+ algorithm is completely lost, for example if someone is producing files to be used for the creation of a CD. Listed below is a sequence of steps which reveal the bug on my machine. What you should find is that after following these steps, Sound Forge Pro 10 will attempt to decimate any file that is 16 bits (even though once it is 16 bits, no decimation or dithering is even required since the file is already at the target bit depth). The result is that when the actual 16 bit file is saved, a series of artefacts appear in the file - these artefacts being typical of what one would see if a 24 file were to be simply decimated without any noise shaping or dithering. 1. With Sound Forge 10 Pro opened, open a new (blank) file. 2. Right-click on the window containing the blank file. If the properties do not reflect a 48 kHz file at 24 bit, then change the properties to reflect that. You should now have a blank 24 bit, 48 kHz file. 3. Select Insert / Synthesis / Simple. 4. Set the Amplitude to 0dBFS. Set the Waveform shape to Sine. Change the length to 60 seconds. Change the start frequency to 1,000 Hz. Ensure the End Frequency and Log Sweep checkboxes are unchecked. 5. Click on the OK button. You should now have a 60 second file containing a 1 KHz sine wave at 0dBFS with a bit depth of 24 bits and a sample rate of 48 kHz. 6. If the Spectrogram window is not already opened then open it (View / Spectrum Analysis). 7. Now highlight, say 50 seconds of the actual sine wav file you created in the first 5 steps. Then click on the Refresh icon on the Spectrum Analysis window. This icon is the fourth object along the top of the window and looks very similar to the Refresh icon on Windows Internet Explorer 8. 8. You should now see the spectrogram results - a clean looking sine wav showing at 1khz with no artefacts (the sine wav will appear to "spread out" below -120 dB but importantly there are no actual artefacts showing. 9. Now save this file. Make sure when you save the file, you select the Template 48,000 Hz, 24 Bit, Stereo PCM. You can call it any name you like - even the default one - but just make sure you actually save it and use the 24 bit, 48 kHz template as described. 10. Now refresh the spectrogram once again. You will see that it has not changed - it looks precisely the same as it did before you saved the file. 11. Now we are going to create a 16 bit version of this file, using MBIT+. Select Process / Bit Depth / iZotope MBIT+ Dither. Select bit depth 16, Dither mode MBIT+, Noise shaping Medium, Dither amount Normal. 12. Press the OK button (you may get a warning message that it will convert the entire file and not just the highlighted area - assuming you still have the file highlighted. This is of course fine - let it convert the file). 13. You should now have a nicely decimated, dithered and noise shaped file. Once again, refresh your spectrogram (remember to highlight around 50 seconds of the file). Note how the file has been beautifully converted to 16 bits - you should see added noise at very low levels throughout the file, a "peak" of noise starting near the top of the spectrum roughly around 19 kHz and certainly no horrible artefacts anywhere. 14. So far so good - everything is looking lovely for our newly created 16 bit file. But now here is the kicker. Save this file - you can either go File / Save, File / Save As or even just click on the save icon. And it won't matter whether you use the 48 kHz 24 bit template or the 48 kHz 16 bit template - the results will be exactly the same in every case. 15. Now refresh your spectrogram. Look at the decimation artefacts which have suddenly appeared as soon as you saved the file! There are now a whole heap of "spikes" in the file, starting at 3 KHz and repeating every 1 KHz increment. This file looks nothing like the dithered 16 bit file you had just before you clicked on the save icon. I should add that I am unable to reproduce this problematic behaviour if I record an actual file at 16 bits to begin with. For example, if I start a brand new recording at 48 khz, 16 bit and then save it, the artifacts do not appear upon saving the just-recorded file. So the problem seems to manifest itself only when processing has occured on a 24 bit file and the file is then decimated, dithered and noise shaped then saved. In other words, not only are we not getting any benefit at all out of MBIT+ when using Sound Forge Pro 10, but saving a 16 bit file that was originally at 24 bits is extremely harmful to the sound, since Sound Forge Pro effectively creates decimation artefacts at the point when the file is finally saved. I hope that Sony can look into this and reproduce the original steps that I have outlined. At the moment - at least on my installation - it is not possible to do any processing in 16 bits unless the file is never actually saved. Because as soon as it is saved, artefacts are added to the file suggestive of a decimation process that isn't even required. Thank you. Message last edited on1/14/2011 4:45:58 AM byJonP01. |
Subject:RE: Decimation bug when processing / saving in 16 bit
Reply by: ForumAdmin
Date:1/14/2011 9:17:17 AM
Short version: Add headroom before you dither (Volume -0.01 dB). Long version, per your steps: 1-10: generate a 24-bit PCM source 11-12: Apply MBIT+. The 24-bit data is converted to floating point as it is processed. The internal result is stored as floating point. Now, adding appropriate dither to cancel out quantization artifacts (probability density function spans 2 LSB peak-to-peak) to a 0 dB sine will results in overs at some of the peaks, no matter how shaped the noise. So there's an internal result that contains overs, and the target bit-depth is set to 16-bit. While things that request 16-bit (playback, waveform, levels toolbar) will get the correct 16-bit value, the internal result remains float. That is, the last 8-bits of mantissa still retain data. 13: Analyze prior to Save The spectrum analysis tool (and, as alluded above, pretty much any processing operation) converts all input to floating point. Since the internal result is still stored as float, there's no conversion to perform, the audio remains continuous (though it contains overs) and the analysis appears smooth. Here is the crux of the problem. One could easily argue that Spectrum Analysis should always read/convert to the target bit depth, then re-convert back up to its processing resolution. There are some good arguments not to do so, as well, depending on your workflow. So this should probably be a user option. 14: Save - At this point, the floating point data (with overs) is converted to 16-bit PCM with standard banker's rounding (aka round-to-nearest-even), which is the statistically appropriate standard method for audio. The overs present in the floating-point source are now bound to PCM resolution ([-32768/32767] for 16-bit PCM). That is, they're now clips. 15: Analysis of the re-quantized 16-bit result. Clips and all, hence the aliasing in the specturm. Summarily: A) Don't dither peak signals. Make sure you have at least 1LSB of headroom. B) Spectrum Analysis would benefit from a user option to pre-quantize to the current bit-depth so they don't have to resort to Save. J. Message last edited on1/14/2011 9:36:26 AM byForumAdmin. |
Subject:RE: Decimation bug when processing / saving in 16 bit
Reply by: JonP01
Date:1/14/2011 10:04:33 AM
Thanks so much for the very prompt and thorough reply. Your advise works - i tried it with no problems at all. In my projects I have generally been normalising using the peak level function and saw no reason not to go to 0dBFS whilst still working in 24 bit (since I see this is done quite ubiquitously on commercial CD releases - even "audiophile" classical ones. But yes, peak normalisation to -0.1 dBFS prevents the issue from happening. I agree what you say about "B" ("...Spectrum Analysis would benefit from a user option to pre-quantize to the current bit-depth so they don't have to resort to Save...."). Perhaps this might be looked into for a future update, and perhaps a checkbox to turn that behaviour on or off. Thanks again. |
Subject:RE: Decimation bug when processing / saving in 16 bit
Reply by: musicvid10
Date:1/14/2011 9:02:12 PM
I appreciate the depth of this discussion, and I have learned from it. One of the first pieces of advice I received when making the transition to digital audio was, "Never normalize peaks to 0 dBFS." It was later I found out about intersample peaks. Now I have yet one more good reason not to do so. Message last edited on1/14/2011 9:26:05 PM bymusicvid10. |