Fix librosa.load() is greater than 1.0 – Python Librosa Tutorial

admin

2 years ago

Usually, we use python librosa.load() to read a wav file and we will get a numpy array between -1.0 and 1.0. Here is the tutorial:

Understand librosa.load() is Between -1.0 and 1.0 – Librosa Tutorial

However, librosa.load() may return a value that is greater than 1.0. In this tutorial, we will introduce you how to fix it.

For example:

wav = r"audio_data/speech-us-gov-0028.wav"
wav_data, sr = librosa.load(wav, sr = 8000, mono = True)
print(sr)
print(wav_data)
print(np.abs(wav_data).max())

Run this code, we may see:

8000
[-2.5019117e-05 -9.3096860e-06  2.3915986e-06 ...  4.3445010e-02
  2.1312233e-02  0.0000000e+00]
1.1275722

It means the sample rate is 8000 and the maximum value in wav_data is 1.1275722, which is greater than 1.0

This may cause some error. For example, if we use webrtcvad to process this file, it need the value between -1.0 and 1.0. You may get an error:

ValueError: when data.type is float, data must be – 1.0 <= data <= 1.0.

How to fix this error?

We should limit the value in -1.0 and 1.0. We can do as follows:

if np.abs(wav_data).max() > 1.0:
    wav_data *= (0.99 / np.abs(wav_data).max())
print(wav_data)
print(np.abs(wav_data).max())

Run this code, we will see:

[-2.1966598e-05 -8.1738353e-06  2.0998059e-06 ...  3.8144395e-02
  1.8711982e-02  0.0000000e+00]
0.99

The maximum value in wav_data is 0.99, which is lower than 1.0.

Finally, this error is fixed.