The Difference Between scipy.io.wavfile.read() and librosa.load() in Python – Python Tutorial

By | August 9, 2021

When we plan to read an audio file, we can use scipy.io.wavfile.read() and librosa.load(), in this tutorial, we will introduce the difference between them.

scipy.io.wavfile.read()

scipy.io.wavfile.read(filename, mmap=False)

This function will open a wav file and return the sample rate and data of this wav file.

librosa.load()

librosa.load(path, sr=22050, mono=True, offset=0.0, duration=None, dtype=<class 'numpy.float32'>, res_type='kaiser_best')

This function will open an audio file based on sample rate (if it is not None) and return audio data and sample rate.

We will compare them using some examples.

scipy.io.wavfile.read() Vs librosa.load()

scipy.io.wavfile.read(): we can not open a wav file based on custom sample rate. However, librosa.load() can read.

For example:

from scipy.io import wavfile
import librosa
import numpy as np
np.set_printoptions(threshold=np.inf)

audio_file = './waihu/6eb2612c-fc23-4ead-b2dd-05009817f7e7.wav'

fs, wavdata = wavfile.read(audio_file)
print(fs)
print(type(wavdata))

audio, fs = librosa.load(audio_file)
print(fs)

audio, fs = librosa.load(audio_file, sr = 4000)
print(fs)

Run this code, you will get:

8000
<class 'numpy.ndarray'>
22050
4000

It means:

  • scipy.io.wavfile.read() only can read a wav file based on original sample rate.
  • If sr = None, librosa.load() will open a wav file base on defualt sample rate 22050.
  • If we have set a sr, librosa.load() will read a audio file based on this sr.
  • If you have many wav files with different sample rates, librosa.load() is a good choice to read audio data.

Look at code below:

from scipy.io import wavfile
import librosa
import numpy as np
np.set_printoptions(threshold=np.inf)

audio_file = './waihu/6eb2612c-fc23-4ead-b2dd-05009817f7e7.wav'

fs, wavdata = wavfile.read(audio_file)
print(wavdata[5000:5100])

audio, fs = librosa.load(audio_file, sr = 8000)
print(audio[5000:5100])

Run this code, you will see:

[-4261 -1797   585  1701  2108  1668   928   191   294  1228  2165  2229
  1134  -127  -664   -77  1101  2242  2704  2309  1328   442   371   914
  1594  1855  1493   855   660   732   632 -1586 -4957 -7701 -7927 -4847
  -367  2493  1150 -2137 -4518 -3791 -1486   492  1239  1453  1512  1122
   563   344  1263  2205  2379  1207   -45  -426   277  1300  1835  1960
  1740  1441   994   810   902  1335  1583  1363   733   598   988  1133
  -457 -4040 -7262 -8377 -5986 -1513  2121  1995 -1100 -4103 -4409 -2127
   287  1418  1419  1223   950   645   325   882  2011  2640  1896   261
  -648  -225  1215  2075]
[-0.1300354  -0.05484009  0.01785278  0.0519104   0.06433105  0.05090332
  0.02832031  0.00582886  0.00897217  0.03747559  0.06607056  0.06802368
  0.03460693 -0.00387573 -0.02026367 -0.00234985  0.03359985  0.06842041
  0.08251953  0.07046509  0.04052734  0.01348877  0.01132202  0.02789307
  0.04864502  0.05661011  0.04556274  0.02609253  0.0201416   0.02233887
  0.01928711 -0.04840088 -0.15127563 -0.23501587 -0.24191284 -0.1479187
 -0.01119995  0.07608032  0.03509521 -0.06521606 -0.13787842 -0.11569214
 -0.04534912  0.01501465  0.03781128  0.04434204  0.04614258  0.03424072
  0.0171814   0.01049805  0.0385437   0.06729126  0.07260132  0.03683472
 -0.00137329 -0.01300049  0.00845337  0.03967285  0.05599976  0.05981445
  0.05310059  0.04397583  0.03033447  0.02471924  0.02752686  0.04074097
  0.04830933  0.04159546  0.02236938  0.01824951  0.03015137  0.03457642
 -0.01394653 -0.12329102 -0.22161865 -0.25564575 -0.18267822 -0.0461731
  0.06472778  0.06088257 -0.03356934 -0.12521362 -0.134552   -0.06491089
  0.00875854  0.04327393  0.04330444  0.037323    0.0289917   0.01968384
  0.00991821  0.0269165   0.06137085  0.08056641  0.05786133  0.00796509
 -0.01977539 -0.00686646  0.03707886  0.06332397]

We can find:

scipy.io.wavfile.read() will return integer value, however, librosa.load() will return value between -1 ~ +1.

scipy.io.wavfile.read() Vs librosa.load()

Leave a Reply