skimage.feature.peak_local_max() function can allow us to find peaks (Peaks are the local maxima) in an image easily. In this tutorial, we will use some examples to show you how to use it.
Syntax
skimage.feature.peak_local_max() is defined as:
skimage.feature.peak_local_max(image, min_distance=1, threshold_abs=None, threshold_rel=None, exclude_border=True, indices=True, num_peaks=inf, footprint=None, labels=None, num_peaks_per_label=inf, p_norm=inf)
It will find peaks in an image as coordinate list or boolean mask.
Parameters
Here are some important parameters we should notice:
image: an image data, it shoud be a numpy ndarray. For example: [60, 300] shape
min_distance: int, the minimal allowed distance separating peaks. To find the maximum number of peaks, use min_distance=1. You can set it to be 4 or 8
threshold_abs: float, minimum intensity of peaks. By default, the absolute threshold is the minimum intensity of the image.
threshold_rel: float, minimum intensity of peaks, calculated as max(image) * threshold_rel
indices: boolean, if True, the output will be an array representing peak coordinates. The coordinates are sorted according to peaks values (Larger first). If False, the output will be a boolean array shaped as image.shape with peaks present at True elements.
It means the output will be:
- If indices = True : (row, column, …) coordinates of peaks.
- If indices = False : Boolean array shaped like image, with peaks represented by True values.
You can learn more parameters here:
https://scikit-image.org/docs/stable/api/skimage.feature.html?highlight=peak_local_max#skimage.feature.peak_local_max
How to use skimage.feature.peak_local_max()?
We will use some examples to show you how to use it.
For example: we can read an audio and see it as an image.
import numpy as np import librosa import librosa.display from matplotlib import pyplot as plt from skimage.feature import peak_local_max wav_file = "query_recordings/pop.00085-snippet-10-20.wav" def normalise(wave): wave = (wave - np.min(wave)) / (np.max(wave) - np.min(wave)) return wave def calculate_stft(file, n_fft=512, plot=True): # read wav data y, sr = librosa.load(file, sr = 8000) win_length = int(0.025*sr) hop_length = int(0.01*sr) print(win_length, hop_length) signal = normalise(y) # compute and plot STFT spectrogram D = np.abs(librosa.stft(signal, n_fft=n_fft, window='hann', win_length=win_length, hop_length=hop_length)) if plot: plt.figure(figsize=(10, 5)) librosa.display.specshow(librosa.amplitude_to_db(D, ref=np.max), y_axis='linear', x_axis='time', cmap='gray_r', sr=sr) plt.show() print(D.shape) return D
Here D is the wav stft data.
D = calculate_stft(wav_file)
Run this code, we will see:
Here The shape of D is (257, 1004). We can see it as an image with width = 1004, height = 257
Then, we can use skimage.feature.peak_local_max() to get peaks.
def calculate_contellation_map(D, min_distance=4, threshold_rel=0.05, plot=True): # detect peaks from STFT and plot contellation map coordinates = peak_local_max(np.log(D), min_distance=min_distance, threshold_rel=threshold_rel, indices=False) #If indices = False : Boolean array shaped like image, with peaks represented by True values. print(coordinates.shape) #(257, 1004) print(type(coordinates)) # <class 'numpy.ndarray'> print(coordinates) if plot: plt.figure(figsize=(10, 5)) plt.imshow(coordinates, cmap=plt.cm.gray_r, origin='lower') plt.show() return coordinates coordinates = calculate_contellation_map(D) print(coordinates) print(coordinates.shape)
Run this code, we will see:
Here indices=False, coordinates will be:
[[False False False ... False False False] [False False False ... False False False] [False False False ... False False False] ... [False False False ... False False False] [False False False ... False False False] [False False False ... False False False]] [[False False False ... False False False] [False False False ... False False False] [False False False ... False False False] ... [False False False ... False False False] [False False False ... False False False] [False False False ... False False False]] (257, 1004)
We can find the value in coordinates are boolean.
If indices=True
coordinates = peak_local_max(np.log(D), min_distance=min_distance, threshold_rel=threshold_rel, indices=True) #If indices = False : Boolean array shaped like image, with peaks represented by True values. print(coordinates.shape) #(257, 1004) print(type(coordinates)) # <class 'numpy.ndarray'> print(coordinates)
coordinates will be:
[[114 575] [136 571] [116 804] ... [186 521] [169 199] [112 439]] [[114 575] [136 571] [116 804] ... [186 521] [169 199] [112 439]] (961, 2)
The shape of coordinates is (961, 2). The value of coordinates are peak coordinates.