Understand skimage.feature.peak_local_max() with Examples – Python Tutorial

By | November 16, 2022

skimage.feature.peak_local_max() function can allow us to find peaks (Peaks are the local maxima) in an image easily. In this tutorial, we will use some examples to show you how to use it.

Syntax

skimage.feature.peak_local_max() is defined as:

skimage.feature.peak_local_max(image, min_distance=1, threshold_abs=None, threshold_rel=None, exclude_border=True, indices=True, num_peaks=inf, footprint=None, labels=None, num_peaks_per_label=inf, p_norm=inf)

It will find peaks in an image as coordinate list or boolean mask.

Parameters

Here are some important parameters we should notice:

image: an image data, it shoud be a numpy ndarray. For example: [60, 300] shape

min_distance: int, the minimal allowed distance separating peaks. To find the maximum number of peaks, use min_distance=1. You can set it to be 4 or 8

threshold_abs: float, minimum intensity of peaks. By default, the absolute threshold is the minimum intensity of the image.

threshold_rel: float, minimum intensity of peaks, calculated as max(image) * threshold_rel

indices: boolean, if True, the output will be an array representing peak coordinates. The coordinates are sorted according to peaks values (Larger first). If False, the output will be a boolean array shaped as image.shape with peaks present at True elements.

It means the output will be:

  • If indices = True : (row, column, …) coordinates of peaks.
  • If indices = False : Boolean array shaped like image, with peaks represented by True values.

You can learn more parameters here:

https://scikit-image.org/docs/stable/api/skimage.feature.html?highlight=peak_local_max#skimage.feature.peak_local_max

How to use skimage.feature.peak_local_max()?

We will use some examples to show you how to use it.

For example: we can read an audio and see it as an image.

import numpy as np
import librosa
import librosa.display
from matplotlib import pyplot as plt
from skimage.feature import peak_local_max


wav_file = "query_recordings/pop.00085-snippet-10-20.wav"
def normalise(wave):
    wave = (wave - np.min(wave)) / (np.max(wave) - np.min(wave))
    return wave
def calculate_stft(file, n_fft=512, plot=True):
    # read wav data
    y, sr = librosa.load(file, sr = 8000)
    win_length = int(0.025*sr)
    hop_length = int(0.01*sr)
    print(win_length, hop_length)

    signal = normalise(y)
    # compute and plot STFT spectrogram
    D = np.abs(librosa.stft(signal, n_fft=n_fft, window='hann', win_length=win_length, hop_length=hop_length))
    if plot:
        plt.figure(figsize=(10, 5))
        librosa.display.specshow(librosa.amplitude_to_db(D, ref=np.max), y_axis='linear',
                                 x_axis='time', cmap='gray_r', sr=sr)
        plt.show()
    print(D.shape)
    return D

Here D is the wav stft data.

D = calculate_stft(wav_file)

Run this code, we will see:

audio data as an image

Here The shape of D is (257, 1004). We can see it as an image with width = 1004, height = 257

Then, we can use skimage.feature.peak_local_max() to get peaks.

def calculate_contellation_map(D, min_distance=4, threshold_rel=0.05, plot=True):
    # detect peaks from STFT and plot contellation map
    coordinates = peak_local_max(np.log(D), min_distance=min_distance, threshold_rel=threshold_rel, indices=False)
    #If indices = False : Boolean array shaped like image, with peaks represented by True values.
    
    print(coordinates.shape) #(257, 1004)
    print(type(coordinates)) # <class 'numpy.ndarray'>
    print(coordinates)
    if plot:
        plt.figure(figsize=(10, 5))
        plt.imshow(coordinates, cmap=plt.cm.gray_r, origin='lower')
        plt.show()

    return coordinates

coordinates = calculate_contellation_map(D)
print(coordinates)
print(coordinates.shape)

Run this code, we will see:

Here indices=False, coordinates will be:

[[False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]
 ...
 [False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]]
[[False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]
 ...
 [False False False ... False False False]
 [False False False ... False False False]
 [False False False ... False False False]]
(257, 1004)

We can find the value in coordinates are boolean.

If indices=True

    coordinates = peak_local_max(np.log(D), min_distance=min_distance, threshold_rel=threshold_rel, indices=True)
    #If indices = False : Boolean array shaped like image, with peaks represented by True values.
    
    print(coordinates.shape) #(257, 1004)
    print(type(coordinates)) # <class 'numpy.ndarray'>
    print(coordinates)

coordinates will be:

[[114 575]
 [136 571]
 [116 804]
 ...
 [186 521]
 [169 199]
 [112 439]]
[[114 575]
 [136 571]
 [116 804]
 ...
 [186 521]
 [169 199]
 [112 439]]
(961, 2)

The shape of coordinates is (961, 2). The value of coordinates are peak coordinates.

Leave a Reply