Visualizing Clustered and Labeled Data With Different Color in Matplotlib Scatter Plots

By | February 24, 2021

In text processing, we may need to visualize some labeled data using scatter plots in matplotlib. In this tutorial, we will use an example to show you how to display them.

Preliminary

First, we will create some labeled data using numpy.

#-*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt

label_num = 5

data = np.random.rand(30,2)
labels = [str(i) for i in np.array(range(data.shape[0])) % label_num + 1]

In this example, we will create 30 data with 5 labels.

Then we will display them with different colors using a scatter plot.

We will create a function to implement it.

Visualize labeled data using different colors in scatter plots

Here is an function:

def plot_with_labels(low_dim_embs, labels):
    plt.figure(figsize=(5, 5))  # in inches
    for i, label in enumerate(labels): 
        x, y = low_dim_embs[i, :] #2 dim
        color = 'r'
        if label == '1':
            color = 'r'
        if label == '2':
            color = 'b'
        if label == '3':
            color = 'g'
        if label == '4':
            color = 'b'
        if label == '5':
            color = 'orange'
        
        plt.scatter(x, y, color = color)
        #
        plt.annotate(label,
            xy=(x, y), #show point 
            xytext=(5, 2), #show annotate
            textcoords='offset points',
            ha='right',
            va='bottom')
    plt.show()

You can change the labeled color by referring this tutorial:

List of Matplotlib Common Used Colors – Matplotlib Tutorial

We can display these labeled data.

plot_with_labels(data, labels)

Run this code, you will get a scatter plot as follows.

Visualizing Clustered and Labeled Data With Different Color in Matplotlib Scatter Plots

Leave a Reply