Compute Cosine Similarity Matrix of Two NumPy Array – NumPy Tutorial

By | August 13, 2021

It is easy to compute cosine similarity of two vectors in numpy, here is a tutorial:

Best Practice to Calculate Cosine Distance Between Two Vectors in NumPy – NumPy Tutorial

However, if you have two numpy array, how to compute their cosine similarity matrix? In this tutorial, we will use an example to show you how to do.

For example:

import numpy as np

x = np.random.random([4, 7])
y = np.random.random([4, 7])

Here we have created two numpy array, x and y, the shape of them is 4 * 7.

We can know their cosine similarity matrix is 4* 4. How to compute it?

How to compute cosine similarity matrix of two numpy array?

We will create a function to implement it.

Here is an example:

def cos_sim_2d(x, y):
    norm_x = x / np.linalg.norm(x, axis=1, keepdims=True)
    norm_y = y / np.linalg.norm(y, axis=1, keepdims=True)
    return np.matmul(norm_x, norm_y.T)

We can compute as follows:

print(cos_sim_2d(x, y))

Run this code, we will get this matrix:

[[0.76589312 0.86447536 0.76700721 0.77745416]
 [0.88567881 0.87581097 0.69686004 0.78686957]
 [0.7997874  0.96608392 0.71164122 0.79662708]
 [0.78752701 0.78548681 0.74697223 0.89676639]]

Leave a Reply