Implement Pearson Correlation Coefficient Loss in TensorFlow – TensorFlow Tutorial

By | March 3, 2021

Pearson Correlation Coefficient can measure the strength of the relationship between two variables. Here is a tutorial:

A Beginner Guide to Pearson Correlation Coefficient – Machine Learning Tutorial

We can use it as a loss to measure the correlation between two distributions in deep learning model. In this tutorial, we will create this loss function using tensorflow.

Preliminary

We will create two distributions in tensorflow.

import numpy as np
import tensorflow as tf
a = np.array([[0.15, 0.16, 0.9], [0.8, 4.15, 0.15]])
b = np.array([[0.7, 0.12, 0.1], [0.15, 0.19, 0.05]])

aa = tf.convert_to_tensor(a, tf.float32)
bb = tf.convert_to_tensor(b, tf.float32)

\(aa\) and \(bb\) are two distributions, we will compute their pearson correlation coefficient loss.

Pearson Correlation Coefficient Loss

Similar to cosine distance loss, pearson correlation coefficient loss is defined as:

\(loss = 1 – p\)

\(p\) is pearson correlation coefficient.

How to compute pearson correlation coefficient loss in tensorflow?

We will create a function to calculate. Here is an example:

def pearson_r(y_true, y_pred):
    x = y_true
    y = y_pred
    mx = tf.reduce_mean(x, axis=1, keepdims=True)
    my = tf.reduce_mean(y, axis=1, keepdims=True)
    xm, ym = x - mx, y - my
    t1_norm = tf.nn.l2_normalize(xm, axis = 1)
    t2_norm = tf.nn.l2_normalize(ym, axis = 1)
    cosine = tf.losses.cosine_distance(t1_norm, t2_norm, axis = 1)
    return cosine

In this example, we will use cosine distance loss to compute pearson correlation coefficient loss. Here is the reason:

Understand the Relationship Between Pearson Correlation Coefficient and Cosine Similarity – Machine Learning Tutorial

Then we can compute the pearson loss between \(aa\) and \(bb\).

a_s = pearson_r(aa, bb)

init = tf.global_variables_initializer() 
init_local = tf.local_variables_initializer()
with tf.Session() as sess:
    sess.run([init, init_local])
    np.set_printoptions(precision=4, suppress=True)
   
    a = (sess.run(a_s))
   
    print('a=')
    print(a)

Run this code, we will get the loss:

0.85890067

Evaluate our pearson correlation coefficient loss function

In order to make sure our function is correct, we will use scipy.stats.pearsonr() to evaluate our function.

Here is the example code:

from scipy.stats import pearsonr
p1, _ = pearsonr(a[0,:], b[0,:])
p2, _ = pearsonr(a[1,:], b[1,:])
print(p1)
print(p2)
print(p1+p2)

d = 1-(p1+p2)/2
print(d)

Run this code, \(d\) is:

0.8589005906554071

It is almost same to \(a_s\) in tensorflow, which means our function is correct.

Leave a Reply