Feature Fusion: Pointwise Addition Or Concatenate Vectors?

In deep learning, we often use a vector to express a target feature, however, how to fuse them if we have got some features? In this tutorial, we will discuss this topic.

Feature Fusion

Feature fusion is a kind of algorithm, which can merge some independence features to an unique feature in order to process easily.

Here is an example, we have two feature: audio feature and visual feature, both of them are vectors. In order process them easily, we should fuse them. How to fuse?

How to fuse features?

There are two important methods to fuse features: Add or Concatenate. We will explain them one by one.

Add features

It means we will add two vectors to be one. For example:

\(A + B = C\)

Element-wise Addition Explained – A Beginner Guide – Machine Learning Tutorial

You shoud be sure the shapes of \(A\) and \(B\) are the same, if they are not, you should project them to be the same by matrix multiplication.

An Introduction to Matrix Multiplication – Deep Learning Tutorial

Concatenate features

It means we will concatenate features to a vector. For example:

\(A\) is 1*m, \(B\) is 1*n, concatenate \(A\) and \(B\), we will get a 1*(m+n) vector.

Understand tf.concat(): Concatenates Tensors for Beginners – TensorFlow Tutorial

The relationship of adding features and concatenate features

The method of adding features is disordered, if the order is an important feature, it is not a good choice. For example, we compute window vector \(c\) in word2vec by adding word vectors. However, we can concatenate them to get a better window vector \(c\) to improve the performance of word2vec.

Adding features is a special form of concatenating features, we can use:

\(y = w \otimes x + b\)

to reduce dimensionality and to get the almost same effect of adding features, \(w\) and \(b\) should be training.

Adding features can help to reduce the number of parameters in model, it can speed up the model training.

When can we use add features or concatenate features?

As to us, if these features are computed from the same algorithm or data sources, you can add or average them. Otherwise, you can concatenate them.

However, that is not a fixed rule. We also can add features that are computed from different sources or algorithms. Under that situation, we should use gated method or attention mechanism.

As example above, we also can add audio feature and visual feature. However, we should estimate which feature is more important. We should use gated method to implement it. Here is a tutorial:

Understand Gated End-to-End Memory Networks – Deep Learning Tutorial

Feature Fusion: Pointwise Addition Or Concatenate Vectors? – Deep Learning Tutorial