Multivarible chain rule is a good way to analyze the derivative of a machine learning model. In this tutorial, we will introduce it for machine learning beginners.
Multivarible Chain Rule
Let \(z=f(x,y)\) , \(x=g(t)\) and \(y=h(t)\) , where \(f\) , \(g\) and \(h\) are differentiable functions. Then \(z=f(x,y)=f(g(t),h(t))\) is a function of \(t\) , in order to compute the derivative \(f\) with respect to \(t\), we can use this formula:
You can understand it as follows:
Here is an example:
Let \(z=x^y+x\) , where \(x=sin(t)\) and \(y=e^{5t}\) . Find \(\frac{dz}{dt} \) using the chain rule.
Multivarible Chain Rule in diagram
We can understand multivarible chain rule using a diagram.
Vector in Multivarible Chain Rule
Rather than thinking of \(x(t)\) and \(y(t)\) as being separate functions, it’s common to package them together into a single, vector-valued function:
The multivarible chain rule will be: