Dear Professor Tao,
In week 7 notes on page 19 almost on the bottom of the page
D(kh)(x_0) is a product of two matrises one of them is the column with gradients, and the other is the row (g(x_0), f(x_0)). Shouldn't they be interchanged, so it would be the row multiplied by the column?
Also I have a suggestion about defining linear transformation. Instead of dealing with vectors as rows and then transpose them into columns, why can't we define vectors as columns and then define linear transformation just as y_i = A_ki*x_k? As far as I know it's a common approach and makes all the proofs related to linear transformation a little easier because again, we don't need then to deal with matrix transposition.
Mikhail.