Good question. For practical purposes, how should one evaluate the trace and determinant of the Hessian?
• It is not essential to evaluate the eigenvalues of a matrix to get its trace and determinant. The trace is simply the sum of the diagonal elements, so once the matrix has been produced, it can just be read out. Similarly the determinant is often produced as a byproduct of the process of inverting the matrix. So eigenvalues can be useful for expository purposes, but are not necessarily a good thing in practical implementation. • Notice that the formulae for finding the optimal hyperparameters alpha and beta only involve the trace and not the determinant. They are thus insensitive to the details of the values of eigenvalues that are close to zero. • We know by definition that if we are a local optimum of M(w) then the curvature grad grad M(w) must be positive definite, i.e. all its eigenvalues must be positive. Otherwise it isn’t an optimum. If grad grad M(w) can be decomposed into two terms alpha C plus beta B, say, then for a general model we do not have any guarantee that both terms