Sunday, July 8, 2018

TensorFlow: first book

Some first impressions after finishing "TensorFlow for Deep Learning" (Ramsundar and Zadeh). 

The book introduces the concept of tensors, primitives and architectures for deep learning, and the basics of regression, various neural networks, hyperparameter optimization, and reinforcement learning. The art work in the figures is beautiful (something that convinced me to buy the book). The tensorflow code examples can be downloaded from the book's website, making it easy to follow along with the discussion the book.

The book falls a bit short on detailed explanation, however. I found that many times when the discussion in the book was about to get interesting, it referred to other work for details instead. Several architectures were merely "explained" with a figure, no accompanying details in the text.

In addition, although I realize how hard it is to avoid errors in a book, the given linear regression example has a very unfortunate bug. The tensorflow code given in the book fits some toy data with the linear regression shown to the left (with a discussion on how gradient descent algorithms are sometimes trapped in a local minimum). However, with a minor fix that avoids the wrong shape in the loss function, the much better linear regression shown to the right is computed instead.

Linear Regression (original)
Linear Regression (bug fix)

Finally, although I of course understand the generality of gradient descent algorithms, at first reading I was a bit surprised that the tensorflow code needs 8000 iterations to derive an approximation that could have been found by a simple least-squares regression in no time.

Please let me know your thoughts, and stay tuned for my impressions of the other books.