If the statistics are boring, then you've got the wrong numbers. Edward R. Tufte
An introduction to TensorFlow
Written on July 3, 2018 by Hoang Nguyen
Abstract:TensorFlow has been widely used for many applications in machine learning and deep learning. However, TensorFlow is more than that, it is a general purpose computing library. Based on that, people have created a rich ecosystem for quickly developing models. In this talk, I will show how statisticians can get most of the main features in TensorFlow such as automatic differentiation, optimization, and Bayesian analysis through a simple linear regression example.
Introduction
TensorFlow is a machine learning framework of Google. It is developed by Google Brain team since 2015 and released publicly in 02.2017. It is now implemented for many applications in machine learning and deep learning. It has API for Python, R, C.
TensorFlow is not only used for deep learning. As a statistician, there are a lot of features that we can take advantages.
TensorFlow = general purpose computing library.
TensorFlow in R = Interface to TensorFlow library.
Computations are implemented as input data (tensor/ generalized matrix/ multidimensional array) flow through nodes (mathematical operators) to the output data.
Tensorflow features:
Reverse-mode auto differentiation.
Multicore CPU, GPU supports.
Official Python API and C API, third-party packages for Julia, R.
An ecosystem with numbers of machine learning algorithms tfestimators, keras.
Graphical probabilistic modelling with TensorFlow Probability.
Monitor and metrics with TensorBoard.
Install TensorFlow in R
We summary the main steps for installing TensorFlow package in R.
For the full instruction, please go to:
a. Download python and install (Choose add path and install pip3).
b. Open cmd with administration role and execute,
Ubuntu
Install python, pip3 and TensorFlow,
macOS
Check pip3 version:
If pip or pip3 version 8.1 or later is not installed, issue the following commands to install or upgrade:
Once you have installed TensorFlow, we go to RStudio and intall the R API package.
Install R package TensorFlow
Hello TensorFlow
Test your installation with this chunk of codes
If everything works, we are ready to go.
TensorFlow API from R
We start with how to declare variables, constants and placeholders in TensorFlow.
We assign an object (sess) pointing to tf$Session()
and close a session with sess$close(). Here top level API is tf which provides access to TensorFlow modules.
There are several ways to evaluate a TensorFlow variable.
Temporary use tf$Session(),
tf$Session()$run() in tf$Session() ,
object_name$eval() in tf$InteractiveSession(),
Linear regression
Gradient descent algorithm
We analyze an example of simple linear regression to see how to use TensorFlow to optimize over a loss function.
Then we use TensorBoard to monitor the loss function in each iteration.
For a simple linear regression, we fit a linear function,
\[y = A x + b + \epsilon\]
such that it minimizes the distance between the predicted values ($\hat{y_i}$) and the observed values ($y_i$) in term of mean square error (MSE).
In order to illustrate how to solve for this optimization, we use the iris data (collected by Ronald Fisher in his well-known 1936 paper).
We want to define a linear model between Petal.Length and Petal.Width.
We first create a placeholder (x_data, y_data) for (Petal.Length, Petal.Width),
Then, we derive the prediction $\hat{y} = A x + b$.
Secondly, we define a loss function (MSE) and a submodule optimizer tf$train$GradientDescentOptimizer
with a learning rate $\gamma = 0.03$. There are several other submodules such as AdagradOptimizer, MomentumOptimizer, RMSPropOptimizer which based on the problem of interest. The GradientDescentOptimizer will update the parameters $A$ and $b$ in each iteration by,
\[A_{n+1} = A_{n} - \gamma \nabla MSE(A_n)\]
Finally, we fetch data to placeholder using feed_dict and update parameters along the gradient few thousand times.
Monitoring with TensorBoard
TensorBoard is a metrics module that helps to monitor the learning process. In the complex model, TensorBoard not only visualizes but also debug, optimize the objective function. Most of the codes in this section are inherited from the previous section with few lines for adding variables to our watch list.
Here are few things that we summary in TensorBoard. The algorithm reaches convergence after 1000 iterations. For the graph structure, each node in the graph represents for an operator at the edge, we can see the flow of the data. It could be a scalar in case of $A$ and $b$ or it could be a vector in case of $x$ and $y$.
Maximum likelihood with TensorFlow
TensorFlow contains a large collection of probability distributions. tf$contrib$distributions provide some common distributions such as Bernoulli, Binomial, Uniform, Normal, Student-t,… The interesting feature of these functions is automatic differentiation. Thus, we just need to sepecify the likelihood function of the model and let TensorFlow takes care of the likelihood. TensorFlow uses reserve mode automatic differentiation.
In general, we have the following workflow,
Define the graph (variables, placeholders for data).
The flow of the graph and operation on the graph.
Calculate the loss function and choose the optimizer engine.
The Graph is executed.
Bayesian with TensorFlow_Probability
TensorFlow_Probability contains the most recent innovated Bayesian inference algorithms used in machine learning and deep learning. TensorFlow_Probability make it easier for probabilistic reasoning and statistical analysis.
TensorFlow package in R does not support for API to TensorFlow_Probability yet, so we can run python code through reticulate package who helps to connect R and python.
In this section, we will work with a graphical probabilistic model using tfp$edward2 and make an inference with Hamiltonian Monte Carlo tfp.mcmc.HamiltonianMonteCarlo. More examples could be found at Github/tfp.