This is going to be a tutorial on how to install tensorflow 1.7.0 GPU version. We will also be installing CUDA Toolkit 9.1 and cuDNN 7.1.2 along with the GPU version of tensorflow 1.7.0. At the time of writing this blog post, the latest version of tensorflow is 1.7.0.This tutorial is for building tensorflow from source. If you want to use the official pre-built pip package instead, I recommend another post, How to install Tensorflow 1.7.0 using official pip package.
Tensorflow is an open source software library developed and used by Google that is fairly common among students, researchers, and developers for deep learning applications such as neural networks. It has both the CPU as well as GPU version available and although the CPU version works quite well, realistically, if you are going for deep learning, you will need GPU. In order to use the GPU version of TensorFlow, you will need an NVIDIA GPU with a compute capability > 3.0. While it is technically possible to install tensorflow GPU version in a virtual machine, you cannot access the full power of your GPU via a virtual machine. So, I recommend doing a fresh install of Ubuntu if you don’t have Ubuntu before starting with the tutorial.
There must be 64-bit python installed tensorflow does not work on 32-bit python installation.
Step 1: Update and Upgrade your system:
sudo apt-get update sudo apt-get upgrade
Step 2: Verify You Have a CUDA-Capable GPU:
lspci | grep -i nvidia
Note GPU model. eg. GeForce 840M
If you do not see any settings, update the PCI hardware database that Linux maintains by entering update-pciids (generally found in /sbin) at the command line and rerun the previous lspci command.
If your graphics card is from NVIDIA then goto http://developer.nvidia.com/cuda-gpus and verify if listed in CUDA enabled gpu list.
Note down its Compute Capability. eg. GeForce 840M 5.0
Step 3: Verify You Have a Supported Version of Linux:
To determine which distribution and release number you’re running, type the following at the command line:
uname -m && cat /etc/*release
The x86_64 line indicates you are running on a 64-bit system which is supported by cuda 9.1
Step 4: Install Dependencies:
Required to compile from source:
sudo apt-get install build-essential sudo apt-get install cmake git unzip zip sudo apt-get install python2.7-dev python3.5-dev python3.6-dev pylint
Step 5: Install linux kernel header:
Goto terminal and type:
uname -r
You can get like “4.10.0-42-generic”. Note down linux kernel version.
To install linux header supported by your linux kernel do following:
sudo apt-get install linux-headers-$(uname -r)
Step 6: Download the NVIDIA CUDA Toolkit:
Go to https://developer.nvidia.com/cuda-downloads and download Installer for Linux Ubuntu 16.04 x86_64 deb[network]. I highly recommend network installer to get updated gpu driver supported by your linux kernel.
For, direct download
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
Installation Instructions:
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub sudo dpkg -i cuda-repo-ubuntu1604_9.1.85-1_amd64.deb sudo apt-get update sudo apt-get install cuda-9.1
Thanks for sharing this post, is very helpful article.
what’s the version of gcc?
This tutorial is tested in gcc 5.4 in ubuntu 16.04 and in gcc 7.3 in ubuntu 18.04.
if enter nvidia-smi
im getting not found error.
Kernel version 4.15.0-23-generic
Thank you sir! Impressive clean tutorial!
All the best to you!
regards,
Arman
Do this!!! if your bazel build is failing.. thank me later..
bazel build –config=opt –config=cuda –config=monolithic //tensorflow/tools/pip_package:build_pip_package
Thanks!
Got Tensorflow 1.8 with Cuda 9.2 to work with this flag.
Hi all,
Great article as usual.
However i get the error : “bazel build –config=opt –config=cuda –incompatible_load_argument_is_label=false //tensorflow/tools/pip_package:build_pip_package”
what should I do ?
sorry, i get this as an error from the code above : “The ‘build’ command is only supported from within a workspace.”
Hi again, I finally succeed the last step, I had to place the directory in tensorflow.
But now i got
“~/tensorflow-1.7.0$ bazel build –config=opt –config=cuda –incompatible_load_argument_is_label=false //tensorflow/tools/pip_package:build_pip_package
WARNING: /home/mxn/.cache/bazel/_bazel_mxn/7ba1c52055aebe9313fd1fcde736a239/external/protobuf_archive/WORKSPACE:1: Workspace name in /home/mxn/.cache/bazel/_bazel_mxn/7ba1c52055aebe9313fd1fcde736a239/external/protobuf_archive/WORKSPACE (@com_google_protobuf) does not match the name given in the repository’s definition (@protobuf_archive); this will cause a build error in future versions
ERROR: /home/mxn/tensorflow-1.7.0/util/python/BUILD:5:1: no such package ‘@local_config_python//’: Traceback (most recent call last):
File “/home/mxn/tensorflow-1.7.0/third_party/py/python_configure.bzl”, line 291
_create_local_python_repository(repository_ctx)
File “/home/mxn/tensorflow-1.7.0/third_party/py/python_configure.bzl”, line 253, in _create_local_python_repository
_check_python_lib(repository_ctx, python_lib)
File “/home/mxn/tensorflow-1.7.0/third_party/py/python_configure.bzl”, line 196, in _check_python_lib
_fail((“Invalid python library path: %…))
File “/home/mxn/tensorflow-1.7.0/third_party/py/python_configure.bzl”, line 27, in _fail
fail((“%sPython Configuration Error:%…)))
Python Configuration Error: Invalid python library path: /usr/bin/python3
and referenced by ‘//util/python:python_headers’
ERROR: Analysis of target ‘//tensorflow/tools/pip_package:build_pip_package’ failed; build aborted: Loading failed
INFO: Elapsed time: 3.854s
FAILED: Build did NOT complete successfully (0 packages loaded)
currently loading: tensorflow/core … (13 packages)
Fetching https://mirror.bazel.build/…e-amalgamation-3200000.zip; 27,447b
Fetching https://mirror.bazel.build/…/archive/4.4.0.tar.gz; 22,629b
Fetching https://mirror.bazel.build/…4aac68bc8559736e53f.tar.gz; 25,358b
Fetching https://mirror.bazel.build/…/get/2355b229ea4c.tar.gz; 26,105b
”
How can i fix this problem ?
Thanks ! 🙂
Hello,
I used the GPU build command as well as the CUDA config:
“`
bazel build –config=opt –config=cuda –incompatible_load_argument_is_label=false //tensorflow/tools/pip_package:build_pip_package
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
“`
However, when I try to see if my GPU is used in tensorflow, i see the following:
>>> print(device_lib.list_local_devices())
2018-04-28 17:46:05.352118: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[name: “/device:CPU:0”
device_type: “CPU”
memory_limit: 268435456
locality {
}
incarnation: 11606757544495911915
]
Can you please help me here? Thank you very much!
seems like previous tensorflow is installed. create a new virtual env and install the built wheel there. Are you able to run nvidia-smi ?
Hi Arun,
Thanks for the response.
Yes, I can run nvidia-smi:
+—————————————————————————–+
| NVIDIA-SMI 390.30 Driver Version: 390.30 |
|——————————-+———————-+———————-+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:01:00.0 On | N/A |
| 0% 49C P8 12W / 215W | 1042MiB / 8116MiB | 12% Default |
+——————————-+———————-+———————-+
+—————————————————————————–+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1305 G /usr/lib/xorg/Xorg 579MiB |
| 0 2288 G …izichen/pycharm-2018.1.1/jre64/bin/java 17MiB |
| 0 2881 G compiz 186MiB |
| 0 9013 G …-token=B651794E15148804B3E0F54CE3A8E6FE 24MiB |
| 0 21627 G …opt/mendeleydesktop/bin/mendeleydesktop 61MiB |
| 0 32176 G …-token=2850830C0C1E4A8985F4B065ED057328 149MiB |
+—————————————————————————–+
When you say install the built wheel in a new virtual env, do you mean I simply do this command? pip install tensorflow-1.7.0-cp36-cp36m-linux_x86_64.whl
I did, and the log says requirement already satisfied:
(udacity) [email protected]:~/tensorflow-1.7.0/tensorflow_pkg$ pip install tensorflow-1.7.0-cp36-cp36m-linux_x86_64.whl
Requirement already satisfied: tensorflow==1.7.0 from file:///home/LinuxUser/tensorflow-1.7.0/tensorflow_pkg/tensorflow-1.7.0-cp36-cp36m-linux_x86_64.whl in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (1.7.0)
Requirement already satisfied: tensorboard=1.7.0 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (1.7.0)
Requirement already satisfied: six>=1.10.0 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (1.11.0)
Requirement already satisfied: termcolor>=1.1.0 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (1.1.0)
Requirement already satisfied: absl-py>=0.1.6 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (0.2.0)
Requirement already satisfied: astor>=0.6.0 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (0.6.2)
Requirement already satisfied: wheel>=0.26 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (0.31.0)
Requirement already satisfied: grpcio>=1.8.6 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (1.11.0)
Requirement already satisfied: protobuf>=3.4.0 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (3.5.2.post1)
Requirement already satisfied: gast>=0.2.0 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (0.2.0)
Requirement already satisfied: numpy>=1.13.3 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorflow==1.7.0) (1.14.2)
Requirement already satisfied: html5lib==0.9999999 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorboard=1.7.0->tensorflow==1.7.0) (0.9999999)
Requirement already satisfied: bleach==1.5.0 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorboard=1.7.0->tensorflow==1.7.0) (1.5.0)
Requirement already satisfied: werkzeug>=0.11.10 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorboard=1.7.0->tensorflow==1.7.0) (0.14.1)
Requirement already satisfied: markdown>=2.6.8 in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from tensorboard=1.7.0->tensorflow==1.7.0) (2.6.11)
Requirement already satisfied: setuptools in /home/LinuxUser/anaconda3/lib/python3.6/site-packages (from protobuf>=3.4.0->tensorflow==1.7.0) (39.0.1)
However, when I import tensorflow, there is still no sign of running GPU…
use
pip install --upgrade --force-reinstall tensorflow-1.7.0-cp36-cp36m-linux_x86_64.whl
to install to/home/LinuxUser/anaconda3/lib/python3.6/site-packages
or create new conda env usingconda create -n tf170gpu python=3.6
and activate usingsource activate tf170gpu
then install tensorflow usingpip install tensorflow-1.7.0-cp36-cp36m-linux_x86_64.whl
usesource deactivate
to deactivate environment.Cool! Thank you so much Arun! Creating a new environment resolves the problem! (Although I don’t know why…) Thanks a lot!