Step 9: Install cuDNN 7.3.1:
NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks.
Goto https://developer.nvidia.com/cudnn and download Login and agreement required
After login and accepting agreement.
Download the following:
cuDNN v7.3.1 Library for Linux [ cuda 10.0]
Goto downloaded folder and in terminal perform following:
tar -xf cudnn-10.0-linux-x64-v7.3.1.20.tgz
sudo cp -R cuda/include/* /usr/local/cuda-10.0/include
sudo cp -R cuda/lib64/* /usr/local/cuda-10.0/lib64
Step 10: Install NCCL 2.3.5:
NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs
Go to https://developer.nvidia.com/nccl/nccl-download and attend survey to download Nvidia NCCL.
Download following after completing survey.
Download NCCL v2.3.5, for CUDA 10.0 -> NCCL 2.3.5 O/S agnostic and CUDA 10.0
Goto downloaded folder and in terminal perform following:
tar -xf nccl_2.3.5-2+cuda10.0_x86_64.txz
cd nccl_2.3.5-2+cuda10.0_x86_64
sudo cp -R * /usr/local/cuda-10.0/targets/x86_64-linux/
sudo ldconfig
Step 11: Install Dependencies
Use following if not in active virtual environment.
pip install -U --user pip six numpy wheel mock
pip3 install -U --user pip six numpy wheel mock
pip install -U --user keras_applications==1.0.5 --no-deps
pip3 install -U --user keras_applications==1.0.5 --no-deps
pip install -U --user keras_preprocessing==1.0.3 --no-deps
pip3 install -U --user keras_preprocessing==1.0.3 --no-deps
Use following if in active virtual environment.
pip install -U pip six numpy wheel mock
pip install -U keras_applications==1.0.5 --no-deps
pip install -U keras_preprocessing==1.0.3 --no-deps
Step 12: Configure Tensorflow from source:
Download bazel:
cd ~/
wget https://github.com/bazelbuild/bazel/releases/download/0.17.2/bazel-0.17.2-installer-linux-x86_64.sh
chmod +x bazel-0.17.2-installer-linux-x86_64.sh
./bazel-0.17.2-installer-linux-x86_64.sh --user
echo 'export PATH="$PATH:$HOME/bin"' >> ~/.bashrc
Reload environment variables
source ~/.bashrc sudo ldconfig
Start the process of building TensorFlow by downloading latest tensorflow 1.12 .
cd ~/
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r1.12
./configure
Give python path in
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Press enter two times
Do you wish to build TensorFlow with Apache Ignite support? [Y/n]: Y
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: Y
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with ROCm support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 10.0
Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-10.0
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.3.1
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-10.0]: /usr/local/cuda-10.0
Do you wish to build TensorFlow with TensorRT support? [y/N]: N
Please specify the NCCL version you want to use. If NCCL 2.2 is not installed, then you can use version 1.3 that can be fetched automatically but it may have worse performance with multiple GPUs. [Default is 2.2]: 2.3.5
Now we need compute capability which we have noted at step 1 eg. 5.0
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 5.0] 5.0
Do you want to use clang as CUDA compiler? [y/N]: N
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: /usr/bin/gcc
Do you wish to build TensorFlow with MPI support? [y/N]: N
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:N
Configuration finished
Step 13: Build Tensorflow using bazel
The next step in the process to install tensorflow GPU version will be to build tensorflow using bazel. This process takes a fairly long time.
To build a pip package for TensorFlow you would typically invoke the following command:
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
Note:-
add "--config=mkl" if you want Intel MKL support for newer intel cpu for faster training on cpu
add "--config=monolithic" if you want static monolithic build (try this if build failed)
add "--local_resources 2048,.5,1.0" if your PC has low ram causing Segmentation fault or other related errors
This process will take a lot of time. It may take 3- 4 hours or maybe even more.
Also if you got error like Segmentation Fault then try again it usually worked.
The bazel build command builds a script named build_pip_package. Running this script as follows will build a .whl file within the tensorflow_pkg directory:
To build whl file issue following command:
bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow_pkg
To install tensorflow with pip:
cd tensorflow_pkg
for existing virtual environment:
pip install tensorflow*.whl
With a new virtual environment using virtualenv:
sudo apt-get install virtualenv virtualenv tf_1.12_cuda10.0 -p /usr/bin/python3 source tf_1.12_cuda10.0/bin/activate pip install tensorflow*.whl
for python 2: (use sudo if required)
pip2 install tensorflow*.whl
for python 3: (use sudo if required)
pip3 install tensorflow*.whl
Note : if you got error like unsupported platform then make sure you are running correct pip command associated with the python you used while configuring tensorflow build.
You can check pip version and associated python by following command
pip -V
Step 14: Verify Tensorflow installation
Run in terminal
python import tensorflow as tf hello = tf.constant('Hello, TensorFlow!') sess = tf.Session() print(sess.run(hello))
If the system outputs the following, then you are ready to begin writing TensorFlow programs:
Success! You have now successfully installed tensorflow 1.12 on your machine. If you are on Windows OS, you might want to check out our other post here, How to install Tensorflow 1.7.0 GPU with CUDA 9.1 and cuDNN 7.1.2 for Python 3 on Windows OS. Cheers!!
For prebuilt wheels go to this link .
Great tutorial!
Many thanks! It was very useful!
I can’t download this file(Tensorflow 1.12 whl for CUDA 10.0 + CUDNN 7.3.1 + NCCL 2.3.5 + python 3.6 +Ubuntu), can you send it to my mailbox?
[email protected]
https://drive.google.com/drive/folders/1jKPh34x3Jkdav2LQNx9IuHUo8h4-zaeS
Hi,
Thanks for the tutorial… i’m still unable to do it!
I’m stuck at step 13:
[email protected]:~/master/tensorflow/tensorflow$ bazel build –config=opt –config=cuda //tensorflow/tools/pip_package:build_pip_package
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
/home/diicic/master/tensorflow/tools/bazel.rc
INFO: Options provided by the client:
Inherited ‘common’ options: –isatty=1 –terminal_columns=204
INFO: Reading rc options for ‘build’ from /home/diicic/master/tensorflow/.tf_configure.bazelrc:
‘build’ options: –action_env PYTHON_BIN_PATH=/usr/bin/python3 –action_env PYTHON_LIB_PATH=/usr/lib/python3/dist-packages –python_path=/usr/bin/python3 –define with_ignite_support=true –define with_xla_support=true –action_env TF_NEED_OPENCL_SYCL=0 –action_env TF_NEED_ROCM=0 –action_env TF_NEED_CUDA=1 –action_env CUDA_TOOLKIT_PATH=/usr/local/cuda-10.0 –action_env TF_CUDA_VERSION=10.0 –action_env CUDNN_INSTALL_PATH=/usr/local/cuda-10.0 –action_env TF_CUDNN_VERSION=7 –action_env NCCL_INSTALL_PATH=/usr/lib/x86_64-linux-gnu –action_env NCCL_HDR_PATH=/usr/include –action_env TF_NCCL_VERSION=2 –action_env TF_CUDA_COMPUTE_CAPABILITIES=7.5 –action_env TF_CUDA_CLANG=0 –action_env GCC_HOST_COMPILER_PATH=/usr/bin/gcc –config=cuda
ERROR: Config value cuda is not defined in any .rc file
I had to install NCCL from deb package since I couldn’t find a tgz file…
please help!
If you have configured cuda while running ./configure then you can exclude –config=cuda also use the same version of bazel mentioned in tutorial and use fresh virtualenv or uninstall previous tensorflow using pip. Also use release branch.
Thanks a lot for this tutorial. I was finally able to set up my system!
Work on Ubuntu 18.04 and Nvidia gtx 1060. Thanks ! 😉
During bazel build I still have the following error:
ERROR: /home/me/tensorflow/cc/BUILD:422:1: Executing genrule //tensorflow/cc:user_ops_genrule failed (Exit 127)
bazel-out/host/bin/tensorflow/cc/ops/user_ops_gen_cc: error while loading shared libraries: libcublas.so.9.2: cannot open shared object file: No such file or directory
Is there any solution for this?
I think you have configured for cuda 9.2 so it is seeking for 9.2 related libraries. If you want to build for cuda 9.2 than install cuda 9.2 first.
How to configure tensorflow to search for CUDA 9.2 instead? I have the same problem here
provide path of cuda 9.2 while running ./configure
Worked perfectly on my new RTX 2080. I have done about 10 failed attempts but this one was spot-on. Hero!
Thanks! Gustaf
what version of Ubuntu are you using? 18.04 or 16.04
Dear Mr Mandal
It’s my first time to install Linux OS(Ubuntu 16.04 LTS) and tensorflow with GPU version(Actually, I want to install Keras and Theano). Following your instructions step by step in terminal, everthing is ok. However, when I close the terminal and open it again, Step 14 can’t be finished successfully. When I type
“python” and “import tensorflow as tf”, then errors happen:
———————
python
Python 2.7.12 (default, Dec 4 2017, 14:50:18)
[GCC 5.4.0 20160609] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import tensorflow as tf
Traceback (most recent call last):
File “”, line 1, in
ImportError: No module named tensorflow
———————————-
When I type “python”,”import tensorflow as tf” and “hello = tf.constant(‘Hello, TensorFlow!’)”
,then errors happen:
————————————————————–
python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import tensorflow as tf
>>> hello = tf.constant(‘Hello, TensorFlow!’)
Traceback (most recent call last):
File “”, line 1, in
AttributeError: module ‘tensorflow’ has no attribute ‘constant’
——————————————————————
If I run in terminal
—————————————-
cd tensorflow
cd tensorflow_pkg
source tf_1.12_cuda10.0/bin/activate
python
import tensorflow as tf
hello = tf.constant(‘Hello, TensorFlow!’)
sess = tf.Session()
print(sess.run(hello))
————————-
then the terminal output right results. It seems the problem is related to the installation path. But I don not know how to solve it.
When I install Theano, I type “sudo pip install theano” or “~sudo pip3 install theano”,errors happen:
————————————————————
sudo pip install theano
Traceback (most recent call last):
File “/usr/bin/pip”, line 9, in
from pip import main
ImportError: cannot import name main
—————————————
sudo pip3 install theano
Traceback (most recent call last):
File “/usr/bin/pip3”, line 9, in
from pip import main
ImportError: cannot import name ‘main’
——————————————
When I install Keras, I type “sudo pip install keras” or “sudo pip3 install keras”,errors happen:
—————————————–
sudo pip install keras
Traceback (most recent call last):
File “/usr/bin/pip”, line 9, in
from pip import main
ImportError: cannot import name main
——————————————–
sudo pip3 install keras
Traceback (most recent call last):
File “/usr/bin/pip3”, line 9, in
from pip import main
ImportError: cannot import name ‘main’
————————————————
Alternatively, I install Keras from GitHub:
git clone https://github.com/fchollet/keras
cd keras
sudo python3 setup.py install
——————
The installation is completed:
—————————
sudo python3 setup.py install
——————————–
However, when I run the example, errors happen:
————————————-
python3 examples/mnist_cnn.py
Using TensorFlow backend.
Traceback (most recent call last):
File “examples/mnist_cnn.py”, line 9, in
import keras
File “/usr/local/lib/python3.5/dist-packages/Keras-2.2.4-py3.5.egg/keras/__init__.py”, line 3, in
File “/usr/local/lib/python3.5/dist-packages/Keras-2.2.4-py3.5.egg/keras/utils/__init__.py”, line 6, in
File “/usr/local/lib/python3.5/dist-packages/Keras-2.2.4-py3.5.egg/keras/utils/conv_utils.py”, line 9, in
File “/usr/local/lib/python3.5/dist-packages/Keras-2.2.4-py3.5.egg/keras/backend/__init__.py”, line 89, in
File “/usr/local/lib/python3.5/dist-packages/Keras-2.2.4-py3.5.egg/keras/backend/tensorflow_backend.py”, line 5, in
ImportError: No module named ‘tensorflow’
———————————–
I can’t solve these problems. Can you help me? Thanks.
I think you have used python from source dir of tensorflow. It is clearely written in tutorial. Cd to any other dir then run python.
What does that mean? I am running the program on : /home/pi/Documents/.
And the keras is installed : /usr/local/lib/python3.5/dist-packages (tensorflow as well as keras).
How do I move the installation outside /usr/local/lib/python3.5?
I understand, it is unable to check for python program within this directory so unable to run python itself.
Then is there anyway to keep keras and tensorflow installation targets outside of this place?
Hello this is a great post, but i am having the following error and I do not know what to do after step 13 I get the following error bazel build –config=opt –config=cuda //tensorflow/tools/pip_package:build_pip_package
Error:tensorflow/BUILD:533:1: Executing genrule //tensorflow:tf_python_api_gen_v1 failed (Exit 1)