Try   HackMD

tensorflow 2.10.0 從原始碼編譯

conda 環境複製

https://blog.csdn.net/ft_sunshine/article/details/92215164
https://stackoverflow.com/questions/44112457/how-to-backup-anaconda-added-packages
https://stackoverflow.com/questions/48016351/how-to-make-new-anaconda-env-from-yml-file

  1. conda old env導出yml
conda env export > path/to/environment.yml

#修改.yml
name: new_env_name
prefix: /path/...../new_env_name

  1. 建立新env,並採用不同python版本
conda env create -n new_env_name python=3.8 --file environment.yml

conda 環境隔離

由於需要python環境隔離,建立anaconda environment時必須指定特定python版本給tensorflow相關套件,以避免pip管理與其他anaconda enviroment衝突

以建立anaconda環境dl_course_env,綁定python 3.8為例:

conda create -n dl_course_env python=3.8

參考

一些常用指令:

conda env remove -n ENV_NAME python -c 'import tensorflow as tf; print(tf.__version__)' # for Python 2 python3 -c 'import tensorflow as tf; print(tf.__version__)' # for Python 3 # use tf1.15new env - 1.15.5

使用現有wheel

此方案適用於在github中能找到配置一模一樣的whl,那就可以無痛直接安裝並跳過自行編譯階段

tf 2.10.0 自行編譯

  • 硬體

    • intel pentium Gold G5400
    • Nvidia FE RTX3070
  • 環境

    • Ubuntu 20.04 LTS
    • anaconda
    • nvidia-driver 515
    • Anaconda installed(懶得解安裝)
    • CUDA version: 11.7
    • cuDNN:8.4.1.50

      cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz

    • CUDA toolkit 11.7
      • 必須global安裝在本機上,不可使用conda虛擬環境之toolkit,否則bazel設定時識別不到
  • build tensorflow 2.10.0 command

    1. 將套件更新到最新狀態並安裝 curl
    ​​​​sudo apt update ​​​​sudo apt full-upgrade ​​​​sudo apt install curl
    1. 安裝 bazel
      Bazelisk
    ​​​​#Installing Bazel using Bazelisk ​​​​#并且 Bazelisk 可以自动为 TensorFlow 下载合适的 Bazel 版本 ​​​​bazel version
    1. 安裝 build 時所需套件
    ​​​​sudo apt install git ​​​​sudo apt-get install python3-dev ​​​​sudo apt-get install python3-pip ​​​​pip3 install six ​​​​pip3 install numpy ​​​​pip3 install wheel ​​​​pip3 install setuptools ​​​​pip3 install mock ​​​​//(use pip3, not pip) ​​​​pip install -U --user 'future>=0.17.1' ​​​​pip install -U --user keras_applications --no-deps ​​​​pip3 install -U --user keras_preprocessing --no-deps
    1. clone tensorflow 原始碼
    ​​​​curl -LO https://github.com/tensorflow/tensorflow/archive/v2.5.0.tar.gz ​​​​tar xvfz v2.5.0.tar.gz ​​​​rm v2.5.0.tar.gz ​​​​cd tensorflow-2.5.0
    1. 設定 bazel 版本
    ​​​​bazel version ​​​​# 顯示 Build label: 5.3.0
    • 要確認bazel版本與原始碼配對正確,不然會產生Bazel Tensorflow installation from source: Unrecognized option: host_force_python=py2類似錯誤1,錯誤2
    • 有嘗試過解決方案,但後來還是採用了第2步中官方推薦的安裝方式才解決。
    1. 設定 build 的一些設定
    ​​​​python3 ./configure.py

    (可用在虛擬環境上)

    • Please specify the location of python. 中, 輸入虛擬環境的python位址
    • 剩下詳細設定參考 https://iter01.com/556024.html
      ​​​​​​​​/lib,/lib/x86_64-linux-gnu,/usr,/usr/local/cuda,/usr/local/cuda/targets/x86_64-linux/lib,/home/sean/anaconda3/envs/dl_course_env/lib
    1. 開始 build
    ​​​​# GPU支援 ​​​​bazel build --config=cuda //tensorflow/tools/pip_package:build_pip_package ​​​​(build 大約會使用 24hr 左右)
    • 完成後顯示
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
    ​​​​# 建構套件 ​​​​./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    • 完成後顯示
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →

    1. 透過wheel安裝
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
    ​​​​# 進入虛擬環境後安裝 ​​​​ ​​​​pip3 install --user /tmp/tensorflow_pkg/tensorflow-2.11.0-cp38-cp38-linux_x86_64.whl