GVKun编程网logo

编译分布式并行版caffe(Open MPI)教程(c++分布式编译)

15

本文将为您提供关于编译分布式并行版caffe的详细介绍,我们还将为您解释OpenMPI教程的相关知识,同时,我们还将为您提供关于c–Caffe:opencv错误、c–没有lmdb的CaffeOpenc

本文将为您提供关于编译分布式并行版caffe的详细介绍,我们还将为您解释Open MPI教程的相关知识,同时,我们还将为您提供关于c – Caffe:opencv错误、c – 没有lmdb的Caffe Opencv、caffe 以及caffe2 安装时portobuf库相关的编译问题、caffe 安装编译问题 - ImportError: libopencv_core.so.3.4: cannot open shared object file: No such file or d...的实用信息。

本文目录一览:

编译分布式并行版caffe(Open MPI)教程(c++分布式编译)

编译分布式并行版caffe(Open MPI)教程(c++分布式编译)

caffe版本:https://github.com/yjxiong/caffe

使用环境:

1 CentOS release 6.6 (Final)
2 CUDA8.0
3 CuDNN6.0
4 Open MPI 3.1.3
5 OpenCV 3.1.0
View Code

CUDA8.0、CuDNN6.0、OpenCV3.1.0以及其他caffe所需要的依赖已经装好,这里仅需要安装OpenMPI3.1.3,步骤如下:

OpenMPI-3.1.3安装

1. 解压openmpi-3.1.3,进入解压后的文件夹 — openmpi3.1.3,在终端输入如下命令:

1 ./configure --prefix=/storage/student5/usr/local/openmpi --with-cuda --enable-mpi-thread-multiple
2 #--prefix后的路径未openmpi的安装路径;
3 sudo make all install
4 # make all install 加sudo,否则安装过程中可能出问题
View Code

2. 测试安装是否成功

1 cd openmpi-3.1.3/examples
2 make
3 mpirun -np 4 hello_c
View Code

Caffe安装

1. 下载caffe,将Makefile.config.example另存为Makefile.config,将其修改成以下的样子:

 1 ## Refer to http://caffe.berkeleyvision.org/installation.html
 2 # Contributions simplifying and improving our build system are welcome!
 3 
 4 # cuDNN acceleration switch (uncomment to build with cuDNN).
 5  USE_CUDNN := 1
 6 
 7 # CPU-only switch (uncomment to build without GPU support).
 8 # CPU_ONLY := 1
 9 
10 # uncomment to disable IO dependencies and corresponding data layers
11  USE_OPENCV := 1
12  USE_LEVELDB := 1
13  USE_LMDB := 1
14 
15 # Uncomment if you''re using OpenCV 3
16  OPENCV_VERSION := 3
17 
18 # To customize your choice of compiler, uncomment and set the following.
19 # N.B. the default for Linux is g++ and the default for OSX is clang++
20 # CUSTOM_CXX := g++
21 
22 # CUDA directory contains bin/ and lib/ directories that we need.
23 CUDA_DIR := /usr/local/cuda
24 # On Ubuntu 14.04, if cuda tools are installed via
25 # "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
26 # CUDA_DIR := /usr
27 
28 # CUDA architecture setting: going with all of them.
29 # For CUDA < 6.0, comment the *_50 lines for compatibility.
30 CUDA_ARCH :=     -gencode arch=compute_30,code=sm_30 \
31         -gencode arch=compute_35,code=sm_35 \
32         -gencode arch=compute_50,code=sm_50 \
33         -gencode arch=compute_50,code=compute_50
34 
35 # BLAS choice:
36 # atlas for ATLAS (default)
37 # mkl for MKL
38 # open for OpenBlas
39 BLAS := atlas
40 # Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
41 # Leave commented to accept the defaults for your choice of BLAS
42 # (which should work)!
43  BLAS_INCLUDE := /usr/include
44  BLAS_LIB := /usr/lib64/atlas
45 
46 # Homebrew puts openblas in a directory that is not on the standard search path
47 # BLAS_INCLUDE := $(shell brew --prefix openblas)/include
48 # BLAS_LIB := $(shell brew --prefix openblas)/lib
49 
50 # This is required only if you will compile the matlab interface.
51 # MATLAB directory should contain the mex binary in /bin.
52  MATLAB_DIR := /usr/local/MATLAB/R2014a
53 # MATLAB_DIR := /Applications/MATLAB_R2012b.app
54 
55 # NOTE: this is required only if you will compile the python interface.
56 # We need to be able to find Python.h and numpy/arrayobject.h.
57 PYTHON_INCLUDE := /usr/include/python2.7 \
58         /usr/lib/python2.7/dist-packages/numpy/core/include
59 # Anaconda Python distribution is quite popular. Include path:
60 # Verify anaconda location, sometimes it''s in root.
61 # ANACONDA_HOME := $(HOME)/anaconda
62 # PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
63         # $(ANACONDA_HOME)/include/python2.7 \
64         # $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \
65 
66 # We need to be able to find libpythonX.X.so or .dylib.
67 PYTHON_LIB := /usr/lib
68 # PYTHON_LIB := $(ANACONDA_HOME)/lib
69 
70 # Homebrew installs numpy in a non standard path (keg only)
71 # PYTHON_INCLUDE += $(dir $(shell python -c ''import numpy.core; print(numpy.core.__file__)''))/include
72 # PYTHON_LIB += $(shell brew --prefix numpy)/lib
73 
74 # Uncomment to support layers written in Python (will link against Python libs)
75  WITH_PYTHON_LAYER := 1
76 
77 # Whatever else you find you need goes here.
78 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
79 LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
80 
81 # If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
82 # INCLUDE_DIRS += $(shell brew --prefix)/include
83 # LIBRARY_DIRS += $(shell brew --prefix)/lib
84 
85 # Uncomment to use `pkg-config` to specify OpenCV library paths.
86 # (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
87 # USE_PKG_CONFIG := 1
88 
89 BUILD_DIR := build
90 DISTRIBUTE_DIR := distribute
91 
92 # Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
93 # DEBUG := 1
94 
95 # The ID of the GPU that ''make runtest'' will use to run unit tests.
96 TEST_GPUID := 0
97 
98 # enable pretty build (comment to see full commands)
99 Q ?= @
View Code

2. 在caffe目录下执行以下操作:

1 mkdir build && cd build
View Code

3. 编译caffe

  如果要开启matlab接口,先修改caffe根目录下的CMakeList.txt文件line24:

1 caffe_option(BUILD_matlab "Build Matlab wrapper" OFF IF UNIX OR APPLE)
View Code

  修改为:

1 caffe_option(BUILD_matlab "Build Matlab wrapper" ON IF UNIX OR APPLE)
View Code

  否则在caffe/build路径下直接进行以下操作:

1 cmake -DUSE_MPI=ON -DMPI_CXX_COMPILER=/path/to/your/openmpi/bin/mpicxx ..
2 # USE_MPI=ON即表示开启Open MPI
3 # -DMPI_CXX_COMPILER后的路径一定得是Open MPI的安装路径下的bin中的mpicxx路径,在/usr/bin下也有这个mpicxx,不要错写路径了
View Code

4. 安装caffe,在caffe根目录下执行以下操作:

1 make all -j8
2 make install
3 # 我在安装过程中,make all之后就不需要再make install
4 make runtest
5 # 同参考教程中一样,有两个test未通过
View Code

5. 编译Python接口:

  a. 添加环境变量:

1 gedit ~/.bashrc
View Code

  b. 在其中写入:

1 export PYTHONPATH=$PYTHONPATH:/path/to/your/caffe/python
View Code

  c. 使环境变量生效:

1 source ~/.bashrc
View Code

  d. 在caffe根目录下:

1 make pycaffe
2 # 教程中有加sudo,但是我没有加sudo也没有影响
View Code

  e. 测试Python接口,在终端输入以下命令:

1 python
2 import caffe
3 # 如果无错,则python接口编译成功
View Code

 

出现问题:

1. 安装caffe过程中,编译caffe时,输入以下命令出错:

1 cmake -DUSE_MPI=ON -DMPI_CXX_COMPILER=/path/to/your/openmpi/bin/mpicxx ..
View Code

  问题1:

 1 CMake Warning at /usr/local/opencv-3.1.0/cmake/OpenCVConfig.cmake:166 (message):
 2   Found OpenCV Windows Pack but it has no binaries compatible with your
 3   configuration.
 4 
 5   You should manually point CMake variable OpenCV_DIR to your build of OpenCV
 6   library.
 7 Call Stack (most recent call first):
 8   cmake/Dependencies.cmake:62 (find_package)
 9   CMakeLists.txt:31 (include)
10 
11 
12 CMake Error at cmake/Dependencies.cmake:62 (find_package):
13   Found package configuration file:
14 
15     /usr/local/opencv-3.1.0/cmake/OpenCVConfig.cmake
16 
17   but it set OpenCV_FOUND to FALSE so package "OpenCV" is considered to be
18   NOT FOUND.
19 Call Stack (most recent call first):
20   CMakeLists.txt:31 (include)
21 
22 
23 -- Configuring incomplete, errors occurred!
24 See also "/storage/student5/usr/local/caffe/build/CMakeFiles/CMakeOutput.log".
25 See also "/storage/student5/usr/local/caffe/build/CMakeFiles/CMakeError.log".
View Code

  解决方法:

    尝试一:在CMakeList.txt文件中加入set(OpenCV_DIR /path/to/your/OpenCV/build),该法无效;

    尝试二:退回到caffe根目录,然后make clean,暂时加入如下环境变量后重新从mkdir build && cd build开始,该法有效。

1 export OpenCV_DIR=/path/to/your/opencv/build
View Code

  问题2:

1 CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:108 (message):
2   Could NOT find Atlas (missing: Atlas_LAPACK_LIBRARY)
3 Call Stack (most recent call first):
4   /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:315 (_FPHSA_FAILURE_MESSAGE)
5   cmake/Modules/FindAtlas.cmake:43 (find_package_handle_standard_args)
6   cmake/Dependencies.cmake:74 (find_package)
7   CMakeLists.txt:31 (include)
View Code

  解决方法:

    尝试一:指定Atlas路径,退回到caffe根目录,然后make clean,暂时加入环境变量export Atlas_ROOT_DIR=/your/Atlas/Root,再重新从mkdir build && cd build开始,该法无效;

    尝试二:退回到caffe根目录,然后make clean,重新mkdir build && cd build开始,在终端输入以下命令后继续进行,该法有效。

1 cmake -DBLAS=open .
View Code

 

2. 当make all -j8时,

  问题1:

1 /usr/bin/ld: .build_release/examples/cpp_classification/classification.o: undefined reference to symbol ''_ZN2cv6imreadERKNS_6StringEi''
2 /usr/local/lib/libopencv_imgcodecs.so.3.1: error adding symbols: DSO missing from command line
3 collect2: error: ld returned 1 exit status
4 make: *** [.build_release/examples/cpp_classification/classification.bin] Error 1
5 make: *** Waiting for unfinished jobs....
View Code

  解决方法:由于使用的是opencv-3.x,需要链接libopencv_imgcodercs.so,在Makefile文件中,line172处做如下修改:

1 LIBRARIES += glog gflags protobuf leveldb snappy \
2     lmdb boost_system hdf5_hl hdf5 m \
3     opencv_core opencv_highgui opencv_imgproc
View Code

  改为:

1 LIBRARIES += glog gflags protobuf leveldb snappy \
2     lmdb boost_system hdf5_hl hdf5 m \
3     opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs
View Code

  问题2:

1 nvcc warning : The ''compute_20'', ''sm_20'', and ''sm_21'' architectures are deprecated, and may be 
2 removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
View Code

  解决方法:删除Makefile.config中的以下语句:

1 -gencode arch=compute_20,code=sm_20 \
2 -gencode arch=compute_20,code=sm_21 \
View Code

 

参考教程:

1. https://blog.csdn.net/whyerdiku/article/details/78842498 (Python+Matlab接口)

2. http://www.cnblogs.com/beihaidao/p/6866342.html (Python+Matlab接口)

3. https://blog.csdn.net/qq_21368481/article/details/81257265?tdsourcetag=s_pctim_aiomsg (Matlab接口)

 

c – Caffe:opencv错误

c – Caffe:opencv错误

我从源代码构建了opencv 3.0,并且可以运行一些示例应用程序,根据标题构建,所以我认为它已成功安装.

我也在使用python3,现在我去安装和构建caffe.由于拥有AMD GPU和Anaconda,我在Makefile.config中设置了一些变量,因为我正在使用cpu.

当我运行make all时,我收到此错误:

$make all
CXX/LD -o .build_release/examples/cpp_classification/classification.bin
/usr/bin/ld: .build_release/examples/cpp_classification/classification.o: undefined reference to symbol '_ZN2cv6imreadERKNS_6StringEi'
//usr/local/lib/libopencv_imgcodecs.so.3.0: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
Makefile:565: recipe for target '.build_release/examples/cpp_classification/classification.bin' Failed
make: *** [.build_release/examples/cpp_classification/classification.bin] Error 1

从搜索我认为这与使用openCV 3有关,但我不知道从哪里开始寻找解决方案.有帮助吗?

是的,我是那些希望摆弄Google Inception学习技术的无经验用户之一.

解决方法

您还可以在第 187行中将opencv_imgcodecs添加到MakeFile,请参阅此 pull.

c – 没有lmdb的Caffe Opencv

c – 没有lmdb的Caffe Opencv

当使用caffe时,为了创建包含图像的训练数据集,我们需要以像lmdb这样的特殊格式创建数据库,但是有任何选项可以传递给caffe批图像,例如vector< cv :: Mat> ?

为了澄清我正在寻找能够处理大量无法融入内存的图像的解决方案(但假设一个训练批次(包含例如50个图像)可以存储在内存中).

解决方法

Caffe可以采用多种类型的输入,具体取决于我们使用的输入层.
一些可用的输入方法是:

>数据
> MemoryData
> HDF5Data
> ImageData等

在模型文件中,您找到的第一个层将是Layer type:Data,它使用lmdb或leveldb作为输入方法.将一组图像转换为这些数据库非常容易,因为Caffe已经提供了转换图像的工具.

Layer类型:MemoryData直接从内存中读取数据,这在使用摄像机输入在测试阶段作为Caffe输入传递时非常有用.强烈建议不要使用此层进行培训.

图层类型:ImageData将文本文件作为输入.文本文件包含所有图像名称及其完整路径和类号. Caffe使用OpenCV来读取该层中的图像.它还负责图像的所有转换.因此,不建议使用OpenCV读取图像然后传递给MemoryData层,而是使用ImageData.

ImageData图层从中读取图像的.txt格式必须是:

/path/to/the/image/imageName.jpg classNumber

强烈建议使用LMDB或LevelDB,因为如果图像路径或名称包含空格,或者任何图像损坏,则ImageData无法正常工作.

各种层的细节可以在here中找到.

内存在GPU中分配,具体取决于型号和批量大小.如果发生内存溢出,您可以尝试减小批量大小. Caffe轻松处理了120万张图像的Imagenet数据库.因此,使用最佳批量大小,算法应该没有任何问题.

caffe 以及caffe2 安装时portobuf库相关的编译问题

caffe 以及caffe2 安装时portobuf库相关的编译问题

安装caffe出现protobuf版本问题,是由于anaconda安装了不同的protobuf版本

pip show protoc
protoc --version

sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install python-protobuf

出现icu相关的问题,重新编译安装

http://www.linuxfromscratch.o...

protobuf自己编译时出现a local symbol'' can not be used when making a shared object; recompile with -fPIC错误

问题原因

 是用了protobuf默认的安装方式(以下代码),如此生成的静态库libprotobuf.a不是PIC object,所以不能被其它库使用(64位的so必须使用-fPIC编译选项,所以要求所链接的静态库也必须是-fPIC编译)。

./autogen.sh
./configure --prefix=INSTALL_PTAH    
./make
./make install

解决方法

confiure步骤时加上两个变量参数:

./configure CFLAGS="-fPIC" CXXFLAGS="-fPIC --prefix=INSTALL_PATH

注意重新配置后需要make clean 清除一下原有的生成文件才有用

 编译caffe2出现from google.protobuf import symbol_database as _symbol_database错误

问题原因

编译的protobuf没有安装到python环境中

解决方法

按照上述方法下再protobuf3.5.0版本后重新安装

cd PROTOBUF_PATH/python
python setup.py build 
sudo python setup.py isntall

编译caffe2出现以下问题

../lib/libgtest.a(gtest-all.cc.o):在函数‘testing::internal::FormatDeathTestOutput(std::string const&)’中:
gtest-all.cc:(.text+0xca9):对‘std::__throw_out_of_range_fmt(char const*, ...)’未定义的引用
gtest-all.cc:(.text+0xce3):对‘std::__throw_out_of_range_fmt(char const*, ...)’未定义的引用
../lib/libgtest.a(gtest-all.cc.o):在函数‘testing::TestResult::GetTestPartResult(int) const’中:
gtest-all.cc:(.text+0x1faf):对‘std::__throw_out_of_range_fmt(char const*, ...)’未定义的引用
../lib/libgtest.a(gtest-all.cc.o):在函数‘testing::TestResult::GetTestProperty(int) const’中:
gtest-all.cc:(.text+0x230f):对‘std::__throw_out_of_range_fmt(char const*, ...)’未定义的引用
../lib/libgtest.a(gtest-all.cc.o):在函数‘testing::internal::UnitTestImpl::ConfigureStreamingOutput()’中:
gtest-all.cc:(.text+0x1bffe):对‘std::__throw_out_of_range_fmt(char const*, ...)’未定义的引用
../lib/libgtest.a(gtest-all.cc.o):gtest-all.cc:(.text+0x1d2d7): 跟着更多未定义的参考到 std::__throw_out_of_range_fmt(char const*, ...)
../lib/libcaffe2.so:对‘std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)())’未定义的引用
../lib/libcaffe2.so:对‘__cxa_throw_bad_array_new_length’未定义的引用

问题原因

ubuntu14.0默认的gcc g++版本是4.8 需要升级到4.9

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-4.9 g++-4.9

为Cmake 指定gcc g++版本

方法1:修改环境变量

修改环境变量,但是但是不一定都有用
export CC=/usr/local/bin/gcc-4.9
export CXX=/usr/local/bin/g++-4.9

方法2: 命令行设置cmake参数

cmake -D CMAKE_C_COMPILER=/usr/local/bin/gcc-4.9 -D CMAKE_CXX_COMPILER=/usr/local/bin/g++-4.9 ..

方法3:在CMakeList中设置变量

set(CMAKE_C_COMPILER "/usr/local/bin/gcc-4.9")
set(CMAKE_CXX_COMPILER "/usr/local/bin/g++-4.9")

判断caffe2是否安装成功,正常情况打印Success

cd ~ && python -c ''from caffe2.python import core'' 2>/dev/null && echo "Success" || echo "Failure"

判断GPU是否正常,正常情况打印值为1

python2 -c ''from caffe2.python import workspace; print(workspace.NumCudaDevices())''

自编译protobuf>=3.2.0

# CAFFE2=/path/to/caffe2
cd $CAFFE2/third_party/protobuf/cmake
mkdir -p build && cd build
cmake .. \
  -DCMAKE_INSTALL_PREFIX=$HOME/c2_tp_protobuf \
  -Dprotobuf_BUILD_TESTS=OFF \
  -DCMAKE_CXX_FLAGS="-fPIC"
make install

编译caffe2时调用

cmake .. \
  # insert your Caffe2 CMake flags here
  -DPROTOBUF_PROTOC_EXECUTABLE=$HOME/c2_tp_protobuf/bin/protoc \
  -DPROTOBUF_INCLUDE_DIR=$HOME/c2_tp_protobuf/include \
  -DPROTOBUF_LIBRARY=$HOME/c2_tp_protobuf/lib64/libprotobuf.a

安装mxnet时出现libgfortran.so.1找不到的问题

ImportError: libgfortran.so.1: cannot open shared object file: No such file or directory

解决方法

conda install libgfortran=1

conda 可以指定安装版本

caffe 安装编译问题 - ImportError: libopencv_core.so.3.4: cannot open shared object file: No such file or d...

caffe 安装编译问题 - ImportError: libopencv_core.so.3.4: cannot open shared object file: No such file or d...

问题描述

>>> import caffe
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/xxx/Downloads/caffe/python/caffe/__init__.py", line 1, in <module>
    from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver, NCCL, Timer
  File "/home/xxx/Downloads/caffe/python/caffe/pycaffe.py", line 13, in <module>
    from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
ImportError: libopencv_core.so.3.4: cannot open shared object file: No such file or directory

问题原因

一种是系统中本身不存在该文件,这个需要自行查找;一种是系统存在该文件,只是系统不知道,可使用以下方法试试;

尝试方法

~/Downloads/caffe$ sudo ldconfig

使用以上方法问题还是没有解决;

 

参考

1.ImportError: libopencv_core.so.3.1: cannot open shared object file: No such file or directory;

我们今天的关于编译分布式并行版caffeOpen MPI教程的分享就到这里,谢谢您的阅读,如果想了解更多关于c – Caffe:opencv错误、c – 没有lmdb的Caffe Opencv、caffe 以及caffe2 安装时portobuf库相关的编译问题、caffe 安装编译问题 - ImportError: libopencv_core.so.3.4: cannot open shared object file: No such file or d...的相关信息,可以在本站进行搜索。

本文标签: