Hi, this is a friend of Farsthary,
Some months ago, Farsthary was speaking about OpenCL, an open and multi-platform programming language (still maturing) that harnesses the computing power of all the available computing devices (cpu, recent graphic cards (ati, nvidia)…), allowing to have the power of a small cluster in a single computer.
Indeed, recent graphic cards allow general purpose computing instead of only graphical operations, and are perfectly suited for massively parallel operations..
If your program is not bound by memory transfers and capacities, you can expect a 5-10 folds speed-boost (single precision) for a recent graphic card compared with a standard quad-core for roughly similar prices (around 100$) and power consumption (100-150W).
As this is only a matter of months before Farsthary begins to investigate this, here is a procedure for those who want to set up their environment to test OpenCL source codes. I will give you the one which works with my system (ubuntu 9.10 32 bits, Ati 4770) , will try to update for other systems latter:
* download and install the last catalyst drivers for OpenCL support: in the terminal:
sudo sh ./ati-driver-installer-10-2-x86.x86_64.run
..and follow the instructions
* download the last SDK kit (you must register before (free), you can also check there if your graphic card is an OpenCL device), extract it in your home directory (ex: /home/farsthary for him) and compile it (type “make” in the terminal once you are in the uncompress folder). Hence, you have the OpenCL libraries, but your system is not yet aware about this (so if you run OpenCL code, no devices will be detected). For that, you have to follow the installation notes (Win, OSX, linux)
For my system (32 bit):
If you load an OpenCL program, it will look for the OpenCL libraries in the default system folder, so you will create symbolic links between these folder and your SDK folder with the libraries. First create appropriates folders:
sudo mkdir /usr/lib/OpenCL
sudo mkdir /usr/lib/OpenCL/vendors
Then create the link
sudo ln -sf ~/ati-stream-sdk-v2.01-lnx32/lib/x86/libatiocl32.so /usr/lib/OpenCL/vendors/libatiocl32.so
You can now test OpenCL samples, such as the didactic examples from the great David Bucciarelli (aka Dade, Luxrender developer). As an example, download SmallptGPU2 (for windows user, the executable is already included), which is an unbiased raytracer example which uses all the available OpenCL devices. First, you need to install some additional libraries (BOOST Threads: look for and install libboost-thread1.40-dev in Synaptic). Then you must edit the Makefile: rewrite the path variable with yours:
..and rename the boost library (-lboost_thread-mt-1_38_0) by “-lboost_thread-mt) if the compilation fails.
.. and compile it (“make”) and lauch it (“./smallptGPU”): at start-up, the soft detect my OpenCL devices and their properties in order to share the computational load. In my case this prints:
OpenCL Platform 0: Advanced Micro Devices, Inc.
OpenCL Device name 0: AMD Phenom(tm) II X4 925 Processor
OpenCL Device type 0: TYPE_CPU
OpenCL Device units 0: 4
OpenCL Device name 1: ATI RV770
OpenCL Device type 1: TYPE_GPU
OpenCL Device units 1: 8
So two computing devices, and my graphic card (RV770 = ATI 4770) is equivalent in this case to an octocore processor (120 stream units per processor in this case)
If the CPU alone works (2.8 GHz) , around 840 samples (12 rays per sample) are performed per second
With only the graphic card (processors at 750 MHz (no memory limitations)), around 4500 samples/s. So this is as if I had 5 additional Phenom II 925 in my rig!
Here are examples of area of Blender which would quickly benefit from such a technology:
* physics, as most of the algorithms do massively repetitive tasks, perfectly suited for stream processors:
– Newtonian physics, SPH: the Bullet library created by Erwin Coumans and used in blender is currently rewritten to take advantage of OpenCL thank to the support from AMD.
– Smoke simulor: Daniel has told that he was interested in porting his implementation to OpenCL.
* Surface calculation: very fast subsurface modifier, useful for high quality animation previz (cf. checking for mesh intersection, expressions)
* Rendering of course, but this would require a lot of manpower for significant speed-up:
But don’t forget: good algorithms come first (cf. V-Ray outperforms many competitors who are on more powerful plateform), then hardware.
As a naive example, if you want to calculate the sum of the numbers from one to n, the naive algorithm uses a loop (more than n calculations), while the good algorithm is base on the math: n(n+1)/2 (a few calculations cf. Karl Gauss)).
UPDATE1: for the windows version, just install the last Ati drivers and the sdk, and follow the foregoing mentioned installation note (the document explains the steps for all the different systems). If you want to compile everything on windows, the process is more tedious (not difficult, but more steps if you are not familiar about this) as you have to set up a lot of stuffs. You can look on google about mingw for linux-like compilation under window. However, the OpenCl example from Dade already contain the windows binary, so you don’t need to compile the program. As long as you have the OpenCL drivers properly installed, you will be able to launch future OpenCL-tweaked build from Farsthary.
UPDATE2: I don’t have a configuration with nvidia card to test with, but here is the link with fresh drivers, SDK and informations.