Personal tools
User menu

Knowledge Base/CUDA/Installation

From Thalesians

Jump to: navigation, search

Contents

Acknowledgements

Most of the original work for getting CUDA up and running on Linux was done by Jeremy Cohen. He produced the original instructions that were later edited by Paul Bilokon.

Background

What is CUDA?

NVIDIA® CUDA™ is a general purpose parallel computing architecture that leverages the parallel compute engine in NVIDIA graphics processing units (GPUs) to solve many complex computational problems. It includes the CUDA Instruction Set Architecture (ISA) and the parallel compute engine in the GPU.

For more information, you should definitely visit NVIDIA's excellent CUDA Zone website: http://www.nvidia.com/object/cuda_home.html

Which programming languages are supported?

In order to use the CUDA architecture programmers need to use C. NVIDIA are planning to add support for other programming languages, including FORTRAN and C++.

Who is using CUDA?

NVIDIA have solved over 100 million CUDA enabled GPUs so far (29 August, 2009).

Which GPUs are CUDA-enabled?

Here is the list so far (29 August, 2009):

  • NVIDIA GeGorce 8, 9, 100, 200-series GPUs with a minimum of 256MB of local graphics memory. — These are generally regarded as lower price gaming cards, though they are still very powerful. They are often used by NVIDIA to roll out the newest GPUs and architectures like the GeForce FTX 295 card.
    • GeForce GTX 295
    • GeForce GTX 285
    • GeForce GTX 285 for Mac
    • GeForce GTX 280 — Supports double precision. The multiprocessor has eight single-precision floating point ALUs (one per core) but only one double-precision ALU (shared by the eight cores). Thus, for applications whose execution time is dominated by floating point computations, switching from single-precision to double-precision will increase runtime by a factor of approximately eight. For applications which are memory bound, enabling double-precision will only decrease performance by a factor of about two. More on this here.
    • GeForce GTX 275
    • GeForce GTX 260 — Supports double precision. The multiprocessor has eight single-precision floating point ALUs (one per core) but only one double-precision ALU (shared by the eight cores). Thus, for applications whose execution time is dominated by floating point computations, switching from single-precision to double-precision will increase runtime by a factor of approximately eight. For applications which are memory bound, enabling double-precision will only decrease performance by a factor of about two. More on this here.
    • GeForce GTS 250
    • GeForce GTS 240
    • GeForce GT 220
    • GeForce G210
    • GeForce GTS 150
    • GeForce GT 130
    • GeForce GT 120
    • GeForce G100
    • GeForce 9800 GX2
    • GeForce 9800 GTX+
    • GeForce 9800 GTX
    • GeForce 9800 GT
    • GeForce 9600 GSO
    • GeForce 9600 GT
    • GeForce 9500 GT
    • GeForce 9400GT
    • GeForce 8800 Ultra
    • GeForce 8800 GTX
    • GeForce 8800 GTS
    • GeForce 8800 GT
    • GeForce 8800 GS
    • GeForce 8600 GTS
    • GeForce 8600 GT
    • GeForce 8500 GT
    • GeForce 8400 GS
    • GeForce 9400 mGPU
    • GeForce 9300 mGPU
    • GeForce 8300 mGPU
    • GeForce 8200 mGPU
    • GeForce 8100 mGPU
  • NVIDIA GeForce mobile products. — NVIDIA GeForce variants for mobile devices.
    • GeForce GTX 280M
    • GeForce GTX 260M
    • GeForce GTS 260M
    • GeForce GTS 250M
    • GeForce GTS 160M
    • GeForce GTS 150M
    • GeForce GT 240M
    • GeForce GT 230M
    • GeForce GT 130M
    • GeForce G210M
    • GeForce G110M
    • GeForce G105M
    • GeForce G102M
    • GeForce 9800M GTX
    • GeForce 9800M GT
    • GeForce 9800M GTS
    • GeForce 9800M GS
    • GeForce 9700M GTS
    • GeForce 9700M GT
    • GeForce 9650M GS
    • GeForce 9600M GT
    • GeForce 9600M GS
    • GeForce 9500M GS
    • GeForce 9500M G
    • GeForce 9400M G
    • GeForce 9300M GS
    • GeForce 9300M G
    • GeForce 9200M GS
    • GeForce 9100M G
    • GeForce 8800M GTS
    • GeForce 8700M GT
    • GeForce 8600M GT
    • GeForce 8600M GS
    • GeForce 8400M GT
    • GeForce 8400M GS
  • NVIDIA Quadro. — Compared with NVIDIA GeForce, these products offer corporate pricing, better CAD support, more thorough testing, and more memory. However, they tend to use the same GPUs (e.g. Quadro FX 5800 is using the same GPU as GeForce GTX 280).
    • Quadro FX 5800
    • Quadro FX 5600
    • Quadro FX 4800 — see NVIDIA Quadro FX 4800: Workstation Graphics At Its Finest?, an article by Uwe Scheffel.
    • Quadro FX 4800 for Mac
    • Quadro FX 4700 X2
    • Quadro FX 4600
    • Quadro FX 3800
    • Quadro FX 3700
    • Quadro FX 1800
    • Quadro FX 1700
    • Quadro FX 580
    • Quadro FX 570
    • Quadro FX 470
    • Quadro FX 380
    • Quadro FX 370
    • Quadro FX 370 Low Profile
    • Quadro CX
    • Quadro NVS 450
    • Quadro NVS 420
    • Quadro NVS 295
    • Quadro NVS 290
    • Quadro Plex 2100 D4
    • Quadro Plex 2200 D2
    • Quadro Plex 2100 S4
    • Quadro Plex 1000 Model IV
  • NVIDIA Quadro mobile products. — NVIDIA Quadro variants for mobile devices.
    • Quadro FX 3700M
    • Quadro FX 3600M
    • Quadro FX 2700M
    • Quadro FX 1700M
    • Quadro FX 1600M
    • Quadro FX 770M
    • Quadro FX 570M
    • Quadro FX 370M
    • Quadro FX 360M
    • Quadro NVS 320M
    • Quadro NVS 160M
    • Quadro NVS 150M
    • Quadro NVS 140M
    • Quadro NVS 135M
    • Quadro NVS 130M
  • NVIDIA Tesla. — These products are CUDA computing cards with no video output. Tesla C1060, for example, is about the same as GeForce GTX 280, and pretty much exactly the same as Quadro FX 5800.
    • Tesla S1070
    • Tesla C1060
    • Tesla C870
    • Tesla D870
    • Tesla S870
  • NVIDIA ION. — These products were designed for compact, low-power PCs with CPUs like Intel Atom; NVIDIA claim that these GPUs have performance up to 10X faster than similar systems on under-performing PC designs.

erformance.

This list is bound to get out-of-date very quickly, so you should check its parent here: http://www.nvidia.com/object/cuda_learn_products.html

Installation on Linux: bob05.doc.ic.ac.uk

We shall now describe the installation and setup process for NVIDIA CUDA software on the Department of Computing, Imperial College London, machines that we used for the 11 September, 2009, Thalesian Workshop. We believe that these notes may be of use to others, so we publish them here. Of course, many issues that we have faced are configuration-specific.

The entire installation in this case could be performed remotely. So we logged into bob05.doc.ic.ac.uk from a Windows machine using PuTTY.

Our System

We are running Ubuntu 8.04.

bob05% uname --all

gives us

Linux bob05 2.6.24-19-generic #1 SMP Fri Jul 11 23:41:49 UTC 2008 i686 GNU/Linux
bob05% cat /proc/cpuinfo

tells us, among other things,


model name      : Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
cpu MHz         : 1600.000
cache size      : 3072 KB

and

bob05% cat /proc/meminfo

tells us, among other things,

MemTotal:      3368008 kB
MemFree:       2882276 kB

What about the video card?

bob05% lspci

tells us that we have

01:00.0 VGA compatible controller: nVidia Corporation Unknown device 06e0 (rev a1)

The device is "Unknown" probably because we are running a legacy driver on this system. Can we find out which one it is?

bob05% lsmod | grep nv

shows us

nvidia               7825536  24
agpgart                34760  1 nvidia
i2c_core               24832  1 nvidia

We have found an NVIDIA driver. Which version is this?

bob05% /sbin/modprobe -l nvidia

We have

/lib/modules/2.6.24-19-generic/volatile/nvidia.ko

Conveniently, the version is contained in the path. This is 2.6.24-19-generic. This version is not CUDA-enabled, so we have to install a new one.

Installing a CUDA-enabled driver

We are going to make a directory for all installation packages:

bob05% mkdir -p ~pb401/thalesians/workshops/2009-09-11/install

Downloading the driver

Let us download the CUDA-enabled driver from http://www.nvidia.com/object/cuda_get.html and put it under this directory.

We have chosen to download the CUDA Driver, with

  • Operating System: Linux 32-bit
  • Linux Version: Ubuntu 8.04

We have downloaded CUDA 2.2 NVIDIA Driver for Linux (Ubuntu 8.04) 185.18.14, NVIDIA-Linux-x86-185.18.14-pkg1.run.

Stopping the X server

First we need to stop the X server if one is running. First of all, is it running?

bob05% ps aux | grep /usr/bin/X

Yes, it is:

root      7189  0.1  0.6 811904 20568 tty7     SLs+ 21:38   0:02 /usr/bin/X :0 -br -audit 0 -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7

We need to kill the X server as root, so

bob05% ksu

which will show something like

bob05% ksu
WARNING: Your password may be exposed if you enter it here and are logged
         in remotely using an unsecure (non-encrypted) channel.
Kerberos password for pb401/root@DOC.IC.AC.UK: :
Authenticated pb401/root@DOC.IC.AC.UK
Account root: authorization for pb401/root@DOC.IC.AC.UK successful
Changing uid to root (0)

Now let's stop the X server:

root@bob05:/homes/pb401# /etc/init.d/gdm stop

and...

 * Stopping GNOME Display Manager...                                   [ OK ]

just as we wanted.

When we check again with

bob05% ps aux | grep /usr/bin/X

we see that there is no such process.

Driver installation

Next,

root@bob05:/homes/pb401# cd /homes/pb401/thalesians/workshops/2009-09-11/install/
root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# sh NVIDIA-Linux-x86-185.18.14-pkg1.run

You should see

Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86 185.18.14........
................................................................................
................................................................................
................................................................................
................................................

Then a console application ("NVIDIA Software Installer for Unix/Linux") will come up with a text-based message box:

Please read the following LICENSE and then select either "Accept" to accept the license and continue with the installation, or select
"Do Not Accept" to abort the installation.

We choose "Accept" (navigate using arrow keys and press [Enter]).

Another message will appear:

No precompiled kernel interface was found to match your kernel; would you like the installer to attempt to download a kernel interface
for your kernel from the NVIDIA ftp site (ftp://download.nvidia.com)?

Our answer is "Yes".

We get

No matching precompiled kernel interface was found on the NVIDIA ftp site; this means that the installer will need to compile a kernel
interface for your kernel.

Well, so be it. We select "OK" (there is, of course, no other option).

Then it will go through a number of steps:

  • Building kernel module
  • Searching for conflicting X files
  • Searching for conflicting OpenGL files
  • Installing NVIDIA Accelerated Graphics Driver for Linux-x86 (185.18.14)

And then another message box will appear:

Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be
used when you restart X?  Any pre-existing X configuration file will be backed up.

We select "Yes".

Hopefully you will then see (as we did):

Your X configuration file has been successfully updated.  Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86 (version:
185.18.14) is now complete.

and select "OK".

Driver status check

So far, so good. However,

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# /sbin/modprobe -l nvidia

still shows

/lib/modules/2.6.24-19-generic/volatile/nvidia.ko
/lib/modules/2.6.24-19-generic/kernel/drivers/video/nvidia.ko

so looks like the old driver is still active and we need to reboot.

If we were to reboot now, the system will still pick up the old NVIDIA driver and this will result in you being told that there is no CUDA capable card available in the machine if we try to run NVIDIA CUDA examples (more on them later). We can verify this by running

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# /sbin/modprobe -nv nvidia

If the response is

install /sbin/lrm-video nvidia

then the configuration is still pointing to the old driver. This brings us to our next step...

Disabling the original NVIDIA driver

Let us remove the current driver module:

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# /sbin/modprobe -r nvidia

And unmount the "volatile" filesystem containing the old drivers:

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# umount /lib/modules/2.6.24-19-generic/volatile

And

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# /usr/sbin/update-rc.d -f linux-restricted-modules-common remove

which should result in

 Removing any system startup links for /etc/init.d/linux-restricted-modules-common ...
   /etc/rc0.d/S01linux-restricted-modules-common
   /etc/rc6.d/S01linux-restricted-modules-common
   /etc/rcS.d/S07linux-restricted-modules-common

The modprobe configuration in /etc/modprobe.d is providing the configuration in the file lrm-video that is pointing to the /sbin/lrm-video script when loading the driver. Remove this and then run depmod to update the module dependencies so that the new CUDA driver is picked up:

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# rm /etc/modprobe.d/lrm-video
root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# /sbin/depmod

Now running

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# /sbin/modprobe -nv nvidia

we shall see that modprobe would install a different module, the CUDA-enabled driver. The response will be

insmod /lib/modules/2.6.24-19-generic/kernel/drivers/i2c/i2c-core.ko
insmod /lib/modules/2.6.24-19-generic/ubuntu/char/intel-agp-ich9m/agpgart.ko
insmod /lib/modules/2.6.24-19-generic/kernel/drivers/video/nvidia.ko NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=44 NVreg_DeviceFileMode=0660

Adding the driver to /etc/modules

Rebooting the machine at this stage would result in X failing to start so the NVIDIA driver should be started on boot by adding NVIDIA to the /etc/modules:

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# echo "nvidia" >> /etc/modules

Now we can

Rebooting the machine

Reboot the machine:

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# /sbin/reboot
Broadcast message from pb401@bob05
        (/dev/pts/0) at 22:34 ...

The system is going down for reboot NOW!

Another driver status check

We have rebooted bob05. Let's check:

bob05% /sbin/modprobe -l nvidia

This is now showing

/lib/modules/2.6.24-19-generic/kernel/drivers/video/nvidia.ko

instead of

/lib/modules/2.6.24-19-generic/volatile/nvidia.ko

Looks like we are in business.

However, a proper test would be to run the CUDA Toolkit examples on this machine. Therefore we proceed to our next step.

Installing the CUDA Toolkit

Downloading the toolkit

Go back to http://www.nvidia.com/object/cuda_get.html and download the CUDA Toolkit.

Select

  • Operating System: Linux 32-bit
  • Linux Version: Ubuntu 8.04

and dowload CUDA Toolkit 2.2 for Linux (Ubuntu 8.04).

We have saved the file, cudatoolkit_2.2_linux_32_ubuntu8.04.run, under ~pb401/thalesians/workshops/2009-09-11/install.

Running the installer

Again we need to

bob05% ksu
WARNING: Your password may be exposed if you enter it here and are logged
         in remotely using an unsecure (non-encrypted) channel.
Kerberos password for pb401/root@DOC.IC.AC.UK: :
Authenticated pb401/root@DOC.IC.AC.UK
Account root: authorization for pb401/root@DOC.IC.AC.UK successful
Changing uid to root (0)

Now we can

root@bob05:/homes/pb401# cd ~pb401/thalesians/workshops/2009-09-11/install/
root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# sh cudatoolkit_2.2_linux_32_ubuntu8.04.run
Verifying archive integrity... All good.
Uncompressing NVIDIA CUDA.......................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
.......................................
Enter install path (default /usr/local/cuda, '/cuda' will be appended):

Accept the default install path, /usr/local/cuda (simply press [Enter]).

You will then see

...
`man/man1' -> `/usr/local/cuda/man/man1'
`man/man1/nvcc.1' -> `/usr/local/cuda/man/man1/nvcc.1'
`lib' -> `/usr/local/cuda/lib'
`lib/libcufftemu.so.2.2' -> `/usr/local/cuda/lib/libcufftemu.so.2.2'
`lib/libcublasemu.so.2.2' -> `/usr/local/cuda/lib/libcublasemu.so.2.2'
`lib/libcublasemu.so.2' -> `/usr/local/cuda/lib/libcublasemu.so.2'
`lib/libcufftemu.so' -> `/usr/local/cuda/lib/libcufftemu.so'
`lib/libcublas.so.2' -> `/usr/local/cuda/lib/libcublas.so.2'
`lib/libcufftemu.so.2' -> `/usr/local/cuda/lib/libcufftemu.so.2'
`lib/libcudart.so.2' -> `/usr/local/cuda/lib/libcudart.so.2'
`lib/libcublas.so' -> `/usr/local/cuda/lib/libcublas.so'
`lib/libcufft.so' -> `/usr/local/cuda/lib/libcufft.so'
`lib/libcufft.so.2' -> `/usr/local/cuda/lib/libcufft.so.2'
`lib/libcufft.so.2.2' -> `/usr/local/cuda/lib/libcufft.so.2.2'
`lib/libcublasemu.so' -> `/usr/local/cuda/lib/libcublasemu.so'
`lib/libcudart.so.2.2' -> `/usr/local/cuda/lib/libcudart.so.2.2'
`lib/libcudart.so' -> `/usr/local/cuda/lib/libcudart.so'
`lib/libcublas.so.2.2' -> `/usr/local/cuda/lib/libcublas.so.2.2'
`include' -> `/usr/local/cuda/include'
`include/texture_types.h' -> `/usr/local/cuda/include/texture_types.h'
`include/cudaGL.h' -> `/usr/local/cuda/include/cudaGL.h'
`include/driver_types.h' -> `/usr/local/cuda/include/driver_types.h'
`include/cufft.h' -> `/usr/local/cuda/include/cufft.h'
`include/math_functions_dbl_ptx1.h' -> `/usr/local/cuda/include/math_functions_dbl_ptx1.h'
`include/sm_12_atomic_functions.h' -> `/usr/local/cuda/include/sm_12_atomic_functions.h'
`include/common_functions.h' -> `/usr/local/cuda/include/common_functions.h'
`include/cuda.h' -> `/usr/local/cuda/include/cuda.h'
`include/host_defines.h' -> `/usr/local/cuda/include/host_defines.h'
`include/cublas.h' -> `/usr/local/cuda/include/cublas.h'
`include/common_types.h' -> `/usr/local/cuda/include/common_types.h'
`include/device_types.h' -> `/usr/local/cuda/include/device_types.h'
`include/driver_functions.h' -> `/usr/local/cuda/include/driver_functions.h'
`include/cuda_runtime_api.h' -> `/usr/local/cuda/include/cuda_runtime_api.h'
`include/sm_11_atomic_functions.h' -> `/usr/local/cuda/include/sm_11_atomic_functions.h'
`include/cuComplex.h' -> `/usr/local/cuda/include/cuComplex.h'
`include/builtin_types.h' -> `/usr/local/cuda/include/builtin_types.h'
`include/host_config.h' -> `/usr/local/cuda/include/host_config.h'
`include/cuda_runtime.h' -> `/usr/local/cuda/include/cuda_runtime.h'
`include/channel_descriptor.h' -> `/usr/local/cuda/include/channel_descriptor.h'
`include/math_constants.h' -> `/usr/local/cuda/include/math_constants.h'
`include/vector_functions.h' -> `/usr/local/cuda/include/vector_functions.h'
`include/vector_types.h' -> `/usr/local/cuda/include/vector_types.h'
`include/crt' -> `/usr/local/cuda/include/crt'
`include/crt/device_runtime.h' -> `/usr/local/cuda/include/crt/device_runtime.h'
`include/crt/storage_class.h' -> `/usr/local/cuda/include/crt/storage_class.h'
`include/crt/host_runtime.h' -> `/usr/local/cuda/include/crt/host_runtime.h'
`include/crt/func_macro.h' -> `/usr/local/cuda/include/crt/func_macro.h'
`include/device_functions.h' -> `/usr/local/cuda/include/device_functions.h'
`include/math_functions.h' -> `/usr/local/cuda/include/math_functions.h'
`include/__cudaFatFormat.h' -> `/usr/local/cuda/include/__cudaFatFormat.h'
`include/texture_fetch_functions.h' -> `/usr/local/cuda/include/texture_fetch_functions.h'
`include/math_functions_dbl_ptx3.h' -> `/usr/local/cuda/include/math_functions_dbl_ptx3.h'
`include/device_launch_parameters.h' -> `/usr/local/cuda/include/device_launch_parameters.h'
`include/sm_13_double_functions.h' -> `/usr/local/cuda/include/sm_13_double_functions.h'
`include/cuda_gl_interop.h' -> `/usr/local/cuda/include/cuda_gl_interop.h'
`include/cuda_texture_types.h' -> `/usr/local/cuda/include/cuda_texture_types.h'
`open64' -> `/usr/local/cuda/open64'
`open64/lib' -> `/usr/local/cuda/open64/lib'
`open64/lib/bec' -> `/usr/local/cuda/open64/lib/bec'
`open64/lib/inline' -> `/usr/local/cuda/open64/lib/inline'
`open64/lib/be' -> `/usr/local/cuda/open64/lib/be'
`open64/lib/gfec' -> `/usr/local/cuda/open64/lib/gfec'
`open64/bin' -> `/usr/local/cuda/open64/bin'
`open64/bin/nvopencc' -> `/usr/local/cuda/open64/bin/nvopencc'
`bin' -> `/usr/local/cuda/bin'
`bin/bin2c' -> `/usr/local/cuda/bin/bin2c'
`bin/cudafe++' -> `/usr/local/cuda/bin/cudafe++'
`bin/cudafe' -> `/usr/local/cuda/bin/cudafe'
`bin/ptxvars.cu' -> `/usr/local/cuda/bin/ptxvars.cu'
`bin/nvcc' -> `/usr/local/cuda/bin/nvcc'
`bin/ptxas' -> `/usr/local/cuda/bin/ptxas'
`bin/fatbin' -> `/usr/local/cuda/bin/fatbin'
`bin/nvcc.profile' -> `/usr/local/cuda/bin/nvcc.profile'
`bin/filehash' -> `/usr/local/cuda/bin/filehash'
`src' -> `/usr/local/cuda/src'
`src/fortran.c' -> `/usr/local/cuda/src/fortran.c'

========================================

* Please make sure your PATH includes /usr/local/cuda/bin
* Please make sure your LD_LIBRARY_PATH includes /usr/local/cuda/lib
*   or add /usr/local/cuda/lib to /etc/ld.so.conf and run ldconfig as root

* Please read the release notes in /usr/local/cuda/doc/

* To uninstall CUDA, delete /usr/local/cuda
* Installation Complete

Configuring the toolkit

First, we need to add the lib directory, /usr/local/cuda/lib, to /etc/ld.so.conf:

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# echo "/usr/local/cuda/lib" >> /etc/ld.so.conf

then run

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# /sbin/ldconfig

so this change is picked up.

Fixing the file permissions

You may have problems with the default permissions assigned to the installed files. The executable and library files must be accessible by non-root users to run the CUDA examples. As a quick workaround, you should make at least the following changes in order to run the examples and build CUDA applications:

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11/install# cd /usr/local/cuda
root@bob05:/usr/local/cuda# find . -perm 700 -exec chmod 755 {} \;
root@bob05:/usr/local/cuda# find . -perm 600 -exec chmod 644 {} \;

Addressing the device node issue

We are almost there but there is still a minor problem: the device nodes /dev/nvidiactl and /dev/nvidia0 that are used to access the video card have permissions set to 660. This results in an error if the user attempting to run CUDA code is not a member of the group video. One way to fix this is to add oneself to the video group by editing /etc/group. However, these changes will be undone when /etc/group is overwritten by the Imperial College London Department of Computing system maintenance processes. An alternative solution is to chmod 666 /dev/nvidiactl and chmod 666 /dev/nvidia0. These changes are also undone on a reboot. We are yet to find a solution to this problem. For now, we

root@bob05:/usr/local/cuda# chmod 666 /dev/nvidiactl
root@bob05:/usr/local/cuda# chmod 666 /dev/nvidia0

being aware that these changes will be undone by the local maintenance system on the next reboot. (A solution to this problem is yet to be found.)

Installing the CUDA SDK code examples

Downloading the examples

Let's go back to http://www.nvidia.com/object/cuda_get.html to download the CUDA SDK code examples. Select

  • Operating System: Linux 32-bit
  • Linux Version: Ubuntu 8.04

and download the "CUDA SDK 2.2.1 code samples for Linux (Ubuntu 8.04)".

We place the file cudasdk_2.21_linux.run under /homes/pb401/thalesians/workshops/2009-09-11/install.

Running the installer

Next:

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11# exit
exit
bob05% cd ~pb401/thalesians/workshops/2009-09-11/
bob05% mkdir -p NVIDIA/CUDA/SDK/V2.21
bob05% cd install
bob05% sh cudasdk_2.21_linux.run
Verifying archive integrity... All good.
Uncompressing NVIDIA CUDA SDK...................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
..............................................................

Enter install path (default ~/NVIDIA_CUDA_SDK):

We enter

~/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21

Then:

Located CUDA at /usr/local/cuda
If this is correct, choose the default below.
If it is not correct, enter the correct path to CUDA

Enter CUDA install path (default /usr/local/cuda):

Indeed, this is correct (we installed CUDA Toolkit to the default location), so we press [Enter].

Then a lot of files will be installed and eventually

`sdk/releaseNotesData/GEF9_2D_wte.gif' ->
    `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/releaseNotesData/GEF9_2D_wte.gif'
`sdk/releaseNotesData/tesla.gif' ->
    `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/releaseNotesData/tesla.gif'
`sdk/releaseNotesData/GEFGTX200_2D_wte.gif' ->
    `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/releaseNotesData/GEFGTX200_2D_wte.gif'
`sdk/tools' ->
    `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/tools'
`sdk/tools/CUDA_Occupancy_calculator.xls' ->
    `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/tools/CUDA_Occupancy_calculator.xls'

========================================

Configuring SDK Makefile (/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/common/common.mk)...

========================================

* Please make sure your PATH includes /usr/local/cuda/bin
* Please make sure your LD_LIBRARY_PATH includes /usr/local/cuda/lib

* To uninstall the NVIDIA CUDA SDK, please delete /homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21
* Installation Complete

Configuring the environment

To ensure that /usr/local/cuda/bin has been added to PATH and /usr/local/cuda/lib has been added to LD_LIBRARY_PATH, we

bob05% emacs .cshrc

(as we are using the tcsh shell in this case) to make sure that we have

set PATH=( \
  . \
  /usr/bin \
  .
  .
  .
  /usr/local/cuda/bin \
)

set LD_LIBRARY_PATH=( \
  /usr/local/cuda/lib \
)

Then we close our PuTTY window and reconnect to bob05 to make sure that everything is clean.

Prerequisite packages

These packages are now present on all the Workshop machines, including bob05. However, in the past we worked out that in order to make the examples we need...

root@bob05:/homes/pb401/thalesians/workshops/2009-09-11# apt-get install build-essential g++ libsdl1.2debian libsdl1.2-dev libgl1-mesa-dev libglu1-mesa-dev libsdl-image1.2 libsdl-image1.2-dev libxi-dev libXmu-dev glutg3-dev

For convenience, here is a list of aptitude packages that you may need to get things to make all the examples successfully (our next step):

  • build-essential — Informational list of build-essential packages. If you do not plan to build Debian packages, you don't need this package. Moreover this package is not required for building Debian packages. This package contains an informational list of packages which are considered essential for building Debian packages. This package also depends on the packages on that list, to make it easy to have the build-essential packages installed. If you have this package installed, you only need to install whatever a package specifies as its build-time dependencies to build the package. Conversely, if you are determining what your package needs to build-depend on, you can always leave out the packages this package depends on. This package is NOT the definition of what packages are build-essential; the real definition is in the Debian Policy Manual. This package contains merely an informational list, which is all most people need. However, if this package and the manual disagree, the manual is correct.
  • g++ — a freely redistributable C++ compiler. It is part of GCC, the GNU compiler collection.
  • libsdl1.2debian — SDL is a library that allows programs portable low level access to a video framebuffer, audio output, mouse, and keyboard.
  • libsdl1.2-dev — This package contains the files needed to compile and link programs which use SDL.
  • libgl1-mesa-swx11 (optional?) — Mesa is a 3-D graphics library with an API which is very similar to that of OpenGL. To the extent that Mesa utilizes the OpenGL command syntax or state machine, it is being used with authorization from Silicon Graphics, Inc. However, the author makes no claim that Mesa is in any way a compatible replacement for OpenGL or associated with Silicon Graphics, Inc.
  • libgl1-mesa-dev — This package includes headers and static libraries for compiling programs with Mesa.
  • libglu1-mesa (optional?) — GLU offers simple interfaces for building mipmaps; checking for the presence of extensions in the OpenGL (or other libraries which follow the same conventions for advertising extensions); drawing piecewise-linear curves, NURBS, quadrics and other primitives (including, but not limited to, teapots); tesselating surfaces; setting up projection matrices and unprojecting screen coordinates to world coordinates.
  • libglu1-mesa-dev — Includes headers and static libraries for compiling programs with GLU.
  • libsdl-image1.2 — This is a simple library to load images of various formats as SDL surfaces. This library currently supports BMP, PPM, PCX, GIF, JPEG, PNG, TIFF, and XPM formats.
  • libsdl-image1.2-dev — This package contains the include files and static libraries required to support development using the SDL 1.2 image loading library.
  • libxi6 (optional?)libXi provides an X Window System client interface to the XINPUT extension to the X protocol. The Input extension allows setup and configuration of multiple input devices, and will soon allow hotplugging of input devices; to be added and removed on the fly.
  • libxi-dev — This package contains the development headers for the library found in libxi6.
  • libXmu6 (optional?)libXmu provides a set of miscellaneous utility convenience functions for X libraries to use.
  • libXmu-dev — This package contains the development headers for the library found in libxmu6.
  • glutg3 (optional?) — GLUT (as in "gluttony") is a window system independent toolkit for writing OpenGL programs. It implements a simple windowing API, which makes life considerably easier when learning about and exploring OpenGL programming.
  • glutg3-dev — This package contains the development headers for the library found in glutg3.

make-ing the SDK examples

bob05% cd /homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21
bob05% make

Eventually you should see

make -C projects/MonteCarlo/
make[1]: Entering directory `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/projects/MonteCarlo'
make[1]: Leaving directory `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/projects/MonteCarlo'
make -C projects/BlackScholes/
make[1]: Entering directory `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/projects/BlackScholes'
make[1]: Leaving directory `/homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/projects/BlackScholes'
Finished building all

Testing

It's time to test our installation.

deviceQuery

bob05% cd /homes/pb401/thalesians/workshops/2009-09-11/NVIDIA/CUDA/SDK/V2.21/bin/linux/release
bob05% ./deviceQuery

Here is the output that we got:

CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA

Device 0: "GeForce 9300 GE"
  CUDA Capability Major revision number:         1
  CUDA Capability Minor revision number:         1
  Total amount of global memory:                 267714560 bytes
  Number of multiprocessors:                     1
  Number of cores:                               8
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          262144 bytes
  Texture alignment:                             256 bytes
  Clock rate:                                    1.30 GHz
  Concurrent copy and execution:                 No
  Run time limit on kernels:                     Yes
  Integrated:                                    No
  Support host page-locked memory mapping:       No
  Compute mode:                                  Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit...

Excellent!

BlackScholes

Another test:

bob05% ./BlackScholes

Here is the output:

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
...generating input data in CPU mem.
...copying input data to GPU mem.
Data init done.
Executing Black-Scholes GPU kernel (512 iterations)...
Options count             : 8000000
BlackScholesGPU() time    : 23.268410 msec
Effective memory bandwidth: 3.438138 GB/s
Gigaoptions per second    : 0.343814
Reading back GPU results...
Checking the results...
...running CPU calculations.
Comparing the results...
L1 norm: 1.991672E-07
Max absolute error: 1.239777E-05
TEST PASSED
Shutting down...
...releasing GPU memory.
...releasing CPU memory.
Shutdown done.

Press ENTER to exit...

oceanFFT

Now let's run a test that produces graphical output. Please note that we are using the PC X Server Xming 6.9.0.31 and we enabled X11 forwarding in PuTTY as per our Knowledge Base instructions.

However, when we try

bob05% ./oceanFFT

we get a blank X server window and

[CUDA FFT Ocean Simulation]

Left mouse button          - rotate
Middle mouse button        - pan
Left + middle mouse button - zoom
'w' key                    - toggle wireframe
[CUDA FFT Ocean Simulation]
freeglut (./oceanFFT): Unable to create direct context rendering for window 'CUDA FFT Ocean Simulation'
This may hurt performance.
ERROR: Support for necessary OpenGL extensions missing.
Press ENTER to exit...

The reason for this error is yet unclear.

The example runs successfully locally.

  • This page was last modified on 6 September 2009, at 18:15.
  • This page has been accessed 68,208 times.