CudaParticles v0.2

I've created a simple particle system powered by NVIDIA CUDA as a class project for Hardware for Computer Graphics.

Win32 binaries: CudaParticles_v0.2.zip [1MB]





Scene 1: 4 million particles with random velocity

Loading the player ...

Scene 2: 1 million particles - 3 magnetic accelerating fountain-jets and angry sweeper :) + multisampling (4x) + alpha blending of particles (simple - no sorting) running @ 100 FPS 1920x1080px with GeForce GTX 460

Loading the player ...

Works even on ION netbook (65536 particles) at ~40 FPS.


User documentation


CudaParticles v0.2
==================

Angry sweeper is actually an invisible ball which is diving in and out.
Accelerating fountain-jets are attracting particles on the ground and accelerate them into the air.

Controls:
---------

 Mouse:
  Control the angry sweeper with pressed mouse button.
  It is better to control the angry sweeper with static camera (press "c" key).

 Keyboard:
  spacebar: play/pause
  c : static/dynamic camera
  m : enable/disable multisampling (works only with  --multisample attribute on start)
  t : enable/disable blending of particles
  v : turn on/off VSYNC
  f: fullscreen on/off
  ESC, q : exit
  s : enable/disable Phong shader on fountainjets
  * : turn on testing CPU kernel
  r : reset scene

Program parameters:
-------------------

CudaParticles [--mutlisample] [--renderFloor] [--cpukernel] [--help]
  Turn on multisampling: --mutlisample 
  Turn on floor rendering: --renderFloor
  Turn on testing CPU kernel: --cpukernel
  Show help: --help
  
  
Notes:
------

It is nice to see both blended and non-blended version with "t" key.
You can reset the scene any time with "r" key and play/pause ("spacebar" key) with static or dynamic camera ("c" key).
It is also nice to hold the mouse button and let fountain-jets actually work :)
If you see "hiccup!" message you either have slow CPU/GPU or there's some hiccup in the system.


Benchmarks


// --------------------------------------------
// GPU BENCHMARK 01 - Random velocity
// Scene: 4,194,304 particles
// 
// GeForce GTX 460 @ 1.5 GHz
// -------------------------------------------

//dim3 block(32, 32, 1);  // 44 FPS
//dim3 block(16, 16, 1);  // 52 FPS
//dim3 block(8, 8, 1);    // 41 FPS

//dim3 block(1024, 1, 1); // 43 FPS
//dim3 block(512, 2, 1);  // 43 FPS
//dim3 block(256, 4, 1);  // 44 FPS
//dim3 block(128, 8, 1);  // 42 FPS
//dim3 block(16, 64, 1);  // 41 FPS
//dim3 block(8, 128, 1);  // 38 FPS
//dim3 block(4, 256, 1);  // 27 FPS
//dim3 block(2, 512, 1);  // 13 FPS
//dim3 block(1, 1024, 1); // 6 FPS

//dim3 block(32, 4, 1);   // 51 FPS
//dim3 block(32, 2, 1);   // 45 FPS
//dim3 block(128, 1, 1);  // 51 FPS
//dim3 block(512, 1, 1);  // 51 FPS
	
dim3 block(256, 1, 1); // 52 FPS
dim3 grid(a_width / block.x, a_height / block.y, 1);

// execute the kernel
kernel<<< grid, block>>>(pos, dvel, dcol, a_width, a_height, delta, time);

// =======================================================================
// CPU test
// 
// Default scene: 1,048,576 particles
// ----------------------------------
// avg CPU kernel call time for Core2Duo @ 2.8 GHz:      ~500ms ->   2 FPS
// avg GPU kernel call time for GeForce GTX 460 @ 1.5 GHz: ~8ms -> 125 FPS
// 
// -----------------------------------------------------------------------
	

2011, Petr Kadlecek