background

Niko's Project Corner

CUDA at other sites


Bruteforcing Countdown numbers game with CUDA

(19th April 2023)

Youtu­ber An­other Roof posed an in­ter­est­ing ques­tion on his video "The Sur­pris­ing Maths of Britain's Old­est* Game Show". The chal­lenge was stated in the video's de­scrip­tion: "I want to see a list of the per­cent­age of solv­able games for ALL op­tions of large num­bers. Like I did for the 15 op­tions of the form {n, n+25, n+50, n+75}, but for all of them. The op­tions for large num­bers should be four dis­tinct num­bers in the range from 11 to 100. As I said there are 2555190 such op­tions so this will re­quire a clever bit of code, but I think it’s pos­si­ble!". His ref­er­ence Python im­ple­men­ta­tion would have taken 1055 days (25000 hours) of CPU time (when all four large num­bers are used in the game), but with CUDA and a RTX 4080 card it could be solved in just 0.8 hours, or 31000x faster!

Languages: Python CUDA Numba
Tags: Applied mathematics

Very fuzzy searching with CUDA

(2nd November 2015)

This is an al­ter­na­tive an­swer to the ques­tion I en­coun­tered at Stack Over­flow about fuzzy search­ing of hashes on Elas­tic­search. My orig­inal an­swer used lo­cal­ity-sen­si­tive hash­ing. Su­pe­rior speed and sim­ple im­ple­men­ta­tion were gained by us­ing nVidia's CUDA via Thrust li­brary.

Languages: C++ CUDA
Tags: Thrust Databases GitHub Stack Overflow
GitHub: nikonyrh/stackoverflow-scripts

Real-time interest point tracking

(15th July 2013)

As men­tioned in an other ar­ti­cle about om­ni­di­rec­tional cam­eras, my Mas­ter's The­sis' main topic was real-time in­ter­est point ex­trac­tion and track­ing on an om­ni­di­rec­tional im­age in a chal­leng­ing forest en­vi­ron­ment. I found OpenCV's rou­ti­nes mostly rather slow and run­ning in a sin­gle thread, so I ended up im­ple­ment­ing ev­ery­thing my­self to gain more con­trol on the data flow and threads' de­pen­den­cies. The im­ple­mented code would si­mul­ta­ne­ously use 4 threads on CPU and a few hun­dred on the GPU, ex­ecut­ing in­ter­est point ex­trac­tion and match­ing at 27 fps (37 ms/frame) for 1800 × 360 pix­els (≈0.65 Mpix) panoramic im­age.

Languages: C++ FFTW CUDA
Tags: Computer Vision FFT

CUDA realtime rendering engine

(9th July 2013)

So far I've writ­ten a ba­sic ren­der­ing en­gine which uses Nvidia's CUDA (Com­pute Uni­fied De­vice Ar­chi­tec­ture) which can ren­der re­flec­tive sur­faces with en­vi­ron­men­tal map­ping and anti-alias­ing and mo­tion blur at 200 fps with min­imal us­age of 3rd party li­braries such as OpenGL. This let me fully im­ple­ment the cross-plat­form ren­der­ing pipeline from data trans­fer to pixel-level RGB cal­cu­la­tions, all in C-like syn­tax.

Languages: C++ CUDA SDL
Tags: Rendering