background

Niko's Project Corner

Simulating gravitational field near a torus

(19th April 2016)

There are many games with a strong em­pha­sis on grav­ity, and at times even multi-body tra­jec­tory sim­ula­tions. How­ever they all seem to be based on spher­ical ge­om­etry (as plan­ets are shaped by grav­ity), but other shapes should cre­ate in­ter­est­ing tra­jec­to­ries. As torus has ro­ta­tional sym­me­try its grav­ity field can be mod­elled on a 2D cross-sec­tion. In this pro­ject torus' field is es­ti­mated in 3D, pro­jected to 2D and in­ter­po­la­tion func­tions are fit­ted. The space- and time-ef­fi­cient model could be used in a game to do physics sim­ula­tion in real time.

Languages: Matlab
Tags: Applied mathematics

Nginx docker image for easy file access via HTTP

(16th April 2016)

Of­ten I find my­self hav­ing a SSH con­nec­tion to a re­mote server, and I'd like to re­trieve some files to my own ma­chine. Com­mon meth­ods for this in­clude Win­dows/Samba share, SSHFS and up­load to cloud (which isn't triv­ial to do via plain cURL). Here an easy-to-use al­ter­na­tive is de­scribed: a sin­gle line com­mand to load and run a docker im­age which con­tains a pre-con­fig­ured Ng­inx in­stance. Then files can be ac­cessed via plain HTTP at the user-as­signed port (as­sum­ing fire­wall isn't block­ing it).

Languages: Bash
Tags: Docker Spark Nginx GitHub
GitHub: nikonyrh/docker-scripts
DockerHub: nikonyrh/nginx_bridge

Scalable analytics with Docker, Spark and Python

(23rd December 2015)

Tra­di­tion­ally data sci­en­tists in­stalled soft­ware pack­ages di­rectly to their ma­chi­nes, wrote code, trained mod­els, saved re­sults to lo­cal files and ap­plied mod­els to new data in batch pro­cess­ing style. New data-driven prod­ucts re­quire rapid de­vel­op­ment of new mod­els, scal­able train­ing and easy in­te­gra­tion to other as­pects of the busi­ness. Here I am propos­ing one (per­haps al­ready well-known) cloud-ready ar­chi­tec­ture to meet these re­quire­ments.

Languages: Bash Python
Tags: Architecture Docker Spark Nginx GitHub JVM
GitHub: nikonyrh/docker-scripts

Very fuzzy searching with CUDA

(2nd November 2015)

This is an al­ter­na­tive an­swer to the ques­tion I en­coun­tered at Stack Over­flow about fuzzy search­ing of hashes on Elas­tic­search. My orig­inal an­swer used lo­cal­ity-sen­si­tive hash­ing. Su­pe­rior speed and sim­ple im­ple­men­ta­tion were gained by us­ing nVidia's CUDA via Thrust li­brary.

Languages: C++ CUDA
Tags: Thrust Databases GitHub Stack Overflow
GitHub: nikonyrh/stackoverflow-scripts

Very fuzzy searching with Elasticsearch

(21st October 2015)

I en­coun­tered an in­ter­est­ing ques­tion at Stack Over­flow about fuzzy search­ing of hashes on Elas­tic­search and de­cided to give it a go. It has na­tive sup­port for fuzzy text searches but due to per­for­mance rea­sons it only sup­ports an edit dis­tance up-to 2. In this con­text the max­imum al­lowed dis­tance was eight so an al­ter­na­tive so­lu­tion was needed. A so­lu­tion was found from lo­cal­ity-sen­si­tive hash­ing.

Languages: Python
Tags: Elasticsearch Databases GitHub Stack Overflow
GitHub: nikonyrh/stackoverflow-scripts

Anonymous and secure information storing and sharing

(25th April 2015)

Nowa­days en­cryp­tion is stan­dard prac­tice on web when data is in tran­si­tion, and there are even a few ser­vices which of­fer client-side en­cryp­tion and thus are truly end-to-end. Nev­er­the­less for some rea­son they all re­quire you to cre­ate and ac­count by pro­vid­ing your email and pass­word, al­though this is not strictly nec­es­sary for stor­ing and shar­ing data. In this sys­tem the doc­ument id, en­cryp­tion key and HMAC key are gen­er­ated ad-hoc on the client and only min­imal nec­es­sary in­for­ma­tion is re­vealed to the server. A live demo should be avail­able at no­knowl­edgenotes.nikonyrh.org.

Languages: PHP
Tags: GitHub Encryption
GitHub: nikonyrh/noknowledgenotes

Automated image capturing + API

(10th April 2015)

Out of in­ter­est on na­ture ob­ser­va­tion, com­puter vi­sion, im­age pro­cess­ing and so forth I de­vel­oped an au­to­mated sys­tem to cap­ture one photo / min­ute and store it on a disk. The pro­ject also has Bash and PHP scripts co­or­di­nat­ing ex­ter­nal tools such as mon­tage for im­age stitch­ing and men­coder for video gen­er­ation. PHP also pro­vides an HTTP API for im­age gen­er­ation and file size statis­tics.

Languages: Bash PHP
Tags: GitHub
GitHub: nikonyrh/webcammon

Approximating planets' orbits in closed-form

(12th October 2014)

I wanted to find or cre­ate a for­mula which would ac­cept an epoch times­tamp, lat­itude and lon­gi­tude and it would pro­duce the Sun's ob­served az­imuth and al­ti­tude in ra­di­ans. It needs to take into ac­count de­tails earth's ax­ial tilt and its po­si­tion on its or­bit around the sun. To my sur­prise I wasn't able to find such for­mula, so I had to de­velop it from scratch. Luck­ily earth's or­bit (and or­bits in gen­eral) is a well stud­ied and doc­umented prob­lem, so I could take some short­cuts.

Languages: Matlab
Tags: Astronomy Applied mathematics

Automatic map stitching

(10th September 2014)

Nowa­days there are many HTML5-based map ser­vices, but typ­ically they don't of­fer any ex­port func­tion­al­ity. To cre­ate a full view of the de­sired re­gion, one can ei­ther zoom out (and lose map de­tails) or take many screen­shots of dif­fer­ent lo­ca­tions and man­ually stitch them to­gether. This pro­ject can au­to­mat­ically load all stored screen­shots, de­tect the map, crop rel­evant re­gions, de­ter­mine im­ages rel­ative off­sets and gen­er­ate the high-res out­put with zero con­fig­ura­tion from any map ser­vice.

Languages: Matlab
Tags: Computer Vision Rendering FFT

Publishing internal services behind a NAT

(1st September 2014)

Even in desk­top ap­pli­ca­tions it is be­com­ing more and more com­mon to provide a HTTP based APIs or full user in­ter­faces. For ex­am­ple Bit­Tor­rent's μTor­rent and Bit­Tor­rent Sync don't have any built-in UI, and in­stead users just head with their pre­ferred in­ter­net browser to http://lo­cal­host:8080 or http://lo­cal­host:8888. How­ever they typ­ically lack HTTPS en­cryp­tion and each port needs to be con­fig­ured to the NAT router in­di­vid­ually. This so­lu­tion uses a Ng­inx in­stance on a vir­tual ma­chine to provide a HTTPS re­verse proxy to all these ser­vices in a sin­gle port un­der dif­fer­ent sub-do­mains.

Languages: Bash
Tags: Nginx NAT SSH

Image distortion estimation and compensation

(9th August 2014)

This pro­ject's goal was to au­to­mat­ically and ro­bustly es­ti­mate and com­pen­sate dis­tor­tion from any re­ceipt pho­tos. The user is able to just snap the photo and OCR could ac­cu­rately iden­tify bought prod­ucts and their prices. How­ever this task is some­what chal­leng­ing be­cause typ­ically re­ceipts tend to get crum­bled and bent. Thus they won't lie nicely flat on a sur­face for easy anal­ysis. This set of al­go­rithms solves that prob­lem and pro­duces dis­tor­tion-free thresh­olded im­ages for the next OCR step.

Languages: Matlab
Tags: Computer Vision

Cheap off-site backup at Amazon Glacier

(17th July 2014)

In ad­di­tion to a mir­rored and check-summed ZFS based backup server, I wanted to have back­ups out­side by premises to be safer against haz­ards such as bur­glary, fire and wa­ter dam­age. ZFS can al­ready re­sist sin­gle disk fail­ure and can re­pair silent data cor­rup­tion, but for im­por­tant mem­ories that isn't suf­fi­cient level of pro­tec­tion. My ever-grow­ing data set is cur­rently 150k files, hav­ing a to­tal size of 520 Gb. Ama­zon's Glacier seems to be the most cost ef­fi­cient so­lu­tion with so­phis­ti­cated APIs and SDKs.

Languages: Bash
Tags: AWS Encryption Backups

Real-time car tracking and counting

(7th June 2014)

From my of­fice win­dow I've got an un­blocked size-view to the Ring Road I (Kehä I) in Es­poo, Fin­land. It is one of the bus­iest roads in Fin­land, hav­ing up-to 100.000 cars / day. I wanted to cre­ate a pro­gram which would re­ceive a video feed from a we­bcam and would pro­cess im­ages in real time on com­mon hard­ware.

Languages: Matlab
Tags: Computer Vision FFT

Server monitoring and analytics

(26th April 2014)

There al­ready ex­ists many server mon­itor­ing and log­ging sys­tems, but I was in­ter­ested to de­velop and de­ploy my own. It was also a good chance to learn about Elas­tic­Search's ag­gre­ga­tion queries (new in v1.0.0). Orig­inally Elas­tic­Search was de­signed to provide scal­able doc­ument based stor­age and ef­fi­cient search, but now it is gain­ing more ca­pa­bil­ities. The pro­ject con­sists of a cron job which pushes new met­rics to Elas­tic­Search, a REST­ful JSON API to query statis­tics on recorded num­bers and plot the re­sults in a browser (based on High­Charts).

Languages: PHP
Tags: Elasticsearch Databases

Efficient in-memory analytical database

(1st December 2013)

Tra­di­tional databases such as MySQL are not de­signed to per­form well in an­alyt­ical queries, which re­quires ac­cess to pos­si­bly all of the rows on se­lected columns. This re­sults in a full table scan and it can­not ben­efit from any in­dexes. Column-ori­ented en­gi­nes try to cir­cum­vent this is­sue, but I went one step deeper and made the stor­age column value ori­ented, sim­ilar to an in­verted in­dex. This re­sults in 2 — 10× speedup from op­ti­mized colum­nar so­lu­tions and 80× the speed of MySQL.

Languages: C++
Tags: FastCGI SQL Databases

[ 1 | 2 | 3 ]