Benchmarking Elasticsearch and MS SQL on NYC Taxis(7th May 2017) |
|||||||
The NYC Taxi dataset has been used on quite many benchmarks (for example by Mark Litwintschik), perhaps because it has a quite rich set of columns but their meaning is mostly trivial to understand. I developed a Clojure project which generates Elasticsearch and SQL queries with three different templates for filters and four different templates of aggregations. This should give a decent indication of these databases performance under a typical workload, although this test did not run queries concurrently and it does not mix different query types when the benchmark is running. However benchmarks are always tricky to design and execute properly so I'm sure there is room for improvements. In this project the tested database engines were Elasticsearch 5.2.2 (with Oracle JVM 1.8.0_121) and MS SQL Server 2014.
|
|
||||||
Efficient in-memory analytical database(1st December 2013) |
|||||||
Traditional databases such as MySQL are not designed to perform well in analytical queries, which requires access to possibly all of the rows on selected columns. This results in a full table scan and it cannot benefit from any indexes. Column-oriented engines try to circumvent this issue, but I went one step deeper and made the storage column value oriented, similar to an inverted index. This results in 2 — 10× speedup from optimized columnar solutions and 80× the speed of MySQL.
|
|
Home |
Home | (Home page) |
About | (About me) |
Platform | (About this blog) |
(Niko Nyrhilä) | |
GitHub | (nikonyrh) |
Stackoverflow | (nikonyrh) |
Bruteforcing Countdown numbe... | (2023 Apr) |
Cheating at Bananagrams with... | (2023 Apr) |
Introduction to Stable Diffu... | (2022 Nov) |
Matching puzzle pieces together | (2022 Jul) |
Single channel speech / musi... | (2022 Feb) |
Computer Vision | (13) |
GitHub | (12) |
Databases | (9) |
Elasticsearch | (6) |
FFT | (5) |
Rendering | (5) |
Applied mathematics | (4) |
Python | (13) |
C++ | (11) |
Matlab | (10) |
Keras | (6) |
Clojure | (6) |
Bash | (6) |
PHP | (6) |
Matl | Pyth | C++ | Cloj | Bash | Kera | |
Comput | 6 | 6 | 3 | 1 | 0 | 5 |
GitHub | 0 | 2 | 1 | 4 | 3 | 0 |
Databa | 0 | 3 | 2 | 2 | 1 | 0 |
Render | 3 | 0 | 3 | 0 | 0 | 0 |
Nginx | 0 | 1 | 0 | 0 | 4 | 0 |
Autoen | 0 | 3 | 0 | 1 | 0 | 2 |
Elasti | 0 | 2 | 0 | 3 | 0 | 0 |
FFT | 3 | 1 | 1 | 0 | 0 | 1 |
Data S | 2 | 1 | 2 | 1 | 0 | 1 |
JVM | 0 | 1 | 0 | 3 | 1 | 0 |
Docker | 0 | 1 | 0 | 0 | 3 | 0 |
FastCG | 0 | 0 | 3 | 0 | 0 | 0 |
Applie | 2 | 2 | 0 | 0 | 0 | 0 |
Field | 2 | 0 | 2 | 0 | 0 | 0 |
Omnidi | 2 | 0 | 2 | 0 | 0 | 0 |
Affine | 2 | 0 | 2 | 0 | 0 | 0 |
Master | 1 | 0 | 2 | 0 | 0 | 0 |
Archit | 0 | 1 | 0 | 0 | 2 | 0 |
Visual | 1 | 0 | 2 | 0 | 0 | 0 |
Spark | 0 | 1 | 0 | 0 | 2 | 0 |
Blog | 0 | 0 | 0 | 2 | 0 | 0 |
Hyphen | 0 | 0 | 0 | 2 | 0 | 0 |
Stack | 0 | 1 | 1 | 0 | 0 | 0 |
SQL | 0 | 0 | 1 | 1 | 0 | 0 |
Busine | 0 | 1 | 0 | 1 | 0 | 0 |
Signal | 0 | 1 | 0 | 0 | 0 | 1 |
Encryp | 0 | 0 | 0 | 0 | 1 | 0 |
Git | 0 | 0 | 0 | 1 | 0 | 0 |
Stable | 0 | 1 | 0 | 0 | 0 | 0 |
Redis | 0 | 1 | 0 | 0 | 0 | 0 |
Thrust | 0 | 0 | 1 | 0 | 0 | 0 |
Kibana | 0 | 0 | 0 | 1 | 0 | 0 |
Astron | 1 | 0 | 0 | 0 | 0 | 0 |
Mustac | 0 | 0 | 1 | 0 | 0 | 0 |
NAT | 0 | 0 | 0 | 0 | 1 | 0 |
jQuery | 0 | 0 | 1 | 0 | 0 | 0 |
SSH | 0 | 0 | 0 | 0 | 1 | 0 |
Happyh | 0 | 0 | 1 | 0 | 0 | 0 |
Backup | 0 | 0 | 0 | 0 | 1 | 0 |
Pthrea | 0 | 0 | 1 | 0 | 0 | 0 |
AWS | 0 | 0 | 0 | 0 | 1 | 0 |
SIFT | 0 | 0 | 1 | 0 | 0 | 0 |
SURF | 0 | 0 | 1 | 0 | 0 | 0 |
Conjug | 0 | 0 | 1 | 0 | 0 | 0 |
Kalman | 0 | 0 | 1 | 0 | 0 | 0 |
Partic | 0 | 0 | 1 | 0 | 0 | 0 |
Gradie | 0 | 0 | 1 | 0 | 0 | 0 |
Simult | 0 | 0 | 1 | 0 | 0 | 0 |
Roboti | 0 | 0 | 1 | 0 | 0 | 0 |
Princi | 1 | 0 | 0 | 0 | 0 | 0 |
Receiv | 1 | 0 | 0 | 0 | 0 | 0 |
Linear | 1 | 0 | 0 | 0 | 0 | 0 |
Suppor | 1 | 0 | 0 | 0 | 0 | 0 |
Machin | 1 | 0 | 0 | 0 | 0 | 0 |
Discre | 1 | 0 | 0 | 0 | 0 | 0 |