blog-2024-01-08-1brc-kotlin

i tried the [1brc] challenge in [kotlin]. the naive implementation was simple enough, though not exactly like the [java] version - the `groupingBy` function works differently, in that it emits a `Grouping` instance which then provides `aggregate`, `fold` and `reduce` functions.

the use of a `Sequence` (akin to lazily evaluated java streams) is necessary as a list wouldn't fit in memory. interestingly, kotlin doesn't provide parallel sequence processing (unlike java's `List.stream().parallel()`).

chunking the input into sub-lists for parallel processing isn't possible, as this would lead to the whole file having to be loaded into RAM before chunking (which produces `List`s, not `Sequence`s). my next step, if i were to follow up on this, would be to divide the file into n regions and let threads build maps for the regions and finally merge the maps.

interestingly, with 100k lines (for testing), on my machine the [Hotspot] takes ~160ms, while graal's community native image completes in ~30ms (!) (excluding vm startup, but without [JIT] warmup). 

even more interestingly, with 1b lines, it's the other way round: 200sec for [GraalVM] vs. 115sec for the [Hotspot].

edited by: stefs at Monday, January 8, 2024, 2:49:42 PM Coordinated Universal Time


view