"Whereas Europeans generally pronounce my name the right way ('Ni-klows Wirt'), Americans invariably mangle it into 'Nick-les Worth'. This is to say that Europeans call me by name, but Americans call me by value."
rest in peace.
Pascal was my first programming language where i could program productively.
i can't really say i'm truly happy with EclipseStore, because i didn't grasp some of the core concepts behind it, couldn't get the necessary information from the docs and didn't get involved with the community yet. while it worked flawlessly, i always had the nagging feeling of using it the wrong way.
the biggest problem is the binary format and its opaqueness. with a traditional database system i've got external tools to inspect and manipulate the content - and then there's the unreasonable effectiveness of plain text.
the pepperino wiki engine was, initially, a demo for testing EclipseStore, but now i ported it to a plain text file based backend. the file format is quite simple: one line for the title, one line for the date, one line for the username and then the content lines prefixed by a character until an empty line. this format is only practical if all pages can be kept in memory after an initial loading/parsing phase but afterwards, storing a new page is just a file append operation.
The thing is that tests are code, and all code is technical debt, but unlike normal code tests can grow without bounds, and they don't always get seen as a something with a cost because they don't get shipped to production.
-- https://www.brandons.me/blog/thoughts-on-testing
i use tests sparingly. for complicated, algorithmic tasks i use test driven design, but otherwise, i usually stopped striving for a high coverage. too many of the tests i see are badly written and more of a liability rather than providing value; they exist to satify code coverage requirements.
The_Egg, a very short very fun story by Andy_Weir: https://galactanet.com/oneoff/theegg_mod.html
https://github.com/binkley/modern-java-practices HN
Modern Java/JVM Build Practices is an article-as-repo on building modern Java/JVM projects using Gradle and Maven, and a starter project for Java.
The focus is best build practices and project hygiene.
https://www.morling.dev/blog/one-billion-row-challenge/ HN / github
read one billion rows of city: temperature
duples and calculate the min/max/avg. i can't say i like it as it's almost purely a test of string parsing speed and few things else.
the results are staggering though: the naive java solution with Files.lines
and stream collector takes 4m13s, while the currently fastest solutions take less than 8s.
it's also interesting that graalvm takes most of the top spots.
see 1brc
i feel like most of my apps are home-cooked meals.
my current tech stack for most private (web) projects:
i tried the 1brc challenge in kotlin. the naive implementation was simple enough, though not exactly like the java version - the groupingBy
function works differently, in that it emits a Grouping
instance which then provides aggregate
, fold
and reduce
functions.
the use of a Sequence
(akin to lazily evaluated java streams) is necessary as a list wouldn't fit in memory. interestingly, kotlin doesn't provide parallel sequence processing (unlike java's List.stream().parallel()
).
chunking the input into sub-lists for parallel processing isn't possible, as this would lead to the whole file having to be loaded into RAM before chunking (which produces List
s, not Sequence
s). my next step, if i were to follow up on this, would be to divide the file into n regions and let threads build maps for the regions and finally merge the maps.
interestingly, with 100k lines (for testing), on my machine the Hotspot takes ~160ms, while graal's community native image completes in ~30ms (!) (excluding vm startup, but without JIT warmup).
even more interestingly, with 1b lines, it's the other way round: 200sec for GraalVM vs. 115sec for the Hotspot.
Die Politik von Milei beginnt zu wirken. Die Preise explodieren, der Konsum bricht ein.
i'm very conflicted about what happens here. i don't think milei's neoliberal policies will work at all, so i'm expecting a terrible spiral where a few will profit immensly while most people of argentina will suffer greatly. as the right is on the rise everywhere in the world the same play will repeat every in other countries in the future. there's a small chance voters of other countries will see milei's policies fail and refrain from trying to implement them locally, but i'm not optimistic enough to believe that will happen.
further reading: https://senecaeffect.substack.com/p/the-dark-face-of-degrowth-argentines
https://leerob.io/blog/css - could writing CSS be fun (again)?
i'll have to read the whole thing, but:
in the end it's mostly a question of whether it needs a build step. i don't like build steps if avoidable and imo, for most small web projects, a build step for CSS is absolutely avoidable. why? because build tools break after some time. i don't want to spend hours to fix the tool chain and dependencies when i touch a project for the first time in a year.
reading list: https://blakewatson.com/journal/surveying-the-landscape-of-css-micro-frameworks/
edit: http://getskeleton.com/ looks promising
how to identify mysterious network devices: unplug it and see who starts screaming.
The Brothers Sun: funny and not too dumb.
The Road To Honest AI
AIs sometimes lie.
They might lie because their creator told them to lie. For example, a scammer might train an AI to help dupe victims.
Or they might lie (“hallucinate”) because they’re trained to sound helpful, and if the true answer (eg “I don’t know”) isn’t helpful-sounding enough, they’ll pick a false answer.
Or they might lie for technical AI reasons that don’t map to a clear explanation in natural language.
two papers about how to spot and manipulate AI honesty.
in the first paper, Representation Engineering by Dan Hendrycks, they seem to have managed to change an AI's answering characteristics by manipulating vector weights. apparently this works not only for honesty and lying, but also any other characteristics (fairness, happyness, fear, power, ...). this means you could directly change an AIs "character" by boosting certain nodes. if this works reliably this would be an absolute game changer that solves many of the most vexxing problems.
the other paper about "spotting lies", is a bit weaker imo and tries to exploit malicious models (i.e. those trained for scamming) having to be in a "frame of mind" for lying, which leads them to lie not only about the topic they're supposted to lie about, but also about other facts, which are known to the person which is lied to. apparently this only works with simple models.
the crucial notions in language understanding: compositionality, systematicity, productivity
https://aiguide.substack.com/p/an-ai-breakthrough-on-systematic
Indeed, it has been shown in many research efforts over the years that neural networks struggle with systematic generalization in language. While today’s most capable large language models (e.g., GPT-4) give the appearance of systematic generalization—e.g., they generate flawless English syntax and can interpret novel English sentences extremely well—they often fail on human-like generalization when given tasks that fall too far outside their training data, such as the made-up language in Puzzle 1.
A recent paper by Brenden Lake and Marco Baroni offers a counterexample to Fodor & Pylyshyn’s claims, in the form of a neural network that achieves “human-like systematic generalization.” In short, Lake & Baroni created a set of puzzles similar to Puzzle 1 and gave them to people to solve. They also trained a neural network to solve these puzzles using a method called “meta-learning” (more on this below). They found that not only did the neural network gain a strong ability to solve such puzzles, its performance was very similar to that of people, including the kinds of errors it made.
I'll start by saying that this article is not meant to be a retrospective on LLMs. It's clear that 2023 was a special year for artificial intelligence: to reiterate that seems rather pointless. Instead, this post aims to be a testimony from an individual programmer. Since the advent of ChatGPT, and later by using LLMs that operate locally, I have made extensive use of this new technology. The goal is to accelerate my ability to write code, but that's not the only purpose. There's also the intent to not waste mental energy on aspects of programming that are not worth the effort.
- antirez, http://antirez.com/news/140
I find most descriptions of WebAssembly to be uninspiring: if you start with a phrase like “assembly-like language” or a “virtual machine”, we have already lost the plot. That’s not to say that these descriptions are incorrect, but it’s like explaining what a dog is by starting with its circulatory system. You’re not wrong, but you should probably lead with the bark.
I have a different preferred starting point which is less descriptive but more operational: WebAssembly is a new fundamental abstraction boundary. WebAssembly is a new way of dividing computing systems into pieces and of composing systems from parts. ... Like the Linux syscall interface, WebAssembly defines an interface language in which programs rely on host capabilities to access system features. Like the C ABI, calling into WebAssembly code has a predictable low cost. Like HTTP, you can arrange for WebAssembly code to have no shared state with its host, by construction.
read: http://wingolog.org/archives/2024/01/08/missing-the-point-of-webassembly HN
tags: wingolog WebAssembly
"epoll: The API that powers the modern internet"
https://darkcoding.net/software/epoll-the-api-that-powers-the-modern-internet/
conditional git configuration
https://blog.scottlowe.org/2023/12/15/conditional-git-configuration/
i use this to manage profiles between different different roles, each of of which has a different project directory
"Perhaps Emotional Dependence on Celebrities Has Gone Too Far" by Freddie_deBoer
https://freddiedeboer.substack.com/p/perhaps-emotional-dependence-on-celebrities
"‘Magical’ Error Correction Scheme Proved Inherently Inefficient"
https://www.quantamagazine.org/magical-error-correction-scheme-proved-inherently-inefficient-20240109/
"Database-Instance using half of available CPU cores" dba StackExchange
https://dba.stackexchange.com/questions/334701/database-instance-using-half-of-available-cpu-cores
Song: Frazey Ford - Done
https://www.youtube.com/watch?v=PXRrySTujn8
read: https://shiftmag.dev/kotlin-vs-java-2392/ HN
there have been several languages targeted at the JVM attempting to improve developer ergonomics (groovy, scala, clojure, ...).
the author argues that java tends to wait and see and then cherry pick the features that prove themselves valuable.
the upstart projects usually have less manpower to innovate much beyond their initial set of foundational ideas that made them take off. after a while, java tends to integrate the juiciest bits, diminishing the unique selling point of its competitors.
does the same apply to kotlin? has kotlin's downfall already begun? imo, no / not yet.
for one, there's been massive buy-in from google for the Android platform. this alone significantly changes the game.
then kotlin has corporate backing from Jet_Brains, and, via android, indirectly also from Google.
and lastly, in my personal opinion, kotlin has a few killer features that aren't present in java and would be very hard to steal, as they'd fundamentally change java's characteristics and compatibility to earlier versions.
nullability handling; compared to kotlin's ?
the Optional
wrapper feels clunky and, more important, it's not enforced. if it's not enforced by
the type system, guarantees are weak and developer adoption is leaky.
everything is an expression: no idea whether java could implement this without breaking compatibility. this and implicit returns are an improvement on the scale of
anonymous objects vs. lambda
s.
scope function blocks: may be possible in java, but i see no attempts to emulate them.
also kotlin is similar enough to java to have a very low barrier to entry for new developers.
as for coroutines vs. virtual_threads: kotlin did profit from virtual threads for free. that's not an argument for java, it's one for kotlin.
all in all, i see the authors point, but i don't think we've reached the point of kotlins decline for at least a few more years.
tags: kotlin java IntelliJ_IDEA
looks like SQLDelight had the same idea i had with scovy.
SQLDelight generates typesafe Kotlin APIs from your SQL statements. It verifies your schema, statements, and migrations at compile-time and provides IDE features like autocomplete and refactoring which make writing and maintaining SQL simple.
asteroid is not a steroid!
Ralative
is a very uncommon misspelling, appearing only in the xml configuration file of a single application. It was added as a token, but after tokenization, xml files were excluded from the training data set. Now there's a token without training data and GPT 3.5 crashes when prompted to use that token. They're called glitch tokens and there are lists of them.
Science fiction awards held in China under fire for excluding authors
There's a worrying tendency of (cash strapped or just greedy?) western institutions to collaborate with totalitarian regimes while submitting to their rules. This lends legitimacy to the regime while subverting the prestige of the institution.
There was no communication between the Hugo administration team and the Chinese government in any official manner.
This almost sounds like an admission that there was communication in an inofficial manner.
looking forward to the remake of roadhouse. hopefully i can do a movie marathon: original, then remake. Watchlist_2024
is the american "grilled cheese sandwich" the same as the european "schinken-käse toast"? a question that forever gnawed at me but i've always been too afraid to ask. today tho i finally searched for an answer, and the answer is: no, but yes. as in: they're very similar, but there are a few differences.
basically: white bread instead of toast, type of cheese, yes or no to ham, buttered tops, optional additional ingredients.