Hacker News new | past | comments | ask | show | jobs | submit login
The seven specification ur-languages (buttondown.email/hillelwayne)
132 points by BerislavLopac on May 9, 2023 | hide | past | favorite | 67 comments



This is a good list but is almost more accurately called a list of modeling languages. At the very least it’s a very particular interpretation of specification, which makes sense from a TLA+ expert but is incomplete.

In particular I think there are a few candidate ur-specification languages that could be included:

* first-order logic, this captures what almost every deductive verifier uses, and has a surprisingly rich diversity in features.

* concurrent separation logic, this was a breakthrough development and every language for specifying concurrent programs is some flavor of this now.

* dependent type theory? The big issue is that as the author points out, taxonomies fail. Dependent type theories give rise to languages which are both “specification” and “programming” languages.


Based on the state of academic conferences in formal methods, I would have 4 languages: temporal logic, first order logic, concurrent separation logic and dependent type theory. This categorization provides generalizing powers because tools, techniques and limitations mostly map onto it.

* temporal logic: almost exclusively used by model checkers. Usually propositional, has limited/no support for quantifiers. Also usually only able to express path properties. This for example makes it hard to state “is_sorted”.

* first-order logic: used by deductive (semi-automatic) verifiers. Can reason about complex properties of state (ie : `is_sorted`), harder but not impossible, to reason about temporal properties. Ill-suited to describe properties involving shared state or concurrency. Used by SMT solvers in particular (good automation). Also used by most deductive verifiers and liquid type systems. Examples: Dafny, Why3, SPARK (spec part), liquid Haskell.

* concurrent separation logic: can be either propositional, first-order or higher order. Does much of the same kinds of properties as FOL, but it can reason about shared state and concurrency, in fact it’s quite good for that. This expressivity usually means it’s harder to reason about / prove specifications. Examples: Iris, Viper, Verifast.

* dependent type theory: when you start pondering what it even means to specify something, the gods of CS bestow upon you a vision of the haskell-curry isomorphism and your programs become specifications and your specifications become programs. This is the “ultimate” level of specification, and the foundation for proof assistants. Examples: Coq, Lean

Each of these categories is well represented at conferences and pretty much every paper falls into one of these four buckets. These also map roughly onto verification methodologies and problem domains.


On second thought, these are mostly too closely related, I think that if you want to truly find the ur-specification languages this is closely related to Girard's research program. Specification languages are intrinsically tied to a notion of truth and also of proof (method of establishing / showing truth), these are central questions in Girard's work.

I think a revised list should also include proof-nets, and other weirder specification languages like the [flower calculus](http://www.lix.polytechnique.fr/Labo/Pablo.DONATO/drafts/FJ....) or stellar resolution. These also stretch the usual interpretation of language as a syntax tree.


The feasibility of automatic verification and ease of specification played incredibly important roles in the historical development of specification languages. This raises an interesting question: what new computational tools are coming on the scene, and what role will those tools play in the next generation of specification/programming languages?

Reading this article, I was struck by how noneterminism was all over the place in ur-specification languages. I think the answer to the question above will turn out to be something about: 1. nondeterminism, and 2. LLMs, RL, and diffusion as a way of resolving/refining nondeterminism.

That aside, I like the taxonomy. Some personal footnotes:

1. I would list Hoare Logic as the ur-language instead of GCL. Dynamic logics are in this family. I had never considered Spin and dynamic logic as being related in this way. That's kind of neat, and I'm glad I read the article.

2. I would probably give type theories a place in the list, even though you have to squint a little bit to make them fit in a list of specification languages.

3. The history of temporal logics is fascinating, and https://www.cs.rice.edu/~vardi/papers/etaps01-ver13.pdf is probably on my top ten list of papers on formal methods.

4. I feel exactly the same way about Petri Nets -- kind of neat, feel powerful, but never quite know what to do with them. Maybe now that we're returning to office I will get a copy of "Petri Net Theory and the Modeling of Systems" for the train ride.


Heh, eh, well, ah, ... (and other strange noises).

Dijkstra's Guarded Command Language is a specification language, sort of, but it's better looked at as a very simple programming language for use with Floyd/Hoare/Dijkstra's axiomatic semantics. (Which didn't get mentioned at all. Sniff.)

Anyway, if you want a better specification language based on GCL, check out Jayadev Misra's UNITY (A Discipline of Multiprogramming, Springer-Verlag, 2001.) or Mohamed Gouda's Abstract Protocols (Elements of network protocol design, Whiley 2008.)


> Dijkstra's Guarded Command Language is a specification language, sort of, but it's better looked at as a very simple programming language for use with Floyd/Hoare/Dijkstra's axiomatic semantics. (Which didn't get mentioned at all. Sniff.)

This struck me too. There is an ur-language there, but it's not GCL rather it's Hoare triples and subsequently predicate transformer semantics and Boolean structures, as defined by Dijkstra and Scholten (among others, but they're the ones best known to me). So in that sense the ur-language for program specification is something we might call "mathematical logic." And predicate transformer semantics, when you get right down to it, is leaning on Leibniz's principle about as hard as biology leans on H-bonds. As you say, GCL is just a simple language that has a well defined predicate transformer semantic and an interesting approach to nondeterminism that is suitable for the kind of research Dijkstra was doing. One could theoretically define predicate transformer semantics for any language, although it would be a prohibitively difficult challenge for most if not all languages in common use today. Whether that's a problem with predicate transformer semantics or the languages we have is a matter of opinion I suppose.

Also predicate transformer semantics were extended to support concurrency by Lamport[1].

I'm still hoping for the day when assertional arguments fully banish behavioral ones, but it looks like it's still a long way off.

[1] https://lamport.azurewebsites.net/pubs/lamport-win.pdf


For some reason, I ended up calling those "axiomatic semantics"; predicate transformer semantics is a much better name.

One of the weaknesses of PTS for common languages is that statements and expressions have very complicated semantics, compared to the relatively simple semantics of GCL and other simple treatments. One of the nice things about Frama-C (and I think the GNAT SPARK tools) is that, when you are using the WP (Weakest Precondition) module to prove properties of C code, it can automatically generate assertions that describe overflow/underflow, which then forces you to put the appropriate preconditions on function parameters. :-)

By the way, have you run across K. Rustan M. Leino's recent book, Program Proofs, and the Dafny language. It's the most recent and so far, best book I've seen on the style and technology.


I was doing some work on defining PTS for a subset of Go, so I'm definitely aware of how tricky it can get. While handling early return requires some care, the existence of goto and break in particular caused me considerable trouble defining semantics for the loop construct. Needless to say that's barely scratching the surface, and Go is a pretty small and intentionally simple language as far as popular ones go.

If I were to try again, I think I'd look at building GCL as as simple Lisp and then build higher level constructs as macros that would inherit their semantics from the well-defined GCL's composition rules. Then of course if I wanted to succeed in the marketplace I'd throw that out, rewrite the Lisp part as an AST library, probably in Rust or Go, and then lex and parse it from some squiggly brace syntax. That syntax in turn would probably be a subset of the grammar of some popular language to ease practical use. At then point it could get grafted on as a kind of linter maybe.

I'm also particularly excited about the prospect of deducing appropriate preconditions from stated postconditions/assertions. The very concept of working backward from the desired end to the known beginning is so elegant and intuitive that millions of school children independently discovered it when solving mazes.

I am somewhat familiar with Dafny. It looks really neat and I've written some toy code. I'm not familiar with that book though, so I'll have to add it to my reading list.


there are too many specification languages


Ooohhhh, a specification language for specification languages.

The grammar that ties together graphical things like Petri Nets and TLA+ is going to be fantastic!


Unrelated but the coolest thing I discovered while researching this piece was that Petri Net reachability is decidable but Ackermann-complete. ACK-COMPLETE is a thing. That's wild.


what does that mean exactly? I skimmed the paper you linked briefly, but couldn't quite tell.

Does it mean that deciding reachability of an N-state petri net takes Ack(N) steps in the worst case? Or that you can can construct a Petri net which has a node reachable after Ack(N) steps?

Those are different I think since you can compute Ack(N) in fewer than Ack(N) steps.

I also wonder if you can more efficiently rule out polynomial-time reachability of a Petri net node, even if it is eventually reachable.


The former, though it grows a little slower than Ack(N). If I read the paper write, they showed that an 18-state Petri Net has a worst case reachability problem on the order of Ack(3), and then every six states you add, n goes up by one. So a 24 state Petri net can take Ack(4) steps to decide, for a 30 state Petri net Ack(5), etc. This is just the decision problem, not the length of the path.

(The paper is actually works on *Vector Addition Systems*, but VASes are isomorphic to Petri nets.)

Note these are loose boundaries: adding at most six states will definitely bump the hardness by at least 1, but it might grow faster than that. At the end of the paper they note some people have gotten more precise boundaries, that 10-states is "Tower-hard" (Ack(3)), and every 2 states bumps it by 1. So worst case for a 12-state net is on the order of Ack(4), etc.


> All taxonomies are broken, full stop. Your categories are gonna be completely wrong and everybody’s going to argue over every single thing. There is no such thing as a tree.

Is this a serious statement? If so, wouldn't it be incompatible with the theory of evolution? Would an alien taxonomy of human binary numbers not be a legitimate tree?


The thing with taxonomies is that trying to make a category more precise tends to exclude things you want to include in it, and vice-versa. It is especially easy to find examples of this in nature, because nature has existed since long before humans had opinions about how things should be organized.

What is a cat? It's a small furry quadruped in family Felidae.

- Exception: cats may be quite large (lions, tigers).

- Exception: cats may have no fur (sphinx cats).

- Exception: cats may have fewer than four legs (e.g. due to injury).

Even questions that seem like they should be quite easy to settle, like "are these two gulls of the same or different species?", might be impossible to define formally due to things like <https://en.wikipedia.org/wiki/Ring_species>.

With the specific example of "no such thing as a tree", the category of plants that humans call "trees" isn't a genetically coherent group. Lots of different types of plants have woody stems and bark and leaves, and you can't group together all the things humans call trees without also including things that we definitely don't.


I definitely understand that the majority of taxonomies are problematic for the reasons you cited. When OP said "no such thing as a tree", I thought OP meant a taxonomic tree, not a literal plant tree, hence my example of binary numbers! Thanks for clarifying.

That said, taxonomic groupings can have both wide consensus and be useful, can't they? (Hand on chin... monotremes? hominids??)


The CS specific analogy might be, "Abstractions are leaky," or more broadly the, "map is not the territory."

Any complex-phenomena that is modeled with with a simplification will have places where that simplification fails. But models can still be highly useful, you just need to choose an appropriate level of abstraction, accept and manage any tradeoffs with exceptions, and move to better paradigm if one emerges.


Although if you want to get get technical, even evolutionary relationships are only trees if you throw out some of the information that doesn't fit the tree. It's truer to speak of phylogenetic networks that can take into account things such as horizontal transfer of genes and recombination.

https://en.wikipedia.org/wiki/Phylogenetic_network


As George Box supposedly said, "All models are wrong, but some are useful."


I've pondered the "taxonomies as trees" as insufficient and broken. What I arrived at as "maybe this will work" is taxonomy as a high-dimensional sponge, where a thing may rest at a given point in high dimensional space (where the dimensions are characteristics of that thing) and may or may not be clustered with other things on certain axes.

Sort of like how word/semantic clustering works in LLMs.

This obviously isn't a fully formed idea, but it might make creating taxonomies easier? Taxonomic clusters? Something like that.


should probably also reference metacrap here https://chnm.gmu.edu/digitalhistory/links/pdf/preserving/8_1... especially section 2.5 Schemas aren't neutral


Well, this clearly references this wonderful article: https://eukaryotewritesblog.com/2021/05/02/theres-no-such-th...

The idea is that our taxonomy selects a group named "trees", but there's little internal coherence in that group, which probably results in more online hilly wars in the plant-loving communities than we the laypeople can think of.

See also: is Pluto a planet?


I don't think this article is referring to physical organisms known as trees (as your article does). It's talking about abstract tree structures, in particular:

1. There is a coherent parent-child relationship

2. Children have only one parent

3. Children have no lateral relationships with other children

4. Relationships are mediated through parents, not e.g. grandparent-child directly.

5. We can coherently differentiate between different children

etc.

All of these are assumptions that greatly simplify analysis and therefore useful in any situations, but almost always fail in some way on closer inspection.

More generally we go to the saying "all models are wrong, some are useful."


I'm 95% sure that the author (who is on HN[1]) is at least referring to the article "There's No Such Thing As A Tree" in his line "There is no such thing as a tree." - regardless of the fact that the article as a whole isn't about trees made of wood.

It's possible that he's additionally making a double entendre about abstract tree structures.

[1] https://news.ycombinator.com/user?id=hwayne


Yeah, I was thinking about biological trees. In retrospect, I should've gone with "there's no such thing as a fish."


My favorite example is "berry". Everything that you think is a berry isn't botanically a berry... but tomatos and bananas are.

OK... technically it's almost everything you think is a berry isn't. But it sure feels like it's everything.


Haha yes please! I thought you were talking about literal taxonomic trees.


> our taxonomy

What exactly do you mean by "our taxonomy". There's no plant order, family, or genus called trees. It's just a common plant form. Do you just mean some people's personal mental model?


Yeah I definitely did not get the reference, I thought OP literally meant that taxonomic trees in general are useless and/or don't exist!



I spent some time working on replacing the formulation of descent as a tree of species with a chain-complete partial order of organisms. Then you start trying to define things like "species" or "strain" or "genus" on that and realize that they don't correspond to any typical clumping of graphs.

Someone else already linked to ring species. In microbiology, the definition of species is "stop asking, we agreed to stop fighting about that, no, really, la-la-la-la." Horizontal gene transfer between species is ubiquitous.

In the end I started talking about populations occupying a niche in a specific place at a specific time, and very cautiously tracing properties among linkages of those. But I'm also the one who kept insisting to my labmates that a gene is not a locus of DNA.


The theory of evolution requires large and fundamental changes in organisms over time due to natural selection over diverse populations with characteristics being passed on by reproduction. Such that very different organisms can have common reproductive ancestors.

It does not in fact require a specific number of clearly delineated categories of organisms where all individual organisms are in one and only one category.

That part is a human invention to try to make sense of the world, rather than part of the mechanics of the world itself. The categorization is probably a necessary invention to try to make sense of the world in a scientific systematic way, and is also inevitably a broken inconsistent subjective biased over-simplified map of the territory.


I have a couple of sources that agree with the proposition that taxonomies are broken, linked below. The short summary is that they can be useful, but they are always just models of the real world. As such, they will always be broken in some way. Even the example of number is incomplete. Is infinity odd, even, or something else?

Voodoo categorisation and dynamic ontologies in the world of OER: https://web.archive.org/web/20190505064655/https://blog.mood...

Ontology is Overrated: Categories, Links, and Tags: https://web.archive.org/web/20070119210402/http://www.shirky...


Taxonomy is just bureaucracy, evolution is real and breaks taxonomy, because evolution works on the level of individuals and genes, when taxonomy works on leaky abstractions called species and groups of species.


Where in a taxonomy of humans would you put the estimated 8% of our genome that is from viruses trapped in our DNA?

https://www.science.org/content/article/ancient-virus-may-be... https://www.discovermagazine.com/planet-earth/mammals-made-b...


This would be incompatible only if we require realistic interpretation of our taxonomies and theory of evolution can be true regardless of our (anti)realistic commitments.


> If so, wouldn’t it be incompatible with the theory of evolution?

It would be incompatible with viewing some past views of the mechanisms of evolution as complete; the “theory of evolution” beyond broad outline is something of a moving target that accommodates things like the ways in which taxonomies are approximate abstractions that aid in discussion rather than exact descriptions of reality.


What is the difference between a biological mutation and a new species? There are thousands of years of grey area between the two. The Theory of Evolution is a tree-like model to summarize billions of years of biological mutations into neat little branches, but we essentially have to ignore the vast periods of grey area in order to turn that into a tree.


Are you sure evolution is tree like? https://en.m.wikipedia.org/wiki/Convergent_evolution

That said, some categories are real, precise, and useful, like integers vs reals vs complex numbers.


Convergent evolution doesn't make evolution non-tree-like, that's just a tree with similar-looking branches or leaves at different locations. Pretty standard really. Convergent evolution makes taxonomy less tree-like.

Carcinization is a good example: it makes lots of things crab-like, so from a surface taxonomic point of view a flattop crab, a coconut crab, and marbled crab are all pretty crabby. But they're completely different evolutionary lineages, and only one of them is a "true crab".

Where evolution gets less tree-like is MGEs (https://en.wikipedia.org/wiki/Mobile_genetic_elements) as some of them can move between branches.


> Convergent evolution makes taxonomy less tree-like.

Yes, exactly. "All taxonomies are don't broken" is not at all "incompatible with the theory of evolution," is the point. The taxonomic layer is pasted on top.


Sorry, as stated above I misunderstood "There is no such thing as a tree." to mean that OP though taxonomic trees were fundamentally broken in some way, I misunderstood OP!

That said, if we had perfect knowledge of the speciation process over the years, would our taxonomy not be extremely close to a perfect tree, where every node has 2+ branches, and branches don't converge to being species-compatible for breeding?

I get convergent evolution, but among large (let's say 10g+) organisms, I'm not aware of convergent evolution resulting in compatible species that would not otherwise have been compatible?

I'm super rusty on this topic, but if there is theory that large organism actual DNA-level speciation (resulting in individuals who cannot reproduce together) has eventuated to convergence back to a new species (who can reproduce together), I'd love a source. I definitely could have very rusty knowledge on this but it seems intuitive to me?


I've been saying it for years. Life is not a tree, it's a DAG.


Are real numbers actually "real" or do they just have good PR?


> If so, wouldn't it be incompatible with the theory of evolution?

It's very much not. The thing is that every model is a lie, but they can often be very useful lies. The reason we use taxonomic trees is because they're "good enough" but there are tons of places where this really breaks down.

For example, horizontal gene transfer is a huge problem in microorganisms. If you take a soil fungi from one environment and put it in a completely new one it'll become so stressed out that it somehow increases it's rate of HGT and can borrow genes from completely different clades of life like bacteria or algae. HGT actually happens at more macro scales as well. Many of our GMOs take genes from bacteria and put them into plants. Parasitic plants like dodders are known for taking (and spreading) genes from the wide variety of plants it can parasitize (though this might be the wrong word to use given that we now know it plays a host of beneficial ecological roles to its hosts like acting as an above-ground myccorhizal network allowing plants to "talk" to each other). We also know that HGT is quite common across completely unrelated fish and sometimes we even have certain animals, like the hoatzin (Opisthocomus hoazin), that push scientists to completely reevaluate their "tree of life" assumptions[0] (aside: this one's particularly bizarre because it even looks like a medieval depiction of a hybrid beast that took different parts of different animals and mashed them together).

Another more obvious problem is just simple hybridization. It can happen across species with regularity (especially in some plant families), but over the deep history of time much larger jumps seem less "rare". On the micro scale it happens so often that "species" is rarely a useful category. Both botany and microbiology often refer to "species complex" instead

Additionally, it's not really clear what an "organism" often is. For example, lichen are actually a partnership between algae and fungi but are given their own species name. Neither of the two can live without each other. And actually we're finding that it's basically a whole ecosystem of many different algae and a fungi. That might not seem as complex but consider that millions of years ago a germ ate another germ and that eaten germ continued to live and reproduce inside it and eventually came to be known as "mitochondria" and form the basis of basically all animals and plants. To this day they have their own DNA but their reproduction is completely tied to us. Also consider the fact that a human is actually mostly germs. Germs outnumber the number of cells in your body (well this can vary based on antibiotic treatments or how long since your last shit). These complex ecosystems that play critical roles in our skin, eyes, guts, and even brains are necessary to our survival. We wouldn't be able to EAT without them! Lastly, take the example of the man o' war. Similar to lichen, it seems at some point ~13 or so different animals came together and worked together so strongly that they essentially merged and became a single organism. Some of the parts of a man o war can even survive without the rest of the "colony" for a short while (imagine if your kidney could just do it's own thing). But, like lichen, it gets its own species name

As you can see there's a number of flaws in the "tree" view of evolution. It works well enough for most cases and that's why we still use it. But try looking up the debates around a taxonomy of human languages. In theory the same approaches should be able to be applied and the same problems (cultural equivalents of "HGT", "hybridization", and blurry lines between "symbiosis and dependency") can apply.

Edit: oh hole #4: ring species!

[0] https://www.newyorker.com/science/elements/the-bizarre-bird-...


I get convergent evolution, but among large (let's say 10g+) organisms, I'm not aware of convergent evolution resulting in compatible species (with fertile offsping) that would not otherwise have been compatible?

I'm super rusty on this topic, but if there is theory that large organism actual DNA-level speciation (resulting in individuals who cannot reproduce together) has eventuated to convergence back to a new species (who can reproduce together with fertile offspring), I'd love a source. I definitely could have very rusty knowledge on this but it seems intuitive to me?


I never once mentioned convergent evolution. Convergent evolution is irrelevant in modern phylogeny-based taxonomy. I'm talking about the actual genes themselves


One word: siphonophores


(I mentioned them already :P)


Oh, I missed it. Is man o' war (Physalia physalis) a siphonophore?


Yes exactly! Sorry, should've specified that


Have you ever heard a discussion betweeen biologists about where to put a given species in the family tree?

Taxonomies are a human construction.


I am confused - what does "ur-" prefix mean?


It means "proto", "primitive" or "original". See the Wikitionary entry here: https://en.wiktionary.org/wiki/ur-


I too am confused about what both authors think the definition of "ur-" is in both of their contexts. https://news.ycombinator.com/item?id=35818607

Nor do I understand why for TLA+, the second author needs to demarcate a dichotomy between language/programming language and specification language. And then the next thing they do is Tear Down the Taxonomy! Tear it down from the beginning.


> I too am confused about what both authors think the definition of "ur-" is in both of their contexts.

I'm confused why you're confused. The original author (of "The seven programming ur-languages") stated what they meant by an ur-language:

> I am aware of seven ur-languages in software today. I’ll name them for a type specimen, the way a species in paleontology is named for a particular fossil that defines it and then other fossils are compared to the type specimen to determine their identity. [emphasis in original]

They didn't hide their meaning, they laid it out right before listing the languages and then explaining why they selected them.


Original, primordial, the source. Comes from German.


Glad you got responses because I sat there thinking it could mean un-readable from the vantage of different styles.


Typically it means very old, and the progenitor of all that followed.

1. https://en.wikipedia.org/wiki/Ur


The etymology is unrelated to the Sumerian city you link to.


I’m assuming it is something like a Cradle of Civilization (the city of Ur is recognized as one of a handful). They are civilizations of which all other civilizations are derived…they have no predecessor civilizations.


You're assuming wrong, I'm afraid. No relation to the city of Ur.


I get that the prefix entered our lexicon from german, but frankly you don't have enough information to say that there is no relation. The city of Ur is 4000 years older than Old High German, and Ur has been used as a metaphor for the origin of things for thousands of years, even ancient Greece. You can't definitively say that the idea of Ur as an origin of civilization had no influence on german.


But in Old German, the prefix ir-/ur- meant "thoroughly", from Proto-Germanic uz-, meaning "out", ultimately from Proto-Indo-European úd-, meaning "outward"/"upward", which is also the origin of the English word "out", as well as the prefix "or-", as in "ordeal". Proto-Indo-European coexisted with Ur, and it doesn't really make sense that Ur would lead to the PIE prefix úd-.


> frankly you don't have enough information to say that there is no relation

I'm not a scholar of this subject. If there is good scholarship out there presenting good arguments in your direction I'll take it. I was just helping out a fellow that has a doubt with my best knowledge of the subject, which is not just a guess.

> has been used as a metaphor for the origin of things for thousands of years, even ancient Greece

Has it really? I'd love to see an example. Sure it's listed in the Bible along with a bunch of other place names, but as a metaphor for the origin of things?

Even if there are examples, I'd really love to see an etymological trace of how it would end up as a prefix. Was it used as such in ancient Greek? In Latin? Sounds like a folk etymology.


Yeah, on second glance I might be confusing some of my sources here. I can’t seem to find what I thought I was remembering.

Regardless, there seems to be a significant distinction between words using a concatenated ur prefix and those using a hyphenated ur-(noun) prefix. In this case, the usage does seem to imply an original language from which all others are derived, and the metaphor seems apt even if it is not historically accurate to say that’s where it came from.


> Everything past this point is much shakier ground and why this is a newsletter and not a blog post, despite being 2500+ words.

ok but lets be real, with substack becoming a thing and hillel’s buttondown hitting HN every week, is there any difference?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: