| Jun. 9th, 2009 @ 10:27 am Dividing the Indivisible |
|---|
I'm reading François-René Rideau of TUNES and find myself agreeing with a lot of his comments on language/OS design (but not on economics or politics, sorry). Eg: http://fare.livejournal.com/139755.html
Found via his Livejournal: this paper on historical data models.
I particularly like this comment on Lisp's central state as its major limitation. It's the problem I see with 'packages' too. Packages share too much state; they are not network-scalable. The True Language has to be able to work at all scales, from 'programming in the large' to 'programming in the small', probably with just one unit of abstraction. Whenever we invent these multiple different types of abstractions: lambdas vs objects vs files vs modules vs packages vs namespaces - we're overcomplicating something that is actually really simple.
And the problem is that when you add complexity you are actually reducing the expressiveness of your language not increasing it. The problem we have is a lack of a sufficiently expressive core meta-language for representing interdependent, time-space distributed data linked by functional relations. This is not just a perceptual issue. It's not a matter of aesthetics or which language we 'prefer' to code in. It's more fundamental than that. It's that we don't have a way of talking about the relationship of one set of changing, computed data with another - not in a well-defined, recursive way. And until we find that language, we'll be unable to reason sensibly about the problem, instead creating chaos, dividing something which is not divisible - creating artificial distinctions.
Let me try to say that again in the hope it will make sense: creating a new entity type means creating a distinction. A distinction is a barrier to communication, it does not add 'richness'. The best language is one which has the fewest number of core entities rather than forcing them to multiply. The challenge for the future is to remove elements from our languages which are obscuring our vision, not to add them. Whenever we look at our languages or OS environments and see lots of not-quite-the-same things: files, objects, pointers, streams, components, processes, threads, interrupts - we should intuitively realise that something here is very broken and needs to be fixed - or at the very least, if we must tolerate a broken foundation in the short term as an expedient measure to bootstrap our real next system, a unifying layer must be written to expose all of these entities as one correctly defined abstraction.
However, when we start raising this question, we usually get derailed because there is a very strong, pervasive philosophy in both the science and industry of computing at the moment which claims not only that 1) fixing this problem of fragmented languages is fundamentally impossible ('you can't satisfy everyone at once'), but 2) that it is actually desirable ('there's more than one way to do it', 'use the right tool for the right job').
The first objection would be correct if we were simply incapable of even imagining all these different systems. But obviously, since we built them, we are. All these apparently mutually exclusive approaches have been produced in the first place by people using the same mental 'operating system' toolset - human thought. So why can't we formalise *how* we built them and what they do, in a single mechanical language?
The second would be correct if all of those 'right tools' were mutually convertable. But they're not. That's what makes them different tools - not just different applications of the same tool, which is what I'm arguing for - in the first place.
And so we're stuck at this philosophical impasse.
It is annoying that those of us who seem to see this problem in the current situation seem to be stuck at the 'unfocused rant' stage - I know I am and I know it sucks. I can *almost* feel what it is I'm trying to say, but it's enormously frustrating to not be able to put the pieces together.
All I can say is that there's a new paradigm trying to emerge which seems to me to be based on: * typelessness (or types as first-class functions) * 100% pure-functionality with I/O handled via functional reactivity (ie, functions over time-varying signals rather than constants) * massively distributed, parallel, concurrent object-like modules * completely extensible syntax * data-centrality and declarative semantics
All of these pieces have to coexist because they work together as a coherent framework, and without them all you don't have the model. We don't yet have a language, or even a paradigm, which brings all of these together. It has to be 100% pure-functional, and that requires functional reactivity or dataflow.
However, all the dataflow languages I've seen (such as Labview, Pure Data or Yahoo Pipes) seem to be weirdly bound to a visual paradigm and have no serialisation as a language; while the FRP frameworks (Cells, Flapjax, FrTime) are all built on top of existing, non-pure-functional languages. Cells particularly requires CLOS, which makes a number of inappropriate assumptions (like a hard class/object divide); Flapjax is limited to the browser; etc.
It has to be declarative, not imperative, and so C, C++, Java, Scheme, Lua, Factor, Io and Common Lisp fail because of their lack of guaranteed purity.
It has to be parallel and concurrent at the instruction level. Erlang has some of this property, but is not pure functional.
It has to be modular at a very atomic level - no shared namespaces, no special 'package' mechanism other than 'object' - which suggests a small prototype OO language like Javascript, Self or Io, but most OO languages make a point of not being at all functional, let alone pure-functional. Where they have lambdas, they are added on as an additional abstraction, which breaks the one-abstraction concept.
It has to NOT have hardcoded types at a level separate from functions, because that makes it impossible to model the real world of network-distributed messages where new message 'types' are created on-the-fly and at 'runtime' - because on the Internet, it's always runtime somewhere. You can never 'escape the program' by creating another meta-level, and that's what type systems try to do. So Haskell fails here.
For the same reason, you need to provide everyone - user, data modeller, application developer, language developer - with exactly the same toolset, language and level of expressive power, and that means full metaprogrammability. Natural languages provide this feature: anyone can make a new word, at any time, just by using it. A computer language equal to the challenge of describing a time-varying, computed, mutable yet functional web of arbitrary data also requires the same ability. Forth, Factor and Lisp have this feature, but not the others.
Some signs of hope: Google Wave, Yahoo Pipes, Flapjax, JavaFX, Joy, Lua, Xanadu, Prolog, SQL Views, RDF.
But none of these approach closure; they all fail to 'close the loop', to reduce the problem to one entity. They're all partial, incomplete, and mutually contradictory. They divide the problem space into 'programming in the large' vs 'programming in the small', into 'code' vs 'data', 'client' vs 'server', 'desktop' vs 'net', and refuse to look at the whole, accepting contradictions (in the postmodernist vein) as required, inescapable, and necessary. But that approach means to accept defeat right from the beginning. If you have a mutal contradiction you have nothing at all. No common, complete data-description environment that can make statements - and formal guarantees - about the connection between Internet-distributed, computable data sets.
This language exists. And I don't yet know what it is. But I am sure that it is not quite reducible to any of the current mainstream paradigms, but is capable of expressing all of them. And it will be very, very simple.
Edit: I'm really echoing David Bohm's thoughts from Wholeness and the Implicate Order where he makes much the same point about language (in the context of the fragmentation of the scientific disciplines).
One of the fixes Bohm suggests is to use verbs rather than nouns to emphasise the process nature of spacetime. I think this is roughly analogous if not identical with the idea of composability of functions, as we see for example in operational transformation, where a document consists of transformations over the empty document, or in concatenative languages, where even data literals are functions over a stack (or a stream). All of these possess the property that we can merge two entities - objects or processes - and get a new one. This property is vitally important to 'remixing' and repurposing of data across the Internet, or for automatically translating from one data format to another without prior knowledge of its 'type'.
Object orientation as we have today fundamentally doesn't work here because it still keeps an artificial idea of object identity - whereas in the real universe identity is a very problematic thing - and because of this, objects are very brittle and aren't composable like functions are. But they could be made composable if they were turned into pure functions. That's basically my core idea: object-like functions, expressed as declarative statements of function application, which exist somehow in the computational ether and 'magically' update themselves to reflect as their sources change (where 'change' is really with respect to the observer; in reality, a single 'changing' object is a series of immutable objects seen in succession as the observer shifts in space or time; the same 'delta' mechanism produces both 'inheritance', if the shift is in space, and 'mutation', if the shift is in time).
This seems like a very easy concept to describe, but in practice most of our functional languages tend to make the assumption (for efficiency) that the environment of a function will never change once the function is created; they provide lookup access and binding (like lambda calculus), but no way for a function to truly operate upon its own environment (not to mutate, because that's a contradiction, but to provide a successor). But of course if you have a long chain of mutations or transformations, it gets increasingly inefficient over time, so you need such a built-in way of pruning that delta chain. A concatenative language such as Joy seems to almost solve this problem by making 'programs' first-class, runtime-modifiable lists. But it fails to achieve closure because it doesn't define a canonical mapping of environments to programs - 'define' is not itself a program.
Factor, as a practical implementation of Joy, takes the concatenative idea but takes the Lisp route of turning everything into an operation on a Big Shared State - the system image - which is precisely the Wrong Thing to do and won't scale to the Internet. Yes, you can run a server on it. But that's answering the wrong question; I want a network which doesn't even have the *concept* of 'server'.
Then we get into single vs multiple inheritance, and static vs dynamic scoping, which are contradictory ways of looking at this 'delta' mechanism and neither of which seem quite appropriate as a fundamental, low-level formal language feature.
This is what I'm trying to describe with my inappropriately-named 'mu functions' - the concept still needs work, I think, but it seems almost there. And yet... |
|  |