Subscripts and invocations
Posted: December 20th, 2009 | Author: Mars | Filed under: Design | 4 Comments »In C, the name of a function returns a pointer to the function. A function call is a combination of the function-reference expression with a parameter subscript. Thus, parentheses are required whether they contain any arguments or not; it is the parentheses that distinguish a call from simple reference. In the same language, however, the name of a variable returns that variable’s value. In order to get a reference to the variable, you must prefix the name with the ampersand. This seems a little inconsistent, but in practice it works well, and it makes the use of function pointers feel natural and convenient.
I started out with the same system for the Radian grammar, but decided it made less sense here. I believe that heavy reliance on punctuation tends to make the learning process more difficult. It’s much easier to look up an unfamiliar term or to consult the documentation for some unfamiliar module than it is to guess at the meaning of some novel piece of punctuation. I banished the empty parentheses, therefore, and decided that naming a function invokes it. One must use the capture operator to get a reference to some function.
The problem is that I want to be able to subscript container types (like tuples, arrays, and maps) in order to get element values back, like this:
var foo = ["zero", "one", "two", "three"]
io->print(foo(1))This doesn’t work, because foo is a var, not a function. The subscript expression is no longer an independent operator, but an adjunct to the act of invoking the function, and has no definition for symbols which are not functions.
One solution would be to define a meaning for the parentheses, when applied to a variable or constant name. This would work, but it gets ugly fast. You can’t tell, when you look at a name followed by a subscript, whether that is a function call with a parameter, or a reference to an element of some container. REALbasic had this problem, since Basic traditionally uses parentheses for both types of subscript, and I was never happy with the grammar compromise we were stuck with.
Instead, I’m going to introduce a second type of subscript, using square brackets. I’m already using square brackets in a non-subscript context as an array literal, as in Python or Javascript, so I think it makes sense to borrow the subscript syntax as well. This will be a postfix operator, not bound to an identifier, so it can be applied to any expression.
io->print(foo[1])
The semantics are not completely clear. Radian has an intrinsic type, the tuple, which I have intended to be a primitive container and not an object. Once you’ve created a tuple, the only thing you can do is ask for one of its elements, by index. The implementation is that the tuple is a function which accepts one parameter, the index. This suggests that the subscript operation should simply call the function reference the expression yields, passing in the value as the sole parameter: exactly the same thing the invoke operator already does. Is there any need for an invoke operator, then? It seems an unfortunate conflation: the square brackets feel right for “get an element from this container”, but arbitrary for “invoke this function reference”.
Further, it’s less clear that this implementation would work for more complex containers, which are likely to be objects. An object is a function which accepts a single parameter, which is a selector identifying a member; the object returns a reference to the function representing that member. A container object, then, would need to accept either a selector representing a member, or an index value representing one of the contained values – how is it to know the difference?
Perhaps it doesn’t matter. Instead of thinking of containers as one subtype of objects, perhaps objects are a subtype of containers! Perhaps the object member access syntax is just a quick shorthand for a common use of a common type of container. The interesting consequence of this approach is that you could create objects out of other containers: if you had some existing map/dictionary type, you could stuff it full of symbol keys mapped to function reference values, and that would be just as legitimate an object as any created through the built-in syntax.
Funny you should suggest that objects might be a subtype of containers. I’ve been musing about such a language off and on for the last several months, in which lists, sets, and maps are the primitive building blocks for most everything else. Obviously, I haven’t gotten very far…
I like the idea of [] to express key-value operations. Were it supported for objects of any sort, foo.bar would invoke the object method, and foo['bar'] would return a function reference (perhaps a delegate bound to foo). And one could add methods and properties to an object at runtime.
Having things so that naming a function invokes the function introduces an asymmetry in the way functions are handled and the way other entities are handled, though.
Assuming ‘now’ is the name of a function returning the current time, and ‘first_primes’ is the name of a list containing the first 3 prime numbers, in Python we would have:
date = now()
thing = first_primes[1]
now ‘date’ names a time, say when this comment was written, and ‘thing’ names a number. In both cases the fancy punctuation does something to the entity in order to get another value. Also:
date2 = now
thing2 = first_primes
now date2 is a function that returns the time, just like ‘now’, and thing2 is a list of the first three primes. In both cases naming the entity just gets you the entity.
however, in Radian it looks like you’d have:
var date = now
var thing = first_primes[1]
var date2 = capture now
var thing2 = first_primes
i.e. you need to use fancy punctuation to produce a value from a list, but not in the case of functions, whereas you can create a synonym for a list simply by naming it but you need to to use a fancy keyword to produce a synonym for a function.
To me this seems very confusing. To know what an assignment expression means, I need to know what the types of the entities involved are. To get a name’s value I have to do something different depending on the type.
In my view, functions should be treated just as any other value, and names for functions should behave just as any other name.
(Python of course even has this when creating values and assigning them:
now = lambda : date.today()
first_primes = [2, 3, 5]
i.e. we have fancy syntax to create both functions and lists. That’s not the usual way of creating a function, of course… )
Although I’d like to add I share your intuition that foo() is a clunky, clunky piece of syntax.
However, bar(’a', 2) isn’t awful, and foo() as a limiting case of this does make sense. It’s also what people are used to. While I think you’re right and weird punctuation is offputting and hard to understand, function invocation using brackets for the 0-argument case isn’t so very bad, especially as it’s consistent with the case of 1+ arguments.
(LISP’s function invocation syntax (foo) does seem like a distinct improvement here. )
Thinking about this a bit more, it’s the verbosity that bothers me, not the punctuation. I want the common operation to be the concise one, and the less common operation to have the extra syntax.
In C, the name of some item yields its value, if it happens to be a variable or a constant, and its reference, if it happens to be a function. I would prefer that the name of some item, absent any additional syntax, always yields its value. That is, when you say
x = y, the value of x must be equal to the value of y, at the point of the assignment.