Access specifiers

Posted: March 31st, 2011 | Author: Mars | Filed under: Uncategorized | 1 Comment »

Radian objects and modules currently expose all of their members, since I’ve yet to invent any way to hide them. This is obviously not great, since it means there is no way to distinguish between the elements of an object which are supposed to be part of its public interface, and elements which are mere implementation details, not to be used by outsiders. Radian clearly needs some kind of member access specifier scheme, but what should it look like?

Java and its cousin C#, as well as my old friend REALbasic, use a leading keyword on each declaration. One declares a private void foo() or a public int bar(). This is certainly clear, but it’s verbose, and it makes scope a more prominent feature of the declaration than it really deserves to be.

In C++ and Ruby, access specifiers are independent statements. One typically declares groups of public or private members together, and a single private or public changes access rights for all declarations which follow, until the next access specifier. This tends to result in classes where all the public declarations are grouped together, which is convenient, but in long class declarations it can become difficult to remember what the current access level is.

Python, typically the first place I look for language ideas, has an unusual approach: the language semantics do not actually include any such thing as a private member, and the distinction between public and private members is maintained by convention. Everyone agrees that members whose names begin with a double underscore are private, and that’s that. Private identifiers look different enough from regular identifiers that it’s hard to do the wrong thing by accident.

Go also uses a naming convention, though it is language-enforced and not just a convention: methods beginning with uppercase letters are public, while methods beginning with lowercase letters are private.

What’s the right choice for Radian? I am drawn to the cleverness of the Go approach, but it’s a bit too clever, and I think the distinction will be too easy to miss. I’ve mulled over the idea of a private: / end private statement block, all of whose contents would become private members, but attempting to implement it quickly became a hairy mess, which I tend to take as a sign that the concept is fundamentally flawed. Using per-declaration access specifier keywords, while time-tested, just doesn’t look good to me: it’s too noisy, it doesn’t suit the language.

This leaves me with something like the Python approach: define a naming convention for private members, make it obvious enough that you can’t violate the convention by accident, then leave it up to Radian programmers to respect the rule and do the right thing. I can always come back later and build in some language-level enforcement if I think of an elegant way to implement it, but for now the language can get by without it.

What, then, should this convention be? It’d be easy enough to copy Python’s convention and use a leading double underscore, but I don’t really like it. There are a handful of unused ASCII punctuation marks still: @, $, and % are all too bulky, and neither ;, `, nor \ feel right, but ~ or ^ could work. It’s pretty much arbitrary, so I will try a few things out and codify whatever looks best.


Operator implementation

Posted: March 29th, 2011 | Author: Mars | Filed under: Design | Comments Off

The infix operator implementations have always been something of a temporary hack. There was a built in list of arithmetic operations and comparisons for a while, then they became a series of intrinsic functions in the standard library. Over the weekend I reimplemented them in a more generic style. The infix operators are now nothing more than syntactic sugar: each one is associated with a well-known method name, and an object which wants to implement that operator simply provides a method with the appropriate name. This is the same basic idea as the compare_to method I described a couple weeks ago.

The implementations have not changed: the object member invocation syntax is a lookup operation anyway, so I’ve simply made the process of identifying the correct add, subtract, multiply, or divide function explicit instead of hardcoding the linkage into the compiler.

The operator-to-method correspondences are:

+ => add
-  => subtract
* => multiply
รท => divide
mod => modulus
** => exponentiate
<< => shift_left
>> => shift_right


Progress on the list object

Posted: March 29th, 2011 | Author: Mars | Filed under: Progress | Comments Off

I’ve implemented the lookup and element-assignment operations on the list now. Lookup uses the subscript protocol: instead of calling a method, you use brackets to look up the element by index. That is, the list’s default operation is the lookup, like this:

var names = ["Joe", "Erin", "Bill", "Colin", "Tessa"]
io->print( names[2] )
…prints the text Bill to the console.

In order to replace one of these values, one calls the assign method:
names->assign(1, "Jessie")

I am thinking about sugaring this so that one can assign to a bracket subscript instead, but it’s not a high priority. It would look like this:
names[1] = "Jessie"

The remaining methods the list object needs are insert, remove, concatenate, and of course iterate.


Programming language: ooc

Posted: March 28th, 2011 | Author: Mars | Filed under: Reference | Comments Off

Not sure why I have never noticed the ooc language before, but its buzzword list is well chosen: “Object-Oriented, Functional, Concurrent, Modular, Portable – Awesome!”. The syntax is a trifle funky but not hard to pick up. Looks like the language is developed to the point that they have written a self-hosting compiler, which is a pretty neat milestone.


Partial implementation of the list

Posted: March 24th, 2011 | Author: Mars | Filed under: Design, Progress | Comments Off

Incorporating Charles Y.’s suggestion, I’ve renamed the “array” object to “list”. It plays a similar role to an array, but there’s no array anywhere in its implementation, and it’s really more interested in the fact that items occur in some order than in the fact that you can look those items up using a numeric index.

I’ve also taken Guyren H.’s suggestion for method nomenclature: append inserts a new item at the tail of the list and chop removes it. Thus one can use the list as a stack through the push/pop/head functions, as a queue through the push/chop/tail functions, or as a deque by using all six of push/pop/head/append/chop/tail. All of these functions operate in constant time or amortized constant time.

These six functions are all I’ve built so far. You will eventually be able to insert, remove, or [lookup] a specific index, not to mention iterate over all the items or get the size of the list, but those operations require an extra level of bookkeeping I haven’t had time to implement yet.


Array as deque: API thoughts

Posted: March 22nd, 2011 | Author: Mars | Filed under: Design | 4 Comments »

Now that the map object is finished, the array is the only remaining intrinsic data type on the development plan. The point of an array, of course, is to store a series of values in some specific order, and to allow efficient access to those values by index. Radian’s arrays will use a densely-packed zero-based ascending index, which is to say they will not have gaps, and that an array containing N items will store values at indexes 0..(N-1). Perhaps someday it will make sense to allow a variable lower bound, or to allow gaps in the index, but for now I don’t want to worry about those complications.

An array is a container, so it will have empty? and size functions. It is a sequence, so it will have iterate. It is also an indexed container, so it will also have a [lookup] function and the insert and remove methods.

The best way to implement this data structure is, without a doubt, the finger tree. This is an immutable data structure which offers some highly appealing performance characteristics and has enough versatility to serve as the foundation of a deque, a string, an array, a priority queue, and more.

Ah yes, the deque. I grew up on arrays, but these days I more often find myself reaching for queue, stack, or deque classes when I need to keep a list of values in some order. Most of the time, it turns out, all I really need is a way to construct a list and review it in order, or possibly in reverse order. The Radian library definitely needs to include such tools.

So why not, I wonder, fold the deque right into the array? The finger tree offers amortized constant-time read and write access to the first and last item in the array, so the deque operations would practically come along for free if I decided to expose them. Methods push, pop, and head would let you insert, remove, or lookup the item at one end of the list, and add, shift, and tail would let you work with the item at the other end. Push/pop/head gives you a stack; push/shift/tail gives you a queue going in one direction, and add/pop/head gives you a queue going in the other direction.

I’m not completely satisfied with this nomenclature: in particular, I don’t think shift is quite right for an operation which removes the last entry from the array. It feels like it ought to remove the first entry, thus shifting all the other elements back. That would certainly work, but then “pop” has to move to the end of the list, which means “push” has to come with it, and now the head and the tail are backwards. Could the head really be the last element of the array, and the tail represent element zero? No matter which way I try to arrange this, something ends up sounding a little bit wrong.


Map implementation finished

Posted: March 16th, 2011 | Author: Mars | Filed under: Uncategorized | 2 Comments »

Radian’s built in map object is done, and it’s hooked up to the brace syntax for construction. The construction expression should be a tuple; the constructor adds each item in the tuple to the map as a separate element. This is exactly the same as calling the map’s add method, passing that element in as its parameter.

var some_data = { "foo" => 1, "bar" => 2, "baz" => 3}
is the same thing as

var some_data = { "foo" => 1 }
some_data->add( "bar" => 2 )
some_data->add( "baz" => 3 )

Lookup uses the indexed-container protocol – that is, it works the same way for a map as it does for a tuple, using postfix square brackets:

var num = some_data["bar"]

This is the same protocol used for object members, which are looked up using symbols, so you can actually make your own object-like maps by associating symbols with captured functions.

Finally, you can remove an element from the map:

some_data->remove( "foo" )

I decided not to have remove raise an exception if the key is not present. Instead, it simply leaves the map unmodified. Similarly, adding an element whose key is already present does not raise an exception; it replaces the old element’s value with the new one.

Map is a container, so it can tell you if it is empty?, but it’s not a sized container, so it can’t (currently) tell you what its size might be. Nor is it currently possible to iterate over the map elements, though I expect that will come along eventually as it is the sort of thing which can’t be implemented without access to the map’s internal data structure.


Map syntax revised again

Posted: March 14th, 2011 | Author: Mars | Filed under: Uncategorized | 2 Comments »

The original syntax for map containers used a special overloading of the colon token as a key-value separator:

var map = {1: "foo", 2: "bar", 3 : "baz"}

While I like the way this reads, I don’t like the way I’ve given the colon three different syntactic roles, and I feel stingy introducing a useful syntax that only works with the built-in map object. Surely people will want to implement their own key-value map datatypes and initialize them in a familiar-looking way!

The solution I’ve chosen is to use the Ruby/Perl style => token to express the key-value grouping. This is a right-associative binary operator, with precedence above the comma and below the logical operators, and its function is to group its operands into a 2-tuple. With this change, the above code sample reads like this:

var map = {1 => "foo", 2 => "bar", 3 => "baz"}

The map-construction syntax is now much simpler: it accepts any expression, and calls the map constructor with that expression as the single argument. The map constructor expects its parameter to be an N-tuple, where each element represents one key-value pair; if the element is a 2-tuple, element 0 is the key and element 1 the value, otherwise the element itself is used as both key and value. This way the map can also be used as a set with no additional fuss.


Comparison operator overloading

Posted: March 14th, 2011 | Author: Mars | Filed under: Design, Progress | 1 Comment »

I’ve just redesigned Radian’s suite of comparison operators so they can be implemented by any object. A “comparable” object is one which implements a compare_to method; this method must express the relative ordering of the self object and the method’s parameter. This is essentially the same as the Operator_Compare scheme I developed for REALbasic. I’ve added compare_to methods to the number and symbol atoms, and will do the same for the string atom too.

This came up because the map container needs to sort its keys in some specific order. Obviously the map can’t know about all possible key datatypes ahead of time, so the language needs a standard interface for comparability, and with that in place there’s no reason to implement the comparison operators any other way.


Map policy

Posted: March 13th, 2011 | Author: Mars | Filed under: Design, Progress | 1 Comment »

I’ve been making some progress on the map datatype, one of a handful of built-in data structures. I decided to take the cheater’s way out and write it in C, inside the runtime library, instead of following my original plan and building it up from primitives (tuples) in Radian itself. I can always come back and rewrite it later; for now I am more interested in getting the language bootstrapped.

I’m considering a policy decision: what happens when you try to insert a value to a key which is already present? Does the map raise an exception, since you’re trying to do something whose meaning is somewhat ambiguous, or does it simply replace the existing value with the new one? The same question applies to the other map operations: what should happen if you try to remove a key which is not present? Does the map remain unchanged or raise an exception? Is the operation “make a map which lacks this key”, or “remove this key from the map”?

I am leaning toward raising exceptions in all of these ambiguous cases, because it seems more correct than making a policy decision which might be wrong sometimes. I am concerned that this will be annoying, though, and that the first thing I will end up doing is building a wrapper around the stock map that has the “just do it” behaviors.