Different languages compared

Posted: November 29th, 2010 | Author: Mars | Filed under: Reference | 2 Comments »

A number of web sites offer comparisons of the same task implemented in different programming languages:

  • 99 Bottles of Beer: a program to print out the text of the song “99 Bottles of Beer”, implemented in 1,348 languages
  • Rosetta Code: several hundred reasonably common programming tasks represented in a few hundred different languages
  • Langref: several dozen common problems implemented in the fifteen currently-hip languages
  • PLEAC: all the examples from the Perl Cookbook, implemented to varying degrees in 28 languages; has lots of content but appears to be dead (no updates since early 2007)

Library design

Posted: November 23rd, 2010 | Author: Mars | Filed under: Uncategorized | Comments Off

With symbol literals and the “has” operator in place, I can start building a type system for the standard library. The library consists of modules, which live in a central directory; each module represents some related package of utilities.

I’ve decided to build a module around each abstract datatype; for example, you can import sequence from radian to use functions relevant to sequences, or import stack from radian to get a collection of stack functions. Each of these modules has a type? function, which examines some arbitrary object and tells you whether it implements the member functions necessary for that interface. I’ve also included an empty function which returns an appropriate null object for the interface. The empty sequence, for example, can still be iterated over, but it ends immediately and yields no values; the empty stack has no items.

For a type like stack, the empty function works like a constructor: you start with stack.empty, then push items onto it. After you’ve popped all the items back off, what you end up with is – once again – stack.empty. I intend to do the same for all the stock container types: dictionary, set, array, stack, queue, deque, tuple, and anything else I end up including.

There’s nothing obligating you to start with the built-in stack.empty, of course: as long as your own object implements the appropriate methods, stack.type?(your_object) will return true, and any function which works on stacks will accept it.

Beginning of a type system

Posted: November 20th, 2010 | Author: Mars | Filed under: Design, Progress, Syntax | Comments Off

Last night’s checkin added a new binary operator: has lets you inquire about a container’s attributes. The has operator has the same precedence as the comparison operators. Its left operand must be a container and its right operand may be any value. The result is a boolean, which is true if looking up the value in the container would result in a non-exceptional value, false if the lookup would fail.

Combining this with the new symbol-literal syntax lets you ask whether a given object implements some method:

if foo has :bar:
end if

Testing for a set of related methods lets you determine whether an object implements some interface:

function stack?(foo):
    result = foo has :push
    result = result and foo has :pop
    result = result and foo has :head
end stack?

Syntax for a symbol literal

Posted: November 19th, 2010 | Author: Mars | Filed under: Syntax | 6 Comments »

Symbols are a kind of atomic datatype which can be used to identify things within a program. You can think of them as a unique-ID. The Radian compiler generates symbols internally as part of the object structure, using them as keys in the member index; member access on an object then uses the same symbol to look up the method.

Examples of symbol literals in other languages – these are the only examples I know about, as most languages don’t use symbols:

  • Ruby: :foo
  • Scheme: 'foo
  • Clojure: :foo
  • Scala: 'foo

I’m not happy with either of these: the apostrophe feels too much like an unbalanced single-quote, and the colon already has two other meanings in the Radian syntax. Using it to begin a symbol would not introduce ambiguity, but I am concerned that reusing the same character in too many different situations would confuse people.

Determining whether an object has some method or property

Posted: November 17th, 2010 | Author: Mars | Filed under: Design | 2 Comments »
  • Ruby: has = foo.respond_to?('bar')
  • Python: has = getattr(foo, "bar", None)
  • JavaScript: has = foo.hasOwnProperty('bar')
  • PHP: has = method_exists($foo, "bar");
  • Perl: has = $foo->can("bar");
  • C#: has = foo.GetType().GetMethod("bar") != null;

This is also possible in Java, using the reflection API, which is so verbose that I refuse to waste time typing out an example. I don’t know whether Scala or Clojure has better syntax than Java does, but presumably they can at least use the same API.

New intrinsic parameter: argv

Posted: November 15th, 2010 | Author: Mars | Filed under: Design | Comments Off

I previously discussed the idea of an IO continuation: a two-parameter function which accepts a world-state and some value, then combines them, creating a sequence of IO actions to return. Each of these actions, when executed, may yield another continuation, which can in its turn generate more actions. It’s a self-extending sequence of little programlets.

The first programlet the IO engine executes is of course the main body of the program itself. This is no different than any other IO continuation; it just happens to be the place we start. It may start up as many IO state-machines as it likes, which will keep working and passing data along until the program has accomplished its job.

The parameter value could be a buffer read from a file, a packet received from a socket, a directory listing, a success/failure code, or whatever it was that the IO action was intended to produce or retrieve. For the main program body, the parameter value is the list of command-line arguments. I’ve followed the Ruby approach here and defined an intrinsic global parameter named argv: this and io are the two parameters the main program receives from the IO engine.

This simple program will print all of its arguments to the console:

for s in argv:
    io->print( s )
end s

Modules, which are singleton objects and not programs or IO continuations, do not have any such parameters; the only place argv is directly available is the main program file.

Rewritten IO, new input function

Posted: November 15th, 2010 | Author: Mars | Filed under: Progress | Comments Off

The io object I’ve been using all year is a simple stub, the barest minimum necessary to get some testing under way. It did not implement the queued IO strategy I described in my last post, but simply violated functional purity and hoped nobody would notice. This worked out fine since it had no way to read data, and until recently the language had no exceptions, but now it’s time to start doing a proper job of IO.

The new IO object looks almost exactly the same, but it implements the fancy queued, exception-safe IO scheme, and it has a more general internal architecture which will allow me to quickly create a whole array of useful IO functions. In the long term I would like to rewrite this module in Radian itself, using a foreign-function interface to call into the standard C library, but for now it’s enough to get the functionality in place.

The only new function so far is an input method, which reads one line from the standard input. I’ve implemented it, for now, to call a captured function rather than a method on an object. This is a functional style API, rather than the object-oriented style I’ve been favoring so far, but in the long run I plan to automate the whole callback process so I don’t think the choice will matter to anyone but power users.

This means that it’s now possible to write Radian programs that transform data. You’ll have to pipe the data in and back out, since there are no file-management methods yet, but it’s a start – this is the first time Radian can be said to actually have some utility as a programming language, marginal though it may still be.

IO system design

Posted: November 10th, 2010 | Author: Mars | Filed under: Uncategorized | 2 Comments »

Functional languages are good at describing pure data transformations, but how do you write a functional program that causes changes to occur in the rest of the computer, or even in the world outside the computer? The whole point of a purely functional language is that functions can only compute values; they cannot cause changes to occur. Doesn’t this mean that actions taken outside the program have to be inherently “impure”?

The solution is to think of the entire state of the outside world as an object; a value which can be described and manipulated. The program itself becomes a function which accepts a state-of-the-world object as a parameter, transforms it, and returns a new state-of-the-world as its result value. The job of a purely functional program is to describe the transformation that should occur on the state of the world outside itself: how to get from point A to point B. It’s up to the operating system, which has invoked the program, to decide what to do with that information.

Radian already has a system for describing transformations of objects; we can represent IO actions the same way. The program receives an implicit parameter named io, which is its representation of the state-of-the-world; the program can do whatever it wants to this object, and whatever value is left in the io variable when the program ends becomes the program’s description of the IO state it wants to create. (We’ll ignore for the moment the question of this object’s internal structure; suffice it to say this is just a pretty UI wrapped around all the old familiar IO streams and system calls.)

This means that an IO action doesn’t actually occur at the point where the program calls an IO method. Instead, the IO object adds the action to its internal queue. As the program continues, the IO queue grows; when the program is finished, it executes the queue, thus applying its state-changes. This is analogous to buffered I/O, where writes may actually live in a buffer for a while, only going to disk via flush() or when the IO subsystem decides it’s time to commit. This buffering approach is what makes Radian’s IO system thread-safe and exception-safe: you can always roll back to a previous IO state.

The problem with this pretty plan – a very big problem! – is that programs don’t just read from the outside world, perform computations, then write their results back to the outside world. It’s usually more of a conversation, where the data that comes in affects the data that goes out and vice versa. If your IO object only represents the state of the world as it was before the program began, how can you implement an interactive process? How, for that matter, can you read in data when the act of reading data itself has side effects? (Think of a socket buffer.)

One option might be to stop the program and flush the IO object before reading any data. The problem is that this doesn’t play well with parallelism, and gives us no way to roll back if an exception occurs – once you’ve flushed a set of IO actions, there’s no way to recall them.

Instead, I’m going to implement an asynchronous IO system. We can think of a read operation as an IO task like any other: the difference is that it doesn’t just change the world-state, but also creates some new information. We can’t just fire and forget; something needs to receive that information and do something with it. That something will be an IO continuation: a function which receives, as its parameters, an IO state and some value. This function is effectively a little program of its own: its job is to interpret the value, do something interesting with it, and return a new IO-state in its turn.

So, you want to read some data: you call io->read, specifying the input stream and the completion. This queues up an IO action which is to read data from that stream; when the read action finishes, it will call the completion routine, passing in the newly-modified world (moving the read pointer may be a tiny change but it is a “side effect” all the same!), plus the data that was read from the stream. Now it’s the completion function’s job: it can parse the data, add it to a buffer, do whatever it wants – write it to a socket, read more data, whatever. In the end it all comes down to another queue of IO actions, which it can return back to its caller. The process continues until some completion, somewhere down the line, decides that all the work is done and returns the IO object without changing it. If there are no actions left to perform, that’s the end – the program exits.

One of the larger problems with this architecture is that I believe people will find it inconvenient. I have observed that most people approaching IO like to think in terms of steps, and like to lay their operations out in a series of commands; recasting the work as a completion-driven state machine does not come naturally. I think I am going to have to implement another code-transformation engine, like the loop hoisting system, that can unroll a function into an IO state machine. There would be some new statement which marked a break, a wait point: it would first commit the IO actions up to that point, then await the result and resume executing.

It’s possible that I can build a single architecture which does something like Python’s sequence generators and this IO system all at once. For what is this IO queue but another sequence, after all? The problem of yielding a sequence of data values asynchronously is not so different from the problem of yielding a sequence of IO actions asynchronously; the key difference is that the IO completion function needs to get a new world-state object along with the data it has read.

Posted: November 6th, 2010 | Author: Mars | Filed under: Uncategorized | Comments Off

IO is a big problem, so I spent a little while poking at the test suite instead. The tests all pass again, and I’ve added a new one that gives the new for-loop sequence hoisting feature a little workout. I still want to set up an automatic pull / build / validate robot one of these days, but the language has been primitive enough so far that it hasn’t been a high enough priority.

Finished induction variable hoisting algorithm

Posted: November 4th, 2010 | Author: Mars | Filed under: Progress | Comments Off

Radian’s compiler can now hoist multiple independent subexpressions out of the body of a loop into a mapping sequence, extracting parallelizable computations from the loop body regardless of their complexity, order, or location, thereby allowing the parallel scheduler to distribute the work across multiple cores. This completes the feature I described on Tuesday.

There’s a mountain of work remaining before the compiler and the support library become anything that could be called a useful tool, but the combination of the hoisting algorithm and the parallel scheduler forms the centerpiece of the language concept, and I’m glad to have finished it. I think I’m going to miss it, too; I’ve been thinking over the eventual design of this piece of code for years, and it’s a little strange now to have actually built it.

The next project should probably be development of the IO system, which will necessarily include some support for concurrency. I’ll start with the standard input, output, and error streams, plus a way of reading command-line arguments.