Posted: June 29th, 2011 | Author: Mars | Filed under: Reference | 1 Comment »
Interesting reference material: it seems obvious in retrospect, but it had never occurred to me that the process of rendering a floating-point number into a string was the sort of project people might write papers about.
Here’s something I bet you never think about, and for good reason: how are floating-point numbers rendered as text strings? This is a surprisingly tough problem, but it’s been regarded as essentially solved since about 1990.
References to the “Dragon4″ algorithm, and to a recent improvement, “Grisu3″.
Posted: June 27th, 2011 | Author: Mars | Filed under: Reference | 1 Comment »
Comparing language performance and memory usage in C and Python, using a simple program to remove duplicate lines from a text file.
Radian’s current feature set ought to be enough to build this test; I wonder how it would stack up.
Posted: May 27th, 2011 | Author: Mars | Filed under: Reference | Comments Off
Multicore Garbage Collection with Local Heaps, by Simon Marlow and Simon Peyton-Jones:
In a parallel, shared-memory, language with a garbage collected heap, it is desirable for each processor to perform minor garbage collections independently. Although obvious, it is difficult to make this idea pay off in practice, especially in languages where muta- tion is common. We present several techniques that substantially improve the state of the art. We describe these techniques in the context of a full-scale implementation of Haskell, and demonstrate that our local-heap collector substantially improves scaling, peak performance, and robustness.
Posted: May 25th, 2011 | Author: Mars | Filed under: Reference | Comments Off
Bruno Jouhler’s Yield – Resume vs. Asynchronous Callbacks – An Equivalence continues a series of interesting explorations on asynchronous behavior using JavaScript.
Y-R Javascript is a small extension to Javascript. It introduces a new operator that can be applied to function definitions and function calls. The @ sign is the yield and resume operator. When present in a function definition, it means that the function executes asynchonously. Somewhere in its body, the function yields to the system and the system calls it back with a result or an error. It may actually yield and resume more than once before returning to its caller. I will call such a function an asynchronous function, in contrast with normal Javascript functions (defined without @) that I will naturally call synchronous functions.
In this post I want to investigate the relationship between this fictional Y-R Javascript language on one side and Javascript with asynchronous callbacks on the other side.
I will show that any program written in Y-R Javascript can be mechanically translated into an equivalent Javascript program with asynchronous callbacks, and vice versa.
Posted: May 11th, 2011 | Author: Mars | Filed under: Uncategorized | 2 Comments »
The foreign-function interface can just about do something useful now. You can load a function pointer from some external library, explain how you’d like to marshal values, then invoke it as an IO action. The value it returns will be marshalled back, according to your previous specification, into some kind of Radian object.
I’ve been sort of picking gently at the remaining bugs in the system while I think about a better syntax for doing asynchronous I/O. All Radian I/O operations are asynchronous, by nature, since the only way to execute an IO action is to yield it back to the system. Thus you must construct I/O activity as a callback-driven state machine.
This is actually the way I learned to do I/O back in the ’80s, on the non-preemptive classic Mac OS, and I’ve used similar techniques in microcontroller code. Nostalgic as the style may be, however, Radian’s lack of shared mutable state makes doing actual work this way something of a mind-bender.
There’s no getting around the fact that IO is asynchronous and callback-driven, but the pattern for writing sequential IO code in such an environment is predictable, and it should be possible to build in some support that makes the process less tedious. I’d like to introduce an async statement, or an async operator, which does all the relevant housekeeping work. It would split the current function apart: anything downstream of the returned value would be captured and passed in as an implicit callback function, which would carry on with the results of the process whenever the initial IO action had completed.
It has been difficult to wrap my head around this semantic transformation, but I think I’ve come up with a way to do it. Along the way I accidentally found a way to implement C-style continue, break, and return statements, and worked out most of what I would need to do in order to introduce Python-style generator functions (yield, or in C# yield return), so it’s been some productive thinking-time even if I haven’t written much code.
It always feels like there is a lot of work piled up ahead of me, but I’ve been keeping a to-do list, and after the async system is done, the only significant task left before “first public beta” is the garbage collector. At that point I plan to spend some time polishing things up, writing documentation, and preparing an installer package.
Posted: April 15th, 2011 | Author: Mars | Filed under: Progress | Comments Off
The first batch of marshaling objects are in the library now. They all live in the ffi module for now, but they will be useful in any situation where one might want to render values as bytes or retrieve values from bytes, so I may end up breaking them out into their own module. The current list is uint64, int64, uint32, int32, uint16, int16, uint8, int8, ascii, and utf8. A marshaling object is one that implements the to_bytes and from_bytes methods; these methods accept a value and return a sized sequence of bytes, and accept a byte buffer and return a value, respectively.
Posted: April 11th, 2011 | Author: Mars | Filed under: Uncategorized | Comments Off
Now that the built in container types are finished, the next most important item on the to-do list appears to be a foreign-function interface: that is, a mechanism for calling functions from external libraries.
We can break this problem down into several pieces. We need:
- IO function to load a function pointer, by name, from some library file
- Mechanism to marshal Radian values into C types, and to create Radian objects from C values
- Annotation scheme so we can describe the parameter and return value signature for some function pointer
- IO function to invoke some annotated function pointer with some values to be used as parameters
It’s not clear what use a bare function pointer would be, so perhaps the annotation operation should occur at the same time as function-loading. Even still, the result cannot be an invokable, since we can’t know which external functions might have side-effects or thread interactions; the external function pointer must be a type of IO action.
As usual with a big, complex problem, I’m going to start by implementing the piece that seems simplest and most obvious. That looks like the marshaling scheme: I will develop a system of type-objects, which can produce a byte buffer from a Radian value, or can create a Radian value from a byte buffer. I will need to consider endianness, and provide some support for variable-length buffers. This system will probably also incorporate support for text encodings, since those can be seen as schemes for marshaling an abstract string of characters into some concrete byte representation.
Posted: April 9th, 2011 | Author: Mars | Filed under: Uncategorized | Comments Off
It turned out to be a much bigger project than I had expected, but that really just demonstrates why this list object is such an important part of the Radian language. The Radian list does everything you’d expect a list to do, in a language of this type, but it’s an immutable (or “persistent”) data structure, so it is memory efficient and plays nicely with multiple threads.
Immutability has another advantage: it’s a bad idea to return an array from an object method in REALbasic, or a list from a method in Python, because you must either make a copy of the array every time, or you give any caller the ability to modify your object’s internal data storage. With an immutable list, however, giving away a reference to a list is effectively the same as making a complete copy, without actually taking up any additional memory. You can safely return a list from a method – the caller can make any changes they like to their new copy of the list without altering your object’s internal data.
Posted: April 7th, 2011 | Author: Mars | Filed under: Reference | Comments Off
PEP 3151 takes a look at the organization of exception classes in the Python library and suggests a new, more useful arrangement.
While Radian has an exception mechanism, I have thought very little about an organization of exception objects yet. Radian doesn’t really have a class hierarchy, so it’s not clear how I would replicate the Python approach, much less a system like Java. It seems like exception values ought to have a type identifier as well as some data chunk; perhaps I’ll define an “exception” object with those two fields, then use a set of well-known symbols for the type identifiers.
Posted: April 5th, 2011 | Author: Mars | Filed under: Progress | 2 Comments »
Implementing the log-time finger tree algorithms has proven to be a substantial challenge. Examples are few, and most are written in Haskell, which I find largely unintelligible. After several false starts, however, I am making some progress. I’ve re-implemented the indexed element lookup, in a way that will survive splits and concatenations, and have roughed out the concatenate method. With that and a split function, I can easily build insert, remove, and assign.
The list object has been more of a slog than I expected; the finger tree is substantially more complex than the Andersson-tree used in the map object. It is still clearly the right approach, however, and I am pleased to have such a powerful tool built into the foundation of the language.