Posted: April 30th, 2013 | Author: Mars | Filed under: Design | Comments Off
Every programming language which manipulates strings offers a pair of functions which convert text to upper or lower case. These functions are often used to perform case-insensitive comparisons – you just convert both strings to either upper or lower case first. This generally works, but it fails for some scripts, and so the Unicode standard defines a case-folding transform which produces a normalized string suitable for caseless matching.
Radian’s string library implements to_upper and to_lower, and it seems reasonable that it should offer a case-folding function too. But what to call it? Nobody else seems to be offering such a function: I can’t find one in .NET, in Python, in PHP, in Ruby – Go has a mysterious function called SimpleFold, but whatever it’s doing, it isn’t what I’m trying to do.
Posted: April 24th, 2013 | Author: Mars | Filed under: Uncategorized | Comments Off
- Fixed a bug in the garbage collector copy-forwarding mechanism which caused strange runtime library assertion failures after
sync statements.
- No longer misassigns a terminator byte when performing concatenations of certain very short strings.
-
const statement renamed to def. The semantics are the same; only the keyword has changed. The statement never created a constant in the mathematical sense; it simply defines a name for a value which cannot be redefined.
- Importing a file whose name begins with an underscore no longer fails, trying to load a mangled version of the file name instead. While the compiler does not enforce this convention, prepending a file name with an underscore is a way to show that the file contains private implementation details and should not be imported from outside the directory which contains it.
- No longer misprocesses hex character escapes inside string literals.
- More specific error message when you try to modify the
self object inside a function and not a method.
- More helpful error message when an
import statement fails: now points at the location of the import statement which failed. This might happen if you typo the import name and accidentally specify a file which does not exist.
- Fixed a race condition in the parallel work dispatcher which sometimes caused a null dereference crash.
Posted: December 17th, 2012 | Author: Mars | Filed under: Progress | Comments Off
I have added two new entries under Documentation: a writeup about the map object and another about the string object. These pages describe the syntax, list the methods and functions offered by the objects, and discuss the computational complexity of the available operations.
Posted: December 3rd, 2012 | Author: Mars | Filed under: Reference | Comments Off
Interesting research from Microsoft on equipping compilers with information necessary to automatically parallelize code via reference immutability tagging:
A key challenge for concurrent programming is that side- effects (memory operations) in one thread can affect the be- havior of another thread. In this paper, we present a type sys- tem to restrict the updates to memory to prevent these unin- tended side-effects. We provide a novel combination of im- mutable and unique (isolated) types that ensures safe paral- lelism (race freedom and deterministic execution). The type system includes support for polymorphism over type quali- fiers, and can easily create cycles of immutable objects. Key to the system’s flexibility is the ability to recover immutable or externally unique references after violating uniqueness without any explicit alias tracking.
Posted: October 26th, 2012 | Author: Mars | Filed under: Reference | Comments Off
A slideshow by Rob Pike, one of the principal inventors of Go. I was surprised by the number of places I found myself nodding along in agreement, having independently arrived at similar opinions about the Way Things Ought To Be Done.
Go has definitely taken over a big piece of the ground I had originally intended to cover with Radian, but I still think there’s room for a comparable language on the more dynamic / scripty end of the scale. Go is a tool for concurrent programming in the large, but there are lots of people who work in the terrain originally occupied by shell scripts and currently dominated by Python, Ruby, and to a decreasing degree Perl, and I think Radian has something to offer there.
Posted: October 26th, 2012 | Author: Mars | Filed under: Reference | Comments Off
- Perl:
$low = lc $str; $hi = uc $str;
- Python:
low, hi = (str.lower(), str.upper())
- Ruby:
low, hi = str.downcase, str.upcase
- Java:
String low = str.toLowerCase(); String hi = str.toUpperCase();
- Go:
var low, hi = strings.ToLower(str), strings.ToUpper(str)
- C#/.NET:
string low = str.ToLower(); string hi = str.ToUpper();
Posted: October 25th, 2012 | Author: Mars | Filed under: Design, Progress, Syntax | 2 Comments »
Radian offers two simple symbol types: var lets you define a symbol to which you can later assign a new value, while const is a definition which cannot later be changed. I had expected to make heavy use of const in Radian code since it echoes a pattern I use frequently in C or C++, but in practice I’ve found myself shying away from it. The reason is entirely superficial: it doesn’t feel right, because the values I would be assigning just aren’t constants. Instead, most of the consts I would define are intermediate values – things that will change on every invocation of the function or every pass through the loop, but which can remain unchanged once I’ve defined them. As such it just feels weird to call them constants, and so I tend to define them as var even if I have no intention of ever redefining them.
I still think that const has a good place; in fact I think that using it heavily is good style. I’ve decided therefore to rename it. Stealing a keyword from Python, “constants” are now “definitions”, using the keyword def. I’d avoided def since Python uses it for function definitions, specifically, while Radian functions use function, but sometimes one’s nice clean abstract ideas don’t pan out in practice.
It’s about time to freeze the syntax for a while. Aside from the half-finished regex literals, which are actually present in 0.6, I don’t see any further syntax changes on the horizon. All the upcoming work is in libraries and the toolchain.
Posted: October 10th, 2012 | Author: Mars | Filed under: Uncategorized | Comments Off
A new version of Radian is available for download. Changes since 0.5.0:
- No longer handles
sync expressions in the ‘predicate’ term of a list comprehension improperly; this could have led to a compiler crash.
- Former IO object method
read_file has been moved to the file module and renamed read_bytes, since its job is to read the file in as a byte buffer. There is no longer a filespec object to provide as a parameter – just use a path string to identify the file.
ffi.load_external function no longer requires its arguments to be string literals; it will now accept any string for the file path or function name.
sync expression no longer returns incorrect value; it had been returning the last IO action applied instead of the result of that action.
- Concatenating a string onto a string literal no longer fails when the right-hand string is not also a string literal; it had been raising an assertion.
- Text encoding objects live in a new
encoding library; encodings convert strings to and from byte streams. Encodings currently offered are ASCII, UTF-8, and UTF-16 in its big- and little-endian variants.
- FFI library no longer includes specific types for each supported string encoding: instead, there’s a single
ffi.string object which accepts a parameter specifying the text encoding to use.
file.read_string is a new function which reads a whole file in as a string; you specify the file path and the text encoding to use.
string.length function computes the number of codepoints in a string.
string.from_codepoint function creates a single-character string from an integer codepoint value
- Line continuation after binops will now work outside of a function or other indented block; it used to fail when used on a root-level statement.
- String concatenation algorithmic complexity is now amortized O(log N) on the length of the composite string; it was previously O(n). Iteration time per character is still approximately constant.
string.is_empty function returns true if the string contains no characters.
string.split_lines function breaks a single string into a sequence of lines, separated by linebreak; the function accepts any of LF, CRLF, or CR as valid line breaks. An empty file will produce an empty sequence, but characters left on the end of the file with no trailing linebreak will be returned as their own line.
sequence.take function accepts a sequence and a number of elements, then returns the first N elements of the sequence, or the whole sequence if it has N or fewer elements. Corresponding string.take function returns the first N characters of a string.
sequence.drop function accepts a sequence and a number of elements, skips the first N elements of the sequence, then returns the remaining sequence. A corresponding string.drop function does the same job for characters in a string.
sequence.singleton(X) function returns a one-element sequence consisting of the specified value X.
sequence.replicate(X, N) function returns a sequence N elements long where every value is equal to the specified X.
sequence.length(seq) counts the number of elements in the sequence.
assert statement evaluation is no longer deferred til after a sync: if the assert fails, the sync will return an appropriate exception instead of a meaningless value.
string.slice(str, begin, length) returns a substring of length no greater than the specified number of characters beginning at the specified number of chars after the beginning of the argument string. If begin is equal to or greater than the length of the string, slice will return an empty string.
string.replicate(str, count) creates a string by repeating the argument a specified number of times. If the count is zero or the string is empty, replicate will return an empty string.
set.union(a, b), set.intersection(a, b), and set.difference(a, b) implement three of the four basic set operators. (It remains to be seen whether there is a practical implementation for complement).
- new
queue module implements a queue data structure: append items to the tail of the queue, then pop them from the head. The queue object also implements the sequence interface, so you can also iterate over a queue instead of using head/pop.
io.write_file function moved to file module and renamed file.write_bytes. A related file.write_string accepts an encoding parameter, allowing you to write a string out as a text file.
Posted: September 27th, 2012 | Author: Mars | Filed under: Design, Progress | Comments Off
The regex system is turning out to be a larger project than I had anticipated. It’s still important, but as the length of time it appears likely to consume continues to grow, its immediate priority is dropping. I’m still working on it, but I’m not going to let it delay the long list of smaller pieces of functionality impeding other use-cases.
I am continuing to move away from the original monadic IO system. The latest change is the file-input mechanism: the function that used to be io.read_file is now file.read_bytes. I want it to be clear that the result of this function is a byte buffer, not a string. The buffer object implements the sequence interface, so if I just called it file.read an unobservant ASCII-using programmer might be able to get disturbingly far along without noticing that what they’d read was not actually text, and had not been decoded from its byte form, but merely a string of bytes. By naming the function read_bytes I hope to plant a seed of puzzlement which will lead the programmer to its eventual sibling, read_string, which will require you to specify the encoding of the text file you are reading.
Another change is the elimination of the filespec object. I’d intended to use an abstract mechanism for describing a file, but it’s ultimately nothing but a thin wrapper around a path string. Since every platform I care about uses path strings to identify files, I’ve decided to drop the wrapper. Perhaps there will eventually be a module in the library which implements platform-localized transformations on path strings.
Posted: September 9th, 2012 | Author: Mars | Filed under: Progress | Comments Off
A new version of Radian is available for download. Changes since 0.4.0:
--dump switch now supports llvm option, producing LLVM IR as output.
- IO system rewritten to use asynchronous tasks. IO methods no longer mutate the implicit IO object, but simply return asynchronous task objects which you can then
sync to execute. It is no longer necessary to pass in a separate callback expression; the program will continue when task execution completes.
- Former IO object methods
load_external, describe_function, and call have been moved to the FFI (”foreign function interface”) module.
- Number type predicates have been renamed from
number? to is_number, integer? to is_integer, and rational? to is_rational.
- All
type? functions in the standard library have been renamed to type.
- Question marks are no longer allowed as identifier and symbol suffixes. The
category “suffix character” no longer exists. An identifier may begin with any
character in the Unicode category XID_Start, or an underscore, and may continue
with any number of characters in the Unicode category XID_Continue.
- Methods of built-in objects check the number of incoming arguments and report an exception when there are too many or too few. Previous behavior was undefined.
- List member indexed lookup no longer dies with strange “member not found” exception after the list grows larger than 8 items.
- A list, once reversed, can now concatenate another list without throwing an “unimplemented” exception.
- Number module now offers a
range_with_step function, accepting parameters min, max, and step. Like the normal range function, this counts from min to max. If step is positive, it continues while current <= max; if step is negative, the sequence continues while current >= max.
sync operator no longer needs to be the root of its expression: you can now use the result of the sync in a compound expression involving other values, other function calls, and even other syncs. Expressions are processed in deepest-to-shallowest, left-to-right order, and syncs are currently the only expression operator which can cause an observable side-effect.
- No longer fails to include line number and position when reporting errors
with parameter definitions.
- Functions inside a module no longer refer to the module as
self; instead they refer to it using the module’s name, derived from its file name, just as other files which import that module would do.
set object in the library no longer returns an exception when you try to add an element: that is, the set object will now actualy work as a set container.
- No longer accepts linebreak characters inside a string literal: that is now an error, as it should have been all along.