It Lives


At last, I have “finished” the Weave interpreter.

I spent last few evenings playing with the fruits of my labor. And I think it’s pretty damn cool - even if I do say so myself.

Check this out!

One line convert from CSV to JSON

read("input.csv", :csv) |> ^(rows) { write("output.json", rows, :json) }

First class Lambdas & pipe operations make for easy function composition

We can put generically useful logic inside reusable module, not reinvented for each script.

So given some SQL-ish functions coming from a module, a Weave script can do real work with very straightforward syntax:

# Imports look like Python's syntax.
from sqlish import select, where, sum_by


# Column names are long, let's pull them out of the way.
cols = ["First Name", "Last Name", "State", "Amount"]
group_cols = cols - ["Amount"]
sum_col = ["Amount"]

# A little lambda to check if a given row's data indicates it's a WA customer
in_washington = ^(row) { row["State"] == "WA" }

# And a nested lambda to support writing our output
to_json_file = ^(f) { ^(data) { write(f, data, :json) } }

# Data Pipeline
customers = read("customer_purchases.csv", :csv) 
top_customers = customers |> select(cols) |> where(in_washington) |> sum_by(group_cols, sum_col, :total) |> sort(:total) |> take(100)

# Now we can do whatever with this set
top_customers |> to_json_file("top_customers.json")

Reflection

The goal of Weave from the beginning was to create a language that made it trivial to work with data files - making it simple to compose functions into a pipeline.

What worked out

By and large, I feel like this project was a successful experiment.

As I had originally intended, the Weave syntax makes it trivial to define data pipelines to work with common file formats.

Expressive Nature

In the sample above, I used Weave itself to implement the SQL-ish API for working with Containers. While the implementation of sum_by won’t win any code-beauty contests, it only took around 15 lines of code to express, including two internal lambda functions. Beyond that, however, the actual data pipeline itself was simple to express. In an equivalent

  • say, Python - program, just the “Data Pipeline” section would be - if not necessarily unclear, at least significantly more verbose.

Lambdas

I really like the terse syntax I landed on for Weave’s lambdas - the single ^ character nature makes it feel very natural to use them, even nest them, as needed.

As a result, even nested, curry-style constructions like to_x = ^(fmt) { ^(f) { ^(data) { write(f, data, fmt) } } } end up feeling like a natural use of the language.

What Could be Better

Of course, as my first language, I learned a lot over the course of this project. If nothing else, this has given me an all-new appreciation for the languages I admire. After this, even Ruby’s performance strikes me as miraculous!

Nevertheless, were I to start again, here are some of the things I would try to improve:

Performance

I prioritized being able to understand what I was doing over all else - and this included some design decisions which had a major impact on performance. For instance, the VM kernel is stack-based instead of register-based. This is much easier to implement and leads to a less complicated instruction set - but it’s also slower for many operations. Consider the &> reduce operation. In a register-based VM, I can keep the accumulator in one register while filling others with the next n input values, then apply the function to each input before loading the next set. In the current stack-based VM however, I have to pop the accumulator, the collection and the index from the stack, use the index to locate the next value in the latter, then push the collection, the increased index, the next item in the collection and the accumulator back onto the stack, call the function (which pops operates and pushes to update the accumulator) - and then repeat the cycle until at last the index tells me there are no more items in the collection to apply.

All of the overhead in this process for a single element of a potentially large collection is simply slower than a register-based approach would be.

…Not to mention a million other improvements or cleverer designs which could be applied to the system.

Memory Management

(Hoo boy, this was a whole can of worms)

Early on, I tried to be as memory-efficient as possible in the Weave interpreter - borrows only, or copy-on-write - avoiding .clone like the plague! …Until I started fighting with the compiler around lifetimes. While Rust’s memory management system is significantly safer that C-style, there are times when it cannot automatically determine whether a borrowed piece of memory will live long enough (e.g. passing a borrow into a closure - will the reference to the closure outlive the variable’s original scope?). At that point, it is left up to the programmer to explicitly define their variable’s lifetimes so that the compiler can assert memory safety. That rapidly became an infection, dropping <'a> throughout the codebase, all to avoid adding a handful of .clone()s.

So I eventually gave up. Rather than trying to manually prove out every memory access was safe and lifetimes were properly aligned, I caved to the siren call of cloning values where convenient.

As a result, Weave spends more time than it strictly needs to on shuffling bits from one place to another and keeping multiple copies of the same bits around for no true benefit.

Error Handling

Strictly speaking - this was out of scope from the start - good error handling is a complicated thing to include in a compiler or interpreter, as the tracking of source-code to byte-code to what-the-vm-actually-does is both non-trivial and computationally expensive.

Yet, now that I’m trying to write even trivial Weave programs, features like stack-traces or even simply better “compilation error” messages would be very helpful.

Streams / Coroutines

File these under things-that-would-make-it-better rather than a true mistake, but it became rapidly apparent that Weave would be well served by having options for processing data in smaller chunks.

As it stands, loading a large data file into memory and then funnelling it through a pipeline can result in intermediate copies of the data being left on the heap until processing is completed, which contributes to Weave’s performance and memory utilization issues both.

Providing language-level support for things like Streams, and/or Coroutine support for resuing call frames, could improve the efficiency of processing large data files.

What’s Next?

Oh man, I dunno. I’m having fun playing with Weave. If I end up actually using the language for Real Work (tm) maybe I’ll invest some time in improvements to some of these issues.

If I don’t, well, I’m still pretty happy with it. Last year I set out to learn Lisp and ended up writing a language of my own instead. As I write this from January 2026 - well, maybe this will turn out to be the year I finally, truly learn Lisp.

weave 

See also