By Zed A. Shaw

BBS, Lua, Coroutines, Mongrel2: Part 1

UPDATE: Got the curl command to get to the BBS. Should work now if you use curl -o client.py http://dpaste.de/xtDj/raw/ && python client.py mongrel2.org 80

I was busy hacking on v1.4 of Mongrel2 and seriously got bored. I can't live on C alone, and rather than fix various performance problems in the code (screw you getaddrinfo!) I decided to write a BBS. I wanted to do the following things in the least amount of code:

  • I wanted to re-learn Lua since it seemed to have improved over the years since I last used it in 2004.
  • I had this crazy idea that I could use Lua's really great coroutines to do a kind of Seaside style web application.
  • I wanted to stress out the new JSON/XML routing protocol I just built.
  • I wanted to do it without a web browser.

Impossible you say?! Read on for how easy it actually was as I explain the code, my missteps, and the next phase of the project.

BBSes For Youngsters

BBS stands for Bulletin Board System. For you young kids out there who don't know what that is, let me explain to you how awesome it was.

When I was a kid there wasn't the internet. What we did was even better. We'd dial other people's computer with a phone built into our computer called a "modem" and then we'd get a terminal connection to that computer. We could then work a menu that was displayed entirely with text ASCII art codes and ANSI color sequences to do amazing things. Like, leave messages for people, read messages people left, send Fidonet mail, play door games, and chat. It was great.

Yes, you dialed one computer. It was kind of like how you go to sheddingbikes.com to read what I post, but you had to disconnect your internet connection and then redial my home phone number to get to it first. Each BBS was its own tiny little world, with its own people and friends. It was great stuff, and lots of fun.

The BBS was driven by a hacker culture of making awesome software with very little. Hackers built the software, most of the hardware, and figured out how to do amazing things. During the period BBSes existed (1978-1994 or so) there were about 150,000 created. The BBS laid the ground work for the commercial internet to exist because when it was finally released on the public, everyone had phone lines, modems, and were ready to dial. Just hand them some cool new software and they could be "online" instead of tied to just one computer.

In 1992 or there abouts, that all changed when the commercial internet arrived and then everyone just stopped dialing BBSes. It's kind of a weird feeling actually trying to explain what happened. Here I was a member of this community. Frequenting different boards and playing games, and then the second I got internet access I just left. Quit dialing the BBSes where my friends hung out. I was just completely swept over into this whole new land like teleportation or something. A few years later I tried to actually dial a few BBSes to see what was going on and they were all gone.

I'll always remember my BBS days though because it got me my first "borrowed" copies of compilers for C and Assembler and sample code I could learn from. Without the BBS I probably would have just tinkered with computers for a year and then gone on to do something else entirely different.

However, describing a BBS like this still doesn't show you what I mean, so why don't you try the one I built out first. Run this:

curl -o client.py http://dpaste.de/xtDj/raw/ && python client.py mongrel2.org 80

Make sure you look at http://dpaste.de/xtDj/ first to see that nobody's ganked it. Shouldn't be able to but you never know.

When you connect you should see something like this:

Connecting to localhost:6767
 __  __ ____  ____  ____ ____  
|  \/  |___ \| __ )| __ ) ___| 
| |\/| | __) |  _ \|  _ \___ \ 
| |  | |/ __/| |_) | |_) |__) |
|_|  |_|_____|____/|____/____/ 
Welcome to the Mongrel2 BBS.
What's your name?
> zed
What's your password?
> XXXXXX
MOTD: There's not much going on here currently, and we're mostly just trying out
this whole Lua with Mongrel2 thing. If you like it then help out by leaving a 
message and trying to break it. -- Zed
Do you want to continue? (Y/n)
> y
---(((MAIN MENU))---
1. Leave a message.
2. Read messages left.
3. Send to someone else.
4. Read messages to you.
Q. Quit the BBS.
> Q
Alright, see ya later.

Yep, that's what a BBS looked like. They got fancier but it was all simple text, prompts, and menus. But with just that they invented a huge amount of cool stuff that was fun and interesting. I think because it was so hard to make money from a BBS the only thing you could do was have fun with it.

In that spirit I decided to try and make one in a little bit of code using Mongrel2.

Using Lua's Coroutines

Here's a problem with this kind of application: it's all async but a conversation that streams by and has state. The user connects and using just that connection you have them navigate through menus and prompts so they can do stuff. People don't really code this way anymore because most of the web is async without state and they then bolt on request/response with state.

To do a BBS you'd need some kind of way to maintain the state of a particular user. Now Mongrel2 uses the socket connection with the user and establishes an id. That's the first step, and it looks like this:

*-----*
|M2BBS| <-- 1276 <-- [m2] <-- (Zed)
*-----*
       \____ 45 ____/^   \___ (Frank)

In this totally awesome BBS inspired diagram you probably can't read, I've got Zed and Frank talking to Mongrel2 (m2) and they have an actual TCP/IP connection. Mongrel2 then translates that to 0MQ messages with connection IDs. That way the M2BBS backend can know who to talk to when it sends replies.

The magic here is that M2BBS works asynchronously and can send as many or as few messages as it wants. It's not strict request/reply, and so with that we can make our own protocol that's more like a conversation. We need a way to model this though so that each person who's connected thinks they're getting an actual connection when really it's just a stream of tagged 0MQ messages.

Enter Lua's coroutines, which are really the only full and complete coroutines. What a coroutine does is create a simple little micro thread communication that works like a pipe. It's a little mind bending at first, but think about it this way.

You know how in event loop style systems you have to do something like this:

  1. Receive an event from the event gods.
  2. Do something with the event, then reply with some value and the next function the event gods should call.
  3. Do this, and hope that when the gods call your function it's with the right stuff.

Coroutines let you do this:

  1. Yield to the event gods when you're ready for an event. Your function pauses.
  2. Event gods tuck you away, and when an event is ready, they resume you and pass in the event on the resume.
  3. You wake up from the yield, deal with the event without leaving the function, and then yield again.

Using coroutines basically gets rid of the "Event System Spaghetti Code" you see everywhere because they can just pause for an event in the normal logic.

How It Looks

To give you a simple taste of this, here's a small cut of the code from the M2BBS code:

local function m2bbs(conn, req)
    req = ui.prompt(conn, req, 'welcome')
    local user = req.data.msg
    req = ui.prompt(conn, req, 'password')
    local password = req.data.msg
    if not db.user_exists(user) then
        new_user(conn, req, user, password)
    elseif not db.auth_user(user, password) then
        ui.exit(conn, req, 'bad_pass')
        return
    end
    req = ui.prompt(conn, req, 'motd')
    if req.data.msg == "n" then
        ui.exit(conn, req, 'bye')
        return
    else
        repeat
            req = ui.prompt(conn, req, 'menu')
            local selection = MAINMENU[req.data.msg]
            print("message:", req.data.msg)
            if selection then
                selection(conn, req, user)
            else
                ui.screen(conn, req, 'menu_error')
            end
        until req.data.msg == "Q"
    end
end

That's Lua code, and you can see where it prompts you for a username/password, asks if you want to continue, and then goes into a loop or the MAINMENU. Yet, you don't see any coroutines here do you?

The coroutine is inside those ui.prompt() function calls, which looks like this:

function ask(conn, request, data, pchar)
    conn:reply_json(request, {
            type = 'prompt',
            msg = data, pchar = pchar})
    return coroutine.yield()
end
function prompt(conn, request, name)
    return ask(conn, request, SCREENS[name], '> ')
end

See the return coroutine.yield() there? That's the magic that let's the above m2bbs function run, pause, run, pause, run until it exits. What happens is the following:

  1. Then engine fires up the m2bbs function as a coroutine.
  2. m2bbs runs and uses ui.prompt() to get the username from whoever is connected.
  3. ui.prompt() called ui.ask() to do the real work.
  4. ui.ask() sends a reply to whoever is connected, which our client displays, and then yields.
  5. Our engine is then given control, and it takes this suspended coroutine and tucks it away in a table (Lua's name for dict/hash...sorta).
  6. The user, back over the internet, types in "joe", and that triggers a message to hit the engine.
  7. The engine pulls up this suspended coroutine, and calls coroutine.resume() on it with the request that came in off the internet from joe.
  8. Finally, the return coroutine.yield() wakes up and returns this request that the engine has piped to us through the coroutine.resume() and the m2bbs function goes on to the next step.

The end result of all this though is that once you get it working you stop worrying about it. You know that ui.prompt() is going to cause your coroutine to pause until a new message comes in, and then you'll continue. That's how you get a conversation going and deal with asynchronous messages.

The Engine

Alright so what's this engine look like?

local STATE = {}
function run(conn, engine)
    while true do
        local good, request = pcall(conn.recv_json, conn)
        if good then
            local data = request.data
            local done = false
            local eng = STATE[request.conn_id]
            local good
            local error
            if data.type == 'disconnect' then
                print("disconnect", request.conn_id)
                done = true
            elseif data.type == 'msg' then
                if eng then
                    print "RESUME!"
                    good, error = coroutine.resume(eng, request)
                else
                    print "CREATE"
                    eng = coroutine.create(engine)
                    good, error = coroutine.resume(eng, conn, request)
                end
                done = coroutine.status(eng) == "dead"
                print("status", coroutine.status(eng))
                if error then
                    print("ERROR", error)
                end
            else
                print("invalid message.")
            end
            print("done", done, "eng", eng)
            if done then
                if data.type ~= 'disconnect' then
                    ui.exit(conn, request, 'error')
                end
                STATE[request.conn_id] = nil
            else
                STATE[request.conn_id] = eng
            end
        end
    end
end

That is the entire engine code that's powering the M2BBS off Mongrel2. Now remember that I was learning Lua again while I wrote this, so there's some problems with it but that's the general idea.

What I'm doing here is actually fairly straight foward once you know about the coroutines:

  1. Receive a JSON message from Mongrel2.
  2. Get the data and look for a coroutine in STATE by the request.conn_id.
  3. If there is one then coroutine.resume() it.
  4. If not then coroutine.create() it and then resume it.
  5. Handle any errors and clean up any coroutines that die.

This engine does have an obvious flaw in that it doesn't find coroutines that have been dead for too long and kill them, so it sort of leaks them. That will have to be added. There's also a serious flaw in this kind of architecture I'll get into later.

The Client

Once you have this engine and the coroutines your running, and you've got your screens worked out you need some kind of client that talks to this thing. You could point your browser at it, but this is a BBS. We need old school terminal window action, not some damn fancy graphics and HTML5 bullshit.

What you ran at the top of this blog post is just that, a simple BBS client that uses the native JSON/XML sockets protocol Mongrel2 users (accessible from jssockets and flash) but it just does it itself rather than using all that gear. Here's the entire simple little client:

#!/usr/bin/env python
import sys
import socket
from base64 import b64decode
try:
    import json
except:
    import simplejson as json
import getpass
host = sys.argv[1]
port = int(sys.argv[2])
def read_msg():
    reply = ""
    ch = CONN.recv(1)
    while ch != '\0':
        reply += ch
        ch = CONN.recv(1)
    return json.loads(b64decode(reply))
def post_msg(data):
    msg = '@bbs %s\x00' % (json.dumps({'type': 'msg', 'msg': data}))
    CONN.send(msg)
print "Connecting to %s:%d" % (host, port)
CONN = socket.socket()
CONN.connect((host, port))
USER = getpass.getuser()
post_msg("connect")
while True:
    try:
        reply = read_msg()
        if 'msg' in reply and reply['msg']:
            print reply['msg']
        if reply['type'] == "prompt":
            msg = raw_input(reply['pchar'])
            post_msg(msg)
        if reply['type'] == 'exit':
            sys.exit(0)
    except EOFError:
        print "\nBye."
        break

This is just a classic quick little hack to have something to work with, but it already shows you how to connect directly to a Mongrel2 server and get at the JSON message routing directly. Here's how it works:

  1. It makes a normal socket connection (not 0mq) to the server.
  2. It needs a small read_msg() function that decodes the base64 encoded responses and converts the json. Don't ask about this base64 crap, it's something about flash or jssockets or whatever cluster of stupid is involved. You can turn it off if you want raw sockets with an option.
  3. Then just loop doing read and send operations. In our client we use message type fields to indicate if this message is a thing to be displayed ('msg') or a prompt asking for something. That way we control the client so it doesn't block when it shouldn't.

And with that you have your little BBS working. It's actually running (as you hopefully tried it) and while it doesn't do much, it is a fairly small amount of code to implement it.

The Point Of This

Apart from just making something fun, it tests out an idea for doing a natural way to handle async events using coroutines rather than event callbacks. Callback based event systems allways suffer from problems merging events and spaghetti code. If you've ever worked on large twisted, libevent, or similar you know it gets gnarly as time goes on.

My typical solution to this problem is finite state machines since they are designed to normalize random events into a finite set of cleanly possible states. The problem with FSM is that they are scary and many programmers don't get them.

The next thing I use is coroutines, because they give you back your natural procedural style of programming but still give you the speed of async event based processing. We do this inside Mongrel2 and it works great reducing code size and complexity quite a lot.

Writing this little BBS helped me work out the gear that could be a nice little coroutine based framework for Mongrel2 backends, but there's one gigantic very obvious flaw with all coroutine systems like this:

Sharing Coroutine State Sucks

Let's say I'm running my little BBS and it becomes popular because I've managed to bring back the BBS. The internet collapses and Facebook is just a memory. Now I have one of those good problems to have and I need to scale my single Lua process up to 1000 processes.

We now have a huge problem because these little coroutines get suspended and live only inside one process. As far as I know it's very hard to share a coroutine or move it between processes. This means you've got problems like this:

  1. If a user connects they have to be constantly routed to the same backend so they keep resuming the same coroutine.
  2. Taking down a process means actually taking down users.
  3. Upgrading code means booting people off because you can't change function contents and keep the coroutine going. You have to restart it.
  4. It becomes hard to manage because you have to know about who's connected where and what they're doing, and many times there isn't enough status to do that.
  5. If you're being slammed you can't just add machines because all your currently overloaded users are "stuck" on your slammed machines. New machines only adopt new users.
  6. Coroutines usually can't be migrated between processes, and definitely not if you have different architectures.

Everyone who does a coroutine based system runs into these problems, and usually has odd solutions to them. They invent cookies galore, have complex upgrade processes, specialized web servers to do complex routing for both of those, and just generally don't feel very "web". I'm actually not sure how to solve these but it could be interesting to try it out anyway.

When you have event based systems that use callbacks, it's much easier to store the name of the callback to run next, and then make all your callbacks stateless. When a request comes in then, any backend can simply lookup the callback to run next and run it. There's no need to resume anything like a coroutine. FSMs are the same way since their state is usually an integer or a name, so it doesn't matter where the FSM lives.

One Solution, The Busy Signal

In the case of the BBS we can use the busy signal and a list of phone numbers. An idea I had was that each backend gets an entry in the routing table for Mongrel2 that's something like this:

@bbs.10
@bbs.11
@bbs.12
...

Client could then hit a directory URL to get the current phone numbers, and try them at random until they hit one that's available. The backends then keep track of number of connected users and set a hard limit. Again, much like a BBS used to do when it only had a limited number of user connections at a time, so you'd get a dial tone.

Phase 2

The next step in this BBS demo is to put it right on the internet with a browser and try out the busy signal solution mentioned above. I'll be crafting a simple single HTML page javascript application that will look like a very basic modem connection, probably with ugly green on black too. When I'm done I'll blog about the solution and how it worked out.

In the meantime, you can grab the source to the bbs by Downloading Mongrel2 to untar it and look in the examples/bbs. When it's more polished I'll post it in a few more places.