Bongo has joined the Software Freedom Conservancy

September 22nd, 2008 | by | bongo

Sep
22

It has taken a while for the paperwork and stuff to work through, and I’m sure people who followed the discussion on our mailing lists will have almost entirely forgotten about this by now – but today’s big news is that Bongo is now officially a member of the Software Freedom Conservancy. This is strangely exciting – it’s another step forward for our project which re-affirms our commitment to what we’re trying to achieve.

Being part of the Conservancy will make it clearer to everyone that we’re dedicated to a free software stack. Indeed, if we did anything non-free, we would probably get thrown out, since it would jepardise the tax status of the overall organisation.

We’re also going to be under the same umbrella as some seriously cool free software projects: Samba are a well known member, but the related OpenChange project (attempting to implement the Microsoft MAPI mail APIs) is also there – and the Conservancy strongly encourages its member projects to interact and help each other.

In terms of what you can expect from the project and how it operates, basically nothing will change: it puts our project on some kind of official legal footing, being part of a US 501(c)(3) organisation, but the project will continue to operate as it has done. Pat Felt and I are the contacts with the SFC itself, but that is really an administrative thing.

It does mean that we can accept tax-deductible donations from Americans now. At some point, I will set up a donations link on the website, which will allow any one to send some money our way. I’m not under much illusion that we’ll be getting much any time soon, but it will be interesting to see what happens. Having a little money about would actually help a great deal – for example, membership of standards organisations like CalConnect (who run a CalDav interoperability workshop each year) would cost Bongo a minimum of $2500 annually. Realistically, if we wanted to send developers to a workshop, we’d actually be talking about double that or more. Getting sponsorship for that is well within the realms of possibility, and that’s the kind of thing I see us spending money on.

If you have any other questions about our membership of the Conservancy, please drop me a line either privately or on the lists.

No Comments »

Whither Zimbra?

September 22nd, 2008 | by | bongo

Sep
22

I noticed “boycott novell” made mention of Bongo today – they mistook our M3 release as the announcement of the project – but nice to see another mention in a new place. I’d like to comment slightly on the Yahoo! situation, though.

The boycott site is quite wrong when it talks of “pulling a Hula”: Hula was never sold because it was competitive to Microsoft Exchange. Anyone who was involved with the project, or used Hula (it’s still available in Ubuntu!), would know that’s not the case. But, unintentionally, there is a point there: the fact is, Microsoft could/would be in a position of owning two groupware systems, which is similar to the position Novell found themselves in (actually, not as bad, because Novell originally had three ;) .

If Microsoft do take over Yahoo! (and that doesn’t seem certain to me, yet), they will be in the position of effectively having two products in the same basic market twice over – Exchange and Zimbra, and MSN Mail (or whatever it is – Hotmail?) and Yahoo! Mail. And for the most part, a business selling two products which do the same thing ends up doing not much more than confusing its customers.

My guess: First, Yahoo! Mail will get merged into the Microsoft version, somehow. I’m sure the UIs will be slowly aligned, and then people switched over without realising it.

Zimbra is a tougher call, though. I don’t expect them to kill it. However, I also don’t see them keeping it – if I was to guess, I would say Zimbra would get spun out into a standalone company. I don’t think Microsoft is going to want to piss off a good number of customers, I don’t think it will want to keep Zimbra, and I don’t think those customers want Exchange. So the obvious answer to me is, get rid.

However, in that scenario, it may well end up a bit like Hula – the people involved in the “open source” side of the software may effectively find themselves in a whole new world they didn’t expect themselves to be in. Whenever you have a software project that has a single corporate sponsor (if you will), you are always in that situation, because that type of sponsorship has to be continually justified.

In virtually all ways, this doesn’t affect Bongo: if those people using Zimbra right now have to change (and I’m not at all convinced they’d have to), I’d say they are more likely to jump to Exchange than Bongo. But, I sure feel a lot of compassion for the situation: with two disparate communities (commercial users and open source users, in this case with slightly different products) there is always going to be one side which “loses out” somehow.

No Comments »

Welcome to 2008

September 22nd, 2008 | by | bongo

Sep
22

We all have New Year’s resolutions in some form or another; I haven’t really got mine straight in my head yet but there are a couple of things I’m committing to over the next month or two – one of which is to spend more time regularly hacking Bongo. I had hoped to get some hacking in over Christmas; family events overtook that and as a result I’ve spent virtually no time on Bongo at all. So, my immediate goals are:

  • chase the M3 release relentlessly. We’re really close, and have been for a little while – reaching the finishing line has been hard, but it’s mostly testing now really.
  • get IMAP support going properly. See above. This is me introducing a bug in the MIME code from the sounds of it :(
  • looking into synchronisation support. If I can get Bongo syncing with my new mobile, I’m going to be in a serious dogfooding mood.

Border suggested that one of us write up a kind of look back over our first year, as well as a look ahead – Pat did a little bit of that, but I do feel the need to take stock in more measured fashion. One goal I set which we totally haven’t achieved is releasing often, and I think my main goal with Bongo this year is to do that: if I achieve nothing else, making steady, perhaps reliable, but recognisable progress I think is going to be very important to this project.

In other news, my Bongo hacking time today has been spent in a couple of ways. I’ve been getting my builds warmed up again – somehow they always bit-rot and there are things to fix – and I’ve been and renewed the project’s domain name. Somehow there were only four days left on it, ho hum.

No Comments »

Various rag-bag of stuff.

September 22nd, 2008 | by | bongo

Sep
22

There’s a number of things I’ve been meaning to blog about recently. Thanks to Andrew, who has been blogging about stuff, people will have hopefully known about the web UI meeting we had recently. While relatively informal, it does feel like everyone wants to move in the same direction: a light-weight reliable UI framework pulling togther useful components to provide a compelling app. We’re going to start with “Flasher”, the UI for non-Bongo users, and I started story boarding some use cases. The key is simplicity as usual, but after a brief look around I realised something – no other free software package is doing this. A few doing something similar, but not this.

I was also interested to read Ten Ways to make OSS More Humane. There’s a lot of stuff in there, but it’s mostly stuff I agree with. In particular, I was quite happy to see that we do most of the “do” statements – I thought it was particularly apropos given the web UI discussions – and hopefully none of the “do not”. We’ve had a few new faces in IRC recently, and they’ve all stuck around: I still think one of our primary strengths is that we’re a really friendly community, and long may that continue.

Somewhat worried to read that David Bienvenu and Scott MacGregor – the driving forces behind Mozilla Thunderbird – are both leaving the Mozilla Foundation. After all the discussion about setting up a new “MailCo” to drive the product in the same way Firefox gets driven, I was hoping for great things. Now, with the very boiler-plate text on both their blogs and both wishing MailCo all the best, it seems they are not to be involved. I really hope I’m wrong, but I would assume that with MailCo on the table they wouldn’t jump to some other Tbird vehicle – so it sounds like they’ll both be reducing their involvement in the project. That’s the last thing Tbird needs, and I wonder if that does spell the end.

No Comments »

Locking revisited

September 22nd, 2008 | by | bongo

Sep
22

Sorry – another slightly technical post. This is yet again another visit to the subject of locking, which I briefly described the problems with a while ago, and then wrote some new code to help alleviate the problems.

One thing I didn’t explain at all, looking back at those posts, is the different reasons for locking. We can simplify this down to two major reasons:

  1. Internal database consistency. SQL databases are usually “ACID compliant“, and SQLite is no different. This means that every reader and writer gets a consistent view of the database; for example, if you’re writing data to the database, you can’t ever run a query which gets back partially-written data – you either see the data in its complete state, or you see nothing at all.
  2. External store consistency. A bit like internal views, we don’t want clients accessing the store to see “inconsistent” data – we don’t want one client to be writing a mail to a user’s Inbox, and have another client read the half-written data and think it has the whole mail.

Those problems seem very similar, and indeed, our previous issues are mostly to do with the fact that those problems were conflated: the SQLite locking was used to ensure external consistency in the store. That’s actually a really common – and often desirable – way of doing things. The way it works is you wrap all your SQL operations in an SQL transaction, and whether or not your operation (e.g., writing a file) succeeds stands or falls on whether or not the transaction succeeds. The database guarantees that when you use a transaction, you can make many alterations at once but no other database client gets an intermediate view of the partially-written data.

However, at this point, we run into an issue with SQLite. “Proper” SQL servers give you the ability to lock single tables, or even rows within tables, to help ensure that transactions succeed where-ever possible. SQLite doesn’t give us that, it basically ensures consistency by locking the whole database. This makes the external consistency above suddenly very expensive to implement at the SQL level.

Now, surprisingly, there is also another set of locks within the store. At the moment, for example, collections are locked at a higher level when you read/write to them, because we also access other data – document contents on disk, the search index, etc. These locks are needed for the external consistency also, and almost make the SQL locks redundant.

So, my work on the store now is mainly to decouple the SQL-level locking from the provision of external consistency. We’ll effectively only use the locks at that level specifically when we’re reading/writing from the database to ensure that the data is written consistently when it matters (for example, allocating IDs) but not when it doesn’t (for example, creating a document and then later adding the correct metadata to it). External consistency will then be provided by the higher-level locks which we already use.

No Comments »

More work in the Store

September 22nd, 2008 | by | bongo

Sep
22

Since my last techie post on locks was appreciated and since I’m still crook, here comes another one. Again, the focus on making Thunderbird work better with Bongo was the main aim – after my locking changes, I found that we ran into a whole new set of problems: mostly, that the IMAP agent wasn’t responding in a timely manner.

So, my next step was to address the progress reporting with a small new piece of code which sits in the various command loops inside the IMAP agent. Basically, as we spin around doing work, it checks to see when it last spoke to the IMAP agent, and if it was too long ago it responds again. I also found out that there was some code already in place which did similar, but it wasn’t working because the socket wasn’t being flushed: in non-techie terms, the responses were waiting on the server because the OS hadn’t sent the data to the client. So, in those cases, the data is now sent, and things are happier.

The Tbird problems aren’t totally done with yet – I do occasionally get errors still which I’m trying to track down – but it is much better. Certainly, copying large amounts of mail around is much better, and setting read/unread status works much more quickly. It’s getting there as being actually usable long-term: I now have some tens of thousands of mails in my system, and it works, which is good news.

Going further with the “actually usable” thing, one thing I wanted to get some better information on was the performance of the Store in various places. From having read the code, I knew there were some places in the code which were obviously poorly performing: the algorithms in use are very naive, and there are much better ways of doing some things. But in order to know where to look, I decided I needed to instrument the Store a little bit. I’ve written a patch which I’ll send along to -devel, but let me show you a little bit of output:

Command COPY took 0.726920 seconds.
 - Out of band messages (0.000033)
 - Starting document copy (0.000295)
 - Files linked (0.033004)
 - Processed doc (0.024436)
 - Parse contents (0.602086)
 - Index contents (0.049533)
 - Commit transaction (0.016867)
 - Finished processing (0.000047)

What happens is that every time a command is run, and during the command run, it pockets away little bits of data about the time performance so far. With enough bits of logging inside you get a reasonably granular idea of what’s going on.

This example is of a particularly poor-performing COPY command. You may think you don’t copy mail very often, but actually a lot of operations turn into COPY: deleting a mail is copying it to the Trash, filing a mail is copying it to the new folder, etc. (IMAP doesn’t know about moving mail). This command took 0.7 seconds: most of the time it’s quicker than that, but that’s close to a mail per second. That’s horrible performance.

If you’d have asked me before where all the time was going, I would have pointed the finger directly at CLucene. However, looking at the timings above, we can see it’s not the main problem: far and away, accounting for 83% of the runtime, is actually the “Parse Contents” stage. The code at that point looks like this:

    TCLog(client, "Processed doc");
    errmsg = StoreProcessDocument(client, &info, collection, dstpath, 0);
    if (errmsg) {
        ccode = ConnWriteStr(client->conn, errmsg);
        goto finish;
    }
    TCLog(client, "Parse contents");

(The log comes after the code it’s measuring for technical reasons)

So, we can see that single StoreProcessDocument() call is actually the big deal. And what is that call doing? It’s actually just filling in the SQLite database (but not writing to it – that happens later, and doesn’t account for the speed). Re-generating data we already have is accounting for 4/5ths of the run-time cost. Ouch.

No, I don’t have a patch which fixes COPY yet. But I’m thinking about it :)

No Comments »

Diagrams and stuff

September 22nd, 2008 | by | bongo

Sep
22

I do love a good diagram, and I noticed that I have a number of Bongo-related diagrams stuck on my hard-drive. Most of them have seen action in a past blog post of mine, but my blog isn’t really a Bongo development resource, and the ones which haven’t yet been seen really ought to be a bit more widely available.

In general, though, programmers aren’t great at drawing things and worse, updating diagrams in drawing programs is often tedious and time-consuming. So, I’ve committed some graphviz support into our build system which runs when you build the development documentation.

The first – and only – fruit of this endeavour is a UML-like diagram I’ve done to try to figure out what the Store’s SQLite database structure should be. Behold (clicky for bigger):

I should at this point take a little bit of time to explain why I did this. I’ve decided that the SQLite usage in the store needs to be pretty much refactored, for a number of reasons. Amongst those is the fact that we store a binary blob within the tables which represents much of this data: some, like “mime_lines” in mail documents is completely hidden with this structure, other, “time_sent” appears in both this blob and as a column in a table.

So, job number one has basically been getting rid of this evil binary blob. This schema, then, isn’t me really redesigning it per se – I’m simply pulling the structure out of this blob and putting it in tables, where it belongs.

I’ve also slightly denormalised the table a little. In our current database, event and mail attributes exist for all documents: and if the document isn’t an event or mail, those fields are just empty. That’s a bit of a waste, hence the one-to-one relations between “StoreObject” (which represents any object in a store – a mail item, a collection, etc.) and “MailDocument” et al.

The one big change I have made is the addition on the many-to-many “links” table. The purpose of this is simply to group together store objects. In our current store, we have a concept of “calendars” and “events”, and events are linked into calendars rather than calendars being collections which hold events. Rather than having something specific to calendars, I’ve generalised this out. I’m not totally sure I like this as a way of representing calendars – I presume that it was done this way because collections can’t be “typed” like a calendar document can – but it would also allow us to do stuff like linking conversations with events, so you could associate a meeting with the invitation conversation or something.

No Comments »

Notes on the end of Bongo Hackweek

September 22nd, 2008 | by | bongo

Sep
22

Well, my informal little hackweek is coming to a close today, and it has been really good. The experimental store-sqlite branch is now well and truly merged onto trunk, and I’ve been running it the past couple of days. It’s not managing my personal mail yet, but I will probably switch over this weekend: as well as being a lot more performant, a number of serious issues for IMAP users are now totally gone. You don’t have to worry that too much mail is queued up for you any more, or that your mail filters will crap out because of all the re-processing happening on the server.

There is still plenty of work left to do in the store: currently, it’s missing a few functions needed to support the web UI correctly, but I’m still planning on making a release soon. The release will likely be some kind of server-only release, for those people (like me) who are mainly running it for the SMTP/IMAP loveliness.

Going back to the dogfood; I’ve set up the new trunk on shell.bongo-project.org and it’s running well. I’ve signed up to LKML, a couple of other mailing lists (I was already subscribed to OpenSUSE’s discussion list, which must have been bouncing, so I’ve asked Andy Wafaa to pass on my profuse apologies to their ML admin: if it’s any consolation to him, receiving backed up posts in batches of ~10 simultaneously really helped me debug a locking problem we had!), and at the moment I have a few hundred mails in there from running just over a day. I’ve also been testing with my mailbox backup, which is nearer 10 thousand e-mails, and there isn’t even a slight different in performance.

I would love to see someone test Bongo against some of the mail test harnesses that are out there, so we have some good idea of how large a mailbox we can handle, how many simultaneous mails, etc. I’m really hoping that people will start using Bongo in progressively more serious situations because the widespread testing is the best way of finding a lot of our bugs.

No Comments »

Thoughts about MDB

September 22nd, 2008 | by | bongo

Sep
22

I had separate chats with people about MDB yesterday, originally about how we’re going to fix a potential security problem with Bongo, but went into a wider-ranging discussion about MDB. For those who don’t know, MDB is the LDAP-like API we use to store virtually all configuration.

There are a couple of issues with our code base right now:

  • The mdb.conf is world-readable, which is the security problem. This is necessitated by the current Dragonfly setup, which runs in the Apache process;
  • Ideally, we want full configuration access from Hawkeye (the new web admin tool), also in a secure manner;
  • Making MDB schema changes etc. is hard, and we don’t really have an upgrade strategy in place;
  • Bongo was designed to be able to run in a clustered fashion – e.g., having IMAP run on a separate server to the store – but, at the moment, there are a number of hard-codings which make this virtually impossible. The main one being, the list of agents in bongo-manager is hardcoded;
  • The above fact also makes it difficult to see how we could integrate third-party agents easily, which is sad.

It feels to me that we’re on something of a sticky wicket with MDB (translations into colloquial English involve creeks and paddles, I’m given to understand).

There are a number of ways that we can think about solving the problems above in isolation, but the more I think about it, the more I think MDB is hamstringing us. The world involved in solving each individually also seems to be less than the work of a more complete solution. MDB was always something we were planning on getting rid of in the long term, but that was going to be a post-1.0 story.

I’m not going to put forward any potential solutions at this point, because this needs thinking about. In an ideal world, I would punt as much of this as possible: e.g., clustering isn’t a feature I would be sad about dropping from 1.0. However, I also don’t want to see us applying big band-aids: if we built Hawkeye over existing MDB, we’re basically doing work we’d end up throwing away later anyway, or at least rewrite majorly.

This needs a lot of thought :)

No Comments »

Hacking Update

September 22nd, 2008 | by | bongo

Sep
22

There has been a little bit of Bongo hacking. In an ideal world, Pat would get with the programme and get himself a blog. People should keep bugging him on IRC to start blogging, and he could be writing some of this stuff instead of me :)

First up, the big connio-on-GNUTLS patch has landed. Pat has worked extremely hard on this, and while I’ve helped out on some setup stuff, it’s basically his work. This means that connio-using agents now use GNU TLS as their secure sockets and TLS implementation (pretty much, SMTP is the only renegade).

In practice, this won’t make a huge amount of difference to the end user, but it does mean we’re both GPL-compliant and secure again.

Like a machine, Pat is now looking at the SMTP agent to bring it into line. This is awesome work, and will hopefully mean that our SMTP implementation is much simpler than it was.

I’m still hopeful that our M1 release is more or less on schedule. To be fair, we’re not working directly towards it at this point: there are a few things which ought to be part of later releases which are being helped along, but there are three main features we really need for M1:

  1. We need antispam/antivirus working again. Mainly, there are some schema updates that we need to put in place first. I have 50% of my patch for this done, hopefully Pat can have a look after that’s in place (we’re working him really hard…)
  2. We need better logging. Anthony C. has taken a look at it, and I’m hoping we’ll have some mostly working logging for M1. Logging will be an on-going thing, but it’s great Anthony has picked this up.
  3. Luis posted to -dev bemoaning the state of our tools. I’ve written up a short proposal to how they ought to look, and I’ll be putting some patches forward in that direction. There would be fewer tools, but the split will be cleaner, and the tools more othogonal – it will hopefully make documentation a lot easier.

It’s a fair amount of work, but possibly sounds worse than it is. I’m hoping M1 will be a pretty useable release for developers, and I intend to migrate my Hula server to Bongo with it (technically, now we have GNU TLS, I could do that now). Later milestones will be a bit more straightforward.

No Comments »