Amazon Simple Storage Service Outage - Some Learnings

February 16, 2008 – 7:42 am by daniel

Amazon’s Simple Storage Service (S3) experienced an outage earlier today, which affected KnowledgeTreeLive and its users. The outage was quite widely reported

KnowledgeTreeLive is in beta and so this was a great way to learn about our contingency planning, both from a technology perspective but also process and communication.

The pundits appear to be pointing to major holes in the cloud computing model which I think is somewhat exaggerating the impact and significance of today. All systems have issues, and this is not entirely unexpected. To provide some context here, we recently experienced outages with our (expensive) hosting provider, RackSpace, who are supposed to be best in the business (and we were by no means the only parties affected). Amazon has had a really great track-record of keeping S3 up and running for the last few years (over 99.993% of the time) and one or two small, isolated outages are acceptable and indeed, expected.

There are however some learnings suggested by others that I do hope Amazon will take to heart.

This is all well and good but it is up to us companies who leverage cloud computing technologies to provide our customers with a innovative (and reasonably priced) services, to ensure that we engineer our systems appropriately to gracefully deal with these situations.

  What this outage meant for KnowledgeTreeLive users

  • During the period of the outage, all documents stored in Amazon S3 were safe and unaffected.
  • We experienced problems with the creation of new KnowledgeTreeLive accounts, particularly if you asked for demo data to be placed into your repository. Our support guys picked up on these pretty quickly and contacted the users affected by this.
  • Users weren’t able to upload new documents, not a great state but what we think is an appropriate behavior - we want users to be certain that their documents are stored safely in persistent storage.
  • Users weren’t able to download documents they had previously stored. This is certainly not an ideal situation and we’ll be investigating how we can implement a cache of documents within our cluster, probably utilizing the distributed filesystem between the various front-end web server appliances.
  • We couldn’t start up new Amazon Elastic Compute Cloud AMI’s and if we needed to due to a significant increase in load we would have had to take the entire system into maintenance mode. Maintenance Mode and other fail-safes are managed from outside of the Amazon cloud.

Some learnings for KnowledgeTreeLive 

  • We need to investigate a “Hot Cache” for documents uploaded to KnowledgeTreeLive, most likely leveraging a distributed filesystem running between our web server appliances. This will allow our customers to continue to have access to their documents during an S3 outage.
  • We need to be better at keeping users informed about what’s going on. We have a KnowledgeTreeLive Beta blog and send an RSS feed of the blog to the KnowledgeTreeLive dashboard, but didn’t do it fast enough this time around.

We’re meeting early this coming week to discuss how we can plug these technology and process holes and I’m likely to blog about the outcomes.

Data in the Cloud

December 17, 2007 – 9:13 am by daniel

Amazon have announced the next piece of the puzzle for enterprise application cloud computing: Amazon SimpleDB, the database in the cloud. SimpleDB is a quasi-relational database accessed via web services. You can grow your datamodel and data on the fly without worrying about indexing, storage capacity etc. All the underlying infrastructure is taken care of by Amazon.

We’ve had the ability to store Binary Large Object (BLOB) data for some time in Amazon’s Simple Storage Service. SimpleDB is a significant piece in the enterprise cloud computing puzzle as it provides a persistent, addressable storage medium for non-BLOB data.

While SimpleDB has great potential, the beta has limitations on the number of domains, size of domain datasets and maximum query time. We can’t yet port KnowledgeTree to SimpleDB and our MySQL clustering technology will still be around for a while yet. We will however most certainly continue building out KnowledgeTreeLive’s (our on-demand/SaaS document management software) management, clustering and potentially “self-healing” capabilities using SimpleDB. We’ve been lucky to be invited to the closed beta and will be starting to muck around with SimpleDB over the next few days.

KnowledgeTreeLive Architecture Webinar and Storage, Language Pack Updates

November 16, 2007 – 11:20 am by daniel

I recently participated in a webinar conducted by rPath as part of their Webinar Series. The webinar focussed on how we had built out KnowledgeTreeLive, our on-demand document management software, using Amazon’s EC2 and virtual appliances. rPath have a recording of the webinar available on their website.

On the subject of KnowledgeTreeLive, we’ve increased the amount of storage each user is allowed to 10GB, making the total minimum storage available to an account 50GB! We’ve also added several language packs to KnowledgeTreeLive: French, German, Spanish, Catalan and Simplified Chinese.

rPath’s blurb for the webinar follows:

On this webinar COO Daniel Chalef explains how the company strategically used the combination of scalable, virtual infrastructure from Amazon EC2 and virtual appliances to offer KnowledgeTreeLive as a hosted solution. You’ll learn how the company:

  • Saved time by using rPath technology to build its virtual appliance format;
  • Saved money by avoiding the need to build its own hosted datacenter infrastructure and to re-architect its software for multi-tenancy; and
  • Increased its marketshare potential in the crowded space of document management products with a competitive, on-demand offering.

Guest speaker Phil Wainewright, strategist for emerging software industry trends and author of the ZDNet Software as Services blog, also presents his independent perspective on virtual appliances as a fast and cost-effective route to market for SaaS.

rPath Software as a Service Webinar

November 12, 2007 – 12:13 pm by daniel

I’ll be participating in a webinar tomorrow, organized by rPath, on the topic of virtual appliances and software as a service. Phil Wainewright, strategist for emerging software industry trends and author of the ZDNet Software as Services blog, will also present his perspective on virtual appliances as a route to market for SaaS.

Register to watch the webinar on the rPath website.

KnowledgeTree Running In IIS

October 31, 2007 – 6:04 pm by daniel

Having found an old Windows XP laptop on our systems manager’s desk I decided to try installing KnowledgeTree 3.5.1 within IIS 5.1 and lo and behold, it was quite simple. By 11pm last night I had got KnowledgeTree 3.5.1 Open Source running under Apache, IIS FastCGI and IIS ISAPI, with IIS ISAPI being the clear winner in terms of performance.

Our recent move to “modern” PHP (i.e. PHP5) has certainly made these sorts of things far easier. I’ve written a very high-level HOWTO on the KnowledgeTree wiki.

Edit Documents in the KnowledgeTree Respository Online

October 19, 2007 – 8:00 am by daniel

Rene Kanzler has released a new version of his View All Online plugin for KnowledgeTree which now allows for the editing of .doc, .sxw, .odt, .xls, .sxc, .odp, .sxi, .ppt and .pps documents that are stored in the KnowledgeTree repository using Zoho Office. There are a number of other cool KnowledgeTree extensions available within the plugin. Its great to see community contributions like this!

We’ve come so far already…

October 17, 2007 – 3:11 pm by Brandon

KnowledgeTree has kept me insanely busy since we started commercializing such that this is the first bit of time I’ve had to fill everyone in on our conquests in the sales channel.

We now have so many commercial customers from a multitude of industries across the globe. The learning curve has been very steep but I’m pleased the way our sales team has grown, how the systems have evolved, and about the sales teams’ level of commitment to servicing offshore customers from our Cape Town office (they often work late into the night to get it right).

As is often the case, those closest to a system don’t appreciate the way others outside the system perceive it. A short conversation with 2 prospective customers recently led me to download KnowledgeTree 3.0 in order to gauge the progression since our first foray into the world of commercial open source. I was amazed at how far we’ve come in the past 12 months… and with the team growing monthly, we expect more and faster going forward.

To the administrators & management, to the engineers, to the support team, to the technical writers, to our community, to the coffee manufacturers & suppliers, and to anyone else who has contributed even in the tiniest way… thank you!

KnowledgeTree Open Source Goes GPL v3

October 17, 2007 – 12:31 pm by daniel

I mentioned in a previous post our intention to release forthcoming versions of KnowledgeTree Open Source Edition under an OSI-approved license. With the release of the GPL v3 and its recent OSI-approval, and the OSI-approval of the Common Public Attribution License we felt that the time was right for a change. My previous post also looked at what we wanted from a new open source license, covering both the community and commercial drivers that are important to us.

We have settled on version 3 of the GNU General Public License and will be releasing all future releases of KnowledgeTree Open Source Edition under this license, starting with 3.5.0 next week. This wasn’t an easy decision to make: the CPAL and several other licenses are compelling. CPAL in particular, would be a relatively easy switch for us: the KnowledgeTree Public License is an MPL+ license. The “+” denotes that we have added extra terms to the license, usually requiring attribution through the use of a logo and copyright notice. The CPAL is also based on the Mozilla Public License and the switch would therefore have been far easier to manage, whether it be educating our sales and support team, or our community.

Switching licenses isn’t something you want to undertake too often and is certainly not an easy task. To this end we wanted to be absolutely certain that the license we utilized was appropriate for our needs. As previously mentioned, we workshopped what we wanted from the license terms and I’ll cover off our thinking around three of these below.

Firstly, we wanted to a license that would be widely accepted by our community and the open source community at large. We did not want to risk the license we were using to be, over time, relegated to the peripheries of the open source world. We wanted to use a license that would have wide acceptance and momentum behind it. What this would mean is that our community would fully understand their rights and obligations around utilizing the software and would not be dissuaded from doing so because they felt they would need to undertake a lengthy and costly legal exercise to determine if they could use our code. Acceptance would also mean that a legal precedent would develop around the license and that bodies would spring up to defend the rights of licensors and licensees (the GPL Violations project is a good example of this).

In terms of acceptance, the CPAL had a difficult birth and is certainly not yet widely accepted. The Affero GPL v3 (more below) has not yet been published and is also not that likely to be as immediately recognizable and understood as the GPL v3.

There are however problems with the GPL v3. Despite being the latest in the line of the most widely-used open source licenses it is still untested: we have no idea how different courts will interpret its terms. In fact, in an effort to become somewhat “jurisdiction independent”, the GPL v3 steers away from using commonly utilized legal terms such as “derivative work” and attempts to define analogues that are likely to be interpreted more uniformly (”modified version”). In my opinion this has been somewhat at the cost of readability and density.

Two major issues that attracted much debate within our team were: user interface attribution provisions (as provided for in the CPAL, and in MPL+ licenses such as the KnowledgeTree Public License and the SocialText and SugarCRM Public License) and network use provisions (to require redistribution of sourcecode in Software-as-a-Service scenarios and found in the Affero GPL, Open Software License and CPAL). There were strong arguments on our team for and against these requirements, coming from both business and idealogical perspectives (open source is, for many of us, far more than just business).

Strong user interface attribution (which is enforced in most MPL+ licenses) has recently come in for some serious criticism by leading members of the open source community. These licenses require elements of the majority copyright holder’s branding to remain visible on the application’s user interface. Many in the open source community have never been comfortable with what they perceive as “badgeware”. Over a year or so ago we felt that we needed this sort of protection against the forking of our source code or (more likely and more threatening to a commercial open source company) the utilization of the KnowledgeTree source code by an OEM-type entity without them giving “something” back (money, code, publicity). Section 5 of the GPL v3 anticipates the need for some level of attribution and requires that “conveying modified source versions” include the retention of “Appropriate Legal Notices” (which may include a copyright statement on the user interface). [UPDATE: As pointed out by Richard Fontana from the SFLC, section 7(b) permits copyright holders to optionally require preservation of “reasonable author attributions” in the Appropriate Legal Notices (in addition to copyright notices). This is something we are in fact doing.]
We’ve also matured our thinking, built out our community, learnt a lot more about our business and now believe that a strong copyleft license is more appropriate for us: it is far more friendly to an open source community and far more likely to dissuade commercial use of the code in circumstances where profit is involved (if you, a commercial user of the KnowledgeTree source code, want to ensure that your derivative work does not need to be redistributed, you’re going to have to license the code from us under new terms).

Another important discussion for us was whether we wanted and needed the license to view redistribution very broadly and thus interpret the serving of an application over a network as distribution of the code. Unfortunately the Affero GPL v3 (which would provide for this) has not yet been published. We discussed this at length and came to the conclusion that even if it were available, it would still be somewhat exotic (along with several other licenses that attempt to address “network use” and redistribution). We evaluated both the community and commercial aspects of not having this sort of control over the code. The primary concern was in fact commercial: competition with our own SaaS offering, KnowledgeTreeLive. We were however comfortable that the GPL v3’s “Appropriate Legal Notices” provisions mitigated some of this risk.

To summarize a relatively long “brain dump”:

  • the GPL v3 is very like to gain significant momentum and acceptance and we think it makes good business and community sense to be part of this momentum;
  • the GPL, being strongly copyleft, is a very community friendly license as it strongly supports the redistribution of source code, more so than the MPL. This also makes it a very friendly license for commercial open source vendors who would like to dual-license their software.
  • the GPL v3 provides for a level of attribution that makes us comfortable;
  • the GPL v3’s lack of “network use provision” could be mitigated by the license’s copyright attribution requirements.

It’s (a)Live!

October 12, 2007 – 8:16 am by daniel

On Tuesday we announced the launch of KnowledgeTreeLive, the hosted on-demand offering of the KnowledgeTree. We got some really great press coverage and have had a good number of sign-ups (far surpassing my expectations).

The launch was certainly not free of hick-ups: we had been sending sign-up emails directly from our EC2 cluster for a few days when we realized that a large number of the emails weren’t actually being delivered. It turns out that many of the EC2 dynamic IP addresses are already blacklisted by numerous email block-lists for spam violations and it just so happened that our servers had been assigned those IPs. We’re now smart-hosting email through one of our servers that isn’t in the cloud and hopefully registration emails are now getting through!

We live and learn!

Integrate with the KnowledgeTree Webservice using Borland Delphi

October 10, 2007 – 2:02 pm by Conrad

Bjarte Kalstveit Vebjørnsen has contributed a port of the KTWSAPI library to Borland Delphi. The KTWSAPI is an object model based on the functions exposed via the KnowledgeTree SOAP Webservice.

Bjarte has created a port of the PHP KTWSAPI object model to Delphi 2006 using components from the Indy Project. His contribution can be found in the KnowledgeTree installation in the folder called ktapi/delphi.

Although Delphi developers could easily access the webservice by creating SOAP proxy object using the WSDL - by having a Delphi port allows developers to work with KnowledgeTree documents and folders as object orientated constructs, making the development process far richer.