Sunday, 24 January 2010

In which software that wasn't properly engineered is shown to have killed people... again.

My wife showed me an article in today’s online New York Times (yesterday’s print edition) that got my blood boiling—a radiation therapy machine manufactured and programmed by Varian Medical Systems wound up massively overdosing and killing two patients in 2005, just like Therac-25 did back in the eighties. The bad part is that Therac-25 is a standard case study for CS students throughout the continent (and probably worldwide), so this shouldn’t have happened in the first place. The worst is that Varian Medical Systems said it was pilot error. Now, as a software developer and a computer scientist, incidents like these are particularly poignant, as they demonstrate the urgent need for a cultural sea change within the industry.

A quick backgrounder: Therac-25 was involved in six known massive overdoses of radiation, at various sites, that killed two patients. A careful study of the machines and their operating circumstances was conducted and it was ultimately revealed that the control software was designed so poorly that it allowed almost precisely the same problem outlined in the Times’ article. In Therac-25’s case, a higher-powered beam was able to be activated without a beam spreader in place to disperse the radiation, much like how the Varian Medical Systems machine allowed the collimator to be left wide open during therapy. Furthermore, the Therac-25 user interface, like the Varian interface, was found to occasionally show a different configuration to the operator than was active in the machine, as the article mentions: “when the computer kept crashing, …the medical physicist did not realize that her instructions for the collimator had not been saved.” The same issue occurred with Therac-25, where operator instructions never arrived at the therapy machine. Finally, both interfaces failed to prevent potentially lethal configurations without some kind of unmistakable warning of the danger to the operator.

The Therac-25 incident quickly became a standard case study in computer science and software engineering programs throughout Canadian and American universities, so the fact that this problem was able to happen again is shocking, as is the fact that Varian Medical Systems and the hospitals in question deflected the blame onto the operators. The fact of the matter is that Varian’s machine is one that, when it fails to operate correctly, can maim or even kill people. In such a system, there is simply no room for operator error and it must be safeguarded against. Unfortunately, in shifting the blame, Varian Medical Systems has denied their responsibility in untold more tragedies.

Had a professional engineer been overseeing the machine’s design, these deaths could have been prevented. Unfortunately, most professional engineering societies do not yet recognise the discipline of software engineering, primarily because it is exceedingly difficult to define exactly what software engineering entails. For decades, software systems have been in positions where life and death are held in the balance of their proper operation, and it is critical, in these cases, that professional engineers be involved in their design. These tragedies underscore the need for engineering societies to begin licensing and regulating the proper engineering of such software. By comparison, an aerospace engineer certifies that an airplane’s design is sound, and when an airframe fails, those certified plans typically reveal that the construction of (or more often, later repairs to) the plane did not adhere to the design. Correspondingly, in computer-controlled systems that can fail catastrophically, as Varian’s has, it is imperative that a professional engineer certify that the design—and in a computer program’s case, the implementation—is sound.

Varian Medical Systems’ response—to merely send warning letters reminding “users to be especially careful when using their equipment”—is appallingly insufficient. Varian Medical Systems is responsible for these injuries and deaths, due to their software’s faulty design and implementation, and I urge them to admit their fault. I recognise that it would be bad for their business, but it is their business practices that have cost lives and livelihoods. I think the least they could do is offer a mea culpa with a clear plan for how they will redesign their systems to prevent these incidents in the future.

The IEEE, the ACM, and professional engineering societies need to sit up and take notice of incidents such as this. That they are still happening, even with the careful studies that have been performed of similar tragedies, is undeniable proof that software engineers are necessary in our ever more technologically dependent society, and that software companies must, without exception, be willing to accept the blame when their poor designs cause such incidents. Medical therapy technology must be properly engineered, or we will certainly see history continue to repeat itself.

If this reads like a submission to the Op-Ed page, there’s a very good reason for that! It was, but they decided not to publish it. Oh well.

Friday, 22 January 2010

In which the author reflects on his career

Without trying to sound like a couple of cartoon nine-year-olds, I realised something today—I’ve done a lot of pretty cool stuff in this industry. I might be young, but there’s something to be said for starting early, and always trying to challenge yourself. Let’s see what I can remember… but I warn you, this is a long, long post.

  • Portico/Project McLuhan CMS
    Late last year, I delivered, to a freelance client, an early build of an original CMS that I called Portico. I refer to it as Project McLuhan when I’m coming up with new features, but it’s certainly not just a random exercise—it’s currently being used to serve up bellabalm.com, the website of one of my wife’s colleagues. Since delivery, I’ve been working on enhancements to the user interface (lots of use of AJAX) as well as the core functionality, to make it a little more usable, but it was pretty complete on delivery. It links to PayPal for purchase completion, and Canada Post for shipping cost calculation, and honestly, the shipping calculator was probably the hardest to write, because it's all XML, and I hadn't used XSLT for about six months before I started into it.
  • Project Alchemy toolkit
    In 2008 I worked for a company I describe as “social advertising agency”. At the time, I was one of two developers, and we jointly decided to standardise the company on Zend Framework, based primarily (if my memory serves me) on the fact that ZF was created by the same group of programmers who wrote PHP, so we good reason to believe that it would be the most effective for the job. A few months later, a couple of other developers were brought who knew Ruby on Rails, and with it, CakePHP. I’ll say this for Cake: it certainly enhances developer time for developers familiar with its API, and it seems to be a bit more clearly documented than Zend. However, from my work with it, it’s not as efficient in terms of database usage, so I can't really say I’m a fan (also, hearing one of these Ruby guys say, on about a weekly basis, “I HATE CAKE!” amused me). However, I did find really quickly that the Zend API is somewhat lacking. Project Alchemy was born out of my desire to merge the two—to take everything I liked about Cake, and make it available in a Zend context, along with some other features that I’ve long thought were important. Of particular note is a full audit trail of all database changes (and optionally, queries), which is easier than you might think to write, and will benefit you in the long run in case Something Bad happens. I'm also trying to decide on the best approach to implement model caching; a course in operating system design late last year brought the notion of shared memory into my mind, so now it’s just a matter of actually putting it together. Project Alchemy’s by no means complete, in terms of everything I want to do with it, but the essentials are all there, and it’s been an amazing learning process.
  • Innumerable changes to CIBC.com, including a core role in the October 2009 redesign
    I was on contract at CIBC for quite a while, but in a purely front-end role. Nevertheless, I got exposed to some fun technologies while I was there—XSLT, XPath, XForms—that I’ve been meaning to investigate but otherwise haven’t had the time to properly study. When the project to update the website’s layout and accessibility came up, I was glad to be involved in the project, because I got a chance to really stretch and refresh my front-end skills, including some neat CSS and JavaScript techniques. I refamiliarised myself with basic AJAX, and I’ve been using it a lot in Alchemy and McLuhan.
  • The back-end image manager for the “Radio Perez” mobile podcast site and iPhone app.
    While I was working at that ad agency, I was asked to write a pretty bare-bones CMS to serve up images and ads to an iPhone app, as well as mobile-ready pages. Two cool things I got to do with this were using the GPS data from the iPhone to provide geographically-relevant data (specifically, serving up a particular button graphic depending on your proximity to one of the client's radio transmitters), and I used the scaffolding technique on this project to get the required functionality going early, before worrying about making it look right. I first saw this technique in use by the Cake developers at that agency, thought it was a good idea, and tried it out. Turns out that it’s a great idea, and I've been using it ever since. It helped me deliver bellabalm.com in time without worrying too much about making the back end look exactly perfect. It did what it needed to do well before launch, it just wasn’t too pretty.
  • An inter-social-network photo album (ad agency)
    This one was only partially completed before the Radio Perez thing came up, so it got shelved (there were only a few developers in the company at the time), but it was going to be a pretty cool photo album, that users could get at from Facebook, MySpace, iGoogle… wherever, really. I did my best to make it work like Facebook’s photo albums, but I couldn’t figure out how to make photos draggable to reorder them (might have been because I hadn’t properly researched JavaScript frameworks at the time), so instead I implemented a priority queue in JS to update the sequence numbers correctly in real time. As I said, it never got completed while I was there.
  • An inter-social-network discusson board (ad agency)
    This was the first project the company did with the technique of opening up a social app to multiple social networks simultaneously. I think it’s gone on to become one of their flagship products, so I have to laugh that I came up with the original design, and figured out all the potential problems and issues in doing this kind of thing, and now they’re making a mint… and then I get sad, because I remember that I’m still bound to a noncompetition agreement and I probably can’t do anything with that knowledge for a while yet. RIM used it, MTV used it, and I think Astral Media used it before it got refactored from Zend to Cake. It doesn’t seem to be running its original form anywhere anymore, so it looks like it’s been pretty thoroughly rewritten.
  • RMAtic Return Merchandise Authorisation tracker
    My first attempt at writing an issue tracker, and if I do say so myself, a successful one! This is where I first got the chance to write the audit trail I mentioned above with Project Alchemy, because if anything needs that kind of tracking, it’s an issue tracker. Some weird bit rot has set in that won't let me update the RMAs in the database that I haven’t been able to isolate, but if I ever want to do an issue tracker again, I’d be rewriting it from scratch--this was done before I had a chance to use Zend, so all the links between things are procedural. It’s not super-pretty code, when you get down to it. But then, I honestly didn’t expect to ever have to see it again, so I wasn’t concerned with code re-use. I’m told it takes three times as much effort to write reusable code than it does to write a one-off, and that’s exactly what this became! If you’re curious, though, check it out. The username and password are both “admin”. It's still branded with the client’s livery.
  • A client relationship and project manager
    This was the last project I worked on at a custom software company north of the city. Based in equal parts the existing in-house CRM, their in-house CMS, and Basecamp, the notion was that I’d rewrite the CRM to be something that could be resold. I very carefully wrote up a specification for everything it was going to do, how everything would interact, how the permissions model would operate, a huge E-R diagram for it, and got to work. I was easily 80-90% of the way through it—and all the core stuff was there; I think I was down to making it easy to install add-ins, and maybe one or two other features that I can’t quite recall—when the contract ended, for a variety of reasons. It was hugely ambitious, I was the only person on the project, and I think I did a great job of it. It was able to handle all the different clients the company had, with infinitely-overlappable user groups, all the various projects, work tasks (it had a built-in issue-tracker), notes, phone calls, anything billable… invoicing. The whole shebang, with a two-dimensional access control scheme that depended on what permissions the user/user group had on individual items and their parent items. Hugely complex, and I’ve realised in retrospect that (a) role-based access control is a far better idea in terms of software complexity, database load, and usability, and (b) the database table that held the permissions really needed to be indexed and stored with a different storage engine. You live, you learn. I’ve been meaning to write myself a simple CRM for my freelance clients, and the little snippets of memory I’ve still got about that project should at least help me get started. It’ll be done in either Project Alchemy or even Django, if I’ve become familiar enough with Python by the time I get started.
  • A shopping cart for an existing CMS (custom software company)
    The client wanted to add a store to their website, so I had to learn the old version of the company’s CMS in a hurry and add a new store module to it. It wasn’t easy, particularly when trying to put the shopping cart together, and make sure that all the right information appeared on the right pages, but I did it, and it worked, and allowed customers to select from a bunch of different customisable options, as well as customising certain products. I realised, in the last month, that the user experience certainly left something to be desired, but again… you live, you learn. I think it was, more than anything, an issue of nomenclature. I used “option” to refer to, say, a specific T-shirt size, and “option group” to refer to offering size as an option. I just didn't make it very intuitive, and I think I know how I’d do it better in the future.
  • An inventory manager for a warehousing and shipping company (custom software company)
    This client stored materials for their clients, and need a way to allow the clients to see how many boxes of each they had warehoused, order some to be shipped, and be alerted when the level was dropping low so that new items could be made. Fairly straightforward, and served as a basis for the issue tracker I built into the CRM later. Users could get things shipped to the address of their choice, and it would remember past addresses to make things easier. Got the opportunity to put together a PDF template that exactly matched their existing shipping labels, so that was fun. First project I did there.
  • A really elemental blogging tool for my own use.
    This was bad, hacky code. I put in only what features I needed, and it also served up static content stored in HTML files within a specific directory structure. I just wanted to write my own blog without having to edit the HTML every time. There were comments, but I started getting spammed, so I had to disable them. The only feature I actually built was writing new blog posts. Couldn’t even edit after the fact. But it was also the first database-driven thing I tried writing… back in first year. I wanted to learn what it took to do it. There may be a reason why I use Blogger for it now… it’s a more complex problem than I wanted to get into at the time. Maybe later, though.
  • While I was in high school, I came up with a way of handling a multi-page ticket-ordering process for a theatre that didn’t require sessions. I didn’t really have access to the server that hosted it, and I didn’t understand session cookies at the time anyway (not that I could use them without being to able get at the server), so I used a frameset, with an invisible frame full of hidden inputs. The hidden inputs were updated by JavaScript, and the last page email their contents to the box office. It was quite interesting, and it was superceded a while ago.

That’s about it for proper projects that really saw the light of day, that I did something interesting with. A few things I came up that either didn't go anywhere or just weren’t really a technical achievement follow:

  • You know how Jillian Michaels, one of the coaches on The Biggest Loser, now has a website where you can track your weight-training goals online, and interact with other people trying to do the same thing? I had that same idea while taking a strength training class in high school. Many, many moons ago. Used it as a project idea. Shelved it because I didn’t have the money to start it up as a business and advertise it. Now I’m kicking myself.
  • In grade school, I had the opportunity to make the school’s website. I wanted to coordinate with the teachers to put lesson plans up online, so that parents could keep aware of what their children were learning. Nobody really wanted to go along with it. In my final year of high school, I tried designing and implementing the same solution, on a high school level, in a way that would empower the teachers to put up what content the wanted… it was really more ambitious than I expected, and all I have left are a few design documents. Then I started university, and discovered that WebCT and Blackboard existed… they do exactly what I was trying to make for my school. D'oh!
  • Also in my last year of high school, I had the “brilliant” idea of trying to create a bash-like shell for MS-DOS… in Perl. Like the course management tool I just described, this was far more ambitious of an idea than I ever expected it to be, probably because at the time, I didn't have much experience with Unix, and didn’t understand the domain of the problem. But it serves as a good example of the level of idea I tend to come up with: BIG. Sometimes revolutionary, sometimes just complex, but I think big. I like big problems that require a lot of thought.
  • Back in the bad old days, just when DVDs had first came out, my cousin and I designed a website for a video store in our hometown. Not very revolutionary there, but we had an idea that no one else did--show the movie trailers on the website. It’s obvious now, sure, but it wasn’t in 1997; only a small fraction of movies even had websites to begin with!

So, yeah… I’ve been around. I’ve seen quite a bit, now that I think of it, and there’s probably a lot of stuff that I can’t properly remember that’s getting left out, fairly or unfairly. I should remember this.

Saturday, 16 January 2010

In which opportunities are taken

The coming of the new year, like everyone is always told, has shown itself to be a time of new opportunities. However, what’s unusual this time around is that I’ve also been in a wonderful position to take the opportunities with which I’ve been presented. The software I’ve been slowly writing over the past roughly six months (dubbed, depending on what specific functions I’m working on, either of Project Alchemy or Project McLuhan) has forced me into a position to really clean up some of the core things—that is, Project Alchemy—I originally put off implementing for the sake of getting it out the door, and an outside project has finally given me an opportunity to make good on my threat to learn Python.

First things last.

Given the nature of this blog so far, and the fact that I’ve been using it to talk about technical matters from a reasonably professional point of view, I kind of wish this opportunity to learn Python had a more professional set of circumstances. On the other hand, it’s not as though I’ve ever tried to hide this particular circumstance; I’ve even discussed it in this very blog. In my continuing attempts to finish my BCSc, I’m required to take an introductory software engineering course, and I’m part of a development team that will be making a product virtually guaranteed to be delivered over the web. At first, I thought that I might be able to make an argument for integrating Project Alchemy into the codebase to speed up development, but then I discovered that I’m the only member of the team who has no experience with Python, and no one else seems to know PHP. As a result, It’s kind of a given that we’ll be using Python, so I have a fairly steep learning curve ahead of me to get caught up. The University of Toronto’s Computer Science department uses Python as the language of choice for its first year classes, so everyone else has had at least a year of active (albeit toy) development with the language, and I have less than two-and-a-half months on this project, in which to provide real, tangible support. No time to lose!

On the other hand, I think I have more project planning experience than the rest of the team—I remember how slapdash my programming practice was at Dalhousie, but this is a real-world product—so I think I might be able to make up my gaps in language experience by taking care of more the behind-the-scenes things in the management of the project. I’ve been kind of spearheading that already, but it hasn’t really been particularly set it stone who’s doing it. I guess we’ll see how this works out. With any luck, our client will get a solid piece of code and I’ll get some realistic experience in project management, which can only ever serve to benefit me professionally, for reasons that seem fairly obvious.

In terms of Project Alchemy, like I said, I put off implementing Alchemy functions for the sake of getting Project McLuhan (and more specifically, my client’s website) out the door on schedule. As a result, fairly important architectural things like multi-model find calls were put on the back burner while I dealt with end-user issues like a shipping rate calculator. Now that it’s been delivered, I’m trying to clean up the McLuhan interface to improve its usability, and I’m realising just how critical that functionality is.

Admittedly, part of my initial hesitance in implementing it up-front was a certain degree of intimidation by the problem—a large part of the design involved coming up with a way of compartmentalising WHERE clauses—rather than simply requiring the find conditions to be input as a string, abstracting it out to better insulate against malformed data from the end user. As it turned out, it wasn’re exceedingly difficult (with two caveats: I still have yet to come up with a way of handling subqueries for the IN/NOT IN operators, and I’m not convinced that I’ve done it in a particularly efficient manner), but now I have to go back through all my controllers and models and rewrite the find calls to use the new schema. It’s a bit of a pain—yet another example of why Doing It The Right Way First will invariably save you time in the long run—but it’s ultimately worth it, because it’s also providing a good opportunity to review my own code and look for bad algorithms.

So, yeah, there have been a couple of great opportunities so far this year that I feel I’ve really been able to taken advantage of. One great opportunity to learn a language I’ve been meaning to learn for more than a year, and another to finish up something important and learn a good deal more about what really goes into writing an MVC framework. This can only possibly be good for me, so stay tuned to see what develops!