This used to be where I'd let off steam after uncovering a nasty bug in the Illustra object-relational database management system. And, in fact, sometimes I reached such poetic heights of vitriol that I'm leaving the old stuff at the bottom (also, it might be useful if you are still running Illustra for some reason).
However, there really aren't any good reasons to pick on Illustra anymore. The company was bought by Informix, one of the "big three" traditional RDBMS vendors (Oracle and Sybase being the other two). Informix basically folded the interesting features of the old Illustra system into their industrial-strength enterprise-scale RDBMS and calls the result "Informix Universal Server" (IUS). To the extent that IUS is based on old code, it is based on Informix's tried and true Online Server, which has been keeping banks and insurance companies with thousands of simultaneous users up and running for many years.
I plan to be experimenting with IUS in some heavily accessed sites during the latter portion of 1997. I'm going to record my experiences here and hope to have lots of tips and source code to distribute.
Another good place to look is the on-line edition of my book and the accompanying virtual CD-ROM.
One of the first horrors that confronts the Oracle SQL or ANSI SQL programmer is the lack of a boolean data type. Illustra has one. It is much cleaner than using CHAR(1) filled with 'f' or 't'.
Note: Informix Universal Server has at least two and maybe three full-text search Blades, one of which is PLS and none of which is the Doc Blade which I revile below. Also, IUS has the nice clean boolean data type.
[Note: the Oracle text searching products provide a custom list of stopwords per index.]
Rule 2 (particular case of Rule 1): don't depend on the Illustra/Informix Web Blade.
The Web DataBlade is yet another server-side extension to HTML. The best of these extensions is Meta-HTML, a carefully designed programming language with HTML syntax. The simplest are the basic server-parsed HTML facilities that you get with standard Web servers (.shtml files).
The Web DataBlade is an ad-hoc design that lets you embed SQL queries in HTML. You are supposed to edit that HTML/SQL hybrid in a Netscape TEXTAREA form input and then submit it to be stored in a database table. Then you can link to Web Blade-backed pages and the WebDaemon will pull the relevant piece of extended HTML out of the database, parse it, do the SQL queries spec'd by your HTML, and then return standard HTML to the requesting client.
Compare this to AOLserver Tcl or CGI. If you make a change in either a CGI script or an AOLserver Tcl API program, you can do it in Emacs and the change is immediately reflected next time the page is loaded by a user. Since the CGI and Tcl scripts are stored in the Unix or NT file systems, you can use Emacs, grep, or perl to quickly find and replace pieces of your programs.
Databases are better than file systems, but not if the database vendor hasn't put any thought into what happens to the data once it goes into the RDBMS. If Informix had written a little bit of Emacs Lisp code so that you could edit your Web Blade pages from a standard editor, then having the database storage for code might be nice. But because they only implemented half of a solution, the RDBMS becomes a prison for your code.
<?MISQL SQL = "SELECT email, subject FROM bboard where msg_id = 37;">
$2 (from $1)
<?/MISQL>
The business end of this code is the $2 (from $1)
. This is
horrible data abstraction. It might not look that bad now, but what if
you had a complicated query and then produced 30 lines of HTML. Do you
really want to see $13 on line 28 and try to figure out which column it
is? In Oraperl, AOLserver Tcl, or PL/SQL for Oracle WebServer 2.0,
you'd instead be able to say $subject (from $email)
.
Conditionals and control structure appear to be even worse. The heart and soul of this thing is declarative but a lot of times you need to do procedural stuff. Feast your eyes on this..
cond=$(OR,$(NXST,$email),$(NXST,$name),$(NXST,$subject))
I thought nothing could be worse than Tcl or Perl but I was wrong...
<?mivar delimit="'" replace="''" name=subject>$subject<?/mivar>
for every variable that you insert into the database. Ugh!
[Note: you could define a function call to do this a little more cleanly but it would still be much more painful than my Tcl hack.]
There is a way to mush the Web Blade code into the Netscape server family using the Netscape API. This should substantially increase performance but I've seen a few tales of woe about reliability and I'm not sure that I'd trust the overall system (since a bug in Illustra's code could bring down services for all Web users; that's the downside of a C API and that's why I only use safe APIs like Tcl, Java, and JavaScript).
Note: The Informix Universal Server version of the Web Blade allegedly has the NSAPI and ISAPI interface under control. There are production sites running this (efficient) way, seemingly quite reliably.
This isn't a problem in Oraperl or AOLserver Tcl. You probably don't need the whole page to run as a transaction and if you don't do anything special then the SELECT up top and the UPDATE down below are separate. You cannot possibly lose in this way.
But it turns out to be a serious problem at this Web Blade-backed site. Call this Bug 1: every page should not have to be a transaction.
So I turned on MI_TRACE planner.1 in the MiParams file so that I'd see all the queries in the miserver.log and could figure out which pages were causing the deadlock. Nothing showed up. It turned out that this was a known bug, let's call it Bug 2: tracing doesn't work for WebBlade queries. This greatly compounds Bug 1.
Finally, it turns out that being deadlocked triggers a bug in the Illustra 3.2 database code whereby miserver processes will grow to consume all the virtual memory on the machine. Now you can't fork processes even to see what is going on. Call this Bug 3, greatly compounding Bug 1 and Bug 2.
How would this work on AOLserver? First, you'd not deadlock unless you explicitly opened a transaction yourself and then did the classically stupid thing of reading before writing. Second, you'd see the queries that were sent to Illustra, in order, written in the AOLserver's own log file (and also probably in Illustra's miserver.log if you configured the RDBMS properly). You wouldn't have to grope around in the dark.
Note: The Informix Universal Server version of the Web Blade lets you choose whether to have each page execute as a transaction or not (I would choose "not", duh!). Also, even if you do have piles of transactions, IUS is about 10 times better at handling locks than Illustra ever was so deadlocks shouldn't be a problem. Finally, the newest Web Blade has much better tracing facilities.
I was then able to touch up the Perl output in Emacs at the rate of about 20 db-backed URLs/day. End result? The site was about 10 times more responsive at times of light load and was no longer plagued with deadlocks. The site went from being able to perform no more than 1 query at a time to being able to easily handle 10 or more queries/second.
[Note: If you have a suffering Web Blade site and would like a copy of my Tcl and Perl scripts to assist in a conversion to AOLserver, you can download them.]
IUS Note: The above paragraph seems harsh on a second reading. It is probably no more harsh than the Illustra Web Blade deserves. However, the new Informix Universal Server Web Blade 3.3 fixes all of the site-killing bugs/features in the original Web Blade. So I wouldn't tell someone with a working site to switch.
Moon was a superb hardware engineer and nobody doubted that he would get the Foonly up and running. Still, people were a bit surprised when a huge steel cylinder came crashing through the machine room wall. The cause of the crash had been one of those washing-machine-sized Control Data T-300 disk packs. The cylindrical missile had been the spindle holding together the bad 12" platters. Moon had hurled it through the wall after determining its guilt in the crime of the Foonly crash. I went back to my office and taped up a poster.
The point of this story is to illustrate the prevailing lack of patience among early developers and users of object systems during their first decades. One of the things that drove them crazy about the object systems of the 1970s (SmallTalk, Lisp Machine Flavors) was that if you changed a class definition, the existing instances of that class did not get modified. You'd have to restart your program, maybe even reboot your computer, if you changed your mind about how to represent something. You could lose 20 minutes or even more.
So the object systems of the 1980s, e.g., Common Lisp Object System, were designed to touch up running instances of classes if the class definition changed.
When I read the Illustra User's Guide, I came to it with 15 years of
experience with object systems. It took me about an hour to figure
out how to take all the columns that were common to a bunch of tables
and put them in a supertable, then make subtables inherit from them.
In a Web site that sold magazines, videos, books, and CDs, I made a
products
table with columns product_id, short_name,
description
, etc. Then the magazines
table would
inherit from products
plus add a column for
issues_per_year
.
Everything was great until six months later when I wanted to
facilitate integration with the MIS department by adding a
foreign_key
column to the products
supertable. You can't change a table with dependents! I had to dump
the data out of my tables, drop all of them, and then rebuild them in
order to change the schema. I ended up rebuilding them without using
inheritance, which is a potentially useful feature but not when you
end up with such a brittle data model.
In short, Illustra failed to aborb the object-oriented lessons of the 1970s: class definitions change. This is much worse for a database than for those old Lisp systems. It took a few minutes to restart a Lisp Machine program; you might be off the Web for days trying to rebuild a big database.
I do not want to be around Dave Moon if he ever gets hold of a copy of Illustra.
IUS Note: I do not think that the "brittle data model" problem is fixed in IUS. Be careful and talk to your Informix sales and support people.
Illustra won't help them. It picks up the design flaws in the ANSI SQL date-time standard and then adds a few implementation bugs.
select classified_ad_id from classified_ads
where posted > (current_date - interval '1' month)
Since posted
is a column containing the date that an ad
was placed, you'd expect this to return rows of ads placed within the
last month. Well, it was working great for my
photography magazine's classified
ad section so I forgot about it. Then I got a pile of complaints from
users in my mailbox. The service had been dead for a couple of days
because March 30 minus 1 month = February 30 according to Illustra.
This wouldn't have been so bad if Illustra didn't then check the
validity of the date it just generated and then fail with an error
message.
The incredible thing about this is that it is the behavior specified by the ANSI standard and documented both in the Illustra User's Guide and the "red standard" book (Date/Darwen). Oracle has nonstandard syntax, but the same semantics would have returned February 28 or February 29 depending on the year.
Workaround: Use current_date - interval '30' day
. Watch
it fail with another error when you try current_date - interval
'100' day
.
Workaround for the workaround: Manually tell Illustra that you want
three digits of precision on the day field (though you'd think it
could see that): current_date - interval '100' day(3)
.
Then I patted myself on the back for constructing a studly "self-join with subquery" to get the intervals... (yes I think this really is the simplest query that will produce the desired data)* create table foobar ( sample_time timestamp, -- when did the cookie have the value sample_value integer ); * select * from foobar order by sample_time; ------------------------------------------ |sample_time |sample_value | ------------------------------------------ |1996-02-15 19:02:08.000000|75000 | |1996-02-16 19:02:08.000000|76000 | |1996-02-17 19:02:08.000000|77000 | |1996-02-18 19:02:08.347617|77276 | ------------------------------------------
I had planned to rush out into the clubs of Manhattan with a printout of this result, expecting women to melt when they saw it. But it was not to be. Subtracting two timestamps a day apart produces a gap that is 0 years and 0 months long. I never really did find a workaround.select h1.sample_time as s1, h2.sample_time as s2, h2.sample_time - h1.sample_time as gap_length from foobar h1, foobar h2 where s2 > s1 and s2 = (select min(h3.sample_time) from foobar h3 where h3.sample_time > h1.sample_time) order by s1; --------------------------------------------------------------------- |s1 |s2 |gap_length | --------------------------------------------------------------------- |1996-02-15 19:02:08.000000|1996-02-16 19:02:08.000000|0-0 | |1996-02-16 19:02:08.000000|1996-02-17 19:02:08.000000|0-0 | |1996-02-17 19:02:08.000000|1996-02-18 19:02:08.347617|0-0 | ---------------------------------------------------------------------
IUS Note: The syntax is very different. I'm pretty sure that the bugs are gone. I'm pretty sure that the "feature" described above (subtracting 1 month from March 30 results in an error) is still a potential problem.
There are several problems with archived rows. The inflexibility of Illustra data models means that at some point you will probably need to dump your data out of your tables, drop them, and rebuild them. There is no easy way to preserve the archived rows during this procedure, so you lose your history. Some bugs in the Illustra storage manager and backup/restore software can be ameliorated by purging archived rows. Of course, here again you lose your history.
My advice: If you need to keep a history, do it in a "status" column like you would in Oracle or Sybase. Your queries will be slightly uglified but at least you'll have your data when you need it.
IUS Note: Archiving was an Illustra feature that Informix chose not to incorporate in IUS, wisely so in my opinion. So you've no choice but to keep your own status column.
insert into products values ( next_product_id(), "AL'S HAND CREAM", ... );
The Unix box started to thrash. Web services ground to a halt. Since
my products
table was indexed by the product_id with a
primary key constraint, Illustra had to check to see if the new ID was
already present in a row. In real programming systems, the semantics
of functions are well-defined. For example, in Lisp, the args are
evaluated and then passed down. So next_product_id
would
have been evaluated once and the result compared to the existing rows.
Instead, Illustra sequenced through the rows, calling my
next_product_id
over and over again. Each call tried to
insert into the sequence numbers table. For some reason, this was
just too much for Illustra 2.4 and it caused a complete failure of the
database. Inserts into an audit table specified by an alerter weren't
done, internal errors showed up in the server log.
My advice: program anything with a side effect in C, Tcl, perl, or whatever else you're using to talk to Illustra.
[Note: PL/SQL has the elegance of a Soviet locomotive, but this ADA derivative is quite solid in my experience. Not only that, Oracle is incredibly good about caching the results of PL/SQL function calls. So you can put a complex query into PL/SQL and it will be very slow the first time it executes, but will return immediately for subsequent calls (until you update one of the tables on which it depends).]
IUS Note: Illustra functions have been killed (mercifully). You can rewrite them in "SPL" which is the Informix equivalent of PL/SQL. This is a tremendous improvement and greatly reduces the need to write C functions.
Now you never have to back up to protect against media failure. However, you may still want snapshots of your database in case someone gets a little excited with a DELETE FROM statement. Or in case your facility is torched by a distraught user. The way the Oracle studs at Boston Children's Hospital do this with their 60 GB database is to break the mirror, back up from the disks that are off-line as far as the database is concerned, and then reestablish the mirror. For both performance and recovery reasons, they presumably keep their redo log on a separate disk from the rest of the database so they probably don't even risk losing the transactions that occur when the mirror is broken.
Note: this is probably good practice with any RDBMS. I've decided that one really needs at least five disks for a db-backed Web server: one for the root disk/swap; one for writing the Web server Unix file system log; one for reading static image files; one for the database rollback segment; one for the database per se. Then five more to mirror them. If this sounds excessive, note that the Oracle DBA Handbook recommends a 7x2 disk configuration as a minimum compromise for a machine doing nothing but database service (their solutions start at 9x2 disks and go up to 22x2).
You cannot in general restore a dump from one Illustra version into another. You'd certainly expect them to have engineered in upward compatibility (so that a 3.2 dump would restore into 3.3 if not vice versa). In practice, they put you through a tortuous process where you have to be up and running two Illustra instances simultaneously (e.g., one 3.2 and one 3.3) and xfer the data with a magic script of theirs.
It is every bit as painful as it sounds. An MIT undergraduate presided over this progress once. He did not follow the instructions to the letter and trashed everything. To Illustra's everlasting credit, they TELNETed in and spent two days fixing the installation (including doing the upgrade to 3.3 from 3.2).
If you don't have that many tables, it is probably best to micopy them all out and load them into the new clean version.