May 20 10

SBS 2008 restarts unexpectedly when backup starts

by Nicholas Piasecki

Today, our SBS 2008 server restarted itself at 5:00 p.m. sharp. I mean on the dot.

That was disturbing enough in itself. When the system came back up, it helpfully asked me to type in “Why did the system shut down unexpectedly?” and I enthusiastically typed in “Fuck if I know you jackass.” Then, I headed straight for the event log.

The event log was full of terrifying messages such as

The system failed to flush data to the transaction log. Corruption may occur.

or

An error was detected on device \Device\Harddisk1\DR2  during a paging operation.

Hmm. There was no blue screen, no bug check, no minidump. It was as if the power had been cut.

I looked accusingly at the UPS since I have had problems with bad UPSs interrupting the power supply in the past. I held down its self-test button, it made that satisfying buzzing noise, and … everything stayed up.

But while crouched down next to the UPS, I heard an odd swishing noise, like a tiny man was running his finger across a sheet of Saran Wrap. Then I noticed that the external Western Digital hard drive that we use for SBS 2008 backup was doing its swooshing-lights mode, not its solid-lights mode, and I knew from previous experience that it only did that when it was starting up or shutting down.

I had a hunch–in SBS 2008, backup uses Volume Shadow Copy, and I had seen similar disk errors when another of our external hard drives cooked itself (though instead of rebooting, that server became unresponsive). I unplugged the external drive and the event log messages stopped.

I then promptly threw the external hard drive into the trash, drove straight to Best Buy and bought a new external hard drive with the company credit card. (Aside: Why do 90% of external hard drives come with craptastic backup software or “one-touch” buttons? I just want a drive in a box. I finally found one in the “Seagate Expansion” line.)

Then I plugged in the new external hard drive and re-ran the “Configure server backup” wizard from the SBS 2008 console. I unchecked the old, now non-existent drive, checked the new one, and off it went. And all seems happy now. (I ran chkdsk for good measure on the system and data drives and they checked out OK, so it does all seem related to the external backup drive cooking itself.)

Should it be capable of handling faulty backup hardware more gracefully? Sure. And I wish that SBS 2008 had the option to use the old ntbackup utility because then at least you could backup to network-attached storage. It’s been my experience that external hard drives really are not that reliable and have an average lifespan of only about two years, but maybe I have just been glaring at them the wrong way.

May 5 10

I Hate Software

by Nicholas Piasecki

Today, I realized that I hate software.

First, it was FedEx

It has been 45 days since I reported to FedEx that the new SmartPost integration offered by their Web services simply does not work when an EPL2 label is requested:

I Hate Software, Part 1

The problem is that the USPS Delivery Confirmation barcode does not print when an EPL2 label is requested. That’s because in the EPL2 document that FedEx sends back, quotation marks in the barcode command are not properly escaped:

A90,473,0,3,1,2,N,"ZIP - USPS DELIVERY CONFIRMATION e-VS"
B50,535,0,1,3,4,175,N,""F1"42023221"F1"0000000000000000000000"

That should read "\"F1\"420..." for it to print correctly. (The ZPLII version of the label works fine.)

I understand that larger corporations have fixed software release cycles and different bug triage mechanisms, but having to through three levels of support and waiting 45+ days to simply say “this simply doesn’t work please add a backslash” is somewhat frustrating.

Then, there was UPS

Similarly, I’ve discovered a recently-introduced error with the UPS Web services. If you request a 4″ x 6″ EPL shipping label, then the UPS Web service will happily ignore request and send back a 4″ x 8″ label instead. That’s because they’re sending back the wrong width and height settings in the label response:

q795
Q1600,24

That would be 795 dots / 203 dpi == 3.91″ wide (OK) and 1600 dots / 203 dpi == 7.88″ high (hmm, not what I asked for). (The ZPLII version of the label works fine.)

Finally, SmartFTP sent me over the edge

Since updating to the latest version of SmartFTP, I found myself frequently being unable to connect to the Skiviez private FTP server that is used to manage software updates, e-mails, and product catalog images. It would fail about 80% of the time with

[13:51:38] 234 Using authentication type TLS
[13:51:38] SSL: Error (Error=0x80090308).

I’m sure that this means something. And even if I knew what it was, sometimes it would just work without me changing anything (about 10 percent of the time). Further still, I hadn’t touched the FTP server for some time.

That’s when I noticed in the SmartFTP change log:

FTP: Completely rewrote SSL layer

Sigh. Downgraded to the version prior to that changelog entry, and it works fine.

Conclusions and delusions

I have certainly done my part to contribute buggy, crappy software to the world. I continue to spew out more buggy, crappy software with each passing day. But it is extra depressing and disheartening to know that I, some idiot working at a small company, can run across such simply-does-not-work bugs (and, in FedEx’s case, never-actually-worked-ever bugs) in software produced by large corporations and used by presumably hundreds to thousands of people around the world.

Apr 15 10

Quick Tip: Sharing a FedEx ZP 500 printer attached to a Windows XP computer to a Windows Vista/7 machine

by Nicholas Piasecki

At Skiviez/WFS, we have a FedEx ZP 500 ZPL printer on the shipping desk. This is what FedEx is migrating everyone to now that the tried-and-true Zebra/Eltron LP2844* series is getting a little long in the tooth. (Along with a gradual migration to ZPL over EPL2, but that’s a rant for another day.)

FedEx ZP 500

The FedEx ZP 500 is a bit of a white elephant in that Zebra doesn’t mention it on their Web site; it’s some sort of special contract job with FedEx to produce and jointly brand these devices. It’s probably just a re-branded version of the Zebra GK420d, but in reality we have the printer manufacturer pretending that they don’t make the printer (e.g., “call FedEx for support”) and a shipping company who has no idea how to support the printer (e.g., “call Zebra for the printer driver”). But I’m getting distracted.

The real issue was that the shipping desk is running Windows XP and shares the printer via the native Windows printer sharing mechanism so that it’s listed in the Active Directory. I do this so that I can run integration tests from the workstation in my office and test the label generation functions of our software without needing to have a thermal label printer hooked up to my workstation solely for this purpose. These printers aren’t cheap, you know.

New operating system, new drivers required

I recently upgraded my workstation to Windows 7. When I tried to add the shared FedEx thermal printer, I was greeted with error code 0x00000007a along with an error message that generally amounted to “something didn’t work.” I suspected a driver problem since Vista is when Microsoft locked down on the mandate that printer drivers run in user mode, not kernel mode–which is a good thing in terms of system stability, since a poorly-written printer driver can no longer trigger a BSOD and a reboot, but a bad thing in terms of backwards compatibility.

The problem is that

  • the Windows XP machine is offering the Windows XP drivers to my Windows 7 install;
  • the Windows 7 printer wizard doesn’t give me a chance to supply my own printer drivers, and instead happily installs the XP ones, which don’t work;
  • the FedEx-supplied Vista drivers are mutually exclusive in terms of compatibility with the XP drivers, so I can’t install them on the XP machine via the Server Properties thingie; and
  • even if I could do that, I am hesitant to dick around with the printer drivers on a critical machine.

Adding the printer

The solution was to add a printer in a different, counter-intuitive way. Here’s what I did:

  1. From the Windows 7 Control Panel, I went to “View devices and printers” and then “Add a printer”.
  2. When asked “What type of printer do you want to install?” I chose Add a local printer, even though I know full damn well that I’m not actually adding a local printer.
  3. For “Choose a printer port,” I chose “Create a new port” with “Type of port” set to “Local Port”.
  4. In the “Enter a port name” dialog, I entered the UNC share name for the printer, which looks like \\{MACHINE-NAME}\{PRINTER-SHARE-NAME}. In my case, it was \\ASHWHWS003\FedEx ZP 500 Plus.
  5. When asked for a driver, I chose “Have Disk” and navigated to the *.inf file in the ZD directory of the Zebra Designer drivers available from the FedEx Web site.

This allowed me to use locally-available printer drivers on a printer attached to another machine. Good luck!

Jan 19 10

Buying Software Sucks

by Nicholas Piasecki

What follows is a rant about the state of marketing in the software industry.

We happen to like our system

So we’ve branched out and offered fulfillment services to other merchants, as I’ve mentioned before. Essentially, other e-commerce stores or merchants can store their items in our warehouse and then transmit orders as “fulfillment requests” to us. We either ship the request or mark it as invalid with a little message (“item has insufficient stock, delivery point validation failed, tax identification number required, etc.”). If they want to re-attempt, then they re-submit an entirely new fulfillment request.

This simple model has worked surprisingly well: the system doesn’t track “orders” (since each merchant handles backordering and cancellations differently), and while it does maintain an inventory log, it doesn’t track the cost of goods in inventory or anything like that, since that is not necessary for us to do our job. (Instead, it tracks a whole boatload of other information that people don’t normally think about, like HS codes or unit weight.)

Some of our merchants need some hand-holding

We have an API so merchants can integrate with our system. And so I’ve written a few plug-ins for Magento and osCommerce that auto-transmit their orders as fulfillment requests, sync outbound shipments back, and deduct inventory from their systems (since we are the authoritative inventory count). This works great for their retail businesses.

We have a merchant that we’ve been doing retail business with for some time who now wants to do wholesale stuff with us. He wants a sales order/invoicing/inventory management solution, and it needs to be able to track multiple inventories across multiple warehouses (our integration, if any, would only be adjusting inventory for a particular warehouse). He wants to enter sales orders remotely, press a button that shows him how much is ordered so he knows how much to make, have that manufactured and sent to us, click another button to transmit the sales order as a fulfillment request to us once it’s in stock, and then have us sync back with the shipment info, creating an invoice.

“Shouldn’t they have had these features prior to switching to you guys?” you ask. Well, yes. In this case, all of these features were provided in an all-in-one system provided by their previous fulfillment warehouse. They have since learned their lesson about keeping all of their data in the hands of a third party because when that relationship went south, so did their access to their own data.

We would rather not add these features to our system. Since all merchants have different ways of handling backorders, pre-orders, cancellations, cost of goods sold (FIFO, LIFO, average, priority), we’ve been maintaining the position that–unlike our competitors–our system is essentially feature complete, since it’s ours and does what we need it to do to ship things out. The features that I’ve mentioned should be things that the merchants are keeping track of themselves–since that’s their business–and integrate with our system via the programming API. While an argument could be made that our system would be abso-freaking-fantastic for merchants who need an all-in-one software and data solution (yes, it certainly would), the reality is that our competitors have an outsourced team of software developers, and we are a small business working in an area that is tangential to our core business as a result of the “new economy” that has a software development team of just one person (me) and can’t even begin to dream of hiring any more until we start seeing some serious cash flow.

In any case, to land this deal, we need to find a system for this merchant, and fast, because there are some important trade shows coming up.

Welcome to marketing Hell

Now for the rant, because you would think that these requirements are not exotic:

  • Let salespeople enter sales orders remotely.
  • Keep track of inventory in multiple locations, and track the cost of inventory.
  • Provide integration hooks so the user can send orders to the warehouse and so the warehouse can send shipment data back.
  • Keep track of sales-order-to-invoice conversions and payments.
  • Provide reporting features.

QuickBooks 2010: Same as last year! Now with more shininess!

You might take a look at integrating with QuickBooks, but you’d realize that once you’ve penetrated the marketing speak that the software in 2010 is essentially no different in terms of fundamental feature set as it was in 2006, and that it doesn’t support inventory in multiple locations and doesn’t scale well. In fact, QuickBooks performance once you start approaching 10,000 SKUs is so bad that they sell an “enterprise” version that essentially–aside from some fine-grained access permissions–has no added features other than the feature of not crashing when dealing with large lists of information.

We could pay another couple of thousand dollars for Fishbowl inventory, which would add multiple location support to Quickbooks, but then we’d have created a Rube Goldberg machine straight out of the gate, with me synchronizing with Fishbowl which is then synchronizing with Quickbooks. I’m sure nothing would go wrong there. That would be insane; we might as well just stick a few fax machines into the sync process and call it an insurance company.

A gap in the market

The reality is that there is a huge gap in the marketplace between merchants who are moving $200k or less per year–just use commercial off-the-shelf (COTS) QuickBooks, you can do most things manually and use your e-commerce system’s native order management functions–and merchants who are moving $5m or more–just use SAP or some other enterprisey software. If you’re in between, like we are and like the merchant that I’m researching for is, the options available to you are not pretty.

I’m not sure why this is. All I can think of is that perhaps companies historically did not spend much time in this space–they either stayed small or had venture capital to acquire the big boy systems and grow quickly. People aren’t exactly lending money anymore, so I suspect that this is a segment that is only going to grow.

If you try to look for COTS software in this segment, you’ll never find the feature matrix that you need:

  • inFlow Inventory is pretty, but offers no integration features, as if an entire business could be run out of one app, and doesn’t offer Web access.
  • WorkingPoint is Web-based but doesn’t offer inventory tracking in multiple locations or an integration API.
  • QuickBooks has a Web-based extension that lets QuickBooks understand multiple inventories but costs thousands of dollars, assumes that the company owns its own warehouse (that is, needing picking/packing/shipping capabilities), and still does the same style of synchronization as Fishbowl does. You’d think Intuit would just add the @#$#@ feature to QuickBooks itself!

No COTS to sleep in

The market seems to have determined that people in this segment have outgrown COTS software and need some consulting help. So any Web sites that advertise products will be full of pages and pages of impenetrable marketing bullshit that use obnoxious acronyms like ERP, CRM, MRP, and WMS, promise the moon, and coyly make no reference to pricing or contract requirements so you can’t even tell if you’re dealing with the right league of product, when the reality is at the end of the day you could look at two or three screenshots and the SDK’s API and immediately tell if the product fit your needs.

Instead, I notice a disturbing trend of “pretty Web site, crap product,” such as Sage’s Simply Accounting, which certainly appears to have an impressive array of features but in reality doesn’t even know the difference between a sales order and an invoice. You can try going to Microsoft’s Dynamics site, but good luck figuring out what the difference between Dynamics AX, Dynamics CRM, Dynamics NAV, and Dynamics GP are: you’ll be told to contact your “Microsoft Dynamics solutions representative” for help. At that point, you’re thinking “Microsoft solutions representative? Who said I committed to Microsoft?! I’m just trying to figure out what in the blue hell your product even does.”

If you do find a vendor that maybe sorta-kinda-hard-to-tell meets your solutions, then you can expect days of scheduling WebEx teleconferences and meetings and run-around with your “account rep” so that they can determine how much you’re worth and willing to pay so that they can charge you a completely different amount than what they charged Bob next door for the same services and bits. Trying to extract “$X/user” and “the login starts working on MM/DD/YYYY” and “the developer get a demo account so if you can know if this is feasible” answers from these people seems to require a hammer in one hand and their genitals in the other. We both know that to add a new account, they’re pressing a button that says “they really bought into that ‘enterprise’ crap” and poof! a new account is created. Let’s quit pretending that the world’s carbon footprint has increased ten-fold by us merely asking to be on the platform.

Trying to extract technical capabilities from these salespeople is nigh-impossible either. I think part of the problem is that they seem to actually believe that the features that they are promising really exist, when in reality I just need them to show me what the data dictionary looks like and how the session needs to be handled and then I can tell for myself whether or not my scenario is actually supported. Instead? I’m waiting on a “discovery session” teleconference with an “engineer” tomorrow.

Conclusions and Delusions

It has to be easier than this. No wonder there are so many not-invented-here software solutions in the world today–custom crap that barely works at home may yet indeed be better than generic crap that you have to waste hundreds of dollars on in productivity time and research before you even get it in your hands and realize that it is also crap, just with a maintenance contract.

If it takes a consultant to help people decide what software to buy, or which of your products is right for them, or whether or not your product even applies to their problem domain, then your marketing simply does not work.

Jan 4 10

Integration Testing with SQL Server Express 2008, NHibernate, and MSTEST

by Nicholas Piasecki

When dealing with NHibernate, I find that it’s important to write tests to make sure that I’ve created my mapping files correctly–that is, that properties actually are mapped correctly, that I remembered to specify an IUserType where appropriate, that I remembered to make my properties virtual, that I remembered to add a parameterless constructor, that I set the right cascade option on an association, and so on.

It would be nice if I could run an automated test that would “exercise” my NHibernate mapping file against an actual database to see if it all worked. It’s not really a unit test–I’m not testing NHibernate itself, I’m testing my configuration of NHIbernate–but it’s still appropriate as an integration test.

Ayende Rahien has a spiffy implementation of an in-memory database test that uses NHibernate’s SchemaExport to build up an empty, in-memory SQLite database at the start of each NUnit test. His implementation serves as the inspiration for my example.

In my case, I’m working on a little moonlighting project that uses some database-specific features of SQL Server Express 2008, in particular the GEOGRAPHY type and SPATIAL indexes. SQLite wouldn’t help me here–unless I just didn’t want to use it for those particular tests–but I figure that if I’m bothering to write an integration test, then it should actually test the integration against the database that I’m actually using. Additionally, unlike Ayende’s example, I’m using MSTest instead of nUnit because it’s built into Visual Studio and I am a masochist.

Understanding SQL Server Express User Instances

SQL Server Express has the concept of a “user instance” that allows a non-administrator user to attach and detach an *.mdf file to the local SQL Server Express instance. I thought this was what I needed. But there are some things to understand about user instances that confused me and probably confuse other people as well:

  • They’re not the same thing as an in-memory database or even a “file-based” database like SQLite. It’s a regular SQL Server database file.
  • The file has to exist. Unlike SQLite, if you tell SQL Server Express to attach to an *.mdf that isn’t there, you’ll get an error instead of having the database created for you just in time.
  • To access the database, it has to be attached to a SQL Server instance. This means downloading SQL Server Express and installing the service on the machine. It’s not as if your application can just directly access the *.mdf file through some special DLL or anything like that. It really is still client/server.
  • Permissions can still be an issue. The SQL Server instance will need to be able to read the *.mdf file, so the NTFS permissions need to be set appropriately; in most cases, this means that NETWORK SERVICE will need to be able to read and write to it.

After spending a few hours learning such things the hard way, I realized that user instances were not necessary for me (VS already runs as administrator, can’t debug in IIS without it). Instead I’ll just use SQL Server Management Objects (SMO) to programmatically set up the database at the start of each test. (More on this later.)

Understanding MSTest Execution Order

On my first attempt, I put my database and NHibernate set-up and tear-down code in methods decorated with the MSTest [TestInitialize] and [TestCleanup] attributes. And this does indeed work well when a single [TestMethod] is executed in isolation. But if I executed more than one test in a single test run, all hell would break loose because the [TestInitialize] for a test that is scheduled to execute might get executed before the [TestCleanup] from a test that has already run.

This seemed like astonishing behavior to me and seems to be just a quirk of Microsoft’s design of MSTest. In nUnit and pretty much any other xUnit framework I’ve tried, the initialize/test/cleanup triumvirate is guaranteed to execute “atomically” before any part of any other test triumvirate is executed. In other words, if you’re running a test run with two tests A and B, then you might see this behavior in the two testing frameworks:

  • nUnit: Initialize A > Test A > Cleanup A > Initialize B > Test B > Cleanup B
  • MSTest: Initialize A > Test A > Initialize B > Cleanup A > Test B > Cleanup B

That’s because MSTest executes tests in multiple threads, and while for a particular test method you’re guaranteed that its [TestInitialize] and [TestCleanup] methods will be called before and after the test itself executes, respectively, there is no guarantee about its relationship with the other simultaneously executing test methods.

This causes a problem when the test method is assuming exclusive access to a shared system resource, like a test SQL Server Express database that has just been set up for that test method’s exclusive use. My brute force workaround is to use Monitor.Enter() and Monitor.Exit() in the [TestInitialize] and [TestCleanup] methods to force MSTest to execute them in the correct order:

private static readonly object lockObject = new object();
 
[TestInitialize]
public void TestInitialize()
{
	Monitor.Enter(lockObject);
 
	try
	{
		// TODO: Delete any pre-existing SQL Server Express Instance
		// TODO: Set up the SQL Server Express Instance
		// TODO: Set up NHibernate
	}
	catch
	{
		// If something went horribly wrong, release the lock
		// or else MSTest will never finish the test run!
		Monitor.Exit(lockObject);
 
		throw;
	}
}
 
[TestCleanup]
public void TestCleanup()
{
	// TODO: Delete the SQL Server Express instance
	Monitor.Exit(lockObject);
}

Using SMO to Bind It All Together

All that was left was to fill in those TODOs.

Deleting an Existing Database

First up is deleting an existing instance, which is pretty trivial with SMO (just add a reference to Microsoft.SqlServer.Smo.dll, which is available after you installing SQL Server Express, and you’re set):

private static readonly string serverInstance = @"(local)\SQLEXPRESS";
 
private static readonly string databaseName = "EquineTest";
 
private string dataFilePath;
 
private string logFilePath;
 
private void DeleteDatabaseIfExists()
{
	var server = new Server(serverInstance);
 
	if (server.Databases.Contains(databaseName))
	{
		// Something might be caching an open connection, so tell everyone
		// to screw off by forcing their connections shut. If we don't do this,
		// the DetachDatabase() call could faile.
		var sql = string.Format(
			CultureInfo.InvariantCulture,
			"ALTER DATABASE {0} SET SINGLE_USER WITH ROLLBACK IMMEDIATE",
			databaseName);
		server.Databases[databaseName].ExecuteNonQuery(sql);
 
		server.DetachDatabase(databaseName, true);
 
		File.Delete(this.dataFilePath);
		File.Delete(this.logFilePath);
	}
}

Why check for an delete an existing database at the start of a test? In case a previous test failed. Shockingly, MSTest won’t call [TestCleanup] if a test method explodes with an exception. I know, I’m in total agreement with you, what were they thinking?

Figuring Out Where to Put the Files

Simple enough, but you’ll notice that my dataFilePath and logFilePath variables, which should be pointing to my *.mdf and *.ldf database files, respectively, aren’t initialized in my above example. Prior to calling this function, I make sure that my InitializePaths() method is called:

private string GetExecutingDirectory()
{
	var path = Assembly.GetExecutingAssembly().GetName().CodeBase;
	return Path.GetDirectoryName(new Uri(path).LocalPath);
}
 
private void InitializePaths()
{
	var directoryPath = this.GetExecutingDirectory();
 
	this.dataFilePath = Path.Combine(directoryPath, databaseName + ".mdf");
	this.logFilePath = Path.Combine(directoryPath, databaseName + "_log.ldf");
}

This is convoluted way of determining the directory that MSTest created for the current test run. (MSTest sets the working directory to some nonsense in Program Files, so no help there. And [DeploymentItem()] is an absolute nightmare!)

Creating the Test Database

The code for creating the database via SMO is similarly straightforward:

private void CreateDatabase()
{
	var server = new Server(serverInstance);
	var database = new Database(server, databaseName);
 
	var fileGroup = new FileGroup(database, "PRIMARY");
	database.FileGroups.Add(fileGroup);
 
	var dataFile = new DataFile(fileGroup, databaseName, this.dataFilePath);
	dataFile.Growth = 10;
	dataFile.GrowthType = FileGrowthType.Percent;
	fileGroup.Files.Add(dataFile);
 
	var logFile = new LogFile(database, databaseName + "_log", this.logFilePath);
	logFile.Growth = 10;
	logFile.GrowthType = FileGrowthType.Percent;
	database.LogFiles.Add(logFile);
 
	database.Create();
}

The only quirk is that I explicitly set the Growth and GrowthType because on some machines it was defaulting to None causing the integration tests to fail when lots of data is inserted under some circumstances. It seems to be a per-server default setting, but I haven’t bothered to figure out why the default behavior is different among the multiple machines that I own; setting it here guarantees the tests will work consistently.

Initializing the Database and Configuring NHibernate

The last piece of the puzzle is to configure NHibernate and set up the freshly-minted database’s schema. I use the NHibernate’s SchemaExport tool to create the schema for me based on my mapping files. The only caveat is that when using SQL SCHEMAs, the SchemaExport tool won’t create those for you, so they have to be handled explicitly:

[TestInitialize]
public void TestInitialize()
{
	Monitor.Enter(lockObject);
 
	try
	{
		this.InitializePaths();
		this.DeleteDatabaseIfExists();
		this.CreateDatabase();
 
		// The configuration is a static variable, I don't care if the
		// configuration is shared among test methods
		if (configuration == null)
		{
			// Disabling connection pooling is important
			var connectionString = string.Format(
				CultureInfo.InvariantCulture,
				@"Data Source={0}; Integrated Security=true; Initial Catalog={1}; Connection Timeout=120; Pooling=false;",
				serverInstance,
				databaseName);
 
			configuration = new Configuration()
				.SetProperty(Environment.ReleaseConnections, "on_close")
				.SetProperty(Environment.Dialect, typeof(MsSql2008Dialect).AssemblyQualifiedName)
				.SetProperty(Environment.ConnectionDriver, typeof(SqlClientDriver).AssemblyQualifiedName)
				.SetProperty(Environment.ConnectionString, connectionString)
				.SetProperty(Environment.ProxyFactoryFactoryClass, typeof(ProxyFactoryFactory).AssemblyQualifiedName)
				.AddAssembly("EquineBusinessBureau.Model");
 
			sessionFactory = configuration.BuildSessionFactory();
		}
 
		// We definitely need a new session for each test method, though		
		this.session = sessionFactory.OpenSession();
 
		// SchemaExport won't create the database schemas for us
		foreach (string schema in schemas)
		{
			var sql = string.Format(CultureInfo.InvariantCulture, "CREATE SCHEMA {0}", schema);
			this.session.CreateSQLQuery(sql).ExecuteUpdate();
		}
 
		new SchemaExport(configuration).Execute(
			true,
			true,
			false,
			session.Connection,
			Console.Out);
	}
	catch
	{
 
		Monitor.Exit(lockObject);
 
		throw;
	}
}

Conclusions and Delusions

Wrap all of this up in an abstract class called IntegrationTestBase, and you’ve got an easy way to write integration tests in a style that’s similar to what Ayende originally posted:

[TestClass]
public class RoleTest : InMemoryDatabaseTestBase
{
    [TestMethod]
    public void CanSaveAndLoadRole()
    {
        object identity;
 
        using (var tx = this.session.BeginTransaction())
        {
            identity = session.Save(new Role(EquineRoles.Administrator, "Administrator"));
 
            tx.Commit();
        }
 
        session.Clear();
 
        using (var tx = this.session.BeginTransaction())
        {
            var role = session.Get<Role>(identity);
 
            Assert.AreEqual<string>(EquineRoles.Administrator, role.RoleEnum);
            Assert.AreEqual<string>("Administrator", role.RoleName);
 
            tx.Commit();
        }
    }
}

The first test in a test run takes a while to run, on the order of 6 – 8 seconds. Painful, but this is also an integration test, so it’s not like I’m running them every five minutes, and it’s faster than spinning up the Web site and seeing if it crashed. Subsequent tests, since the NHibernate configuration is already initialized in the static variable, execute much more quickly, about one second per test.

Hope this saves someone a few hours of putting the puzzle pieces together.

Download the IntegrationTestBase example source file.

Dec 22 09

Quick Tip: BinaryWriter’s Write(string) overload prepends the binary data with the length of the string

by Nicholas Piasecki

If you’re like me and you’re using the .NET Framework’s BinaryWriter class to write bytes to a stream, you might call the overload that takes a string as its parameter and expect to get the sequence of characters that is that string written to the stream as a series of bytes according to the encoding set in the BinaryWriter instance.

In other words, I was expecting the BinaryWriter’s Write(string) implementation to look something like this pseudocode:

public void Write(string text)
{
     this.buffer.Append(this.encoding.GetBytes(text));
}

After all, when you call Write() with an int, you get four bytes written. If I call Write() with the string “TEST” and the BinaryWriter is using the ASCII encoding, I’d expect four bytes to be written. But instead it writes five.

That’s because the BinaryWriter writes the length of the string in one byte and then calls GetBytes() to output the string data.

Now that I think about it, this makes perfect sense: strings in the .NET framework are typically not thought of as being null-terminated, they’ve got a length, and in order for the BinaryReader’s Read(string) method to work, it’ll need to be able to know the length of the string to determine how many bytes to read.

In my case, I was writing data to an Epson TM-T88III receipt printer, and given the structure of the commands that the printer expects, it doesn’t need or want the length of the string in this way. Because I didn’t read the MSDN documentation closely, I was left scratching my head as to why weird characters were showing up or characters were being omitted in my output.

The workaround is to just call GetBytes() myself and pass the byte buffer to the BinaryWriter’s Write(byte[]) overload:

// "bw" is an instance of a BinaryWriter
// "barcode" is a string
 
bw.Write(AsciiControlChars.GroupSeparator);
bw.Write('k');
bw.Write((byte)4);
bw.Write('*');
bw.Write(Encoding.ASCII.GetBytes(barcode)); // NOT bw.Write(barcode);
bw.Write('*');
bw.Write(AsciiControlChars.Null);

Hope this saves someone from a forehead-slapping moment.

Dec 15 09

Dear FedEx

by Nicholas Piasecki

Puerto Rico is not a country.

Dec 9 09

Sending a bit image to an Epson TM-T88III receipt printer using C# and ESC/POS

by Nicholas Piasecki

At Skiviez/WFS, our “low-volume fulfillment requests” (which usually simply means “retail orders”) get a simple receipt printed out via an Epson TM-T88III receipt printer.

TM-T88III Receipt Printer

TM-T88III Receipt Printer

You’ve probably seen the TM-T88III before, though the current model is the TM-T88IV–they’re ubiquitous and especially common in restaurants. We chose it because the receipt prints quickly (which reduces the probability that a stack of order receipts gets missorted such that an employee puts the wrong receipt in the wrong parcel), is thermal so there is no ink to replace (which reduces the cost of consumables), uses standard thermal receipt paper (which is cheap, and we can always run out to Staples if we run out), and, perhaps most importantly, we found one in a box in the back of the warehouse.

At the top of the retail receipt, I print out the Skiviez logo. This is stored in a non-volatile area of memory in the printer (called NV RAM in EPSON parlance), and I dole out a command that essentially says “print out the image stored in slot #1,” having previously uploaded the bitmap via the printer driver’s flash utility. This worked great for as long as we only shipped Skiviez orders and only had one TM-T88III printer. But with the advent of WFS, however, Skiviez now ships orders (we call them “fulfillment requests”) for multiple e-commerce stores, each requiring a different logo to appear at the top of the receipt. So I now had two options:

  • I could manually upload the logos for all of the merchants that we ship for into all of the receipt printers that we use, making sure to upload each merchant image into the same NV memory slot for each printer.
  • I could store the merchant’s logo in our database, figure out how to use the “select bit image mode” receipt printer command, and just send the image data just-in-time whenever I generated a receipt.

The second option has obvious maintenance benefits, so that’s the path I started hacking my way down.

A brief primer on ESC/POS and data

Like the Zebra LP2844 and its brethren, the TM-T88III comes with a Windows printer driver that allows Windows and applications to see it as any other GDI-based printer. You can print a Word document to the TM-T88III and it will print, albeit slowly, and it’ll look like crap. That’s because, as with any other GDI printer, the Word document gets rendered as a bitmap and the entire big-ass bitmap gets sent to the poor little receipt printer which then prints it 24 dots at a time. A better method is to use the printer for its intended purpose and use the native printer language to generate the receipt. This allows us to select printer fonts that are specifically designed for the printer’s resolution and speed characteristics and use other advanced features of the printer such as the paper cutter.

That printer language is called ESC/POS, which is entirely unpronounceable and a stupid name. The “POS” stands for “Point of Sale” since this printer belongs to a class of devices commonly known as point of sale devices–cash drawers, VFDs, barcode scanners, and the like. The “ESC” stands for “escape” because the printer treats any data that is sent to it as text–passed straight through–unless it is escaped with a special character to indicate that a command instruction follows in the input stream. The most common escape character used is the ASCII ESC character, and so we have “ESC/POS” as our language name.

If you’ve just picked up one of these printers on eBay, you may find that getting documentation on this language is hard to come by because you need to get a copy of the “ESC/POS Application Programming Guide” from your “EPSON Authorized Dealer,” whoever the hell that is. (Again, we found ours in a box in the back of the warehouse, so my “authorized dealer” is long gone.) Still, some Googling will give you more or less complete and current copies of these guides, which I’ve mirrored here. Go ahead and download them:

(I particularly like how the programming guide has “proprietary” and “confidential” stamped all over it and then proceeds to describe how great EPSON is for “taking initiative” for “expandability” and “universal applicability” on page 7 of the same document. You can’t make this crap up–it’s full of oxymorons.)

Having just come from writing in the EPL2 printer language for our Zebra LP2844 and Zebra ZP 500 Plus thermal label printers, I found the ESC/POS language to be confusing. My confusion mostly stemmed partly from the way the documentation was written and partly because I was used to the way EPL2 worked. Let’s take a look at the documentation for the bit image command that I’m trying to use:

Documentation Snipped

If this were EPL2, I would actually send the document as a string to the printer, so if a command needed the integer 33 as a parameter, I would send the string “33″ (two bytes of data). In ESC/POS, each parameter is a single byte, so if I want to send 33 as a parameter for m in the bit image command above, then I need to send 33 in a single byte: that is, 0b100001, which is 25 + 20 = 32 + 1 = 33.

The reason that this is super confusing is that other parameters in the documentation are specified to be ASCII characters, such as the ‘*’ parameter in the bit image command above. This is because low-level programmers, such as those who designed the ESC/POS language, tend to blur the lines between data types: it’s all bytes at the end of the day. As a result, if you’re using C#, you might be tempted to use a StringBuilder to build up your document to send to the printer. Don’t do it! You’ll inevitably get confused by its overloads. Let’s take that m = 33 parameter as an example:

var sb = new StringBuilder();
 
// ASCII escape
sb.Append((char)27); 
 
// ASCII '*' for the bit image command
sb.Append('*'); 
 
// oops! this appends the string "33" in two bytes, doesn't work
sb.Append(33); 
 
// this is what I want, but semantically weird
sb.Append((char)33);

A much better and less confusing in the long run solution is to eschew any notion that we are dealing with text. Instead, let’s use the semantics that we are sending bytes of raw binary data directly to the printer:

using (var ms = new MemoryStream())
using (var bw = new BinaryWriter(bw))
{
     bw.Write(AsciiControlChars.Escape);
     bw.Write('*');         // bit-image mode
     bw.Write((byte)33);    // 24-dot double-density
}

But remember that cast to byte! Otherwise, you’ll get a C# int, which is four bytes, written to the stream, when we only wanted one. Yes, that means that any parameter that an ESC/POS command takes has a maximum value of 255.

You could get by on the StringBuilder method for a while and not have it burn you until you try the bit image command because all the characters that you’re appending happen to be less than 128–greater than that and you start getting into bytes that don’t map to a Unicode character for quirky historical reasons, such as the range 128 to 159, and you’ll be pulling your hair out as to why some of your data is getting “lost.” Just use the BinaryWriter method. You can thank me later.

Why is the bit image command designed this way?

The m parameter in the bit image command as defined in the previous section has 4 values, and each value changes the way we construct the image data bytes and the way the printer interprets them. The first three values are of dubious value, so let’s throw them out for now. We’ll focus on m = 33, which means “24-dot double density mode.” So keep this in mind and set this aside.

When we think of drawing a bitmap in high level software, we usually think of drawing it pixel by pixel, left to right, top to bottom. So the order that the dots get printed as indicated by the programming guide seems strange:

Relationship between image data and the print result

That is, the dots are rendered from top to bottom, then left to right.

Why in the blue hell do I need to rearrange my nice and neat array of bitmap data into this Connect Four scheme of dots? It all comes down to the size of the print head.

For our mental model, the thermal print head in the receipt printer is physically 1 dot wide by 24 dots high where there are 203 dots per inch (square pixels). It moves left to right across the receipt paper, burning up to 24 dots in the vertical (y) direction for each dot in the horizontal (x) direction.

Since we’re talking about a thermal printer here, there’s no concept of gray scale tones–we’ve either burned a dot into the paper (black) or we haven’t. So if we’re sending image data to the printer in bytes, it would make sense to say that a bit set to 1 means “burn a dot” and a bit set to “0″ means do nothing.

But a byte is only 8 bits wide, and we have 24 dots to burn in each “slice” of the image. 24 just happens to be divisible by 8, so we can send 3 bytes of data for each slice to represent our 24 dots. (If the bitmap I want to draw is taller than 24 dots/pixels, then I need to send multiple bit image commands, effectively doing multiple stripes of the image that are 24 dots high; more on this later.)

So, now, the diagram in the programming guide makes sense: we’re not burning 1 dot at time, we’re burning a vertical stripe that is 1 dot wide and 24 dots high at a time, and then moving to the right. So our printer reads our first 3 bytes of image data, burns the dots specified by those 24 bits, then reads the next 3 bytes, and so on.

Aside: Converting the bitmap to monochrome

Right. So a merchant has given me a *.bmp file, and I need to convert that into array of bits that I can send to the receipt printer. It’s a good bet that the bitmap that the merchant sent me for the logo is not monochrome (that is, every pixel is either 100% black or 100% white).

So what we can do is look at each pixel in the bitmap, determine its luma, and if that’s below a certain threshold, count that pixel as black. How did I write the code to do this? I looked the formula up via the intertubes and modified it to fit my needs:

private static BitmapData GetBitmapData(string bmpFileName)
{
    using (var bitmap = (Bitmap)Bitmap.FromFile(bmpFileName))
    {
        var threshold = 127;
        var index = 0;
        var dimensions = bitmap.Width * bitmap.Height;
        var dots = new BitArray(dimensions);
 
        for (var y = 0; y < bitmap.Height; y++)
        {
            for (var x = 0; x < bitmap.Width; x++)
            {
                var color = bitmap.GetPixel(x, y);
                var luminance = (int)(color.R * 0.3 + color.G * 0.59 + color.B * 0.11);
                dots[index] = (luminance < threshold);
                index++;
            }
        }
 
        return new BitmapData()
            {
                Dots = dots,
                Height = bitmap.Height,
                Width = bitmap.Width
            };
    }
}

BitmapData is just a little struct that contains the three properties that you see here. After all, once I have my BitArray with 1 indicating black dots and 0 indicating white dots, I don’t need the original Bitmap instance anymore.

Just to clarify, this means that we’re holding onto a BitArray of monochrome data in the Dots property that conceptually looks something like this:

Conceptual Overview

Marshalling the monochrome data

Now for the trickiest part of all. We need to take our BitArray of monochrome data, divide it up into 8-dot chunks, represent those 8-dot chunks as bytes, and send them to the printer in the order required by the bit image command, discussed above. That is, I’ll need to send the data as bytes in this order:

Capture

So the printer will draw them one vertical stripe of three bytes each at a time. (If you don’t get it, print out that image, cut up the bytes, and then arrange them in the order shown in the diagram from the EPSON documentation. It’ll instantly make sense once you see it.) The X’s in the diagram correspond to bits that aren’t in our original image that we have to send anyway as padding. Remember, since we selected the 24-dot double density mode, the printer is going to draw a 1 x 24 dot slice as it moves the print head. My example bitmap is only 4 pixels tall, so I have to send 20 zero bits as padding.

And thus we would end up with our pretty image.

Unfortunately, there is some math involved in translating the bits in our BitArray into these bytes that we need to send. Assuming bw is our reference to a BinaryWriter, here’s the code that does just that:

// So we have our bitmap data sitting in a bit array called "dots."
// This is one long array of 1s (black) and 0s (white) pixels arranged
// as if we had scanned the bitmap from top to bottom, left to right.
// The printer wants to see these arranged in bytes stacked three high.
// So, essentially, we need to read 24 bits for x = 0, generate those
// bytes, and send them to the printer, then keep increasing x. If our
// image is more than 24 dots high, we have to send a second bit image
// command to draw the next slice of 24 dots in the image.
 
// Set the line spacing to 24 dots, the height of each "stripe" of the
// image that we're drawing. If we don't do this, and we need to
// draw the bitmap in multiple passes, then we'll end up with some
// whitespace between slices of the image since the default line
// height--how much the printer moves on a newline--is 30 dots.
bw.Write(AsciiControlChars.Escape);
bw.Write('3'); // '3' just means 'change line height command'
bw.Write((byte)24);
 
// OK. So, starting from x = 0, read 24 bits down and send that data
// to the printer. The offset variable keeps track of our global 'y'
// position in the image. For example, if we were drawing a bitmap
// that is 48 pixels high, then this while loop will execute twice,
// once for each pass of 24 dots. On the first pass, the offset is
// 0, and on the second pass, the offset is 24. We keep making
// these 24-dot stripes until we've run past the height of the
// bitmap.
int offset = 0;
 
while (offset < data.Height)
{
    // The third and fourth parameters to the bit image command are
    // 'nL' and 'nH'. The 'L' and the 'H' refer to 'low' and 'high', respectively.
    // All 'n' really is is the width of the image that we're about to draw.
    // Since the width can be greater than 255 dots, the parameter has to
    // be split across two bytes, which is why the documentation says the
    // width is 'nL' + ('nH' * 256).
    bw.Write(AsciiControlChars.Escape);
    bw.Write('*');         // bit-image mode
    bw.Write((byte)33);    // 24-dot double-density
    bw.Write(width[0]);  // width low byte
    bw.Write(width[1]);  // width high byte
 
    for (int x = 0; x < data.Width; ++x)
    {
        // Remember, 24 dots = 24 bits = 3 bytes. 
        // The 'k' variable keeps track of which of those
        // three bytes that we're currently scribbling into.
        for (int k = 0; k < 3; ++k)
        {
            byte slice = 0;
 
            // A byte is 8 bits. The 'b' variable keeps track
            // of which bit in the byte we're recording.                 
            for (int b = 0; b < 8; ++b)
            {
                // Calculate the y position that we're currently
                // trying to draw. We take our offset, divide it
                // by 8 so we're talking about the y offset in
                // terms of bytes, add our current 'k' byte
                // offset to that, multiple by 8 to get it in terms
                // of bits again, and add our bit offset to it.
                int y = (((offset / 8) + k) * 8) + b;
 
                // Calculate the location of the pixel we want in the bit array.
                // It'll be at (y * width) + x.
                int i = (y * data.Width) + x;
 
                // If the image (or this stripe of the image)
                // is shorter than 24 dots, pad with zero.
                bool v = false;
                if (i < dots.Length)
                {
                    v = dots[i];
                }
 
                // Finally, store our bit in the byte that we're currently
                // scribbling to. Our current 'b' is actually the exact
                // opposite of where we want it to be in the byte, so
                // subtract it from 7, shift our bit into place in a temp
                // byte, and OR it with the target byte to get it into there.
                slice |= (byte)((v ? 1 : 0) << (7 - b));
            }
 
            // Phew! Write the damn byte to the buffer
            bw.Write(slice);
        }
    }
 
    // We're done with this 24-dot high pass. Render a newline
    // to bump the print head down to the next line
    // and keep on trucking.
    offset += 24;
    bw.Write(AsciiControlChars.Newline);
}
 
// Restore the line spacing to the default of 30 dots.
bw.Write(AsciiControlChars.Escape);
bw.Write('3');
bw.Write((byte)30);

This code looks confusing, and is, but I’ve tried to document it with explanatory comments in line.

Sending the document to the printer

Finally, there is the task of sending this array of bytes directly to the printer. I’ve talked about this before and the code is uninteresting, requiring some P/Invokes into native APIs. Essentially, you pass in the name of the printer as it appears in the Control Panel, the array of bytes from our BinaryWriter, and you’re all set:

private static void Print(string printerName, byte[] document)
{
    NativeMethods.DOC_INFO_1 documentInfo;
    IntPtr printerHandle;
 
    documentInfo = new NativeMethods.DOC_INFO_1();
    documentInfo.pDataType = "RAW";
    documentInfo.pDocName = "Bit Image Test";
 
    printerHandle = new IntPtr(0);
 
    if (NativeMethods.OpenPrinter(printerName.Normalize(), out printerHandle, IntPtr.Zero))
    {
        if (NativeMethods.StartDocPrinter(printerHandle, 1, documentInfo))
        {
            int bytesWritten;
            byte[] managedData;
            IntPtr unmanagedData;
 
            managedData = document;
            unmanagedData = Marshal.AllocCoTaskMem(managedData.Length);
            Marshal.Copy(managedData, 0, unmanagedData, managedData.Length);
 
            if (NativeMethods.StartPagePrinter(printerHandle))
            {
                NativeMethods.WritePrinter(
                    printerHandle,
                    unmanagedData,
                    managedData.Length,
                    out bytesWritten);
                NativeMethods.EndPagePrinter(printerHandle);
            }
            else
            {
                throw new Win32Exception();
            }
 
            Marshal.FreeCoTaskMem(unmanagedData);
 
            NativeMethods.EndDocPrinter(printerHandle);
        }
        else
        {
            throw new Win32Exception();
        }
 
        NativeMethods.ClosePrinter(printerHandle);
    }
    else
    {
        throw new Win32Exception();
    }
}

The P/Invoke declarations can be found in the sample code at the end of the article, and the documentation is available on MSDN. But, again, there’s nothing sexy going on here.

Conclusions and delusions

And that’s a wrap. I always find it interesting when I encounter a problem at work–which is a small business by any definition of the phrase–and the Google indicates that no one has ever encountered or documented this problem before. And it’s these types of integration problems that I find fascinating; after years of playing in castles in the sky, dealing with databases and exceptions and HTML forms, it’s nice to know that I can dust off the corners of my brain and my college degree and deal with bits and bytes when I need to. Writing software that makes hardware do stuff is always fun, too.

And after hours of watching the receipt printer spew out lines and lines of gibberish, there’s nothing like the feeling of seeing the bitmap print out and knowing that, finally, things have simply started to work.

Good luck!

Download the code used in this article.

Nov 10 09

Thank You, Comcast

by Nicholas Piasecki

There is nothing quite like dashing to work at night to restart the “SMC Comcast Business IP Gateway” because it stopped routing packets for one of our static IP addresses.

The reason? Oh, it just does that sort of thing every 25 days or so.

Nov 7 09

Programmatically updating software deployed via Group Policy

by Nicholas Piasecki

At work, I’ve written a small application called the “Fulfillment Manager.” From a user’s perspective, it’s an extremely simple application. It shows the current order counts for all of the stores that we ship for, and if you scan a barcode, it figures out what store that barcode belongs to, determines if the order the barcode corresponds to needs to be packed or shipped, and prints out a receipt/packing slip or USPS/FedEx shipping label and supporting shipping documentation automatically. Most operations involve just scanning the barcode and pressing enter.

Yes, it epitomizes "Battleship Grey." You love it.

Yes, it epitomizes Battleship Grey. You love it.

But, behind the scenes, it’s not quite as simple as all of that. It’s aggregating order data from heterogeneous data sources–some in our legacy database, some in our new fulfillment system, some in a custom integration with a third party. It has to figure out which postage account to pay for postage with or which FedEx account number to use to ship a package. For orders that aren’t coming from our new fulfillment system, it has to “cleanse” the address against the current USPS address database. It has to figure out the cheapest way to ship a package, compute customs values correctly, generate certificates of origin and commercial invoices for international shipments, and determine what box types an order is allowed to be packed in. And it has to write shipment information back to one of those three disparate data sources.

What this means is that I’m frequently making adjustments and bug fixes to the application. And managing the deployment and installation of those bug fixes had been, up until now, a pain.

A brief interlude on ClickOnce

The other internal application that we use (“Undies Client”) for our long-time running e-commerce store is deployed via ClickOnce, which is essentially the Java Web Start of the .NET world.

While ClickOnce is a neat technology and has its applications, to be sure, I probably wouldn’t use it again on Undies Client if I were starting that application over today, just as I decided not to use it for the Fulfillment Manager (which is an effort to divorce the processing and shipping features from Undies Client and make them simpler and applicable to multiple e-commerce stores).

First, there’s user confusion. If I deploy a Windows Installer MSI via Group Policy, then the application is magically there on all computers in the office. But for Undies Client, you have to go to a special Web page and click on a link. With ClickOnce, the installation happens per user, so employees can get confused if they go to another computer one day, log into their account, and see that the app isn’t there (“but Jennifer runs it on this computer so I thought it was already installed”).

Second, there’s deployment headaches. Like Java Web Start, you get a retarded warning if the deployment manifest wasn’t signed with an expensive code signing certificate. To mitigate that, you either buy one or start diddling with the self-signing certificate capability within the context of your own Active Directory domain. Not a show-stopper, and it makes sense, I guess, but it’s One More Thing that you have to deal with.

Third, when that certificate expires and needs to be renewed, you’re in for a world of hurt, because essentially all users will need to uninstall and re-visit the Web site download link and reinstall. Otherwise, the application simply stops seeing the newer updated versions and doesn’t update itself.

Fourth, the distribution of your app now has a dependency on an IIS installation somewhere, so that’s something else to maintain–both the configuration of that virtual directory in IIS as well as the shared drive to which Visual Studio dumps its files when clicking the “Publish” button.

Fifth, the installation can’t do much. Until recently, you couldn’t even create an icon on the desktop as part of the installation process. Nor can you do anything that would require elevated permissions for actions that you might typically do when running an installer, such as registering a COM DLL, or installing some third-party dependency. So the Web page at which you download the app usually contains things like “ooh be sure to install this that and the other first,” defeating the deployment simplicity of ClickOnce. And if you need to update one of those third party dependencies and your app because dependent on one of those updates, you have no way to update that dependency with ClickOnce, unless you take it upon yourself to have your application manage the upgrade during its next run. That’s just more work than you shouldn’t have to do.

After writing all of this, it may seem like I am saying that ClickOnce is a half-baked load of crap; it’s not half-baked. I’m saying it’s a fully-baked, complete load of crap. (Kidding. ClickOnce has its applications for applications that can be completely self-contained, but if at any point you become dependent on anything COM, it’s time to move on to real deployment technology.)

Using WiX to create a Windows Installer MSI file

I’ve blogged about WiX before. It’s a great open source tool put together by some guys who decided to write a reasonable mechanism for generating Windows installers because, for some reason, the Windows installer team has seemed to think that editing database tables in a cheeseball editor called Orca was sufficient. This would be like saying that our warehouse workers could ship orders by updating data in a Microsoft Excel spreadsheet.

You could also pay lots of money for InstallShield or something similar, which would create MSI installers for you, but installation is a convenience for me–as a small business whose primary focus is not end-user software, paying for that doesn’t make much sense. There’s also NSIS, but, oh–I just threw up all over myself. We’ll save NSIS for another post.

Additionally, the WiX guys have realized that installers usually need to do useful things, like install certificates, set up Web sites in IIS, and run database scripts, whereas the Windows Installer team seems to have been trying to make writing custom actions harder, not easier, with their subsequent releases, because that’s where most of the crashes and problems in setup packages happen. With WiX, we now have a suite of well-tested custom actions that lots of people are using; this should have been the Installer team’s original response instead of depending on the community and third parties to fill in this gap for them, but it is what it is.

The point is that WiX enables a whole class of small business developers like me to build first-class deployment methods into their applications. With a 300-odd line XML file, the Fulfillment Manager now builds to an MSI file. And since WiX integrates with Visual Studio, I can generate that XML file as part of my build process.

Indeed, I’ve set up TeamCity and use this as a continuous integration server. Whenever I commit a change to Subversion, TeamCity picks up the change, compiles the solution, runs the tests, and if they pass, copies the newly generated MSI file to a network share for potential deployment. It’s pretty sweet.

The missing piece of the puzzle, then, is actually getting this freshly baked MSI file onto all of the client machines in the office.

Deploying via the Group Policy Software Installation Extension

My first instinct was to use the Group Policy Software Installation Extension. This is the thingie where when you open up a group policy in the Group Policy Management Editor, you can drill down to the Software Installation thing under Computer Configuration, specify an MSI file for deployment, and (presto!) any computers linked to that GPO will install the MSI on next boot.

This worked swimmingly well for the first release of Fulfillment Manager.

We pause for another brief interlude on update strategies

Let me explain the design of the installation for a moment. I’ve written my WiX file such that each new MSI file that it generates is a “major upgrade” in the Windows Installer parlance–it has a different product code, a different package code, and a different version number (since my version numbers are a combination of an incrementing build number and the Subversion revision number). But they all have the same upgrade code and I schedule RemoveExistingProducts during the install.

This means that if you have an older version of the Fulfillment Manager on your machine and double-click the MSI file for a newer, updated version, you don’t have to do anything–the existing version is completely uninstalled and the new version is installed on top of it.

The Windows Installer has support for “minor upgrades” and “small updates”, but I can never keep the damn things straight. Can I add a new component? Can I change a file? Can I reorganize a feature? Do I really want to be thinking about this every time I press Build in Visual Studio? My application is small, so I think it’s far easier and more reassuring to just blow away the whole thing and install again during an update, starting with a clean slate each time. In fact, I think this is a reasonable approach for many reasonably-sized applications (Paint.NET, for example, does this) and only becomes a problem when you start getting really large (such as Visual Studio or Microsoft Word).

Getting the update out there

OK. So all I really need to do is get all of the client computers to run msiexec /i FulfillmentManager.msi on the MSI and I’ll be good to go.

You might think that I could just overwrite the old MSI file on the network share with the new one, and the computers would notice the change at the next boot. But you would be, as I was, a fool–machines that already had the install would not notice the change and machines that did not have the install would freak out because they could not find the correct MSI file. After positing my question on ServerFault, I discovered the way the Group Policy Software Installation Extension works is by creating an advertisement script (*.aas file) and referencing that script via an object sitting in the Active Directory. That LDAP entry and the script file both do annoying things like reference a specific package code and product code, both of which I change with every new build of my software. So this method is out for the count.

Similarly, lugging out the Group Policy Editor and trudging down to the package entry and clicking “Redeploy application…” won’t work for the reasons described above–except that it’ll break the machines that already have the software installed, too.

What works is lugging out the Group Policy Editor, trudging down to the package entry, clicking “Remove” and “Immediately remove”, and then adding the package right back. This creates an updated *.aas file and correspondingly updated LDAP entries, and the old LDAP entry is flagged with a “remove me now please” flag. This works, but there are three things that I don’t like about it:

  • It’s a manual process, so I have to remember to do it every time I create a new deployable build.
  • References to all of the old versions hang around by necessity, since it’s recording the fact that “hey, if I see this particular product code then I need to uninstall it.” Indeed, upon inspecting the SYSVOL share, I saw that there about 45 of such files sitting in there since development of this app started in early July.
  • There is no way to perform this process programmatically. (Well, there is, it’s documented as part of the EU anti-trust settlement, but let’s get real now: if it has LDAP in the spec then I’m not touching it with a ten-foot pole. Plus, this would be work that is totally tangential to my problem, which is automating a 1-minute task that ignores the crap out of me. At several days’ worth of work, it’d take me a long time to climb out of that time deficit to realize any savings.)

Nirvana: Automatic updating

The Software Installation extension for Group Policy can do a lot of things that I don’t need, like using patches or transforms, or shifting installed software by just moving a computer to a different OU, when at the end of the day all I really wanted was something that looked like this:

Must this be so difficult?

Must this be so difficult?

Since my installer will uninstall any previous versions, all I need to do is run the installation package via msiexec. This is easy enough to do via batch script that I’ve configured to run at startup via group policy, since those batch scripts run as SYSTEM and will have the necessary privileges to complete successfully.

You would think that in addition to /i and /x (and the idiotic /vomus), msiexec would have a somewhat useful parameter called, oh, I don’t know, /install-it-only-if-the-damn-thing-isn't-already-installed, but that would be a useful feature, so of course the Windows Installer team didn’t actually implement it.

Now, granted, there is actually no harm in running my installer when my app is already installed. It just would check that all the components are indeed installed and exit. But this still leaves a bad taste in my mouth. Not all MSI’s that I want to use in this way might behave like this. And, if I have any custom actions, that means that they’ll also get run on every boot, which seems like a waste.

While I could spelunk through the registry to try and see if I can find my current product code in the list, I’d rather not do that, because (1) that might not work consistently on all versions of Windows and (2) really, I’m just not that comfortable with batch scripts to begin with.

What ended up doing is writing a little command-line program called msicheck. The source file is boring and decently commented. Because all of the Windows Installer APIs are in C, my utility is in C. I briefly thought about writing it in C# with some interop, because I hate C just that much, but that did seem awfully overkill for such a simple application.

Ah, C in Windows. I had long forgotten the days of Hungarian variable names that sound like Yosemite Sam cussing (“lpszPackageVer“) and the land of if ("swizzle" == "swizzle") returning false. Heck, I had forgotten the days of not evening having false! I digress; the code is probably awful, but seems to work well enough for my purposes.

> msicheck.exe /?
 
Determines if a given Windows Installer package is installed.
 
msicheck package
 
Exit Codes:
        0       Exact version installed
        1       General failure
        2       Path to MSI package not found or not accessible
        3       Error determining package status
        4       Newer version installed
        5       No version installed
        6       Older version installed

All msicheck does is take the full path to an MSI file as an argument and returns an exit code that indicates the status of that particular package by looking at the product code and the version. (That is, if the product code is not installed, it returns 5; if the same product code is found, it compares the version and returns 0, 4, or 6.) I call it in my batch script (which, I don’t do this for a living, it’s probably retarded) like so:

@ECHO OFF
 
REM ---------------------------------------------------------------------
REM VARIABLES
REM ---------------------------------------------------------------------
 
SET path=\\SKIVIEZSBS2008\Group Policy Installations\Fulfillment Manager\
SET package=%path%Skiviez.FulfillmentManager.Installer.WinForms.msi
SET msicheck=%path%msicheck.exe
 
ECHO Checking installed version of Fulfillment Manager...
 
"%msicheck%" "%package%"
 
IF ERRORLEVEL 0 IF NOT ERRORLEVEL 1 (
	echo Fulfillment Manager is up to date.
) ELSE (
	IF ERRORLEVEL 5 (
		echo Installing latest version...
		%SYSTEMROOT%\system32\msiexec /qn /i "%package%"
		IF ERRORLEVEL 0 (
			echo Fulfillment Manager successfully updated.
		) ELSE (
			echo Errors may have occurred during installation.
		)
	) ELSE (
		echo Newer version installed or error occurred.
	)
)

Keeping in mind that IF ERRORLEVEL 5 evaluates to true for error levels of 5 or greater, this means that I call msiexec only if the product isn’t installed or if an older version is installed.

Conclusions and Delusions

So, by using a batch script and my msicheck utility, I have things the way I want them. I commit a change; TeamCity builds it, runs the tests, and copies the resulting MSI to the network share that is referenced in my batch script; and the batch script runs at the next boot, uses msicheck to note that that package is not installed, and so runs msiexec to install the update (which removes the old version as part of the install process).

It’s certainly not applicable to every software deployment scenario, much like ClickOnce, but hopefully it’ll help someone out there who wanted to do something similar.

Download MsiCheck, which requires Windows Installer 4.5 to run. Complete with the “works on my machine!” guarantee of absolutely no warranties.

Good luck!