Dear Authorize.Net

I walked in this morning to find multiple Authorize.net accounts failing all transactions with a General Error. I called Authorize.net around 8 am Eastern and got connected to a woman who said I was breaking up. Then I launched a Live Chat, and Brandon S. told me that I needed to contact my bank. He confirmed that lots of other people were having the errors, but that I needed to contact my bank for resolution. So I called my bank (Wells Fargo) who had no idea what was going on and said to try back at 9 am. When I called again, they admitted they were getting a lot of calls about this but weren’t sure what was going on. Twitter starts lighting up with people reporting similar problems. All for an outage that had been going on since around 4 am Eastern according to our logs.

So now it looks like a First Data change broke Authorize.Net, or the other way around, or in some other way in which I shouldn’t have to care: it still took you half a day to admit that you had a problem and then get it resolved. That’s what happens when you play the blame game instead of investigating issues yourself.

Here’s what you should have done:

  • You should have noticed a high percentage of these errors occurring during the night.

  • You should have noticed that they were all related to a particular processor.

  • You should have alerted that processor, posted a status message to your home page, and alerted your customer service staff.

Instead, you told individual customers to call their banks and fend for themselves. You said, “it’s not my problem.”

This is typical Authorize.Net; time and time again, I see you passing the blame onto someone else:

  • Instead of implementing two-factor security, you require us to change inane security questions and passwords constantly.

  • Instead of improving account security through alerts for suspicious behavior, you make merchants sign a release form for ECC because it’s always easier to pass the buck than it is to make your system better.

  • Instead of improving the very real use case problems with CIM, you let developer complaints on the forums go without action for years.

  • Instead of making refunds easy to handle like YOUR PARENT COMPANY CyberSource, you document that each individual merchant needs to void and recapture instead of implementing this functionality once, for everybody, yourselves in the API. Which means storing the card details (hi PCI DSS!) or battling CIM.

I’m beginning to think that nobody actually works on customer feedback or new features at Authorize.Net. Instead, it’s all maintenance on a decade-old platform and API that, because it’s been running for this long, there must be nothing wrong with it.

You are setting your company up to be disrupted by a new player. You need to act now by not being a payment gateway but by being a company that simply helps customers accept credit cards, and that means figuring out what’s wrong any time, every time.

Otherwise, you’re just another middleman passing the buck, and you will be replaced.

Thanks for listening.

Using the EPL2 GW command to send an image to a Zebra thermal printer

How’s that for a title? While that’s gibberish for most, if you’re the unlucky soul who’s been tasked with dumping an image to your Zebra thermal printer, it could be just what you’re looking for.

The ubiquitous Zebra thermal printer.

The ubiquitous Zebra thermal printer.

A little background

These thermal printers are commonly used in shipping environments to print out USPS, FedEx, and UPS labels. The printers generally speak two languages natively: EPL2 and ZPLII. The EPL (Eltron Programming Language) language is older than ZPL (Zebra Programming Language) but is also a bit simpler.

Zebra bought Eltron and has kept EPL around for backward compatibility reasons; the tired LP2844 only speaks EPL, but newer printers can speak both EPL and ZPL. While ZPL has advanced drawing features and proportional fonts, I tend to favor EPL just because the command set is simpler and it ends up getting the job done.

The EPL language consists of printer commands, one per line, in an ASCII text file. Each of the commands is described in the EPL Manual on Zebra’s site. For example, here’s a quick-and-dirty document that prints out some text on a 3″ x 1″ thermal label:

 
N
q609
Q203,26
A26,26,0,5,1,2,N,"HI, MOM!"
P1,1

Each command generally starts with a letter and is followed by comma-separated parameters, each of which are described in the manual. There are commands for drawing text, drawing lines, and drawing barcodes: the basic kind of stuff that you need to do in a warehouse envrionment.

OK. But what about images?

These printers come with a Windows printer driver that let them work with Windows like any other GDI-based printer. You can print a Word document, the driver translates it into a bitmap, and then the driver dishes out the image to the printer.

Sometimes, though, you want to use the EPL language (after all, your printer has tons of barcode formatting built into it already, so you might as well use that instead of buying a third-party library) while also dishing out a bitmap (such as your company’s logo). You look in the handy-dandy manual and see that you need to use the GW command to send out an image, and you start by …

The GW Command

… scratching your head. Ugh, looks like we’ll have to do some math.

But not yet

First, though, let’s pound and poke the image that we want to draw into a format that we can work with. Since we’re dealing with a thermal printer, there is no concept of grayscale here: we either burn a dot (represented by bit 0) or don’t burn one (represented by bit 1). (If you’ve seen my post about printing images to an ESC/POS receipt printer, you’ll note that Epson’s convention is conveniently the direct opposite of this.)

There’s a good chance that the bitmap that we’re trying to draw is not monochrome. In a monochrome image, each pixel is either 100% black (burn a dot) or 100% white (don’t burn a dot). Our image is probably grayscale or even in color, so we need to figure out how to snap each pixel to either being pure black or pure white.

The way one figures out how to do this is to search for it on the Internet and hope that some graphics nerd has done this for you. And, happily, there’s apparently this thing called luma that will serve this purpose nicely:

/// <summary>
/// Gets a <see cref="BitmapData"/> instace for a given image.
/// </summary>
/// <param name="bytes">The image data.</param>
/// <returns>The <see cref="BitmapData"/> instance.</returns>
private static BitmapData GetBitmapData(byte[] bytes)
{
    using (var ms = new MemoryStream(bytes))
    using (var bitmap = (Bitmap)Bitmap.FromStream(ms))
    {
        var threshold = 127;
        var index = 0;
        var dimensions = bitmap.Width * bitmap.Height;
        var dots = new BitArray(dimensions);
 
        for (var y = 0; y < bitmap.Height; y++)
        {
            for (var x = 0; x < bitmap.Width; x++)
            {
                var color = bitmap.GetPixel(x, y);
                var luminance = (int)((color.R * 0.3) + (color.G * 0.59) + (color.B * 0.11));
                dots[index] = luminance < threshold;
                index++;
            }
        }
 
        return new BitmapData()
        {
            Dots = dots,
            Height = bitmap.Height,
            Width = bitmap.Width
        };
    }
}

Given our bitmap as an array of bytes, we load it into a GDI+ bitmap using .NET’s System.Drawing namespace. Then we apply our luminance formula to determine how “bright” each pixel is, snapping it into a binary value. Then, we return a BitmapData struct that just contains the properties that you see here: the bitmap height, the bitmap width, and the now-binary pixels of our image strung out in one long array.

That’s nice, but you’re avoiding the math

So now we need to generate the actual GW command. The code here is remarkably similar to that of the article I wrote about sending images to ESC/POS printers. Here we go:

/// <summary>
/// Inserts a GW command image.
/// </summary>
/// <param name="bw">The binary writer.</param>
/// <param name="top">The top location.</param>
/// <param name="left">The left location.</param>
/// <param name="image">The image bytes.</param>
private static void InsertImage(BinaryWriter bw, int top, int left, byte[] image)
{
    var encoding = Encoding.ASCII;
    var data = GetBitmapData(image);
    var dots = data.Dots;
    var bytes = (int)Math.Ceiling((double)data.Width / 8);
 
    bw.Write(encoding.GetBytes(string.Format("GW{0},{1},{2},{3},", top, left, bytes, data.Height)));
 
    var imageWidth = data.Width;
    var canvasWidth = bytes * 8;
 
    for (int y = 0; y < data.Height; ++y)
    {
        for (int x = 0; x < canvasWidth; )
        {
            byte s = 0;
 
            for (int b = 0; b < 8; ++b, ++x)
            {
                bool v = false;
 
                if (x < imageWidth)
                {
                    int i = (y * data.Width) + x;
                    v = data.Dots[i];
                }
 
                s |= (byte)((v ? 0 : 1) << (7 - b));
            }
 
            bw.Write(s);
        }
    }
 
    bw.WriteNewLine();
}

So what the hell is going on here?

The first thing of note is that we calculate the p3 parameter by converting our bitmap width (in pixels) into a number of bytes. Since each pixel is represented by one bit (a 1 or 0), then each byte represents 8 pixels. That means that our image’s width must be a multiple of 8. We handle this by using Math.Ceiling in the conversion so that we’ll end up just padding with extra white space if our bitmap width is not a multiple of 8.

The second thing of note is that we calculate the p4 parameter. This is referred to in the documentation as the “print length” which is just a confusing way of asking “how many dots tall is it”. This is just the height of our bitmap.

Finally, we need to dump out our pixel data. I’m using .NET’s BinaryWriter class, so I have to write out data to the stream in bytes. The outer loop loops through each horizontal stripe of the bitmap, starting from the top. And the next loop draws each dot in that line, starting from the left. And the innermost loop fills up a byte, since we have to “gather” 8 pixels at once to write out as a byte to the BinaryWriter. There’s an extra if check there to account for the case where our bitmap image width is not a multiple of 8; if so, we need to make sure to pad the extra space instead of marching off to the next line in our dots array.

The s |= (byte)((v ? 0 : 1) << (7 - b)); line looks terrifying but is really just working to build up the byte. I discussed the mechanics of this in detail in my post about printing images to an ESC/POS receipt printer.

If you open the file in Notepad, you’ll see gibberish. That’s because the image data you just encoded isn’t going to map neatly into ASCII characters, and this is where the design of the EPL language starts to break down. It’s nice that most of it can be expressed in simple ASCII characters and can be edited in a text editor, but this isn’t one of those cases. If you open the file in a text editor and save it, you might “bake” the binary image data portion it in the wrong encoding and end up with gibberish on your printer. Be sure to send it directly to the printer as is without loading it into a string or StringBuilder!

Putting it all together

In my case, I was sick of a certain provider constantly breaking their EPL label with every software update and instead wanted to dump their PNG version of the label (which they seem to actually test) directly to the printer on 4″ x 6″ stock. Given the bitmap as a byte[] array, here’s a quick-and-dirty function to dump it out:

/// <summary>
/// Converts an image, represented by the given binary payload, into an EPL document.
/// </summary>
/// <param name="source">The image to convert.</param>
/// <returns>The EPL document payload.</returns>
internal static byte[] AsEplImageDocument(this byte[] source)
{
    using (var ms = new MemoryStream())
    using (var bw = new BinaryWriter(ms, Encoding.ASCII))
    {
        // Clear out any bogus commands
        bw.WriteNewLine();
 
        // Start a new document
        bw.Write(Encoding.ASCII.GetBytes("N"));
        bw.WriteNewLine();
 
        // Label width is 4"
        bw.Write(Encoding.ASCII.GetBytes("q812"));
        bw.WriteNewLine();
 
        // Label height is 6" ... only important in ZB mode
        bw.Write(Encoding.ASCII.GetBytes("Q1218,20"));
        bw.WriteNewLine();
 
        // From earlier in the article
        InsertImage(bw, 0, 0, source);
 
        // Print one copy of the label
        bw.Write(Encoding.ASCII.GetBytes("P1,1\n"));
        bw.WriteNewLine();
 
        bw.Flush();
 
        return ms.ToArray();
    }    
}

You can dish the resulting document out directly to the printer using the RawPrinterHelper class from MSDN.

Hope this helps someone!

DKIM Signing Outbound Messages in Exchange Server 2007

DomainKeys Identified Mail is a proposed standard that, in its own words, allows a domain (such as skiviez.com) to assert responsibility for the contents of an e-mail message.

The outgoing mail transport agent (“mail server”) computes a hash of the body of the mail message and adds a special, cryptographically signed DKIM-Signature header to the mail message that contains this hash as well as some other information about the message. The public part of the key, used by receiving mail transport agents to verify the signature on incoming messages, is stored as a TXT record in the DNS for the signing domain.

The result of all of this is that the recipient can now know that the body and certain headers of the e-mail message were not tampered with or changed by a third party while the message was in transit. If the signature contained within the DKIM-Signature header doesn’t verify, then it’s possible that the message is a phishing message, spam, or some other false representation of members of the signing domain. (IMHO, however, it is more likely to just indicate a broken DKIM implementation, as there are several obnoxious corner cases that we will see.)

Where is the Exchange support?

No currently released version of Microsoft Exchange supports DKIM natively, not even 2010. This is purportedly because Microsoft threw its weight behind another standard called SPF, or Sender Protection Framework. SPF is much simpler to implement because it does not make a statement about the integrity of the message contents; that is, there is no signing step that requires processing of each outbound message and stamping it with a special header. Instead, a TXT record in the DNS for the sending domain lists the IP addresses from which e-mail messages from that domain are allowed to originate.

For example, let’s say that a store has a mail transport agent at mail.example.com and a Web site at www.example.com. An SPF record for example.com might make a statement like the following:

“E-mail messages purporting to be from @example.com should only originate from either mail.example.com or www.example.com. All other originating IP addresses should be treated with suspicion.”

You can see why this is easier to implement; you just add an entry to DNS, and you don’t have to configure the outbound mail transport agent to do anything. The only reason a mail transport agent would need to change is for it to gain the ability to inspect incoming messages and validate the MAIL FROM part of the message headers against the corresponding SPF TXT records in DNS. And that is what Exchange does today.

While SPF does not validate the integrity of the message–a hacked, legitimate company mail server that spews out spam unwittingly will still pass an SPF–it is a simple, convenient way to mark phishing and spoofing attempts originating from other locations, like a random house in Wisconsin that is unwittingly part of a botnet, with suspicion.

But, then again, a hacked mail transport agent using DKIM that spews out spam unwittingly will still pass a DKIM verification. Spam can be signed just like regular mail. So the advantage of DKIM over SPF is that DKIM provides the ability to prove that the contents of the message were not modified in transit, and if an organization so chose, they could sign different messages with different keys or decide to not sign certain kinds of messages at all, allowing the recipient to interpret these different kinds of mail in different ways.

In reality, most e-mail doesn’t need to be that secure, so the utility-to-difficulty ratio may be a little out-of-whack here, especially when other standards for signing messages already exist.

So DKIM is more complicated and not supported by Microsoft. Why implement DKIM on Exchange?

Because fuck Yahoo!, that’s why.

Actually, let’s step back a minute.

Yahoo! invented something called Domain Keys, which predates DKIM and which DKIM is largely based on, in an attempt to combat spam in the ways described previously. The aim was for Yahoo! mail servers to be able to easily identify entire classes of suspect e-mail and send them to people’s spam folders (which, in Yahoo! weasel words, is referred to as the “bulk mail folder”). Then they integrated it into the Yahoo! mail system.

If your business sends mail of any kind of volume to Yahoo! mail servers, such as an e-mail newsletter that users subscribe to, you’ll quickly see a message along the lines of the following:

421 4.16.55 [TS01] Messages from x.x.x.x temporarily deferred due to excessive user complaints

And you might also notice your e-mail messages going straight to the spam folder on Yahoo! mail accounts by default, with a X-Yahoo-Filtered-Bulk: x.x.x.x header appearing in the message headers. If this has happened, it is because Yahoo! has decided that you are naughty and placed the IP address of your mail transport agent on an internal blacklist. If a Yahoo! user has previously marked e-mail from your domain as “Not Spam,” then that user will continue to receive your e-mail, but any other Yahoo! users will find your e-mail going to the spam folder by default until they either click the “Not Spam” button on one of them or add your from address to their contact list.

“But we’re a legitimate small business,” you say. “Only about 5,000 of the e-mails on our mailing list are actual Yahoo! addresses, the newsletter list is opt-in, and there are ‘Unsubscribe’ links in every e-mail. It’s not like we’re spamming users out of the blue,” you continue. “These are users who have asked for this service and can cancel at anytime. So why has Yahoo! blacklisted us and treat us like spam?”

The answer, unfortunately, is that Yahoo! is incompetent and can’t build a mail service that can handle spam like Gmail can handle spam. Instead, their e-mail system depends on the following metrics:

  • If a large burst of e-mail comes from the same IP address within a certain time period, temporarily consider it as a spam and temporarily deny the requests. The mail will still arrive, but the “fat pipe” will be squashed down so that overall delivery takes longer to complete. The idea is that if the mail is originating from a spammer, they are unlikely to try again.
  • If a “significant” number of Yahoo! users mark mail messages originating from the same IP address as spam by clicking on the “Spam” button in their account, add that IP address to the X-Yahoo-Filtered-Bulk blacklist.

In other words, if enough people mark your e-mail as spam, your IP address is fucked, and you’ll soon be fielding complaints from customers who insist that they aren’t receiving your Web sites transactional e-mails, such as a “forgot my password” or “shipping confirmation” e-mail, because Yahoo! has now decided to deliver all e-mails that originate from your IP address to the spam folder, which many users do not think to check or cannot find. So they blame you.

If you think that Yahoo! has blacklisted you in error, or if you have improved your e-mail sending practices and wish to have them consider you for removal, you can fill out this form with incorrect JavaScript required field validation to request that the Yahoo! postmaster take you off the naughty list. About 27 hours later, a Yahoo! representative with a sub-room temperature IQ will e-mail you back a long, canned reply that generally amounts to “No.”

But in that “no” message, they offer the possibility of signing up to the Complaint Feedback Loop. When you participate in this program, if a Yahoo! user presses the “Spam” button on one of your e-mails, then Yahoo! will e-mail you the e-mail address of that user along with a copy of the e-mail that you sent. You can then unsubscribe that user from your mailing list yourself (since they neglected to use the unsubscribe link in your e-mail and will continue to damage the reputation of your IP address if you continue to send mail to them), or you can see which kinds of e-mails you are sending are creating the most problems.

But participation in their feedback loop requires that you use DKIM to sign all of your outbound mail messages. The reasons for this are not clear, but I assume it has to do with ensuring that you are only notified about the problematic e-mails that legitimately came from your organization, and not ones generated by a spammer. Or, it could simply be a way to drive adoption of Yahoo!’s DKIM standard.

At any rate, Yahoo! mail is the most widely used e-mail service on the planet, so when the idiots in the higher echelons of Yahoo! say “Jump!” then we must ask “How high?” When they say “implement DKIM”, we implement DKIM.

So what are options for implementing DKIM on Exchange?

One option is to simply not do it in Exchange and set up a relaying mail server that has DKIM support, like hMailServer or Postfix with dkim-milter. But if you’re at a small business, the idea of maintaining yet another server for what is conceptually a simple task is not a pleasant thought.

Another option is to use a dedicated device like a Barracuda or an IronPort. This device would sit in front of Exchange and rewrite the mail headers in transit, adding the DKIM-Signature header as it flies out of your office. But these devices are not cheap and are out of reach of many small businesses. And the thought of acquiring a specialized device for doing something any mail transport agent should be able to do natively is not a pleasant thought.

You can buy an off-the-shelf plug-in for Exchange like this one from a company in Hong Kong. The reviews on the Internet do generally seem to indicate that it does work. But do we really want to spend $300 – $800 on a component from a company without a reputation and give that component read access to all of our organization’s outbound mail messages? Trust is certainly a driving factor here.

A fourth option, and the option I pursued since I considered giving up on DKIM if Yahoo! didn’t change its ways after our implementation, is to take advantage of Exchange 2007′s new Transport Agents functionality. This allows you to write custom, managed code that runs on that .NET Framework that integrates with the Exchange message processing pipeline. In our case, we could write a custom transport agent that appends a DKIM-Signature header to outgoing MIME messages.

Setting up the project

In Visual Studio, we just need to use a plain old “C# Class Library” project. The project must target version 2.0 of the CLR (that’s .NET Framework 2.0, .NET Framework 3.0, or .NET Framework 3.5, thanks to the dipshits in Microsoft’s marketing department) and not the .NET Framework 1.x or, maddeningly, the .NET Framework 4.x. This is because the transport agent process provided by Exchange Server 2007 that will load our transport agent is running on CLR 2.0, and a process that’s been loaded with an earlier version of the CLR can’t load a later version.

We also need to reference two assemblies provided by the Exchange Server, Microsoft.Exchange.Data.Common.dll and Microsoft.Exchange.Data.Transport.dll. You would think that you would be able to find these in, say, the Exchange Server SDK that’s available from Microsoft Downloads. But you’d be wrong. Microsoft keeps diddling with the version of these assemblies with various update rollups for Exchange, and the only way to get a copy is to pull them off an actual Exchange Server 2007 installation. Mine were located in C:\Program Files\Microsoft\Exchange Server\Public; I just copied them to my local computer and referenced them in my new class library assembly for local development.

To start, we just need to create two classes. The first is our actual agent, which derives from the RoutingAgent class defined in the DLLs we just copied:

public sealed class DkimSigningRoutingAgent : RoutingAgent
{
	public DkimSigningRoutingAgent()
	{
		// What "Categorized" means in this sense,
		// only the Exchange team knows
		this.OnCategorizedMessage += this.WhenMessageCategorized;
	}
 
	private void WhenMessageCategorized(
		CategorizedMessageEventSource source,
		QueuedMessageEventArgs e)
	{
		// This is where the magic will happen
	}
}

The second is a factory class that the Exchange server’s little plug-in agent looks for; it’ll instance new copies of our agent:

public sealed class DkimSigningRoutingAgentFactory : RoutingAgentFactory
{
	public override RoutingAgent CreateAgent(SmtpServer server)
	{
		return new DkimSigningRoutingAgent();
	}
}

You’re probably thinking, “Wow! The documentation provided by the Exchange team sure blows donkey chunks, how did you ever figure out that that was what you needed to do?” The answer is by spelunking through the Exchange Server SDK examples and by religiously following this guy’s blog.

Now we’re ready to start reading mail messages and “canonicalizing” them into the format that the DKIM spec expects.

Exchange and “Internet Mail”

If there is one thing that is annoying about Exchange, it is that 20-some years after the fact it seems skeptical that this whole “Internet Mail” thing is going to catch on. Getting Exchange to give you the raw message in MIME format is not as simple as one might think.

You see, to be able to hash the body of the message and sign certain headers in the message, we need to know exactly how Exchange is going to format it when it is sent. Even a single space or newline that is added after we compute the hash can throw the whole thing off.

In our routing agent’s OnCategorizedMessage event listener, the event arguments give us access to an instance of the MailItem class. It has a boatload of properties for accessing the body and headers of the message programmatically. Unfortunately, we can’t use these properties because they represent the semantic values, not the raw ones. Instead, we’ll need to use the GetMimeReadStream() and GetMimeWriteStream() methods to read the raw mail message and write out the modified version, respectively.

Implementing the routing agent

Let’s start by completing the Routing Agent implementation. We’ll keep it simple by moving all of the hard signing stuff into an IDkimSigner interface, which we’ll worry about implementing later:

public sealed class DkimSigningRoutingAgent : RoutingAgent
{
    private static ILog log = LogManager.GetLogger(
        MethodBase.GetCurrentMethod().DeclaringType);
 
    private IDkimSigner dkimSigner;
 
    public DkimSigningRoutingAgent(IDkimSigner dkimSigner)
    {
        if (dkimSigner == null)
        {
            throw new ArgumentNullException("dkimSigner");
        }
 
        this.dkimSigner = dkimSigner;
 
        this.OnCategorizedMessage += this.WhenMessageCategorized;
    }
 
    private void WhenMessageCategorized(
        CategorizedMessageEventSource source,
        QueuedMessageEventArgs e)
    {
        try
        {
            this.SignMailItem(e.MailItem);
        }
        catch (Exception ex)
        {
            log.Error(
                Resources.DkimSigningRoutingAgent_SignFailed,
                ex);
        }
    }
 
    private void SignMailItem(MailItem mailItem)
    {
        if (!mailItem.Message.IsSystemMessage &&
            mailItem.Message.TnefPart == null)
        {
            using (var inputStream = mailItem.GetMimeReadStream())
            {
                if (this.dkimSigner.CanSign(inputStream))
                {
                    using (var outputStream = mailItem.GetMimeWriteStream())
                    {
                        this.dkimSigner.Sign(inputStream, outputStream);
                    }
                }
            }
        }
    }
}

The only real quirk is the if statement in the SignMailItem() function, which I mostly discovered through trial and error. If the mail item is a “system message” (whatever that means) then all of the mailItem‘s methods will be read only (throwing exceptions if we try to mutate), so we shouldn’t even bother. And if the mail item has a TNEF part, then in it’s in a bizarro proprietary Microsoft format, and the DKIM spec just isn’t going to work with that. Finally, if something blows up, we catch the exception and log it–better to send a message without a signature than not send it all.

Defining an interface for DKIM signing

So the next step is to make up that IDkimSigner implementation and make it do the dirty work. You can see that I’ve made it simple in that we only need to write two methods:

public interface IDkimSigner : IDisposable
{
    bool CanSign(Stream inputStream);
 
    void Sign(Stream inputStream, Stream outputStream);
}

A method for sanity checking

The first method will scan our mail item’s content stream and do a sanity check and ensure that we can actually sign the message. For example, if our IDkimSigner implementation is configured to sign messages originating from warehouse1.example.com and we pass CanSign() a message from warehouse2.example.com, then we can return false to indicate that we just don’t know what to do with the message. Let’s implement that method.

private string domain;
 
public bool CanSign(Stream inputStream)
{
    bool canSign;
    string line;
    StreamReader reader;
 
    if (this.disposed)
    {
        throw new ObjectDisposedException("DomainKeysSigner");
    }
 
    if (inputStream == null)
    {
        throw new ArgumentNullException("inputStream");
    }
 
    canSign = false;
    reader = new StreamReader(inputStream);
 
    inputStream.Seek(0, SeekOrigin.Begin);
 
    line = reader.ReadLine();
    while (line != null)
    {
        string header;
        string[] headerParts;
 
        // We've reached the end of the headers (headers are
        // separated from the body by a blank line).
        if (line.Length == 0)
        {
            break;
        }
 
        // Read a line. Because a header can be continued onto
        // subsequent lines, we have to keep reading lines until we
        // run into the end-of-headers marker (an empty line) or another
        // line that doesn't begin with a whitespace character.
        header = line + "\r\n";
        line = reader.ReadLine();
        while (!string.IsNullOrEmpty(line) &&
            (line.StartsWith("\t", StringComparison.Ordinal) ||
            line.StartsWith(" ", StringComparison.Ordinal)))
        {
            header += line + "\r\n";
            line = reader.ReadLine();
        }
 
        // Extract the name of the header. Then store the full header
        // in the dictionary. We do this because DKIM mandates that we
        // only sign the LAST instance of any header that occurs.
        headerParts = header.Split(new char[] { ':' }, 2);
        if (headerParts.Length == 2)
        {
            string headerName;
 
            headerName = headerParts[0];
 
            if (headerName.Equals("From", StringComparison.OrdinalIgnoreCase))
            {
                // We don't break here because we want to read the bottom-most
                // instance of the From: header (there should be only one, but
                // if there are multiple, it's the last one that matters).
                canSign = header
                    .ToUpperInvariant()
                    .Contains("@" + this.domain.ToUpperInvariant());
            }
        }
    }
 
    inputStream.Seek(0, SeekOrigin.Begin);
 
    return canSign;
}

Barf. But we have to do this style of ghetto parsing because, after all, we’re dealing with the raw e-mail message format. All we’re doing is scanning through the headers until we reach the last From: header, and then we make sure that the From: e-mail address belongs to the domain that our instance knows how to sign. Then we seek back to the beginning of the stream to be polite.

A method for signing

The second method that we have to implement is the one that actually does all of the dirty work. And in DKIM signing, we can break it down into five steps:

  1. Compute a hash of the body of the message.
  2. Create an unsigned version of the DKIM-Signature header that contains that body hash value and some other information, but has the signature component set to an empty string.
  3. “Canonicalize” the headers that we are going to sign. By “canonicalize”, we mean “standardize capitalization, whitespace, and newlines into a format required by the spec, since other mail transport agents who get their grubby paws on this message might reformat the headers”.
  4. Slap our unsigned version of the DKIM-Signature header to the end of our “canonicalized” headers, sign that data, and slap the resulting signature to the end of the DKIM-Signature header.
  5. Write this signed DKIM-Signature into the headers of the mail message, and send it on its merry way.

Divide and conquer!

Implementing the Sign() method

Our implementation for the Sign() method will tackle each step in turn:

public void Sign(Stream inputStream, Stream outputStream)
{
    if (this.disposed)
    {
        throw new ObjectDisposedException("DomainKeysSigner");
    }
 
    if (inputStream == null)
    {
        throw new ArgumentNullException("inputStream");
    }
 
    if (outputStream == null)
    {
        throw new ArgumentNullException("outputStream");
    }
 
    var bodyHash = this.GetBodyHash(inputStream);
    var unsignedDkimHeader = this.GetUnsignedDkimHeader(bodyHash);
    var canonicalizedHeaders = this.GetCanonicalizedHeaders(inputStream);
    var signedDkimHeader = this.GetSignedDkimHeader(unsignedDkimHeader, canonicalizedHeaders);
 
    WriteSignedMimeMessage(inputStream, outputStream, signedDkimHeader);
}

Computing the body hash

The first step, computing the hash of the body, is actually pretty easy. There is only one quirk in that DKIM spec says that if the body ends with multiple empty lines, then the body should be normalized to just one terminating newline for the purposes of computing the hash. The code is not exciting, and you can download it at the end of this article.

Creating the “unsigned” header

The next step is to create the “unsigned” DKIM-Signature header. This is where the DKIM spec is just weird. The DKIM-Signature header contains a lot of information in it, such as the selector, domain, and the hashing algorithm (SHA1 or SHA256) being used. Since that information is vital to ensuring the integrity of the signature, it’s important that that information be a part of the DKIM signature.

If I were designing this, I would append two headers to e-mail messages: a DKIM-Information header that contained all of the above information and is part of the data that is signed and a DKIM-Signature header that contains just the signature data. But the DKIM spec makes use only the one DKIM-Signature header, and for the purposes of signing, we treat the “signature part” of the header (b=) as an empty string:

private string GetUnsignedDkimHeader(string bodyHash)
{
    return string.Format(
        CultureInfo.InvariantCulture,
        "DKIM-Signature: v=1; a={0}; s={1}; d={2}; c=simple/simple; q=dns/txt; h={3}; bh={4}; b=;",
        this.hashAlgorithmDkimCode,
        this.selector,
        this.domain,
        string.Join(" : ", this.eligibleHeaders.OrderBy(x => x, StringComparer.Ordinal).ToArray()),
        bodyHash);
}

You can see here that I’ve got some instance variables that were set in our IDkimSigner implementation’s constructor, such as the hash algorithm to use, the selector, domain, headers to include in the signature, and so on. We also insert our recently-computed hash of the body here.

You can also see that I’m using “simple” body canonicalization and “simple” header canonicalization. The DKIM spec gives us a few options in determining how the message is represented for signing and verification purposes. For the “simple” body canonicalization, it means “exactly as written, except for the weird rule about multiple newlines at the end of the body”. For the “simple” header canonicalization, it means “exactly as written, whitespace, newlines, and everything”.

There is a “relaxed” canonicalization method, but it’s more work, since you have to munge the headers and body into a very particular format, and I didn’t feel like writing a MIME parser.

Extracting “canonicalized” headers

The third step is to get a list of the canonicalized headers. In the constructor, I accept a list of headers to sign: From, To, Message-ID, and so on. (From is always required to be signed.) Then I use parsing code similar to that used in the CanSign() method and build a list of of the raw headers. The only real gotcha to watch out for is that headers can be wrapped onto more than one line, and since we’re using the “simple” canonicalization algorithm, we’ll need to preserve those whitespaces and newlines exactly as we extract them from the stream. Then I sort the headers alphabetically, since that’s how I specified them in the GetUnsignedDkimHeader() method specified above.

Signing the message

The logic behind signing the message is not that difficult. We smash all of the canonicalized headers together, add our unsigned DKIM-Signature header to the end, and compute our signature on this. Then we append the signature to the b= element, previously empty, of our DKIM-Signature header:

private string GetSignedDkimHeader(
    string unsignedDkimHeader,
    IEnumerable<string> canonicalizedHeaders)
{
    byte[] signatureBytes;
    string signatureText;
    StringBuilder signedDkimHeader;
 
    using (var stream = new MemoryStream())
    {
        using (var writer = new StreamWriter(stream))
        {
            foreach (var canonicalizedHeader in canonicalizedHeaders)
            {
                writer.Write(canonicalizedHeader);
            }
 
            writer.Write(unsignedDkimHeader);
            writer.Flush();
 
            stream.Seek(0, SeekOrigin.Begin);
 
            signatureBytes = this.cryptoProvider.SignData(stream, this.hashAlgorithmCryptoCode);
        }
    }
 
    signatureText = Convert.ToBase64String(signatureBytes);
    signedDkimHeader = new StringBuilder(unsignedDkimHeader.Substring(0, unsignedDkimHeader.Length - 1));
 
    signedDkimHeader.Append(signatureText);
    signedDkimHeader.Append(";\r\n");
 
    return signedDkimHeader.ToString();
}

The only gotcha here, which I lost a few hours to, is a weird quirk of the .NET Framework 3.5 implementation of the SignData() function of the RSACryptoServiceProvider class. One of the overloads of the SignData() function accepts an instance of a HashAlgorithm to specify the kind of hash to use. The SHA-256 implementation was added in .NET 3.5 SP1, but it was done in such a way that an internal switch statement used internally by the .NET crypto classes wasn’t updated until .NET 4.0 to recognize the new SHA256CryptoServiceProvider type. Some guy blogs about why this is, but what it essentially means is that if you pass a SHA256CryptoServiceProvider instance to the SignData() method on .NET 2.0/3.0/3.5/3.5SP1, you get an exception, and on .NET 4.0 you don’t. Since Exchange 2007 uses .NET 3.5 SP1, we have to use the recommended workaround of using the overload that accepts a string representation of the hash algorithm.

Writing out the message

The last step is to write out the message with our newly created DKIM-Signature header. This really is a simple as taking the output stream, writing the DKIM-Signature header, and then dumping in the entire contents of the input stream.

Getting a key to sign messages with

Let us take a brief interlude from our DKIM circles of hell and obtain a key with which we will actually sign the DKIM-Signature header we’ve worked so hard to create.

We need to generate an RSA public/private key pair: a public key to store in DNS in the format required by the DKIM spec, and a private key to actually sign the messages with. The nice folks over at Port25 have a DKIM wizard that does exactly that.

It’s smashingly simple–just enter your domain name (say, “example.com“), a “selector” (say, “key1“), and select a key size (bigger is better, right?). The “selector” is a part of the DKIM spec that allows a single signing domain to use multiple keys. For example, you could use a key with selector name “newsletters” to sign all of the crap newsletter e-mails that you send out, and another key with selector name “tx” to sign all of the transactional e-mails that you send out.

It then spits out the syntax of the TXT records that you need to add to DNS for that selector:

key1._domainkey.example.com IN  TXT     "k=rsa\; p={BIG HONKING PUBLIC KEY HERE}"
_domainkey.example.com      IN  TXT     "t=y; o=~;"

The first record is where the public part of the key is stored. Whenever a mail transport agent sees one of our DKIM-Signature headers with a selector of key1, it’ll know to go hunting in DNS for a TXT record named key1._domainkey.example.com and pull the public key for verification from there.

The second record is part of the older DomainKeys specification and it is not strictly necessary. As written here, it means that we’re in testing mode (“t=y”)–that is, don’t freak out if you see a bad signature because we’re still dicking around with the setup of our implementation–and that not all messages originating from this domain will be signed (“o=~”)–maybe we won’t bother signing our newsletter e-mails, for example.

We’ll also have the private key specified in a format similar to the one below:

-----BEGIN RSA PRIVATE KEY-----
MIIBOwIBAAJBANXBbZybdmjKDTONFVqAWXmGzR6GSZX5LV3OF//1jRz7dzGWTCKK
jembqBxqhr0Y2ua2l4D4EZi6FwDmdqgLS6MCAwEAAQJAD4qhypovEM1oClB+tfbR
Cpn3ffmrjgDxAHoEmrKi0PGBn8fumW22bad2tmrAjWWTVmeXJvQyEy1awq0M2PMR
0QIhAPEnqivb5dKZbTeKhiF4c6IUHfwEq8wNf2LWZvdH3ROrAiEA4un604mDss4Q
qAVEx686pUttfWyJrYkcZ/tx7kOoL+kCICEysqyDAypw0KY6vahR6qk/V7lf8z6O
BSFYHqigDgEtAiEAsK9r5UcQSyv1AD+J/MpOqeJ/kMfwtDUs7zJ01gfMb/ECIQDg
8d/XVJDi4Cqbt4wfcHZxADAgqyK8Z5M69fBecnExVg==
-----END RSA PRIVATE KEY-----

One thing I have glossed over in the code discussion until now was how that this.cryptoProvider instance that actually computes the signature got created.

We’ll need to read this key and load it into the cryptography classes used by the .NET Framework and by Windows to actually sign mail messages and get that this.cryptoProvider instance. Surely there is a simple API for this, yes?

Instancing a CryptoProvider

One problem is that the documentation in MSDN for the CryptoAPI is bad. I say “bad” because it certainly seems like .NET and Windows don’t expose native support for processing a PEM-encoded key, and if it does, well, I couldn’t find the documentation for it. Instead, the RSACryptoServiceProvider prefers to store its keys in an XML format that nothing else in the world seems to use.

This means that our implementation is so close to being finished that we can almost taste it, but now we have to complete a side quest to actually read our damn key and get an instance of the RSACryptoServiceProvider. Or, we could generate a certificate ourselves and store it in the Certificates MMC snap-in, but why should we have to do that? I’d rather just plop the damn key in the application configuration file like the rest of the goddamned world does it, “secure container” my ass.

We can thank the moon and the stars that some guy has written a PEM reader for us. How does it work? I have no idea, but I tested it on several keys and it seemed to work fine, which is good enough for me. I tossed this code into a static CryptHelper class, and now getting an instance of the RSACryptoServiceProvider is as simple as

this.cryptoProvider = CryptHelper
     .GetProviderFromPemEncodedRsaPrivateKey(encodedKey);

Loading the routing agent into Exchange Server 2007

I took all of this code and then added boring administrative stuff like logging and moving some hardcoded values (such as the PEM-encoded key, the selector, and the domain) into the usual .NET App.config file mechanism.

Installing the agent on the Exchange Server is surprisingly simple. After compiling the project and futzing with the configuration file, we just copy the DLLs and configuration file to a folder on the Exchange Server, say, C:\Program Files\Skiviez\Wolverine\DKIM Signer for Exchange\.

Then we launch the Exchange Management Shell (remember to right-click it and “Run as Administrator”) and execute a command to tell Exchange to actually register our agent:

Install-TransportAgent
     -Name "DKIM Signer for Exchange"
     -TransportAgentFactory "Skiviez.Wolverine.Exchange.DkimSigner.DkimSigningRoutingAgentFactory"
     -AssemblyPath "C:\Program Files\Skiviez\Wolverine\DKIM Signer for Exchange\Skiviez.Wolverine.Exchange.DkimSigner.dll"

followed by

Enable-TransportAgent -Name "DKIM Signer for Exchange"

Interestingly, there will be a note telling you to close the Powershell window. It is not kidding. For some reason, the Install-TransportAgent cmdlet will keep a file handle open on our DLL, preventing Exchange from actually loading it until we close the Powershell window.

To make it actually work, we need to restart the Microsoft Exchange Transport service. I’ve found that restarting the Microsoft Exchange Mail Submission right after that is a good idea; otherwise, there can be a short delay of about 15 minutes before people’s Outlooks attempt to send outbound mail again.

Testing the implementation

To make sure things are actually working, Port25 comes to the rescue with their verification tool. You just send an e-mail to check-auth@verifier.port25.com and within a few minutes, they’ll send you an e-mail back with a boatload of debugging information. If it’s all good, you’ll see a result like the following:

Summary of Results
SPF check:          pass
DomainKeys check:   neutral
DKIM check:         pass
Sender-ID check:    pass
SpamAssassin check: ham

(What you’re looking for is the “pass” next to the DKIM check. The DomainKeys part being neutral is OK, since DomainKeys is the older standard and we’re choosing not to implement it.)

Conclusions and Delusions

I’ve been using this code for a few weeks now and it seems to work fine–the messages that I’ve sent through the server to Port25 and my Yahoo! test account all end up showing the DKIM passing. The usual “it works on my machine!” disclaimers apply, however, as I’m sure there are myriad configuration differences in Exchange that could this not to work. Bug fixes are welcome, but don’t come crying if it sends all of your e-mail to that big junk folder in the sky.

And thanks to some blowhards at Yahoo!, the world now has a public domain implementation of DKIM signing for Exchange to play with.

And in case you’re curious–after doing all this work to set up DKIM and participate in the Complaint Feedback Loop at their suggestion–their answer is still “no,” without elaboration. When Yahoo! finally goes under, I won’t be one shedding a nostalgic tear.

There are some unit tests, but they do have our private key in them, and I couldn’t be bothered to siphon those out. The code below is just the bits that do the actual signing.

Download the code used in this article.

Windows Mobile Device Center hangs on splash screen

At Skiviez, we use two HandHeld Honeywell Dolphin D7600 Mobile Computers to pick orders. They’re devices with a barcode scanner, a touch screen, and WiFi capability. They run a Platform Builder variant of Windows CE 5.0. And in Windows XP, they used ActiveSync to connect to Windows and provide the basic service of being able to access the file system of the device (the other services provided by ActiveSync, like syncing mail, contacts, and calendars, don’t really make sense for an industrial device like this).

ActiveSync is a pretty wonky and ugly looking program, but it worked.

In Windows Vista and Windows 7, ActiveSync has been replaced–though the tradition of fragility and wonkiness continued–by an abomination called the Windows Mobile Device Center. (Indeed, it is telling that the new Windows Phone 7 drops ActiveSync/Windows Mobile Device Center completely and instead uses its own synchronization mechanism through the Zune software.) Which, like its predecessor, still worked for providing the basic service of accessing the file system and allowing Visual Studio to connect a debugger.

Except, one day, it stopped working. And I had no idea why.

Suddenly, plunking the device into its cradle would have the following behavior:

  • The device would authenticate and think it is connected.
  • The PC’s USB subsystem would recognize the device and think it is connected.
  • Windows Mobile Device Center would permanently hang at the splash screen:

Which is just awesome.

If I re-launched Windows Mobile Device Center via the Start menu, the program would open, but it would insist that the device is “Not Connected.”

So, I began to troubleshoot.

  • I looked in the Event Log. (My first mistake. I’m renaming the Event Log the “Red Herrings Viewer”.)
  • I tried enabling verbose logging for Windows Mobile Device Center (which reports nothing useful).
  • I tried uninstalling the the device and letting Windows reinstall its drivers.
  • I tried uninstalling Windows Mobile Device Center and reinstalling it.
  • I tried soft resetting the device.
  • I tried hard resetting the device.
  • I made sure that the “Windows Mobile *” firewall rules existed in Windows Firewall.
  • I tried a different USB port.
  • I tried a different USB cable.
  • I tried a different D7600 device.
  • I tried a different D7600 cradle.
  • I tried merged the registry settings from another Windows 7 machine that the device would successfully connect to.
  • I tried switching to the RNDIS connectivity model.
  • I tried granting additional permissions on the “Windows CE Services” registry keys.
  • I tried diddling with the various “Windows CE Services” registry keys.

You might think that anyone of these things would contain the solution. But you’d wrong.

The problem was that I had FileZilla FTP Server installed on my machine, configured to allow FTPS connections. (We use an FTP server to manage images and files on the Skiviez Web site, and I had a local copy on my machine from when I was testing the configuration.)

Now, some people might ask “Why the hell would an FTP server break Windows Mobile device connectivity?” Apparently, Windows Mobile Device Center uses port 990 to orchestrate the connection.

Port 990 just so happens to be the standard control port for FTPS connections. If anything else is consuming port 990, then the Windows Mobile Device Center either hang, reports that the device is not connected, or stupidly tries to keep connecting to it. (A message like “whoops port 990 is in use or does not seem to be a mobile device” would go a long way.)

So make sure nothing is using port 990; then go pour yourself a g&t.

SBS 2008 restarts unexpectedly when backup starts

Today, our SBS 2008 server restarted itself at 5:00 p.m. sharp. I mean on the dot.

That was disturbing enough in itself. When the system came back up, it helpfully asked me to type in “Why did the system shut down unexpectedly?” and I enthusiastically typed in “Fuck if I know you jackass.” Then, I headed straight for the event log.

The event log was full of terrifying messages such as

The system failed to flush data to the transaction log. Corruption may occur.

or

An error was detected on device \Device\Harddisk1\DR2  during a paging operation.

Hmm. There was no blue screen, no bug check, no minidump. It was as if the power had been cut.

I looked accusingly at the UPS since I have had problems with bad UPSs interrupting the power supply in the past. I held down its self-test button, it made that satisfying buzzing noise, and … everything stayed up.

But while crouched down next to the UPS, I heard an odd swishing noise, like a tiny man was running his finger across a sheet of Saran Wrap. Then I noticed that the external Western Digital hard drive that we use for SBS 2008 backup was doing its swooshing-lights mode, not its solid-lights mode, and I knew from previous experience that it only did that when it was starting up or shutting down.

I had a hunch–in SBS 2008, backup uses Volume Shadow Copy, and I had seen similar disk errors when another of our external hard drives cooked itself (though instead of rebooting, that server became unresponsive). I unplugged the external drive and the event log messages stopped.

I then promptly threw the external hard drive into the trash, drove straight to Best Buy and bought a new external hard drive with the company credit card. (Aside: Why do 90% of external hard drives come with craptastic backup software or “one-touch” buttons? I just want a drive in a box. I finally found one in the “Seagate Expansion” line.)

Then I plugged in the new external hard drive and re-ran the “Configure server backup” wizard from the SBS 2008 console. I unchecked the old, now non-existent drive, checked the new one, and off it went. And all seems happy now. (I ran chkdsk for good measure on the system and data drives and they checked out OK, so it does all seem related to the external backup drive cooking itself.)

Should it be capable of handling faulty backup hardware more gracefully? Sure. And I wish that SBS 2008 had the option to use the old ntbackup utility because then at least you could backup to network-attached storage. It’s been my experience that external hard drives really are not that reliable and have an average lifespan of only about two years, but maybe I have just been glaring at them the wrong way.

Quick Tip: Sharing a FedEx ZP 500 printer attached to a Windows XP computer to a Windows Vista/7 machine

At Skiviez/WFS, we have a FedEx ZP 500 ZPL printer on the shipping desk. This is what FedEx is migrating everyone to now that the tried-and-true Zebra/Eltron LP2844* series is getting a little long in the tooth. (Along with a gradual migration to ZPL over EPL2, but that’s a rant for another day.)

FedEx ZP 500

The FedEx ZP 500 is a bit of a white elephant in that Zebra doesn’t mention it on their Web site; it’s some sort of special contract job with FedEx to produce and jointly brand these devices. It’s probably just a re-branded version of the Zebra GK420d, but in reality we have the printer manufacturer pretending that they don’t make the printer (e.g., “call FedEx for support”) and a shipping company who has no idea how to support the printer (e.g., “call Zebra for the printer driver”). But I’m getting distracted.

The real issue was that the shipping desk is running Windows XP and shares the printer via the native Windows printer sharing mechanism so that it’s listed in the Active Directory. I do this so that I can run integration tests from the workstation in my office and test the label generation functions of our software without needing to have a thermal label printer hooked up to my workstation solely for this purpose. These printers aren’t cheap, you know.

New operating system, new drivers required

I recently upgraded my workstation to Windows 7. When I tried to add the shared FedEx thermal printer, I was greeted with error code 0x00000007a along with an error message that generally amounted to “something didn’t work.” I suspected a driver problem since Vista is when Microsoft locked down on the mandate that printer drivers run in user mode, not kernel mode–which is a good thing in terms of system stability, since a poorly-written printer driver can no longer trigger a BSOD and a reboot, but a bad thing in terms of backwards compatibility.

The problem is that

  • the Windows XP machine is offering the Windows XP drivers to my Windows 7 install;
  • the Windows 7 printer wizard doesn’t give me a chance to supply my own printer drivers, and instead happily installs the XP ones, which don’t work;
  • the FedEx-supplied Vista drivers are mutually exclusive in terms of compatibility with the XP drivers, so I can’t install them on the XP machine via the Server Properties thingie; and
  • even if I could do that, I am hesitant to dick around with the printer drivers on a critical machine.

Adding the printer

The solution was to add a printer in a different, counter-intuitive way. Here’s what I did:

  1. From the Windows 7 Control Panel, I went to “View devices and printers” and then “Add a printer”.
  2. When asked “What type of printer do you want to install?” I chose Add a local printer, even though I know full damn well that I’m not actually adding a local printer.
  3. For “Choose a printer port,” I chose “Create a new port” with “Type of port” set to “Local Port”.
  4. In the “Enter a port name” dialog, I entered the UNC share name for the printer, which looks like \\{MACHINE-NAME}\{PRINTER-SHARE-NAME}. In my case, it was \\ASHWHWS003\FedEx ZP 500 Plus.
  5. When asked for a driver, I chose “Have Disk” and navigated to the *.inf file in the ZD directory of the Zebra Designer drivers available from the FedEx Web site.

This allowed me to use locally-available printer drivers on a printer attached to another machine. Good luck!

Buying Software Sucks

What follows is a rant about the state of marketing in the software industry.

We happen to like our system

So we’ve branched out and offered fulfillment services to other merchants, as I’ve mentioned before. Essentially, other e-commerce stores or merchants can store their items in our warehouse and then transmit orders as “fulfillment requests” to us. We either ship the request or mark it as invalid with a little message (“item has insufficient stock, delivery point validation failed, tax identification number required, etc.”). If they want to re-attempt, then they re-submit an entirely new fulfillment request.

This simple model has worked surprisingly well: the system doesn’t track “orders” (since each merchant handles backordering and cancellations differently), and while it does maintain an inventory log, it doesn’t track the cost of goods in inventory or anything like that, since that is not necessary for us to do our job. (Instead, it tracks a whole boatload of other information that people don’t normally think about, like HS codes or unit weight.)

Some of our merchants need some hand-holding

We have an API so merchants can integrate with our system. And so I’ve written a few plug-ins for Magento and osCommerce that auto-transmit their orders as fulfillment requests, sync outbound shipments back, and deduct inventory from their systems (since we are the authoritative inventory count). This works great for their retail businesses.

We have a merchant that we’ve been doing retail business with for some time who now wants to do wholesale stuff with us. He wants a sales order/invoicing/inventory management solution, and it needs to be able to track multiple inventories across multiple warehouses (our integration, if any, would only be adjusting inventory for a particular warehouse). He wants to enter sales orders remotely, press a button that shows him how much is ordered so he knows how much to make, have that manufactured and sent to us, click another button to transmit the sales order as a fulfillment request to us once it’s in stock, and then have us sync back with the shipment info, creating an invoice.

“Shouldn’t they have had these features prior to switching to you guys?” you ask. Well, yes. In this case, all of these features were provided in an all-in-one system provided by their previous fulfillment warehouse. They have since learned their lesson about keeping all of their data in the hands of a third party because when that relationship went south, so did their access to their own data.

We would rather not add these features to our system. Since all merchants have different ways of handling backorders, pre-orders, cancellations, cost of goods sold (FIFO, LIFO, average, priority), we’ve been maintaining the position that–unlike our competitors–our system is essentially feature complete, since it’s ours and does what we need it to do to ship things out. The features that I’ve mentioned should be things that the merchants are keeping track of themselves–since that’s their business–and integrate with our system via the programming API. While an argument could be made that our system would be abso-freaking-fantastic for merchants who need an all-in-one software and data solution (yes, it certainly would), the reality is that our competitors have an outsourced team of software developers, and we are a small business working in an area that is tangential to our core business as a result of the “new economy” that has a software development team of just one person (me) and can’t even begin to dream of hiring any more until we start seeing some serious cash flow.

In any case, to land this deal, we need to find a system for this merchant, and fast, because there are some important trade shows coming up.

Welcome to marketing Hell

Now for the rant, because you would think that these requirements are not exotic:

  • Let salespeople enter sales orders remotely.
  • Keep track of inventory in multiple locations, and track the cost of inventory.
  • Provide integration hooks so the user can send orders to the warehouse and so the warehouse can send shipment data back.
  • Keep track of sales-order-to-invoice conversions and payments.
  • Provide reporting features.

QuickBooks 2010: Same as last year! Now with more shininess!

You might take a look at integrating with QuickBooks, but you’d realize that once you’ve penetrated the marketing speak that the software in 2010 is essentially no different in terms of fundamental feature set as it was in 2006, and that it doesn’t support inventory in multiple locations and doesn’t scale well. In fact, QuickBooks performance once you start approaching 10,000 SKUs is so bad that they sell an “enterprise” version that essentially–aside from some fine-grained access permissions–has no added features other than the feature of not crashing when dealing with large lists of information.

We could pay another couple of thousand dollars for Fishbowl inventory, which would add multiple location support to Quickbooks, but then we’d have created a Rube Goldberg machine straight out of the gate, with me synchronizing with Fishbowl which is then synchronizing with Quickbooks. I’m sure nothing would go wrong there. That would be insane; we might as well just stick a few fax machines into the sync process and call it an insurance company.

A gap in the market

The reality is that there is a huge gap in the marketplace between merchants who are moving $200k or less per year–just use commercial off-the-shelf (COTS) QuickBooks, you can do most things manually and use your e-commerce system’s native order management functions–and merchants who are moving $5m or more–just use SAP or some other enterprisey software. If you’re in between, like we are and like the merchant that I’m researching for is, the options available to you are not pretty.

I’m not sure why this is. All I can think of is that perhaps companies historically did not spend much time in this space–they either stayed small or had venture capital to acquire the big boy systems and grow quickly. People aren’t exactly lending money anymore, so I suspect that this is a segment that is only going to grow.

If you try to look for COTS software in this segment, you’ll never find the feature matrix that you need:

  • inFlow Inventory is pretty, but offers no integration features, as if an entire business could be run out of one app, and doesn’t offer Web access.
  • WorkingPoint is Web-based but doesn’t offer inventory tracking in multiple locations or an integration API.
  • QuickBooks has a Web-based extension that lets QuickBooks understand multiple inventories but costs thousands of dollars, assumes that the company owns its own warehouse (that is, needing picking/packing/shipping capabilities), and still does the same style of synchronization as Fishbowl does. You’d think Intuit would just add the @#$#@ feature to QuickBooks itself!

No COTS to sleep in

The market seems to have determined that people in this segment have outgrown COTS software and need some consulting help. So any Web sites that advertise products will be full of pages and pages of impenetrable marketing bullshit that use obnoxious acronyms like ERP, CRM, MRP, and WMS, promise the moon, and coyly make no reference to pricing or contract requirements so you can’t even tell if you’re dealing with the right league of product, when the reality is at the end of the day you could look at two or three screenshots and the SDK’s API and immediately tell if the product fit your needs.

Instead, I notice a disturbing trend of “pretty Web site, crap product,” such as Sage’s Simply Accounting, which certainly appears to have an impressive array of features but in reality doesn’t even know the difference between a sales order and an invoice. You can try going to Microsoft’s Dynamics site, but good luck figuring out what the difference between Dynamics AX, Dynamics CRM, Dynamics NAV, and Dynamics GP are: you’ll be told to contact your “Microsoft Dynamics solutions representative” for help. At that point, you’re thinking “Microsoft solutions representative? Who said I committed to Microsoft?! I’m just trying to figure out what in the blue hell your product even does.”

If you do find a vendor that maybe sorta-kinda-hard-to-tell meets your solutions, then you can expect days of scheduling WebEx teleconferences and meetings and run-around with your “account rep” so that they can determine how much you’re worth and willing to pay so that they can charge you a completely different amount than what they charged Bob next door for the same services and bits. Trying to extract “$X/user” and “the login starts working on MM/DD/YYYY” and “the developer get a demo account so if you can know if this is feasible” answers from these people seems to require a hammer in one hand and their genitals in the other. We both know that to add a new account, they’re pressing a button that says “they really bought into that ‘enterprise’ crap” and poof! a new account is created. Let’s quit pretending that the world’s carbon footprint has increased ten-fold by us merely asking to be on the platform.

Trying to extract technical capabilities from these salespeople is nigh-impossible either. I think part of the problem is that they seem to actually believe that the features that they are promising really exist, when in reality I just need them to show me what the data dictionary looks like and how the session needs to be handled and then I can tell for myself whether or not my scenario is actually supported. Instead? I’m waiting on a “discovery session” teleconference with an “engineer” tomorrow.

Conclusions and Delusions

It has to be easier than this. No wonder there are so many not-invented-here software solutions in the world today–custom crap that barely works at home may yet indeed be better than generic crap that you have to waste hundreds of dollars on in productivity time and research before you even get it in your hands and realize that it is also crap, just with a maintenance contract.

If it takes a consultant to help people decide what software to buy, or which of your products is right for them, or whether or not your product even applies to their problem domain, then your marketing simply does not work.

Programmatically updating software deployed via Group Policy

At work, I’ve written a small application called the “Fulfillment Manager.” From a user’s perspective, it’s an extremely simple application. It shows the current order counts for all of the stores that we ship for, and if you scan a barcode, it figures out what store that barcode belongs to, determines if the order the barcode corresponds to needs to be packed or shipped, and prints out a receipt/packing slip or USPS/FedEx shipping label and supporting shipping documentation automatically. Most operations involve just scanning the barcode and pressing enter.

Yes, it epitomizes "Battleship Grey." You love it.

Yes, it epitomizes Battleship Grey. You love it.

But, behind the scenes, it’s not quite as simple as all of that. It’s aggregating order data from heterogeneous data sources–some in our legacy database, some in our new fulfillment system, some in a custom integration with a third party. It has to figure out which postage account to pay for postage with or which FedEx account number to use to ship a package. For orders that aren’t coming from our new fulfillment system, it has to “cleanse” the address against the current USPS address database. It has to figure out the cheapest way to ship a package, compute customs values correctly, generate certificates of origin and commercial invoices for international shipments, and determine what box types an order is allowed to be packed in. And it has to write shipment information back to one of those three disparate data sources.

What this means is that I’m frequently making adjustments and bug fixes to the application. And managing the deployment and installation of those bug fixes had been, up until now, a pain.

A brief interlude on ClickOnce

The other internal application that we use (“Undies Client”) for our long-time running e-commerce store is deployed via ClickOnce, which is essentially the Java Web Start of the .NET world.

While ClickOnce is a neat technology and has its applications, to be sure, I probably wouldn’t use it again on Undies Client if I were starting that application over today, just as I decided not to use it for the Fulfillment Manager (which is an effort to divorce the processing and shipping features from Undies Client and make them simpler and applicable to multiple e-commerce stores).

First, there’s user confusion. If I deploy a Windows Installer MSI via Group Policy, then the application is magically there on all computers in the office. But for Undies Client, you have to go to a special Web page and click on a link. With ClickOnce, the installation happens per user, so employees can get confused if they go to another computer one day, log into their account, and see that the app isn’t there (“but Jennifer runs it on this computer so I thought it was already installed”).

Second, there’s deployment headaches. Like Java Web Start, you get a retarded warning if the deployment manifest wasn’t signed with an expensive code signing certificate. To mitigate that, you either buy one or start diddling with the self-signing certificate capability within the context of your own Active Directory domain. Not a show-stopper, and it makes sense, I guess, but it’s One More Thing that you have to deal with.

Third, when that certificate expires and needs to be renewed, you’re in for a world of hurt, because essentially all users will need to uninstall and re-visit the Web site download link and reinstall. Otherwise, the application simply stops seeing the newer updated versions and doesn’t update itself.

Fourth, the distribution of your app now has a dependency on an IIS installation somewhere, so that’s something else to maintain–both the configuration of that virtual directory in IIS as well as the shared drive to which Visual Studio dumps its files when clicking the “Publish” button.

Fifth, the installation can’t do much. Until recently, you couldn’t even create an icon on the desktop as part of the installation process. Nor can you do anything that would require elevated permissions for actions that you might typically do when running an installer, such as registering a COM DLL, or installing some third-party dependency. So the Web page at which you download the app usually contains things like “ooh be sure to install this that and the other first,” defeating the deployment simplicity of ClickOnce. And if you need to update one of those third party dependencies and your app because dependent on one of those updates, you have no way to update that dependency with ClickOnce, unless you take it upon yourself to have your application manage the upgrade during its next run. That’s just more work than you shouldn’t have to do.

After writing all of this, it may seem like I am saying that ClickOnce is a half-baked load of crap; it’s not half-baked. I’m saying it’s a fully-baked, complete load of crap. (Kidding. ClickOnce has its applications for applications that can be completely self-contained, but if at any point you become dependent on anything COM, it’s time to move on to real deployment technology.)

Using WiX to create a Windows Installer MSI file

I’ve blogged about WiX before. It’s a great open source tool put together by some guys who decided to write a reasonable mechanism for generating Windows installers because, for some reason, the Windows installer team has seemed to think that editing database tables in a cheeseball editor called Orca was sufficient. This would be like saying that our warehouse workers could ship orders by updating data in a Microsoft Excel spreadsheet.

You could also pay lots of money for InstallShield or something similar, which would create MSI installers for you, but installation is a convenience for me–as a small business whose primary focus is not end-user software, paying for that doesn’t make much sense. There’s also NSIS, but, oh–I just threw up all over myself. We’ll save NSIS for another post.

Additionally, the WiX guys have realized that installers usually need to do useful things, like install certificates, set up Web sites in IIS, and run database scripts, whereas the Windows Installer team seems to have been trying to make writing custom actions harder, not easier, with their subsequent releases, because that’s where most of the crashes and problems in setup packages happen. With WiX, we now have a suite of well-tested custom actions that lots of people are using; this should have been the Installer team’s original response instead of depending on the community and third parties to fill in this gap for them, but it is what it is.

The point is that WiX enables a whole class of small business developers like me to build first-class deployment methods into their applications. With a 300-odd line XML file, the Fulfillment Manager now builds to an MSI file. And since WiX integrates with Visual Studio, I can generate that XML file as part of my build process.

Indeed, I’ve set up TeamCity and use this as a continuous integration server. Whenever I commit a change to Subversion, TeamCity picks up the change, compiles the solution, runs the tests, and if they pass, copies the newly generated MSI file to a network share for potential deployment. It’s pretty sweet.

The missing piece of the puzzle, then, is actually getting this freshly baked MSI file onto all of the client machines in the office.

Deploying via the Group Policy Software Installation Extension

My first instinct was to use the Group Policy Software Installation Extension. This is the thingie where when you open up a group policy in the Group Policy Management Editor, you can drill down to the Software Installation thing under Computer Configuration, specify an MSI file for deployment, and (presto!) any computers linked to that GPO will install the MSI on next boot.

This worked swimmingly well for the first release of Fulfillment Manager.

We pause for another brief interlude on update strategies

Let me explain the design of the installation for a moment. I’ve written my WiX file such that each new MSI file that it generates is a “major upgrade” in the Windows Installer parlance–it has a different product code, a different package code, and a different version number (since my version numbers are a combination of an incrementing build number and the Subversion revision number). But they all have the same upgrade code and I schedule RemoveExistingProducts during the install.

This means that if you have an older version of the Fulfillment Manager on your machine and double-click the MSI file for a newer, updated version, you don’t have to do anything–the existing version is completely uninstalled and the new version is installed on top of it.

The Windows Installer has support for “minor upgrades” and “small updates”, but I can never keep the damn things straight. Can I add a new component? Can I change a file? Can I reorganize a feature? Do I really want to be thinking about this every time I press Build in Visual Studio? My application is small, so I think it’s far easier and more reassuring to just blow away the whole thing and install again during an update, starting with a clean slate each time. In fact, I think this is a reasonable approach for many reasonably-sized applications (Paint.NET, for example, does this) and only becomes a problem when you start getting really large (such as Visual Studio or Microsoft Word).

Getting the update out there

OK. So all I really need to do is get all of the client computers to run msiexec /i FulfillmentManager.msi on the MSI and I’ll be good to go.

You might think that I could just overwrite the old MSI file on the network share with the new one, and the computers would notice the change at the next boot. But you would be, as I was, a fool–machines that already had the install would not notice the change and machines that did not have the install would freak out because they could not find the correct MSI file. After positing my question on ServerFault, I discovered the way the Group Policy Software Installation Extension works is by creating an advertisement script (*.aas file) and referencing that script via an object sitting in the Active Directory. That LDAP entry and the script file both do annoying things like reference a specific package code and product code, both of which I change with every new build of my software. So this method is out for the count.

Similarly, lugging out the Group Policy Editor and trudging down to the package entry and clicking “Redeploy application…” won’t work for the reasons described above–except that it’ll break the machines that already have the software installed, too.

What works is lugging out the Group Policy Editor, trudging down to the package entry, clicking “Remove” and “Immediately remove”, and then adding the package right back. This creates an updated *.aas file and correspondingly updated LDAP entries, and the old LDAP entry is flagged with a “remove me now please” flag. This works, but there are three things that I don’t like about it:

  • It’s a manual process, so I have to remember to do it every time I create a new deployable build.
  • References to all of the old versions hang around by necessity, since it’s recording the fact that “hey, if I see this particular product code then I need to uninstall it.” Indeed, upon inspecting the SYSVOL share, I saw that there about 45 of such files sitting in there since development of this app started in early July.
  • There is no way to perform this process programmatically. (Well, there is, it’s documented as part of the EU anti-trust settlement, but let’s get real now: if it has LDAP in the spec then I’m not touching it with a ten-foot pole. Plus, this would be work that is totally tangential to my problem, which is automating a 1-minute task that ignores the crap out of me. At several days’ worth of work, it’d take me a long time to climb out of that time deficit to realize any savings.)

Nirvana: Automatic updating

The Software Installation extension for Group Policy can do a lot of things that I don’t need, like using patches or transforms, or shifting installed software by just moving a computer to a different OU, when at the end of the day all I really wanted was something that looked like this:

Must this be so difficult?

Must this be so difficult?

Since my installer will uninstall any previous versions, all I need to do is run the installation package via msiexec. This is easy enough to do via batch script that I’ve configured to run at startup via group policy, since those batch scripts run as SYSTEM and will have the necessary privileges to complete successfully.

You would think that in addition to /i and /x (and the idiotic /vomus), msiexec would have a somewhat useful parameter called, oh, I don’t know, /install-it-only-if-the-damn-thing-isn't-already-installed, but that would be a useful feature, so of course the Windows Installer team didn’t actually implement it.

Now, granted, there is actually no harm in running my installer when my app is already installed. It just would check that all the components are indeed installed and exit. But this still leaves a bad taste in my mouth. Not all MSI’s that I want to use in this way might behave like this. And, if I have any custom actions, that means that they’ll also get run on every boot, which seems like a waste.

While I could spelunk through the registry to try and see if I can find my current product code in the list, I’d rather not do that, because (1) that might not work consistently on all versions of Windows and (2) really, I’m just not that comfortable with batch scripts to begin with.

What ended up doing is writing a little command-line program called msicheck. The source file is boring and decently commented. Because all of the Windows Installer APIs are in C, my utility is in C. I briefly thought about writing it in C# with some interop, because I hate C just that much, but that did seem awfully overkill for such a simple application.

Ah, C in Windows. I had long forgotten the days of Hungarian variable names that sound like Yosemite Sam cussing (“lpszPackageVer“) and the land of if ("swizzle" == "swizzle") returning false. Heck, I had forgotten the days of not evening having false! I digress; the code is probably awful, but seems to work well enough for my purposes.

> msicheck.exe /?
 
Determines if a given Windows Installer package is installed.
 
msicheck package
 
Exit Codes:
        0       Exact version installed
        1       General failure
        2       Path to MSI package not found or not accessible
        3       Error determining package status
        4       Newer version installed
        5       No version installed
        6       Older version installed

All msicheck does is take the full path to an MSI file as an argument and returns an exit code that indicates the status of that particular package by looking at the product code and the version. (That is, if the product code is not installed, it returns 5; if the same product code is found, it compares the version and returns 0, 4, or 6.) I call it in my batch script (which, I don’t do this for a living, it’s probably retarded) like so:

@ECHO OFF
 
REM ---------------------------------------------------------------------
REM VARIABLES
REM ---------------------------------------------------------------------
 
SET path=\\SKIVIEZSBS2008\Group Policy Installations\Fulfillment Manager\
SET package=%path%Skiviez.FulfillmentManager.Installer.WinForms.msi
SET msicheck=%path%msicheck.exe
 
ECHO Checking installed version of Fulfillment Manager...
 
"%msicheck%" "%package%"
 
IF ERRORLEVEL 0 IF NOT ERRORLEVEL 1 (
	echo Fulfillment Manager is up to date.
) ELSE (
	IF ERRORLEVEL 5 (
		echo Installing latest version...
		%SYSTEMROOT%\system32\msiexec /qn /i "%package%"
		IF ERRORLEVEL 0 (
			echo Fulfillment Manager successfully updated.
		) ELSE (
			echo Errors may have occurred during installation.
		)
	) ELSE (
		echo Newer version installed or error occurred.
	)
)

Keeping in mind that IF ERRORLEVEL 5 evaluates to true for error levels of 5 or greater, this means that I call msiexec only if the product isn’t installed or if an older version is installed.

Conclusions and Delusions

So, by using a batch script and my msicheck utility, I have things the way I want them. I commit a change; TeamCity builds it, runs the tests, and copies the resulting MSI to the network share that is referenced in my batch script; and the batch script runs at the next boot, uses msicheck to note that that package is not installed, and so runs msiexec to install the update (which removes the old version as part of the install process).

It’s certainly not applicable to every software deployment scenario, much like ClickOnce, but hopefully it’ll help someone out there who wanted to do something similar.

Download MsiCheck, which requires Windows Installer 4.5 to run. Complete with the “works on my machine!” guarantee of absolutely no warranties.

Good luck!

"The data area passed to a system call is too small" and other idiocies when installing a Zebra LP2844 printer driver

For reasons that would bore most to tears, I had to swap out a workstation at work today. This workstation had one Zebra LP2844 laser printer attached to it, and along with the “new” (well, new-to-that-particular-desk, not new-to-the-world) workstation, it was getting another.

Simple! The “new” workstation already had the Zebra LP2844 printer drivers installed, so I’ll just Start > Printers > Add Printer > Yep, LPT1: > Mmhmm, ZEBRA EPL > (scrolling … scrolling … Why is this dialog box so small? … scrolling) ah, LP2844 > Continue Anyway (Why can’t a million dollar company get their drivers signed?) > Finish, insto presto, and….

Boom. “Printer Driver was not installed. The operation could not be completed.”

Hum.

I’ll try it again. This is an indication of insanity–expecting a different result from the same operation–but it’s 7:30 a.m. and I can’t be experiencing problems already. Come on.

Boom. “Printer Driver was not installed. The operation could not be completed.”

Okay, I’ll just download Zebra’s driver setup utility and install the printer that way. Next > Next > Add a printer > Next > Next, and….

Boom. “The data area passed to a system call is too small.”

Hum. That’s a new one. But different!

I know this printer works, why is this not working?

The Solution

After about 30 minutes of head-scratching, I stumbled upon the solution. These types of errors are apparently the result of corrupted or just plain buggy printer drivers already installed on the machine. They could be unrelated to the printer that you’re trying to install; it doesn’t matter. They’ll cause the error when the “add a printer” mechanism enumerates through the list of printer drivers already installed on the machine.

To fix it, it’s time to blow away some printer drivers. Go to Printers under the Control Panel. Click File > Server Properties and switch to the Drivers property sheet. Remove any suspicious-looking drivers and try to add your printer again. (For me, the problematic driver was a custom LP2844 driver that came on a UPS WorldShip disk.)

Getting WINS-like computer name resolution over VPN in SBS 2008

So this week concluded several sleepless nights and much heartburn as I migrated Skiviez’s SBS 2003 machine (running as our domain controller and our mail server) to SBS 2008. As far as things go, it went relatively smoothly, and the remainder of the week was dealing with lots of small niceties that I had forgotten that I had set up on the 2003 server that I now needed to set up once again.

One of these was something that I used for my convenience over a VPN connection from home. You see, the internal order processing application that I wrote uses some shared folders to store some temporary data, such as e-mails that are generated but not yet released to Exchange, or a local copy of images that are available on the Web site. This software–and our users–are used to referring to Windows file shares as \\COMPUTER-NAME\SHARE-NAME; for example, \\CYRUS\Pickup Holding, because for some reason some of the older servers are named after my boss’s dead cats.

When connecting through VPN to SBS 2008, however, that “suffix-less” name resolution was not working. So when \\CYRUS\Pickup Holding failed to resolve to anything, \\cyrus.skiviez.com\Pickup Holding would work fine. This was super annoying.

The reason this worked previously with our SBS 2003 installation is that it was acting as a WINS server, which provided this type of computer name resolution for us. SBS 2008 finally retires this ancient technology by default, however, so I had two choices: I could either install the WINS server role on SBS 2008, or I could just figure out how to get the 015 DNS Domain Name option from DHCP to relay through the VPN connection.

I chose the latter option, since it’s certainly less confusing to be able to say to someone in the future “we don’t use WINS, DNS does everything.” So here’s how to do it:

  1. On the SBS 2008 server, click Start > Administrative Tools > Routing and Remote Access.
  2. In the tree view, drill down past the server name to IPV4 > General. Right-click the General option and choose “New Routing Protocol” and choose DHCP Relay Agent.
  3. Now right-click the newly appended “DHCP Relay Agent” node and choose Properties. Add the IP address of your DHCP server (which is probably your SBS server itself), and click OK. Then click it again and choose “New Interface” and add the “Internal” interface.
  4. Now if you connect through VPN, an ipconfig /all should show your domain name as a “Connection-specific DNS suffix” and pinging machines by their suffix-less computer names should work. (If it doesn’t, make sure your DHCP server is using that 015 DNS Domain Name option, which the SBS 2008 wizards set up by default.)

Happy file sharing!