Our MS-CRM deployment has a fairly simple, custom integration with our mainframe system. Nothing fancy, effectively just shunting XML files around to perform overnight and the occasional ad hoc data sync.
We farmed the coding out to a well-known implementation partner and received code which had been tested in their development environment. A little reconfiguration and we were good to go. The first few test files worked just great, so I started throwing some bigger files at it. Suddenly we were getting problems…
“The underlying connection was closed: unable to connect to the remote server.”
Strange – this hadn’t been seen in the development environment. The connection details were clearly correct too since half of the batch had been processed ok. There was little else to go on. Now we all know, usually Google Is Your Friend, but on this occasion all we could find were details of generic ASP.Net web service development issues almost exclusively caused by firewall and proxy issues – again, this couldn’t be the case since the batch was partway complete.
The developer was at a loss – he had done nothing out of the ordinary. We noted that the service would fail at roughly (though not exactly) the same point each time, so his only suggestion was to introduce forced delay between each call to the CRM WebServices. This was far from satisfactory, but fundamentally it worked (and we’d already exceeded allocated time on this simple service) so that was that.
Unfortunately I’m a persistent little bugger, so I couldn’t rest knowing there was an issue out there that we hadn’t identified! What seemed apparent was that this network issue was not at the transport layer or lower. I’m not a hardcore developer, but I remember that network ports can be opened/allocated on a dynamic basis. Don’t berate me but I think this is what is meant by “ephemeral ports”. Suffice to say, this seemed to fit the scenario – the server was running out of dynamic ports (or connections) to allocate to our batch service.
Anyway – I guess that that’s mostly just hot air: if you’re reading this post, there’s a good chance you’re having the same issue. The cause? It is acknowledged in Microsoft’s KB article: KB913515, but unfortunately you’ll need access to PartnerSource or CustomerSource to view.
The fix is made with a couple of changes to the registry to amend the machine’s TCP/IP configuration. If you can’t access the KB article above, I reckon I can probably point you to one of Microsoft’s publicly available pages which explains “WinTCP TIME-WAIT Delay” (Win2k version here) and why you might need to amend the TcpTimedWaitDelay and MaxUserPort keys in the registry.
Please comment and let everyone know if this helped 🙂