Category Archives: devops

A Condemnation of Invoke-SqlCmd

Recently Mike Fal penned a post defending the Invoke-SqlCmd cmdlet. This is a response to that post. (Update: Apparently Mike Fal’s post was in response to a post by Drew Furgiuele that was also critical of Invoke-Sqlcmd.) I want to give Mike proper credit: he states multiple times that Invoke-SqlCmd is not the right tool for every sql-related job. Kudos for that. However in this post I’m going to make the argument that Invoke-SqlCmd is the wrong tool for any job, period.

I believe that not only is Invoke-SqlCmd deficient, but is also potentially dangerous to use in production. Let me immediately clarify: I’m not saying the cmdlet is buggy. I’m simply saying that the way it’s intended to be used encourages bad coding practices that, at the very least can produce unintended results, and at worst can introduce serious security issues.

The root cause of all the issues with Invoke-SqlCmd cmdlet is that it’s the classic sqlcmd.exe utility that has some powershell hammered onto it. As a result, most of the patterns and conventions used for cmdlets are thrown out the window in order to maintain backwards compatibility with the base executable. Here’s a great example:

PS c:\> Invoke-SqlCmd -ServerInstance "MyServer\MyInstance" -Database "db1" -Username "ctigeek" -Password "whyAmIPassingInAPassword???" -Query "select * from sometable1"

Now I challenge you to find one other cmdlet, published by Microsoft, that accepts a clear-text password on the command line. Any other cmdlet that needs this information takes in a Credential object. Another example is how it uses the AbortOnError switch instead of the standard ErrorAction parameter. All of these proprietary parameters are intended to maintain parity with the underlying executable. The MSDN documentation even itemizes which options from sqlcmd.exe are included and how they map to the powershell cmdlet. I find all this completely intolerable. The standards by which Powershell commands are built should not be bent (or in this case, flat-out broken) in order to conform to the underlying executable. If DBAs want to use sqlcmd.exe, then they can use it directly without this layer of obfuscation that doesn’t appear to add any real value.

Sqlcmd.exe was originally designed for DBAs to execute DDL and other scripts related to the administration of the database. Yes, it can also return data; it was built to generate “flat” static-font reports that you see printed out in any 80s TV show that had a computer in it. This is how most db reports were generated up until relatively recently. My point is this: sqlcmd was never intended to be used to query data in the kind of environment in which it is used with Powershell. In other words, it is completely lacking in the features necessary to be a useful and safe tool in a modern production application.

  1. It doesn’t support ANY standard cmdlet parameters. I would expect any cmdlet published by Microsoft, especially one that interacts with a database, to have ErrorAction, Verbose, WhatIf, and Confirm. If it can consume user IDs and/or passwords, I expect it to accept a Credential object.
  2. It doesn’t support non-queries or scalar queries. If I execute an update or delete using Invoke-SqlCmd, I will have no idea how many rows I just impacted. If confirmation is a requirement, I then have to execute a second, possibly expensive table-scanning, query to confirm.
  3. It doesn’t support transactions. I can’t overemphasize how big a deal this is. This is one of those things that people don’t think about because it’s not even available. The only way to do transactions with Invoke-SqlCmd is by having the entire transaction inside the SQL command being sent. This usually isn’t practical, sometimes necessitating the script to dynamically generate the sql with all the correct commits, rollbacks, and checks thereto. That’s insane. And it completely prevents the Powershell script from doing other things within the scope of the transaction. Did you know that Powershell supports distributed transactions? Yup. You can start a transaction, make changes on multiple database servers, change a file, a registry key, and put a message into an MSMQ queue, all part of a transaction. But Invoke-SqlCmd doesn’t support that.
  4. Support for SQL parameters is awful. This is the dangerous part I was eluding to earlier. The support for defining and passing in parameters is awful. How bad? The variable and the value are defined in a single string. e.g. "var1='val1'" And you have to escape single quotes in the value. This means that if the values are already in proper Powershell variables, you have to escape the value, build strings, put that into an array, pass that into …. never mind. It’s an unusable mess. And that’s the point. Anyone with 5 minutes experience with Powershell will just create the SQL using string concatenation based on the function’s input parameters. And of course that’s begging for a whole litany of security issues. If proper SQL parameters were supported, people would use it.
  5. It only supports Sql Server. Do you really only EVER work with Sql Server? Ever? If so, good for you, but in 20 years that’s never been the case for me. The great thing about SQL is that it’s (mostly) universal. I want a Powershell command that works the same way, with the same options, across any relational database.

So what’s the harm if Invoke-SqlCmd supports your current needs? The problem is your needs tomorrow will be different. Your code is going to change, and sooner or later you’re going to hit the much-closer-than-you-think wall of usefulness. So what’s the alternative? Lots of .NET coding?

Two years ago I realized all this and started the OSS project, InvokeQueryPowershellModule. It’s written in c#, based on ADO.NET, and supports all of the functionality above and more. There’s lots of examples in the Readme. It’s available from the Powershell gallery now. I’ve been using it in a production environment for over a year with no issues. I strongly encourage you to consider it as a superior alternative.

Persistent Connections and F5 iRules

At Rackspace Cloud Office we rely on the amazing power of F5’s BigIP network devices for most of our application routing. In case you don’t know, BigIP (usually referred to simply as ‘F5’) is an application gateway device that can perform a variety of functions that bridge the gap between your network and the application. In our case, we use F5 to route traffic to a variety of web applications, but really it can be any type of network application. A user’s HTTP request is routed into our data center and terminates at the F5. The resolvable public IP for our URLs is owned by the F5 and also owns the SSL cert for the domain. The F5 will then “reverse proxy” the traffic to an internal server. This may be a single server, but more likely it’s one of multiple servers in a pool (i.e. web farm), and the F5 will round-robin traffic to those servers. The F5 can also execute user-created scripts, called iRules, for each request to allow you to make intelligent decisions based on data in the request, e.g. User-Agent header, or the path in the URL. One of our applications uses an iRule to select a destination based on the path in the URL: one destination is a pool of servers running an ASP.NET application, the other is a pool of servers that simply serve static files. The iRule is executed for every HTTP request and if, for example, the path starts with `/static/`, then the static-content pool is selected for the request.
Here is an example of an iRule that works as described.

when HTTP_REQUEST {
  if { [string tolower [HTTP::uri]] starts_with "/static/" } {
     pool pool-static
  }
}

If no pool is selected by the iRule, then a configured default pool is selected, which for this example would be the application-server pool. Once the pool is selected, the F5 then selects a server from the pool and forwards the HTTP request to the server, and the response from that server is sent back to the requester. Simple enough.

But what about persistent connections (aka Keep-Alive)? Think about that for a second and you can see how persistent connections could throw a few wrenches into how it should work. Just to review: the HTTP 1.1 protocol declares that the underlying TCP connection should stay open until the client or server closes it. This saves a lot of time especially since a client can invoke multiple calls to the server to fetch sub-resources for a page. In this day & age of SSL-everywhere, persistent connections is a godsend. (Side-note: HTTP/2 spec allows for *multiple* concurrent persistent connections!) Thankfully the F5 handles HTTP persistent connections quite well, but in order to stay out of trouble you need to know how it works under the hood. More on that in a minute.

Quick digression: The interesting part about how application gateways work, is they are sometimes acting as a true network device, providing a NAT into or out of a network. This what a consumer wifi router does to provide Internet access to your home network, and quite often this exact same functionality is used to provide internet access to an isolated application network. However, there are other times when an application gateway is simulating the job of a network device. This is the case as outlined above: incoming HTTP requests from the client are terminated at the F5, including any encryption via SSL. Then once a pool & destination server is selected, a totally separate TCP connection is created on a different (internal) network from the F5 to the destination server. Once the second connection is up, a new HTTP request is created that is basically a copy of the incoming request, and sent to the destination server. On the return trip, the response from the server is copied to create an HTTP response back to the client to fulfill the original request. This sounds like a lot of work, and it is! (F5s manage to do a surprising amount of this work at the hardware level.) But the net effect is that it appears as though the F5 is routing the HTTP request at a network layer, however routing is a network layer 3 function, and the F5 is operating on layer 5/6. A more correct way to think about it is that it’s proxying the request, but even that, as we will see, isn’t entirely accurate.

So what happens to our simulated request routing when we have persistent connections? At this point in the discussion it’s not too complicated: as long as the TCP connection from the client stays open, then the F5 will keep the second TCP connection to the internal server open. Any subsequent HTTP request that comes in over the existing external connection will be proxy’d over the existing internal connection. The F5 keeps a map of which external connections map to which internal connections, and internal connections are never re-used (a.k.a. connection pooling) due to obvious security concerns. If the client closes the external connection, then the corresponding internal connection is also closed. I’m almost positive the opposite is true as well: if the internal server closes its connection, the external connection to the client is closed as well. (I haven’t had time to verify this, but I can think of some serious security concerns if this wasn’t the case.)

Now let’s take it a step further: What if we have an iRule that can split incoming requests between two pools of servers? What if, for every request, an iRule has to determine whether we are proxying to an application server pool, or a static-content server pool. What happens to the persistent connection? This is where the F5 has to behave in a way that is transparent to the client, but may have an impact on how the request is routed to an internal pool.

Here’s where I have to put a great big disclaimer up: I’m not an expert at F5 routing, nor have I had time to exhaustively research this. What I’m about to state is based on circumstantial evidence from troubleshooting this issue for the past couple of days. I’ll update accordingly if an expert tells me I’m wrong, which is likely. 🙂

Let’s take it step-by-step:

  1. A client creates a connection to the F5 and makes an HTTP request.
  2. The F5 runs an iRule which explicitly selects an application pool server, which fulfills the request. The internal connection is left open and is mapped as the “persistent connection” to the external connection.
  3. Using the existing external connection, the client makes another HTTP request for static content.
  4. The F5 runs the same iRule as before, which explicitly selects a static-content pool server, which fulfills the request. This new internal connection to the static-content pool server is separate and distinct from the internal connection created in step 2. Because it is the internal connection used for the most recent request for this external connection, the F5 now maps the internal connection to the static-content server as the “persistent connection” to the external connection. The other internal connection to the application server is no longer mapped as the “persistent connection”.
  5. The client makes a third request, again over the existing external connection.
  6. Again, the F5 runs the same iRule, however this time a pool is not explicitly selected. Typically that’s not a big deal, because the endpoint should always have a default pool. The default pool is used if the iRule didn’t explicitly select one. However this scenario isn’t typical: there’s an existing connection. If there’s an existing established connection that’s mapped, the default pool will not be used. The F5 will use the existing mapped internal connection unless the iRule explicitly selects a different pool.

This is why I said that proxying isn’t exactly an accurate description. Application gateways have to be far more intelligent about how they handle things. It’s not just forwarding bits and it’s not just a store-and-forward algorithm. It has to track state and make some unique choices about how it implements the HTTP standard, but do it in a way that is compatible with the usability and flexibility offered by iRules and other configuration.

I had to figure this out when the web application in our dev environment starting returning random 404s. After some investigation we determined that the requests returning 404 had been routed to the static-content server. This was very odd because the url for these requests didn’t match the criteria in the iRule for that pool. After quite a bit of digging (i.e. wireshark) we found that the requests were going to the static-content server because an existing connection existed to that server. The iRule wasn’t explicitly selecting the application server pool at the end. We didn’t think we needed to since the application server pool was the default pool for this endpoint. However the default pool wasn’t being used because there was already an existing connection. When we changed the iRule to select the default pool at the end of the script, the random 404s stopped occurring. Example:

when HTTP_REQUEST {
  if { [string tolower [HTTP::uri]] starts_with "/static/" } {
     pool pool-static
  }
  else {
     pool pool-application
  }
}

TL;DR; flowchart:
iRuleFlowchart