It's a topic that comes up frequently for us here at UpGuard. Our customers are always keen to know how they can take control and simplify their configuration management processes. We've all experienced at some time or another that issue that was the result of a database migration that didn't complete, a column that has mysteriously changed data type or an old version of a stored proc or view being restored to a new database.
The worst bug is not the one that brings your site down, it is the bug that affects your data over time like a virus causing damage that is difficult to repair.
We have now introduced a new scan option to UpGuard which enables custom SQL queries to be written allowing customers to monitor database configuration drift without needing to create and monitor schema files.
You receive a call from the sales team informing you that the sales reports being exported from the production and QA systems are not matching up. They have manually trawled through the monthly sales figures and everything looks normal.
The replication from Prod to QA seems to have run without error for as long as the logs show. You inspect the reports and indeed the numbers just don't add up.
Something isn't right here.
Luckily, UpGuard is on hand to diagnose the problem.
Step 1: Scan A Node
Navigate to Manage | Nodes and click the View button for the node you wish to scan.
On the following page click the Scan button.
Step 2: Add SQL Queries To Scan
The SQL Queries section can be expanded as per below and allows for SQL queries to be added to subsequent scans. Here we are querying the sales table schema to establish a baseline for our table attribute data. Detailed instructions on how to setup windows database connections can be found here and for Unix/Linux/OSX database connections here. Click Scan Now to continue.
Step 3: View The Schema Differences
The results from our scan are shown on the following page. If we click the Select Node button on the left we can compare our scan of Prod to our problem QA machine.
Immediately, we can see a number of potential areas to investigate. The green sections indicate items that exist on the production machine but not on QA, the red sections indicate items that exist on the QA machine but not on production and the orange items indicate items that exist on both, but differ.
Let's switch to the table view for easier reading. We can click the Diff Only button to show only the items that have changed which allows us to zoom in and see only the important items.
Looking through this list, we can see a few things that stand out, but mostly things that we can disregard for this specific case. Although they do raise the question as to why these are different from our QA server. These servers are supposed to be identical, right?
We can dig deeper into that later. For now, lets get back to our problem at hand.
Down the bottom of that list, we can see something of interest. The test to compare our sales table schema has failed and the sales table attributes on our QA server have been modified.
Step 4: View The Output Differences
We've detected a change between our table schema on these two machines. That's already enough for us to go off, but in many cases these machines may have dozens of schema tables that you really can't be bothered sorting through.
We really need to just show the exact lines that have changed. UpGuard has you covered here, simply click the File Diff button.
The problem is staring us right in the face about now.
The amount attribute type from the problem QA server (on the left) has been changed to an int. The amount type from production on the right is a decimal.
This would definitely cause our mismatched figures and we know exactly how to fix the problem now.
Nodes are scanned daily by default on UpGuard and had we added this SQL query to the scan earlier, we could have gone back in time and determined exactly when this database schema changed occured.
We've detected a small but critical bug in our SQL configuration in a matter of minutes. I know from experience that tracking down this type of issue without a tool like UpGuard can take hours, if not days. A typical production database will have dozens of tables, triggers, stored procs and views that your applications depend on heavily. Keeping track of them across environments is no mean feat.
Using this scan option, you can detect changes in all of these and react accordingly.
We've really only scratched the surface here in terms of UpGuard capability. From here we could move onto to creating policies that will perform queries against the database every 15 minutes and ensure an expected output. We could also create security policies to ensure that our SQL Server is CIS compliant and be alerted if we fall out of compliance.
With UpGuard in your toolbelt the possibilities are endless.