Archive for Technology

Removing duplicate rows in small batches based on date column

Due to double scheduling, some duplicate rows were inserted into the Windows.TableStats table in JiMetrics. To confirm that the table has duplicates, here is the T-SQL script I used. Note that this same script should work in all other major RDBMS platforms like MySQL, Oracle, etc. Adjust table and column names to fit your needs.

SELECT HostID,
       InstanceID,
       DbName,
       SchemaName,
       TableName,
       CAST (CollectionDate AS DATE),
       COUNT (*)
  FROM Windows.TableStats
GROUP BY HostID,
	 InstanceID,
         DbName,
         SchemaName,
         TableName,
         CAST (CollectionDate AS DATE)
HAVING COUNT (*) > 1

So the duplicates need to be removed. In Microsoft SQL Server, that can be accomplished with a Common Table Expression (CTE) using the ROW_NUMBER ranking function, which is a pretty elegant solution.

However, when using DELETE to remove potentially large number of rows, say millions, it is always advisable to do that in small batches. Otherwise, you are risking running out of temp, log space, or even disk space, not to mention that you are potentially creating bigger and coarser locks than necessary.

So I decided to remove the duplicates on a daily basis to lessen the impact on the database instance. A runningDate variable is used and incremented by one until the desired end date. Within each iteration of the loop, that day’s duplicates are removed. Tweak it to suit your needs.

I think things like this make a good job interview question for a database administrator.

DECLARE @runningDate DATE
SET @runningDate = '20140101'

WHILE (@runningDate < '20140406')
BEGIN
   WITH dupRows
        AS (SELECT RecordID,
                   ROW_NUMBER ()
                      OVER (PARTITION BY HostID,
                                         InstanceID,
					 DbName,
                                         SchemaName,
                                         TableName,
                                         cast (CollectionDate AS DATE)
                            ORDER BY
                               HostID,
			       InstanceID,
                               DbName,
                               SchemaName,
                               TableName)
                      AS RankID
              FROM Windows.TableStats
             WHERE CAST (CollectionDate AS DATE) = @runningDate)
   DELETE dupRows
    WHERE RankID > 1

   SET @runningDate = DATEADD (DAY, 1, @runningdate)
END

Comments

WordPress post update services and Nginx upstream time out

Starting around last Friday, I think, this site became irresponsive. Checking Tengine/Nginx error logs, here is a sample of what I saw:

2014/03/xx 08:43:40 [error] 3837#0: *6149 connect() to unix:/var/run/php-fastcgi/php-fastcgi.socket failed (11: Resource temporarily unavailable) while connecting to upstream, client: xxx.xx.xxx.xxx, server: haidongji.com, request: "POST /xmlrpc.php HTTP/1.0", upstream: "fastcgi://unix:/var/run/php-fastcgi/php-fastcgi.socket:", host: "www.haidongji.com"
2014/03/xx 08:45:13 [error] 3837#0: *6151 upstream timed out (110: Connection timed out) while reading response header from upstream, client: x.xxx.xx.xxx, server: haidongji.com, request: "GET /wiki/doku.php?id=wiki:dokuwiki HTTP/1.0", upstream: "fastcgi://unix:/var/run/php-fastcgi/php-fastcgi.socket", host: "www.haidongji.com", referrer: "http://www.haidongji.com/"
2014/03/xx 08:46:00 [error] 3837#0: *6159 upstream timed out (110: Connection timed out) while reading response header from upstream, client: xx.xxx.xx.xxx, server: haidongji.com, request: "GET /category/technology/mysql/feed/ HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fastcgi/php-fastcgi.socket", host: "www.haidongji.com"

So I started researching. A number of web pages, including StackOverflow, recommended increasing values for some or all of the following Tengine/Nginx parameters, so time out wouldn’t occur:

  • fastcgi_read_timeout
  • proxy_connect_timeout
  • proxy_read_timeout
  • proxy_send_timeout

Those suggestion appeared to be reasonable. So I tried raising those parameter values one at a time. Their default value is 60 seconds. I raised them to 600 just to test.

In my case, the problem persisted. The only difference was that the site suffered a slower death and the few minutes it was able to stay alive, the site was terribly slow and unacceptable.

At this point I’d spent hours debugging this. I decided to give myself a break and come back to it the next day. Before falling asleep, I wrote down a note of checking some reputable forums the search engines may not necessarily able to get, such as Linode’s forums. Failing that, I am determined to read up on more documentation and code to get to the bottom of this.

It turned out that Linode’s forum provided the necessary clue to finally resolve this. The problem lies in the WordPress blog publishing update service. The idea behind it was to notify blog aggregators like Technorati/Feedburner (remember them?) as soon as you publish a blog. One of the aggregator sites listed in my WordPress setting decided to not accepting those notifications anymore. Not knowing this, WordPress still tries to send notification out in vain, which caused the whole mess. After removing that URL, everything went back to normal.

On your WordPress admin page, you can find which URL your blog pings to by going to: Settings -> Writing -> Update Services. Update: The same problem happened again. So I removed pingomatic.com also. Now my WordPress does not ping or update anybody. So far the site has been stable.

Hope this helps!

Lesson learned/reinforced:
After working on a problem for a long time and you are stuck, take a break, go for a walk, find a friend to talk to and describe the issue and what you’ve done thus far. You’d be surprised at how effective this can be.

Comments (2)

Collecting Windows BIOS and Host Serial Number

After imporving JiMetrics yesterday so it tries to determine if the host is a VM or not, I made another improvement today: collecting host server BIOS related information and serial number.

I don’t know about you, in the past when I needed to gather a Windows server’s BIOS or serial number for troubleshooting, I typically rebooted the machine, pressed whatever function key was necessary to get into the BIOS, and coped things down. Wouldn’t it be nice if we can get it programmatically and store that somewhere for easy reference?

So today I made another enhancement to JiMetrics to address that need. Here is what I did:

  1. Added additional columns to the Windows.Host table: SMBIOSVersion, BIOSReleaseDate, SerialNumber;
  2. I made improvement to the PowerShell script, so it now uses Win32_BIOS to gather and store that in the JiMetrics database.

Together with HardwareVendor and HardwareModel, it will be easy to find the latest BIOS software for your server host. In addition, based on SMBIOSVersion and BIOSRealeaseDate, you can determine if your BIOS is out of date and decide if you want to update the host BIOS accordingly. Based on my past experience, a lot of companies have servers whose BIOS is way out of date. Armed with this information, they won’t be.

With this improvement, it is even more easier for system admins to be on top of things. JiMetrics is not just for SQL Server DBAs, a Windows admin will also find it extremely valuable. What are you waiting for? Go to jimetrics.com and get it :)

Comments

Determining if a Windows host is a VM in JiMetrics

I’ve been using my own SQL Server metrics collection package called JiMetrics for a couple of years. It is easy to set up. All you need are just two things: SQL Server instance and an account that has admin access to both the servers and instances you care about.

JiMetrics doesn’t do anything that will be hard to undo: no binary install, no registry changes, no cookies and temp files to store states and all that junk that pollutes your environment. Its footprint is small and just quietly collects important metrics data at an interval of your own choosing. Follow steps here and start today. As a system administrator, you will be glad to have the metrics data for analysis and decision making.

Today I made the database a bit better: I changed the database schema by adding a new column, IsVM, to the Windows.Host table. Here is the idea:

  1. IsVM tells you if a particular Windows host is a virtual machine or not;
  2. IsVM can have only one of two values: “Y” and “N”, and is not NULLable;
  3. IsVM is a persisted computed column. The computation is done via the Manufacture property of Win32_ComputerSystem class;
  4. Currently, if the Manufacure is one of “microsoft”, “xen”, or “innotek GmbH”, then IsVM value is “Y”, otherwise it is “N”.

Here is the relevant portion of the computed column definition:

	[IsVM] AS (CASE HardwareVendor 
			WHEN 'innotek GmbH' THEN 'Y'
			WHEN 'microsoft' THEN 'Y'
			WHEN 'xen' THEN 'Y'
			ELSE 'N'
		  END) PERSISTED,

Enjoy! If you have any comments on either of these questions, I’d love to hear them:
1. What other values of Manufacture in Win32_ComputerSystem indicates a VM, other than the three I listed above?
2. Are there better ways to detect if a Windows host is a VM?

Comments (1)

Setting up ownCloud

I’ve spent some of my spare time lately researching on ways to have better control of my calendar, contacts, important files that I sync to different places, and such. My objective is to move those things away from big cloud providers like Google, Microsoft, Dropbox, Skype, and so on. I am happy to report that the open source solution ownCloud provides just what I needed. I was able to set it up today and the tests were successful. Here are some noteworthy points.

  • ownCloud provides clients for Windows, Mac, Linux, iPhone, and Android devices, which is really nice;
  • In my case, the only php component missing was the php GD module. After installation, I restarted my VM and then it worked fine. For database, I used MySQL;
  • Setup was easy. I did the manual install by following instructions here. My web server was Tengine, a fork of Nginx. I used the sample Nginx configuration in the instruction page as the starting point, and it worked just fine;
  • ownCloud recommends using SSL, which I concur. I spent some time researching on certificate options, and decided to use self-signed certificate. The reason is simple: the intended customer is me and my family, therefore I don’t need a Certificate Authority (CA) to tell me that I am using a site that I own :) Not forking out the cash is added bonus. Note, though, that if you are running a commercial site that conducts transactions online, it is necessary to purchase certificates and enable SSL. That’s not the right time to be penny-pinching (or a 铁公鸡);
  • To generate my own self-signed certificate, I followed instructions here. I used the 2048 bit key. I followed the optional step 5 so that I can start Tengine without providing a password;
  • I think ownCloud will be great for companies and teams. If you have control to a web server, which most companies and teams do, then more than likely you’ve got the infrastructure to have your own cloud. For a company/team, I do recommend purchasing SSL certificates, unless your company/team is pretty small;
  • Don’t forget to renew your self-signed certificate as time moves closer to its expiration date. In my case, the validation period is one year. I’ve already set a reminder for me.
  • I’ve really enjoyed using dnsimple. Separating DNS and hosting service really gives me great flexibility in learning and testing out different web technology.

Hope this helps. Let me know if you have comments/suggestions.

Comments (2)

« Previous entries