Analyzing low performance SQL code

As an independent consultant and trainer , I found myself doing a lot of existing code analysis and enhancement, mostly for stored procedures and ad-hoc SQL statements. I suspect a lot of people do the same thing as well, so I am really interested in learning how you do it. That’s the main purpose of this post. This post is tagged with Oracle, Sql Server, and MySQL, as the principals should be the same for all platforms.

Let me share with you how I do it. Notice that I look at table/column statistics and indexes in almost all the steps below. Therefore I purposely left them out in the discussion.

1. I always talk to the original author or the current owner of the code, asking him/her to walk me through it. I listen mostly, trying to understand why s/he writes code this way. I may ask a few questions, just to help me understand. I almost never do any lecturing at this stage. If the code is a stored procedure, I ask the author or stakeholder to give me sample execution statements, with parameters filled with typical values for production load. Sometime it is not possible to talk to the original author, because s/he may have left the company;

2. After that, I will get the database in question and restore it on my own workstation, with the data as close to production as possible. I will start running the code, and gather IO statistics and execution plan, and save the results for benchmark comparison later on;

3. Here is the raw code analysis stage, where code is put into an editor for analysis. In my case, it is VI or VIM. I find myself using the * command on all variable names and tables (make sure you are doing case insensitive search for Sql Server code), to see where they are used. This is also where I weed out the historical garbage code that is there but never being used anymore, such as unused variables, tables, etc. In the case of Sql Server, I mostly look for cursors, temp tables, and user defined functions;

4. Code simplification stage. I found a lot of bad performing code is unwieldy and unnecessarily complex. At this stage, I look for redundant WHERE clauses, convoluted AND and/or OR operators. This can be a tedious and time-consuming process. At this point, it is important to talk to the author or owner of the code, as a good understanding of data and entity relationship is crucial. Also, since I have gained some understanding from the steps above, it is easier to have an intelligent conversation with the code owner, with fresh perspective and momentum. Usually something good will come out of the conversation, and we will have something concrete to work on next;

5. I will make the changes discussed in step 4, once again, collect statistics and execution plan for comparison;

6. Repeat Step 4 and 5 a few times, when necessary. I usually start talking about better database design, code/design refactoring, set based operations, etc., as hopefully I have some credibility and gained the customer’s trust.

So what do you do to break down complex, low performance SQL code? What methodology, tools, tips, tricks you can share with me?

Migrating from one RDBMS to another

Here is some of my thoughts on migrating MySQL to Sql Server. It came out of an email discussion. I’d love to hear your thoughts on migrating to a different database platform, not just MySQL to Sql Server.

I actually thought about writing a white paper or even a course on migrating from MySQL to Sql Server, but never got the time to do it. Sometimes a project doing similar things can serve as a launchpad for this endeavour, but that never came along, at least not yet. I am very interested in database interoperability field though. I’ve done MySQL and Oracle admin in the past and have published some MySQL and Oracle stuff in blogs. I have much better technical skills on Sql Server than any other RDBMS platforms, primarily because I’ve worked on it longer.

Here are some of my thoughts. I think most of it applies equally on migration from Oracle, DB2, Sybase, Postgresql, etc., to Sql Server, or the other way around. It might be slightly easier to migrate from Sybase to Sql Server, considering their common root.

1. It is not easy to migrate existing app, unless the app is a simple one. Even for that, there are enough quirks that can throw people off and cause enough frustration to derail the whole projects. I’ve seen that happening twice, having engaged in moving 2 apps from Sql Server to Oracle;

2. Therefore, the best way to migrating to a new database RDBMS, in my opinion, is to start from a new initiative, probably not big initially. When you start things from a clean slate, you don’t have the historical garbage to worry about. Furthermore, you will give the team enough time to learn the new platform, and prepare the team for future migration, if you choose to do so;

3. Having open-minded team members is crucial to a migration project’s success. Too often people have emotional attachments to the platform they are familiar with, possibly out of job security concerns and lack of general curiosity toward new things.

I generally adopt a platform agnostic attitude, and don’t get religious and too carried away on the platform I work on. Having said that, I think these are points that marketing people can spin for persuasion purposes:

1. MySQL has too many storage platforms: MyISAM, InnoDB, MaxDb, and the newly introduced Maria. This can be viewed as a plus, as it provides choice. The downside of it is that it causes confusion for end users;

2. MySQL’s support for relational model is fairly recent. For example, for a long time, MySQL didn’t support Stored Procedures, Views, Triggers, Foreign Keys, etc. One could argue that MySQL is not mature in this area since it is new for them, but I think it is difficult to find evidence to substantiate that claim. Also, running the risk of offending some people, I think the importance of relational model got overblown a bit;

3. Sql Server offers the CLR integration. This can be a great selling point;

4. Sql Server offers tight integration with Visual Studio, Windows network, and all other things Microsoft. This is a huge advantage.

5. Too many people find *nix environment intimidating. Although MySQL works on Windows, but the perception in the marketplace is MySQL works better on *nix.

As far as migrating MySQL to Sql Server in a hosting web environment, my honest opinion is Sql Server will be fighting an uphill battle, because MySQL excels in this arena, especially for small and medium-sized, or departmental organizations, with the proliferation of such LAMP app like blogs, wikies, discussion boards, etc. I believe Microsoft’s weapon of choice in this arena should be SharePoint. Given Microsoft’s clout, it is certainly a battle worth fighting.

Oracle 10g Express Install Part 2

After my last failed Oracle 10g Express install due to insufficient memory, I purchased some memory from eBay, and started it over.

This time, the first thing the installer noticed was that the swap file is not big enough. I fixed that by doing a total re-install of Fedora Core 2. Yeah, I know that was kind of lame, but I got a few machines with me, so I didn’t waste time waiting for it.

Started again, this time the install went through, but I got some error messages at the end. The message was “groupadd: unable to lock group file”. The problem was that there were passwd.lock and gshadow.lock files under /etc. I then removed those files.

I tried to reinstall the RPM package. The message I got back was that it was already setup. I then ran:

rpm -e oracle-xe-10.2.0.1-1.0.i386.rpm

to remove it.

Finally, I was able to install it correctly.

It was a good exercise. I’ve done some Oracle database administration stuff, but never installation and setup stuff. I would like to bring my Oracle skill to the next level, so this is a good way to get started. 11g will come out within a week or so, but hopefully the skills I learned here can be transferred. I will give 11g a run when it comes out.

A few things learned:

1. yum is worth looking into. It is probably a better way to install or update dependencies than compiling binary or doing manual rpm install;
2. Got to be careful when typing. During profile setup, when I tried to put . /usr/lib/oracle/xe/app/oracle/product/10.2.0/server/bin/oracle_env.sh into /etc/profile, I missed the blank between the dot and forward slash. This shell script sets up $ORACLE_HOME and $ORACLE_SID for you;
3. Here is a good site for 10g Express setup on Linux.

Problems with Oracle Migration Workbench

Database vendors like to bash each other, sponsoring dubious “objective industry studies” to “prove” they are better than others. All of them do it. In my opinion, Oracle is particularly bad in this regard, compared against IBM DB2 or Microsoft Sql Server. Talking about “unbreakable Oracle” and software full of bugs, and in many cases you need to pay the Metalink membership to simply get to know that the issue you are dealing with is a bug. That is not to say Sql Server and DB2 don’t have bugs in their software, though.

Each of them wants you to convert your database to their platform. To that end, they provide some program to help you along. Oracle has a thing called Oracle Migration Workbench. Sql Server has something similar called Sql Server Migration Assistant.

More often than not, marketing people from those companies will tell you how great their migration program is. That it is easy to migrate, you will see performance improvement, etc. Don’t believe them.

A couple of months ago, I worked on a project to migrate a database from Sql Server to Oracle. This database has a lot of stored procedures, user defined functions, and linked server stuff. I talked to Oracle people, and they recommended their own Migration Workbench, with Sql Server plug-in. I downloaded them and started working.

Basically, the Migration Workbench tries to go through database code in T-Sql. For code that uses built-in T-Sql functions, the Migration Workbench creates functions with the same name in Oracle, and try to writes something similar in PL/Sql that makes an attempt to do what T-Sql function does. I didn’t look too deep into this, but I am suspicious at all of them. The reason I didn’t look too deep into them was that the whole effort was derailed by a bigger problem. Let me explain.

In T-Sql, all variable names start with @ symbol. In fact, in many places, variable names are simply column names prefixed with the @ symbol. This can be pretty easy to read. And it actually works very well.

However, PL/Sql variable names don’t follow that convention. So, as Migration Workbench goes through T-Sql stored procedures, it simply strips off the @ symbol from the variable names. This basically renders all the code useless.

This is just very, very dumb. Oracle is a multi-billion dollar company, and you would think they should know this. Instead of stripping off the @ symbol, it could replace it with some kind of prefix. But it does not do that.

Another problem is with the identity field. Once again, one would expect that the Migration Workbench converts it to Oracle sequence, but it doesn’t. It changes that to NUMBER.

Eventually, the project was canceled. The moral of the story: migration from one RDBMS to another is not as easy as it sounds. It is doable. Sometimes it is probably easier to just write everything from scratch.

Failed Oracle 10g R2 Express Linux Installation Notes

My first ever computer, the 10-year old HP Pavilion 4455 desktop, purchased after I finished my master degree in Economics, has seen very limited action in the last few years. My better half had suggested that I donate it away, since we really don’t have enough space to put it, but I refused, because I always thought I would use it in the future. Now I am working on my own, so I will need some spare machines to mess around with as my sandboxes. I finally got some time in the last two days to show it some love.

I decided to load Linux on it. Neither Ubuntu 7.04 nor Ubuntu 6.01 worked. (They might work better for newer desktops, I just don’t know) I have Fedora Core 2, so I installed that. I know the latest Fedora Core is version 6. I am kinda old school in this regard. Anyway, Fedora Core 2 worked on this computer.

I then decided to load Oracle 10g R2 on it. Here is a brief note of my process. It failed, because this machine has 160 MB of memory, not reaching the 256 MB required. I think I will purchase some memory tomorrow and try again.

1. Verify dependencies

[root@localhost haidong]# rpm -i oracle-xe-10.2.0.1-1.0.i386.rpm  --test
error: Failed dependencies:
        libaio >= 0.3.96 is needed by oracle-xe-10.2.0.1-1.0

2. Downloaded libaio source. libaio is Oracle’s library for asynchronous IO. configure failed, therefore I couldn’t compile and install it. Below is the last output on the screen.

                                                                                                                             
ecking for libaio >= 0.3... no
checking for libaio 0.1... no
configure: error:
*** libaio is required.

3. I didn’t bother to look through the configure log file. Instead, I searched for “rpm libaio fedora core 2” and got this link:

http://download.fedora.redhat.com/pub/fedora/linux/core/2/i386/os/Fedora/RPMS/

Downloaded the package for libaio.

                                                                                                                             
[root@localhost haidong]# rpm -ivh libaio-0.3.99-2.i386.rpm
warning: libaio-0.3.99-2.i386.rpm: V3 DSA signature: NOKEY, key ID 4f2a6fd2
Preparing...                ########################################### [100%]
   1:libaio                 ########################################### [100%]

4. Install. It failed. I will try again, if/when I get more memory for this old machine.

[root@localhost haidong]# rpm -ivh oracle-xe-10.2.0.1-1.0.i386.rpm
Preparing...                ########################################### [100%]
Oracle Database 10g Express Edition requires a minimum of 256 MB of physical
memory (RAM).  This system has 155 MB of RAM and does not meet minimum
requirements.
error: %pre(oracle-xe-10.2.0.1-1.0) scriptlet failed, exit status 1
error:   install: %pre scriptlet failed (2), skipping oracle-xe-10.2.0.1-1.0