SQL code for SQL and Relational Theory

I am reading SQL and Relational Theory by C. J. Date. Baron Schwartz wrote a nice review for it. I am reading the online version, about half way through, so am not sure if it has an accompanying CD with source code. In any case, if you want to play with some SQL code listed in the book, here is the script to generate the tables and rows (or should I say relations at a certain point in time with tuples?)

CREATE TABLE S
   ( SNO    VARCHAR(5)   NOT NULL ,
     SNAME  VARCHAR(25)  NOT NULL ,
     STATUS INTEGER      NOT NULL ,
     CITY   VARCHAR(20)  NOT NULL ,
     UNIQUE ( SNO ) ) ;

 CREATE TABLE P
   ( PNO    VARCHAR(6)   NOT NULL ,
     PNAME  VARCHAR(25)  NOT NULL ,
     COLOR  CHAR(10)     NOT NULL ,
     WEIGHT NUMERIC(5,1) NOT NULL ,
     CITY   VARCHAR(20)  NOT NULL ,
     UNIQUE ( PNO ) ) ;

 CREATE TABLE SP
   ( SNO    VARCHAR(5)   NOT NULL ,
     PNO    VARCHAR(6)   NOT NULL ,
     QTY    INTEGER      NOT NULL ,
     UNIQUE ( SNO , PNO ) ,
     FOREIGN KEY ( SNO )
        REFERENCES S ( SNO ) ,
     FOREIGN KEY ( PNO )
        REFERENCES P ( PNO ) ) ;

INSERT INTO S (SNO, SNAME, STATUS, CITY) VALUES ('S1', 'Smith', 20, 'London');
INSERT INTO S (SNO, SNAME, STATUS, CITY) VALUES ('S2', 'Jones', 10, 'Paris');
INSERT INTO S (SNO, SNAME, STATUS, CITY) VALUES ('S3', 'Blake', 30, 'Paris');
INSERT INTO S (SNO, SNAME, STATUS, CITY) VALUES ('S4', 'Clark', 20, 'London');
INSERT INTO S (SNO, SNAME, STATUS, CITY) VALUES ('S5', 'Adams', 30, 'Athens');

INSERT INTO P (PNO, PNAME, COLOR, WEIGHT, CITY) VALUES ('P1', 'Nut', 'Red', 12.0, 'London');
INSERT INTO P (PNO, PNAME, COLOR, WEIGHT, CITY) VALUES ('P2', 'Bolt', 'Green', 17.0, 'Paris');
INSERT INTO P (PNO, PNAME, COLOR, WEIGHT, CITY) VALUES ('P3', 'Screw', 'Blue', 17.0, 'Oslo');
INSERT INTO P (PNO, PNAME, COLOR, WEIGHT, CITY) VALUES ('P4', 'Screw', 'Red', 14.0, 'London');
INSERT INTO P (PNO, PNAME, COLOR, WEIGHT, CITY) VALUES ('P5', 'Cam', 'Blue', 12.0, 'Paris');
INSERT INTO P (PNO, PNAME, COLOR, WEIGHT, CITY) VALUES ('P6', 'Cog', 'Red', 19.0, 'London');

INSERT INTO SP (SNO, PNO, QTY) VALUES ('S1', 'P1', 300);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S1', 'P2', 200);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S1', 'P3', 400);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S1', 'P4', 200);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S1', 'P5', 100);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S1', 'P6', 100);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S2', 'P1', 300);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S2', 'P2', 400);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S3', 'P2', 200);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S4', 'P2', 200);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S4', 'P4', 300);
INSERT INTO SP (SNO, PNO, QTY) VALUES ('S4', 'P5', 400);

Moving data from Sql Server to MySQL

To move data from Sql Server to MySQL, it is certainly possible to use tools that can make connections to both data stores and manipulate data that way, such as Access, Excel, or SSIS. Here I will introduce a process that does not need any special tools or data drivers. Instead, we can use the utilities and methods that come with a standard Sql Server and MySQL install to accomplish that task.

With this approach, it is assumed that matching tables already exist on MySQL. If not, they need to be created first.

This process is comprised of these steps: first bcp command will be generated based on Sql Server database meta data (sysobjects, think information_schema in MySQL); then the generated bcp commands will be executed; the resulting csv files can then be transferred to the MySQL server, optionally it is possible to compress them if the size is big; and finally the csv files will be imported into MySQL with LOAD DATA LOCAL INFILE.

1. Run the code below on Sql Server to generate bcp commands, customize the data dump folder name and database name as needed. For database name, replace MyDb with the proper database name, 2 such places need to be changed:

set nocount on

/* Set BackupFolder name here */
declare @BackupFolder varchar(100)
set @BackupFolder = ‘c:\MyDataDumpFolder\’

/* Initialize variables here */
declare @TableNames table (TableTempID
smallint IDENTITY(1,1) PRIMARY KEY CLUSTERED, TableName varchar(50))
declare @TableName varchar(50)
declare @BackupFileName char(50)
declare @BackupFolderFile varchar(150)
declare @Counter smallint
declare @MaxTableCount smallint
declare @BCPOutCommand varchar(500)

/* Get the list of tables that we want to dump out as csv */
insert into @TableNames (TableName) select name from
MyDb.dbo.sysobjects where xtype = ‘U’
and name not in (‘dtproperties’)

select @MaxTableCount = max(TableTempID) from @TableNames
set @Counter = 1

/* Loop through all each table individually, generate bcp
commands and run bcp commands to export data */
while @Counter < = @MaxTableCount Begin /* Create backup file name */ select @TableName = ltrim(rtrim(TableName)) from @TableNames where TableTempID = @Counter select @BackupFileName = ltrim(rtrim(@TableName)) + '.txt' /* Combine backup folder name and file name */ select @BackupFolderFile = @BackupFolder + @BackupFileName /* Create BCP command */ select @BCPOutCommand = 'bcp ' + '"MyDb.dbo.' + @TableName + '" out "' + ltrim(rtrim(@BackupFolderFile)) + '" -c -q -S' + @@Servername + ' -T' print @BCPOutCommand set @Counter = @Counter + 1 end [/sourcecode] 2. Run the bcp commands generated above on Windows DOS command line. Suppose we only have one table called t1, below is what the bcp command would look like: bcp "test.dbo.t1" out "c:\junk\t1.txt" -c -q -SSqlServerInstanceName -T 3. Gather the text files and move them to MySQL server; 4. Run the command in mysql to import the data in. Adjust the directory name as needed: mysql> LOAD DATA LOCAL INFILE ‘/home/haidong/Desktop/t1.txt’ INTO TABLE t1 LINES TERMINATED BY ‘\r\n’;

I’ve tested this for tables that have number, character, datetime, and sequence data successfully. Your mileage may vary. Hope this helps.

Some notes on Sql Server and MySQL

I will take MySQL certification exams, therefore I am going through MySQL 5.0 Certification Study Guide. It is a pretty cool book in that it goes to the point right away without many wasted words. Too many technical books dance around the topic, fill the volume with screen shots, but do not deliver the goods in the end. I am reading the Safari online version. With so many pages, I can see the physical book can be bulky.

Anyway, some comments and observations:

1. Sybase, MySQL, and Sql Server’s definition and implementation of database and schema are somewhat close. I dislike Microsoft’s definition and usage of schema, especially in Sql Server 2005. It is very confusing to new or DBAs from other platforms.

Sybase, MySQL, and Sql Server all support the USE Database statement;

2. I like mysql command line tool and how the result set is displayed. It looks clean and neat to me, although, like command line tools from other database platforms, when the result set gets wider, it gets messy;

3. Similar to Sql Server, MySQL also has the system and status variables that starts with @@. And, like Sql Server, you can just do SELECT @@Variable to retrieve its value, for example, SELECT @@sql_mode;

4. Played with MySQL Query Browser a little bit. Ctrl – E is the keyboard shortcut for statement execution, the same as Sql Server. However, F5 does not execute the statement in MySQL Query Browser.

Analyzing low performance SQL code

As an independent consultant and trainer , I found myself doing a lot of existing code analysis and enhancement, mostly for stored procedures and ad-hoc SQL statements. I suspect a lot of people do the same thing as well, so I am really interested in learning how you do it. That’s the main purpose of this post. This post is tagged with Oracle, Sql Server, and MySQL, as the principals should be the same for all platforms.

Let me share with you how I do it. Notice that I look at table/column statistics and indexes in almost all the steps below. Therefore I purposely left them out in the discussion.

1. I always talk to the original author or the current owner of the code, asking him/her to walk me through it. I listen mostly, trying to understand why s/he writes code this way. I may ask a few questions, just to help me understand. I almost never do any lecturing at this stage. If the code is a stored procedure, I ask the author or stakeholder to give me sample execution statements, with parameters filled with typical values for production load. Sometime it is not possible to talk to the original author, because s/he may have left the company;

2. After that, I will get the database in question and restore it on my own workstation, with the data as close to production as possible. I will start running the code, and gather IO statistics and execution plan, and save the results for benchmark comparison later on;

3. Here is the raw code analysis stage, where code is put into an editor for analysis. In my case, it is VI or VIM. I find myself using the * command on all variable names and tables (make sure you are doing case insensitive search for Sql Server code), to see where they are used. This is also where I weed out the historical garbage code that is there but never being used anymore, such as unused variables, tables, etc. In the case of Sql Server, I mostly look for cursors, temp tables, and user defined functions;

4. Code simplification stage. I found a lot of bad performing code is unwieldy and unnecessarily complex. At this stage, I look for redundant WHERE clauses, convoluted AND and/or OR operators. This can be a tedious and time-consuming process. At this point, it is important to talk to the author or owner of the code, as a good understanding of data and entity relationship is crucial. Also, since I have gained some understanding from the steps above, it is easier to have an intelligent conversation with the code owner, with fresh perspective and momentum. Usually something good will come out of the conversation, and we will have something concrete to work on next;

5. I will make the changes discussed in step 4, once again, collect statistics and execution plan for comparison;

6. Repeat Step 4 and 5 a few times, when necessary. I usually start talking about better database design, code/design refactoring, set based operations, etc., as hopefully I have some credibility and gained the customer’s trust.

So what do you do to break down complex, low performance SQL code? What methodology, tools, tips, tricks you can share with me?

Migrating from one RDBMS to another

Here is some of my thoughts on migrating MySQL to Sql Server. It came out of an email discussion. I’d love to hear your thoughts on migrating to a different database platform, not just MySQL to Sql Server.

I actually thought about writing a white paper or even a course on migrating from MySQL to Sql Server, but never got the time to do it. Sometimes a project doing similar things can serve as a launchpad for this endeavour, but that never came along, at least not yet. I am very interested in database interoperability field though. I’ve done MySQL and Oracle admin in the past and have published some MySQL and Oracle stuff in blogs. I have much better technical skills on Sql Server than any other RDBMS platforms, primarily because I’ve worked on it longer.

Here are some of my thoughts. I think most of it applies equally on migration from Oracle, DB2, Sybase, Postgresql, etc., to Sql Server, or the other way around. It might be slightly easier to migrate from Sybase to Sql Server, considering their common root.

1. It is not easy to migrate existing app, unless the app is a simple one. Even for that, there are enough quirks that can throw people off and cause enough frustration to derail the whole projects. I’ve seen that happening twice, having engaged in moving 2 apps from Sql Server to Oracle;

2. Therefore, the best way to migrating to a new database RDBMS, in my opinion, is to start from a new initiative, probably not big initially. When you start things from a clean slate, you don’t have the historical garbage to worry about. Furthermore, you will give the team enough time to learn the new platform, and prepare the team for future migration, if you choose to do so;

3. Having open-minded team members is crucial to a migration project’s success. Too often people have emotional attachments to the platform they are familiar with, possibly out of job security concerns and lack of general curiosity toward new things.

I generally adopt a platform agnostic attitude, and don’t get religious and too carried away on the platform I work on. Having said that, I think these are points that marketing people can spin for persuasion purposes:

1. MySQL has too many storage platforms: MyISAM, InnoDB, MaxDb, and the newly introduced Maria. This can be viewed as a plus, as it provides choice. The downside of it is that it causes confusion for end users;

2. MySQL’s support for relational model is fairly recent. For example, for a long time, MySQL didn’t support Stored Procedures, Views, Triggers, Foreign Keys, etc. One could argue that MySQL is not mature in this area since it is new for them, but I think it is difficult to find evidence to substantiate that claim. Also, running the risk of offending some people, I think the importance of relational model got overblown a bit;

3. Sql Server offers the CLR integration. This can be a great selling point;

4. Sql Server offers tight integration with Visual Studio, Windows network, and all other things Microsoft. This is a huge advantage.

5. Too many people find *nix environment intimidating. Although MySQL works on Windows, but the perception in the marketplace is MySQL works better on *nix.

As far as migrating MySQL to Sql Server in a hosting web environment, my honest opinion is Sql Server will be fighting an uphill battle, because MySQL excels in this arena, especially for small and medium-sized, or departmental organizations, with the proliferation of such LAMP app like blogs, wikies, discussion boards, etc. I believe Microsoft’s weapon of choice in this arena should be SharePoint. Given Microsoft’s clout, it is certainly a battle worth fighting.