Archive for February, 2009

Moving data from Sql Server to MySQL

To move data from Sql Server to MySQL, it is certainly possible to use tools that can make connections to both data stores and manipulate data that way, such as Access, Excel, or SSIS. Here I will introduce a process that does not need any special tools or data drivers. Instead, we can use the utilities and methods that come with a standard Sql Server and MySQL install to accomplish that task.

With this approach, it is assumed that matching tables already exist on MySQL. If not, they need to be created first.

This process is comprised of these steps: first bcp command will be generated based on Sql Server database meta data (sysobjects, think information_schema in MySQL); then the generated bcp commands will be executed; the resulting csv files can then be transferred to the MySQL server, optionally it is possible to compress them if the size is big; and finally the csv files will be imported into MySQL with LOAD DATA LOCAL INFILE.

1. Run the code below on Sql Server to generate bcp commands, customize the data dump folder name and database name as needed. For database name, replace MyDb with the proper database name, 2 such places need to be changed:

set nocount on

/* Set BackupFolder name here */
declare @BackupFolder varchar(100)
set @BackupFolder = 'c:\MyDataDumpFolder\'

/* Initialize variables here */
declare @TableNames table (TableTempID
smallint IDENTITY(1,1) PRIMARY KEY CLUSTERED, TableName varchar(50))
declare @TableName varchar(50)
declare @BackupFileName char(50)
declare @BackupFolderFile varchar(150)
declare @Counter smallint
declare @MaxTableCount smallint
declare @BCPOutCommand varchar(500)

/* Get the list of tables that we want to dump out as csv */
insert into @TableNames (TableName) select name from
MyDb.dbo.sysobjects where xtype = 'U'
and name not in ('dtproperties')

select @MaxTableCount = max(TableTempID) from @TableNames
set @Counter = 1

/* Loop through all each table individually, generate bcp
commands and run bcp commands to export data */
while @Counter < = @MaxTableCount

Begin

/* Create backup file name */
select @TableName = ltrim(rtrim(TableName)) from
@TableNames where TableTempID = @Counter

select @BackupFileName = ltrim(rtrim(@TableName)) + '.txt'

/* Combine backup folder name and file name */
select @BackupFolderFile = @BackupFolder + @BackupFileName

/* Create BCP command */
select @BCPOutCommand = 'bcp ' + '"MyDb.dbo.' + @TableName +
'" out "' + ltrim(rtrim(@BackupFolderFile)) +
'" -c -q -S' + @@Servername + ' -T'

print @BCPOutCommand
set @Counter = @Counter + 1
end

2. Run the bcp commands generated above on Windows DOS command line. Suppose we only have one table called t1, below is what the bcp command would look like:

bcp “test.dbo.t1″ out “c:\junk\t1.txt” -c -q -SSqlServerInstanceName -T

3. Gather the text files and move them to MySQL server;
4. Run the command in mysql to import the data in. Adjust the directory name as needed:

mysql> LOAD DATA LOCAL INFILE ‘/home/haidong/Desktop/t1.txt’ INTO TABLE t1 LINES TERMINATED BY ‘\r\n’;

I’ve tested this for tables that have number, character, datetime, and sequence data successfully. Your mileage may vary. Hope this helps.

Comments (13)

Dropping a table does not remove permissions granted to it in MySQL

If:

1. A table t1 is created in database test;
2. A login is granted select permission on t1;
3. t1 is dropped and then recreated.

Then that login would still be able to read the newly recreated table t1, even if t1 has totally different columns. The reason is that the table select privilege is stored in tables_priv in mysql database, and when t1 is dropped, that privilege will not be cleared in tables_priv.

Personally, I think this needs to be changed. In other words, when a drop table command is issued, not only the table needs to be dropped, this command should also go to tables_priv in mysql to remove that select permission. Not too sure how big a task that is, perhaps I should try to see if I can implement it myself, after I get comfortable with the source code and development process of MySQL.

Comments (5)

Perl script to convert MySQL character set to utf8

“Groovy, baby!”, as international man of mystery likes to say.

It seems converting character and text data to utf8 character set is a common task for MySQL, especially during an upgrade. In fact, I had trouble with it during server and WordPress database upgrade for this blog site. I wrote about it in this post, where I explained how to do it step-by-step using a command line tool such as mysql, taking advantage of some nifty code generation trick with information_schema.

One drawback of that method is that it’s a manual process, therefore time-consuming and error-prone.

I spent some time today to cook up a simple Perl script to automate this task. See below. Remember to change MyDatabase to your database name (there are 3 places that need to be changed), and adjust the user name and password accordingly. This script also prints out sql statements used during the process so you will know what has been done to the database.

I have tested this successfully on my own WordPress blog. Hopefully it will help somebody out there. Enjoy!

#!/usr/bin/perl
# MyUtf8Converter.pl - convert all character data to utf8 character set
use strict;
use warnings;
use DBI;

# data source name, username, password, connection attributes
my $dsn = "DBI:mysql:MyDatabase:localhost";
my $user_name = "MySQLUserName";
my $password = "MySQLPassword";
my %conn_attrs = (RaiseError => 1, PrintError => 0, AutoCommit => 1);

# connect to database
my $dbh = DBI->connect ($dsn, $user_name, $password, \%conn_attrs);

# Convert char types first. Use information_schema to generate alter table code to convert data to binary first.
my $sth = $dbh->prepare ("SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'char', 'binary')) FROM information_schema.columns WHERE table_schema = 'MyDatabase' and data_type LIKE '%char%'");

$sth->execute ();

# Run alter table stamement. First convert char types to binary types, then convert those to char types utf8 character set
while (my ($val) = $sth->fetchrow_array ())
    {
        print $val, "\n";
        $dbh->do ($val);
        $val =~ s/binary/char/;
        $val = $val . " CHARACTER SET utf8";
        print $val, "\n";
        $dbh->do ($val);
    }
$sth->finish ();
# Now we convert text data. Use information_schema to generate alter table code to convert data to blob first.
$sth = $dbh->prepare ("SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'text', 'blob')) FROM information_schema.columns WHERE table_schema = 'MyDatabase' and data_type LIKE '%text%'");

$sth->execute ();

# Run alter table stamement. First convert text types to blob types, then convert those to text types utf8 character set
while (my ($val) = $sth->fetchrow_array ())
    {
        print $val, "\n";
        $dbh->do ($val);
        $val =~ s/blob/text/;
        $val = $val . " CHARACTER SET utf8";
        print $val, "\n";
        $dbh->do ($val);
    }
$sth->finish ();

$dbh->disconnect ();

Comments (12)

峨嵋小馆

Rockville是华盛顿特区在马里兰州的郊区。之前我来这儿好几次了。这次来之前,我查了一下这儿的中餐馆,找到了一个名字叫“峨嵋小馆”的地方。它离我的旅馆很近,走得到,挺好的。这儿交通也很方便,坐往Shady Grove方向的Metro红线,Twinbrook站下车,再走六七分钟就到。

Joe’s Noodle House
1488-C Rockville Pike
Rockville, Maryland
301-881-5518
http://www.joesnoodlehouse.com

这里的菜不错,服务员也好,很讲礼貌。和她们闲聊,感觉得到她们的友好与温暖。她们有从河北和广东来的。在那儿做打扫工作擦桌子的是一位拉美裔的女士。看着他们用基本的英语和手势交流,开玩笑的一个片段,和他们脸上的笑容,真好。就想,我们能不能抛开肤色,外貌,宗教,语言等的差异,摈弃对与自己不同的群体的贬低、非人化和妖魔化或盲目崇拜,向以人为本的方向努力?

贴几张这个小馆的菜的照片。明天有可能和网上认识的Prince Roy会面,期待中。现在该回去做一些Linux,MySQL,和Perl之类的东西。千里之行,始于足下。

峨嵋小馆
梅酒,拌豆腐,红油抄手,和辣椒小炒

峨嵋小馆
蒜炒菠菜和夫妻肺片

Comments (3)

Masking a CSV file through shuffling

I wrote a bit on data masking here. Below is a simple C# program to reshuffle data in a csv file. It uses FileHelpers library for the .Net Framework.

To play with this simple code, first download FileHelpers library. Then save the text below the C# code into a file, modify the code to so it points to the text file you just saved. And it should work. Nothing fancy, just a proof of concept kind of work.

using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
using FileHelpers;

namespace FileTest
{
    class Program
    {
        static Random randomNumber = new Random();
        static void Main(string[] args)
        {

            FileHelperEngine engine = new FileHelperEngine(typeof(People));
            engine.Options.IgnoreEmptyLines = true;
            People[] res = engine.ReadFile(@"c:\junk\PeopleName.txt") as People[];
            for (int i = 0; i < res.Length; i++)
            {
                //Console.WriteLine(GenerateRandomString(9) +
                    Console.WriteLine(res[randomNumber.Next(res.Length)].FirstName +
                    res[randomNumber.Next(res.Length)].LastName);
            }
        }
        public static long CountLinesInFile(string fileName)
        {
            long count = 0;
            using (StreamReader reader = new StreamReader(fileName))
            {
                string line;
                while ((line = reader.ReadLine()) != null)
                {
                    count++;
                }
            }
            return count;
        }
        public static string GenerateRandomString(int intLenghtOfString)
        {
            //Create a new StrinBuilder that would hold the random string.
            StringBuilder randomString = new StringBuilder();
            //Create a new instance of the class Random
            //Create a variable to hold the generated charater.
            Char appendedChar;
            //Create a loop that would iterate from 0 to the specified value of intLenghtOfString
            for (int i = 0; i <= randomNumber.Next(1, intLenghtOfString); ++i)
            {
                //Generate the char and assign it to appendedChar
                appendedChar = Convert.ToChar(Convert.ToInt32(26 * randomNumber.NextDouble()) + 97);
                //Append appendedChar to randomString
                randomString.Append(appendedChar);
            }
            //Convert randomString to String and return the result.
            return randomString.ToString();
        }
        [FixedLengthRecord(FixedMode.AllowLessChars)]
        class People
        {
            [FieldFixedLength(9)]
            public string FirstName;

            [FieldFixedLength(14)]
            public string LastName;

            [FieldFixedLength(27)]
            public string Address1;

            [FieldFixedLength(18)]
            public string City;
        }
    }
}
Robert   Foster         123 Roosevolt Ave.         Chicago
Edward Smith 93874 Madison Oak Park
James Kane 324 Grant Geneva
Mei Liu 587 Lincoln River Forest
Jane Doe 6565 Truman Maywood
Peter Imanov 456 Wilson Berkeley
Daming Wang 897 Jackson Aurora
Spiro Agnew 233 Jefferson Naperville
Osama BinLaden 6431 Kennedy Arlington Heights
George Bush 2356 Grant Evanston
Dick Cheney 6498 Coolidge Skokie
Barack Obama 8439 Adams Glenview

Comments