Installing ack on Windows with ActiveState Perl

Search is important! Having a good tool for text search can lower or eliminate barrier, speed up work, and generally make life more enjoyable and fulfilling. That handy search tool, for me on Linux, has been ack for the past few years. One still needs to know some grep and regex. In fact, the more you know regex and grep, the better. But for a handy and quick pattern matching against source files, I use ack all the time. By default, it ignores Subversion, Git, and other VCS directories and files, which is nice.

On Linux systems, it’s really easy to get ack. On Centos/Red Hat based systems, yum install ack will get it for you. On Ubuntu based systems, one needs to run apt-get install ack-grep. Not surprisingly, on Ubuntu, the default program name is ack-grep, although you can change that.

You can use ack on Windows system as well. Over the years, I’ve played with Strawberry Perl and ActiveState Perl on Windows. Although I haven’t done much of Perl work on Windows lately, I think ActiveState Perl would be my pick if given a choice, simply because its ppm program is nice and the precompiled PPM packages are really pretty extensive. I think most users don’t want to mess with MinGW, dmake, and whatnot. Even cpan can be a bit frightening for beginners. This is my opinion, your mileage may vary!

Anyway, if you have ActiveState Perl and ppm on Windows, run this to get ack. Afterwards, just invoke ack from the command line (if you get your PATH right). Enjoy!

ppm install ack

Installing Perl DBI and DBD-mysql on Windows 64 bit

I had trouble getting Perl DBI and DBD-mysql on Windows in the past. In addition, on Windows 64-bit, you sometimes see recommendations of using 32-bit Perl.

Today I got to test the latest 64-bit ActiveState Perl distro for Windows, version I tested it on Windows 2008 R2 64-bit. I am happy to report that it works. I am not categorically recommend FOR the installation of 64-bit Perl on Windows, though.

Here are the steps:
1. Get the ActiveState Perl 64-bit package for Windows and install it, following all the default options;
2. On command prompt, do:
cd c:\perl64\bin
ppm install DBI
ppm install DBD-mysql

I then tested against both Oracle’s MySQL 5.5 Community Server and MariaDb’s 5.2.7 on Windows with MaatKit’s mk-table-checksum to confirm. And it worked fine:

C:\Users\Administrator\Downloads\maatkit-7540\maatkit-7540\bin>c:\Perl64\bin\perl.exe mk-table-checksum –databases mysql h=localhost,u=root,p=password

Having fun with MySQL and Python: converting MySQL character set to utf8

Lately I worked quite a bit with Python and Linux, writing monitoring and automation utilities. I am in a transition period, so I thought I ought to write some Python stuff interfacing with MySQL for fun, and start positioning myself for expanded career horizon, I hope.

To get started, I thought it would be fun to rewrite a Perl utility I wrote before with Python. That script converts MySQL character sets to utf8, a very common task for wikis and blogs during an upgrade. This time, I did everything from scratch: firing up an Amazon EC2 Linux instance, hand install and configuring MySQL 5.1.50 (creating mysql user, group, wget tarball, setting directory ownership and permissions, creating symbolic to MySQL binaries, editing my.cnf, /etc/init.d/ and chkconfig automatic startup, environmental variables, the works), compiled and configured Python 2.7, compiled and configured Python easy_install, compiled and installed MySQLdb module for Python, and finally successfully rewrote and tested the utility in Python. It is listed at the bottom of this post.

The original Perl program used some SQL code generation technique based on metadata within information_schema. As is hopefully well-known by now, information_schema should be used with caution because it can leads to server lockup. In fact, I noticed it when I ran my Perl program on a shared web hosting server. See Baron’s post here for an alternative way of doing it. But my Python program listed below didn’t use it, just to be consistent with the one used in Perl. I had a lot of fun going through this exercise.

Here are some notes I took.
1. It would be really nice if there is a catalog of every publicly available AMIs that contains detailed metadata that we can query, such as:
a. Version and distro of *nix;
b. What has been installed? Apache, MySQL, Python, PHP, Nagios, etc., etc.
c. For the software that has been installed, what is the version? What version(s) of MySQL was installed? What patches have been applied? Is it Python 2.4, 2.5, 2.6, 2.7, 3.0? Things like that.
d. Hardware related stuff: memory, disk size, etc.

Granted, I used AWS Management Console web interface to fire up and terminate instances, and accessed the servers via Putty. I lost my Linux workstation and haven’t built a new one yet. I do recall there were some EC2 command line tools on Linux that can do some of that when I last tried it a few months back. So it could be that things have improved and/or I just didn’t know the right way to get that out. If that is the case, I’d really appreciate it if you can give me some pointers. As it stands, the xml manifest file is too vague and not detailed enough.

2. Trying to do Python upgrade, say from 2.4 to 2.6 or 2.7, is generally not a good idea. Having multiple versions of Python side by side is probably a better way to go. Yes, it can be confusing, but I think in most cases, the cost/benefit analysis weighs against an upgrade: broken libraries and modules, failed cron jobs, you name it.

When having multiple versions of Python side by side, you can define a default version by some creative use of symbolic link. For instance, I did
./configure && make && sudo make install
for Python 2.7 on a server that has 2.4 pre-installed. I then did:
ln -s /usr/local/bin/python2.7 /usr/bin/python
to make Python 2.7 the default version. If I need Python 2.4, I will just call python2.4 binary.

While on that note, I found it surprising that 2 AMI instances (ami-3e836657 and ami-01b75668) I used both had Python 2.4 installed, which in my mind was pretty old.

3. Installing MySQLdb Python module was a bit problematic. On one server,
sudo easy_install mysql-python
did not work for me. I didn’t spend time digging because the following worked:
sudo yum install MySQL-python
On another server, easy_install mysql-python did work, but when I did import MySQLdb, I got:
ImportError: cannot open shared object file: No such file or directory
To fix that, do:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/mysql/lib
Here is the Python code

import MySQLdb
db = MySQLdb.connect(host="localhost", user="root", passwd="", db="charset_test", unix_socket="/tmp/mysql.sock")
cursor = db.cursor()

sql = "SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'char', 'binary')) FROM information_schema.columns WHERE table_schema = 'charset_test' and data_type LIKE '%char%'"
results = cursor.fetchall()

for row in results:
        """Convert char types to binary types first"""
        """Now we convert binary type back to char with CHARACTER SET utf8 defined"""
        cursor.execute(row[0].rpartition(' ')[0] + ' ' + row[0].rpartition(' ')[2].replace('binary', 'char') + ' CHARACTER SET utf8')

sql = "SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'text', 'blob')) FROM information_schema.columns WHERE table_schema = 'MyDatabase' and data_type LIKE '%text%'"
results = cursor.fetchall()

for row in results:
        """Convert text types to blob types first"""
        """Now we convert blob type back to text with CHARACTER SET utf8 defined"""
        cursor.execute(row[0].rpartition(' ')[0] + ' ' + row[0].rpartition(' ')[2].replace('blob', 'text') + ' CHARACTER SET utf8')


Perl script to split sql script into separate files

Below is a quick script I put together to split one single file that contains many stored procedures into many files that contain just a single stored procedure. It can be easily modified for views, user-defined functions, etc..

1. It does put BEGIN and END before and after the procedure creation. Because some script has grant statements in it, and I want to put END right after the procedure creation, therefore there is a variable called $FirstGrant;
2. The new files created will be named like StoredProcedureName.sql.

use strict;

open(MYDATA, "c:\\junk\\MyDb\\AllProc.sql") or 
  die("Error: cannot open file 'AllProc.sql'\n");

    my $FirstGrant = 0;
while( my $line = <mydata> ){
if ($line =~ /^create\s+proc/i) {
   close (FILE);
   $FirstGrant = 0;
   (my $filename) = $line =~ /create\s+proce?d?u?r?e?\s+\[?d?b?o?]?\.?\[?(\w+)/i;
   open (FILE, ">", "c:\\junk\\MyDb\\".$filename.".sql");
   print FILE "IF NOT EXISTS (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[$filename]') AND OBJECTPROPERTY(id,N'IsProcedure') = 1)\n";
   print FILE "BEGIN\n";
   print FILE $line;
else {
if (($line =~ /^grant\s+execute/i) && ($FirstGrant == 0)) {
   print FILE "END\n";
   $FirstGrant = 1;
   print FILE $line;

close MYDATA;

Generate insert scripts for SQL Server tables with Perl

Here is a quick hack with Perl to generate insert statements to reproduce data already exist in a given table. It is similar to results mysqldump generates.

A few things worth mentioning here:
1. This script uses a text file, TableList, that contains a list of tables that you want insert statements created. Each line in the list contains the database name and table name, separated by space(s). This is a quick hack, one could easily pass the file name as an input;
2. Obviously the connection information needs to be modified for your own needs;
3. The script does escape single quote in character fields properly;
4. Use
perl > c:\MyFolder\MyTableInsert.sql
to collect the results into c:\MyFolder\MyTableInsert.sql file.
5. Thanks to Summit Amar and people who commented on this article for sql script ideas.
6. Update: modified the script so it handles identity field if it is present in a table.

By the way, Microsoft’s new scripting language, PowerShell, borrowed a lot of good stuff from Perl and Unix shell scripting, which is pretty powerful. My friends Yan Pan and Ananthakumar Muthusamy wrote a book, Microsoft SQL Server 2008 Administration with Windows PowerShell, which I helped to tech-edit. If you are looking to learn more automation and scripting tricks with PowerShell, definitely check out that book, which will come out next month.

use strict;
use Win32::SqlServer;

# Get server name from the command line
# my $ServerName = shift or die "Please enter server name as the first only parameter needed";
# my $TableName = "authors";

open(MYDATA, "RefList.txt") or 
  die("Error: cannot open file 'RefList.txt'\n");

while( my $line = <mydata> ){
  $line =~ /(\w+)\s+(\w+)/;
  my $db;
  my $table;
  if ($db ne $1){
	print "-----Tables in $1-----\n";
  $db = $1;
  my $sqlsrv = sql_init(".", "sa", "s3cr3t", $1);

  my $HasIdentityColumn = $sqlsrv->sql_one("select objectproperty(object_id('$2'), 'TableHasIdentity')");
  if ($HasIdentityColumn == 1) {
# Our SQL statement.
  my $stmnt = < <SQLEND;
--Declare a cursor to retrieve column specific information for the specified table
SELECT column_name,data_type FROM information_schema.columns WHERE table_name = '$2'
OPEN cursCol
DECLARE \@string nvarchar(3000) --for storing the first half of INSERT statement
DECLARE \@stringData nvarchar(3000) --for storing the data (VALUES) related statement
DECLARE \@dataType nvarchar(1000) --data types returned for respective columns
SET \@string='INSERT '+ '$2' +'('
SET \@stringData=''

DECLARE \@colName nvarchar(50)

FETCH NEXT FROM cursCol INTO \@colName,\@dataType

IF \@\@fetch_status<>0
	--print 'Table ' + $2 + ' not found, processing skipped.'
	close curscol
	deallocate curscol

IF \@dataType in ('varchar','char','nchar','nvarchar')
        SET \@stringData=\@stringData+''''+'''+isnull('''''+'''''+REPLACE('+\@colName+','''''+''''','''''+'''''+'''''+''''')+'''''+''''',''NULL'')+'',''+'
if \@dataType in ('text','ntext') --if the datatype is text or something else 
        SET \@stringData=\@stringData+'''''''''+isnull(REPLACE(cast('+\@colName+' as varchar(2000)),'''''+''''','''''+'''''+'''''+'''''),'''')+'''''',''+'
IF \@dataType = 'money' --because money doesn't get converted from varchar implicitly
	SET \@stringData=\@stringData+'''convert(money,''''''+isnull(cast('+\@colName+' as varchar(200)),''0.0000'')+''''''),''+'
IF \@dataType='datetime'
	SET \@stringData=\@stringData+'''convert(datetime,'+'''+isnull('''''+'''''+convert(varchar(200),'+\@colName+',121)+'''''+''''',''NULL'')+'',121),''+'
IF \@dataType='image' 
	SET \@stringData=\@stringData+'''''''''+isnull(cast(convert(varbinary,'+\@colName+') as varchar(6)),''0'')+'''''',''+'
ELSE --presuming the data type is int,bit,numeric,decimal 
	--SET \@stringData=\@stringData+'''''''''+isnull(cast('+\@colName+' as varchar(200)),''0'')+'''''',''+'
	--SET \@stringData=\@stringData+'''convert(datetime,'+'''+isnull('''''+'''''+convert(varchar(200),'+\@colName+',121)+'''''+''''',''NULL'')+'',121),''+'
	SET \@stringData=\@stringData+''''+'''+isnull('''''+'''''+convert(varchar(200),'+\@colName+')+'''''+''''',''NULL'')+'',''+'

SET \@string=\@string+\@colName+','

FETCH NEXT FROM cursCol INTO \@colName,\@dataType
DECLARE \@Query nvarchar(4000)

SET \@query ='SELECT '''+substring(\@string,0,len(\@string)) + ') VALUES(''+ ' + substring(\@stringData,0,len(\@stringData)-2)+'''+'')'' Col1 FROM '+ '$2'
exec sp_executesql \@query

CLOSE cursCol

# Run query. The return value is a reference to an array.
  my $result = $sqlsrv->sql($stmnt);
# Print results. Each array entry is a hash reference.
  foreach my $row (@$result) {
	print "$$row{Col1}\n";

  if ($HasIdentityColumn == 1) {
close MYDATA;