Initial impressions of SQL Server v.Next Public Preview

Microsoft announced its SQL Server on Linux public preview yesterday. I’m really excited to check it out. Here are some interesting things I found during my testing. I’ll write more as I play with it further.

  • If you want to play it on Ubuntu, it needs to be 16.04 or above. I didn’t pay attention to that initially, and started installing on Ubuntu 14. Below is a typical message you would get:

    The following packages have unmet dependencies:
     mssql-server : Depends: openssl (>= 1.0.2) but 1.0.1f-1ubuntu2.21 is to be installed
    E: Unable to correct problems, you have held broken packages.
    

    Running sudo apt-get dist-upgrade brought my Ubuntu to 16.04. The install was smooth afterwards.

  • Instruction for Red Hat Enterprise Linux also works for Fedora. I tested it on Fedora 23. I think it should also work on CentOS, although I didn’t test it myself.
  • The machine needs to have at least 3.25 GB of memory. On Ubuntu, install won’t continue if that condition is not satisfied:
    Preparing to unpack .../mssql-server_14.0.1.246-6_amd64.deb ...
    ERROR: This machine must have at least 3.25 gigabytes of memory to install Microsoft(R) SQL Server(R).
    dpkg: error processing archive /var/cache/apt/archives/mssql-server_14.0.1.246-6_amd64.deb (--unpack):
     subprocess new pre-installation script returned error exit status 1
    Processing triggers for libc-bin (2.21-0ubuntu4.3) ...
    Errors were encountered while processing:
     /var/cache/apt/archives/mssql-server_14.0.1.246-6_amd64.deb
    E: Sub-process /usr/bin/dpkg returned an error code (1)
    

    On Fedora, installation finishes, but you won’t be able to start the service:

    [hji@localhost ~]$ sudo /opt/mssql/bin/sqlservr-setup 
    Microsoft(R) SQL Server(R) Setup
    
    You can abort setup at anytime by pressing Ctrl-C. Start this program
    with the --help option for information about running it in unattended
    mode.
    
    The license terms for this product can be downloaded from
    http://go.microsoft.com/fwlink/?LinkId=746388 and found
    in /usr/share/doc/mssql-server/LICENSE.TXT.
    
    Do you accept the license terms? If so, please type "YES": YES
    
    Please enter a password for the system administrator (SA) account: 
    Please confirm the password for the system administrator (SA) account: 
    
    Setting system administrator (SA) account password...
    sqlservr: This program requires a machine with at least 3250 megabytes of memory.
    Microsoft(R) SQL Server(R) setup failed with error code 1. 
    Please check the setup log in /var/opt/mssql/log/setup-20161117-122619.log
    for more information.
    
  • Some simple testing :) From the output below, we learn that: 1)in sys.sysfiles, full file name is presented like “C:\var\opt\mssql\data\TestDb.mdf”; 2) Database name, at least inside sqlcmd, is not case-sensitive. By the way, login is also case-insensitive: SA is sA.
    1> create database TestDb;
    2> go
    
    Network packet size (bytes): 4096
    1 xact[s]:
    Clock Time (ms.): total       447  avg   447.0 (2.2 xacts per sec.)
    1> use testdb;
    2> go
    Changed database context to 'TestDb'.
    
    Network packet size (bytes): 4096
    1 xact[s]:
    Clock Time (ms.): total         3  avg   3.0 (333.3 xacts per sec.)
    1> select filename from sys.sysfiles
    2> go
    filename                                                                                                                                                                                                                                                            
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    C:\var\opt\mssql\data\TestDb.mdf                                                                                                                                                                                                                               
    C:\var\opt\mssql\data\TestDb_log.ldf                                                                                                                                                                                                                                
    
  • I then did a quick testing of advanced feature, like Clustered Columnstore Index (CCI). Yes, it’s available in SQL Server for Linux!
    1> create table Person (PersonID int, LastName nvarchar(255), FirstName nvarchar(255))
    2> go
    
    Network packet size (bytes): 4096
    1 xact[s]:
    Clock Time (ms.): total        28  avg   28.0 (35.7 xacts per sec.)
    1> create clustered columnstore index Person_CCI on Person;
    2> go
    
    Network packet size (bytes): 4096
    1 xact[s]:
    Clock Time (ms.): total        25  avg   25.0 (40.0 xacts per sec.)
    1> 
    
    Network packet size (bytes): 4096
    1 xact[s]:
    Clock Time (ms.): total         1  avg   1.0 (1000.0 xacts per sec.)
    
    

Overall, it looks pretty nice! I’ve got to say, I’m really impressed with Microsoft’s embrace of Linux. By the way, if you use Windows 10, I recommend Bash on Ubuntu on Windows. It’s in beta, but it works for me pretty well so far.

Stay tuned for more. I’ll definitely write more as I play with this new toy!

Java regex Matcher’s first group is the whole pattern

I didn’t realize that Java’s regex class, Matcher, uses m.group(0) to denote the entire pattern. I spent some time debugging it. Hence this note.

As is stated in the documentation, “Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.group(0) is equivalent to m.group().”

Here is a sample code to pick out twitter user names out of a string using Java. A set is returned, therefore if a user name is mentioned more than once, it’ll only be stored once in the set. All user names are returned back in lowercase. This function has gone through pretty through testing and works pretty well.

In addition, this is also a pretty good sample of negative lookbehind regex usage: we are not looking for pattern where @ is proceeded by any valid Twitter user name character.

Update: Angle brackets in Java code caused my code formatter to add some junk inside the code. Be aware! I need to look for a good code formatter for WordPress…

package com.haidongji.java;

import java.util.HashSet;
import java.util.Set;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Tweet {

    /**
     * Get usernames mentioned in a string.
     * 
     * @param s
     *          string, a tweet
     * @return the set of usernames mentioned in the text of the tweet.
     *         A username-mention is "@" followed by a Twitter username
     *         A Twitter username is composed of:
     *          English letters, digits, dash, and underscore
     *         The username-mention cannot be immediately preceded or followed
     *         by any character valid in a Twitter username. Therefore:
     *         user@example.com does NOT contain a mention of the username example.
     *         Twitter usernames are case-insensitive
     */
    public static Set<string> getMentionedUsersFromString(String s) {
        Set</string><string> set = new HashSet</string><string>();
         String pattern = "(?< ![a-zA-Z0-9_-])@([a-zA-Z0-9_-]+)";    // see spec above 
        Matcher m = Pattern.compile(pattern).matcher(s);
        while (m.find()) {
            set.add(m.group(1).toLowerCase());
        }
        return set;
    }
}

Xi’s China is really progressive and impressive

Below is a comment I made regarding Chinese leader Xi Jinping on Kaiser Kuo’s Facebook page, with very slight change.

In my view, domestically, China under Xi is on the right path, unlike, say, Brazil, India, and the United States.

I believe Xi’s a real believer of socialism and a real progressive. I believe his ideology and moral compass is very well aligned with that of Bernie Sanders and Jeremy Corbyn than Hillary Clinton and Theresa May will ever be. His heart is in the right place. I recommend this interview highly. It was done when he was the leader in Zhejiang province in 2004, way before anybody would know for sure that he would be the future leader of China.

As can be seen through the interview, he’s honed his leadership skills in a small rural village in 陕西 (along with the villagers, he built the first biogas/methane facility using biowaste in 陕西!), as a deputy in the central military council, as a boss in a small county in 河北正定,and various positions in 福建 and 浙江. That leadership showed as soon as he was picked by the party committee as the leader. The proof is in the pudding. Here is a list I quickly came up with:
1. The unrelenting anti-corruption campaign, very effective with tremendous popular support. At the moment, premier Li is in Canada, negotiating extradition treaty to go after corrupt officials in that country. Similar discussions have been had with other 5-eye countries (NSA term for US, UK, Canada, Australia, NZ…);
2. Emission and pollution control. This encompasses many areas: environmental data transparency, more effective regulation, new Environment Protect Law and its enforcement. This has produced results already, CO2, PM2.5 and other pollutant numbers are in a downward trend. Beijing is having more blue sky days than years past. The rest of country is also getting better. For example, during my last trip to my hometown, I noticed all diesel polluting buses were replaced with clean electric bus;
3. Anti domestic violence law was passed and became law of the land early this year. Badly, badly needed in a huge developing country like China. It’s hard to overstate the significance of this development, although sadly it’s not reported that much;
4. 供给侧改革。This is badly named, but not to be confused with Reagan’s supply-side economics. Essentially it means to curb overcapacity and polluting sectors, such as steel, chemical industries;
5. Poverty alleviation. 精准扶贫,targeted poverty alleviation, is an effort to have accurate and detailed statistics on poverty levels, and then fiscal assistance is provided based on those numbers;
6. Anti corruption campaign now also targets the corruption in election process, imagine that!. Just last week, 辽宁人大代表 election results were nullified due to campaign corruption. This is a very encouraging sign;
7. This is hard to quantify, but like neocons and neoliberals who make s**t up and create a toxic social environment for everybody else in the United States, China has its own share of wackos who make things up all the time. I feel that toxic element (money worshiping, consumerism, lying, cheating) is abating since Xi came to power.

Anyhow, I’m too cynical to worship any leader, but I feel really good about this Xi/Li administration.

UMDC: Recommended

I learned about Quincy Carroll’s debut novel, Up to the Mountains and Down to the Countryside, through Jocelyn Eikenburg’s wonderful blog. I really enjoyed it! Quincy Carroll’s levelheaded and nuanced depiction of the two main characters’ experience in China gave us a wonderful, honest perspective that was rarely offered in similar novels, memoirs, or news reports.

It is not unusual for a white person to exhibit superiority complex in a developing country like China (Guillard, the old male teacher from Minnesota in the novel). Similarly, it is not unusual for a non-white person from a small place in a developing country (Bella, the scheming student from Hunan province) to display signs of inferiority complex when engaging with white people from the all mighty America (or UK, Canada, Australia, Germany…). Superiority and inferiority complexes are two sides of the same coin. When they collide, things happen, sometimes comical, sometimes awkward, sometimes sad, mostly sad.

The author, (the other main character in the novel), is honest and compassionate. He is also curious and humble: he took the time to learn the language and speaks it fluently. This enables him to understand and appreciate the local culture and be effective in his teaching. I think that’s the reason he is beloved and respected in that high school. With his observant eyes and the ability to put things down on paper, we ended up with a wonderful book to learn from and enjoy.

I got my bachelor degree in China in the 90s, and had various English teachers in my college from the US. I think I met both types. One was a Vietnam Vet, who must have been traumatized by that war. He taught English and made a decent living by simply being a white American without solid skills and/or certifications, from what I could tell. I’ve also had young Peace Corps volunteers, who were recent college graduates, that were friendly and helpful. For example, looking back now, they must have been tired of all the similar questions being asked again and again, yet they were patient enough not showing it and still being helpful.

Comparing my experience with what’s depicted in the novel, I think the fact that Guillard-types had to go to a small town in Hunan for a job is a sign of progress. It’s tougher for him to make a living through his whiteness and native English language ability in coastal, more developed areas :)

By the way, I think the toxic combination of superiority and inferiority complex is the reason that we have crappy foreign news reporting from mainstream media: on one hand, we have the arrogant journalists from a developed country with plenty attitude and preconceived notions, but without the necessary openness and curiosity; on the other hand, we have the local interpreters/compradors type who have their own complex motives. I’m not saying all journalists are like that, but there are enough such that the western audience is misinformed on many important issues.

Anyway, great book, highly recommended!

Also posted at Amazon, with some edit here.