Yanking and sorting lines matching a pattern

One of the best investments I’ve ever made is to be proficient with a good cross-platform editor, in my case Vim. It took me a good few months before I really became comfortable with it, but those few months’ struggle yielded huge dividend since then!

So after years of Vim usage, I consider myself a power user. Yet from time to time, I come across some nifty tips that remind me why I fall in love with it in the first place: the sense of wonder, awe, beauty, and intelligence of its creators!

Here are two things I learned recently:

  • Sort based on a pattern
    I use sort and sort u all the time. sort does what the word implies, sorting all lines in the buffer. sort u (unique) does the sort, but in addition to that, removes duplicate lines. Those two commands are extremely useful.

    Yesterday I was doing some email log analysis, and had a bunch of email addresses in my file. And I thought, wouldn’t it be nice if I could sort those addresses based on domain names? So I searched the web, then looked through :help sort. Sure enough, I can absolutely do that.

    Say you’ve got the following lines:


    To sort them based on domain names, type :sort /.\+@/ in normal mode will do just that.

  • Yank all matching lines into a register
    I use :g/pattern/d fairly often. What that line does is to delete all lines inside the document that match the pattern. Since you can use regex with pattern, this can be pretty powerful.

    However, before deleting them, sometime it is a good idea to save them away. To do that, run
    :g/pattern/yank CapitalLetter

    This command will put matching lines into a register. Let use X as an example. At a different buffer, you can run


    And it’ll paste those lines!


1. 怎么更新Ubuntu Linux;
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install build-essential
2. 怎么安装Python包;
sudo apt-get install python-pip
sudo apt-get install python-dev
sudo pip install numpy
sudo pip install ggplot…
3. 怎么安装R和R包;
sudo apt-get install r-base
sudo apt-get install openjdk-7-jdk
sudo R
4. 如何方便快捷下载视频:
Firefox, 插件DownThemAll





Using rsync to backup remote n00 files

I had trouble rsync remote Linux 600 files (rw——-) today. I knew that I came across this issue before but couldn’t remember how I resolved it. Therefore I had to waste time looking for and verifying a solution. Hence this blog post.

This is the problem I had earlier:

rsync -zr userA@remoteServer:/var/www/website/ /home/user/Documents/webSiteBackup/website/www/
rsync: send_files failed to open "/var/www/website/wp-config.php": Permission denied (13)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1655) [generator=3.1.0]

So the issue is that wp-config.php is a 600 file, meaning only root can read and write it. Although userA@remoteServer has sudo privilege, I still need to run visudo so that it won’t ask for password when this user runs rsync.

Here is the line I added into visudo:

userA    ALL=(root) NOPASSWD: /usr/bin/rsync

And here is the slightly modified bash command to run:

rsync --rsync-path="sudo rsync" -zr userA@remoteServer:/var/www/website/ /home/user/Documents/webSiteBackup/website/www/

Hope it helps you as well.

sed tricks

I helped a charity to rebuild a MySQL server and to restore a database with a lot of data of longblob type in the last two days. Fortunately there was a dump backup file for the database in question.

However, tables with longblob column(s) were not defined with “ROW_FORMAT=COMPRESSED”. I’d like to restore that database with row compression before inserting the data. Therefore I need to modify the dump sql file. The problem is that the file is 2.5 GB and the server only has 4 GB memory. So editing it is a challenge. Fortunately, Linux has sed to save the day. Don’t you love open source free software?

I am power Vi/Vim user, so I am familiar with sed and have used it in the past. But there are still a few things that I searched around for quick answers. So I’ll record noteworthy points here. I couldn’t remember how many times my own past blog entries helped me over the years. And I hope you’ll find this helpful too!

  • The -n switch is important. sed is a stream editor. In many cases you’d like to supress data streaming to stdout, and -n does that. This was especially important in my case, because a) the file is large, b) it contains blob that may or may not “fit to print”;
  • To see a particular line, say line a, use the p (print) command: sed -n 'ap' file
  • To see all lines between line a and b: sed -n 'a,bp' file
  • To see multiple, non-adjacent lines, say line a, e, g: sed -n 'ap;ep;gp' file
  • To edit big files, you’d like to make in-place changes. Therefore the -i switch. For example, to put in InnoDB row compression, this is the command I used: sed -i 's/CHARSET=utf8;/CHARSET=utf8 ROW_FORMAT=COMPRESSED;/' file
  • Similarly, to delete line a: sed -i 'ad' file You can also do range delete as well

By the way, when restore InnoDB database with a lot of blob data, it makes a lot of sense to enable the following settings in my.cnf, if they are not enabled already. It’ll make administration much easier down the road:
innodb_file_format = Barracuda

You may also need to tweak the max_allowed_packet and innodb_log_file_size parameters for successful restore.

Something else to pay attention to:
If you use:

mysql -uuser -p database < dump.sql to restore the database back, the program may report the wrong line where it had loading problems. In most cases, you need to search the surrounding lines to find where the problem is. Additionally, if you are in a hurry and want to load data in without troubleshooting loading issues, you can try adding -f switch to the command above so the restore ignores errors it encountered and jump to the next line.