Archive for Technology

Setting up Windows development environment with VirtualBox

Over the years, I’ve used many virtualization software for testing, hacking, and development work. So far my favorite virtualization software is VirtualBox.

I’ve written about using Vagrant and Veewee to make that process easier. I’ve since changed my strategy slightly:

  • I don’t recommend using Veewee anymore, because silly Ruby compatibility issues. I like the idea of Veewee, but in my opinion jumping through hoops to get it to work with Vagrant is no longer worth it. It is often quicker to just build a VirtualBox vm and then create a Vagrant box out of it than taking the Veewee extra step and remembering those extra commands. I couldn’t find the link now, but I thought I read somewhere that Vagrant has plans to make VM box creation easier;
  • I don’t use Vagrant for managing Windows VMs anymore. I tried but my effort was not too successful so far. Again, I read it somewhere that Windows support will be provided in future release of Vagrant. We will see.

My use cases in the past mostly involves single VM. For networking among Linux VMs, the instruction in my Vagrant post still works. But what if you want to build a Windows VM environment with domain controller, database servers, web servers, and other application servers based on a single VM image. So far I’ve built an environment with a domain controller and a seperate application server (I will soon install SQL Server on it). Here is my notes for reference. The host machine is Lenovo W520 running Linux Mint 14.

  1. Download Windows 2012 ISO here;
  2. Start creating a base Windows 2012 VM in VirtualBox. Use the vdi disk type to make it easier working with VirtualBox;
  3. I could have installed the Data Center eval edition, but chose the standard instead. I didn’t choose the Core edition. Using the Core edition might make more sense in a real production environment, but I’d like to get something going quickly and also want to use the same image for all future VM needs, so the standard non-Core install works the best for now. I may come back and play with the Core edition down the road;

    Note: There are sites that talks about the possibility of activating Windows with an official license key, so that future VMs based on this image no longer need activation. My advice is: don’t bother with it. Licensing issue aside, my reasoning is that if you decide the VMs are going to be part of a domain, they then need to have their unique SID, which requires a separate Windows key, I think. Plus, I’ve run eval edition of Windows that has past its validation date but it still functions, probably in some limited capacity, but so far it hasn’t affected me. So I am hopeful that for my purposes, the eval edition of Windows 2012 will satisfy my need.

  4. Build the base VM with NAT networking. It is convienient for the VM to have access to the web directly. We will add additional netword card(s) when we clone this base image.

    Once the base VM is built, copy that single vdi file somewhere else. Then remove/delete this base VM from your system all together;

  5. Make the base vdi immutable by running this:
    VBoxManage modifyhd pathTo/myBase.vdi --type immutable
    
  6. Build a domain controller based on this image. When creating a VirtualBox hard disk, use the base vdi file. At this point, add in a second network adapter, which should be type “Internal Network”. Take the VirtualBox default value for the rest of the fields;
  7. Finish building the VM, change the VM to a proper name, such as DCServer or whatever you feel is appropriate;
  8. Shutdown the VM, then run this command against the vdi file in your new VM’s Snapshots folder:
    VBoxManage modifyhd ~/Virtual Machines/myNewVM/Snapshots/{someName}.vdi --autoreset false
    

    This step is necessary, otherwise VirtualBox resets the VM to the base image after every reboot;

  9. Now install the appropriate software and set this VM as a domain controller. Follow instructions here.
  10. Configure the network card for the “Internal Network” by assigning a fixed IP address. If you’ve followed the instructions, the second network card should be called Ethernet 2. I didn’t bother with IPv6, and gave my VM the fixed IP address of 192.168.1.10. I left everything else to its default value. Note that after entering the IP address, the Subnet mask will be 255.255.255.0. I didn’t change that and didn’t touch the rest of the fields;
  11. Turn off all Windows firewalls. In addition, open up “Group Policy Management”, expand to Domains->MyDomain->Default Domain Policy. Right click and choose “Edit…”. Group Policy Management Editor opens up. Under “Computer Configuration”, expand to Policies->Windows Settings->Security Settings->Account Policies. Change password policy accordingly.
  12. Take a snapshot of the machine. Mark every disk file under the Snapshot folder so its autoreset is false.
  13. Now build another VM. Add the second network adapter for “Internal Network”. When done, shut it down and change its vdi disk file’s autoreset property, described above;
  14. Because this VM and the domain controller VM use the same base image, so by default they will have the same SID, which will prevent them to be in the same domain. Fix that by starting this new VM, then run sysprep. Follow the instructions here, if you need it.
  15. Reboot the new VM again, configure its “Ethernet 2″ before it can join the domain we created earlier. Here is my configuration. Once again I didn’t touch IPv6:
    IP address: 192.168.1.11
    Subnet mask: 255.255.255.0
    Default gateway: 192.168.1.10
    Preferred DNS server: 192.168.1.10
  16. Rename the VM to an appropriate name, for example, sql1. Then join the domain we created earlier. Obviously the other domain controller VM needs to be up and you’ve run sysprep on this second VM. If you haven’t assigned a different SID for this new VM, you will receive error message like “the domain join cannot be completed because the SID of the domain you attempted to join was identical to the SID of this machine …”
  17. Repeat the process of building other VMs to join the domain, if you have the need.

Any comments/suggestions welcome. Good luck!

Comments

Install sqlps as a PowerShell module

Most of SQL Server automation scripts using PowerShell use SMO directly. For example, one would do something like:

[System.Reflection.Assembly]::LoadWithPartialName(“Microsoft.SqlServer.SMO”)
$MyServer = new-object (‘Microsoft.SqlServer.Management.Smo.Server’) ‘HOME’
$MyDataBase = new-object (‘Microsoft.SqlServer.Management.Smo.Database’)
($Server, “MyDataBase”)

That’s a lot of typing and looks messy and cumbersome.

Since SQL Server 2008, Microsoft provides a sqlps shell which exposes a lot of SQL Server objects and functions for easy browsing and inspection, but one couldn’t import that into an existing scripts for easy automation. With the release of SQL Server 2012 Feature Pack, it is now possible to expose sqlps functions in your scripts by running:

Import-Module sqlps

Here is how to get sqlps installed so it is available to import as a module. I’ve tested this successfully on both Windows Server 2008 R2 and Windows Server 2012. To get some idea on how to use it, perhaps some of my scripts here can get you started :)

1. Go to SQL Server 2012 Feature Pack site

2. Download and install the 3 items below
Microsoft System CLR Types for Microsoft SQL Server 2012
Microsoft SQL Server 2012 Shared Management Objects
Microsoft Windows PowerShell Extensions for Microsoft SQL Server 2012

BeginDigression
Given a choice, I’d rather write automation programs on Windows with either Python and Perl, because both languages come with a comprehensive testing framework that supports TDD. With test cases and various testing harness, it is much easier to build solid, high quality software solutions.

However, if for whatever reason, it is difficult to use Python or Perl, then the natural choice on Windows is PowerShell, because it comes with the OS. Too bad PowerShell does not have a solid testing framework. I have been using PSUnit, which leaves a lot to be desired for people who are used to xUnit framework, but it is still worthwhile to use.
EndDigression

Comments

Gathering SQL Server database free space and space usage by schema

I did analysis for SQL Server database free space and space usage by schema twice in the last few months, without any outside monitoring tools. I wrote this down here for future reference. A significant part of the code came from this article here by The MAK, with slight modification. And I’ve verified that it gives the correct result.

T-SQL to get row count and space usage by schemas

--To get a particular schema, add "a3.name = 'schemaName'"
SELECT
 --(row_number() over(order by a3.name, a2.name))%2 as l1,
 a3.name AS [schemaname],
 count(a2.name ) as NumberOftables,
 sum(a1.rows) as row_count,
 sum((a1.reserved + ISNULL(a4.reserved,0))* 8) AS reservedKB,
 sum(a1.data * 8) AS dataKB,
 sum((CASE WHEN (a1.used + ISNULL(a4.used,0)) > a1.data THEN
   (a1.used + ISNULL(a4.used,0)) - a1.data ELSE 0 END) * 8 )AS index_sizeKB,
 sum((CASE WHEN (a1.reserved + ISNULL(a4.reserved,0)) > a1.used THEN
   (a1.reserved + ISNULL(a4.reserved,0)) - a1.used ELSE 0 END) * 8) AS unusedKB
FROM
 (SELECT
  ps.object_id,
  SUM (
   CASE
    WHEN (ps.index_id < 2) THEN row_count
    ELSE 0
   END
   ) AS [rows],
  SUM (ps.reserved_page_count) AS reserved,
  SUM (
   CASE
    WHEN (ps.index_id < 2) THEN
     (ps.in_row_data_page_count + ps.lob_used_page_count + ps.row_overflow_used_page_count)
    ELSE (ps.lob_used_page_count + ps.row_overflow_used_page_count)
   END
   ) AS data,
  SUM (ps.used_page_count) AS used
 FROM sys.dm_db_partition_stats ps
 GROUP BY ps.object_id) AS a1
LEFT OUTER JOIN
 (SELECT
  it.parent_id,
  SUM(ps.reserved_page_count) AS reserved,
  SUM(ps.used_page_count) AS used
  FROM sys.dm_db_partition_stats ps
  INNER JOIN sys.internal_tables it ON (it.object_id = ps.object_id)
  WHERE it.internal_type IN (202,204)
  GROUP BY it.parent_id) AS a4 ON (a4.parent_id = a1.object_id)
INNER JOIN sys.all_objects a2  ON ( a1.object_id = a2.object_id )
INNER JOIN sys.schemas a3 ON (a2.schema_id = a3.schema_id)
WHERE a2.type <> 'S' and a2.type <> 'IT'
group by a3.name
ORDER BY a3.name

T-SQL to get database free space

-- database free space by file
select
      name
    , filename
    , convert(decimal(12,2),round(a.size/128.000,2)) as FileSizeMB
    , convert(decimal(12,2),round(fileproperty(a.name,'SpaceUsed')/128.000,2)) as SpaceUsedMB
    , convert(decimal(12,2),round((a.size-fileproperty(a.name,'SpaceUsed'))/128.000,2)) as FreeSpaceMB
from dbo.sysfiles a
--sum of all database free space, excluding log file free space
select
    sum(convert(decimal(12,2),round((a.size-fileproperty(a.name,'SpaceUsed'))/128.000,2)))/1024 as FreeSpaceGB
from dbo.sysfiles a where groupid > 0

Comments

关于软件测试:持续集成测试

讲完了单元测试,我来分享下个人感觉另一个开发团队的重要测试步骤:持续集成测试,continuous integration testing。

稍具规模的软件开发,不管是商业或开源,一般是由一个团队来完成。团队写软件,必然会有一系列的分工。比方说,张三可能会写一些处理客户订单方面的library, class,和/或function;李四会写库存管理方面的代码;而王五负责物流和采购;等等。当然也不能排除张三发现李四代码内的问题进而修正的现象。我们在这里假定队里的工程师们已经统一使用一种单元测试机制并且会写单元测试。但最终我们要把这些代码通过代码库整合集中起来,确保作为一个整体,这些代码可以通过所有单元测试,没有冲突和编译错误,并且编译好的binary/executable符合事先订好的规范。这种源代码的整合、总体通过所有单元测试、代码链接和编译以及编译后的一些测试,即为集成测试,integration testing。

在团队开发环境里,各团队成员写的部件会有依赖性。如果我们数天、一周、甚至更长的时间里才做一次代码集成,就很可能会遇到以下情况:来自不同成员的代码部件可能有冲突,界面和生产环境的变化没能及时得到沟通和传达到每个队员,一部分的代码故障修复造成了其它部分的故障,等等等等。这会引进bugs,造成部分或全部的代码回滚,代码的不可编译,甚至拖延产品的发布或升级。为了解决这一常见问题,我们需要尽可能早、多次、自动化地运行集成测试。持续集成测试,即continuous integration testing,就应运而生。

自动化的持续集成测试由软件来完成。现在业界有不少continuous integration testing的工具,开源和商业的都有,像Buildbot(Python写成),Jenkins(Java写成),TeamCity,CruiseControl等。在本文底下我会给两个实战集成测试链接:一个是MariaDB的,用的是Buildbot;一个是Percona的,用的是Jenkins。读完本博后你可以到那两个网站看看,可以分析出他们的集成测试都做什么,有什么报告等,希望对你有所启发。集成测试机制一般有以下特点:

1. 和代码库挂钩。代码提交/check in/commit会自动触发集成测试;
2. 除了代码提交的自动激活测试外,一般可以安排时间定期来做集成测试,比如每天晚上一次;
3. 在编译和构建binary的同时或之后,该机制还可以和其它机制挂钩,如生成文档,给软件打包,做其它测试甚至生产发布,等等;
4. 有网页界面来回报集成测试结果。这结果包括集成测试状态,该集成包括哪些代码变化、bug fixes,新性能等,代码覆盖测试/code coverage报告等等。

一般情况下,如果队伍和产品有一定规模,可以考虑设置专门的build server来完成以上步骤。持续集成测试会给我们带来以下优势:
1. 尽早发现不同队员代码之间的冲突,减少甚至铲除难缠的由于不能尽快整合而对全局有负影响的问题;
2. 促进团队内部沟通;
3. 在紧急情况下,如果需要部署最新软件来解决生产环境中的燃眉之急,你有了可以直接部署的binary。

我个人以为建立持续集成测试机制对一个团队很重要。从我的理解和体会,我觉得以下几点值得注意:
1. 如上文所述,团队用统一的单元测试架构很重要。有了这个架构,工程师要写单元测试;
2. 养成多次从代码库拉下更新自己代码的习惯,每个工作日至少一次。一天可以多次提交代码;
3. 在代码提交之前,完整运行一次所有单元测试,包括组内其他成员的单元测试。测试不通过,不提交自己的代码。更何况,如果你的代码没通过单元测试而被提交,那集成测试机制会发现这个问题并在网页上公布出来,你会觉得很没面子。

好,这次就写到这儿。下次继续,可能会写下接收测试acceptance testing,stress/performan testing之类吧。如上文所述,这里是MariaDB的持续集成测试网站链接(Buildbot);这里是Percona的持续集成测试网站链接(Jenkins)。

Comments

Cliffs Notes: administrating Active Directory with PowerShell

1. Install ActiveDirectory modules by running PowerShell as Administrator and executing the commands below:

PS C:\Windows\system32> Import-Module ServerManager
PS C:\Windows\system32> Add-WindowsFeature RSAT-AD-PowerShell
Success Restart Needed Exit Code Feature Result
------- -------------- --------- --------------
True    No             Success   {Active Directory module for Windows Power...

2. Link for newly added Active Directory cmdlets after installation, pretty useful;

3. There are two types of Active Directory groups: distribution and security. The former is for email distribution and the later is what most people had in mind: groups used for security and access management. So if you are not sure, use security group;

4. LDAP search string and DN (Distinguished Name) follow the convention of putting contained item in the front. Scope increases from left to right, in other words. For example, “CN=eastRegion,OU=Benefits,OU=HR,DC=research,DC=hardware,DC=acme,DC=com” can be a security/permission group called eastRegion that belongs to OU (OrganizationUnit) Benefits, which belongs to OU HR, under domain of research.hardware.acme.com. DC stands for Domain Component, by the way.

5. This site has info on filtering of Active Directory. Very useful. Below is an example of getting all security groups. You may need to run Import-Module ActiveDirectory first:
$a = Get-ADGroup -SearchBase “OU=Benefits,OU=HR,DC=research,DC=hardware,DC=acme,DC=com” -filter ‘groupcategory -eq “security”‘

6. Oh, dsa.msc, if you have installed, is a useful GUI tool. Good for verification and quick glances.

Comments

PowerShell TDD with PSUnit: some usage examples

I discussed setting up PSUnit for unit testing PowerShell before. This is a quick followup for my own record and consumption in the future. I will update this post as I find more interesting things to record.

1. Put PowerShell functions in one file such as myBaseFunctions.ps1;
2. Create a test directory and under that directory, create a test case file called myBaseFunctions.Test.ps1;
3. Here is what the first two lines of the test case file:

. PSUnit.ps1
. "c:\properPath\myBaseFunctions.ps1"

4. A sample test case:

function Test.getFileName()
{
#Arrange
    	$expectedResult = "X:\somePath\someName.sql"
#Act
	$actualResult = functionBeingTested -Para1 "someValue" -Para2 "someValue"
#Assert
	Assert-That -ActualValue $actualResult -Constraint {$actualResult -eq $expectedResult}
}

5. Running tests on command line:
PS C:\myDirectory> PSUnit.Run.ps1 -PSUnitTestFile properPath\myBaseFunctions.Test.ps1

Comments

关于软件测试:单元测试

上博我写了测试的重要性。现在我来谈谈软件开发最基本的测试:单元测试,unit testing。

单元测试主要用来检测某class,function,method的正确性。我想绝大多数工程师在编程时都会做各种各样的单元测试。但直到约上世纪末和本世纪初,系统性、自动化的单元测试机制才被真正地建立起来注,见下。说到这里,我们必须提到Kent Beck和Erich Gamma。

Kent Beck在90年代末用Smalltalk编程,在那段时期他写出SUnit,是用Smalltalk写出的给Smalltalk程序做单元测试的架构。在此基础之上,他和Erich Gamma携手创建了JUnit。JUnit是用Java写成的用来做Java单元测试的架构。从JUnit开始,统称为xUnit的测试架构以燎原之势传入到其它语言:Python(unittest),PHP(PHPUnit),NUnit(.Net语言如C#,VB.net)等。

xUnit的架构和基本思路实际上很简单:Arrange-Act-Assert:
1.把输入值和期望值安排好,必要时可以装逼(mock)。mock是当你的单元测试需要大、笨重、或复杂的外部资源支持时(如数据库,其它复杂API等等),你可以用个假模子mock来代替。这一步是Arrange,设定期望值;
2.调动要被测试的function/method/class,得到实际值,这是Act;
3.比较期望值和实际值,报出测试结果,这是Assert.

或许你会说,这有啥大不了的,我编程时所做的随机测试就是这么做的吗。没错,但像xUnit这样的测试机制提供了自动化、可重复性和方便性等特点。

1.自动化
良好的xUnit测试机制有所谓的test harness,即各种脚手架、工具等配套的捆绑机制。在按xUnit规范写出测试案例后,可以很方便地用该捆绑机制轻松调用全部或部分测试案例,不用笨手笨脚地做手动调节;

2.即时性
和上文紧密相关的是xUnit带来的即时性。在写出一个测试案例后,你可以很方便得用copy/paste/adjust来写出更多其它测试案例。因为有自动化机制,运行这些案例会很方便快捷,成功与否,结果立现。这样你就可以在思路、信息还停留在脑海之时,趁热打铁,迅速、有效地解决这个问题。顺便说一句,英语中也有这一俚语:strike while the iron is hot。不知该成语是否是翻译、借用来(中英或英中),我感觉自发生成的可能性大些。

单元测试是你的安全网。随着你单元测试的增多,你的安全网就变得越来越密集,漏网之鱼就会越来越少。测试给你的软件质量提供了足够的保障。

如果说我们对单元测试的重要性和自动化有了共识,我有以下想法和你共勉,让我们一起来学习和提高。弄好测试,如俗谚所说,磨刀不费砍柴工,会大大提高你的功力!

1.搜索出并安装适合你编程语言的xUnit测试框架。比方说你写PHP,那你可以用PHPUnit。如你用Python,如果是2.7以后的话,你可以用标准的unittest模块来做测试:注意要安装nose来做这个测试的harness,来方便地调用你的测试案例。像其它语言,大多数都会有xUnit,搜搜就可以找到;

2.花时间学习如何写单元测试案例和如何有效地调用单元测试。相关的tutorial,how to等应当不少;

3.在做新开发时,要给新function/method/class写单元测试。当有时间或需要修改已存在代码时,如那段代码没单元测试,逐步引进单元测试;

4.单元测试也是代码的一部分,当然要进入代码管理库;

5.用copy和paste,写出各种满足边缘参数、一般参数等各种情况下的测试案例。硬盘便宜,多存几个案例占不了多大空,注意好的文档和命名方式对案例的管理很有帮助;

6.如果在整合测试/integration test或质量检测/quality assurance甚至是生产环境中发现你的function/method/class出现问题时,请首先写出个单元测试案例来捕捉这条漏网之鱼,然后修改代码来让测试通过。这样你的安全网就会被织得更紧一些。
在完成这个修改之后,你要反思一下为什么一开始你没想到要写这个测试?为什么没有想到鱼会有这种形状和技巧来挣脱你已经编织的网?通过这种反思,你会慢慢增加你的编程洞察力、嗅觉、和敏锐。

好,先写到这里。下一篇继续。我可能会写integration testing和continuous integration testing。或者专门写一下单元测试的应用实例如Python。如有问题或评论,欢迎提出,我们一起切磋,共同前进。

注:Perl的TAP(Test Anything Protocol)值得专门提一下。早在1987年,即随Perl第一版发布,就有了自己的单元测试机制。这比SUnit(xUnit之前身)要早10年左右。之后大部分在CPAN上公布的模块都经过了Test::More的符合TAP规格的单元测试。我个人以为这是Perl作为元老级脚本语言,至今仍威风八面、历久弥新的主要原因之一。所以如果你用Perl编程,我强烈建议你掌握Test::More和相关的Test::Harness模块。

Comments

关于软件测试:测试的重要性

题记:去年12月在淘宝和奥莱利公司举办的Velocity会议上,结交了很多好朋友。当时有好多想分享的主意,回来后写了个“开源软件的参与及社区互动的一些体会和建议”系列。之后就酝酿着要写一个测试系列,因为我坚信好的测试是生产高质量软件的必要条件。后来因为有家人从各地来,再加上时间安排不当(主要是懒和自律性差)和其它方面的原因,现在才动手写这个系列。这是开头篇。

通过自己的实战经验和阅读别人经验体会,我感觉我们可以总结出在生产高质量和高可靠的软硬件产品过程中,大都会走如下流程:

  1. 尽快制出原型产品(prototype);
  2. 尽快把原型产品拿到生产环境中做检测和获得反馈;
  3. 尽快把在生产中得到的反馈数据和教训融入到下一步的故障修复和性能提升;
  4. 把改进后的产品再次尽快拿到生产环境中做检测和反馈;
  5. 循环以上步骤。

在这样的流程里,有以下几个方面特别值得注意:

1. 尽快做出原型产品并拿到实战中(或近可能类似于生产环境中)检验。
和用户互动,了解实战应用场景,检索、阅读、和了解业界同仁的文章和经验积累,搜索、阅读、分析网上网下的关于各种语言、解决方案的文档等都是很重要的过程。但把想法和了解到的信息付之与实践之前,那些理论、分析、和研究到头来只是纸上谈兵。把东西付诸与实践后,你会加深和巩固自己对已有知识的理解,碰到和解决自己和别人没有想象或碰到的问题,把学到的理论和步骤加以调整后运用到自己的环境中去。这是Kent Beck在极限编程/eXtreme Programming/XP中的一大重要、也是在实战中极端有效的一个原则。这一点和陆游在1199年写的《冬夜读书示子聿》里说得是同一个道理:

古人学问无遗力,少壮工夫老始成。
纸上得来终觉浅,绝知此事要躬行。

2. 快速的检测和反馈需要可靠、方便、和尽可能自动化的测试机制。
原型产品出来后肯定会有这样或那样的问题:有的地方需要改进和提高,有的地方需要添加新功能等等。我们在做改进和提高的同时还要做到已有功能不退化(regression bugs)。要能够实现这一目标,唯一有效的办法是建立起有效、系统、和自动化的测试脚手架。有了这种测试机制,并配有有效的源代码管理系统,我们就敢于去做代码改进和重构,就有足够的信心来保证代码的质量和可靠性。

前一段时间读Kent Beck的TDD(Test-Driven Development)书,他举了这样一个例子,我觉得挺贴切的。我们都知道用轱辘打水比从井里直接往上提省力,是打水的有效方法。假设水提到一半后你累啦或有别的情况而hold不住,你的努力就会前功尽弃。但如果你在轱辘上安装个齿轮机制来咬住你的进程而不至于让提上来的绳子后退,那你就会轻松、自信地打很多水。有效、自动化的测试机制就是你用轱辘打水的那个齿轮,它给你信心和回旋余地,让你大胆、放心地去修复、改进、重构、和添加新功能,让你有信心和实效来生产和提高产品,在保留既有功效的前提下扩大战果,打一场干脆利索、无后顾之忧(或担忧最小化)、不拖泥带水的漂亮仗!

在编程中,如果我们把每一个类/class、每一个函数/function都当作一个产品的话,根据上述“尽快做出原型产品并拿到实战中检验”的法则,我们就可以很容易地理解TDD的原理:
a.我们对某类或某函数的界面有初步打算,即脑中已经设计出原型;
b.在实施该类或函数之前,我们写出一个单元测试案例来在实战中应用该函数/类的原型;
c.我们运行该单元测试,那么这第一个测试应当失败;
d.好,接下来我们用代码来实施该函数或类的原型,直到该单元测试通过;
e.我们接着写更多的测试来强化该函数/类的代码,直至新的单元测试通过;
f.循环以上步骤,来搭建其它函数/类。

按照这种方式来写代码,这些单元测试就构建成咬定既有成果的齿轮和脚手架。它们给我们提供质量保证和信心,给整个产品的稳定、可靠、和高质量打下了坚实的基础。

先写到这儿,该系列的下一篇,我写一下单元测试

后记:这些生产原型产品、实战测试和反馈、之后提高的良性循环经验过程,不仅仅可以应用到软件开发方面,个人以为在硬件、工业制造、能源和环保产品的设计和研发等很多方面也能派上用场。比方说关于节能型汽车的研发、改进和提高,关于高铁、大飞机各部件的研发和应用,关于能源产品如藻类/海藻产生能源和风能、地热能、太阳能方面的可行性研究和应用,关于很多产业向创新、高附加值型转型等等,这些对于中国和很多后发国家都有很现实的指导意义。有的东西你可以在网上、书里、和各种会议上学,有的东西别人会藏着、掖着不想让你学到,但你若不自己去应用和尝试,那永远都不会成为你自己的东西。不要自高自大,也不要妄自菲薄,一步步建立起完善的研发、测试、反馈、和提高机制,有勇气和实际行动去大胆尝试,踏踏实实地走稳每一步,那么最后领先的会是你。

Comments

Thinking of changing hosting company

I purchased shared hosting service from Midphase years ago and that is where this site is hosted. It is up for renewal before 10/11.

Every year during this time, for the last 2 or 3 years, I toyed with the idea of switching to a different hosting company, because the inflexibility of not having root access. The current annual price from Midphase is around $136 with this breakdown: haidongji.com (MPan) $96 and DevPack – Unlimited MySQL & Subdomains $40. Here is some server specs:

# Percona Toolkit System Summary Report ######################
        Date | 2012-09-27 21:13:49 UTC (local TZ: MDT -0600)
    Hostname | some server name
      Uptime | 197 days, 12:31,  1 user,  load average: 5.87, 4.95, 4.89
    Platform | Linux
     Release | CloudLinux Server release 5.8
      Kernel | 2.6.18-374.18.1.el5.lve0.8.57
Architecture | CPU = 64-bit, OS = 64-bit
   Threading | NPTL 2.5
    Compiler | GNU CC version 4.1.2 20080704 (Red Hat 4.1.2-52).
     SELinux | Disabled
 Virtualized | No virtualization detected
# Processor ##################################################
  Processors | physical = 1, cores = 8, virtual = 1, hyperthreading = no
      Speeds | 1x2400.000
      Models | 1xAMD Opteron(tm) Processor 6136
      Caches | 1x512 KB
# Memory #####################################################
       Total | 15.5G
        Free | 500.9M
        Used | physical = 15.0G, swap allocated = 4.0G, swap used = 397.2M, virtual = 15.4G
     Buffers | 835.2M
      Caches | 5.3G
       Dirty | 254020 kB
     UsedRSS | 18.6G
  Swappiness | 0
 DirtyPolicy | 10, 80
 DirtyStatus | 0, 0

I looked at some Virtual Private System (VPS) providers. Linode’s smallest offering, Linode 512, provides 512MB of RAM and 20 gig of storage at a cost of $20 per month. NTC Hosting’s VPS Hosting vBox 1 provides 1 gig of RAM and 10 gig of storage at a cost of $20 per month. I’ve also browsed Rackspace, InMotion, and others. I don’t know, if I do decide to go with VPS, I will probably pick between Linode and NTC Hosting. Either way, it looks I need to pay $100 extra per year if I go that route. And I need to transfer domain name, which I’ve never done before. There is also the factor if my site is still accessible in China if I change hosting companies…

Anyway, if you have any recommendations on hosting service providers and comments, I am all ears.

Comments

One example of query performance problem due to data type conversion

In many RDBMS systems, if the value(s) passed to a query for filtering/matching is of different data type than the column data type it is comparing against, an implicit data conversion occurs. This conversion can render the index defined on said column(s) less useful or entirely useless. I encountered one such problem on SQL Server today.

Here is the relevant “WHERE CLAUSE”:

WHERE DomainNTLogin = SUSER_NAME()

Column DomainNTLogin is defined as varchar(100), but SQL Server’s SUSER_NAME() returns nvarchar(128). Note the letter “n” in nvarchar: the difference is subtle but significant. nvarchar is used for languages such as Arabic, Hebrew, Chinese, Japanese, and others. Because of the data type difference, the index defined on DomainNTLogin is not used, one would see a CONVERSION_IMPLICIT in the query’s execution plan:

CONVERT_IMPLICIT(nvarchar(100),[DbName].[SchemaName].[TableName].[ColumnName],0)=SUSER_NAME()

What that tells us is that the query engine needs to get all values in ColumnName, which is a scan operation, convert that to nvarchar, before the comparison can be made. Ideally, we’d like to see a seek, instead of a scan!

Fortunately, this problem is easy to fix. Instead of letting the query engine doing the implicit conversion, we do the conversion for it:

WHERE DomainNTLogin = CAST(SUSER_NAME() AS VARCHAR(100))

This change turns a table scan into an index seek, decreasing this query’s execution time from around 16 minutes to less than 1 second!

Years ago, I seem to remember that a JDBC driver for SQL Server from a certain vendor turns all varchar values to nvarchar by default, and I believe/hope there is a setting to turn that off. Inform me, dear reader, if this is the case and you know how to adjust that setting, as needed.

Comments

« Previous entries

Page optimized by WP Minify WordPress Plugin