Thursday, January 31, 2013

bash versus Powershell

Occasionally, I find myself dumbfounded at how difficult something is in Powershell that is just brain-dead simple in bash.  Then, I remember that the various Unix shell flavors and their POSIX toolset have had many more years to mature than even I have had thus far. The Unix shell tool philosophy is "do one thing and do it well". Doing things well in this context is certainly enabling the most common usage scenarios with minimum ceremony and surprise.

So, let's say you wanted a one-liner that gave the number of lines in a bunch of C# files:
find -name *.cs | xargs wc -l

Now, let's see how to get the equivalent output from Powershell:

gci -filter *.cs -Recurse | 
select @{Name="Lines";Expression={(gc $_.FullName | measure).Count }}, @{Name="Path";Expression={ resolve-path $_.FullName -Relative }} | 
sort Lines | 
ft -HideTableHeaders -Auto

Believe it or not, that is all one line. Of course, wc also gave us a summary row in its output, and we can get that with Powershell, too.
gci -Filter *.cs -Recurse | % {gc $_.FullName | measure }  | measure Count -Sum | select -expand Sum

Now we have a two-liner and tired fingers!

To be fair I am not playing to Powershell's strengths, i.e. .NET accessibility, structured scripting, object pipelining, and a wonderful extensibility story.  With Powershell you won't need the analog of Perl, sed, and awk when the built-in shell functions reveal their limitations.

In fact our Powershell script is a lot more impressive in one respect than its competition; it allowed me to cobble together a "light" version of wc. I was nearly able to duplicate its output. If the roles were reversed, one could almost certainly contrive some Powershell I/O that would make bash look like the verbose, stilted challenger.

So, which shell wins? There's certainly a lot to be said for bash (and most Unix shells); it's power, succinctness, and ubiquity have been honed over more than 30 years. At the end of the day, I'm grateful for the Github for Windows release of a shell that incorporates Powershell and the POSIX tools, so I don't have to choose.

Recommended: The Unreasonable Effectiveness of C


Unknown said...

Your PowerShell example is much longer than it needs to be and it finds and iterates the files twice. This should output individual and total line counts much like wc would:

"`t"+((ls *.cs | %{ [pscustomobject] @{ f = resolve-path $_ -rel; l =(gc $_ | measure).count }} | %{ write-host "`t" $_.l "`t" $_.f; $_ }) | measure l -sum).sum+" total"

But, what you're basically doing is rewriting the entire wc utility in a Powershell one-liner, something that would be even harder to do in bash's scripting language. The wc util doesn't care if it's invoked from bash, cmd, or Powershell. Why not use the strengths of the Powershell shell while using the wc util, since you want wc's output format anyway?

wc -l (ls -r *.cs)

Thanks to Powershell's shell, it's even shorter than the bash version!

If you have git installed, you already have wc available to you, and it's available from many other sources too.

gubtest said...

"find -name *.cs | xargs wc -l" doesn't work with filenames containing whitespace. You can use "find -name '*.cs' -print0 | xargs -0 wc -l" instead.

In fact whitespace management is a pain with UNIX shells and I hope that powershell improves on this.

BTW, if you are using zsh (best shell ever :)), you can use "wc -l **/*.cs".