PS for .NET devs part 4: Source digging

Posted on Dec 4, 2011

The kind of adhoc scripting you can do with PowerShell lends it’s self very well for getting an overview of the code-base you find yourself in. When you find yourself in a project or you’re just interested in some basic statistics/metrics about your code-base you could follow something like the path I’m about to uncover. For brevity and a real life flavor I’m using the short form aliases of the Cmdlets look them up with Get-Alias if you find any that are unfamiliar.

How many files are actually in the archive?

1
>ls -rec | measure

How many files of which type are there in the archive?

1
>ls -rec | group Extension 
1
2
>ls -rec | group Extension | sort -desc Count |
>>select -First 10

Ignoring directories the top 10 becomes:

1
2
>ls -rec | ?{ -not $_.PSIsContainer } | 
>>group Extension | sort -desc Count | select -First 10

C# is the main language here.

How many classes do we have?

1
2
3
>ls -rec -inc *.cs | 
>>sls -pattern '\bclass\b' -allmatches | 
>>measure
Explanation

Consider all C# files search for the ‘whole word’ class and count all the matches.

Do we have any tests?

1
>ls -rec -inc *Test*.cs | measure

Let’s look at the first one to see which testing framework were using:

1
2
3
>ls -rec -inc *Test*.cs | 
>>select -first 1 | 
>>gc | select -first 20

Ah, NUnit so,

How many test fixtures do we have:

1
2
3
>ls -rec -inc *.cs | 
>>sls -pattern '\[TestFixture\]' -allmatches |
>>measure

Lets check some basic line counts

1
>ls -rec -exc *.exe,*.dll | measure -line -ignorw
Explanation

Going through all files in the archive exclude some known binary types, count the number of lines excluding lines containing only whitespace.

How are these lines distributed across the different file-types?

1
2
3
4
5
6
>ls -rec -exc *.exe,*.dll,*.zip,*.recipe,*.mdb,*.png,*.ov,*.bin,*.cd,*.pdb,*.mat |
>>?{ -not $_.PSIsContainer } | 
>>group Extension | 
>>Select Name, @{Name="LoC";Expr={ $_.group | %{ gc $_ } | 
>>measure -line -ignorew|select -expa Lines }} | sort LoC -desc | 
>>select -first 10 |ft -autosize Name,LoC
Explanation
  • retrieve all files excluding a large number of binary file-types specific to this project
  • filter out the directories
  • group the files by their extension
  • select the Name of the group which will be the extension, calculate the total lines of code for all files in the group by
    • retrieving the contents of each file using gc an alias for Get-Content
    • counting the number of lines
    • extracting the Lines property for the Measure-Object return value as a value, rather than an object containing only a Lines property
  • sort in descending order by the number of lines
  • select the top 10
  • format them in a table with auto-sized columns

What about the ratio of test vs. production code?

1
2
3
4
5
6
7
>ls -rec -inc *.cs | 
>>group { sls -path $_ -pattern '\[TestFixture\]' -quiet } | 
>>Select @{Name="TestOrProduction";
>>Exp={if ($_.Name -eq "True") { "Test" } else {"Production"}}},`
>>@{Name="LoC";Expr={ $_.group | %{ gc $_ } | measure -line -ignorew|
>> select -expa Lines }} | 
>>ft -autosize TestOrProduction,LoC
Explanation

The code follows the same pattern as above except that the grouping is done on the fact whether the file is a [TestFixture] or not.

Epilogue

I believe that this shows what kind of awesome stuff you can do with adhoc scripting using PowerShell. Until next time.

References

Scripting with Windows PowerShell NUnit a .NET unit-testing framework