LINQ Extension Method Potpourri
One of the most useful things about the new C# 3.0 language features is the ability to create “extension methods”.
These methods have the curious property that
- They must be static
- They must be declared in a static class
- Their first argument must be declared with the “this” keyword.
After being declared, they essentially add the method to the type declared with the “this” keyword, without actually altering the code for that type. A really good example (with screenshots) is over at Scott Guthrie’s blog.
The recently-released LINQ framework makes very strong use of this feature by providing a number of extension methods for the IEnumerable
string[] names = new string[] {
"Matt", "Bob", "James", "Erin", "Billy", "Ashley", "Jennie"};
IEnumerable coolPeople =
names
.Where(s => s.StartsWith("M"))
.Select(s => s.ToUpper());
foreach (string s in coolPeople)
{
Console.WriteLine(s);
}
In this example, we start by creating an array of strings (names of people that I may or may not know). All collections (arrays, List<>s, etc) implement the IEnumerable
After this selective filtering and mapping, all of the capitalized names that start with “M” are dumped to the console. In this example, only “MATT” is output.
All of this is fine and dandy and is extremely flexible. There are quite a few built-in extension methods to do various things that satisfy 90% of what I normally need to do. The funny thing is, there is that last 10% of common things that simply seem to be missing from the framework. Did you notice how in the example, I had to store the list of “Cool People” in another variable before using a for loop to dump it to the screen? That seems rather silly, since what I really want to do is: “Start with the list of names, keep all of the names that start with M, make them all upper case, and print them out.” I didn’t want to “Start with the list of names, keep all of the names that start with M, and store them in a new variable. Then go through each name in the new variable and print them out”.
One way that I like to think about this is through a “pipeline” analogy — Start with the list of names and “pipe” them into a box that only keeps names that start with “M”, then “pipe” the resulting data into another box that capitalizes each name, then pipe the resulting data into a box that prints them out.
I looked and looked, but could not find an extension method that could be “stuck onto” the end of the pipeline. I just wanted to call a method (in this case, Console.WriteLine) on each element in the result. It seems to me like this would be a very common occurrence — Grab a set of data, perform operations on it, and dwindle it down to the point where you want to do something for each element in the result set and not return anything.
Well, thankfully we can out own define extension methods! We can write our own “ForEach” method that simply calls a generic Action
public static class ExtensionMethods
{
public static void ForEach(this IEnumerable source, Action action)
{
foreach (T item in source)
{
action(item);
}
}
}
That wasn’t so bad, was it? We can use it to make our initial example much more readable:
string[] names = new string[] {
"Matt", "Bob", "James", "Erin", "Billy", "Ashley", "Jennie"};
names
.Where(s => s.StartsWith("M"))
.Select(s => s.ToUpper())
.ForEach(s => Console.WriteLine(s));
That reads like a waterfall — at each point you can see exactly what is going on. Start with the list of names, filter out the ones that start with “M”, make each result upper-case, and print each one out to the console. Beautiful.
Since we’re on a roll here, another common theme that kept coming up in my code was that I needed to peek into the data “pipeline” and see what was there at each step in the process. What I needed was an extension method exactly like the ForEach method I just described, but with the additional feature that allowed the original data to “pass through”. I needed this a lot for debugging my LINQ queries, so I named it Debug:
public static IEnumerable Debug(this IEnumerable source, Action action)
{
foreach (T item in source)
{
action(item);
}
return source;
}
We can “peek” into the query above using the Debug extension method like so:
names
.Where(s => s.StartsWith("M"))
.Debug(s => Console.WriteLine("After Where: {0}", s))
.Select(s => s.ToUpper())
.ForEach(s => Console.WriteLine("After Select: {0}", s));
That was pretty easy. I’ve become hooked on this “functional style” of programming where you start with a collection, perform a number of operations on it, and then do something with the results. The possibilities are endless. As the last step (usually requiring the ForEach method), you could do a number of things such as adding them to a listbox, sticking them in a database, or printing them to the console. Here’s another example:
DirectoryInfo di = new DirectoryInfo(@"C:\");
di.GetFiles()
.Where(fileinfo => fileinfo.Name.Contains(".sys"))
.Select(fileinfo => fileinfo.FullName)
.ForEach(s => Console.WriteLine(s));
If you’re interested in learning more, there’s an excellent book by Joe Rattz called Pro LINQ: Language Integrated Query in C# 2008 This book is extremely informative and has excellent examples for every one of the query operators (extension methods). If you’re wanting to jump into using these cool .NET 3.0 features like extension methods, this is a great place to start. Not only does it cover “LINQ-to-objects” (what I’ve basically been talking about here), it also explains “LINQ-to-XML” (also amazing — look for an upcoming post on this) and “LINQ-to-SQL” in-depth as well. I just finished reading it and really enjoyed it.
I’m sure you’ll come up with even better extension methods (these two were pretty trivial and straightforward). I’d be happy to hear of any useful ones you might want to share.
Hope someone find that useful!
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
June 14th, 2008 at 10:03 pm
[...] the ForEach extension method is just like the one I talked about before. (Seriously, how could the LINQ designers missed [...]
November 16th, 2008 at 12:22 am
Hey, to conform to LINQ (http://msdn.microsoft.com/en-us/library/bb308959.aspx) standards, and to better support concurrency going forward, you should rewrite your ForEach Method to the following
public static IEnumerable ForEach(this IEnumerable list, Action action)
{
foreach (T i in list)
{
action(i);
yield return i;
}
}
November 16th, 2008 at 12:40 am
I didn’t mean to offend, both methods are completely valid uses. I should have phrased my comment differently. this post actually helped me when researching my issue. I just thought it might be nice to mention the different approaches.