A look at the upcoming improvements to LINQ in .NET 6

| 8 min. (1538 words)

This article is an expanded exploration of several items mentioned by Richard Lander in Microsoft’s May 2020 announcement of .NET 6 Preview 4. I would be remiss if I didn’t give credit where credit was due.


Overview

When .NET Framework 3.5 was released back in 2007 it included a new feature known as Language Integrated Query, or LINQ for short. LINQ allows .NET developers to write efficient C# code using arrow functions to query collections of objects or even databases using libraries like Entity Framework Core.

Like all things with .NET, LINQ continues to evolve over time. The upcoming release of .NET 6 brings a number of really interesting features, including a suite of new LINQ capabilities.

In this article we’ll take a quick look at the major LINQ enhancements that are coming soon to .NET developers with .NET 6.

In order to use these new LINQ APIs you must be on .NET 6 preview 6 or later. The full release of .NET 6 should occur late in 2021, but if you want to play with these features before then you can download an official preview from Microsoft.

Chunking

Chunking is probably the biggest addition to LINQ in .NET 6. If you’re anything like me, you’ve had to work with large collections of objects before and wanted to take a large collection and work with it in small pages or “chunks” of that large collection.

Previously, when you wanted to chunk large collections you had to use loops and conditional logic and watch out for a few edge cases:

List<List<Movie>> pages = new List<List<Movie>>();

const int PAGE_SIZE = 5;

List<Movie> currentPage = null;
int spaceRemaining = 0;

foreach (Movie movie in movies) {
    // Check to see if we're at the start of a new page
    if (spaceRemaining <= 0) {
        // Move to a new page and add it to our list
        currentPage = new List<Movie>();
        pages.Add(currentPage);
        spaceRemaining = PAGE_SIZE;
    }

    // Add items to the current page and decrease the count allowable in this page
    currentPage.Add(movie);
    spaceRemaining--;
}

This code is admittedly a lot for the simple concept of splitting a large collection into several smaller chunks. On top of the large line count, chunking code required special testing to ensure that all items were included in a chunk, even if the last chunk wasn’t completely filled to capacity.

Thankfully, in .NET 6 LINQ now supports all of this in a single line of C# code via its Chunk method:

const int PAGE_SIZE = 5;

IEnumerable<Movie[]> chunks = movies.Chunk(PAGE_SIZE);

This new method is much more concise and easier to follow than trying to write and test your own chunking code anytime you need it. I can see this feature being incredibly helpful for people trying to build a paged API or some form of batch processing application, for example.

Index Support for ElementAt

The next feature is simpler, but still effective. Previously, if you wanted to get something from the end of a collection, you had to calculate the length of the collection and then get the item out by that index, subtracting as needed:

Movie lastMovie = movies.ElementAt(movies.Count() - 1);

With .NET 6, LINQ now allows you to use an Index overload to the ElementAt method as follows:

Movie lastMovie = movies.ElementAt(^1);

Indexes have been with us since .NET Core 3.0, but allow for a simple syntax to grab elements starting from the back of a collection. See Microsoft’s guide on Indices for more information.

Range Support for Take

Building on the prior idea of using Indices to improve code, a member of the .NET community expanded the Take method to allow it to function with a Range parameter:

Movie middleMovies = movies.Take(6..10);

This code is elegantly simple and skips until the 6th element and then grabs the following 4 entries. Other range expressions such as ..3 or 5.. also work, which significantly expands your ability to grab elements from collections by specific indexes.

If you’d like a refresher on how to work with Ranges, check out Microsoft’s documentation.

3 Parameter Zip Overload

Previously, LINQ offered .NET developers a Zip method that would allow them to enumerate through two collections in parallel:

string[] titles = { "A Tale of Two Variables", "The Heisenbug", "Pride, prejudice, and semicolons" };
string[] genres = { "Drama", "Horror", "Romance" };

foreach ((string title, string genre) in titles.Zip(genres)) {
    Console.WriteLine($"{title} is a {genre} film");
}

This was great and prevented developers from having to create additional anonymous or named types to loop through several collections. However, there are some cases where you may want to enumerate through three collections together.

To address this need, LINQ now has an additional overload that allows three collections to work in tandem:

string[] titles = { "A Tale of Two Variables", "The Heisenbug", "Pride, prejudice, and semicolons" };
string[] genres = { "Drama", "Horror", "Romance" };
float[] ratings = { 5f, 3.5f, 4.5f };

foreach ((string title, string genre, float rating) in titles.Zip(genres, ratings)) {
    Console.WriteLine($"{title} is a {genre} film that got a rating of {rating}");
}

Default Parameters for Common Methods

The ever-present LINQ methods of FirstOrDefault, SingleOrDefault, and LastOrDefault are a mainstay in LINQ development in C#.

Put simply, these methods will look at a collection and return a match if a condition is met. If a condition is not met, the default value for that type is used. For reference types that will be null, numeric types will use 0, and booleans use false.

For example, let’s say we tried to find the first movie that included this author in its cast:

Movie movie = movies.FirstOrDefault(m => m.Cast.Includes("Matt Eland"));

Since I’ve never been in a movie, FirstOrDefault would go with its default value and movie would be set to null.

However, in .NET 6, LINQ now allows you to specify a custom parameter to use in case nothing matches the condition. This avoids having to deal with null values, and instead specifies a safe alternative as the following code illustrates:

Movie defaultValue = movies.First();

Movie movie = movies.FirstOrDefault(m => m.Cast.Includes("Matt Eland"), defaultValue);

In this case, when FirstOrDefault doesn’t hit a match it will use the defaultValue parameter.

This change also applies to SingleOrDefault and LastOrDefault in similar ways where you are now able to specify a custom defaultValue parameter.

Avoiding Enumeration with TryGetNonEnumeratedCount

When working with LINQ, you’re not always working with a List or other type of collection that makes it easy to count the length. In fact, when working with certain types of collections, such as those implementing IQueryable, even a simple operation like calling the Count() method may cause the entire query to be re-evaluated.

To combat this, .NET 6 adds the very specialized TryGetNonEnumeratedCount method. This method will check to see if determining the count of items in the collection will cause the collection to be enumerated. If it doesn’t, the count is produced and stored in an out parameter and a value of true is returned from the method. If evaluating the count would cause the collection to be enumerated, the method simply returns false instead and leaves the out parameter at 0.

if (movies.TryGetNonEnumeratedCount(out int count))
{
    Console.WriteLine($"The count is {count}");
}
else
{
    Console.WriteLine("Could not get a count of movies without enumerating the collection");
}

If you find this code or concept confusing, this is a very similar pattern to how int.TryParse and other TryX APIs currently work in the .NET base class library. The only difference is that TryGetNonEnumeratedCount is focused on avoiding potentially slow operations due to enumerating a very large collection or re-querying a database.

MaxBy and MinBy

Finally, .NET 6 gives .NET developers the MinBy and MaxBy extension methods in LINQ. These two methods allow you to look at your collection and find the largest or smallest of something, based on a specific arrow function you provide.

Before .NET 6 in order to get the entity that had the largest or smallest of something you would have to use Max or Min to find the value, then query again to find the related entity:

int numBattles = movies.Max(m => m.NumSpaceBattles);

Movie mostAction = movies.First(m => m.NumSpaceBattles == numBattles);

This worked, but it takes 2 lines of code instead of 1 and enumerates the collection several times.

With .NET 6’s MinBy and MaxBy LINQ extension methods we can now do it more quickly in a single line of code:

Movie mostAction = movies.MaxBy(m => m.NumSpaceBattles);

Conclusion

While none of these improvements fundamentally alter how C# developers work with LINQ in their .NET code, each one offers some significant quality of life improvements, which was Microsoft’s goal in implementing these community-driven features. These changes also continue to reaffirm Microsoft’s commitment to .NET’s place in the open source software ecosystem and .NET developers.

If you’d like to learn more about these changes, you can check out the user story in the .NET runtime repository on GitHub for additional context, discussion, and even the exact source code changes involved in implementing these features by contributors from the .NET community.

If you’re curious about other changes in .NET 6, check out Microsoft’s .NET blog which contains the original announcement of these features and more content and context on what else is going into .NET 6.