84 - LINQ - Group - Aggregate Data

 

There are many ways to group data in a LINQ query, it is one of the most powerful features of LINQ. You can group by a single property, a substring of a property, a computed numeric range, a Boolean predicate, other expressions and a composite key. You can even nest groups within each other and use them as sub queries.

The information or a subset of the information you group can then be stored using objects of a defined class, or you can use anonymous objects.

The group clause is used to group data; it returns a sequence of IGrouping<TKey, TElement> objects. Items that match the group key value, if any, will be stored in the object sequence. The grouping key can be of any type. Let’s say that you have a sequence of Film objects that you want to group on production year, you could then group the LINQ query on the Year property of the elements, which will make year the grouping key. The data type of the grouping key will be inferred by the compiler and the group clauses will be translated into calls to the GroupBy method.

It is possible to end a query with a group clause. The into keyword can be used to perform additional query operations on each group; when doing this the query must eventually end with either a select or a group statement.

Because a query result essentially is a list within a list, IEnumerable<IGrouping<TKey, TElement>>, you use a nested foreach loop to get to the fetched elements, where the outer loop loops over the grouping key and the inner loop loops over the existing elements, if any.

Let’s look at a few examples of how the group clause can be used. In the examples we use two extension methods that we have created to print the results and to fetch property values with reflection, as well as a Film class and a list of films. The methods, class and list are described at the end of this article.

Group by a single property

You can use the group clause to group elements based on a property of the elements. In this example we will group all films by the film studios that have produced the films; to do this we use the Studio property of the Film objects that are fetched.

private void BySingleProperty()

{

    // filmsQuery is an IEnumerable<IGrouping<string, Film>>. 

    var filmsQuery =

        from film in films

        group film by film.Studio into filmGroup

        orderby filmGroup.Key

        select filmGroup;

 

    filmsQuery.Print("By Single Property");

 

    /*

        OUTPUT:

            Group: Lucas Arts

                    Star Wars, Lucas Arts

            Group: MGM

                    The Hobbit, MGM

                    Mad Max, MGM

            Group: Pixar

                    Monsters Inc., Pixar

                    Toy Story, Pixar

            Group: Sony Pictures

                    RoboCop, Sony Pictures

                    Captain Phillips, Sony Pictures

        */

}

 

Group by a substring

You can group elements based on a substring of an element property, this can be done either by using the Substring method on the string or by using the square brackets syntax to fetch items in an array; which a string essentially is.

var string_value = ”Some string”;

var substring_value = string_value.Substring(0, 1);

// is the same as

var substring_value = string_value[0];

 

In this example we will use the Substring method to group the elements by the first character of the film title using the Title property of the elements.  To order the result we would have to add an orderby clause.

private void BySubstring()

{

    var filmsQuery =

        from film in films

        group film by film.Title.Substring(0, 1);

        // or

        // group film by film.Title[0];

 

    filmsQuery.Print("By Substring");

 

    /*

        OUTPUT:

            Group: S

                    Star Wars, Lucas Arts

            Group: T

                    The Hobbit, MGM

                    Toy Story, Pixar

            Group: M

                    Mad Max, MGM

                    Monsters Inc., Pixar

            Group: R

                    RoboCop, Sony Pictures

            Group: C

                    Captain Phillips, Sony Pictures

        */

}

 

Group by a computed numeric range

Elements can be grouped based on a computed value by storing the result temporarily in-memory using the let keyword and order by that result in the group clause. In this example we calculate the average rating of the films using the Average method on the Rank array of the elements. We then use these averages as key values to group the elements on. An anonymous class is used to define the objects that will be the result of the query.

private void ByComputedNumericRange()

{

    var filmsQuery =

        from film in films

        let avearge = (film.Rank.Average() > 0

            ? (int)film.Rank.Average() : 0).ToString()

        group new { film.Title, film.Studio } by avearge into aveargeGroup

        orderby aveargeGroup.Key

        select aveargeGroup;

 

    filmsQuery.Print("Computed Numeric Range");

 

    /*

        OUTPUT:

            Group: 6

                    Mad Max, MGM

                    Toy Story, Pixar

            Group: 7

                    Star Wars, Lucas Arts

                    The Hobbit, MGM

                    RoboCop, Sony Pictures

            Group: 8

                    Monsters Inc., Pixar

                    Captain Phillips, Sony Pictures

    */

}

 

Group by a Boolean predicate

Any expression that yields a Boolean result can be used when grouping elements. In this example we calculate the average ranking foreach film and check if the result is greater than eight; if it is then it will be added to the ”True” group, otherwise it will be added to the ”False” group. Note also that we use an anonymous class to create new objects that contain the title and film studio.

private void ByBooleanPredicate()

{

    var filmsQuery = from film in films

                        group new { film.Title, film.Studio }

                        by (film.Rank.Average() > 8).ToString() into filmGroup

                        select filmGroup;

 

    filmsQuery.Print("By Boolean Predicate");

 

    /*

        OUTPUT:

            Group: False

                    Star Wars, Lucas Arts

                    The Hobbit, MGM

                    Mad Max, MGM

                    Toy Story, Pixar

                    RoboCop, Sony Pictures

            Group: True

                    Monsters Inc., Pixar

                    Captain Phillips, Sony Pictures

        */

}

Group by composite key

Sometimes you want to group the result by more than one key value, in these cases you use a composite key of several values. As you can see in the example output the key actually contains the values that are used as a composite key, these values can then be used when looping through the result.

In this example we use the first character of the Title property value and the average rank of the film as a composite key when grouping the result. Note that we create an anonymous object for the composite key that contains two properties FirstLetter and IsGreater; these properties are not part of the element objects but the key itself. We can however, as stated before, use the values of these properties when we loop on the group.

private void ByCompositeKey()

{

    var filmsQuery =

        from film in films

        group film by new

        {

            FirstLetter = film.Title[0],

            IsGreater = film.Rank.Average() > 8,

        } into filmGroup

        orderby filmGroup.Key.IsGreater, filmGroup.Key.FirstLetter

        select filmGroup;

 

    filmsQuery.Print("By Compound Key");

 

    /*

        OUTPUT:

            Group: { FirstLetter = M, IsGreater = False }

                    Mad Max, MGM

            Group: { FirstLetter = R, IsGreater = False }

                    RoboCop, Sony Pictures

            Group: { FirstLetter = S, IsGreater = False }

                    Star Wars, Lucas Arts

            Group: { FirstLetter = T, IsGreater = False }

                    The Hobbit, MGM

                    Toy Story, Pixar

            Group: { FirstLetter = C, IsGreater = True }

                    Captain Phillips, Sony Pictures

            Group: { FirstLetter = M, IsGreater = True }

                    Monsters Inc., Pixar

        */

}

 

Nested groups

Nested groups can be used to subdivide the result of a group into a more granular result. In this example we first group the result by the average rank of the films, and then subdivide that result into the different film studios that are available. The elements of the result are Film objects.

The first thing we do is to use the let keyword to temporarily store the calculated averages and use those values for the first grouping. We then use a second nested query to group by film studio using the result from the averages query. The entire query result is then grouped by the Key property of the averages group query.

private void ByNestedGroups()

{

    var filmsQuery =

        from film in films

        let avearge = (film.Rank.Average() > 0 ? (int)film.Rank.Average() : 0).ToString()

        group film by avearge into aveargeGroup

        orderby aveargeGroup.Key

        from studioGroup in

            (from studio in aveargeGroup

                group studio by studio.Studio)

        group studioGroup by aveargeGroup.Key;

 

    Console.WriteLine();

    Console.WriteLine("--- By Nested Groups ---");

    foreach (var outerGroup in filmsQuery)

    {

        Console.WriteLine("Rank = {0}", outerGroup.Key);

        outerGroup.Print("");

        Console.WriteLine();

    }

 

    /*

        OUTPUT:

            Rank = 6

            Group: Pixar

                    Toy Story, Pixar

 

            Rank = 7

            Group: Lucas Arts

                    Star Wars, Lucas Arts

            Group: MGM

                    The Hobbit, MGM

                    Mad Max, MGM

            Group: Sony Pictures

                    RoboCop, Sony Pictures

 

            Rank = 8

            Group: Pixar

                    Monsters Inc., Pixar

            Group: Sony Pictures

                    Captain Phillips, Sony Pictures

        */

}

 

Group with sub query

A sub query can be used for instance when we want to calculate values that are going to be part of the element objects in the query result. To achieve this we simply assign the result of a second query to a property within the curly braces where the values are assigned to the element objects.

In this example we group the result by the film studios and want to find out what the highest average ranking the respective film studio has received for a film.

private void GroupWithSubquery()

{

    var filmsQuery =

        from film in films

        group film by film.Studio into filmGroup

        select new

        {

            Studio = filmGroup.Key,

            HighestScore =

                (from rank in filmGroup

                    select rank.Rank.Average()).Max()

        };

 

    int count = filmsQuery.Count();

    Console.WriteLine();

    Console.WriteLine("--- Group With Subquery ---");

    Console.WriteLine("Number of studios = {0}", count);

 

    foreach (var item in filmsQuery)

    {

        Console.WriteLine("    {0} Highest Score={1}", item.Level, item.HighestScore);

    }

 

    /*

        OUTPUT:

            Number of studios = 4

                Lucas Arts Highest Score=7,63333333333333

                MGM Highest Score=7,43333333333333

                Pixar Highest Score=8,2

                Sony Pictures Highest Score=8,5

        */

}

 

The Film class

This class defines the film objects that are the result of the LINQ queries.

class Film

{

    public string Title { get; set; }

    public string Studio { get; set; }

    public List<double> Rank { get; set; }

}

 

Example data

The following list was used in all the examples.

List<Film> films = new List<Film>()

{

    new Film{Title="Star Wars",  Studio= "Lucas Arts", Rank = new List<double> {7.8, 5.6, 9.5}},

    new Film{Title="The Hobbit", Studio= "MGM", Rank = new List<double> {7.2, 6.6, 8.5}},

    new Film{Title="Mad Max", Studio = "MGM", Rank = new List<double> {6.8, 5.6, 9.0}},

    new Film{Title="Monsters Inc.", Studio = "Pixar", Rank = new List<double> {7.8, 7.6, 9.2}},

    new Film{Title="Toy Story", Studio = "Pixar", Rank = new List<double> {6.8, 6.6, 6.5}},

    new Film{Title="RoboCop", Studio = "Sony Pictures", Rank = new List<double> {7.6, 7.7, 7.5}},

    new Film{Title="Captain Phillips", Studio = "Sony Pictures", Rank = new List<double> {8.4, 8.5, 8.6}}

};

 

Print and proeprty methods

The following extension methods were used when printing values and fetching property values in the Print method. We use reflection to fetch the property values in the Print method to make it as generic as possible.

public static class Extensions

{

    public static string GetPropertyValue(this object item, string propertyName)

    {

        var property = item.GetType().GetProperty(propertyName).GetValue(item);

        return property == null ? String.Empty : property.ToString();

    }

 

    public static bool HasProperty(this object item, string propertyName)

    {

        var property = item.GetType().GetProperty(propertyName);

        return property == null;

    }

 

    public static void Print(this IEnumerable<IGrouping<object, object>> groups, string header)

    {

        if (header.Length > 0) Console.WriteLine("\r\n--- {0} ---", header);

 

        string groupMessage = String.Empty;

 

        foreach (var filmGroup in groups)

        {

            Console.WriteLine("Group: {0}", filmGroup.Key);

            foreach (var film in filmGroup)

            {

                Console.WriteLine("\t{0:-20}, {1}", film.GetPropertyValue("Title"), film.GetPropertyValue("Studio"));

            }

        }

    }

 

}

 

Stay connected with news and updates!

Join our mailing list to receive the latest news and updates from our team.
Don't worry, your information will not be shared.

Subscribe
Close

50% Complete

Two Step

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.