LINQ - Join

 

LINQ Join

Sometimes you need to combine values from two data sequences that don’t have a direct relationship connecting them, using LINQ; this is where the join clause comes to the rescue. To be able to connect the two sources it has to be possible to compare equality on a shared value in both sources. For example, a video store clerk wants to match his list of arrived films with a list if film studio that has released films. A join clause could connect these two lists on the film studio id.

When a join clause compares the equality of a single property or a property of an object in the two sources it uses the special equals keyword. All joins made with the join keyword in LINQ are equijoins, which means that equality is used for comparison of the values.

LINQ joins are optimized for equijoins, but you can perform non-equijoins by using multiple from clauses and use a where clause to apply the predicate expression. The predicate expression could call the Contains method on the list produced by the inner from clause to see if it contains the current element in the outer from clause. You should however be aware that using non-equijoins can have a negative impact on performance.

In this article we will look at the following joins:

  • Inner join
  • Group join
  • Left outer join
  • Join using composite keys
  • Cross join
  • Non-equijoin

We will be using the following sources in the code samples.

List<FilmStudio> filmStudios = new List<FilmStudio>()

{

    new FilmStudio(){Name="Lucas Arts", ID=001},

    new FilmStudio(){Name="MGM", ID=002},

    new FilmStudio(){Name="Pixar", ID=003},

    new FilmStudio(){Name="HBO Films", ID=004},

    new FilmStudio(){Name="Sony Pictures", ID=005}           

};

 

List<Actor> actors = new List<Actor>()

{

    new Actor(){Name="Mel Gibson", ID=001},

    new Actor(){Name="Elijah Woods", ID=002},

    new Actor(){Name="Tom Hanks", ID=003}

};

 

List<Film> films = new List<Film>()

{

    new Film{Title="Star Wars",  StudioID=001},

    new Film{Title="The Hobbit", StudioID=002, ActorID = 002},

    new Film{Title="Mad Max", StudioID=002, ActorID = 001},

    new Film{Title="Monsters Inc.", StudioID=003},

    new Film{Title="Toy Story", StudioID=003},

    new Film{Title="RoboCop", StudioID=005},

    new Film{Title="Captain Phillips", StudioID=005, ActorID = 003}

};

 

We will be using the following methods to print the results to the Console window.

private string GetPropertyValue(object item, string propertyName)

{

    var property = item.GetType().GetProperty(propertyName).GetValue(item);

    return property == null ? String.Empty : property.ToString();

}

 

private void PrintResult<T>(IEnumerable<T> result, string typeOfJoin)

{

    int totalItems = 0;

    object type = result.FirstOrDefault().GetType();

 

    foreach (var item in result)

    {

        if(item is Result)

            Console.WriteLine("{0,-18}{1,-16}{2}",

GetPropertyValue(item, "Film"),

GetPropertyValue(item, "Studio"),

GetPropertyValue(item, "Actor"));

        else if (item is GroupResult)

        {

            Console.WriteLine("Group: {0}", GetPropertyValue(item, "Studio"));

 

            foreach (var film in (ICollection<Film>)item.GetType().GetProperty("Films").GetValue(item))

            {

                totalItems++;

                Console.WriteLine("{0,-18}{1}", GetPropertyValue(film, "Title"), GetPropertyValue(film, "StudioID"));

            }

 

            Console.WriteLine();

        }

    }

 

    if (result is IList)

    {

        int counter = 0;

        foreach (object value in result)

        {

            Console.Write("{0, -3}", value);

 

            counter++;

            if (counter == 4)

            {

                counter = 0;

                Console.WriteLine();

            }

        }

    }

 

    Console.WriteLine();

 

    Console.WriteLine(totalItems == 0 ?

        String.Format("{0}: {1} items.", typeOfJoin, result.Count()) :

        String.Format("{0}: {1} items in {2} groups.", typeOfJoin, totalItems, result.Count()));

 

    Console.WriteLine();

}

Inner join

Using an inner join will produce a flat list. The following code sample will list film titles and the name of their corresponding film studio. If a studio has no corresponding films it will not appear in the list; you can note that HBO Films has been excluded from the list for this reason.

public void InnerJoin()

{

    var result =

        from studio in filmStudios

        join film in films on studio.ID equals film.StudioID

        select new Result {

           StudioID = studio.ID, Studio = studio.Name, Film = film.Title };

 

    PrintResult(result, "InnerJoin");

 

    /*

        OUTPUT:

        InnerJoin: 7 items.

        Star Wars         Lucas Arts

        The Hobbit        MGM

        Mad Max           MGM

        Monsters Inc.     Pixar

        Toy Story         Pixar

        RoboCop           Sony Pictures

        Captain Phillips  Sony Pictures

    */

}

Group join

A group join is a way to group all elements of the joined in source based on the elements in the from source creating a hierarchical result sequence consisting of object arrays. If no matching elements are found in the joined in source an empty array is added.

A group join is essentially an inner-equijoin that produces a result that is organized into groups.

Because the group join cannot identify the key that the elements match on it can be a good idea to store the result in a new type that contains the key name of the elements in the from source, like in the first example.

The following code sample writes the films grouped by the film studios, to the Console window.

public void GroupJoin()

{

    var result =

        from studio in filmStudios

        join film in films on studio.ID equals film.StudioID into filmGroup

        select new GroupResult { Studio = studio.Name, Films = filmGroup };

 

 

    PrintResult(result, "Simple GroupJoin");

 

    /*

        OUTPUT:

        Group: Lucas Arts

        Star Wars         1

 

        Group: MGM

        The Hobbit        2

        Mad Max           2

 

        Group: Pixar

        Monsters Inc.     3

        Toy Story         3

 

        Group: HBO Films

 

        Group: Sony Pictures

        RoboCop           5

        Captain Phillips  5

 

        Simple GroupJoin: 7 items in 5 groups.

    */

}

 

It is also possible to use the result of a group join as a sub query. The following sample code lists all films produced by a film studio with an id less than three.

public void GroupJoinWithSubQuery()

{

    var result =

        from studio in filmStudios

        join film in films on studio.ID equals film.StudioID into filmGroup

           from filmItem in filmGroup

           where filmItem.StudioID < 3

           select new Result {

              StudioID = studio.ID, Studio = studio.Name, Film = filmItem.Title };

 

    PrintResult(result, "GroupJoin With Sub Query");

 

    /*

        OUTPUT:

        Star Wars         Lucas Arts

        The Hobbit        MGM

        Mad Max           MGM

 

        GroupJoin With Sub Query: 3 items.

    */

}

Left outer join

A Left outer join returns all elements in the source in the from clause (the left source). To be able to perform a left join, a group join has to be used in conjunction with the DefaultIfEmpty method on the source that is being joined in; this ensures that empty elements can be returned if none exist for the current element in the left source. You can pass in null or a user-defined type to the DefaultIfEmpty method. In the code sample we use a user-defined type.

In this code sample all studios are listed with their corresponding films. Note that the HBO Films don’t have any films associated with it.

public void LeftOuterJoin()

{

    var result =

        from studio in filmStudios

        join film in films on studio.ID equals film.StudioID into filmGroup

            from item in filmGroup.DefaultIfEmpty(

               new Film { Title = "<No titles found>", StudioID = 0 })

        select new Result { 

            Studio = studio.Name, StudioID = studio.ID,  Film= item.Title };

 

    PrintResult(result, "Left Outer Join");

 

    /*

        OUTPUT:

        Star Wars         Lucas Arts

        The Hobbit        MGM

        Mad Max           MGM

        Monsters Inc.     Pixar

        Toy Story         Pixar

        <No titles found> HBO Films

        RoboCop           Sony Pictures

        Captain Phillips  Sony Pictures

 

        Simple GroupJoin: 8 items.

    */

}

Join using composite keys

Composite keys are used with joins when you want to test equality more than one value. When creating the composite key, you can use an anonymous or a named type containing the values you want to compare. The property names must be the same and they must be in the same order in the composite keys; you must change the property names in the anonymous type if they differ in the composite keys.

Named types must be used when a query is passed between method boundaries.

In this code sample the film title, studio and actor is displayed for the films that has an actor listed.

public void JoinUsingCompositeKeys()

{

    var result =

        from studio in filmStudios

        from actor in actors

        join film in films on new { StudioID = studio.ID, ActorID = actor.ID }

            equals new { StudioID = film.StudioID, ActorID = film.ActorID }

            into details

            from film in details

            select new Result { StudioID = studio.ID, Studio = studio.Name,

               Film = film.Title, Actor = actor.Name };

 

    PrintResult(result, "Join Using Composite Keys");

 

    /*

        Mad Max           MGM             Mel Gibson

        The Hobbit        MGM             Elijah Woods

        Captain Phillips  Sony Pictures   Tom Hanks

 

        Join Using Composite Keys: 3 items.

    */

}

Cross join

If you want to create simple test data fast you can use a cross join between two sources, the result of a cross join is the Cartesian product of the two sources. All items from the first (left) source will be combined with all the items of the second (right) source. The result will always have a size equal to the product of the sizes of the two sources.

To create a cross join you add two from clauses, one for each source.

public void CrossJoin()

{

    var result =

        (from letter in "ABCD".ToCharArray()

            from numbers in "1234".ToCharArray()

            select letter.ToString() + numbers).ToList();

 

    PrintResult(result, "CrossJoin");

 

    /*

        OUTPUT:

        A1 A2 A3 A4

        B1 B2 B3 B4

        C1 C2 C3 C4

        D1 D2 D3 D4

    */

}

Non-equijoin

You have to use multiple from clauses when creating a non-equijoin to introduce the sources. The range variables for the sources are then compared in the where clause; this can be done with a method call, like the Contains method of a collection.

There are cases when a join cannot be used:

  • When inequality comparison is used (a non-equijoin)
  • When more than one expression of equality or inequality is used to predicate a join.
  • When a temporary range is used for the inner (right) source before the join.

Note that using non-equijoins can have a negative impact on performance.

public void NonEquijoin()

{

    var result =

            from film in films

            let actorIDs = from actor in actors

                           select actor.ID

            where actorIDs.Contains(film.ActorID)

            join actor in actors on film.ActorID equals actor.ID

            select new Result { Film = film.Title, Actor = actor.Name };

 

    PrintResult(result, "NonEquijoin");

 

    /*

        OUTPUT:

        The Hobbit                        Elijah Woods

        Mad Max                           Mel Gibson

        Captain Phillips                  Tom Hanks

 

        NonEquijoin: 3 items.

    */

}

 

Stay connected with news and updates!

Join our mailing list to receive the latest news and updates from our team.
Don't worry, your information will not be shared.

Subscribe
Close

50% Complete

Two Step

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.