Modern programming languages and software development
architectures are based more and more on object-oriented design and
development. As a result, quite often we need to query and manage objects and
collections of items, rather than records and data tables. We also need tools
and languages independent from specific data sources or persistence layers.
Language Integrated Query (LINQ) allows developers to query and manage
sequences of items (objects, entities, database records, XML nodes, and so on)
within their software solutions, using a unique programming language
independent from their original persistence media. The key feature of LINQ is
its integration with widely used programming languages, made possible by the
use of a syntax common to all kinds of content.
In this article, we will describe the main classes and operators on
which LINQ is based as a means of understanding its architecture and to learn
its syntax. As we described in “LINQ
Introduction,” LINQ provides a basic infrastructure for many different
implementations of querying engines, such as LINQ to Objects, LINQ to SQL, LINQ
to DataSet, LINQ to Entities, LINQ to XML, and so on. All the query extensions
are based on extension methods specialization, which you will read about in
this article. The examples in this article mainly use LINQ to Objects so that
we can focus on queries and operators rather than on specific internal
implementations of the various flavors of LINQ.
LINQ Queries
LINQ is based on a set of query operators, defined as
extension methods, that mainly work with any object that implements
IEnumerable<T>. (For more details about extension methods, see “C#
Language Features,” and “Microsoft
Visual Basic 9.0 Language Features.”) This approach makes LINQ a
general-purpose querying framework because many lists implement
IEnumerable<T>, and any developer can implement his or her own.
This query infrastructure is also very
extensible. Given the architecture of extension methods, developers can
specialize a method’s behavior based on the type of data they are querying. For
instance, LINQ to SQL and LINQ to XML have specialized LINQ operators to handle
relational data and XML nodes, respectively.
Query Syntax
To understand query syntax, we will start with a simple
example. Consider the following Developer type:
Imagine that you need to query an array of objects of the
Developer type, using LINQ to Objects, extracting the developers who
use C# as their main programming language. The code you might use is shown in
Listing 4-1.
Listing 4-1: A
simple LINQ query
using System;
using System.Linq;
using System.Collections.Generic;
class app {
static void Main() {
Developer[] developers = new Developer[] {
new Developer {Name = "Paolo", Language = "C#"},
new Developer {Name = "Marco", Language = "C#"},
new Developer {Name = "Frank", Language = "VB.NET"}};
IEnumerable<string> developersUsingCsharp =
from d in developers
where d.Language == "C#"
select d.Name;
foreach (string item in developersUsingCsharp) {
Console.WriteLine(item);
}
}
}
The result of running this code would be the developers
Paolo and Marco.
The syntax of this query (shown in bold in
Listing 4-1) reads something like an SQL statement, although its style
is a bit different. To understand it and become familiar with this new syntax,
we will try to deconstruct its definition.
|
Query Expression |
A query expression is an expression tree that operates on one
or more information sources by applying one or more query operators from either
the group of standard query operators or domain-specific operators. In general,
the evaluation of a query expression results in a sequence of values. A query
expression is evaluated only when its contents are enumerated.
|
The expression is defined by a selection command:
applied to a set of items:
where the from clause targets any instance
of a class that implements the IEnumerable<T> interface.
The selection applies a specific filtering condition:
The filtering condition simply translates to an invocation of the
Where extension method of the Enumerable class,
defined in the System.Linq namespace. The
select statement is another extension method, named Select,
provided by the Enumerable class.
|
Tip |
The Enumerable class, defined in the
System.Linq namespace, provides many query operators for the LINQ to
Objects implementation, defining them as extension methods for types that
implement IEnumerable<T>.
|
Starting from the considerations just mentioned, we can rewrite the
query expression and resolve its definition into basic elements:
The Where method and the Select
method both receive lambda expressions as arguments. These lambda expressions translate to predicates that
are based on a set of generic delegate types, defined within the
System.Linq namespace.
Here is the entire family of generic delegate types available:
Many extension methods of the Enumerable class
accept these delegates as arguments, and we will use them throughout the
examples in this article. A final deconstruction of our initial query might be
something like Listing 4-2.
Listing 4-2: The
first LINQ query translated into basic elements
Func<Developer, bool> filteringPredicate = d => d.Language == "C#";
Func<Developer, string> selectionPredicate = d => d.Name;
IEnumerable<string> expr =
developers
.Where(filteringPredicate)
.Select(selectionPredicate);
The C# 3.0 compiler, like the Visual Basic 9.0 compiler,
translates the LINQ statement (Listing 4-1)
into something like the statement shown in Listing
4-2. When you have become familiar with the LINQ syntax (Listing
4-1), it is simpler and easier to write and manage, even if it is
optional, and you can always use the equivalent, more verbose version.
Nevertheless, sometimes it is necessary to use the direct call to an extension
method because query syntax does not cover all possible extension methods.
Full Query Syntax
In the previous section, we defined and deconstructed a
simple query over a list of objects. However, LINQ query syntax is more
complete and articulated. Every query starts with a from
clause and ends with either a select clause or a
group clause. The reason to start with a from clause
instead of a select statement, as in SQL syntax, is
related to the need to provide Microsoft IntelliSense capabilities within the
remaining part of the query, which makes writing conditions, selections, and
any other LINQ query clauses easier. A select clause
projects the result of an expression into an enumerable object. A
group clause projects the result of an expression into a set of groups,
based on a grouping condition, where each group is an enumerable object. The
following code shows a prototype of the full syntax of a LINQ query expression:
The first from clause can be followed by
zero or more from, let, or
where clauses. A let clause applies a name to
the result of an expression, while a where clause
defines a filter that will be applied to include specific items in the results.
Each from clause is a generator that represents an
iteration over a sequence on which query operators (such as the extension
methods of System.Linq.Enumerable) are applied.
A from clause can be followed by any number
of join clauses. The final select
or group clause can be preceded by an orderby
clause that applies an ordering to the results:
We will use query expressions throughout this article. Refer to
this section when you want to check the syntax of a LINQ query.
|