|
|
|
|
C# 3.0 Features
C# 3.0 moves C# in the direction of a functional language,
supporting a more declarative style of coding. LINQ makes extensive use of all
the new features, which also let you use a higher level of abstraction in your
code in areas other than LINQ.
Local Type Inference
Type inference is a wonderful feature for any language. It
preserves type safety while allowing you to write more “relaxed” code. In other
words, you can define variables and use them without worrying too much about
their types, leaving it to the compiler to determine the correct type of a
variable by inferring it from the expression assigned to the variable itself.
The price for using type inference might be less explicit code
against the types you want to use, but in our opinion, this feature simplifies
code maintenance of local variables where explicit type declaration is not
particularly meaningful.
C# 3.0 offers type inference that allows you to define a variable
by using the var keyword instead of a specific type.
This might seem to be equivalent to defining a variable of type
object, but it is not. The following code shows you that an
object type requires the boxing of a value type (see b
declaration), and in any case it requires a cast operation when you want to
operate with the specific type (see d assignment):
When var is used, the compiler infers the
type from the expression used to initialize the variable. The compiled IL code
contains only the inferred type. In other words, consider this code:
It is perfectly equivalent to this example:
Why is this important? The var keyword
calls to mind the Component Object Model (COM) type VARIANT, which was used
pervasively in Visual Basic 6.0, but in reality it is absolutely
different because it is a type-safe declaration. In fact, it infers the type
just as you wrote it.
To some, var might seem to be a tool for
the lazy programmer. Nevertheless, var is the only way
to define an anonymous type variable, as we will describe later.
|
Note |
Variants were a way in COM to implement late binding with the
type of a variable. There was no compile check using variants, and this caused
a lot of nasty bugs that were revealed only when code was executed (most of the
time, only when it was executed by end users).
|
The var keyword can be used only within a
local scope. In other words, a local variable can be defined in this way, but
not a member or a parameter. The following code shows some examples of valid
uses of var: x,
y, and r are double types;
d and w are decimal;
s and p are string;
and l is an int. Please note
that the constant 2.3 defines the type inferred by
three variables, and the default keyword is a “typed”
null that infers the correct type to p.
The next sample shows some cases in which the var
keyword is not allowed:
The k type can be inferred by the constant
initializer, but var is not allowed on type members.
The result type of InvalidUseResult could be inferred
by the internal return statement, but even this syntax
is not allowed.
This simple language feature allows us to write code that
virtually eliminates almost all local variable type declarations. Although this
simplifies code writing, it can make reading code more difficult. For example,
if you are going to call an overloaded method with versions of the
method that differ in parameter types, it could be unclear
which version of the method is being called by reading the code. Anyway,
similar problems are generated from the poor use of method overloading: you
should use different method names when the behavior (and the meaning) of the
methods is different.
Lambda Expressions
C# 2.0 introduced the capability to “pass a pointer to some
code” as a parameter by using anonymous methods. This concept is a powerful
one, but what you really pass in this way is a reference to a method, not
exactly a piece of code. That reference points to strongly typed code that is
generated at compile time. Using generics, you can obtain more flexibility, but
it is hard to apply standard operators to a generic type.
C# 3.0 introduces lambda expressions, which allow the definition of
anonymous methods using more concise syntax. Lambda expressions can also
optionally postpone code generation by creating an expression
tree that allows further manipulation before code is actually
generated, which happens at execution time. An expression tree can be generated
only for the particular “pieces of code” that are expressions.
The following code shows a simple use of an anonymous method:
In the following examples, we use similar versions of the
Aggregate method, so we will not reproduce it each time. The anonymous
method passed as a parameter to Aggregate defines the
aggregate operation that is executed for each element of the List
object that is used.
Using lambda expression syntax, we can write the Aggregate
call as shown in Listing 2-14.
Listing 2-14: Explicitly
typed parameter list
sum = Aggregate(
l.Values,
( int a, int b ) => { return a + b; }
);
You can read this formula as “given a and b, both integers, return
a+b that is the sum of a and b.”
We removed the delegate keyword before the
parameter list and added the => token between the
parameter list and the method code. At this stage, the difference is only
syntactical because the compiled code is identical to the result of the
anonymous method syntax. However, lambda expression syntax allows you to write
the same code as shown in Listing 2-15.
Listing 2-15: Implicitly
typed parameter list
sum = Aggregate(
l.Values,
( a, b ) => { return a + b; }
);
|
Note |
The pronunciation of the => token
has no official definition. A few developers use “such that” when the lambda
expression is a predicate and “becomes” when it is a projection. Other
developers say generically “goes to.”
|
You can read this formula as “given a and b, return a+b, whatever
‘+’ means for the type of a and b.” (The “+” operator must exist for the
concrete type of a and b-inferred from the context-otherwise, the code will not
compile.)
Although we removed parameter types from the parameter list, the
compiler will infer parameter types from the Aggregate call.
We are calling a generic method, but the generic type T
is defined from the l.Values parameter, which is a
List<int> type. In this call, T is an
int; therefore, the Func<T> delegate is a
Func<int>, and both a and
b are of type int.
You can think of this syntax as more similar to a var
declaration than to another form of generic use. The type resolution is made at
compile time. If a parameter type is generic, you
cannot access operators and members other than those allowed by type
constraints. If it is a regular type, you have full access to operators (such
as the “+” operator we are using) and members eventually defined on that type.
A lambda expression can define a body in two ways. We have seen the
statement body, which requires brackets like any other block of code and a
return statement before the expression that has to be returned. The
other form is the expression body, which can be used when the code inside the
block is only a return followed by an expression. You
can simply omit the brackets and the return statement,
as shown in Listing 2-16.
Listing 2-16: Expression
body
sum = Aggregate(
l.Values,
( a, b ) => a + b
);
When we worked with lambda expressions for the first time, we felt
some confusion until we realized that they are only a more powerful syntax with
which to write an anonymous method. This is an important concept to remember,
because you can always access identifiers that are not defined in the parameter
list. In other words, remember that the parameter list defines the parameters
of the anonymous method. Any other identifier inside the body (either a
statement or an expression) of a lambda expression has to be resolved within
the anonymous method definition. The following code shows an example of this.
(The AggregateSingle<T> method uses a slightly
different delegate for the second parameter, declared as delegate
T FuncSingle<T>( T a )).
This lambda expression has only the x parameter;
sum is a local variable of the containing method, and
its lifetime is extended over the lifetime of the delegate instance that points
to the anonymous method defined by the lambda expression itself. Remember that
the result of the corresponding return sum += x statement
will be the value of sum after the sum of
x.
When a lambda expression has only one parameter, the parentheses
can be omitted from the parameter list, as in this example:
If there are no parameters for a lambda expression, two parentheses
are required before the => token. The code in
Listing 2-17 shows some of the possible syntaxes.
Listing 2-17: Lambda
expression examples
( int a, int b ) => { return a + b; } // Explicitly typed, statement body
( int a, int b ) => a + b; // Explicitly typed, expression body
( a, b ) => { return a + b; } // Implicitly typed, statement body
( a, b ) => a + b // Implicitly typed, expression body
( x ) => sum += x // Single parameter with parentheses
x => sum += x // Single parameter no parentheses
() => sum + 1 // No parameters
A practical use of lambda expressions is in writing small pieces of
code inside the parameter list of a method call. The following code shows an
example of a predicate passed as a parameter to a generic Display
method that iterates an array of elements and displays only those that make the
predicate true. The predicate and its use are highlighted in the code. The
Func delegate shown in Listing 2-18 is explained in the following
pages.
Listing 2-18: Lambda
expression as a predicate
public static void Demo() {
string[] names = { "Marco", "Paolo", "Tom" };
Display( names, s => s.Length > 4 );
}
public static void Display<T>( T[] names, Func<T, bool> filter ){
foreach( T s in names) {
if (filter( s )) Console.WriteLine( s );
}
}
The execution results in a list of names having more than four
characters. The conciseness of this syntax is one reason for using lambda
expressions in LINQ; the other reason is the potential to create an expression
tree.
To this point, we have considered the difference between the
statement body and the expression body only as a different syntax that can be
used to retrieve the same code, but there is something more. A lambda
expression can also be assigned to a variable of these delegate types:
There are no requirements for defining these delegates in a
particular way. LINQ defines such delegates within the System.Linq
namespace, but lambda expression functionality does not depend on these
declarations. You can make your own, even with a name other than
Func, except in one case: if you convert a lambda expression to an
expression tree, the compiler emits a binary representation of the lambda
expression that can be manipulated and converted into executable code at
execution time. An expression tree is an instance of a System.Linq.Expressions.Expression<T>
class, where T is the delegate that the expression tree
represents.
In many ways, the use of lambda expressions to create an expression
tree makes lambda expressions similar to generic methods. The difference is
that generic methods are already described as IL code at compile time (only the
type parameters used are not completely specified), while an expression tree
becomes IL code only at execution time. Only lambda expressions with an
expression body can be converted into an expression tree, and this conversion
is not possible if the lambda expression contains a statement body.
Listing 2-19 shows how the same lambda expression can be converted into
either a delegate or an expression tree. The highlighted lines show the
assignment of the expression tree and its use.
Listing 2-19: Use
of an expression tree
class ExpressionTree {
delegate T Func<T>( T a, T b );
public static void Demo() {
Func<int> x = (a, b) => a + b;
Expression<Func<int>> y = (a, b) => a + b;
Console.WriteLine( "Delegate" );
Console.WriteLine( x.ToString() );
Console.WriteLine( x( 29, 13 ) );
Console.WriteLine( "Expression tree" );
Console.WriteLine( y.ToString() );
Console.WriteLine( y.Compile()( 29, 13 ) );
}
}
Here is the output of Demo execution. The
result of the invocation is the same (42), but the output of the
ToString() invocation is different.
The expression tree maintains a representation of the
expression in memory. You cannot use the compact delegate invocation on an
expression tree as we did on the x delegate syntax.
When you want to evaluate the expression, you need to compile it. The
invocation of the Compile method returns a delegate
that can be invoked through the Invoke method (or the
compact delegate invocation syntax we used in the preceding example). We do not
have space here for a deeper investigation of deferred query evaluation, but it
is an important foundation for many parts of LINQ. For example, LINQ to SQL has
methods that navigate an expression tree and convert it into an SQL statement.
That conversion is made at execution time and not at compile time.
Extension Methods
C# is an object-oriented programming language that allows the
extension of a class through inheritance. Nevertheless, designing a class that
can be inherited in a safe way and maintaining that class in the future is hard
work. A safe way to write such code is to declare all classes as sealed, unless
they are designed as inheritable. In that case, safety is set against agility.
|
More Info |
Microsoft .NET allows class A in assembly X.DLL to be
inherited by class B in assembly Y.DLL. This implies that a new version of
X.DLL should be designed to be compatible even with older versions of Y.DLL. C#
and .NET have many tools to help in this effort. However, we can say that a
class has to be designed as inheritable if you want to allow its derivation;
otherwise, you run the risk that making a few changes in the base classes will
break existing code in derived classes. If you do not design a class to be
inheritable, it is better to make the class sealed, or
at least private or internal.
|
C# 3.0 introduces a syntax that conceptually extends an existing
type (either reference or value) by adding new methods without deriving it into
a new type. Some might consider the results of this change to be only syntactic
sugar, but this capability makes LINQ code more readable and easier to write.
The methods that extend a type can use only the public members of the type
itself, just as you can do from any piece of code outside the target type.
The following code shows a traditional approach to writing two
methods (FormattedUS and FormattedIT)
that convert a decimal value into a string formatted
with a specific culture:
There is no link between these methods and the decimal
type other than the methods’ parameters. We can change this code to extend the
decimal type. It is a value type and not inheritable, but we can add
the this keyword before the first parameter type of our
methods, and in this way use the method as if it was defined inside the decimal
type. Changes are highlighted in the code shown in Listing 2-20.
Listing 2-20: Extension
methods declaration
static class ExtensionMethods {
public static void Demo() {
decimal x = 1234.568M;
Console.WriteLine( x.FormattedUS() );
Console.WriteLine( x.FormattedIT() );
Console.WriteLine( FormattedUS( x ) ); // Traditional call allowed
Console.WriteLine( FormattedIT( x ) ); // Traditional call allowed
}
static CultureInfo formatUS = new CultureInfo( "en-US" );
static CultureInfo formatIT = new CultureInfo( "it-IT" );
public static string FormattedUS( this decimal d ){
return String.Format( formatIT, "{0:#,0.00}", d );
}
public static string FormattedIT( this decimal d ){
return String.Format( formatUS, "{0:#,0.00}", d );
}
}
An extension method must be static and
public, must be declared inside a static class,
and must have the keyword this before the first
parameter type, which is the type that the method extends. Extension methods
are public because they can be (and normally are) called from outside the class
where they are declared.
Although this is not a big revolution, one advantage could be
Microsoft IntelliSense support, which could show all extension methods
accessible to a given identifier. However, the result type of the extension
method might be the extended type itself. In this case, we can extend a type
with many methods, all working on the same data. LINQ very frequently uses
extension methods in this way.
We can write a set of extension methods to decimal
as shown in Listing 2-21.
Listing 2-21: Extension
methods for native value types
static class ExtensionMethods {
public static decimal Double( this decimal d ) {
return d + d;
}
public static decimal Triple( this decimal d ) {
return d * 3;
}
public static decimal Increase( this decimal d ) {
return ++d;
}
public static decimal Decrease( this decimal d ) {
return --d;
}
public static decimal Half( this decimal d ) {
return d / 2;
}
// …
}
In Listing 2-22, we can compare the two calling syntaxes, the
classical one (y) and the new one (x).
Listing 2-22: Extension
methods call order
decimal x = 14M, y = 14M;
x = Half( Triple( Decrease( Decrease( Double( Increase( x ) ) ) ) ) );
y = y.Increase().Double().Decrease().Decrease().Triple().Half();
The result for both x and y
is 42. The classical syntax requires several nested
calls that have to be read from the innermost to the outermost. The new syntax
acts as though our new methods are members of the decimal class. The call order
follows the read order (left to right) and is much easier to understand.
|
Note |
It is important to recognize that extension methods come at a
price. When you call an instance method of a type, you can expect that the
instance state can be modified by your call. But keep in mind that an extension
method can do that only by calling public members of the extended type, as we
already said. When the extension method returns the same type as it extends,
you can assume that the instance state of the type should not be changed. This
might be a recommendation for extending value types, but we cannot assume the
same for any reference type because the related cost (creating a copy of an
object for each call) could be too high.
|
An extension method is not automatically considered. Its resolution
follows some rules. Here is the order of evaluation used to resolve a method
for an identifier:
-
Instance method: If an instance method exists, it
has priority.
-
Extension method: The search for an extension method
is made through all static classes in the “current namespace” and in all
namespaces included in active using directives. (Current
namespace refers to the closest enclosing namespace declaration. This
is the namespace that contains the static class with the extension method
declaration.) If two types contain the same extension method, the compiler
raises an error.
The most common use of extension methods is to define them in
static classes in specific namespaces, importing them into the calling code by
specifying one or more using directives in the module.
These precedence rules used to resolve a method call define a
feature that is not apparent at first sight. When you call an extension method
on a class, it can always be replaced by a specific version of the method
defined as a member method for a particular type. In other words, the extension
method represents a “default” implementation for a method, which can always be
overridden by a specialized version for specific classes.
We can see this behavior in a few examples. The first code example
contains an extension method for the object type; in
this way, you can call Display on an instance of any
type. We call it on our own Customer class instance:
The result of executing this code is the class name
Customer.
We can customize the behavior of the Display
method for the Customer class, defining an overloaded
extension method, as shown in Listing 2-23. (We could define an overloaded
extension method in another namespace if this namespace
had a higher priority in the resolution order.)
Listing 2-23: Extension
methods overload
static class Visualizer {
public static void Display( this object o ) {
string s = o.ToString();
Console.WriteLine( s );
}
public static void Display( this Customer c ) {
string s = String.Format( "Name={0}", c.Name );
Console.WriteLine( s );
}
}
This time the more specialized version is executed, as we can see
from the execution output, shown here:
Without removing these extension methods, we can add other special
behavior to Display by implementing it as an instance
method in the Customer class. This implementation,
shown in Listing 2-24, will have precedence over any other extension method
for a type equal to or derived from Customer.
Listing 2-24: Instance
method over extension methods
public class Customer {
protected int Id;
public string Name;
public Customer( int id ) {
this.Id = id;
}
public void Display() {
string s = String.Format( "{0}-{1}", Id, Name );
Console.WriteLine( s );
}
}
The execution output, shown here, illustrates that the instance
method is now called:
At first glance, this behavior seems to overlap functionality
provided by virtual methods. It does not, however, because an extension method
has to be resolved at compile time, while virtual methods are resolved at
execution time. This means that if you call an extension method on an object
defined as a base class, the instance type of the contained object is not
relevant. If a compatible extension method
exists (even if it is a derived class), it is used in place of the instance
method. The code in Listing 2-25 illustrates this concept.
Listing 2-25: Extension
methods resolution
public class A {
public virtual void X() {}
}
public class B : A {
public override void X() {}
public void Y() {}
}
static public class E {
static void X( this A a ) {}
static void Y( this A b ) {}
public static void Demo() {
A a = new A();
B b = new B();
A c = new B();
a.X(); // Call A.X
b.X(); // Call B.X
c.X(); // Call B.X
a.Y(); // Call E.Y
b.Y(); // Call B.Y
c.Y(); // Call E.Y
}
}
The X method is always resolved by the
instance method. It is a virtual method, and for this reason c.X()
calls the B.X overridden implementation. The extension
method E.X is never called on these objects.
The Y method is defined only on the
B class. It is an extension method for the A class,
and therefore only b.Y() calls the B.Y
implementation. Note that c.Y() calls E.Y
because the c identifier is defined as an
A type, even if it contains an instance of type B,
because Y is not defined in class A.
A final point to consider regarding a generic extension method is
that when you use a generic type as the parameter that you mark with the
this keyword, you are extending not only a class but a whole set of
classes. We found that this operation is not very intuitive when you are
designing a components library, but it is a comfortable approach when you are
writing the code that uses them. The following code is a slightly modified
version of a previous example of lambda expressions. We added the
this keyword to the names parameter and changed
the invocation of the Display method. Important changes
are highlighted in the code shown in Listing 2-26.
Listing 2-26: Lambda
expression as predicate
public static void Display<T>( this T[] names, Func<T, bool> filter ) {…}
public static void Demo() {
string[] names = { "Marco", "Paolo", "Tom" };
names.Display( s => s.Length > 4 );
// It was: Display( names, s => s.Length > 4 );
}
The Display method can be used with a
different class (for example, an array of type int),
and it will always require a predicate with a parameter that is the same type
as the array. The following code uses the same Display method,
showing only the even values:
As you learn more about extension methods, you can start to
see a language that is more flexible but still strongly typed.
Object Initialization Expressions
C# 1.x allows the initialization of a
field or a local variable in a single statement. The syntax shown here can
initialize a single identifier:
When an initialization statement of this kind is applied to a
reference type, it requires a call to a class constructor that has parameters
that specify how to initialize the inner state of the instance created. You can
use an object initializer on both reference and value types.
When you want to initialize an object (either a reference or value
type), you need a constructor with enough parameters to specify the initial
state of the object you want to initialize. Consider this code:
The customer instance is initialized
through the Customer constructor, but we set only the
Name and Age fields. If we want to set
Country but not Age, we need to write code such
as that shown in Listing 2-27.
Listing 2-27: Standard
syntax for object initialization
Customer customer = new Customer();
customer.Name = "Marco";
customer.Country = "Italy";
C# 3.0 introduces a shorter form of object initialization syntax
that generates functionally equivalent code, shown in Listing 2-28.
Listing 2-28: Object
initializer
// Implicitly calls default constructor before object initialization
Customer customer = new Customer { Name = "Marco", Country = "Italy" };
|
Note |
The syntaxes used to initialize an object (standard and
object initializers) are equivalent after code is compiled. Object initializer
syntax produces a call to a constructor for the specified type (either a
reference or value type): this is the default constructor whenever you do not
place a parenthesis between the type name and the open bracket. If that
constructor makes assignments to the member fields successively initialized,
the compiler still performs that work, although the assignment might not be
used. An object initializer does not have an additional cost if the called
constructor of the initialized type is empty.
|
The names assigned in an initialization list can correspond to
either fields or properties that are public members of the initialized object.
The syntax also allows for specifying a call to a nondefault constructor, which
might be necessary if the default constructor is not available for a type.
Listing 2-29 shows an example.
Listing 2-29: Explicit
constructor call in object initializer
// Explicitly specify constructor to call before object initialization
Customer c1 = new Customer() { Name = "Marco", Country = "Italy" };
// Explicitly specify nondefault constructor
Customer c2 = new Customer( "Paolo", 21 ) { Country = "Italy" };
The c2 assignment above is equivalent to
this one:
|
Note |
The real implementation of an object initializer creates and
initializes the object into a temporary variable and, only at the end, it
copies the reference to the destination variable. In this way, the object is
not visible to another thread until it is not completely initialized.
|
One of the advantages of the object initializer is that it allows
for writing a complete initialization in a functional form: you can put it
inside an expression without using different statements. Therefore, the syntax
can also be nested, repeating the syntax for the initial value of a member into
an initialized object. The classic Point and
Rectangle class example shown in Listing 2-30 (part of the C# 3.0
specification document) illustrates this.
Listing 2-30: Nested
object initializers
public class Point {
int x, y;
public int X { get { return x; } set { x = value; } }
public int Y { get { return y; } set { y = value; } }
}
public class Rectangle {
Point tl, br;
public Point TL { get { return tl; } set { tl = value; } }
public Point BR { get { return br; } set { br = value; } }
}
// Possible code inside a method
Rectangle r = new Rectangle {
TL = new Point { X = 0, Y = 1 },
BR = new Point { X = 2, Y = 3 }
};
The compiled initialization code for r is
equivalent to the following:
Now that you have seen this code, it should be clear when using the
shortest syntax has a true advantage in terms of code readability. The two
temporary variables, point1 and point2,
are also created in the object initializer form, but we do not need to
explicitly define them.
The previous example used the nested object initializers with
reference types. The same syntax also works for value types, but you have to
remember that a copy of a temporary Point object is
made when the TL and BR members
are initialized.
|
Note |
Copying value types can have performance implications on
large value types, but this is not related to the use of object initializers.
|
The object initializer syntax can be used only for assignment of
the initial value of a field or variable. The new keyword
is required only for the final assignment. Inside an initializer, you can skip
the new keyword in an object member’s initialization.
In this case, the code uses the object instance created by the constructor of
the containing object, as shown in Listing 2-31.
Listing 2-31: Initializers
for owned objects
public class Rectangle {
Point tl = new Point();
Point br = new Point();
public Point TL { get { return tl; } }
public Point BR { get { return br; } }
}
// Possible code inside a method
Rectangle r = new Rectangle {
TL = { X = 0, Y = 1 },
BR = { X = 2, Y = 3 }
};
The TL and BR member
instances are implicitly created by the Rectangle class
constructor. The object initializer for TL and
BR does not have the new keyword. In this way,
the initializer works on the existing instance of TL and
BR.
In the examples so far, we have used some constant within the
object initializers. You can also use other calculated values, as shown here:
C# 1.x included the concept of initializers
that used a similar syntax, but it was limited to arrays:
The same new object initializer syntax can also be used for
collections. The internal list can be made of constants, expressions, or other
initializers, like any other object initializer we have already shown. If the
collection class implements the System.Collections.Generic.ICollection<T>
interface, for each element in the initializer a call to ICollection<T>.Add(T)
is made with the same order of the elements. If the collection class implements
the IEnumerable interface, the Add()
method is called for each element in the initializer. The code in Listing
2-32 shows some examples of using collection initializers.
Listing 2-32: Collection
initializers
// Collection classes that implement ICollection<T>
List<int> integers = new List<int> { 1, 3, 9, 18 };
List<Customer> list = new List<Customer> {
new Customer( "Jack", 28 ) { Country = "USA"},
new Customer { Name = "Paolo" },
new Customer { Name = "Marco", Country = "Italy" },
};
// Collection classes that implement IEnumerable
ArrayList integers = new ArrayList() { 1, 3, 9, 18 };
ArrayList list = new ArrayList {
new Customer( "Jack", 28 ) { Country = "USA"},
new Customer { Name = "Paolo" },
new Customer { Name = "Marco", Country = "Italy" },
};
In summary, object and collection initializers allow the
creation and initialization of a set of objects (eventually nested) within a
single function. LINQ makes extensive use of this feature, especially through
anonymous types.
Anonymous Types
An object initializer can also be used without specifying the
class that will be created with the new operator. Doing
that, a new class-an anonymous type-is created. Consider the example shown in
Listing 2-33.
Listing 2-33: Anonymous
types definition
Customer c1 = new Customer { Name = "Marco" };
var c2 = new Customer { Name = "Paolo" };
var c3 = new { Name = "Tom", Age = 31 };
var c4 = new { c2.Name, c2.Age };
var c5 = new { c1.Name, c1.Country };
var c6 = new { c1.Country, c1.Name };
The variables c1 and c2
are of the Customer type, but the type of variables
c3, c4, c5, and
c6 cannot be inferred simply by reading the printed code. The
var keyword should infer the variable type from the assigned
expression, but this one has a new keyword without a
type specified. As you might expect, that kind of object initializer generates
a new class.
The generated class has a public property and an underlying private
field for each argument contained in the initializer: its name and type are
inferred from the object initializer itself. When the name is not explicit, it
is inferred from the initialization expression, as in the definitions for
c4, c5, and c6. This
shorter syntax is called a projection initializer because
it projects not just a value but also the name of the value.
That class is the same for all possible anonymous types whose
properties have the same names and types in the same order. We can see the type
names used and generated in this code:
The following is the output that is generated:
The anonymous type name cannot be referenced by the code (you do
not know the generated name), but it can be queried on an object instance. The
variables c3 and c4 are of the
same anonymous type because they have the same fields and properties. Even if
c5 and c6 have the same properties (type and
name), they are in a different order, and that is enough for the compiler to
create two different anonymous types.
|
Important |
Usually in C# the order of members inside a type is not
important; even standard object initializers are based on member names and not
on their order. The need for LINQ to get a different type for two classes that
differ only in the order of their members derives from the need to represent an
ordered set of fields, as in a SELECT statement.
|
The syntax to initialize a typed array has been enhanced in C# 3.0.
Now you can declare an array initializer and infer the type from the
initializer content. This mechanism can be combined with anonymous types and
object initializers, as in the code shown in Listing 2-34.
Listing 2-34: Implicitly
typed arrays
var ints = new[] { 1, 2, 3, 4 };
var ca1 = new[] {
new Customer { Name = "Marco", Country = "Italy" },
new Customer { Name = "Tom", Country = "USA" },
new Customer { Name = "Paolo", Country = "Italy" }
};
var ca2 = new[] {
new { Name = "Marco", Sports = new[] { "Tennis", "Spinning"} },
new { Name = "Tom", Sports = new[] { "Rugby", "Squash", "Baseball" } },
new { Name = "Paolo", Sports = new[] { "Skateboard", "Windsurf" } }
};
|
Note |
The syntax of C# 1.x needs the
assigned variable to be a definite type. The syntax of C# 3.0 allows the use of
the var keyword to define the variable initialized in
such a way.
|
While ints is an array of
int and ca1 is an array of Customers,
ca2 is an array of anonymous types, each containing a
string (Name) and an array of strings (Sports).
You do not see a type in the ca2 definition because all
types are inferred from the initialization expression. Once again, note that
the ca2 assignment is a single expression, which could
be embedded in another one.
Query Expressions
C# 3.0 also introduces query expressions,
which have a syntax similar to the SQL language and are used to manipulate
data. This syntax is converted into regular C# 3.0 syntax that makes use of
specific classes, methods, and interfaces that are part of the LINQ libraries.
We would not cover all the keywords in detail because it is beyond the scope of
this article.
In this section, we want to introduce the transformation that the
compiler applies to a query expression, just to describe how the code is
interpreted.
Here is a typical LINQ query:
A query expression begins with a from clause
(in C#, all query expression keywords are case sensitive) and ends with either
a select or group clause. The
from clause specifies the object on which LINQ operations are applied,
which must be an instance of a class that implements the IEnumerable<T>
interface.
That code produces the following results:
C# 3.0 interprets the query assignment as
if it was written in this way:
Each query expression clause corresponds to a generic method, which
is resolved through the same rules that apply to an extension method.
Therefore, the query expression syntax is similar to a macro expansion, even if
it is a more intelligent one because it infers many definitions, like the names
of parameters in lambda expressions.
At this point, it should be clear why the features of C# 3.0
that allow you to write complex actions into a single expression are so
important to LINQ. A query expression calls many methods in a chain, where each
call uses the result of the previous call as a parameter. Extension methods
simplify the syntax, avoiding nested calls. Lambda expressions define the logic
for some operations (such as where, orderby,
and so on). Anonymous types and object initializers define how to store the
results of a query. Local type inference is the glue that holds these pieces
together.
|
|
|
|