C# 3.0 Features


C# 3.0 Features

C# 3.0 moves C# in the direction of a functional language, supporting a more declarative style of coding. LINQ makes extensive use of all the new features, which also let you use a higher level of abstraction in your code in areas other than LINQ.

Local Type Inference

Type inference is a wonderful feature for any language. It preserves type safety while allowing you to write more “relaxed” code. In other words, you can define variables and use them without worrying too much about their types, leaving it to the compiler to determine the correct type of a variable by inferring it from the expression assigned to the variable itself.

The price for using type inference might be less explicit code against the types you want to use, but in our opinion, this feature simplifies code maintenance of local variables where explicit type declaration is not particularly meaningful.

C# 3.0 offers type inference that allows you to define a variable by using the var keyword instead of a specific type. This might seem to be equivalent to defining a variable of type object, but it is not. The following code shows you that an object type requires the boxing of a value type (see b declaration), and in any case it requires a cast operation when you want to operate with the specific type (see d assignment):

var a = 2;       // a is declared as int
object b = 2;    // Boxing an int into an object
int c = a;       // No cast, no unboxing
int d = (int) b; // Cast is required, an unboxing is done

When var is used, the compiler infers the type from the expression used to initialize the variable. The compiled IL code contains only the inferred type. In other words, consider this code:

int a = 5;
var b = a;

It is perfectly equivalent to this example:

int a = 5;
int b = a;

Why is this important? The var keyword calls to mind the Component Object Model (COM) type VARIANT, which was used pervasively in Visual Basic 6.0, but in reality it is absolutely different because it is a type-safe declaration. In fact, it infers the type just as you wrote it.

To some, var might seem to be a tool for the lazy programmer. Nevertheless, var is the only way to define an anonymous type variable, as we will describe later.

Note 

Variants were a way in COM to implement late binding with the type of a variable. There was no compile check using variants, and this caused a lot of nasty bugs that were revealed only when code was executed (most of the time, only when it was executed by end users).

The var keyword can be used only within a local scope. In other words, a local variable can be defined in this way, but not a member or a parameter. The following code shows some examples of valid uses of var: x, y, and r are double types; d and w are decimal; s and p are string; and l is an int. Please note that the constant 2.3 defines the type inferred by three variables, and the default keyword is a “typed” null that infers the correct type to p.

public void ValidUse( decimal d ) {
    var x = 2.3;             // double
    var y = x;               // double
    var r = x / y;           // double
    var s = "sample";        // string
    var l = s.Length;        // int
    var w = d;               // decimal
    var p = default(string); // string
}

The next sample shows some cases in which the var keyword is not allowed:

class VarDemo {
    // invalid token 'var' in class, struct or interface member declaration
    var k =0;

    // type expected in parameter list
    public void InvalidUseParameter( var x ){}

    // type expected in result type declaration
    public var InvalidUseResult() {
        return 2;
    }
    public void InvalidUseLocal() {

        var x;          // Syntax error, '=' expected
        var y = null;   // Cannot infer local variable type from 'null'
    }
   // 
}

The k type can be inferred by the constant initializer, but var is not allowed on type members. The result type of InvalidUseResult could be inferred by the internal return statement, but even this syntax is not allowed.

This simple language feature allows us to write code that virtually eliminates almost all local variable type declarations. Although this simplifies code writing, it can make reading code more difficult. For example, if you are going to call an overloaded method with versions of the method that differ in parameter types, it could be unclear which version of the method is being called by reading the code. Anyway, similar problems are generated from the poor use of method overloading: you should use different method names when the behavior (and the meaning) of the methods is different.

Lambda Expressions

C# 2.0 introduced the capability to “pass a pointer to some code” as a parameter by using anonymous methods. This concept is a powerful one, but what you really pass in this way is a reference to a method, not exactly a piece of code. That reference points to strongly typed code that is generated at compile time. Using generics, you can obtain more flexibility, but it is hard to apply standard operators to a generic type.

C# 3.0 introduces lambda expressions, which allow the definition of anonymous methods using more concise syntax. Lambda expressions can also optionally postpone code generation by creating an expression tree that allows further manipulation before code is actually generated, which happens at execution time. An expression tree can be generated only for the particular “pieces of code” that are expressions.

The following code shows a simple use of an anonymous method:

public class AggDelegate {
    public List<int> Values;
    delegate T Func<T>( T a, T b );

    static T Aggregate<T>( List<T> l, Func<T> f ) {
        T result = default(T);
        bool firstLoop = true;
        foreach( T value in l ) {
            if (firstLoop) {
                result = value;
                firstLoop = false;
            }
            else {
                result = f( result, value );
            }
        }
        return result;
    }

    public static void Demo() {
        AggDelegate l = new AggDelegate();
        int sum;
        sum = Aggregate(
                l.Values,

                delegate( int a, int b ) { return a + b; }
            );
        Console.WriteLine( "Sum = {0}", sum );
    }

    // 
}

In the following examples, we use similar versions of the Aggregate method, so we will not reproduce it each time. The anonymous method passed as a parameter to Aggregate defines the aggregate operation that is executed for each element of the List object that is used.

Using lambda expression syntax, we can write the Aggregate call as shown in Listing 2-14.

Listing 2-14: Explicitly typed parameter list
Image from book

sum = Aggregate(
        l.Values,
        ( int a, int b ) => { return a + b; }
    );
Image from book

You can read this formula as “given a and b, both integers, return a+b that is the sum of a and b.”

We removed the delegate keyword before the parameter list and added the => token between the parameter list and the method code. At this stage, the difference is only syntactical because the compiled code is identical to the result of the anonymous method syntax. However, lambda expression syntax allows you to write the same code as shown in Listing 2-15.

Listing 2-15: Implicitly typed parameter list
Image from book

sum = Aggregate(
        l.Values,
        ( a, b ) => { return a + b; }

    );
Image from book
Note 

The pronunciation of the => token has no official definition. A few developers use “such that” when the lambda expression is a predicate and “becomes” when it is a projection. Other developers say generically “goes to.”

You can read this formula as “given a and b, return a+b, whatever ‘+’ means for the type of a and b.” (The “+” operator must exist for the concrete type of a and b-inferred from the context-otherwise, the code will not compile.)

Although we removed parameter types from the parameter list, the compiler will infer parameter types from the Aggregate call. We are calling a generic method, but the generic type T is defined from the l.Values parameter, which is a List<int> type. In this call, T is an int; therefore, the Func<T> delegate is a Func<int>, and both a and b are of type int.

You can think of this syntax as more similar to a var declaration than to another form of generic use. The type resolution is made at compile time. If a parameter type is generic, you cannot access operators and members other than those allowed by type constraints. If it is a regular type, you have full access to operators (such as the “+” operator we are using) and members eventually defined on that type.

A lambda expression can define a body in two ways. We have seen the statement body, which requires brackets like any other block of code and a return statement before the expression that has to be returned. The other form is the expression body, which can be used when the code inside the block is only a return followed by an expression. You can simply omit the brackets and the return statement, as shown in Listing 2-16.

Listing 2-16: Expression body
Image from book

sum = Aggregate(
        l.Values,
        ( a, b ) => a + b

    );
Image from book

When we worked with lambda expressions for the first time, we felt some confusion until we realized that they are only a more powerful syntax with which to write an anonymous method. This is an important concept to remember, because you can always access identifiers that are not defined in the parameter list. In other words, remember that the parameter list defines the parameters of the anonymous method. Any other identifier inside the body (either a statement or an expression) of a lambda expression has to be resolved within the anonymous method definition. The following code shows an example of this. (The AggregateSingle<T> method uses a slightly different delegate for the second parameter, declared as delegate T FuncSingle<T>( T a )).

int sum = 0;
sum = AggregateSingle(
          l.Values,
          ( x ) => sum += x
       );}

This lambda expression has only the x parameter; sum is a local variable of the containing method, and its lifetime is extended over the lifetime of the delegate instance that points to the anonymous method defined by the lambda expression itself. Remember that the result of the corresponding return sum += x statement will be the value of sum after the sum of x.

When a lambda expression has only one parameter, the parentheses can be omitted from the parameter list, as in this example:

int sum = 0;
sum = AggregateSingle(
          l.Values,
          x => sum += x
       );}

If there are no parameters for a lambda expression, two parentheses are required before the => token. The code in Listing 2-17 shows some of the possible syntaxes.

Listing 2-17: Lambda expression examples
Image from book

( int a, int b ) => { return a + b; } // Explicitly typed, statement body
( int a, int b ) => a + b;            // Explicitly typed, expression body
( a, b ) => { return a + b; }         // Implicitly typed, statement body
( a, b ) => a + b                     // Implicitly typed, expression body
( x ) => sum += x                     // Single parameter with parentheses
x => sum += x                         // Single parameter no parentheses
() => sum + 1                         // No parameters

Image from book

A practical use of lambda expressions is in writing small pieces of code inside the parameter list of a method call. The following code shows an example of a predicate passed as a parameter to a generic Display method that iterates an array of elements and displays only those that make the predicate true. The predicate and its use are highlighted in the code. The Func delegate shown in Listing 2-18 is explained in the following pages.

Listing 2-18: Lambda expression as a predicate
Image from book

public static void Demo() {
    string[] names = { "Marco", "Paolo", "Tom" };
    Display( names, s => s.Length > 4 );
}

public static void Display<T>( T[] names, Func<T, bool> filter ){
    foreach( T s in names) {
        if (filter( s )) Console.WriteLine( s );
    }
}

Image from book

The execution results in a list of names having more than four characters. The conciseness of this syntax is one reason for using lambda expressions in LINQ; the other reason is the potential to create an expression tree.

To this point, we have considered the difference between the statement body and the expression body only as a different syntax that can be used to retrieve the same code, but there is something more. A lambda expression can also be assigned to a variable of these delegate types:

public delegate T Func< T >();
public delegate T Func< A0, T >( A0 arg0 );
public delegate T Func<A0, A1, T> ( A0 arg0, A1 arg1 );
public delegate T Func<A0, A1, A2, T >( A0 arg0, A1 arg1, A2 arg2 );
public delegate T Func<A0, A1, A3, T> ( A0 arg0, A1 arg1, A2 arg2, A3 arg3 );

There are no requirements for defining these delegates in a particular way. LINQ defines such delegates within the System.Linq namespace, but lambda expression functionality does not depend on these declarations. You can make your own, even with a name other than Func, except in one case: if you convert a lambda expression to an expression tree, the compiler emits a binary representation of the lambda expression that can be manipulated and converted into executable code at execution time. An expression tree is an instance of a System.Linq.Expressions.Expression<T> class, where T is the delegate that the expression tree represents.

In many ways, the use of lambda expressions to create an expression tree makes lambda expressions similar to generic methods. The difference is that generic methods are already described as IL code at compile time (only the type parameters used are not completely specified), while an expression tree becomes IL code only at execution time. Only lambda expressions with an expression body can be converted into an expression tree, and this conversion is not possible if the lambda expression contains a statement body.

Listing 2-19 shows how the same lambda expression can be converted into either a delegate or an expression tree. The highlighted lines show the assignment of the expression tree and its use.

Listing 2-19: Use of an expression tree
Image from book


class ExpressionTree {
    delegate T Func<T>( T a, T b );
    public static void Demo() {
        Func<int> x = (a, b) => a + b;
        Expression<Func<int>> y = (a, b) => a + b;

        Console.WriteLine( "Delegate" );
        Console.WriteLine( x.ToString() );
        Console.WriteLine( x( 29, 13 ) );
        Console.WriteLine( "Expression tree" );
        Console.WriteLine( y.ToString() );
        Console.WriteLine( y.Compile()( 29, 13 ) );
    }
}
Image from book

Here is the output of Demo execution. The result of the invocation is the same (42), but the output of the ToString() invocation is different.

Delegate ExpressionTree+Func`1[System.Int32]

42
Expression tree (a, b) => (a + b)
42

The expression tree maintains a representation of the expression in memory. You cannot use the compact delegate invocation on an expression tree as we did on the x delegate syntax. When you want to evaluate the expression, you need to compile it. The invocation of the Compile method returns a delegate that can be invoked through the Invoke method (or the compact delegate invocation syntax we used in the preceding example). We do not have space here for a deeper investigation of deferred query evaluation, but it is an important foundation for many parts of LINQ. For example, LINQ to SQL has methods that navigate an expression tree and convert it into an SQL statement. That conversion is made at execution time and not at compile time.

Extension Methods

C# is an object-oriented programming language that allows the extension of a class through inheritance. Nevertheless, designing a class that can be inherited in a safe way and maintaining that class in the future is hard work. A safe way to write such code is to declare all classes as sealed, unless they are designed as inheritable. In that case, safety is set against agility.

More Info 

Microsoft .NET allows class A in assembly X.DLL to be inherited by class B in assembly Y.DLL. This implies that a new version of X.DLL should be designed to be compatible even with older versions of Y.DLL. C# and .NET have many tools to help in this effort. However, we can say that a class has to be designed as inheritable if you want to allow its derivation; otherwise, you run the risk that making a few changes in the base classes will break existing code in derived classes. If you do not design a class to be inheritable, it is better to make the class sealed, or at least private or internal.

C# 3.0 introduces a syntax that conceptually extends an existing type (either reference or value) by adding new methods without deriving it into a new type. Some might consider the results of this change to be only syntactic sugar, but this capability makes LINQ code more readable and easier to write. The methods that extend a type can use only the public members of the type itself, just as you can do from any piece of code outside the target type.

The following code shows a traditional approach to writing two methods (FormattedUS and FormattedIT) that convert a decimal value into a string formatted with a specific culture:


static class Traditional {
    public static void Demo() {
        decimal x = 1234.568M;
        Console.WriteLine( FormattedUS( x ) );
        Console.WriteLine( FormattedIT( x ) );
    }

    public static string FormattedUS( decimal d ) {
        return String.Format( formatIT, "{0:#,0.00}", d );
    }

    public static string FormattedIT( decimal d ) {
        return String.Format( formatUS, "{0:#,0.00}", d );
    }

    static CultureInfo formatUS = new CultureInfo( "en-US" );
    static CultureInfo formatIT = new CultureInfo( "it-IT" );
}

There is no link between these methods and the decimal type other than the methods’ parameters. We can change this code to extend the decimal type. It is a value type and not inheritable, but we can add the this keyword before the first parameter type of our methods, and in this way use the method as if it was defined inside the decimal type. Changes are highlighted in the code shown in Listing 2-20.

Listing 2-20: Extension methods declaration
Image from book

static class ExtensionMethods {
    public static void Demo() {
        decimal x = 1234.568M;
        Console.WriteLine( x.FormattedUS() );
        Console.WriteLine( x.FormattedIT() );
        Console.WriteLine( FormattedUS( x ) ); // Traditional call allowed
        Console.WriteLine( FormattedIT( x ) ); // Traditional call allowed
    }

    static CultureInfo formatUS = new CultureInfo( "en-US" );
    static CultureInfo formatIT = new CultureInfo( "it-IT" );
    public static string FormattedUS( this decimal d ){
        return String.Format( formatIT, "{0:#,0.00}", d );
    }

    public static string FormattedIT( this decimal d ){
        return String.Format( formatUS, "{0:#,0.00}", d );
    }
}

Image from book

An extension method must be static and public, must be declared inside a static class, and must have the keyword this before the first parameter type, which is the type that the method extends. Extension methods are public because they can be (and normally are) called from outside the class where they are declared.

Although this is not a big revolution, one advantage could be Microsoft IntelliSense support, which could show all extension methods accessible to a given identifier. However, the result type of the extension method might be the extended type itself. In this case, we can extend a type with many methods, all working on the same data. LINQ very frequently uses extension methods in this way.

We can write a set of extension methods to decimal as shown in Listing 2-21.

Listing 2-21: Extension methods for native value types
Image from book

static class ExtensionMethods {
    public static decimal Double( this decimal d ) {
        return d + d;
    }
    public static decimal Triple( this decimal d ) {
        return d * 3;
    }
    public static decimal Increase( this decimal d ) {
        return ++d;
    }
    public static decimal Decrease( this decimal d ) {
        return --d;
    }
    public static decimal Half( this decimal d ) {
        return d / 2;
    }
    // 
}
Image from book

In Listing 2-22, we can compare the two calling syntaxes, the classical one (y) and the new one (x).

Listing 2-22: Extension methods call order
Image from book

decimal x = 14M, y = 14M;
x = Half( Triple( Decrease( Decrease( Double( Increase( x ) ) ) ) ) );
y = y.Increase().Double().Decrease().Decrease().Triple().Half();
Image from book

The result for both x and y is 42. The classical syntax requires several nested calls that have to be read from the innermost to the outermost. The new syntax acts as though our new methods are members of the decimal class. The call order follows the read order (left to right) and is much easier to understand.

Note 

It is important to recognize that extension methods come at a price. When you call an instance method of a type, you can expect that the instance state can be modified by your call. But keep in mind that an extension method can do that only by calling public members of the extended type, as we already said. When the extension method returns the same type as it extends, you can assume that the instance state of the type should not be changed. This might be a recommendation for extending value types, but we cannot assume the same for any reference type because the related cost (creating a copy of an object for each call) could be too high.

An extension method is not automatically considered. Its resolution follows some rules. Here is the order of evaluation used to resolve a method for an identifier:

  1. Instance method: If an instance method exists, it has priority.

  2. Extension method: The search for an extension method is made through all static classes in the “current namespace” and in all namespaces included in active using directives. (Current namespace refers to the closest enclosing namespace declaration. This is the namespace that contains the static class with the extension method declaration.) If two types contain the same extension method, the compiler raises an error.

The most common use of extension methods is to define them in static classes in specific namespaces, importing them into the calling code by specifying one or more using directives in the module.

These precedence rules used to resolve a method call define a feature that is not apparent at first sight. When you call an extension method on a class, it can always be replaced by a specific version of the method defined as a member method for a particular type. In other words, the extension method represents a “default” implementation for a method, which can always be overridden by a specialized version for specific classes.

We can see this behavior in a few examples. The first code example contains an extension method for the object type; in this way, you can call Display on an instance of any type. We call it on our own Customer class instance:

public class Customer {
    protected int Id;
    public string Name;

    public Customer( int id ) {
        this.Id = id;
    }
}

static class Visualizer {
    public static void Display( this object o ) {
        string s = o.ToString();
        Console.WriteLine( s );
    }
}

static class Program {
    static void Main() {
        Customer c = new Customer( 1 );
        c.Name = "Marco";
        c.Display();

    }
}

The result of executing this code is the class name Customer.

We can customize the behavior of the Display method for the Customer class, defining an overloaded extension method, as shown in Listing 2-23. (We could define an overloaded extension method in another namespace if this namespace had a higher priority in the resolution order.)

Listing 2-23: Extension methods overload
Image from book

static class Visualizer {
    public static void Display( this object o ) {
        string s = o.ToString();
        Console.WriteLine( s );
    }
    public static void Display( this Customer c ) {
        string s = String.Format( "Name={0}", c.Name );

        Console.WriteLine( s );
    }
}
Image from book

This time the more specialized version is executed, as we can see from the execution output, shown here:

Name=Marco

Without removing these extension methods, we can add other special behavior to Display by implementing it as an instance method in the Customer class. This implementation, shown in Listing 2-24, will have precedence over any other extension method for a type equal to or derived from Customer.

Listing 2-24: Instance method over extension methods
Image from book

public class Customer {
    protected int Id;
    public string Name;

    public Customer( int id ) {
        this.Id = id;
    }
     public void Display() {
        string s = String.Format( "{0}-{1}", Id, Name );

        Console.WriteLine( s );
    }
}
Image from book

The execution output, shown here, illustrates that the instance method is now called:

1-Marco

At first glance, this behavior seems to overlap functionality provided by virtual methods. It does not, however, because an extension method has to be resolved at compile time, while virtual methods are resolved at execution time. This means that if you call an extension method on an object defined as a base class, the instance type of the contained object is not relevant. If a compatible extension method exists (even if it is a derived class), it is used in place of the instance method. The code in Listing 2-25 illustrates this concept.

Listing 2-25: Extension methods resolution
Image from book

public class A {
    public virtual void X() {}
}
public class B : A {
    public override void X() {}
    public void Y() {}
}

static public class E {
    static void X( this A a ) {}
    static void Y( this A b ) {}

    public static void Demo() {
        A a = new A();
        B b = new B();
        A c = new B();

        a.X(); // Call A.X
        b.X(); // Call B.X
        c.X(); // Call B.X

        a.Y(); // Call E.Y
        b.Y(); // Call B.Y
        c.Y(); // Call E.Y
    }
}

Image from book

The X method is always resolved by the instance method. It is a virtual method, and for this reason c.X() calls the B.X overridden implementation. The extension method E.X is never called on these objects.

The Y method is defined only on the B class. It is an extension method for the A class, and therefore only b.Y() calls the B.Y implementation. Note that c.Y() calls E.Y because the c identifier is defined as an A type, even if it contains an instance of type B, because Y is not defined in class A.

A final point to consider regarding a generic extension method is that when you use a generic type as the parameter that you mark with the this keyword, you are extending not only a class but a whole set of classes. We found that this operation is not very intuitive when you are designing a components library, but it is a comfortable approach when you are writing the code that uses them. The following code is a slightly modified version of a previous example of lambda expressions. We added the this keyword to the names parameter and changed the invocation of the Display method. Important changes are highlighted in the code shown in Listing 2-26.

Listing 2-26: Lambda expression as predicate
Image from book

public static void Display<T>( this T[] names, Func<T, bool> filter ) {}

public static void Demo() {
    string[] names = { "Marco", "Paolo", "Tom" };

    names.Display( s => s.Length > 4 );
    // It was: Display( names, s => s.Length > 4 );
}
Image from book

The Display method can be used with a different class (for example, an array of type int), and it will always require a predicate with a parameter that is the same type as the array. The following code uses the same Display method, showing only the even values:

int[] ints = { 19, 16, 4, 33 };
ints.Display( i => i % 2 == 0 );

As you learn more about extension methods, you can start to see a language that is more flexible but still strongly typed.

Object Initialization Expressions

C# 1.x allows the initialization of a field or a local variable in a single statement. The syntax shown here can initialize a single identifier:

int i = 3;
string name = 'Unknown';
Customer c = new Customer( "Tom", 32 );

When an initialization statement of this kind is applied to a reference type, it requires a call to a class constructor that has parameters that specify how to initialize the inner state of the instance created. You can use an object initializer on both reference and value types.

When you want to initialize an object (either a reference or value type), you need a constructor with enough parameters to specify the initial state of the object you want to initialize. Consider this code:

public class Customer {
    public int Age;
    public string Name;
    public string Country;
    public Customer( string name, int age ) {
        this.Name = name;
        this.Age = age;
    }
    // …
}

The customer instance is initialized through the Customer constructor, but we set only the Name and Age fields. If we want to set Country but not Age, we need to write code such as that shown in Listing 2-27.

Listing 2-27: Standard syntax for object initialization
Image from book

Customer customer = new Customer();
customer.Name = "Marco";
customer.Country = "Italy";
Image from book

C# 3.0 introduces a shorter form of object initialization syntax that generates functionally equivalent code, shown in Listing 2-28.

Listing 2-28: Object initializer
Image from book

// Implicitly calls default constructor before object initialization
Customer customer = new Customer { Name = "Marco", Country = "Italy" };
Image from book
Note 

The syntaxes used to initialize an object (standard and object initializers) are equivalent after code is compiled. Object initializer syntax produces a call to a constructor for the specified type (either a reference or value type): this is the default constructor whenever you do not place a parenthesis between the type name and the open bracket. If that constructor makes assignments to the member fields successively initialized, the compiler still performs that work, although the assignment might not be used. An object initializer does not have an additional cost if the called constructor of the initialized type is empty.

The names assigned in an initialization list can correspond to either fields or properties that are public members of the initialized object. The syntax also allows for specifying a call to a nondefault constructor, which might be necessary if the default constructor is not available for a type. Listing 2-29 shows an example.

Listing 2-29: Explicit constructor call in object initializer
Image from book

// Explicitly specify constructor to call before object initialization
Customer c1 = new Customer() { Name = "Marco", Country = "Italy" };

// Explicitly specify nondefault constructor
Customer c2 = new Customer( "Paolo", 21 ) { Country = "Italy" };
Image from book

The c2 assignment above is equivalent to this one:

Customer c2 = new Customer( "Paolo", 21 );
c2.Country = "Italy";
Note 

The real implementation of an object initializer creates and initializes the object into a temporary variable and, only at the end, it copies the reference to the destination variable. In this way, the object is not visible to another thread until it is not completely initialized.

One of the advantages of the object initializer is that it allows for writing a complete initialization in a functional form: you can put it inside an expression without using different statements. Therefore, the syntax can also be nested, repeating the syntax for the initial value of a member into an initialized object. The classic Point and Rectangle class example shown in Listing 2-30 (part of the C# 3.0 specification document) illustrates this.

Listing 2-30: Nested object initializers
Image from book

public class Point {
    int x, y;
    public int X { get { return x; } set { x = value; } }
    public int Y { get { return y; } set { y = value; } }
}

public class Rectangle {
    Point tl, br;
    public Point TL { get { return tl; } set { tl = value; } }
    public Point BR { get { return br; } set { br = value; } }
}

// Possible code inside a method
Rectangle r = new Rectangle {
    TL = new Point { X = 0, Y = 1 },

    BR = new Point { X = 2, Y = 3 }
};
Image from book

The compiled initialization code for r is equivalent to the following:

Rectangle rectangle2 = new Rectangle();
Point point1 = new Point();
point1.X = 0;
point1.Y = 1;
rectangle2.TL = point1;
Point point2 = new Point();
point2.X = 2;
point2.Y = 3;
rectangle2.BR = point2;
Rectangle rectangle1 = rectangle2;

Now that you have seen this code, it should be clear when using the shortest syntax has a true advantage in terms of code readability. The two temporary variables, point1 and point2, are also created in the object initializer form, but we do not need to explicitly define them.

The previous example used the nested object initializers with reference types. The same syntax also works for value types, but you have to remember that a copy of a temporary Point object is made when the TL and BR members are initialized.

Note 

Copying value types can have performance implications on large value types, but this is not related to the use of object initializers.

The object initializer syntax can be used only for assignment of the initial value of a field or variable. The new keyword is required only for the final assignment. Inside an initializer, you can skip the new keyword in an object member’s initialization. In this case, the code uses the object instance created by the constructor of the containing object, as shown in Listing 2-31.

Listing 2-31: Initializers for owned objects
Image from book

public class Rectangle {
    Point tl = new Point();
    Point br = new Point();
    public Point TL { get { return tl; } }
    public Point BR { get { return br; } }
    }

// Possible code inside a method
Rectangle r = new Rectangle {
    TL = { X = 0, Y = 1 },
    BR = { X = 2, Y = 3 }
};

Image from book

The TL and BR member instances are implicitly created by the Rectangle class constructor. The object initializer for TL and BR does not have the new keyword. In this way, the initializer works on the existing instance of TL and BR.

In the examples so far, we have used some constant within the object initializers. You can also use other calculated values, as shown here:

Customer c3 = new Customer{
        Name = c1.Name, Country = c2.Country, Age = c2.Age };

C# 1.x included the concept of initializers that used a similar syntax, but it was limited to arrays:

int[] integers = { 1, 3, 9, 18 };
string[] customers = { "Jack", "Paolo", "Marco" };

The same new object initializer syntax can also be used for collections. The internal list can be made of constants, expressions, or other initializers, like any other object initializer we have already shown. If the collection class implements the System.Collections.Generic.ICollection<T> interface, for each element in the initializer a call to ICollection<T>.Add(T) is made with the same order of the elements. If the collection class implements the IEnumerable interface, the Add() method is called for each element in the initializer. The code in Listing 2-32 shows some examples of using collection initializers.

Listing 2-32: Collection initializers
Image from book

// Collection classes that implement ICollection<T>
List<int> integers = new List<int> { 1, 3, 9, 18 };

List<Customer> list = new List<Customer> {
    new Customer( "Jack", 28 ) { Country = "USA"},
    new Customer { Name = "Paolo" },
    new Customer { Name = "Marco", Country = "Italy" },
};

// Collection classes that implement IEnumerable
ArrayList integers = new ArrayList() { 1, 3, 9, 18 };

ArrayList list = new ArrayList {
    new Customer( "Jack", 28 ) { Country = "USA"},
    new Customer { Name = "Paolo" },
    new Customer { Name = "Marco", Country = "Italy" },
};

Image from book

In summary, object and collection initializers allow the creation and initialization of a set of objects (eventually nested) within a single function. LINQ makes extensive use of this feature, especially through anonymous types.

Anonymous Types

An object initializer can also be used without specifying the class that will be created with the new operator. Doing that, a new class-an anonymous type-is created. Consider the example shown in Listing 2-33.

Listing 2-33: Anonymous types definition
Image from book

Customer c1 = new Customer { Name = "Marco" };
var c2 = new Customer { Name = "Paolo" };
var c3 = new { Name = "Tom", Age = 31 };
var c4 = new { c2.Name, c2.Age };
var c5 = new { c1.Name, c1.Country };
var c6 = new { c1.Country, c1.Name };
Image from book

The variables c1 and c2 are of the Customer type, but the type of variables c3, c4, c5, and c6 cannot be inferred simply by reading the printed code. The var keyword should infer the variable type from the assigned expression, but this one has a new keyword without a type specified. As you might expect, that kind of object initializer generates a new class.

The generated class has a public property and an underlying private field for each argument contained in the initializer: its name and type are inferred from the object initializer itself. When the name is not explicit, it is inferred from the initialization expression, as in the definitions for c4, c5, and c6. This shorter syntax is called a projection initializer because it projects not just a value but also the name of the value.

That class is the same for all possible anonymous types whose properties have the same names and types in the same order. We can see the type names used and generated in this code:

Console.WriteLine( "c1 is {0}", c1.GetType() );
Console.WriteLine( "c2 is {0}", c2.GetType() );
Console.WriteLine( "c3 is {0}", c3.GetType() );
Console.WriteLine( "c4 is {0}", c4.GetType() );
Console.WriteLine( "c5 is {0}", c5.GetType() );
Console.WriteLine( "c6 is {0}", c6.GetType() );

The following is the output that is generated:

c1 is Customer
c2 is Customer
c3 is <>f__AnonymousType0`2[System.String,System.Int32]
c4 is <>f__AnonymousType0`2[System.String,System.Int32]
c5 is <>f__AnonymousType5`2[System.String,System.String]
c6 is <>f__AnonymousTypea`2[System.String,System.String]

The anonymous type name cannot be referenced by the code (you do not know the generated name), but it can be queried on an object instance. The variables c3 and c4 are of the same anonymous type because they have the same fields and properties. Even if c5 and c6 have the same properties (type and name), they are in a different order, and that is enough for the compiler to create two different anonymous types.

Important 

Usually in C# the order of members inside a type is not important; even standard object initializers are based on member names and not on their order. The need for LINQ to get a different type for two classes that differ only in the order of their members derives from the need to represent an ordered set of fields, as in a SELECT statement.

The syntax to initialize a typed array has been enhanced in C# 3.0. Now you can declare an array initializer and infer the type from the initializer content. This mechanism can be combined with anonymous types and object initializers, as in the code shown in Listing 2-34.

Listing 2-34: Implicitly typed arrays
Image from book

var ints = new[] { 1, 2, 3, 4 };
var ca1 = new[] {
    new Customer { Name = "Marco", Country = "Italy" },
    new Customer { Name = "Tom", Country = "USA" },
    new Customer { Name = "Paolo", Country = "Italy" }
};
var ca2 = new[] {
    new { Name = "Marco", Sports = new[] { "Tennis", "Spinning"} },
    new { Name = "Tom", Sports = new[] { "Rugby", "Squash", "Baseball" } },
    new { Name = "Paolo", Sports = new[] { "Skateboard", "Windsurf" } }
};
Image from book
Note 

The syntax of C# 1.x needs the assigned variable to be a definite type. The syntax of C# 3.0 allows the use of the var keyword to define the variable initialized in such a way.

While ints is an array of int and ca1 is an array of Customers, ca2 is an array of anonymous types, each containing a string (Name) and an array of strings (Sports). You do not see a type in the ca2 definition because all types are inferred from the initialization expression. Once again, note that the ca2 assignment is a single expression, which could be embedded in another one.

Query Expressions

C# 3.0 also introduces query expressions, which have a syntax similar to the SQL language and are used to manipulate data. This syntax is converted into regular C# 3.0 syntax that makes use of specific classes, methods, and interfaces that are part of the LINQ libraries. We would not cover all the keywords in detail because it is beyond the scope of this article.

In this section, we want to introduce the transformation that the compiler applies to a query expression, just to describe how the code is interpreted.

Here is a typical LINQ query:

// Declaration and initialization of an array of anonymous types
var customers = new []{
    new {  Name = "Marco", Discount = 4.5 },
    new {  Name = "Paolo", Discount = 3.0 },
    new {  Name = "Tom", Discount = 3.5 }
};

 var query =
    from c in customers
    where c.Discount > 3
    orderby c.Discount

    select new { c.Name, Perc = c.Discount / 100 };

foreach( var x in query ) {
    Console.WriteLine( x );
}

A query expression begins with a from clause (in C#, all query expression keywords are case sensitive) and ends with either a select or group clause. The from clause specifies the object on which LINQ operations are applied, which must be an instance of a class that implements the IEnumerable<T> interface.

That code produces the following results:

{ Name = Tom, Perc = 0.035 }
{ Name = Marco, Perc = 0.045 }

C# 3.0 interprets the query assignment as if it was written in this way:

var query = customers
            .Where( c => c.Discount > 3)
            .OrderBy( c => c.Discount )
            .Select( c=> new { c.Name, Perc = c.Discount / 100 } );

Each query expression clause corresponds to a generic method, which is resolved through the same rules that apply to an extension method. Therefore, the query expression syntax is similar to a macro expansion, even if it is a more intelligent one because it infers many definitions, like the names of parameters in lambda expressions.

At this point, it should be clear why the features of C# 3.0 that allow you to write complex actions into a single expression are so important to LINQ. A query expression calls many methods in a chain, where each call uses the result of the previous call as a parameter. Extension methods simplify the syntax, avoiding nested calls. Lambda expressions define the logic for some operations (such as where, orderby, and so on). Anonymous types and object initializers define how to store the results of a query. Local type inference is the glue that holds these pieces together.


©2008 LINQ - LINQ Labs - Discuss - Terms of Use - Privacy Policy - About LINQ
- Interview Questions - Sharepoint Articles - Interview Questions Resource Library - WPF Articles - MS Knowledgebase Articles - Electronics and Hardware discussions