ЗНАЕТЕ ЛИ ВЫ?

Virtual, override, and abstract methods



When an instance method declaration includes a virtual modifier, the method is said to be a virtual method. When no virtual modifier is present, the method is said to be a non-virtual method.

When a virtual method is invoked, the run-time type of the instance for which that invocation takes place determines the actual method implementation to invoke. In a nonvirtual method invocation, the compile-time type of the instance is the determining factor.

A virtual method can be overridden in a derived class. When an instance method declaration includes an override modifier, the method overrides an inherited virtual method with the same signature. Whereas a virtual method declaration introduces a new method, an override method declaration specializes an existing inherited virtual method by providing a new implementation of that method.

An abstract method is a virtual method with no implementation. An abstract method is declared with the abstract modifier and is permitted only in a class that is also declared abstract. An abstract method must be overridden in every non-abstract derived class.

The following example declares an abstract class, Expression, which represents an expression tree node, and three derived classes, Constant, VariableReference, and Operation, which implement expression tree nodes for constants, variable references, and arithmetic operations. (This is similar to, but not to be confused with the expression tree types introduced in section §4.6).

using System;
using System.Collections;

public abstract class Expression
{
public abstract double Evaluate(Hashtable vars);
}

public class Constant: Expression
{
double value;

public Constant(double value) {
this.value = value;
}

public override double Evaluate(Hashtable vars) {
return value;
}
}

public class VariableReference: Expression
{
string name;

public VariableReference(string name) {
this.name = name;
}

public override double Evaluate(Hashtable vars) {
object value = vars[name];
if (value == null) {
throw new Exception("Unknown variable: " + name);
}
return Convert.ToDouble(value);
}
}

public class Operation: Expression
{
Expression left;
char op;
Expression right;

public Operation(Expression left, char op, Expression right) {
this.left = left;
this.op = op;
this.right = right;
}

public override double Evaluate(Hashtable vars) {
double x = left.Evaluate(vars);
double y = right.Evaluate(vars);
switch (op) {
case '+': return x + y;
case '-': return x - y;
case '*': return x * y;
case '/': return x / y;
}
throw new Exception("Unknown operator");
}
}

The previous four classes can be used to model arithmetic expressions. For example, using instances of these classes, the expression x + 3 can be represented as follows.

Expression e = new Operation(
new VariableReference("x"),
'+',
new Constant(3));

The Evaluate method of an Expression instance is invoked to evaluate the given expression and produce a double value. The method takes as an argument a Hashtable that contains variable names (as keys of the entries) and values (as values of the entries). The Evaluate method is a virtual abstract method, meaning that non-abstract derived classes must override it to provide an actual implementation.

A Constant’s implementation of Evaluate simply returns the stored constant. A VariableReference’s implementation looks up the variable name in the hashtable and returns the resulting value. An Operation’s implementation first evaluates the left and right operands (by recursively invoking their Evaluate methods) and then performs the given arithmetic operation.

The following program uses the Expression classes to evaluate the expression x * (y + 2) for different values of x and y.

using System;
using System.Collections;

class Test
{
static void Main() {

Expression e = new Operation(
new VariableReference("x"),
'*',
new Operation(
new VariableReference("y"),
'+',
new Constant(2)
)
);

Hashtable vars = new Hashtable();

vars["x"] = 3;
vars["y"] = 5;
Console.WriteLine(e.Evaluate(vars)); // Outputs "21"

vars["x"] = 1.5;
vars["y"] = 9;
Console.WriteLine(e.Evaluate(vars)); // Outputs "16.5"
}
}

Method overloading

Method overloading permits multiple methods in the same class to have the same name as long as they have unique signatures. When compiling an invocation of an overloaded method, the compiler uses overload resolution to determine the specific method to invoke. Overload resolution finds the one method that best matches the arguments or reports an error if no single best match can be found. The following example shows overload resolution in effect. The comment for each invocation in the Main method shows which method is actually invoked.

class Test
{
static void F() {
Console.WriteLine("F()");
}

static void F(object x) {
Console.WriteLine("F(object)");
}

static void F(int x) {
Console.WriteLine("F(int)");
}

static void F(double x) {
Console.WriteLine("F(double)");
}

static void F<T>(T x) {
Console.WriteLine("F<T>(T)");
}

static void F(double x, double y) {
Console.WriteLine("F(double, double)");
}

static void Main() {
F(); // Invokes F()
F(1); // Invokes F(int)
F(1.0); // Invokes F(double)
F("abc"); // Invokes F(object)
F((double)1); // Invokes F(double)
F((object)1); // Invokes F(object)
F<int>(1); // Invokes F<T>(T)
F(1, 1); // Invokes F(double, double) }
}

As shown by the example, a particular method can always be selected by explicitly casting the arguments to the exact parameter types and/or explicitly supplying type arguments.

Other function members

Members that contain executable code are collectively known as the function members of a class. The preceding section describes methods, which are the primary kind of function members. This section describes the other kinds of function members supported by C#: constructors, properties, indexers, events, operators, and destructors.

The following table shows a generic class called List<T>, which implements a growable list of objects. The class contains several examples of the most common kinds of function members.

public class List<T> {
const int defaultCapacity = 4; Constant
T[] items; int count; Fields
public List(int capacity = defaultCapacity) { items = new T[capacity]; } Constructors
public int Count { get { return count; } } public int Capacity { get { return items.Length; } set { if (value < count) value = count; if (value != items.Length) { T[] newItems = new T[value]; Array.Copy(items, 0, newItems, 0, count); items = newItems; } } } Properties

 

public T this[int index] { get { return items[index]; } set { items[index] = value; OnChanged(); } } Indexer
public void Add(T item) { if (count == Capacity) Capacity = count * 2; items[count] = item; count++; OnChanged(); } protected virtual void OnChanged() { if (Changed != null) Changed(this, EventArgs.Empty); } public override bool Equals(object other) { return Equals(this, other as List<T>); } static bool Equals(List<T> a, List<T> b) { if (a == null) return b == null; if (b == null || a.count != b.count) return false; for (int i = 0; i < a.count; i++) { if (!object.Equals(a.items[i], b.items[i])) { return false; } } return true; } Methods
public event EventHandler Changed; Event
public static bool operator ==(List<T> a, List<T> b) { return Equals(a, b); } public static bool operator !=(List<T> a, List<T> b) { return !Equals(a, b); } Operators
}

 

Constructors

C# supports both instance and static constructors. An instance constructor is a member that implements the actions required to initialize an instance of a class. A static constructor is a member that implements the actions required to initialize a class itself when it is first loaded.

A constructor is declared like a method with no return type and the same name as the containing class. If a constructor declaration includes a static modifier, it declares a static constructor. Otherwise, it declares an instance constructor.

Instance constructors can be overloaded. For example, the List<T> class declares two instance constructors, one with no parameters and one that takes an int parameter. Instance constructors are invoked using the new operator. The following statements allocate two List<string> instances using each of the constructors of the List class.

List<string> list1 = new List<string>();
List<string> list2 = new List<string>(10);

Unlike other members, instance constructors are not inherited, and a class has no instance constructors other than those actually declared in the class. If no instance constructor is supplied for a class, then an empty one with no parameters is automatically provided.

Properties

Properties are a natural extension of fields. Both are named members with associated types, and the syntax for accessing fields and properties is the same. However, unlike fields, properties do not denote storage locations. Instead, properties have accessors that specify the statements to be executed when their values are read or written.

A property is declared like a field, except that the declaration ends with a get accessor and/or a set accessor written between the delimiters { and } instead of ending in a semicolon. A property that has both a get accessor and a set accessor is a read-write property, a property that has only a get accessor is a read-only property, and a property that has only a set accessor is a write-only property.

A get accessor corresponds to a parameterless method with a return value of the property type. Except as the target of an assignment, when a property is referenced in an expression, the get accessor of the property is invoked to compute the value of the property.

A set accessor corresponds to a method with a single parameter named value and no return type. When a property is referenced as the target of an assignment or as the operand of ++ or --, the set accessor is invoked with an argument that provides the new value.

The List<T> class declares two properties, Count and Capacity, which are read-only and read-write, respectively. The following is an example of use of these properties.

List<string> names = new List<string>();
names.Capacity = 100; // Invokes set accessor
int i = names.Count; // Invokes get accessor
int j = names.Capacity; // Invokes get accessor

Similar to fields and methods, C# supports both instance properties and static properties. Static properties are declared with the static modifier, and instance properties are declared without it.

The accessor(s) of a property can be virtual. When a property declaration includes a virtual, abstract, or override modifier, it applies to the accessor(s) of the property.

Indexers

An indexer is a member that enables objects to be indexed in the same way as an array. An indexer is declared like a property except that the name of the member is this followed by a parameter list written between the delimiters [ and ]. The parameters are available in the accessor(s) of the indexer. Similar to properties, indexers can be read-write, read-only, and write-only, and the accessor(s) of an indexer can be virtual.

The List class declares a single read-write indexer that takes an int parameter. The indexer makes it possible to index List instances with int values. For example

List<string> names = new List<string>();
names.Add("Liz");
names.Add("Martha");
names.Add("Beth");
for (int i = 0; i < names.Count; i++) {
string s = names[i];
names[i] = s.ToUpper();
}

Indexers can be overloaded, meaning that a class can declare multiple indexers as long as the number or types of their parameters differ.

Events

An event is a member that enables a class or object to provide notifications. An event is declared like a field except that the declaration includes an event keyword and the type must be a delegate type.

Within a class that declares an event member, the event behaves just like a field of a delegate type (provided the event is not abstract and does not declare accessors). The field stores a reference to a delegate that represents the event handlers that have been added to the event. If no event handles are present, the field is null.

The List<T> class declares a single event member called Changed, which indicates that a new item has been added to the list. The Changed event is raised by the OnChanged virtual method, which first checks whether the event is null (meaning that no handlers are present). The notion of raising an event is precisely equivalent to invoking the delegate represented by the event—thus, there are no special language constructs for raising events.

Clients react to events through event handlers. Event handlers are attached using the += operator and removed using the -= operator. The following example attaches an event handler to the Changed event of a List<string>.

using System;

class Test
{
static int changeCount;

static void ListChanged(object sender, EventArgs e) {
changeCount++;
}

static void Main() {
List<string> names = new List<string>();
names.Changed += new EventHandler(ListChanged);
names.Add("Liz");
names.Add("Martha");
names.Add("Beth");
Console.WriteLine(changeCount); // Outputs "3"
}
}

For advanced scenarios where control of the underlying storage of an event is desired, an event declaration can explicitly provide add and remove accessors, which are somewhat similar to the set accessor of a property.

Operators

An operator is a member that defines the meaning of applying a particular expression operator to instances of a class. Three kinds of operators can be defined: unary operators, binary operators, and conversion operators. All operators must be declared as public and static.

The List<T> class declares two operators, operator == and operator !=, and thus gives new meaning to expressions that apply those operators to List instances. Specifically, the operators define equality of two List<T> instances as comparing each of the contained objects using their Equals methods. The following example uses the == operator to compare two List<int> instances.

using System;

class Test
{
static void Main() {
List<int> a = new List<int>();
a.Add(1);
a.Add(2);
List<int> b = new List<int>();
b.Add(1);
b.Add(2);
Console.WriteLine(a == b); // Outputs "True"
b.Add(3);
Console.WriteLine(a == b); // Outputs "False"
}
}

The first Console.WriteLine outputs True because the two lists contain the same number of objects with the same values in the same order. Had List<T> not defined operator ==, the first Console.WriteLine would have output False because a and b reference different List<int> instances.

Destructors

A destructor is a member that implements the actions required to destruct an instance of a class. Destructors cannot have parameters, they cannot have accessibility modifiers, and they cannot be invoked explicitly. The destructor for an instance is invoked automatically during garbage collection.

The garbage collector is allowed wide latitude in deciding when to collect objects and run destructors. Specifically, the timing of destructor invocations is not deterministic, and destructors may be executed on any thread. For these and other reasons, classes should implement destructors only when no other solutions are feasible.

The using statement provides a better approach to object destruction.

Structs

Like classes, structs are data structures that can contain data members and function members, but unlike classes, structs are value types and do not require heap allocation. A variable of a struct type directly stores the data of the struct, whereas a variable of a class type stores a reference to a dynamically allocated object. Struct types do not support user-specified inheritance, and all struct types implicitly inherit from type object.

Structs are particularly useful for small data structures that have value semantics. Complex numbers, points in a coordinate system, or key-value pairs in a dictionary are all good examples of structs. The use of structs rather than classes for small data structures can make a large difference in the number of memory allocations an application performs. For example, the following program creates and initializes an array of 100 points. With Point implemented as a class, 101 separate objects are instantiated—one for the array and one each for the 100 elements.

class Point
{
public int x, y;

public Point(int x, int y) {
this.x = x;
this.y = y;
}
}

class Test
{
static void Main() {
Point[] points = new Point[100];
for (int i = 0; i < 100; i++) points[i] = new Point(i, i);
}
}

An alternative is to make Point a struct.

struct Point
{
public int x, y;

public Point(int x, int y) {
this.x = x;
this.y = y;
}
}

Now, only one object is instantiated—the one for the array—and the Point instances are stored in-line in the array.

Struct constructors are invoked with the new operator, but that does not imply that memory is being allocated. Instead of dynamically allocating an object and returning a reference to it, a struct constructor simply returns the struct value itself (typically in a temporary location on the stack), and this value is then copied as necessary.

With classes, it is possible for two variables to reference the same object and thus possible for operations on one variable to affect the object referenced by the other variable. With structs, the variables each have their own copy of the data, and it is not possible for operations on one to affect the other. For example, the output produced by the following code fragment depends on whether Point is a class or a struct.

Point a = new Point(10, 10);
Point b = a;
a.x = 20;
Console.WriteLine(b.x);

If Point is a class, the output is 20 because a and b reference the same object. If Point is a struct, the output is 10 because the assignment of a to b creates a copy of the value, and this copy is unaffected by the subsequent assignment to a.x.

The previous example highlights two of the limitations of structs. First, copying an entire struct is typically less efficient than copying an object reference, so assignment and value parameter passing can be more expensive with structs than with reference types. Second, except for ref and out parameters, it is not possible to create references to structs, which rules out their usage in a number of situations.

Arrays

An array is a data structure that contains a number of variables that are accessed through computed indices. The variables contained in an array, also called the elements of the array, are all of the same type, and this type is called the element type of the array.

Array types are reference types, and the declaration of an array variable simply sets aside space for a reference to an array instance. Actual array instances are created dynamically at run-time using the new operator. The new operation specifies the length of the new array instance, which is then fixed for the lifetime of the instance. The indices of the elements of an array range from 0 to Length - 1. The new operator automatically initializes the elements of an array to their default value, which, for example, is zero for all numeric types and null for all reference types.

The following example creates an array of int elements, initializes the array, and prints out the contents of the array.

using System;

class Test
{
static void Main() {
int[] a = new int[10];
for (int i = 0; i < a.Length; i++) {
a[i] = i * i;
}
for (int i = 0; i < a.Length; i++) {
Console.WriteLine("a[{0}] = {1}", i, a[i]);
}
}
}

This example creates and operates on a single-dimensional array. C# also supports multi-dimensional arrays. The number of dimensions of an array type, also known as the rank of the array type, is one plus the number of commas written between the square brackets of the array type. The following example allocates a one-dimensional, a two-dimensional, and a three-dimensional array.

int[] a1 = new int[10];
int[,] a2 = new int[10, 5];
int[,,] a3 = new int[10, 5, 2];

The a1 array contains 10 elements, the a2 array contains 50 (10 × 5) elements, and the a3 array contains 100 (10 × 5 × 2) elements.

The element type of an array can be any type, including an array type. An array with elements of an array type is sometimes called a jagged array because the lengths of the element arrays do not all have to be the same. The following example allocates an array of arrays of int:

int[][] a = new int[3][];
a[0] = new int[10];
a[1] = new int[5];
a[2] = new int[20];

The first line creates an array with three elements, each of type int[] and each with an initial value of null. The subsequent lines then initialize the three elements with references to individual array instances of varying lengths.

The new operator permits the initial values of the array elements to be specified using an array initializer, which is a list of expressions written between the delimiters { and }. The following example allocates and initializes an int[] with three elements.

int[] a = new int[] {1, 2, 3};

Note that the length of the array is inferred from the number of expressions between { and }. Local variable and field declarations can be shortened further such that the array type does not have to be restated.

int[] a = {1, 2, 3};

Both of the previous examples are equivalent to the following:

int[] t = new int[3];
t[0] = 1;
t[1] = 2;
t[2] = 3;
int[] a = t;

Interfaces

An interface defines a contract that can be implemented by classes and structs. An interface can contain methods, properties, events, and indexers. An interface does not provide implementations of the members it defines—it merely specifies the members that must be supplied by classes or structs that implement the interface.

Interfaces may employ multiple inheritance. In the following example, the interface IComboBox inherits from both ITextBox and IListBox.

interface IControl
{
void Paint();
}

interface ITextBox: IControl
{
void SetText(string text);
}

interface IListBox: IControl
{
void SetItems(string[] items);
}

interface IComboBox: ITextBox, IListBox {}

Classes and structs can implement multiple interfaces. In the following example, the class EditBox implements both IControl and IDataBound.

interface IDataBound
{
void Bind(Binder b);
}

public class EditBox: IControl, IDataBound
{
public void Paint() {...}

public void Bind(Binder b) {...}
}

When a class or struct implements a particular interface, instances of that class or struct can be implicitly converted to that interface type. For example

EditBox editBox = new EditBox();
IControl control = editBox;
IDataBound dataBound = editBox;

In cases where an instance is not statically known to implement a particular interface, dynamic type casts can be used. For example, the following statements use dynamic type casts to obtain an object’s IControl and IDataBound interface implementations. Because the actual type of the object is EditBox, the casts succeed.

object obj = new EditBox();
IControl control = (IControl)obj;
IDataBound dataBound = (IDataBound)obj;

In the previous EditBox class, the Paint method from the IControl interface and the Bind method from the IDataBound interface are implemented using public members. C# also supports explicit interface member implementations, using which the class or struct can avoid making the members public. An explicit interface member implementation is written using the fully qualified interface member name. For example, the EditBox class could implement the IControl.Paint and IDataBound.Bind methods using explicit interface member implementations as follows.

public class EditBox: IControl, IDataBound
{
void IControl.Paint() {...}

void IDataBound.Bind(Binder b) {...}
}

Explicit interface members can only be accessed via the interface type. For example, the implementation of IControl.Paint provided by the previous EditBox class can only be invoked by first converting the EditBox reference to the IControl interface type.

EditBox editBox = new EditBox();
editBox.Paint(); // Error, no such method
IControl control = editBox;
control.Paint(); // Ok

Enums

An enum type is a distinct value type with a set of named constants. The following example declares and uses an enum type named Color with three constant values, Red, Green, and Blue.

using System;

enum Color
{
Red,
Green,
Blue
}

class Test
{
static void PrintColor(Color color) {
switch (color) {
case Color.Red:
Console.WriteLine("Red");
break;
case Color.Green:
Console.WriteLine("Green");
break;
case Color.Blue:
Console.WriteLine("Blue");
break;
default:
Console.WriteLine("Unknown color");
break;
}
}

static void Main() {
Color c = Color.Red;
PrintColor(c);
PrintColor(Color.Blue);
}
}

Each enum type has a corresponding integral type called the underlying type of the enum type. An enum type that does not explicitly declare an underlying type has an underlying type of int. An enum type’s storage format and range of possible values are determined by its underlying type. The set of values that an enum type can take on is not limited by its enum members. In particular, any value of the underlying type of an enum can be cast to the enum type and is a distinct valid value of that enum type.

The following example declares an enum type named Alignment with an underlying type of sbyte.

enum Alignment: sbyte
{
Left = -1,
Center = 0,
Right = 1
}

As shown by the previous example, an enum member declaration can include a constant expression that specifies the value of the member. The constant value for each enum member must be in the range of the underlying type of the enum. When an enum member declaration does not explicitly specify a value, the member is given the value zero (if it is the first member in the enum type) or the value of the textually preceding enum member plus one.

Enum values can be converted to integral values and vice versa using type casts. For example

int i = (int)Color.Blue; // int i = 2;
Color c = (Color)2; // Color c = Color.Blue;

The default value of any enum type is the integral value zero converted to the enum type. In cases where variables are automatically initialized to a default value, this is the value given to variables of enum types. In order for the default value of an enum type to be easily available, the literal 0 implicitly converts to any enum type. Thus, the following is permitted.

Color c = 0;

Delegates

A delegate type represents references to methods with a particular parameter list and return type. Delegates make it possible to treat methods as entities that can be assigned to variables and passed as parameters. Delegates are similar to the concept of function pointers found in some other languages, but unlike function pointers, delegates are object-oriented and type-safe.

The following example declares and uses a delegate type named Function.

using System;

delegate double Function(double x);

class Multiplier
{
double factor;

public Multiplier(double factor) {
this.factor = factor;
}

public double Multiply(double x) {
return x * factor;
}
}

class Test
{
static double Square(double x) {
return x * x;
}

static double[] Apply(double[] a, Function f) {
double[] result = new double[a.Length];
for (int i = 0; i < a.Length; i++) result[i] = f(a[i]);
return result;
}

static void Main() {
double[] a = {0.0, 0.5, 1.0};

double[] squares = Apply(a, Square);

double[] sines = Apply(a, Math.Sin);

Multiplier m = new Multiplier(2.0);
double[] doubles = Apply(a, m.Multiply);
}
}

An instance of the Function delegate type can reference any method that takes a double argument and returns a double value. The Apply method applies a given Function to the elements of a double[], returning a double[] with the results. In the Main method, Apply is used to apply three different functions to a double[].

A delegate can reference either a static method (such as Square or Math.Sin in the previous example) or an instance method (such as m.Multiply in the previous example). A delegate that references an instance method also references a particular object, and when the instance method is invoked through the delegate, that object becomes this in the invocation.

Delegates can also be created using anonymous functions, which are “inline methods” that are created on the fly. Anonymous functions can see the local variables of the sourrounding methods. Thus, the multiplier example above can be written more easily without using a Multiplier class:

double[] doubles = Apply(a, (double x) => x * 2.0);

An interesting and useful property of a delegate is that it does not know or care about the class of the method it references; all that matters is that the referenced method has the same parameters and return type as the delegate.

Attributes

Types, members, and other entities in a C# program support modifiers that control certain aspects of their behavior. For example, the accessibility of a method is controlled using the public, protected, internal, and private modifiers. C# generalizes this capability such that user-defined types of declarative information can be attached to program entities and retrieved at run-time. Programs specify this additional declarative information by defining and using attributes.

The following example declares a HelpAttribute attribute that can be placed on program entities to provide links to their associated documentation.

using System;

public class HelpAttribute: Attribute
{
string url;
string topic;

public HelpAttribute(string url) {
this.url = url;
}

public string Url {
get { return url; }
}

public string Topic {
get { return topic; }
set { topic = value; }
}
}

All attribute classes derive from the System.Attribute base class provided by the .NET Framework. Attributes can be applied by giving their name, along with any arguments, inside square brackets just before the associated declaration. If an attribute’s name ends in Attribute, that part of the name can be omitted when the attribute is referenced. For example, the HelpAttribute attribute can be used as follows.

[Help("http://msdn.microsoft.com/.../MyClass.htm")]
public class Widget
{
[Help("http://msdn.microsoft.com/.../MyClass.htm", Topic = "Display")]
public void Display(string text) {}
}

This example attaches a HelpAttribute to the Widget class and another HelpAttribute to the Display method in the class. The public constructors of an attribute class control the information that must be provided when the attribute is attached to a program entity. Additional information can be provided by referencing public read-write properties of the attribute class (such as the reference to the Topic property previously).

The following example shows how attribute information for a given program entity can be retrieved at run-time using reflection.

using System;
using System.Reflection;

class Test
{
static void ShowHelp(MemberInfo member) {
HelpAttribute a = Attribute.GetCustomAttribute(member,
typeof(HelpAttribute)) as HelpAttribute;
if (a == null) {
Console.WriteLine("No help for {0}", member);
}
else {
Console.WriteLine("Help for {0}:", member);
Console.WriteLine(" Url={0}, Topic={1}", a.Url, a.Topic);
}
}

static void Main() {
ShowHelp(typeof(Widget));
ShowHelp(typeof(Widget).GetMethod("Display"));
}
}

When a particular attribute is requested through reflection, the constructor for the attribute class is invoked with the information provided in the program source, and the resulting attribute instance is returned. If additional information was provided through properties, those properties are set to the given values before the attribute instance is returned.


Lexical structure

Programs

A C# program consists of one or more source files, known formally as compilation units (§9.1). A source file is an ordered sequence of Unicode characters. Source files typically have a one-to-one correspondence with files in a file system, but this correspondence is not required. For maximal portability, it is recommended that files in a file system be encoded with the UTF-8 encoding.

Conceptually speaking, a program is compiled using three steps:

1. Transformation, which converts a file from a particular character repertoire and encoding scheme into a sequence of Unicode characters.

2. Lexical analysis, which translates a stream of Unicode input characters into a stream of tokens.

3. Syntactic analysis, which translates the stream of tokens into executable code.

Grammars

This specification presents the syntax of the C# programming language using two grammars. The lexical grammar (§2.2.2) defines how Unicode characters are combined to form line terminators, white space, comments, tokens, and pre-processing directives. The syntactic grammar (§2.2.3) defines how the tokens resulting from the lexical grammar are combined to form C# programs.

Grammar notation

The lexical and syntactic grammars are presented using grammar productions. Each grammar production defines a non-terminal symbol and the possible expansions of that non-terminal symbol into sequences of non-terminal or terminal symbols. In grammar productions, non-terminal symbols are shown in italic type, and terminal symbols are shown in a fixed-width font.

The first line of a grammar production is the name of the non-terminal symbol being defined, followed by a colon. Each successive indented line contains a possible expansion of the non-terminal given as a sequence of non-terminal or terminal symbols. For example, the production:

while-statement:
while ( boolean-expression ) embedded-statement

defines a while-statement to consist of the token while, followed by the token “(”, followed by a boolean-expression, followed by the token “)”, followed by an embedded-statement.

When there is more than one possible expansion of a non-terminal symbol, the alternatives are listed on separate lines. For example, the production:

statement-list:
statement
statement-list statement

defines a statement-list to either consist of a statement or consist of a statement-list followed by a statement. In other words, the definition is recursive and specifies that a statement list consists of one or more statements.

A subscripted suffix “opt” is used to indicate an optional symbol. The production:

block:
{ statement-listopt }

is shorthand for:

block:
{ }
{ statement-list }

and defines a block to consist of an optional statement-list enclosed in “{” and “}” tokens.

Alternatives are normally listed on separate lines, though in cases where there are many alternatives, the phrase “one of” may precede a list of expansions given on a single line. This is simply shorthand for listing each of the alternatives on a separate line. For example, the production:

real-type-suffix: one of
F f D d M m

is shorthand for:

real-type-suffix:
F
f
D
d
M
m

Lexical grammar

The lexical grammar of C# is presented in §2.3, §2.4, and §2.5. The terminal symbols of the lexical grammar are the characters of the Unicode character set, and the lexical grammar specifies how characters are combined to form tokens (§2.4), white space (§2.3.3), comments (§2.3.2), and pre-processing directives (§2.5).

Every source file in a C# program must conform to the input production of the lexical grammar (§2.3).

Syntactic grammar

The syntactic grammar of C# is presented in the chapters and appendices that follow this chapter. The terminal symbols of the syntactic grammar are the tokens defined by the lexical grammar, and the syntactic grammar specifies how tokens are combined to form C# programs.

Every source file in a C# program must conform to the compilation-unit production of the syntactic grammar (§9.1).

Lexical analysis

The input production defines the lexical structure of a C# source file. Each source file in a C# program must conform to this lexical grammar production.

input:
input-sectionopt

input-section:
input-section-part
input-section input-section-part

input-section-part:
input-elementsopt new-line
pp-directive

input-elements:
input-element
input-elements input-element

input-element:
whitespace
comment
token

Five basic elements make up the lexical structure of a C# source file: Line terminators (§2.3.1), white space (§2.3.3), comments (§2.3.2), tokens (§2.4), and pre-processing directives (§2.5). Of these basic elements, only tokens are significant in the syntactic grammar of a C# program (§2.2.3).

The lexical processing of a C# source file consists of reducing the file into a sequence of tokens which becomes the input to the syntactic analysis. Line terminators, white space, and comments can serve to separate tokens, and pre-processing directives can cause sections of the source file to be skipped, but otherwise these lexical elements have no impact on the syntactic structure of a C# program.

When several lexical grammar productions match a sequence of characters in a source file, the lexical processing always forms the longest possible lexical element. For example, the character sequence // is processed as the beginning of a single-line comment because that lexical element is longer than a single / token.

Line terminators

Line terminators divide the characters of a C# source file into lines.

new-line:
Carriage return character (U+000D)
Line feed character (U+000A)
Carriage return character (U+000D) followed by line feed character (U+000A)
Next line character(U+0085)
Line separator character (U+2028)
Paragraph separator character (U+2029)

For compatibility with source code editing tools that add end-of-file markers, and to enable a source file to be viewed as a sequence of properly terminated lines, the following transformations are applied, in order, to every source file in a C# program:

· If the last character of the source file is a Control-Z character (U+001A), this character is deleted.

· A carriage-return character (U+000D) is added to the end of the source file if that source file is non-empty and if the last character of the source file is not a carriage return (U+000D), a line feed (U+000A), a line separator (U+2028), or a paragraph separator (U+2029).

Comments

Two forms of comments are supported: single-line comments and delimited comments. Single-line comments start with the characters // and extend to the end of the source line. Delimited comments start with the characters /* and end with the characters */. Delimited comments may span multiple lines.

comment:
single-line-comment
delimited-comment

single-line-comment:
// input-charactersopt

input-characters:
input-character
input-characters input-character

input-character:
Any Unicode character except a new-line-character

new-line-character:
Carriage return character (U+000D)
Line feed character (U+000A)
Next line character(U+0085)
Line separator character (U+2028)
Paragraph separator character (U+2029)

delimited-comment:
/* delimited-comment-textopt asterisks /

delimited-comment-text:
delimited-comment-section
delimited-comment-text delimited-comment-section

delimited-comment-section:
/
asterisksopt not-slash-or-asterisk

asterisks:
*
asterisks *

not-slash-or-asterisk:
Any Unicode character except / or *

Comments do not nest. The character sequences /* and */ have no special meaning within a // comment, and the character sequences // and /* have no special meaning within a delimited comment.

Comments are not processed within character and string literals.

The example

/* Hello, world program
This program writes “hello, world” to the console
*/
class Hello
{
static void Main() {
System.Console.WriteLine("hello, world");
}
}

includes a delimited comment.

The example

// Hello, world program
// This program writes “hello, world” to the console
//
class Hello // any name will do for this class
{
static void Main() { // this method must be named "Main"
System.Console.WriteLine("hello, world");
}
}

shows several single-line comments.

White space

White space is defined as any character with Unicode class Zs (which includes the space character) as well as the horizontal tab character, the vertical tab character, and the form feed character.

whitespace:
Any character with Unicode class Zs
Horizontal tab character (U+0009)
Vertical tab character (U+000B)
Form feed character (U+000C)

Tokens

There are several kinds of tokens: identifiers, keywords, literals, operators, and punctuators. White space and comments are not tokens, though they act as separators for tokens.

token:
identifier
keyword
integer-literal
real-literal
character-literal
string-literal
operator-or-punctuator





Последнее изменение этой страницы: 2016-12-14; Нарушение авторского права страницы

infopedia.su Все материалы представленные на сайте исключительно с целью ознакомления читателями и не преследуют коммерческих целей или нарушение авторских прав. Обратная связь - 3.236.156.32 (0.072 с.)