C+ Language Design

C# Language Design
Peter Hallam
Software Design Engineer
C# Compiler
Microsoft Corporation
Overview




Introduction to C#
Design Problems
Future Directions
Questions
Hello World
using System;
class Hello
{
static void Main()
{
Console.WriteLine("Hello, world!");
}
}
C# Program Structure

Namespaces


Type declarations


Classes, structs, interfaces, enums,
and delegates
Members



Contain types and other namespaces
Constants, fields, methods, operators,
constructors, destructors
Properties, indexers, events
Organization

No header files, code written “in-line”
Program Structure Example
namespace System.Collections
{
using System;
public class Stack: Collection
{
Entry top;
public void Push(object data) {
top = new Entry(top, data);
}
public object Pop() {
if (top == null) throw new InvalidOperationException();
object result = top.data;
top = top.next;
return result;
}
}
}
Predefined Types

C# predefined types






The “root”
Logical
Signed
Unsigned
Floating-point
Textual

object
bool
sbyte, short, int, long
byte, ushort, uint, ulong
float, double, decimal
char, string
Textual types use Unicode (16-bit characters)
C# Classes



Single inheritance
Can implement multiple interfaces
Members





Constants, fields, methods, operators,
constructors, destructors
Properties, indexers, events
Nested types
Static and instance members
Member access

public, protected, internal, private
Interfaces



Can contain method declarations; no
code or data
Defines a “contract” that a class must
support
Classes have one base class, but can
implement many interfaces
interface IFormattable {
string Format(string format);
}
class DateTime: IFormattable {
public string Format(string format) {…}
}
Statements and
Expressions

Very similar to C++, with some changes to increase
robustness






No ‘->’ or ‘::’; all qualification uses ‘.’
Local variables must be initialized before use
if, while, do require bool condition
goto can’t jump into blocks
switch statement – no fall through
Expression statements must do something useful
(assignment or call)
void Foo() {
i == 1;
}
// error
void Foo() {
if (i = 1) // error
...
C# Design Goals

Simple, Extensible Type System

1st Class Component Support

Robust and Versionable

Preserve existing investments
Problem:
How to Unify the Type System

A single universal base type (“object”)




All types ultimately inherit from object
Object variable can hold any value
Any piece of data can be stored,
transported, and manipulated with no
extra work
Unification enables:


Calling virtual functions on any value
Collection classes for any type
Unifying the Type System
Desired Picture:
Stream
MemoryStream


object
Hashtable
int
double
FileStream
How to deal with the primitive types
without losing performance?
How to create user-defined types that
are as efficient as “int” or “double”?
How to Unify:
A traditional approach (SmallTalk)

Make everything a real object

Performance implications



All objects have a type descriptor or virtual
function table
May require all object to be heap-allocated to
prevent dangle pointers
Behavior and expectation mismatch

“int” variables can be “null”
How to Unify:
Don’t do it (Eiffel, Java)

Intrinsic types are not classes




Good performance
Can’t convert “int” to “Object” – the
primitive types are in a separate world
Requires special wrapper classes (e.g.,
“Integer”) to “wrap” a primitive type so
that it works in the “Object” world.
Not extensible – the set of primitive types
is fixed.
How to Unify:
C# Approach


Types are divides into two kinds:
Reference types and Value types
Reference types are full-featured:



Always allocated in heap
Arbitrary derivation
Value types have restrictions:




Only inherit from object
Can’t be used as base classes
Allocated from stack or inline inside other
objects
Assignment copies value, not reference
Unification



Value types don’t need type
descriptors or vtables (efficient!)
“object” does need a type descriptor,
because it can contain any type
Value types become reference types
when they are converted to “object”



Value is copied to heap, type descriptor
attached
Process is called “boxing”
When cast back to value type, “unboxing”
occurs, value is copied out of heap
Boxing and Unboxing

“Everything is an object”

Any type can can be stored as an object
int i = 123;
object o = i;
int j = (int)o;
i
123
o
“int”
?
j
123
123
} “Boxing”
}
“Unboxing”
User-Defined Types


C# allows user-defined types to be
either reference or value types
Classes (reference)


Used for most objects
Structs (value)

Objects that are like primitive data (Point,
Complex, etc).
struct Point { int x, y; ... }
Point sp = new Point(10, 20);
C# Type System



Natural User Model
Extensible
Performant
Problem:
Additional Declarative Information

How do you associate information with
types and members?






XML persistence mapping for a type
External code interop information
Remoting information
Transaction context for a method
Visual designer information (how should
property be categorized?)
Security contraints (what permissions are
required to call this method?)
Other Approaches

Add keyword or pragma


Use external file




Requires updating the compiler for each
new piece of information
Information clumsy to find/see
Require duplication of names
Example: IDL files for remote procedures
Use naming patterns


Create a new class or constant that
describes the class/members
Example: Java “BeanInfo” classes
C# Solution: Attributes



Attach named attributes (with optional
arguments) to language element
Uses simple bracketed syntax
Arguments must be constants of
string, number, enum, type-name
Attributes - Examples
public class OrderProcessor {
[WebMethod]
public void SubmitOrder(PurchaseOrder order) {...}
}
public class PurchaseOrder
[XmlElement("shipTo")]
[XmlElement("billTo")]
[XmlElement("items")]
[XmlAttribute("date")]
}
{
public
public
public
public
Address ShipTo;
Address BillTo;
Item[] Items;
DateTime OrderDate;
public class Button {
[Category(Categories.Layout)]
public int Width { get {…} set {…} }
[Obsolete(“Use DoStuff2 instead”)]
public void DoStuff() {…}
}
Attributes

Attributes






Attached to types, members, parameters, and
libraries
Present in the compiled metadata
Can by examined by the common language
runtime, by compilers, by the .NET Frameworks, or
by user code (using reflection)
Extensible
Type-safe
Extensively used in .NET Frameworks

XML, Web Services, security, serialization,
component model, transactions, external code
interop…
Creating an Attribute

Attributes are simply classes



Derived from System.Attribute
Class functionality = attribute
functionality
Attribute arguments are constructor
arguments
public class ObsoleteAttribute : System.Attribute
{
public ObsoleteAttribute () { … }
public ObsoleteAttribute (string descrip) { … }
}
Using the Attribute
[Obsolete]
void Foo() {…}
[Obsolete(“Use Baz instead”)]
void Bar(int i) {…}

When a compiler sees an attribute it:
1. Finds the constructor, passing in args
2. Checks the types of arguments against
the constructor
3. Saves a reference to the constructor and
values of the arguments in the metadata
Querying Attributes

Use reflection to query attributes
Type type = typeof(MyClass);
foreach(Attribute attr in type.GetCustomAttributes())
{
if ( attr is ObsoleteAttribute ) {
ObsoleteAttribute oa = (ObsoleteAttribute) attr;
Console.WriteLine(“{0} is obsolete: {1}”,
type, attr.Description;
}
}
Problem : Versioning


Once a class library is released, can we add
functionality without breaking users of the
class library?
Very important for system level components!
Versioning Problems

Versioning is overlooked in most languages



C++ and Java produce fragile base classes
Users unable to express versioning intent
Adding a virtual method can break a derived
class

If the derived class already has a method of the
same name, breakage can happen
Versioning: C# solution

C# allows intent to be expressed





Methods are not virtual by default
C# keywords “virtual”, “override” and “new”
provide context
Adding a base class member never breaks a
derived class
Adding or removing a private member never
breaks another class
C# can't guarantee versioning


Can enable (e.g., explicit override)
Can encourage (e.g., smart defaults)
Versioning Example
class Base
// version 2
1
{
} public virtual void Foo() {
Console.WriteLine("Base.Foo");
}
}
class Derived: Base
// version 2b
1
2a
{
new public
public
virtual
override
virtual
void
void
void
Foo()
Foo()
Foo()
{{ {
Console.WriteLine("Derived.Foo");
base.Foo();
} Console.WriteLine("Derived.Foo");
} }
}
Interface Implementation

Private interface implementations

Resolve interface member conflicts
interface I {
void foo();
}
interface J {
void foo();
}
class C: I, J {
void I.foo() { /* do one thing */
}
void J.foo() { /* do another thing */ }
}
foreach Statement

Iteration of arrays
public static void Main(string[] args) {
foreach (string s in args) Console.WriteLine(s);
}

Iteration of user-defined collections
foreach (Customer c in customers.OrderBy("name")) {
if (c.Orders.Count != 0) {
...
}
}
Extending foreach
IEnumerable
interface IEnumerable {
IEnumerator GetEnumerator();
}
interface IEnumerator {
bool MoveNext();
object Current { get; }
}
Extending foreach
IEnumerable
foreach (int v in collection) {
// use element v …
}
(IEnumerable) ie = (IEnumerable)
collection;
IEnumerator e = ie.GetEnumerator();
while (e.MoveNext()) {
int v = (int) e.Current;
…
}
foreach

Problems with IEnumerable



no compile time type checking
boxing when enumerating value types
Solution : A pattern-based approach

The C# compiler looks for:



GetEnumerator() on the collection
bool MoveNext() on the enumerator type
Strongly typed Current on the enumerator
type
foreach - Summary

Some Complexity



Extensible


User collections can plug into foreach
User Model


interface
pattern
Compile-time type checking
Performance

Value type access without boxing
Future Directions
Generics - Prototype

Implemented by MSR Cambridge



Don Syme
Andrew Kennedy
Published Paper at PLDI 2001
Generics - Prototype
class Stack<T>
{
T[ ] data;
void Push(T top) { …}
T Pop() { … }
}
Stack<string> ss = new Stack<string>;
ss.Push(“Hello”);
Stack<int> si = new Stack<int>;
ss.Push(4);
Other Approaches – C++






Templates are really typed macros
Compile time instantiations only
Require source for new instantiations
Type Parameter Bounds Infered
Good Execution Speed
Bad Code Size
Other Approaches - Java







Type Erasure
No VM modifications
Compile time type checking
No instantiations on primitive types
Execution Speed – Casts
Good Code Size
Type Identity Problems

List<String> vs. List<Object>
C# .NET Generics Prototype





prototype .NET runtime is generics
aware
All objects carry exact runtime type
Instantiations on reference and value
types
Type Parameters bounded by base
class and/or interfaces
Runtime performs specialization
C# .NET Generics Prototype

Compile Time Experience




Separate compilation of generic types
Instantiations checked at compile time
Can Instantiate on all types – int, string
Runtime Experience



Dynamic Type Specialization
Execution Speed – No Extra Casts
Code Size – Code Sharing reduces bloat