Mango - Welcome [Savannah]

Mango
A General Purpose Programming
Language
My Background
 Early experience on Apple II
 University of Illinois – Champaign –
Urbana.
 Bachelor's degree in computer
engineering
 Three years at Neoglyphics: software
 Three years at Alpha: hardware
How Mango Got Started
 Frustrated with C/C++/Java
 There had to be a better way
 Foolishly began designing my own
language
 Foolishness can be a virtue
 It’s been a seven year voyage
The Problem
 Common computing infrastructure is
written in C/C++
 C/C++ is inadequate
 Lack of higher level abstractions
 Programmers must use low level constructs
 Results of C/C++ use
 Unsafe/unstable software
 Slower development times
 Higher development costs
A lack of alternatives
 Java, Python, Perl, Pascal, Ada, Modula,
C#
 Not viable replacements
 Lack C’s virtues in performance and flexibility
 Dependent on C for core tasks
 Lack of widespread appeal (clumsy, i.e. Ada?)
 Not sufficiently different to switch
The Solution: Core Goals
 Provide higher level abstractions
 Avoid low level constructs when not needed
 Make programming easier, more enjoyable
 Retain performance and flexibility
 Allow unrestricted operations as necessary
 Avoid overhead
 Match machine execution model
 Overall: make a better experience for
programmers
Overview
 High level design goals and decisions
 Feature walk through
 Future directions
Design Goals
Design goals
 Syntax
 Static Typing vs. Dynamic Typing
 How Vs. What
 Large Languages Vs. Small Languages
 Object Orientation: Yes or No
Goal #1: A Good Syntax
 Syntax is key




It’s underrated (focus on semantics)
Makes the language easier to learn
Makes it accessible to non-programmers
Makes the language self-documenting
 Marketing versus engineering
 Bad marketing of a good product will fail
 Bad syntax around good semantics will have
a harder time gaining acceptance
Static versus Dynamic Typing
 Dynamic languages are very popular
 Due to poor implementations of static
languages
 Advantages of static typing




Critical for performance
Types act as documentation
They catch many errors at compile time
Types allows overloading of names
How versus What
 CS fantasy to forget how and focus on what
 How is a hard problem
 Distinguish features that are theoretically
equivalent but practically different
 Everything can be a list, but it’ll be slow
 Rich set of primitive and aggregate types
 Side effects to use memory more effectively
 Reduce copying
 Manual memory management
 GC is not possible for some applications
 Done right beats a garbage collector
Large Vs. Small Languages
 Small language
 Secondary features are in a standard library
 Advantages: easier to learn core features, make a compiler
 Large language
 First class treatment of secondary features
 Allows specialized operations, makes programs more readable
 Advantage: a smoother user experience
 Ease of learning is dependent on more than language
size
 You still have to learn library API’s
 Intangible quality: how the language corresponds to human
cognition
 Open source makes compilers easier to write
 Open front end acts as the spec
Object-Orientation: Yes or No?
 Stepanov’s criticism
 programming = data structures + algorithms
 multi-sorted algebras
 Object orientation has shown itself useful in
certain circumstances
 Offer OO as an option
 Leave inheritance behind
 Inheritance hierarchies are difficult to follow
 Fragile base class problem requires reanalysis of
class behavior
Mango walk through





















Influences
Syntax plus some basic examples
Module system + incremental compilation (Include requires clause, parameter and options)
(platform and foundation files)
Naming and overloading
Literals (include string format)
Primitive Types
Memory model (include pointer/reference syntax)
Records and Abstracts
Procedures, functions, constants (pure functions?)
Statements: Control flow
Statements: I/O
Abstract Data Type (objects)
Iterators and iteration operators
Mutexes
Exception Handling
Strings, arrays, buffers
Collection Types
Global variables/External symbols/Module Constructors

Calling convention part of the type system
Packet/Device types
Development aids
Genericity













SETL - set operations
ALGOL - imperative block structure and syntax
C - low level ops, low overhead
ML - type inference, type syntax
ADA - fine grained control over primitives
PYTHON - indentation based syntax
JAVA - interfaces
C++ - STL, operator overloading, IO syntax
CLU - iterators
PASCAL - sets (bit masks)
PERL - richness of expressibility
COBOL - readable syntax
SIMULA - objects
Syntax
Mango’s Syntax
 Mango looks like pseudo code
 Indentation based syntax
 Reduces clutter and typing
 Allows more code to be shown on the screen
 Blocks can be enclosed with begin/end delimiters if desired
 All directives, definitions, statements begin with a
keyword
 Simple style that is easy to remember
 User’s symbolic names cannot conflict with reserved words
 This makes the language easy to extend
 There are no exclusively reserved keywords
 Legacy of lex?
Modules and Naming
Modules




Module is a set of symbols
Each file is a module
Module name must correspond to pathname
A modules symbols can be public or private
 Public: symbols are visible to other modules
 Private: symbols are invisible to other modules
 Modules may import and include other modules
 Import: foreign symbols are localized (private)
 Include: foreign symbols are exported (public)
Incremental Compilation
 Mango supports incremental compilation
 Module has changed
 A dependency has changed
 Major or minor revision?
 Compares syntax trees
 Major revision: public change
 Minor revision: private change
 Comparing syntax
 Advantage: Eases implementation
 Disadvantage: Set of major revisions is obviously
larger
Naming
 Naming is not hierarchical, but geometric
 Symbol names exist within a four-d grid
 Namespace, keyword, extension, auxiliary
 Only namespace and keyword are mandatory
 Each module is part of a namespace
 Public symbols use declared namespace
 Private symbols use special namespace “local”
 Format:
 namespace::keyword@extension$auxiliary
Shallow Overloading
 Overloading occurs for every imported or included






module
4d namespace is collapsed into 1d namespace
 Utilizing keyword or extension
Other partial namespaces as well
Symbol can be access using proper name or alias
Ensures all overloaded symbols have a unique name
As a result, all overloading is superficial or shallow
Operator overloading is also supported
Memory Model
Mango’s Memory Model
 Value semantics
 Put stuff on the stack, particularly primitives
 Key for performance
 Offers more flexibility
 Three types of memory
 Static:
 Compiled data, global variables
 Heap items that are never deleted
 Arbitrary:
 Heap items that are eventually deleted
 Local:
 Items that live on the stack
Safe manual memory management
 Static datums
 Always safe to use
 Arbitrary datums
 Need to be guarded to avoid dangling pointer
references
 Local datums
 Compiler must enforce restrictions to avoid
dangling pointer references
Sentries
 Pointer guards are called sentries
 Pointers to arbitrary datums are fat
 One address to the datum on the heap
 One address to the datum’s sentry
 Sentries live on the heap too
 Have a static lifetime (i.e. never deallocated)
 They are very small ~= 5 bytes
Sentry Performance
 When # sentries is small
 Good performance on modern hardware
 Sentries stay in the cache
 Half of the processor’s time is spent waiting
on memory
 As # sentries increases
 cache starts to overflow
 We need to reduce the number of sentries
Arenas and Recycling
 Method #1: Allocate in pools
 A group of datums share a sentry
 Allocated arbitrarily, but deallocated at once
 Method #2: Recycling
 Use static datums instead
 When static datums are deleted
 Initialized to zero
 Stay on the heap until datum of the same type is requested
 Incorrect results are possible, catastrophic failures
are not
 The program cannot break the type system
Literals
Literals
 Definition
 A value which is known at compile time
 Types
 Immediate values
 Numeric primitives and text
 Literal expressions
 Interpreted during compilation
 Parameters
 Values used to configure the compiler
 Options
 User supplied values to customize a build
 Literals can be named
Numeric Literals
 Six types
 Integer, Decimal, Hex, Address, Binary, Signal
 Integers and decimals also include complex plane signifiers
 Can be anonymous
 Anonymous literals are stored as untyped strings
 Converted to a real value when type is known
 There are no constraints on range
 Can be typed
 Type is specified with value
 Converted immediately to the desire type value
 There are no constraints on range
Text Literals
 Two types
 Characters and Strings
 Stored as untyped strings until desired type is known
 Characters enclosed with the back tick
 Text string enclosed with the double quote
 Literals can be inserted into characters and text strings





Named literals
Parameters
Options
Character codes
Character aliases
Literal Expressions
 Two forms of literal expressions
 Normal literals: immediate evaluation
 Macro literals: deferred evaluation
 Macros are evaluated over arguments
 Result value is optionally typed
 Expressions can include
 Condition construct
 Conversion construct
 Local aliasing
Parameters and Options
 Changes cause recompilation of module
 Part of the public interface of the module
 Checked when comparing syntax
 Only options that are used included in
dependency analysis
 Parameters included in dependency
analysis
 Specified by user
 Have a major impact on compilation
Core Types
Core Types
 Primitives
 Tuples and Unions
 Addresses, Pointers, References
 Polymorphic Types
 Strings, Arrays and Buffers
 Collections
 Records and Abstracts
Type Qualifiers
 Pliancy: Immutable, Mutable
 Reactivity: Volatile, Inert
 Duration: Local, Static, Arbitrary
 Memory: IO, Virtual, Physical?
 Useful for embedded system with multiple
memories
 Undecided: means to access hardware
registers directly
Primitives
 Logical
 Bit (2 State), Boolean (3 State), Signal (4 state)
 Ordinal
 Range from 0 to N, where N is user specified
 Character
 ASCII, UTF 8, UTF 16, UTF 32, 8 Bit Data
 Register
 Binary register, user specified dimensions
 Signal
 Signal bus, user specified dimensions
Primitives (cont’d)
 Cardinal
 Unsigned integer, 1/2/4/8/16/32/64 bits
 Subrange
 Signed range, upper/lower bound, default value
 Integer
 Signed integer, 8/16/32/64 bits
 Rational
 Signed rational, fixed or floating denominator
 Decimal
 Fixed point decimal number
 Whole and fractional component
 Number
 Floating point number, specified size
Primitives (cont’d)
 Complex numbers
 Primitives qualified with units
 Enumerations
 Matrices
 Coordinates
Primitive Modifiers
 Conversion
 Automatic, manual, none, universal
 Evaluation
 None, Fixed, Fluid
 Approximation
 Round, Truncate, Conserve
 Overflow
 Check, Limit, Wrap
 Byte Order
 Big, Little, Host, Network
Tuples and Unions
 Anonymous type products
 Fields can be labeled
 Unions
 Each term of product is overlapped
 Unsafe
 Elaborate types decompose into tuples
Addresses, Pointers, References
 Addresses have bitwise resolution
 Address is 2 product tuple
 Upper value: byte count
 Lower value: bit count
 Addresses still optimal
 types with byte-wise alignment will drop bit count
 Pointers
 Address of type where address is significant
 References
 Address of type where type is significant
Polymorphic Types
 Sums
 Superposition of multiple types
 Qualified with type tag to ensure safety
 Handle
 Anonymous pointer (points to anything)
 Anything
 Stores any type value
Strings, Arrays, Buffers
 Arrays
 Dimension type can be customized
 Slices of arrays preserve range information
 Strings
 Array of items from 1 to x
 Dimension type can be customized
 Slices of strings begin at 1
 Buffers
 Fixed length string that wraps around itself
 Varying start and end positions
 Dimension type can be customized
Collections
 Entry

a node of a linked list

a combination of string and pointer
 Segment
 List
 appends new data at the end
 elements allocated in pages
 Stack
 FIFO
 Can be prioritized
 Sequence


inserts new data
elements allocated in pages
 Queue
 LIFO
 Can be prioritized
Collections (cont’d)
 Mask
 A bit mask
 Range
 A numeric range with upper and lower bounds
 Set
 A hashed set
 Doubles as a one-to-one map
 Table
 A hashed one-to-many mapping
 Group
 A special collected used for comparisons
 Graph
 Used for frequency tables
Records
 Three visibility states
 Public: accessible anywhere
 Protected: accessible by modules that declare
access
 Private: accessible within the module
 Two layout types
 Fixed: in order of declaration
 Packed: ordered to reduce record size
 Fields can be qualified to remove them
from the build
Abstracts
 Interface to multiple record types
 Records mapped to abstracts using link directive
 Gives more flexibility in mapping abstracts
 Mappings can occur after declaration
 i.e. library types can still be mapped to abstracts
 Simplifies remapping of fields
 Fields can be qualified to remove them from the
build
 Medley: combines abstract with a group of
records
Abstract Data Types
 Mango can bind any type to a class
 Objects: Classes + Record
 Properties
 getter/setter methods
 Parameters
 Properties that can only be set at instantiation
 Delegates
 Multiple dispatch for aspects and aggregates
 Normal classes do not support inheritence
Object Construction
 Constructor (prelude)
 Destructor (finale)
 New style constructor
 Parameters and properties can be set before
the constructor is called
 Only one constructor per class – no
overloading
 Constructor spans entire object.
 Variable initialization at declaration
Object Interfaces
 Interfaces (mixins)
 Separates sub typing concerns from
implementation
 Interfaces are purely abstract. No
associated code
 Uses link directive like record abstracts
 Bind operator links types to interfaces
automatically
Aggregates
 Composition instead of hierarchical
inheritance
 Flattens the hierarchy: easier to
understand and maintain
 Overriding of methods is explicit
 Compiler can statically check for problems
 All publicly accessible methods must
explicitly exported
Built in I/O Devices
 Mango’s built in I/O devices are typed
 Six types of I/O devices






file: an operating system file
text: a line oriented text file
stream: an I/O stream
port: a connection listener
database: a database
document: a document (XML)
Procedures and Functions
Procedures
 Multiple inputs and output
 Call by localized reference
 Compiler makes a duplicate of a value on the
stack if necessary
 Nested procedures
 Can reference variables within parents scope
 Return values are held in a named
variable
 Possible to return values with assignment
Functions and Constants
 Real functions
 Essentially a single expression
 Extended with two case constructs
 Condition and conversion
 Constants




Similar format to functions
Evaluated only once, result is cached
AKA Eiffel once function
Clean way to instantiate global variables
Procedures and Functions (cont’d)
 Mango supports function/procedure
pointers
 Both global and local procedures
 Invokation through expressions or
statements
 Statement invokations automatically infer
return types
 Labeled arguments possible
Method Pointers
 Method pointers
 combine class state with class method
 When a method is selected from a class
 Result is a method pointer
 Special attribute and receptor pointers
 Automatically evaluated
 Method pointers are a supertype of
function pointers
 Will absorb function pointers
Expressions
Arithmetic
 Addition, Subtraction, Negation, Identity
 Multiplication, Division (Truncate)
 Remainder, Division (Conserve)
 Apply (Exponent), Absolute Value
 Shift Left, Shift Right
Comparison/Logical
 Equal, Not Equal
 Less, Less Equal, Greater, Greater Equal
 Test, Not, And, Or, Exclusive Or, Reduce
Casting
 Static casting
 Type is specified
 Dynamic casting
 Type is infered
 Cast is specified with dynamic cast operator
 Implicit casting
 From subtype to supertype
 Class to interface, Function to method, etc
 Polymorphic types
Membership
 in, not_in
 String, array, buffer, entry, segment
 List, stack, queue, sequence
 Mask, range, set, table, graph
 is, is_not
 Evaluates left value with right function
 Returns boolean
 like, not_like
 Tests left value is equivalent to right hand type
Selections
 Selection of record fields/class methods
 By name
 By number (access to anonymous fields)
 Class fields
 Special attribute operator
 Dynamic method operator
 Looks for class that can operate on target
type, with named method
Array
 A[x] to index an array
 Slice operator
 Before and after some position
 Within a range
 ~=, /= for string comparisons
 Size operator returns size (# elements)
 Domain operator returns dimensions
Production
 Create and initialize data types
 Local stack, static heap, dynamic heap
 Production arguments
 Where clause
 Initialize class properties/record fields
 With clause
 Class constructor arguments
 Aggregate members
 Within/upto
 Dynamic string/array dimensions
 From
 Target arena
Sequence Operators
 Generate sequence
 Pick: select one value
 Every: select group values
 Test sequence
 Some: one term in set matches
 All: all terms in set match
 Compute sequence
 Product: product of terms
 Sum: sum of terms
General Iteration Form
[a of t] in x at h using u over p where z
while q
a in x at h using u over p where z while q
<a,b> in x at h using u over p where z while q
Iterations
 Iterator targets





String, array, buffer, entry, segment
List, stack, queue, sequence
Mask, range, set, table, graph
File, text, stream, database, document
Iterator construct
Statements
Statements Overview
 Sequence of statements
 Nested block delimitation
 Begin/end pair
 Indentation
 Each statement begins with keyword
 Possible to group statements under a single
keyword
 Delimited with indentation or begin/end pair
Overview (cont’d)
 Variables can be named anywhere
 Renaming can occur
 In current block
 In nested blocks
 No goto’s
 Blocks can be labeled
 Blocks can be exited
 Block labels can be used to exit multiple
levels
Overview: Cont’d
 Conditional predication
 For many statements
 Often with alternate evaluation
Basic Blocks
 Execute: nested block
 Exit: exit blocks
 Return: exit procedure and set results
 Repeat: repeating block
 Continue: repeat loop
Production
 Declare: declare local variable
 Allocate: make temporary
 Deallocated when leaving block
 Get: declare static type
 New: declare dynamic type
 Delete: recycle or delete heap type
 Trash: delete static heap type
Placement
 Overlay: place a type over another type
 Create: create a type from string
 Place: place a type at an address
 Map: map a device type
 Open: open an I/O type
 Close: close an I/O type
Assignment
 Assign: copy value
 Optional declared result with inferred type
 Swap: swap two values
 Let: reference alias of value
 Refine: update multiple fields of a record
 Touch: Mark variable as used
 Nothing: Do nothing
Arithmetic
 Reset: reset value to default
 Set: set value to 1
 Incr: Increment by target, else 1
 Decr: Decrement by target, else 1
 Magnify: Multiply by target
 Reduce: Reduce by target
Invokations
 Call: call function with arguments
 Optional declared result with inferred type
 Eval: evaluate arguments with function
 Optional declared result with inferred type
 Initialize: initialize object with arguments
 Finalize: finalize object
 Function: local function
 Procedure: local procedure
Conditional Blocks
 If: If-then-else
 Select: Multiple conditions
 When: Defers compilation
 During: Conditional debug block
 Compiler option removes them
 While: Loop on pre-condition
 Until: Loop on post-condition
Case Constructs
 Where: case of values




Handles multiple matched terms
Case values can include ranges and groups
Character strings have contents compared
Wild cards: Case value can be ignored
 Convert: covert one value into another
 Case values can include ranges and groups
 Character strings have contents compared
Specialized Blocks
 Given: Reduces polymorphic type
 Handles multiple matched terms\
 Wild cards: Case value can be ignored
 Cast: Cast value, Execute Block
 Foreach: Iterate and execute block
 Acquire: Acquire mutex and execute block
Collection Operators
 Add: add item into set
 Replace: exchange item with another
 Remove: remove items from set
 Take: take (arbitrary) item from set
 Trim: reduce set y by set x
 Filter: intersect set y by set x
Collection Iterators
 Find: Return item
 Option to insert and return new item
 Extract: Remove and return item
 Fill: Populate collection with iterator
 Compute: Arithmetic sum over iteration
 Copy: Copy from one set to another
 Modify: Update items within collection
I/O
 Write: write terms to I/O device
 Read: read term from I/O device
 Strings, Buffers, Lists read series of terms
 Continue/Stop masks for character devices
 Print: write to standard out
 Log: Atomic write to I/O device
 Accept: accept stream from listener
 Monitor: monitor group of listeners
Error Handling
 Activate: active exception handler
 Try: trap exceptions with block
 Throw: throw exception
 Require: assertion statement
 Quit: quit program
 Raise: rethrow exception
Iterators
Iterators
 Procedures to iterate over type
 Consists of operand and result type
 Iterator activation:
 By name
 By type (operand and/or result)
 Iteration deactivation
 Result is <stop>
 Result type is tagged
Iterators (cont’d)
 Iterator sections




State: local state while iterator active
Prelude: initialization segment
Query: compute result or signal stop
Finale: finalization segment
 Local state is pushed on stack
Exceptions
Activated Handler Style
 Exception event code contained in global
handler object
 Handler object pushed onto activation
stack
 On exception, handlers are processed
until exception is handled
 Exception handling is immediate, before
stack is unwound
Exceptions (cont’d)
 Four exception flavors:




Fatal: Quit program without unwinding stack
Severe: Quit program but unwind stack
Normal: Unwind stack to last trap
Soft: Program can continue
Advantages
 Exception is processed before unwinding
 No side effects before exception handling
 Exception code can be pushed into
libraries
 Still possible to localize with arguments
passed during activation
 Reduces clutter
 No need for try-catch blocks
 No need for ‘throws’ clauses
Global variables
Global Variables
 Variable: global value
 Dataset: string of values
 Symbol: external (linked) value
 External: external (linked) function
Global Constructors
 Global Initialization/Finalization code
 Executed before and after program
 Contents:
 Local state
 Prelude procedure
 Finale procedure
 Priority value sorts contructors
Misc
Mutexes
 Formal means of acquiring locks
 Mutex object + Acquire statement
 Object selected by name or by type
Mutex Components
 State: local state
 Prelude: initialization
 Acquire: acquire lock
 Retry: reacquire flag
 Release: release lock
 Finale: finalization
Packets
 Specialized record
 Header
 Payload of variable size
 Footer
 Packet type is polymorphic pointer
 Point to packet of any size
Devices
 Used to define set of hardware registers
 Record of records
 Register placement:
 In sequence
 At specified offset
Package
 Used to define group of related functions
 Elements of packages pushed into
extended namespace
 Particularly useful for operators
 Means of reducing namespace clutter
 Means of grouping related generic
operators
Genericity
Genericity
Builds
Builds
 Assemblies, Platforms, and Prologues
Module Suffixes
 Runtime type identification
 Requests types that are used
 Therefore not all types need to be instantiated
Future Directions
Future Directions
 Finish the compiler
 Mango is the first in a family of languages
 Sharing type system and syntax
 Targeting a virtual machine
Application Language
 In the spirit of C# or Java
 Reference semantics
 i.e. no user defined pointers
 Automated memory management
 With user intervention
 Less complicated type system
 More accessible language
Functional Language
 Functional/Dataflow semantics
 Strong focus on doing distributed
computations
 Gateway to bring functional programming
to the masses
 Accessible because of shared syntax and
types
Record Processing Language
 Basically a replacement for COBOL
 “The use of COBOL cripples the mind”
 “working within the limitations of the original
COBOL programming model can make for
robust and maintainable code”
 Overlooked area by language designers
Command Language
 Cross between TCL and LISP
 Can be used as OS shell
 Eliza – MIT LL4 conference
 Classification of Functions into 4 Types:
<G> Generator
<C> Concentrator
<T> Transformer
<F> Filter
 Connected with pipes (aka unix shells)
Miscellaneous
 Standard Development Environment
 Text Editor, Debugger, Static Analysis, Etc.
 Programming is about community
 Programmers Organization




Builds community
Standards body
Funds open source projects
Earns revenue by supplying directory services
The End
Thanks for listening