Mango A General Purpose Programming Language My Background Early experience on Apple II University of Illinois – Champaign – Urbana. Bachelor's degree in computer engineering Three years at Neoglyphics: software Three years at Alpha: hardware How Mango Got Started Frustrated with C/C++/Java There had to be a better way Foolishly began designing my own language Foolishness can be a virtue It’s been a seven year voyage The Problem Common computing infrastructure is written in C/C++ C/C++ is inadequate Lack of higher level abstractions Programmers must use low level constructs Results of C/C++ use Unsafe/unstable software Slower development times Higher development costs A lack of alternatives Java, Python, Perl, Pascal, Ada, Modula, C# Not viable replacements Lack C’s virtues in performance and flexibility Dependent on C for core tasks Lack of widespread appeal (clumsy, i.e. Ada?) Not sufficiently different to switch The Solution: Core Goals Provide higher level abstractions Avoid low level constructs when not needed Make programming easier, more enjoyable Retain performance and flexibility Allow unrestricted operations as necessary Avoid overhead Match machine execution model Overall: make a better experience for programmers Overview High level design goals and decisions Feature walk through Future directions Design Goals Design goals Syntax Static Typing vs. Dynamic Typing How Vs. What Large Languages Vs. Small Languages Object Orientation: Yes or No Goal #1: A Good Syntax Syntax is key It’s underrated (focus on semantics) Makes the language easier to learn Makes it accessible to non-programmers Makes the language self-documenting Marketing versus engineering Bad marketing of a good product will fail Bad syntax around good semantics will have a harder time gaining acceptance Static versus Dynamic Typing Dynamic languages are very popular Due to poor implementations of static languages Advantages of static typing Critical for performance Types act as documentation They catch many errors at compile time Types allows overloading of names How versus What CS fantasy to forget how and focus on what How is a hard problem Distinguish features that are theoretically equivalent but practically different Everything can be a list, but it’ll be slow Rich set of primitive and aggregate types Side effects to use memory more effectively Reduce copying Manual memory management GC is not possible for some applications Done right beats a garbage collector Large Vs. Small Languages Small language Secondary features are in a standard library Advantages: easier to learn core features, make a compiler Large language First class treatment of secondary features Allows specialized operations, makes programs more readable Advantage: a smoother user experience Ease of learning is dependent on more than language size You still have to learn library API’s Intangible quality: how the language corresponds to human cognition Open source makes compilers easier to write Open front end acts as the spec Object-Orientation: Yes or No? Stepanov’s criticism programming = data structures + algorithms multi-sorted algebras Object orientation has shown itself useful in certain circumstances Offer OO as an option Leave inheritance behind Inheritance hierarchies are difficult to follow Fragile base class problem requires reanalysis of class behavior Mango walk through Influences Syntax plus some basic examples Module system + incremental compilation (Include requires clause, parameter and options) (platform and foundation files) Naming and overloading Literals (include string format) Primitive Types Memory model (include pointer/reference syntax) Records and Abstracts Procedures, functions, constants (pure functions?) Statements: Control flow Statements: I/O Abstract Data Type (objects) Iterators and iteration operators Mutexes Exception Handling Strings, arrays, buffers Collection Types Global variables/External symbols/Module Constructors Calling convention part of the type system Packet/Device types Development aids Genericity SETL - set operations ALGOL - imperative block structure and syntax C - low level ops, low overhead ML - type inference, type syntax ADA - fine grained control over primitives PYTHON - indentation based syntax JAVA - interfaces C++ - STL, operator overloading, IO syntax CLU - iterators PASCAL - sets (bit masks) PERL - richness of expressibility COBOL - readable syntax SIMULA - objects Syntax Mango’s Syntax Mango looks like pseudo code Indentation based syntax Reduces clutter and typing Allows more code to be shown on the screen Blocks can be enclosed with begin/end delimiters if desired All directives, definitions, statements begin with a keyword Simple style that is easy to remember User’s symbolic names cannot conflict with reserved words This makes the language easy to extend There are no exclusively reserved keywords Legacy of lex? Modules and Naming Modules Module is a set of symbols Each file is a module Module name must correspond to pathname A modules symbols can be public or private Public: symbols are visible to other modules Private: symbols are invisible to other modules Modules may import and include other modules Import: foreign symbols are localized (private) Include: foreign symbols are exported (public) Incremental Compilation Mango supports incremental compilation Module has changed A dependency has changed Major or minor revision? Compares syntax trees Major revision: public change Minor revision: private change Comparing syntax Advantage: Eases implementation Disadvantage: Set of major revisions is obviously larger Naming Naming is not hierarchical, but geometric Symbol names exist within a four-d grid Namespace, keyword, extension, auxiliary Only namespace and keyword are mandatory Each module is part of a namespace Public symbols use declared namespace Private symbols use special namespace “local” Format: namespace::keyword@extension$auxiliary Shallow Overloading Overloading occurs for every imported or included module 4d namespace is collapsed into 1d namespace Utilizing keyword or extension Other partial namespaces as well Symbol can be access using proper name or alias Ensures all overloaded symbols have a unique name As a result, all overloading is superficial or shallow Operator overloading is also supported Memory Model Mango’s Memory Model Value semantics Put stuff on the stack, particularly primitives Key for performance Offers more flexibility Three types of memory Static: Compiled data, global variables Heap items that are never deleted Arbitrary: Heap items that are eventually deleted Local: Items that live on the stack Safe manual memory management Static datums Always safe to use Arbitrary datums Need to be guarded to avoid dangling pointer references Local datums Compiler must enforce restrictions to avoid dangling pointer references Sentries Pointer guards are called sentries Pointers to arbitrary datums are fat One address to the datum on the heap One address to the datum’s sentry Sentries live on the heap too Have a static lifetime (i.e. never deallocated) They are very small ~= 5 bytes Sentry Performance When # sentries is small Good performance on modern hardware Sentries stay in the cache Half of the processor’s time is spent waiting on memory As # sentries increases cache starts to overflow We need to reduce the number of sentries Arenas and Recycling Method #1: Allocate in pools A group of datums share a sentry Allocated arbitrarily, but deallocated at once Method #2: Recycling Use static datums instead When static datums are deleted Initialized to zero Stay on the heap until datum of the same type is requested Incorrect results are possible, catastrophic failures are not The program cannot break the type system Literals Literals Definition A value which is known at compile time Types Immediate values Numeric primitives and text Literal expressions Interpreted during compilation Parameters Values used to configure the compiler Options User supplied values to customize a build Literals can be named Numeric Literals Six types Integer, Decimal, Hex, Address, Binary, Signal Integers and decimals also include complex plane signifiers Can be anonymous Anonymous literals are stored as untyped strings Converted to a real value when type is known There are no constraints on range Can be typed Type is specified with value Converted immediately to the desire type value There are no constraints on range Text Literals Two types Characters and Strings Stored as untyped strings until desired type is known Characters enclosed with the back tick Text string enclosed with the double quote Literals can be inserted into characters and text strings Named literals Parameters Options Character codes Character aliases Literal Expressions Two forms of literal expressions Normal literals: immediate evaluation Macro literals: deferred evaluation Macros are evaluated over arguments Result value is optionally typed Expressions can include Condition construct Conversion construct Local aliasing Parameters and Options Changes cause recompilation of module Part of the public interface of the module Checked when comparing syntax Only options that are used included in dependency analysis Parameters included in dependency analysis Specified by user Have a major impact on compilation Core Types Core Types Primitives Tuples and Unions Addresses, Pointers, References Polymorphic Types Strings, Arrays and Buffers Collections Records and Abstracts Type Qualifiers Pliancy: Immutable, Mutable Reactivity: Volatile, Inert Duration: Local, Static, Arbitrary Memory: IO, Virtual, Physical? Useful for embedded system with multiple memories Undecided: means to access hardware registers directly Primitives Logical Bit (2 State), Boolean (3 State), Signal (4 state) Ordinal Range from 0 to N, where N is user specified Character ASCII, UTF 8, UTF 16, UTF 32, 8 Bit Data Register Binary register, user specified dimensions Signal Signal bus, user specified dimensions Primitives (cont’d) Cardinal Unsigned integer, 1/2/4/8/16/32/64 bits Subrange Signed range, upper/lower bound, default value Integer Signed integer, 8/16/32/64 bits Rational Signed rational, fixed or floating denominator Decimal Fixed point decimal number Whole and fractional component Number Floating point number, specified size Primitives (cont’d) Complex numbers Primitives qualified with units Enumerations Matrices Coordinates Primitive Modifiers Conversion Automatic, manual, none, universal Evaluation None, Fixed, Fluid Approximation Round, Truncate, Conserve Overflow Check, Limit, Wrap Byte Order Big, Little, Host, Network Tuples and Unions Anonymous type products Fields can be labeled Unions Each term of product is overlapped Unsafe Elaborate types decompose into tuples Addresses, Pointers, References Addresses have bitwise resolution Address is 2 product tuple Upper value: byte count Lower value: bit count Addresses still optimal types with byte-wise alignment will drop bit count Pointers Address of type where address is significant References Address of type where type is significant Polymorphic Types Sums Superposition of multiple types Qualified with type tag to ensure safety Handle Anonymous pointer (points to anything) Anything Stores any type value Strings, Arrays, Buffers Arrays Dimension type can be customized Slices of arrays preserve range information Strings Array of items from 1 to x Dimension type can be customized Slices of strings begin at 1 Buffers Fixed length string that wraps around itself Varying start and end positions Dimension type can be customized Collections Entry a node of a linked list a combination of string and pointer Segment List appends new data at the end elements allocated in pages Stack FIFO Can be prioritized Sequence inserts new data elements allocated in pages Queue LIFO Can be prioritized Collections (cont’d) Mask A bit mask Range A numeric range with upper and lower bounds Set A hashed set Doubles as a one-to-one map Table A hashed one-to-many mapping Group A special collected used for comparisons Graph Used for frequency tables Records Three visibility states Public: accessible anywhere Protected: accessible by modules that declare access Private: accessible within the module Two layout types Fixed: in order of declaration Packed: ordered to reduce record size Fields can be qualified to remove them from the build Abstracts Interface to multiple record types Records mapped to abstracts using link directive Gives more flexibility in mapping abstracts Mappings can occur after declaration i.e. library types can still be mapped to abstracts Simplifies remapping of fields Fields can be qualified to remove them from the build Medley: combines abstract with a group of records Abstract Data Types Mango can bind any type to a class Objects: Classes + Record Properties getter/setter methods Parameters Properties that can only be set at instantiation Delegates Multiple dispatch for aspects and aggregates Normal classes do not support inheritence Object Construction Constructor (prelude) Destructor (finale) New style constructor Parameters and properties can be set before the constructor is called Only one constructor per class – no overloading Constructor spans entire object. Variable initialization at declaration Object Interfaces Interfaces (mixins) Separates sub typing concerns from implementation Interfaces are purely abstract. No associated code Uses link directive like record abstracts Bind operator links types to interfaces automatically Aggregates Composition instead of hierarchical inheritance Flattens the hierarchy: easier to understand and maintain Overriding of methods is explicit Compiler can statically check for problems All publicly accessible methods must explicitly exported Built in I/O Devices Mango’s built in I/O devices are typed Six types of I/O devices file: an operating system file text: a line oriented text file stream: an I/O stream port: a connection listener database: a database document: a document (XML) Procedures and Functions Procedures Multiple inputs and output Call by localized reference Compiler makes a duplicate of a value on the stack if necessary Nested procedures Can reference variables within parents scope Return values are held in a named variable Possible to return values with assignment Functions and Constants Real functions Essentially a single expression Extended with two case constructs Condition and conversion Constants Similar format to functions Evaluated only once, result is cached AKA Eiffel once function Clean way to instantiate global variables Procedures and Functions (cont’d) Mango supports function/procedure pointers Both global and local procedures Invokation through expressions or statements Statement invokations automatically infer return types Labeled arguments possible Method Pointers Method pointers combine class state with class method When a method is selected from a class Result is a method pointer Special attribute and receptor pointers Automatically evaluated Method pointers are a supertype of function pointers Will absorb function pointers Expressions Arithmetic Addition, Subtraction, Negation, Identity Multiplication, Division (Truncate) Remainder, Division (Conserve) Apply (Exponent), Absolute Value Shift Left, Shift Right Comparison/Logical Equal, Not Equal Less, Less Equal, Greater, Greater Equal Test, Not, And, Or, Exclusive Or, Reduce Casting Static casting Type is specified Dynamic casting Type is infered Cast is specified with dynamic cast operator Implicit casting From subtype to supertype Class to interface, Function to method, etc Polymorphic types Membership in, not_in String, array, buffer, entry, segment List, stack, queue, sequence Mask, range, set, table, graph is, is_not Evaluates left value with right function Returns boolean like, not_like Tests left value is equivalent to right hand type Selections Selection of record fields/class methods By name By number (access to anonymous fields) Class fields Special attribute operator Dynamic method operator Looks for class that can operate on target type, with named method Array A[x] to index an array Slice operator Before and after some position Within a range ~=, /= for string comparisons Size operator returns size (# elements) Domain operator returns dimensions Production Create and initialize data types Local stack, static heap, dynamic heap Production arguments Where clause Initialize class properties/record fields With clause Class constructor arguments Aggregate members Within/upto Dynamic string/array dimensions From Target arena Sequence Operators Generate sequence Pick: select one value Every: select group values Test sequence Some: one term in set matches All: all terms in set match Compute sequence Product: product of terms Sum: sum of terms General Iteration Form [a of t] in x at h using u over p where z while q a in x at h using u over p where z while q <a,b> in x at h using u over p where z while q Iterations Iterator targets String, array, buffer, entry, segment List, stack, queue, sequence Mask, range, set, table, graph File, text, stream, database, document Iterator construct Statements Statements Overview Sequence of statements Nested block delimitation Begin/end pair Indentation Each statement begins with keyword Possible to group statements under a single keyword Delimited with indentation or begin/end pair Overview (cont’d) Variables can be named anywhere Renaming can occur In current block In nested blocks No goto’s Blocks can be labeled Blocks can be exited Block labels can be used to exit multiple levels Overview: Cont’d Conditional predication For many statements Often with alternate evaluation Basic Blocks Execute: nested block Exit: exit blocks Return: exit procedure and set results Repeat: repeating block Continue: repeat loop Production Declare: declare local variable Allocate: make temporary Deallocated when leaving block Get: declare static type New: declare dynamic type Delete: recycle or delete heap type Trash: delete static heap type Placement Overlay: place a type over another type Create: create a type from string Place: place a type at an address Map: map a device type Open: open an I/O type Close: close an I/O type Assignment Assign: copy value Optional declared result with inferred type Swap: swap two values Let: reference alias of value Refine: update multiple fields of a record Touch: Mark variable as used Nothing: Do nothing Arithmetic Reset: reset value to default Set: set value to 1 Incr: Increment by target, else 1 Decr: Decrement by target, else 1 Magnify: Multiply by target Reduce: Reduce by target Invokations Call: call function with arguments Optional declared result with inferred type Eval: evaluate arguments with function Optional declared result with inferred type Initialize: initialize object with arguments Finalize: finalize object Function: local function Procedure: local procedure Conditional Blocks If: If-then-else Select: Multiple conditions When: Defers compilation During: Conditional debug block Compiler option removes them While: Loop on pre-condition Until: Loop on post-condition Case Constructs Where: case of values Handles multiple matched terms Case values can include ranges and groups Character strings have contents compared Wild cards: Case value can be ignored Convert: covert one value into another Case values can include ranges and groups Character strings have contents compared Specialized Blocks Given: Reduces polymorphic type Handles multiple matched terms\ Wild cards: Case value can be ignored Cast: Cast value, Execute Block Foreach: Iterate and execute block Acquire: Acquire mutex and execute block Collection Operators Add: add item into set Replace: exchange item with another Remove: remove items from set Take: take (arbitrary) item from set Trim: reduce set y by set x Filter: intersect set y by set x Collection Iterators Find: Return item Option to insert and return new item Extract: Remove and return item Fill: Populate collection with iterator Compute: Arithmetic sum over iteration Copy: Copy from one set to another Modify: Update items within collection I/O Write: write terms to I/O device Read: read term from I/O device Strings, Buffers, Lists read series of terms Continue/Stop masks for character devices Print: write to standard out Log: Atomic write to I/O device Accept: accept stream from listener Monitor: monitor group of listeners Error Handling Activate: active exception handler Try: trap exceptions with block Throw: throw exception Require: assertion statement Quit: quit program Raise: rethrow exception Iterators Iterators Procedures to iterate over type Consists of operand and result type Iterator activation: By name By type (operand and/or result) Iteration deactivation Result is <stop> Result type is tagged Iterators (cont’d) Iterator sections State: local state while iterator active Prelude: initialization segment Query: compute result or signal stop Finale: finalization segment Local state is pushed on stack Exceptions Activated Handler Style Exception event code contained in global handler object Handler object pushed onto activation stack On exception, handlers are processed until exception is handled Exception handling is immediate, before stack is unwound Exceptions (cont’d) Four exception flavors: Fatal: Quit program without unwinding stack Severe: Quit program but unwind stack Normal: Unwind stack to last trap Soft: Program can continue Advantages Exception is processed before unwinding No side effects before exception handling Exception code can be pushed into libraries Still possible to localize with arguments passed during activation Reduces clutter No need for try-catch blocks No need for ‘throws’ clauses Global variables Global Variables Variable: global value Dataset: string of values Symbol: external (linked) value External: external (linked) function Global Constructors Global Initialization/Finalization code Executed before and after program Contents: Local state Prelude procedure Finale procedure Priority value sorts contructors Misc Mutexes Formal means of acquiring locks Mutex object + Acquire statement Object selected by name or by type Mutex Components State: local state Prelude: initialization Acquire: acquire lock Retry: reacquire flag Release: release lock Finale: finalization Packets Specialized record Header Payload of variable size Footer Packet type is polymorphic pointer Point to packet of any size Devices Used to define set of hardware registers Record of records Register placement: In sequence At specified offset Package Used to define group of related functions Elements of packages pushed into extended namespace Particularly useful for operators Means of reducing namespace clutter Means of grouping related generic operators Genericity Genericity Builds Builds Assemblies, Platforms, and Prologues Module Suffixes Runtime type identification Requests types that are used Therefore not all types need to be instantiated Future Directions Future Directions Finish the compiler Mango is the first in a family of languages Sharing type system and syntax Targeting a virtual machine Application Language In the spirit of C# or Java Reference semantics i.e. no user defined pointers Automated memory management With user intervention Less complicated type system More accessible language Functional Language Functional/Dataflow semantics Strong focus on doing distributed computations Gateway to bring functional programming to the masses Accessible because of shared syntax and types Record Processing Language Basically a replacement for COBOL “The use of COBOL cripples the mind” “working within the limitations of the original COBOL programming model can make for robust and maintainable code” Overlooked area by language designers Command Language Cross between TCL and LISP Can be used as OS shell Eliza – MIT LL4 conference Classification of Functions into 4 Types: <G> Generator <C> Concentrator <T> Transformer <F> Filter Connected with pipes (aka unix shells) Miscellaneous Standard Development Environment Text Editor, Debugger, Static Analysis, Etc. Programming is about community Programmers Organization Builds community Standards body Funds open source projects Earns revenue by supplying directory services The End Thanks for listening
© Copyright 2025 ExpyDoc