A Language-Independent Approach to Software

Research Directions
Towards
Software Maintenance, Adaptation
and Modernization
Suman Roychoudhury
Grammar-based Software Technology Spring 2006 Programming Lang. Seminar
1
Maintaining Legacy Code –
Tool Support ?
Legacy
Commercial + Scientific
Applications
Cobol
200 Billion Lines of
Cobol Source
Fortran
Object Pascal
Software
Migration is not an
easy task !
Infrastructure, money
effort already spent
2
Current Techniques and Tools





OMG MDA - Model Driven Architecture
OMG KDM - Knowledge Discovery
Metamodel
Aspect-Oriented Software Development
Software Refactoring/Mining/Migration
Eclipse IDE/ AspectJ
3
Aspect Weavers: Current Limitations
Several tools and language extensions have been
developed to enable aspect weaving :-
However –
1.
Specific to a few popular languages (e.g., Java)
2.
New tools are developed from scratch without
preserving knowledge of previous construction
3.
No emphasis on re-usability in a different language
and platform context
4
Research Questions

Is there a technique to construct aspect weavers for
legacy languages without extending or inventing a
new parser (or compiler) from scratch?

Can such construction be supported in a more
generic or language-parametric (independent) way?

Can the knowledge of building a weaver from a
previous construction be reused in a different
language and platform context?
5
Challenges in Language-Parametric
Legacy Adoption
Challenge C1 – The Parser Construction Problem
Challenge C2 – Type Analysis Problem
Challenge C3 – The Code Transformation Problem
Challenge C4 – Language-Parametric Generalization of
Transformation Objectives
Challenge C5 – Accidental Complexities of Transformation
Specifications
6
Challenge C1 – The Parser Construction
Problem




Scalabilty
Extensibility
Availabilty of AST
Framework
Reusabilty
Transformation
System
Parser
Concrete
Syntax
Intermediate
Type
Analysis
Legacy System







C++
Object Pascal
Fortan
Cobol
Ada
Jovial
Intermediate
Syntax
User Inputs
Aspect Language
VHDL
7
Challenge C2 – Type Analysis Problem

Interaction of aspects
with Type System (e.g.
computing sub-types
for a given type)
Transformation
System
Parser
Intermediate
Syntax
Concrete
Syntax
Intermediate
Type
Analysis
Legacy System
User Inputs
Aspect Language
8
Challenge C3 – The Code Transformation
Problem
Transformation
System
Intermediate
Syntax
pattern probe_id_pattern1()
: unqualified_id = " printf ".
rule insert_probe1
(s: statement_seq):
function_body
->
Legacy System
function_body
= " { \s } " ->
"{ \probe_id_pattern1\(\)
(\"Entering Method…\");
{ \s }
}".
Parser
Concrete
Syntax
Intermediate
Type
Analysis
User Inputs
Aspect Language
9
Challenge C4 – Language-Parametric
Generalization of Transformation Objectives
Transformation
System
Intermediate
Syntax
pattern probe_id_pattern1()
: IUnqualifiedId = "IFunctionName".
rule insert_generic_probe1
(s: IStatementList):
IFunctionDecl
->
Legacy System
IFunctionDecl
= " \Block_Begin
\s
\Block_End"
->
"\Block_Begin
\probe_id_pattern1\(\)
(\"Entering Method…\");
\Block_Begin \s \Block_End
\Block_End ".
Parser
Concrete
Syntax
Intermediate
Type
Analysis
User Inputs
Aspect Language
10
Challenge C5 – Accidental Complexities
of Transformation Specifications

A high level aspect language
layered on top of the base system

Translates to low level
specifications
aspect insert_probe1 {
before():
execution
(function_declaration)
{
// insert print
statement
}
}
Transformation
System
Parser
Intermediate
Syntax
Concrete
Syntax
Intermediate
Type
Analysis
Legacy System
User Inputs
Aspect Language
11
Goal Statement
Goal 1 : Raise the level of
abstraction (aspect layering)
for end-programmers (C5)
Goal 2 : Re-use core transformations across
language domains using language
parametric transforms (C2, C4)
Program
Transformation
System
the
level
of
abstraction and reuse the
core
High
Lowlevel
Levellanguage
xForm
constructs
(C1,C3)
Legacy Software
Increase
Parametric
Transforms
transformation
Aspect Weavers
functions across language
domains
We use a commercial PTE from
Semantic Designs
(the Design Maintenance System - DMS)
12
Current State-of-the-Art:
Rewriting Techniques

object-based transforms, such as a visitor object applied
to an object model

intermediate representations that permit primitive
transformations to be applied to a set of languages (e.g.,
.Net CodeDOM)

XML-based transforms that use an XML DOM structure

term rewriting, such as a transformation rule
13
Current State-of-the-Art:
SourceWeave.Net

Uses Microsoft’s CodeDom
Architecture as the underlying
AST Representation

Uses XML descriptor to
specify interaction between
aspects and .Net components

Limited availability of
CodeDOM providers

CodeDOM biased towards
C#, hence lack of
expressiveness for other
languages

Verbosity of XML representations
a serious concern for scalability
http:// aosd.net/2004/archive/JacksonPoster.pdf
14
Current State-of-the-Art:
Weave.Net

Uses existing .Net Binary
Component and cross-cutting
specifications in an xml file as input

Advice is given separately in
another .Net assembly

Limited to applications
hosted within .Net

Aspect specification is difficult to
comprehend due to lack of
expressiveness of XML constructs
http://www.dsg.cs.tcd.ie/uploads/category144/119.pdf
15
Current State-of-the-Art:
Aspect Cobol

Parse trees are exported to XML

Weaving boils down to XML DOM tree
processing in Java.

XML representation of source
code (intermediate form) may not
scale with code size

It has been reported that such
internal representations are 50
times larger and much slower to
transform [Ger01]

Limited only to COBOL i.e., a
more generalized approach is
missing
Roy Germon, “Using XML as an Intermediate Form for Compiler Development,” XML Conference and Exposition, Orlando, FL, December 2001.
16
http://homepages.cwi.nl/~ralf/AspectCobol/
Current State-of-the-Art:
Aspierce + GCC 4.0

Proposes to use GCC 4.0
GENERIC trees

Weaving is done by
modifying the GENERIC
AST representations

Lack of support for .Net
Languages like C#, VB.Net and
other languages like Object
Pascal, Cobol etc

Still in a proposal stage, not clear
whether API support is available
for modifying GENERIC trees

Very little documentation
available from project site
http://users.ugent.be/~badams/publications/2005/language-independentaspectweaving.pdf
17
Background:
Term Rewriting Based Program Transformation

Term rewriting is a paradigm that is used in fields such as program
transformation and theorem proving

In term rewriting, rules define a refinement to a structure by
specifying a pattern to be matched and the resulting effect (closer to
logic and algebraic programming)
Example:
private rule
refine_expression_add(e1:expression@InfixArithmetic,
e2:expression@InfixArithmetic):
expression -> expression
= "+ \e1\:expression \e2\:expression" -> "(\e1) + (\e2)".
18
Genericity of Aspect Transformations (Object Pascal)
– Database Error Handler Synchronization
Object Pascal Source File
………
function
TExNullField.Handle
(ServerType: TServerType;
E: EDBEngineError): Integer;
begin
TExHandleCollection
(Collection).LockHandle;
try
<database error handling code
omitted here>
finally
TExHandleCollection
(Collection).UnLockHandle;
end;
end;
………
DMS Rewrite Rule
……
rule sync_OP_meths
(sl:stmt_list, id:IDENTIFIER,
fps: formal_params,
frt:func_result_type):
qual_func_header_decl 
qual_func_header_decl =
"function \method_id\(\id\)
\fps : \frt ;
begin \sl end;" 
"function \method_id\(\id\)
\fps : \frt ;
begin \LockStmt\(\);
try \sl
finally
\UnLockStmt\(\);
end;“
19
Genericity of Aspect Transformations ( Java)
– Database Error Handler Synchronization
Java Source File
………
public Integer HandleDBError(
TServerType ServerType,
EDBEngineError E)
{
ErrHandlerCollction
. LockHandle(Collection);
try {
<database error handling code
omitted here>
finally {
ErrHandlerCollection
. UnLockHandle(Collection);
}
}
DMS Rewrite Rule
……
rule sync_Java_meths
(s_seq:stmt_seq,m_mods:
meth_modifiers,
t:type,id:IDENTIFIER,
fp:fparams) :
method_declaration 
method_declaration =
"\m_mods \type
\method_id\(\id\) \fp
{ \s_seq } ;" 
"\m_mods \type
\method_id\(\id\) \fp
{ \LockStmt\(\);
try { \s_seq }
finally {
\UnLockStmt\(\) }
};”
20
Transformation Map
DMS Rewrite Rule for Java
DMS Rewrite Rule for Object Pascal
……
rule sync_Java_meths
(s_seq:stmt_seq,m_mods:
meth_modifiers,
t:type,id:IDENTIFIER,
fp:fparams) :
method_declaration 
method_declaration =
……
rule sync_OP_meths
(sl:stmt_list, id:IDENTIFIER,
fps: formal_params,
frt:func_result_type):
"\m_mods \type
\method_id\(\id\) \fp
{ \s_seq } ;" 
"\m_mods \type
\method_id\(\id\) \fp
{ \LockStmt\(\);
try { \s_seq }
finally {
\UnLockStmt\(\) }
};”
"function \method_id\(\id\)
\fps : \frt ;
begin \sl end;" 
"function \method_id\(\id\)
\fps : \frt ;
begin \LockStmt\(\);
try \sl
finally
\UnLockStmt\(\);
end;“
qual_func_header_decl 
qual_func_header_decl =
21
Language-Parametric Synchronization
Rule – Intermediate AST
Mapping to Intermediate AST
st_list@iAst  stml_list@OP
ID_list@iAst  id@OP
param_list@iAst 
paramformal_params@OP
ret_type@iAst 
func_result_type@OP
meth_signature@iAst 
qual_func_header_decl@OP
---------------------------st_list@iAst  stmt_seq@Java
ID_list@iAst  id@Java
param_list@iAst 
fparams@Java
ret_type@iAst  type@Java
meth_signature@iAst 
method_declaration@Java
Generic Transformation Rule
rule sync_generic (sL:st_list,
i:ID_list, pl:
param_list, rt:ret_type):
std_func_decl ->
std_fun_decl =
"meth_signature(\id\,\pl\,\rt)
\std_start
\sL
\std_end" ->
"meth_signature(\id\,\pl\,\rt)
\std_start
\LockStmt\(\);
\try_finally_decl
\sL
\UnLockStmt\(\);
\std_end“
22
System Workflow
Input
Source
File
Lexer/
Parser
AST
Graph
Symbol
Table
Intermediate
AST Reader
parser
definitions
Lexer/
Parser
AST
Graph
Pattern
Transformer
Analyzer
Attribute
Evaluator
Domain
Reader
Transformation
Engine
Generic
Transforms
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
Phase 1
Phase 2
23
GenAWeave in Action
Input
Source
File
Lexer/
Parser
AST
Graph
Symbol
Table
Intermediate
AST Reader
parser
definitions
Lexer/
Parser
AST
Graph
Pattern
Transformer
Analyzer
Attribute
Evaluator
Domain
Reader
Transformation
Engine
Generic
Transforms
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
Phase 1
Phase 2
24
GenAWeave in Action
Input
Source
File
Lexer/
Parser
AST
Graph
Symbol
Table
Intermediate
AST Reader
parser
definitions
Lexer/
Parser
AST
Graph
Pattern
Transformer
Analyzer
Attribute
Evaluator
Domain
Reader
Transformation
Engine
Generic
Transforms
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
Phase 1
Phase 2
25
GenAWeave in Action
Input
Source
File
Lexer/
Parser
AST
Graph
Symbol
Table
Intermediate
AST Reader
parser
definitions
Lexer/
Parser
AST
Graph
Pattern
Transformer
Analyzer
Attribute
Evaluator
Domain
Reader
Transformation
Engine
Generic
Transforms
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
Phase 1
Phase 2
26
GenAWeave in Action
Input
Source
File
Lexer/
Parser
AST
Graph
Symbol
Table
Intermediate
AST Reader
parser
definitions
Lexer/
Parser
AST
Graph
Pattern
Transformer
Analyzer
Attribute
Evaluator
Domain
Reader
Transformation
Engine
Generic
Transforms
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
Phase 1
Phase 2
27
GenAWeave in Action
Input
Source
File
Lexer/
Parser
AST
Graph
Symbol
Table
Intermediate
AST Reader
parser
definitions
Lexer/
Parser
AST
Graph
Pattern
Transformer
Analyzer
Attribute
Evaluator
Domain
Reader
Transformation
Engine
Generic
Transforms
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
Phase 1
Phase 2
28
GenAWeave in Action
Input
Source
File
Lexer/
Parser
AST
Graph
Symbol
Table
Intermediate
AST Reader
parser
definitions
Lexer/
Parser
AST
Graph
Pattern
Transformer
Analyzer
Attribute
Evaluator
Domain
Reader
Transformation
Engine
Generic
Transforms
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
Phase 1
Phase 2
29
GenAWeave in Action
Input
Source
File
Lexer/
Parser
AST
Graph
Symbol
Table
Intermediate
AST Reader
parser
definitions
Lexer/
Parser
AST
Graph
Pattern
Transformer
Analyzer
Attribute
Evaluator
Domain
Reader
Transformation
Engine
Generic
Transforms
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
Phase 1
Phase 2
30
GenAWeave in Action
Input
Source
File
C4
Lexer/
Parser
Symbol
Table
C1
Intermediate
AST Reader
AST
Graph
parser
definitions
C4
Lexer/
Parser
C2
AST
Graph
Pattern
Transformer
C2
Analyzer
Attribute
Evaluator
C2
Domain
Reader
Transformation
Engine
Generic
Transforms
C4
unparser definitions
Core
Transforms
Pretty
Printer
Revised
Source
File
Transformation
Engine
C3
Phase 1
C1
Phase 2
31
Aspect Layering – Challenge C5



Remove accidental complexities arising from term rewriting
Provide an easy customized aspect language using the
Join Point Model (JPM) adapted from AspectJ
The formalizations of transformation rules showed the
relationship between the JPM model of an aspect language
with respect to rewrite rules
aspect generic_synchronize {
before(): execution (mname@meth_signature) {
// insert base language specific lock statement
}
after(): execution (mname@meth_signature) {
// insert base language specific unlock statement
}
32
Object Management Group(OMG)
ASTM –Abstract Syntax Tree Meta-model
“ It has long been observed that although there are
many differences between the statements in many
programming languages there is a very large set of
statements that are common across most languages.
The Object Management Group (OMG) Architecture
Driven Modernization Task Force has issued an RFP
for the Abstract Syntax Tree Meta Modeling standard.
The ASTM seeks to establish a single comprehensive
set of modeling elements for capturing how many
software languages represent the same software
language constructs and concepts…”
33
34
35
36
37
38
39
40
41
42
43
44