Translation of Tree-processing Programs
into Stream-processing Programs
based on Ordered Linear Type
Koichi Kodama (Tokyo Institute of Technology)
Kohei Suenaga (University of Tokyo)
Naoki Kobayashi (Tohoku University)
Two main methods of XML processing
Tree-processing
Views XML documents as tree structures
Manipulates tree structure of input data
Trees are kept on memory in many implementations
e.g., DOM API, XDuce, CDuce
Stream-processing
Views XML documents as stream of tokens
Read/Write tokens from/to streams
e.g., SAX
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
2
Tree-processing
node
leaf
node
3
4
leaf 3
leaf 2
memory
leaf
Programmers
write here
disk, network
<node>
<leaf>2</leaf>
<leaf>3</leaf>
</node>
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
<node>
<leaf>3</leaf>
<leaf>4</leaf>
</node>
3
Stream-processing
memory
disk, network
Programmers
write here <node>
<node>
<leaf>2</leaf>
<leaf>3</leaf>
</node>
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
<leaf>3</leaf>
<leaf>4</leaf>
</node>
4
Example of programs in each style
Incrementing the value of each leaf
Tree-processing
fix f. λt.
case t of
leaf x → leaf (x + 1)
| node x1 x2 → node (f x1) (f x2)
Stream-processing
fix f. λt.
case read() of
leaf → write(leaf); write(read() + 1)
| node → write(node); f (); f ()
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
5
Comparison of two styles
Tree-processing
Stream-processing
Readability, writability
better
worse
Memory efficiency
worse
better
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
6
Our goal
Taking the best of both world
Readability and writability of tree-processing
High memory efficiency of stream-processing
Approach
Automatic translation of tree-processing
programs into stream-processing programs
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
7
Key observation (1/2)
In order for stream-processing to be effective, a
program should access an input tree from left to
right, in the depth-first order.
fix f. λt.
case t of
leaf x → leaf (x + 1)
| node x1 x2 → node (f x2) (f x1)
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
8
Key observation (2/2)
If a program accesses an input tree in that order,
then there is a simple, structure-preserving
translation into an equivalent stream-processing
program.
fix f. λt.
case t of
leaf x →
leaf (x + 1)
| node x1 x2 →
node (f x1) (f x2)
2015/10/1
fix f. λt.
case read() of
leaf →
write(leaf);
write(read() + 1)
| node →
write(node); f (); f ()
The Second Asian Symposium on Programming
Language and Systems
9
Our solution
Use the idea of ordered linear types
[Polakow 2000]
Guarantee “correct” access order by assigning
ordered linear types to input trees
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
10
Outline
Syntax of languages
Type system
Translation algorithm
Extension
Related work
Conclusion
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
11
Outline
Syntax of languages
Type system
Translation algorithm
Extension
Related work
Conclusion
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
12
Languages
Source language
Call-by-value λ-calculus +
primitives for binary tree processing
Target language
Call-by-value λ-calculus +
primitives for stream processing
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
13
Source language
M (terms) ::= i
(integer)
| M1 + M2
(addition)
|x
(variable)
| λx.M
(abstraction)
| M1 M2
(application)
| fix f. M
(recursive function)
| leaf M
(leaf with an integer)
| node M1 M2
(branch)
| (case M of
(case analysis)
leaf x → M1
| node x1 x2 → M2)
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
14
Target language
e (terms) ::= i
| e1 + e2
| ()
|x
|λx.e
| e1 e2
| fix f. e
| leaf | node
| read () | write e
| (case e of
leaf → e1
| node → e2)
2015/10/1
(integer)
(addition)
(unit)
(variable)
(abstraction)
(application)
(recursive function)
(tag)
(stream manipulation)
(case analysis)
The Second Asian Symposium on Programming
Language and Systems
15
Representation of trees in each language
node
Source language
node (leaf 1) (leaf 2)
Target language
node; leaf; 1; leaf; 2
leaf
leaf
1
2
We do not consider closing tags
because we only focus on binary trees for now
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
16
Example of programs
Incrementing the value of each leaf
Source language
fix f. λt.
case t of
leaf x → leaf (x + 1)
| node x1 x2 → node (f x1) (f x2)
Target language
fix f. λt.
case read() of
leaf → write(leaf); write(read() + 1)
| node → write(node); f (); f ()
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
17
Outline
Syntax of languages
Type system
Translation algorithm
Extension
Related work
Conclusion
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
18
Type system
Utilizes ordered linear type
Properties of well-typed programs:
access each node of the input tree exactly once
in left-to-right, depth-first order
do not construct trees on memory
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
19
Types
τ (types) ::= Int
(integer)
| τ1→τ2 (function)
| InTree
(input tree)
| OutTree
(output tree)
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
20
Type judgment
Γ | Δ├ M :τ
(Non-ordered) type environment: Γ
{ x1:τ1, x2:τ2, ..., xn:τn }
(τi ∈ {InTree, OutTree} )
Set of bindings
Represents that
each of x1, x2, ..., xn is
accessed exactly once
in the order
Ordered linear type environment: Δ
x1:InTree, x2:InTree, ..., xn:InTree
Sequence of bindings
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
21
Example of type judgment
Γ = f : InTree → OutTree
Δ = x1 : InTree, x2 : InTree
Γ | Δ├ node (f x1) (f x2) : OutTree
Γ | Δ├ node (f x2) (f x1) : OutTree
Γ | Δ├ node (f x1) (f x1) : OutTree
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
22
Typing rules
Γ|Δ1├M1 :OutTree Γ|Δ2├M2 :OutTree
(T-NODE)
Γ|Δ1,Δ2├ node M1 M2 :OutTree
Γ|Δ1├M :InTree
Γ,x:Int|Δ2├M1 :τ
Γ|x1:InTree, x2:InTree, Δ2├M2 :τ
(T-CASE)
Γ|Δ1,Δ2├ case M of leaf x → M1 |node x1 x2 → M2 :τ
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
23
Outline
Syntax of languages
Type system
Translation algorithm
Extension
Related work
Conclusion
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
24
Translation algorithm (1/2)
A(i) = i
A(x) = x
A(M1 + M2) = A(M1) + A(M2)
A(λx.M) = λx. A(M)
A(M1 M2) = A(M1) A(M2)
A(fix f.M) = fix f. A(M)
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
25
Translation algorithm (2/2)
A(leaf M) = write(leaf);write(A(M))
A(node M1 M2) =
write(node);A(M1);A(M2)
A(case M of leaf x → M1
| node x1 x2 → M2)
= (case A(M);read() of
leaf → let x=read() in A(M1)
| node → [()/x1,()/x2]A(M2))
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
26
Correctness of the translation
If φ | x : InTree├ M : OutTree
(M, {x=V}) * W
iff
(A(M), <V>, ) * ( ( ), , <W>)
input stream
2015/10/1
output stream
The Second Asian Symposium on Programming
Language and Systems
27
Outline
Syntax of languages
Type system
Translation algorithm
Extension
Related work
Conclusion
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
28
Extensions
Introduce buffering of trees
Introduce primitives for buffered tree
construction and destruction
Extend typing rules and translation algorithm for
those primitives
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
29
Source language
M (terms) ::= ...
| mleaf M
(buffered leaf)
| mnode M1 M2
(buffered branch)
| (mcase M of
(case analysis
mleaf x → M1
for buffered tree)
| mnode x1 x2 → M2)
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
30
Target language
e (terms) ::= ...
| mleaf M
(buffered leaf)
| mnode M1 M2
(buffered branch)
| (mcase M of
(case analysis
mleaf x → M1
for buffered tree)
| mnode x1 x2 → M2)
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
31
Types
τ (types) ::= ... | MTree
Type of buffered trees
(kept in non-ordered environment)
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
32
Example of type judgment
Γ = f : MTree → OutTree
Γ’ = g : InTree → OutTree
Γ, x : MTree | φ├ node (f x) (f x) : OutTree
Γ’ | x : InTree ├ node (g x) (g x) : OutTree
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
33
Typing rules
Γ|Δ1├M : MTree
Γ,x:Int|Δ2├M1 :τ
Γ, x1:MTree ,x2:MTree | Δ2├M2 :τ
(T-MCASE)
Γ|Δ1,Δ2├ mcase M of mleaf x → M1
|mnode x1 x2 → M2 :τ
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
34
Translation algorithm
A(mleaf M) = mleaf A(M)
A(mnode M1 M2) = mnode A(M1) A(M2)
A(mcase M of mleaf x → M1
| mnode x1 x2 → M2)
= (mcase A(M) of
leaf x → A(M1)
| node x1 x2 → A(M2))
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
35
Example of programs
fix s2m. λt. case t of
leaf x → mleaf x
| node x1 x2 →
mnode (s2m x1) (s2m x2)
fix m2s. λt. mcase t of
mleaf x → leaf x
| mnode x1 x2 →
node (m2s x1) (m2s x2)
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
36
Outline
Syntax of languages
Type system
Translation algorithm
Extension
Related work
Conclusion
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
37
Related work
Attribute grammar based approach
[Nakano et al. 2004]
Translates tree-processing attribute grammars into
stream-processing attribute grammars
Pros: need not be conscious which part of stream
to be buffered
Cons: cannot specify evaluation order
Introduction of side-effect may be problematic
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
38
Related work
Deforestation [Wadler 1988]
Remove intermediate trees from programs
Put syntactic restrictions (treelessness) on
programs
Variables have to occur only once
Only variables can be passed to functions
Do not require order restrictions
Translated programs may not suitable
for stream-processing
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
39
Related work
Ordered linear logic [Polakow 2000]
Formalizes ordered linear logic
Type system for memory allocation and
data layout [Petersen et al. 2003]
Uses ordered linear type to express spatial
order on memory
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
40
Outline
Syntax of languages
Type system
Translation algorithm
Extension
Related work
Conclusion
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
41
Conclusion
Proposed a method of translating treeprocessing programs to stream-processing
ones utilizing ordered linear type
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
42
Future work
Providing automatic insertion of buffering
primitives
Extension to general XML documents
Dealing with rose trees
Considering closing tags
2015/10/1
The Second Asian Symposium on Programming
Language and Systems
43
Fin
© Copyright 2026 ExpyDoc