Regularity of Hairpin Completions of crossing words

Regularity of Iterative Hairpin Completion of crossing words.
!
Kayoko Shikishima-Tsuji
Center for Liberal Arts Education and Research Tenri University
!
1 Introduction The mathematical hairpin notion introduced in [5] is a word in which some
suffix is the mirrored complement of a middle factor of the word. This operation were
introduced which are very similar in nature to it. Mirrored complementary sequences
occur frequently in DNA and are often found at functionally interesting locations such
as replication origins or operator sites. The most basic question about hairpin
completion is “given a word, can we decide whether the iterated hairpin completion of
the word is regular?” The situation is very complicated. In the special case when the
word is non-crossing, some results were given ([1],[3]). In section 3, we add some
results in the case when the word is crossing.
!
2 Preliminaries
We assume the reader to be familiar with basic concepts as alphabet, word,
language and regular expression (for more details see [2]).
Words together with the operation of concatenation form a free monoid, which
is usually denoted by Σ∗ for an alphabet Σ. Repeated concatenation of a word w with
itself is denoted by wi for nonnegative integer i. The length of a finite word w is the
number of not necessarily distinct symbols it consists of and is written by w . Let θ be an antimorphic involution, i.e. θ : Σ∗ → Σ∗ is a function, such that for
θ(θ(a)) = a for all a ∈ Σ, and θ(uv) = θ(v)θ(u) for all u, v ∈ Σ+. Then, w is a
(θ-)pseudopalindrome if w = θ(w). To make notation cleaner, we write u for θ(u),
when θ is understood. Throughout this paper, let k be a fixed integer. For a word w= γαβα for some
k
α ∈Σ and γ , β ∈Σ * , we define right (k-)hairpin completion γαβαγ of w= γαβα
(with respect to α ). By the notation w → r u (or w k → r u ), we mean that u is right
(k-)hairpin completion of w . The left hairpin completion is defined analogously. By
the notation w →l u , we mean that u is left (k-)hairpin completion of w . The relation
of hairpin completion → is defined as → r or →l . By →* , →*r and →*l , we denote
the reflexive transitive closure of → , → r and →l , respectively.
1
For a language L ⊆ Σ + , we define the right (k-)hairpin completion of L by
RH k (L) = {γαβαγ | γαβα ∈L, γ ∈Σ + , β ∈Σ * and | α |= k} and the left (k-)hairpin
completion of L by LH k (L) = {γαβαγ | γαβα ∈L, γ ∈Σ + , β ∈Σ * | α |= k} . The
(k-)hairpin completion of L is H k (L) = RH k (L) ∪ LH k (L) . And we define the iterated
(right or left) (k-)hairpin completion H *k (L) (or RH *k (L) or LH *k (L) ) of L, inductively
as follows:
H k0 (L) = H k (L) , H kn+1 (L) = H k (H kn (L)) and H *k (L) = ∪ n≥0 H kn (L ) ,
!
RH k0 (L) = RH k (L) , RH kn+1 (L) = RH k (RH kn (L)) and RH *k (L) = ∪ n≥0 RH kn (L ) ,
LH k0 (L) = LH k (L) , LH kn+1 (L) = LH k (LH kn (L)) and LH *k (L) = ∪ n≥0 LH kn (L ) ,
3 Iterated hairpin completion of crossing words
A word w ∈Σ + is an (m, n)- α -word (or simply (m, n)-word) if the numbers of
occurrence of α and α in w, are m and n respectively. We say that an (m, n)- α -word
w is non- α -crossing if the rightmost occurrence of α precedes the leftmost
occurrence of α on w. Otherwise, the word is α -crossing.
Let w ∈αΣ * ∩ Σ *β and | α |=| β |= k . By one-step hairpin completion with
respect to α and β , we get two words w ′ ∈ βΣ * ∩ Σ *β and w ′′ ∈αΣ * ∩ Σ *α .
Regularity of the iterative hairpin completion of w is depend on those of H *k (w ′ ) and
H *k (w ′′ ) . Our problem in this paper is “for a given word w , whether the iterated
hairpin completion of w is regular?” In the rest of this paper, we may assume that the
word w is in αΣ * ∩ Σ *α . Let w be a crossing (m, n)- α -word. Then we have m,n ≥ 2 . !
The following example shows that the iterated hairpin completion of a (3, 2)- α
crossing word is not always regular ([4]).
!
Example 1. (Kopecki) We consider a crossing word w = α bααα cα . Let u = b α ,
v = αα bα and R = wu + vu + wu + w. Then, R ∩ H k* ({w}) = {wu r vu r wu r w | r ≥ 1} . Therefore,
the iterated hairpin completion of a word is not always regular.
!
We have some results for the iterated hairpin completion of a (2, 2)- α crossing
word.
!
Proposition 1. Let w ∈αΣ * ∩ Σ *α be (2, 2)- α -crossing-word such that w = xvy
where x and y are (1,1)- α -words in αΣ *α and v ∈Σ * . If x and y are pseudopalindromes, then H k* (w) is regular.
2
!
Proof) Let R = {x(vx)* (vy(vx)+ (vx)* )* vy(vx)+ } and L = {(yv )+ xv((yv)* (yv )+ xv)* (yv)* y} .
Since these languages are regular and H k* (w) = H k* (xvyvx) ∪ H k* (yvxvy) ∪ {w} , we show
that H k* (w) = R ∪ L ∪ {w} . First, we prove that H k* (xvyvx) = R. We show that H k* (xvyvx) ⊃ R. Let Rn = x(vx)* (vy(vx)+ (vx)* )n (vy)(vx)+ for every
nonnegative integer n. Then it is obvious that R = ∪ n≥0 Rn . We show that
Rn ⊂ H k* (xvyvx) by induction on n ≥ 0 . It is clear that R0 = x(vx)* vy(vx)+ ⊂ H k* (xvyvx) by
the following: xvyvx →il x(vx)i vy(vx) → rj x(vx)i vy(vx) j+1 .
Suppose that Rn = x(vx)* (vy(vx)+ (vx)* )n (vy)(vx)+ is contained in H k* (xvyvx) for some
nonnegative integer n. Every word w in Rn+1 is written by the form
x(vx)r vy(vx)m (vx)t !vy(vx)m (vx)t vy(vx)s where m1,!, mn+1 > 0 , t1,!,t n+1 ≥ 0 , and
r, s ≥ 0 . We show that the word w1 = xvy(vx)m (vx)t !vy(vx)m (vx)t vy(vx) is in
H k* (xvyvx) .
(1)
When m1 > t n+1 , let w ′ = xvy(vx)m (vx)t !vy(vx)m (vx)t vy(vx) . We have
w ′ = xvy(vx)m (vx)t !vy(vx)m (vx)t vy(vx) → rm −1 xvy(vx)m (vx)t !vy(vx)m (vx)t vy(vx)m = xvyv (xv )t ⋅ x(vx)m −t −1 (vx)t !vy(vx)m (vx)t vy(vx)m → r xvyv (xv )t ⋅ x(vx)m −t −1 (vx)t !vy(vx)m (vx)t vy(vx)m ⋅(vx)t vyvx = w1 Since w ′ is in Rn , it is contained in H k* (xvyvx) . Then the word w1 is also in H k* (xvyvx) . (2)
When m1 ≤ t n+1 , let w ′′ = xvy(vx)m (vx)t !vy(vx)m (vx)t vy(vx) . We have w ′′ = xvy(vx)m (vx)t !vy(vx)m (vx)t vy(vx) →tr x(vx)t vy(vx)m (vx)t !vy(vx)m (vx)t vy(vx) = x(vx)t vy(vx)m (vx)t !vy(vx)m (vx)t −m +1 ⋅(vx)m −1 vy(vx) → r xvyv (xv )m −1 ⋅ x(vx)t ⋅ vy(vx)m (vx)t !vy(vx)m (vx)t −m +1 ⋅(vx)m −1 vy(vx) = w1 Since w ′′ is in Rn , it is contained in H k* (xvyvx) . Then the word w1 is also in H k* (xvyvx) . It is clear that w ∈H k* (w1 ) ⊂ H k* (H k* (xvyvx)) = H k* (xvyvx) . We proved that
H k* (xvyvx) ⊃ R. To prove that H k* (xvyvx) ⊂ R is easy. For a nonnegative integer n, suppose that
H kn (xvyvx) ⊂ R . Then we have H kn+1 (xvyvx) = H k1 (H kn (xvyvx)) ⊂ H k1 (R) ⊂ R .
As the proof of H k* (yvxvy) = L is similar, we omit it. ! 1
1
n+1
n+1
1
1
1
1
n
n+1
1
n+1
n+1
1
1
2
1
n
1
2
1
2
2
n
n
n
n
n+1
n+1
n+1
n+1
n+1
n+1
2
1
n+1
2
n+1
n+1
1
1
n
1
n+1
n
n+1
n+1
2
n
n
2
2
n+1
1
1
n+1
1
n+1
2
1
1
n+1
n+1
1
1
!
The following example is that the iterated hairpin completion of a (2, 2)- α
crossing word w which is not satisfies the condition of Proposition 1, that is w = xvy
such that y ≠ y , is not always regular.
!
3
n+1
Example 2 We consider a (2, 2)- α -crossing word w = α bα cα dα where a,b,c,d ∈Σ ,
a ≠ a,b = b,c ≠ c,d ≠ d and α = aa . For integers i, j > 0 , suppose a word
wi, j = xcy(c x )i (cx) j cy c x where x = α bα and y = α dα is in H k* (w) . Since w = xcy
appears only one time as a factor of wi, j , we have wi, j ∈RH k* (w) . We have w = α bα cα dα = xcy → rm α bα cα dα ⋅(cα b α )m = xcy ⋅(c x)m →1r xcy ⋅(c x)m+1 or
xcy(c x)m ⋅(cx)s cy c x where m > 0 and m ≥ s > 0 . Let S = xcy(c x)* (cx)* cy c x . It is
easy to see that H k* (w) ∩ S = RH k* (w) ∩ S = {xcy(c x)i (cx) j cy c x | i ≥ j ≥ 0} . Since S
is regular and H k* (w) ∩ S is not regular, H k* (w) is not regular.
!
!
We have the following corollaries by the proof of Proposition 1.
Corollary 1. Let w ∈αΣ * ∩ Σ *α be (3, 3)- α -crossing-word such that w = xvyv x
where x and y are (1,1)- α -words in αΣ *α and v ∈Σ * . If x and y are pseudopalindromes, then the iterated hairpin completion H k* (w) of w is regular and H k* (w) =
x(vx)* (vy(vx)+ (vx)* )* vy(vx)+ .
!
Corollary 2. Let w ∈αΣ * ∩ Σ *α be (2, 2)- α -crossing-word such that w = xvy where
x and y are (1,1)- α -words in αΣ *α and v ∈Σ * . If x, y and v are pseudo-palindromes,
then the iterative hairpin completion H k* (w) is regular and H k* (w) = {w} ∪
x(vx)* (vy(vx)+ )* vy(vx)+ ∪ (yv)+ xv((yv)+ xv)* (yv)* y .
!
!
The following theorem is proved by the similar way of Proposition 1.
Theorem 1. Let w be (2, 2)- α -crossing-word such that w = xΣ + ∩ Σ + y where x and y
are (1,1)- α -words in αΣ * ∩ Σ *α . If x and y are pseudo-palindromes, then H k* (w) is
regular.
!
Proof) If w ≥ x + y , then we already proved in Proposition 1. If w ≤ x + y − 2 α ,
then w is non-crossing. We may assume that x + y − 2 α < w < x + y . Since α
overlaps with α , there exist factors u,v ∈Σ + of α such that α = vu , αα = uvu = u vu
and w = vx0 vy0 v where x = vx0 v , y = vy0 v , x0 , y0 ∈Σ * . Let R = (vx0 )+ (vy0 (vx0 )+ )* vy0 (vx0 )+ v and L = v(y0 v)+ x0 v((y0 v)+ x0 v)* (y0 v)+ . Since
H k* (w) = H k* (vx0 vy0 vx0 v) ∪ H k* (vy0 vx0 vy0 v) ∪ {w} , to prove the theorem we show that
R = H k* (vx0 vy0 vx0 v) and L = H k* (vy0 vx0 vy0 v) .
4
First, we prove that R = H k* (vx0 vy0 vx0 ) . Let Rn = (vx0 )+ (vy0 (vx0 )+ )n vy0 (vx0 )+ v for
every nonnegative integer n. Then it is obvious that R = ∪ n≥0 Rn . We show that Rn ⊂ H k* (vx0 vy0 vx0 v) by induction on n ≥ 0 . It is clear that
R0 = (vx0 )+ vy0 (vx0 )+ v ⊂ H k* (vx0 vy0 vx0 v) . Suppose that Rn = (vx0 )+ (vy0 (vx0 )+ )n vy0 (vx0 )+ v is
contained in Rn ⊂ H k* (vx0 vy0 vx0 v) for some nonnegative integer n. Every word w in
Rn+1 is written by the form (vx0 )r vy0 (vx0 )m !vy0 (vx0 )m vy0 (vx0 )s v where m1,!, mn+1 > 0
and r, s > 0 . We show that the word w1 = vx0 vy0 (vx0 )m !vy0 (vx0 )m vy0 (vx0 )v is in
H k* (vx0 vy0 vx0 v) . Let w ′ = vx0 vy0 (vx0 )m !vy0 (vx0 )m vy0 (vx0 )v . We have
w ′ = vx0 vy0 (vx0 )m !vy0 (vx0 )m vy0 (vx0 )v k → rm −1 vx0 vy0 (vx0 )m !vy0 (vx0 )m vy0 (vx0 )m v = vx0 vy0 ⋅(vx0 )(vx0 )m −m !vy0 (vx0 )m vy0 (vx0 )m v → r vx0 vy0 ⋅(vx0 )m !vy0 (vx0 )m vy0 (vx0 )m vy0 vx0 v = w1 Since w ′ is in Rn , it is contained in H k* (vx0 vy0 vx0 v) . Then the word w1 is also in
H k* (vx0 vy0 vx0 v) and then R ⊂ H k* (vx0 vy0 vx0 v) .
On the other hand, for a nonnegative integer n, suppose that H kn (vx0 vy0 vx0 v)
⊂ R , then we have H k1 (H kn (vx0 vy0 vx0 v)) = H k1 (H kn (vx0 vy0 vx0 v)) ⊂ H k1 (R) ⊂ R .
As the proof of H k* (vy0 vx0 vy0 v) = L is similar, we omit it.
! 1
n+1
1
1
n
1
n
1
1
n+1
n+1
n+1
n
n
1
n
n+1
n+1
n+1
!
References.
!
[1] V. Diekert, S. Kopecki, and V. Mitrana. On the hairpin completion of regular languages. In ICTAC, vol. 5684 of Lect. Notes Comput. Sci., 170–184, 2009.
[2] M. A. Harrison. Introduction to Formal Language Theory. Addison-Wesley,
Reading, Massachusetts, 1978. [3] Lila Kari, Steffen Kopecki, Shinnosuke Seki: Iterated Hairpin Completions of
Non-crossing Words. SOFSEM 2012: 337-348
[4] S. Kopecki, On the iterated hairpin completion, Theoretical Computer Science 412
(2011) 3629.
[5] G. Paun, G. Rozenberg, and T. Yokomori. Hairpin languages. Int. J. Found. Comput. Sci., pages 837–847, 2001.
!
!
Center for Liberal Arts Education and Research
Tenri University
1050 Somanouchi, Tenri, Nara 632-8510, Japan
E-mail address: tsuji @sta.tenri-u.ac.jp
5
天理大学・総合教育研究センター 佳代子