Partial Word Representation

Partial Word Representation
F. Blanchet-Sadri
Fields Institute Workshop
This material is based upon work supported by the National
Science Foundation under Grant No. DMS–1060775.
Partial words and compatibility
I
A partial word is a sequence that may have undefined
positions, called holes and denoted by ’s, that match any
letter of the alphabet A over which it is defined (a full word
is a partial word without holes); we also say that is
compatible with each a ∈ A.
abaab is a partial word with two holes over {a, b}
I
Two partial words w and w 0 of equal length are compatible,
denoted by w ↑ w 0 , if w[i] = w 0 [i] whenever w[i], w 0 [i] ∈ A.
a b a
a b a
↑
6↑
b a a
a a a
Factor and subword
I
A partial word u is a factor of the partial word w if u is a
block of consecutive symbols of w.
a is a factor of aaab
I
A full word u is a subword of the partial word w if it is
compatible with a factor of w.
aaa, aab, baa, bab are the subwords of aaab
corresponding to the factor a
Some computational problems
I
We define R EP, or the problem of deciding whether a set S
of words of length n is representable, i.e., whether
S = subw (n) for some integer n and partial word w.
I
If h is a non-negative integer, we also define h-R EP, or the
problem of deciding whether S is h-representable, i.e.,
whether S = subw (n) for some integer n and partial word
w with exactly h holes.
Rauzy graph of S = {aaa, aab, aba, baa, bab}
aa
aaa
aab
ab
aba
baa
bab
ba
S is 0-representable by w = aaababaa
Why partial words? (Compression of representations)
S = {aaa, aab, aba, baa, bab}
is representable by
aaababaa
Why partial words? (Compression of representations)
S = {aaa, aab, aba, baa, bab}
is representable by
aabab
Why partial words? (Compression of representations)
S = {aaa, aab, aba, baa, bab}
is representable by
aa
Why partial words? (Representation of non0-representable sets)
I
Set S of 30 words of length six:
1
2
3
4
5
I
aaaaaa
aaaaab
aaaabb
aaabba
aaabbb
6
7
8
9
10
aabbaa
aabbba
aabbbb
ababbb
abbaab
11
12
13
14
15
abbbaa
abbbab
abbbba
abbbbb
baabba
16
17
18
19
20
baabbb
bababb
babbba
babbbb
bbaabb
21
22
23
24
25
bbabab
bbabbb
bbbaaa
bbbaab
bbbaba
26
27
28
29
30
bbbabb
bbbbaa
bbbbab
bbbbba
bbbbbb
Rauzy graph (V , E) of S, where E = S and V = subS (5)
consists of 20 words of length five:
1
2
3
4
aaaaa
aaaab
aaabb
aabba
5
6
7
8
aabbb
ababb
abbaa
abbba
9
10
11
12
abbbb
baabb
babab
babbb
13
14
15
16
bbaaa
bbaab
bbaba
bbabb
17
18
19
20
bbbaa
bbbab
bbbba
bbbbb
4 aaabba
5 aaabbb
aaabb
4 aaabba
5 aaabbb
aaabb
Membership of h-R EP in P
Theorem
h-R EP is in P for any fixed non-negative integer h.
F. Blanchet-Sadri and S. Simmons, Deciding representability of
sets of words of equal length. Theoretical Computer Science
475 (2013) 34–46
Membership of R EP in P
Theorem
R EP is in P.
F. Blanchet-Sadri and S. Munteanu, Deciding representability of
sets of words of equal length in polynomial time. Submitted
Open problems
I
Characterize the sets of words that are representable.
I
Characterize minimal representing partial words (can they
be constructed efficiently?)