Mark Dow

Geek art

Minimal arrays containing all sub-array combinations of symbols: De Bruijn sequences and tori

    For n symbols in m groups, what are the smallest arrays that contain all combinations of adjacent elements?

    This is a symbolic combinatorics question. It shares some characteristics with the necklace problem. I know very little about this topic, and have not attempted to formalize the problem in standard mathematical language. I don't even understand the standard mathematical language used. The observations and algorithms below were derived from tinkering and intuition.

    Since I wrote most of this, I found that these problems have been worked on, and many of the simpler cases fully characterized. The one-dimensional cases are called De Bruijn sequences, and the two-dimensional cases De Bruijn tori:

De Bruijn sequence, Wikipedia
De Bruijn graph, Wikipedia
There is a nice link between these sequences and dynamical systems.
De Bruijn Sequence, MathWorld



1-D 2-symbol, every triplet, link to 2-D 2-symbol, every 2x2, link to Corners tilings, link to 
1-D 2-symbol, every pair
1-D 2-symbol, every triplet
1-D 2-symbol, every quadruplet
1-D 2-symbol, every quintuplet
1-D 3-symbol, every triplet
1-D 3-symbol, every sextet
1-D 4-symbol, every pair
1-D 4-symbol, every triplet
2-D 2-symbol, every 2x2
Meshing method
2-D 3-symbol, every 2x2
2-D 4-symbol, every 2x2
2-D 2-symbol, every 3x3
Corners tilings

Programs and code
References


    This problem came up when designing motifs for aperiodic tesselations, for example Thue-Morse patterns. In a 1-D 2-symbol Thue-Morse sequence all pairs of symbols occur ( aa, ab, ba, bb ) but no triplets (of symbols or sets of sequential symbols). Extended to 2-D and more symbols, aperiodic sequences can have sub-arrays of symbols that occur infrequently. [To Do: 4-symbol cyclic system example.]  It would be nice to be able to draw all possible neighborhoods (sets of adjacent motifs) with no repetitions.

    It is convenient to consider the symbols as sequential integers starting at 0 because the possible pairs can be enumerated as all possible integers with a base of the number of symbols.

[To Do: Draw corresponding directed graphs for all 1-D cases.]

1-D 2-symbol, every pair (k = 2, n = 2)

    There are four unique pairs of two symbols. Using {0,1} as symbols they correspond to the binary (base 2) numbers between 0 and 3:

{ 00, 01, 10, 11 }          


 A bit of trial and error yields a string with all adjacent pairs and no duplications:
[ 0 0 1 1 0 ]

The mirror and complements of this string also are solutions:
[ 0 1 1 0 0 ]
[ 1 0 0 1 1 ]
[ 1 1 0 0 1 ]

Note that the first symbol is the same as the last, so they can wrap around as a length four ring. The solutions are all cyclic permutation of a single ring:
[ ... 0 0 1 1 ... ] 2-symbol doublet necklace
The graphical representation (right) uses black and white to represent 0 and 1 respectively, and is read clock-wise from the top. It is the only solution, as any mirror or complement is the same as a cyclic permutation of this solution.

1-D 2-symbol, every triplet  (k = 2, n = 3)

     There are eight unique triplets of two symbols. Using {0,1} as symbols they correspond to the binary (base 2) numbers between 0 and 7:
       
{ 000, 001, 010, 011, 100, 101, 110, 111 }

     A solution can be found by starting with [ 0 0 0 ], and iteratively add a symbol to the right with these conditions:

(a) Choose the symbol that will form a triple that is not yet in the string
(b) If both symbols can be added to result in a new triplet, use the one that will most closely balance the number of each symbol.
(c) If the number of symbols is balanced, arbitrarily select the next symbol as 0.

Here's the sequence of adding to the string, with the appropriate conditions:
[ 0 0 0                ]  
[ 0 0 0 1              ]  (a)
[ 0 0 0 1 1            ]  (b)
[ 0 0 0 1 1 1          ]  (b)
[ 0 0 0 1 1 1 0        ]    (a), because 111 is already in the string
[ 0 0 0 1 1 1 0 1      ]    (b), because both 100 and 101 aren't in the string, but there are more 0's than 1's
[ 0 0 0 1 1 1 0 1 0    ]  (a) Note that the last symbol is the first symbol wrapped around.
[ 0 0 0 1 1 1 0 1 0 0  ]  (a) Note that the last two symbols are the first symbols wrapped around.

Condition (c) is not used here, but will be for longer sub-string solutions.

    Does this algorithm result in a De Bruijn sequence for all 2-symbol strings, of any length? Is there a generalization for more than 2 symbol strings? Does it matter what triplet is used as the initial state?

    Similar to the "every pair" solution, the first two symbols are the same as the last, so all triplets occur in the eight term ring:
[ ... 0 0 0 1 1 1 0 1 ...  ] 2-symbol triplet necklace

and in any cyclic permutation. This and its mirror/complement are the only solutions.

    The mirror or complement (cyclic permutation of symbols) of this ring is another unique solution -- the mirror solution is the same as the complement. Considered as a 3-D necklace (allowing a rotation about an axis in the plane), there is only a single solution.

    Ted demonstrated a nice method of finding solution to some of these problems by construction of a bilaterally symmetric directed graph. Solutions are Hamiltonian cycles on the graph; each vertex is visited once and the path returns to a starting vertex. The general problem of finding Hamiltonian cycles on graphs is well studied, with a complexity class of NP-complete in FNP. While this particular graph is planar, the graphs for longer sub-strings or more symbols are not. But any graph for similar problems has symmetries that might reduce the problem to polynomial time. There is a reflection symmetry about the vertical centerline, both in the graph layout and in the sub-string triplets, and there is an anti-symmetric reflection symmetry about the horizontal centerline. These are De Bruijn graphs.
2-symbol triplet graph 2-symbol triplet Hamiltonian cycle 1 2-symbol triplet Hamiltonian cycle 2


    This directed graph can be mapped onto the corners of a cube, and a solution will visits all corners of the cube on a restricted set of directed edges:
directed graph on a cube, 2-symbol every triplet Hamiltonian cycle, 2-symbol every triplet Hamiltonian cycle, 2-symbol every triplet
    Diagram of all triplets as corner coordinates (3-tuples) of a unit cube, and allowed neighbors as a graph on the cube (left). There are two solutions, corresponding to two Hamiltonian cycles on this graph (center and right).
    Treated as 3-dimensional graphs, the two solutions are related by a parity transformation, equivalent to a mirroring and a rotation by pi radians. Thus they are achiral. This is equivalent to the complement, or swapping of the symbols 0 and 1.

    The blue edge in the left figures correspond to the length 4 substrings (0 1 1 0) and (1 0 0 1) that can't appear in the solutions: if these edges are traversed there is no way of reaching (1,1,1) and (0,0,0) respectively without visiting these edge vertices twice.

    This graph on a cube is related to Gray codes, in that they also form a Hamiltonian cycle on the corners of an n-cube, where each bit is viewed as one dimension. But in the case of Gray codes the graph is undirected and the edges are all edges of the cube (no diagonals).
    There is a unique Eularian cycle on the left graphs, including edges 000->000 and 111->111 (not shown), that here gives the cycle  ...0000101111010011... . These paths include all combinations exactly twice.

1-D 2-symbol, every quadruplet (k = 2, n = 4)

     Using the algorithm described in "every triplet", all sixteen possible quadruplets:
       
{ 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111 }

appear in this string:

[ 0 0 0 0 1 1 1 1 0 1 0 1 1 0 0 1 0 0 0 ]

The sixteen term ring is:

[ ... 0 0 0 0 1 1 1 1 0 1 0 1 1 0 0 1 ...  ]

    Is this is a unique minimal solution, up to cyclic permutations and mirroring? No! Ted Bell found another:

[ ... 0 0 0 0 1 0 0 1 1 1 1 0 1 0 1 1 ...  ]

Ted found this as a Hamiltonian cycle on a bilaterally symmetric directed graph. [To Do: Draw directed graph.]

    By brute force search, I found the 16 solutions

0000100110101111
0000100111101011
0000101001101111
0000101001111011
0000101100111101
0000101101001111
0000101111001101
0000101111010011
0000110010111101
0000110100101111
0000110101111001
0000110111100101
0000111100101101
0000111101001011
0000111101011001
0000111101100101

with, 4? unique solutions up to mirroring and cyclic permutation of the two symbols.


1-D 2-symbol, every quintuplet (k = 2, n = 5)

     Using the same procedures as for the "every triplet" string:
       
[ 0 0 0 0 0 1 1 1 1 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0 ]

The 32-term ring is:

[ ... 0 0 0 0 0 1 1 1 1 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 0 0 1 ...  ]

Is this is a unique minimal solution, up to cyclic permutations and mirroring? No, there are 2048 solutions, 512 solutions up to symbol permutation (swapping 1 and 0) and mirroring. The can be found by a brute force search of all strings represented by binary integers from 0 to 227. There are 232 binary integers with 32 digits, but we can find only those with leading digits 000001... and all cyclic permutations, symbol swaps,  and mirrors will also be solutions. This number of checks (227) in a brute force search is nearing the limit of what is practicle on a serial processor.

1-D 3-symbol, every triplet (k = 3, n = 3)

    There are 216 solutions, but eliminating permutations and mirrors leaves 12?

    Text files with sequence lists, generated by brute force search (see programs and code):

1-D 3-symbol, every sextet (k = 3, n = 6)

A universal cycle, by Anna Virágvölgyi
A universal cycle
, by Anna Virágvölgyi

"A picture of an unwrapped cylinder. The universal cycle "a b c a b a b a b c b c b c a b c a b c b c a c b a c a b a c b c a c a c a b a b c a c a b c b" includes all possible words of six length from the alphabet (a, b, c) in which no letters of alphabet are paired. The picture is created by substituting stripes for letters of the universal cycle. Due to of the nature of universal cycles all possible diagonal striped square tiles (with six stripes - each differ from its neighbour) can be found on this picture."

1-D 4-symbol, every pair (k = 4, n = 2)

    There are 20736 solutions, but eliminating the 24 symbol permutations (all combinations of  4 symbols) and mirrors leaves 20736/48 = 432 unique sequences.

    Text files with sequence lists, generated by brute force search (see programs and code):

k4_n2_no_permutes.txt   (List of 20736/4  = 5184 sequences that have 001 as a subsequence. Note that all 20736 sequences can be generated by with the permutations 01-> 02, 03, 23, 32.)
k4_n2_unique.txt   (List of 432 unique sequences up to symbol permutation and mirroring.)

1-D 4-symbol, every triplet (k = 4, n = 3)


Ted Bell sent me this solution, which he found by visually searching it's graph for a Hamiltonian cycle:
... 0001110100202221211220030333232233131133013203210310231201230213 ...

This is way too long for my brute force (with principled lowest estimate) programming approach.

    The mother of one of my programming kids taught me an interesting trick: once you've found one solution there are ways of finding more solutions, and perhaps all solutions. The basic transform for this one goes like:
- all pairs of symbols happen several times. For example the 00 pairs are:
0001110100202221211220030333232233131133013203210310231201230213
- the sequences between these pairs (for example 11101 and 20222121122) can be swapped giving
0002022212112200111010030333232233131133013203210310231201230213
This works with any 1-D solution, and I think it could be used to count the number of unique solutions. But it is still a difficult combinatorics problem that I don't know how to solve.


2-D 3-symbol, every 2x2

    Here is a solution for this case from "On the de Bruijn Torus Problem", where they have used a constructive proof (Theorem 1.3, Cock's construction). This is possible for any odd number of symbols; all 2x2's, a dBk( k2, k2; 2, 2 ) member where k is odd. They step through the actual construction, but it's hard for me to follow.

2 0 0 2 1 1 2 0 0
1 0 2 1 2 2 1 2 0
0 1 2 0 1 1 0 2 1

2 1 0 2 0 0 2 0 1
2 2 0 2 2 2 2 0 2
0 1 1 0 2 2 0 1 1
0 0 1 0 0 0 0 1 0
1 2 2 1 0 0 1 2 2
1 2 1 1 1 1 1 1 2

3-symbol, all 2x2s  

The top row and left column are repeated at the bottom and right.


    It's about as symmetric as a 3 symbol set can be: bilaterally symmetric about the vertical axes between the fourth and fifth column, and about the center of the ninth column. The 0's and 2's are exactly the same pattern, with 2's shifted two units up relative to the 1's. I don't know how many unique solutions exist.

2-D 2-symbol, every 2x2

     There are sixteen possible 2x2 sub-arrays with two symbols. Using {0, 1} as symbols they correspond to the binary (base 2) numbers between 0 and 15:
{ [ 0 0    [ 0 0    [ 0 0    [ 0 0    [ 0 1    [ 0 1    [ 0 1    [ 0 1
    0 0 ]
,   0 1 ],   1 0 ],   1 1 ],   0 0 ],   0 1 ],   1 0 ],   1 1 ],

  [ 1 0    [ 1 0    [ 1 0    [ 1 0    [ 1 1    [ 1 1    [ 1 1    [ 1 1
    0 0 ],   0 1 ],   1 0 ],   1 1 ],   0 0 ],   0 1 ],   1 0 ],   1 1 ], }


I gave up on using a tesselation on a torus (wrap-around in both directions), because it appeared to be ugly, requiring many duplicates.
7/1/08 But Ted Bell didn't, and showed that there aren't duplicates:
Brigid's cross (octomino) is a torroidal solution.  I was fussing with the big hierarchical set of solutions last night, I got too frustrated and decided to count the number of 2x2 squares in a torroidal matrix of size 4.
It's 16, so I thought I'd try to fit the 16 possible 2x2 2 symbol squares into it. Only a symmetrical solution would do and Brigid's cross is one of them, I think the others are all row and column swaps.
0 1 0 0
0 1 1 1
1 1 1 0
0 0 1 0

Brigid's octomino array
Putting the left column and top row on top of the originals (to show all sub-arrays when tesselated on a torus):

0 0 1 0 |0
1 1 1 0 |0
0 1 1 1 |1
0 1 0 0 |0
___________
0 1 0 0 | 0
Brigids octomino tesselation

This array's tiling is symmetric and periodic with respect the symbols 0/1, or black/white.


      


    The tesseract's adjacency diagram (which faces are mutually adjacent) can be mapped onto a 4x4 toroidal grid. So the tesseract faces can be colored using this pattern such that each of the 16 vertices (which are all at intersections of four faces) has a uniquely colored neighborhood (counting each of four orthogonal orientations as different).
[To Do: A stereo projection of a tesseract (similar to this) with labelled vertices and a corresponding 4x4 adjacency grid (see bottom of this). Try illustrating the coloring of faces.]
[To Do: Construct a single projected "corner" of a tesseract, composed of four interpenetrating parallelpipeds. This corresponds to a cube's single corner projected onto a plane, composed of three diamonds.]

    To construct a "non-wrapping" array that contains all sub-arrays, a 5x5 array seems natural because it has exactly 16 2x2 sub-arrays. It has 25 elements so must include an unequal number of the two symbols -- it didn't seem possible since the set of 2x2's is equivalent with a swap of symbols so there must be an equal number of symbols. But there is another solution:

[ 0 0 0 0 1
  0 0 1 0 1
  1 1 1 1 0
  1 1 0 1 0
  0 0 0 0 1
]

There are 14-0's and 11-1's, but the central elements (with more 1's) appear in more sub-arrays that the side and corner elements (with more 0's). Are there other 5x5 solutions?
 
    All columns, but no rows, are 1-D 2-symbol solutions. The top and bottoms rows are the same, so the 5x4 array (without the top or bottom row) tesselates a cylinder. The left and right columns are complements, so it and its complement forms an 8x4 tesselation on a torus that has a uniform distribution of all 2x2 elements:
 0 0 0 0 1 1 1 1
 0 0 1 0 1 1 0 1
 1 1 1 1 0 0 0 0
 1 1 0 1 0 0 1 0

765 octomino tile
The 8x4 tile.
765 octomino tesselation
Tesselation of the tile.
 


    Any 5x5 subset of these tesselations will be a minimal solution (contain all 2x2 sub-arrays), and any 4x8 subset of the latter will tesselate in the same way.

    The periodicity of the tesselations is curious. Both the rows and the columns of the first are periodic on the string 0 1 1 1 or its complement 1 0 0 0 (and cyclic permutations). The columns are all periodic on 0 0 1 1 and its complement, but the rows are periodic on either 0 0 0 0 1 1 1 1 or 0 0 1 0 1 1 0 1 or their complements (and cyclic permutations).

    It turns out that there are offset tesselations of  these 4x8 arrays that also have an equal distribution of 2x2 sub-arrays. For example:

 1 0 0 1
 0 0 1 1
 0 0 1 1
 1 1 0 0
 0 1 1 0
 1 1 0 0
 1 1 0 0
 0 0 1 1

4x8 tile pair

A subset of the tesselation (above) rotated by pi/2 radians, and its pi rotation.
4x8 offset tesselation
An offset tesselation of the two tiles, with horizontal rows of the same tile. By matching the 2x2 sub-arrays that occur on horizontal boundaries, the equal frequency of sub-arrays is preserved.
[To Do: Make a rectangular region that periodically tiles the plane.]
 motif outline
A tile constructed by symbol replacement with a double diagonal motif:
symbol replacement with double-diagonal motif

(Rotated extension not shown.)
Kinky frieze, link to
A frieze that based on this tiling and symbol replacement, followed by coloring contiguous regions.
Pattern and process, link to
A fractal elaboration of the frieze. [Broken link: synthetic ripples - ripple distortion description and code.]


[To Do: L_system_tiling( 'test', 1, 2, 1, 0,'zig_zag_eel_motif_5.png', '2x2_2s_a.png', 0, 1 ); ]

Of course it works with a swap of the symbols, or with rotations/mirrors of any 5x5 subset of the tesselations. Are there other solutions? There are only so many 5x5 arrays and I tried just about every one that contained [ 0 0; 0 0 ] and [ 1 1; 1 1 ].

2-D 4-symbol, every 2x2

    There are 256 (44) possible 2x2's, and a minimal toroidal solution would fit on a 16x16 grid. Ted and I couldn't do this one. Can a brute force search find a solution?

    We can imagine a simple brute force search algorithm that requires somewhat less than 231 checks to find (or not find) a solution. Is this too large to be computationally tractable? How much less to find at least one solution? What is a good algorithm, and can it be guaranteed to find all solutions?

    There are several good reasons for having a positive solution. The most basic is to make figures like this Truchet-like tiling that show all combinations of 2x2 neighbors with no repeats.

    I found a paper that constructs a dB4( 42, 42; 2, 2 ), which is the formal notation for this pattern. In "A Meshing Technique for de Bruijn Tori". (One of the main topics of this paper is to show another result implying (among other things) that there is also exists a 8 x 32 matrix with all 2x2's of 4 symbols.) The Section 2 "Meshing method" describes the construction (due to Ivanyi and Toth, 1988).

Meshing method

    The meshing method uses a 1-D 4-symbol every pair De Bruijn sequence to construct a 2-D 4-symbol every 2x2 De Bruin torus. The 1-D sequences can be found by brute force searching of strings.  Only a fraction of these sequences, those that are "even", can be used with the meshing method. Here's what "even" and "odd" mean in this context: if ab and ba occurs in a sequence, then if the distance between the occurances is odd, then the sequence is odd. So aba is odd because ab and ba are one (index) apart. But abcba is also odd, as they are three apart. The subsequence abba can occur in an even sequence.

    Only 104 of 432 unique 1-D 4-symbol every pair De Bruijn sequences are even.

    Are all possible 2-D 4-symbol every pair De Bruijn tori generated by this method? I don't know, and I don't know if this is known.


   0 0 1 1 0 2 1 3 3 1 2 0 3 2 2 3

0  0 0 1 0 0 0 1 0 3 0 2 0 3 0 2 0
0  0 0 0 1 0 2 0 3 0 1 0 0 0 2 0 3
1  0 1 1 1 0 1 1 1 3 1 2 1 3 1 2 1
1  1 0 1 1 1 2 1 3 1 1 1 0 1 2 1 3
0  0 0 1 0 0 0 1 0 3 0 2 0 3 0 2 0
2  2 0 2 1 2 2 2 3 2 1 2 0 2 2 2 3
1  0 1 1 1 0 1 1 1 3 1 2 1 3 1 2 1
3  3 0 3 1 3 2 3 3 3 1 3 0 3 2 3 3
3  0 3 1 3 0 3 1 3 3 3 2 3 3 3 2 3
1  1 0 1 1 1 2 1 3 1 1 1 0 1 2 1 3
2  0 2 1 2 0 2 1 2 3 2 2 2 3 2 2 2
0  0 0 0 1 0 2 0 3 0 1 0 0 0 2 0 3
3  0 3 1 3 0 3 1 3 3 3 2 3 3 3 2 3
2  2 0 2 1 2 2 2 3 2 1 2 0 2 2 2 3
2  0 2 1 2 0 2 1 2 3 2 2 2 3 2 2 2
3  3 0 3 1 3 2 3 3 3 1 3 0 3 2 3 3





4-symbol all 2x2s De Bruijn torus



4-symbol all 2x2s De Bruijn torus



4-symbol all 2x2s De Bruijn torus
The 2D pattern formed by the "mesh method" uses a logical combination of  a 1D De Bruijn sequence. The 1D sequence must have an "even" property for the torus to have the De Bruin property. This torus with colored squares as symbols. The bottom row and right column are repeated at the bottom and right, so every 2x2 pattern occurs exactly once. Two different cyclic rotations of the same De Bruijn torus.
   0 0 1 1 0 2 1 2 2 0 3 1 3 2 3 3

0  0 0 1 0 0 0 1 0 2 0 3 0 3 0 3 0
0  0 0 0 1 0 2 0 2 0 0 0 1 0 2 0 3
1  0 1 1 1 0 1 1 1 2 1 3 1 3 1 3 1
1  1 0 1 1 1 2 1 2 1 0 1 1 1 2 1 3
0  0 0 1 0 0 0 1 0 2 0 3 0 3 0 3 0
2  2 0 2 1 2 2 2 2 2 0 2 1 2 2 2 3
1  0 1 1 1 0 1 1 1 2 1 3 1 3 1 3 1
2  2 0 2 1 2 2 2 2 2 0 2 1 2 2 2 3
2  0 2 1 2 0 2 1 2 2 2 3 2 3 2 3 2
0  0 0 0 1 0 2 0 2 0 0 0 1 0 2 0 3
3  0 3 1 3 0 3 1 3 2 3 3 3 3 3 3 3
1  1 0 1 1 1 2 1 2 1 0 1 1 1 2 1 3
3  0 3 1 3 0 3 1 3 2 3 3 3 3 3 3 3
2  2 0 2 1 2 2 2 2 2 0 2 1 2 2 2 3
3  0 3 1 3 0 3 1 3 2 3 3 3 3 3 3 3
3  3 0 3 1 3 2 3 2 3 0 3 1 3 2 3 3



4-symbol, not a De Bruijn torus
The De Bruijn sequence used here is "odd", as it contains the substring 313, so the mesh method does not result in a De Bruijn torus. Not a De Bruijn torus, constructed by the mesh method with a De Bruijn sequence, but one that is "odd".
4-symbol 2x2 De Bruijn torus, link to 4-symbol 2x2 De Bruijn torus, link to
A De Bruijn torus using Truchet's tile set, the four rotations of a diagonal division of a square. Any permutation of tiles, mirror, or cyclic rotation of the torus is also a De Bruijn torus. The same torus after a cyclic rotation.


  

2-D 2-symbol, every 3x3

    There are 512 (29) 2-symbol 3x3 arrays. There is no minimal square solution that tesselates a torus (because 512 is not the square of an integer), but there may be 24 x 25 (16 x 32), 23 x 26 (8 x 64), or 22 x 27 (4 x 128) solutions.

       I don't know any solutions for this, or have any ideas for methods of constructing one -- except for brute force search of all combinations. But the brute force search space is far too large.


2-D 2-symbol, Thue-Morse 2x2s

    Is there a toroidal array that contains all 2x2's that can occur in a Thue-Morse pattern, with no repeats? All L-arrangements don't occur, all others do.


Corners tilings

 [To Do:   An example of a combinatorial system with a second neighborhood constraint. ]

Brick wall corners, link to

The pattern is formed from these rotations and inversions of a single "corner motif":
corner motifs, labelled

The left half of the pattern is this arrangement:

corner 5x5 array

    This pattern has a nice set of symmetries and anti-symmetries [ To Do: Describe in B/W matrix.]. But it doesn't include the fifth set [ To Do: Show ].
Is there an square, or nearly square matrix of motifs that contains all 2x2 patterns in an arrangement where the white and black (a's and b's) have the same relationship?


Entropy of De Bruijn patterns

    In a random string, where every letter has an equal probability of occurring at every location, the expected value of the number of times each n-tuple occurs is the same for any given n -- a random string is a normal number. De Bruijn sequences have equal frequencies of n-tuples for one particular n, but not for all n. Tuples of length > n never occur. Is it true for tuples of length < n? What is the information entropy of De Bruijn strings, and how does it change with n? As n increases, do De Bruin strings asymptotically approach normal numbers? Are normal numbers De Bruin strings as n approaches infinity?

    Consider the 2-symbol every quadruple De Bruijn sequences in the context of all 16 bit cyclic strings. If we randomly pick a large number of 16 bit strings, the distribution of substrings will asymptoptically approach the distribution of all 16 bit strings. This distribution for each n will be binomial:

[To Do: Distributions for n = 1 to ~6 for all 16 bit strings, compared with the De Bruijn strings.]


Programs and code

[To Do: Generalize, fix up and document these.]
MATLAB code:
find_de_Bruin_strings.m
filter_de_Bruijn.m

References

A Meshing Technique for de Bruijn Tori
Glenn Hulbert and Garth Isaak, Contemporary Math. 178 (1994) 153-160.


On the de Bruijn Torus Problem, Glenn Hulbert, Garth Isaak


Garth Isaak (accessed 2/8/10)

"Perfect Maps: how determine your location in n dimensions

List the string 00011101 cyclically. Each triple occurs exactly once. This is known as a perfect map or de Bruijn cycle. Many questions can be asked about higher dimensional versions of this and version with an alphabet of size k (instead of 2). In particular are the `obvious' necessary conditions sufficient? Kenny Patterson in London has answered this completely in 2-dimensions for k a prime power. Many other interesting questions arise about related structures called perfect factors and perfect multi-factors which arise in the study of these objects. For more information and references to other work see these papers my recent paper (which is partially survey and partially new unifying results and notation Constructing Higher Dimensional Perfect Factors (to appear in Aequationes Mathematicae)"

On the Existence of de Bruijn Tori with Two by Two Windows


Research problems on Gray codes and universal cycles
Brad Jackson, Brett Stevens, Glenn Hurlbert
Discrete Mathematics 309 (2009) 5341-5348

Open problems arising from the Workshop on Generalizations of de Bruijn cycles and Gray
Codes at the Banff International Research Station in December, 2004.

PROBLEM 480. De Bruijn Tori

Ron Graham
Dept. of Comp. Sci. & Engineering, Univ. of CaliforniaSan Diego, La Jolla, CA, USA
graham@ucsd.edu

We consider a 2-dimensional generalization of de Bruijn cycles. An R-by-S k-ary array A embedded on a torus is a de Bruijn torus with window size (m; n) if every m-by-n array occurs once. For short, we say that A is a k-ary (R,S,m,n)-de Bruijn torus.

Question 1: Is it possible to construct a k-ary (R,S,m,n)-de Bruijn torus whenever R, S, m, n, k are positive integers with k > 1, RS = kmn, R > m, and S > n?

Comment: This question has been answered affirmatively for "square'' tori (the case R = S and m = n). The binary case (k = 2) was solved by Fan, Fan, Ma, and Siu [1] and the general case by Hurlbert and Isaak [2]. The problem was also solved for m = n = 2 by Hurlbert, Mitchell, and Paterson [3].

As with universal cycles, one can also consider de Bruijn tori for many other combinatorial structures.

Question 2: For which positive integers R, S, m, n, k such that k > mn, R >=  m, S >=  n and RS = [Product from i = 0 to k-1]{mn-i) do there exist de Bruijn tori of the mn-permutations of a k-set?

Question 3: For which positive integers R, S, m, n, k such that k > mn, R >=  m, S >=  n and RS = [k choose mn] do there exist de Bruijn tori of the mn-subsets of a k-set?

References
[1] C.T. Fan, S.M. Fan, S.L. Ma, and M.K. Siu, On de Bruijn arrays, Ars Combin. 19A (1985), 205213
[2] G. Hurlbert and G. Isaak, On the de Bruijn torus problem, J. Combin. Th. (A) 64 (1993), 50-62.
[3] G. Hurlbert, C. Mitchell, and K. Paterson, On the existence of de Bruijn tori with 2 x 2 windows, J. Combin. Th. (A) 76 (1996), 213-30.


Information on necklaces, unlabelled necklaces, Lyndon words, De Bruijn sequences

Using de Bruijn Sequences to Index a 1 in a Computer Word

Charles E. Leiserson, Harald Prokop, Keith H. Randall

Some computers provide an instruction to find the index of a 1 in a computer word, but many do not. This paper provides a fast and novel algorithm based on de Bruijn sequences to solve this problem. The algorithm involves little more than an integer multiply and a lookup in a small table. We compare the performance of our algorithm with other popular strategies that use table lookups or floating-point conversion.

Automorphisms of subword-posets
P. Ligeti and P. Sziklai
Discrete Mathematics Volume 305, Issues 1-3, 6 December 2005, Pages 372-378

In this paper the automorphism group of two posets, Dk,n and Bm,n is determined. Dk,n is the poset of DNA strands of length at most n, built up with k complement pairs of letters, and partially ordered by the subsequence relation. Bm,n is the set of all subsequences of the word um,n=a1an defined over the alphabet {0,1,…,(m-1)}, where Click to view the MathML source. The automorphism group of Bm,n was known already (see G. Burosch, H.-D.O.F. Gronau, J.-M. Laborde, The automorphism group of the subsequence poset Bm,n, Order 16 (2) (1999) 179–194 (2000)), here a short proof is presented as an illustration of the method used in the first part.

Keywords: Subword; Poset; Automorphism



A038219     A maximally unpredictable sequence.
The sequence starts 0,1,0 and continues according to the following rule: find the longest sequence at the end that has occurred at least once previously. If there are more than one previous occurrences select the last one. The next digit of the sequence is the opposite of the one following the previous occurrence.

A079101       A repetition-resistant sequence.

a(n) = 0 or 1, chosen so as to maximize the number of different subsequences that are formed.

a(n+1)=1 if and only if (a(1),a(2),...,a(n),0), but not (a(1),a(2),...,a(n),1), has greater length of longest repeated segment than (a(1),a(2),...,a(n)) has.

In Feb, 2003, Alejandro Dau solved Problem 3 on the Unsolved Problems and Rewards website, thus establishing that every binary word occurs infinitely many times in this sequence.

Klaus Sutmer remarks (Jun 26 2006) that this sequence is very similar to the Ehrenfeucht-Mycielski sequence A007061. Both sequences have every finite binary word as a factor; in fact, essentially the same proof works for both sequences.

Cumulative distribution function for order 7 de Bruijn weight classes

Mayhew, G.L.

Aerospace conference, 2009 IEEE

Order n de Bruijn sequences are the period 2n binary sequences from n-stage feedback shift registers. The de Bruijn sequences have good randomness and complexity properties. The quantity of de Bruijn sequences in a weight class of the order n generating functions is an unsolved NP complete problem. Weight class distributions for small n have been obtained by exhaustive searches. This paper uses cumulative distribution function to obtain a high resolution projection of the quantity of de Bruijn sequences in each order 7 weight class. The weight class probability mass function is a shifted Binomial probability mass function which in the limit is accurately represented as a Normal probability density function scaled by a Beta probability density function. The order 7 weight class cumulative distribution function can be modeled as a weighted sum of two Normal cumulative distribution functions.

From de Bruijn sequence (Mathworld, accessed 3/1/2010)

The lexicographic sequence of Lyndon words of lengths divisible by n gives the lexicographically smallest de Bruijn sequence (Ruskey). de Bruijn sequences can be generated by feedback shift registers (Golomb 1967; Ronse 1984; Skiena 1990, p. 196).

de Bruijn, N. G. "A Combinatorial Problem." Koninklijke Nederlandse Akademie v. Wetenschappen 49, 758-764, 1946.

Golomb, S. W. Shift Register Sequences. San Francisco, CA: Holden-Day, 1967.

Good, I. J. "Normal Recurring Decimals." J. London Math. Soc. 21, 167-172, 1946.

Knuth, D. E. "Oriented Subtrees of an Arc Digraph." J. Combin. Th. 3, 309-314, 1967.

Ronse, C. Feedback Shift Registers. Berlin: Springer-Verlag, 1984.

Ruskey, F. "Information on Necklaces, Lyndon Words, de Bruijn Sequences." http://www.theory.csc.uvic.ca/~cos/inf/neck/NecklaceInfo.html.

Skiena, S. Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Reading, MA: Addison-Wesley, pp. 195-196, 1990. 



------------------------------------

 Public Domain Dedication This work is dedicated to the Public Domain.
There are no restrictions on use of the text, images or code on this page. Claiming to be the originator of the material, explicitly or implicitly, is bad
karma. A link (if appropriate), a note to dow[at]uoregon.edu, and credit are appreciated but not required.


Comments are welcome (dow[at]uoregon.edu).