L-graphs and what they represent in TOC

 L-graphs and what they represent in TOC

Prerequisite – Finite automata introduction
All programming languages can be represented as a finite automata. C, Paskal, Haskell, C++, all of them have a specific structure, grammar, that can be represented by a simple graph. Most of the graphs are NFA’s or DFA’s. But NFA’s and DFA’s determine the simplest possible language group: group of regular languages [Chomsky’s hierarchy]. This leaves us with a question: what about all other types of languages? One of the answers is Turing machine, but a Turing machine is hard to visualize. This is why in this article I will tell you about a type of finite automata called an L-graph.

In order to understand how L-graphs work we need to know what type of languages L-graphs determine. To put it simply, L-graphs represent context-sensitive type of languages [and every other type that the context-sensitive group contains]. If you don’t know what “context-sensitive” means, let me show you an example of a language that can be represented by an L-graph and not by any easier type of finite automata.

This language is L = \{ a^nb^nc^n | n \geqslant 1 \}. Corresponding L-graph looks like this:


As you can see the brackets after the symbol ‘|’ control the numbers of symbols that come after the symbols ‘a’. This leads us to the two features that all L-graphs possess: all L-graphs have up to two independent from each other and from input symbols bracket groups, both bracket groups have to be right [string from a Dyck language] in order for the string of input symbols to be accepted by the given L-graph.



You can see that an L-graph is just a version of finite automata with an added couple of bracket groups. To help you get an understanding of why the languages determined by L-graphs are context-sensitive, check what strings the L-graph shown above has to accept.

abc: \{\varepsilon, \varepsilon, \varepsilon\} \rightarrow \{a, (, \varepsilon} \rightarrow \{ab, (), <\} \rightarrow \{abc, (), <>\}\\* a^2b^2c^2: \{\varepsilon, \varepsilon, \varepsilon\} \rightarrow \{a, (, \varepsilon\} \rightarrow \{aa, ((, \varepsilon\} \rightarrow \{aab, ((), <\} \rightarrow \{aabb, (()), <<\} \rightarrow \{aabbc, (()), <<>\} \rightarrow \{aabbcc, (()), <<>>\}\\* a^5b^5c^5: \{\varepsilon, \varepsilon, \varepsilon\} \rightarrow \{a, (, \varepsilon\} \rightarrow \ldots \rightarrow \{aaaaa, (((((, \varepsilon\} \rightarrow \{aaaaab, (((((), <\} \rightarrow \ldots \rightarrow \{aaaaabbbbb, ((((())))), <<<<<\} \rightarrow \{aaaaabbbbbc, ((((())))), <<<<<>\} \rightarrow \ldots \rightarrow \{aaaaabbbbbccccc, ((((())))), <<<<<>>>>>\}

To conclude, I would like to add three other definitions that I’ll be using in the future. These definitions are very important for the hypothesis [and its future proof or disproof]. Refer – Hypothesis (language regularity) and algorithm (L-graph to NFA)

We will call a path in the L-graph neutral, if both bracket strings are right. If a neutral path T can be represented like this, T = T_1T_2T_3, where T_1 and T_3 are cycles and T_2 is a neutral path (T_1, T_2 or T_3 can be empty), T is called a nest. We can also say that the three (T_1, T_2, T_3) is a nest or that T_1 and T_3 form a nest in the path T.

(\omega, d)-core in an L-graph G, defined as Core(G, \omega, d), is a set of (\omega, d)-canons. (\omega, d)-canon, where \omega and d are positive whole numbers, is a path that contains at most m, m \leqslant \omega, neutral cycles and at most k, k k \leqslant d d, nests that can be represented this way: T_1_1...T_1_kT_2_1T_3_1...T_3_k is part of the path T, T_i_l, i = 1 or 3, 1 \leqslant l \leqslant k, are cycles, every path T_1_jT_2_jT_3_j is a nest, where T_2_j = T_1_(_j_-_1_)T_2_(_j_-_1_)T_3_(_j_-_1_), 2 \leqslant j \leqslant k-1.

The last definition is about a context free L-graph. An L-graph G is called context free if G has only one bracket group (all rules in the L-graph have only one look of these two: [‘symbol’ | ‘bracket’, ?] or [‘symbol’ | ?, ‘bracket’]).

[Definition of a Dyck language. \Sigma_( and \Sigma_) are disjoint alphabets. There exists a bijection (function that for every element from the 1st set matches one and only one element from the 2nd set) \phi: \Sigma_( \rightarrow \Sigma_). Then the language defined by the grammar S \rightarrow \varepsilon | aSbS, a \in \Sigma_(, b = \phi(a), we will call a Dyck language.]

Post a Comment

0 Comments