Formal language

advertisement
Formální jazyky /matematika, logika, počítače/:
Formal language
In mathematics, logic and computer science, a formal language is a set of finitelength words (i.e. character strings) drawn from some finite alphabet, and the
scientific theory that deals with these entities is known as formal language theory.
Note that we can talk about formal language in many contexts (scientific, legal,
linguistic and so on), meaning a mode of expression more careful and accurate, or
more mannered than everyday speech. The sense of formal language dealt with in
this article is the precise sense studied in formal language theory.
An alphabet might be
, and a string over that alphabet might be ababba.
A typical language over that alphabet, containing that string, would be the set of all
strings which contain the same number of symbols a and b.
The empty word (that is, length-zero string) is allowed and is often denoted by e, ε or
Λ. While the alphabet is a finite set and every string has finite length, a language may
very well have infinitely many member strings (because the length of words in it may
be unbounded).
Some examples of formal languages:

the set of all words over a,b



the set
, n is a prime number and an means a repeated n times
the set of syntactically correct programs in a given programming language; or
the set of inputs upon which a certain Turing machine halts.
A formal language can be specified in a great variety of ways, such as:




Strings produced by some formal grammar (see Chomsky hierarchy);
Strings produced by a regular expression;
Strings accepted by some automaton, such as a Turing machine or finite state
automaton;
From a set of related YES/NO questions those ones for which the answer is YES —
see decision problem.
Several operations can be used to produce new languages from given ones.
Suppose L1 and L2 are languages over some common alphabet.





The concatenation L1L2 consists of all strings of the form vw where v is a string from
L1 and w is a string from L2.
The intersection of L1 and L2 consists of all strings which are contained in L1 and also
in L2.
The union of L1 and L2 consists of all strings which are contained in L1 or in L2.
The complement of the language L1 consists of all strings over the alphabet which are
not contained in L1.
The right quotient L1 / L2 of L1 by L2 consists of all strings v for which there exists a
string w in L2 such that vw is in L1.

The Kleene star
consists of all strings which can be written in the form w1w2...wn
with strings wi in L1 and
. Note that this includes the empty string ε because n
= 0 is allowed.


The reverse
contains the reversed versions of all the strings in L1.
The shuffle of L1 and L2 consists of all strings which can be written in the form
v1w1v2w2...vnw n where
and v1,...,vn are strings such that the concatenation
v1...vn is in L1 and w1,...,wn are strings such that w1...wn is in L2.
A question often asked about formal languages is "how difficult is it to decide whether
a given word belongs to the language?" This is the domain of computability theory
and complexity theory.
Použitý zdroj: Encyklopedie Wikipedia
Download