рефераты конспекты курсовые дипломные лекции шпоры

Реферат Курсовая Конспект

GENERATIVE, MTT, AND CONSTRAINT IDEAS IN COMPARISON

GENERATIVE, MTT, AND CONSTRAINT IDEAS IN COMPARISON - раздел Образование, THE ROLE OF NATURAL LANGUAGE PROCESSING In This Book, Three Major Approaches To Linguistic Description Have Been Disc...

In this book, three major approaches to linguistic description have been discussed till now, with different degree of detail: (1) generative approach developed by N. Chomsky, (2) the Meaning Û Text approach developed by I. Mel’čuk, and (3) constraint-based approach exemplified by the HPSG theory. In the ideal case, they produce equivalent results on identical language inputs. However, they have deep differences in the underlying ideas. In addition, they use similar terminology, but with different meaning, which may be misleading. In this section, we will compare their underlying ideas and the terminology. To make so different paradigms comparable, we will take only a bird’s-eye view of them, emphasizing the crucial commonalities and differences, but in no way pretending to a more deep description of any of these approaches just now.

Perhaps the most important commonality of the three approaches is that they can be viewed in terms of linguistic signs. All of them describe the structure of the signs of the given language. All of them are used in computational practice to find the Meaning corresponding to a given Text and vice versa. However, the way they describe the signs of language, and as a consequence the way those descriptions are used to build computer programs, is different. Generative idea. The initial motivation for the generative idea was the fact that describing the language is much more difficult, labor-consuming, and error-prone task than writing a program that uses such a description for text analysis. Thus, the formalism for description of language should be oriented to the process of describing and not to the process of practical application. Once created, such a description can be applied somehow.

Now, what is to describe a given language? In the view of generative tradition, it means, roughly speaking, to list all signs in it (in fact, this is frequently referred to as generative idea). Clearly, for a natural language it is impossible to literally list all signs in it, since their number is infinite. Thus, more strictly speaking, a generative grammar describes an algorithm that lists only the correct signs of the given language, and lists them all—in the sense that any given sign would appear in its output after a some time, perhaps long enough. The very name generative grammar is due to that it describes the process of generating all language signs, one by one at a time.

FIGURE IV.16. Generative idea.


There can be many ways to generate language signs. The specific kind of generative grammars suggested by N. Chomsky constructs each sign gradually, through a series of intermediate, half-finished sign “embryos” of different degree of maturity (see Figure IV.16). All of them are built starting from the same “egg-cell” called initial symbol, which is not a sign of the given language. A very simple example of the rules for such gradual building is given on the pages 35 to 39; in this example, the tree structure can be roughly considered the Meaning of the corresponding string.

Where the infinity of generated signs comes from? At each step, called derivation, the generation can be continued in different ways, with any number of derivation steps. Thus, there exist an infinite number of signs with very long derivation paths, though for each specific sign its derivation process is finite.

However, all this generation process is only imaginable, and serves for the formalism of description of language. It is not—and is not intended to be—applied in practice for the generation of an infinitely long list of language expressions, which would be senseless. The use of the description—once created—for passing from Text to Meaning and vice versa is indirect. A program called parser is developed by a mathematician (not a linguist) by means of automatic “reversing” of the original description of the generative process.

FIGURE IV.17. Practical application of the generative idea.


This program can answer the questions: What signs would it generate that have the given Text as the signifier? What signs would it generate that have the given Meaning as signified? (See Figure IV.17.)

The parser does not really try to generate any signs, but instead solves such an equation using the data structures and algorithms quite different from the original description of the generating process.

The result produced by such a black box is, however, exactly the same: given a Text, the parser finds such Meaning that the corresponding sign belongs to the given language, i.e., would be generated by the imaginable generation algorithm. However, the description of the imaginable generation process is much clearer than the description of the internal structures automatically built by the parser for the practical applications.

Meaning Û Text idea. As any other grammar, it is aimed at the practical application in language analysis and synthesis. Unlike generative grammar, it does not concentrate on enumeration of all possible language signs, but instead on the laws of the correspondence between the Text and the Meaning in any sign of the given language. Whereas for a given text, a generative grammar can answer the question Do any signs with such Text exist, and if so, what are their Meanings?, a the MTT grammar only guarantees the answer to the question If signs with such Text existed, what would be their Meanings?

In practice, the MTT models usually can distinguish existing signs from ungrammatical ones, but mainly as a side effect. This makes the MTT models more robust in parsing.

Another idea underlying the MTT approach is that linguists are good enough at the intuitive understanding of the correspondence between Texts and Meanings, and can describe such correspondences directly. This allows avoiding the complications of generative grammars concerning the reversion of rules. Instead, the rules are applied to the corresponding data structures directly as written down by the linguist (such property of a grammar is sometimes called type transparency [47]). Direct application of the rules greatly simplifies debugging of the grammar. In addition, the direct description of the correspondence between Text and Meaning is supposed to better suite the linguistic reality and thus results in less number of rules.

Similarly to the situation with generative grammars, there can be many ways to describe the correspondence between Text and Meaning. The specific kind of the MTT grammars suggested by I. Mel’čuk describes such a correspondence gradually, through many intermediate, half-finished almost-Meanings, half-Meanings, half-Texts, and almost-Texts, as if they were located inside the same sign between its Meaning and Text (see Figure IV.18).

Since the MTT and the generative approach developed rather independently, by accident, they use similar terms in quite different and independent meanings. Below we explain the differences in the use of some terms, though these informal explanations are not strict definitions.

· In generative grammar (see Figure IV.16):

- Transformation: a term used in early works by N. Chomsky for a specific kind of non-context-free derivation.

- Deep structure, in the transformational grammar, is a half-finished sign with a special structure to which a transformation is applied to obtain a “readier” sign. It is nearer to the initial symbol than the surface structure.

FIGURE IV.18. Meaning Û Text idea.


- Surface structure is a half-finished sign obtained as the result of the transformation. It is nearer to the ready sign than the deep structure.

- Generation is used roughly as a synonym of derivation, to refer to the process of enumeration of the signs in the given language.

· In the MTT (see Figure IV.18):

- Transformation is sometimes used for equative correspondences between representations on different levels.

- Deep structure concerns the representation nearer to Meaning.

- Surface structure concerns the representation nearer to Text.

- Generation (of text) is used sometimes as a synonym of synthesis, i.e., construction of Text for the given Meaning.

Constraint-based idea. Similarly to the generative grammar, a constraint-based grammar describes what signs exist in the given language, however not by means of explicit listing (generation) of all such signs, but rather by stating the conditions (constraints) each sign of the given language must satisfy.

It can be viewed as if it specified what signs do not exist in the given language: if you remove one rule (generation option) from a generative grammar, it will generate less signs. If you remove one rule (constraint) from a constraint-based grammar, it will allow more signs (i.e., allow some signs that really are ungrammatical in the given language). Hence is the name constraint-based. (See also page 44.)

Since constraint-based grammars do not use the generation process shown on Figure IV.16, their rules are applied within the same sign rather than to obtain one sign from another (half-finished) one.

This makes it similar to the MTT. Indeed, though the constraint-based approach was originated in the generative tradition, modern constraint-based grammars such as HPSG show less and less similarities with Chomskian tradition and more and more similarity—not in the formalism but in meaningful linguistic structures—with the MTT.

A constraint-based grammar is like a system of equations. Let us consider a simple mathematical analogy.

Each sheet of this book is numbered at both sides. Consider the side with even numbers. Looking at the page number, say, 32, you can guess that it is printed on the 16-th sheet of the book. Let what you see be Text and what you guess be Meaning; then this page number corresponds to a “sign” <32, 16>, where we denote <T, M> a sign with the Text T and Meaning M. In order to describe such a “language”, the three approaches would use different mathematical constructions (of course, in a very rough analogy):

· Generative grammar is like a recurrent formula: The sign <2, 1> (analogue of the initial symbol) belongs to this “language”, and if <x, y> belongs to it, then <x + 2, y + 1> belongs to it (analogue of a generation rule). Note that some effort is needed to figure out from this description how to find a sheet number by a page number.

· The MTT grammar is like an algorithm: given the page number x, its sheet number is calculated as x/2; given a sheet number y, its page number is calculated as 2 ´ y. Note that we have made no attempt to describe dealing with, or excluding of, odd page numbers x, which in fact do not belong to our “language.”

· Constraint-based grammar is like an equation or system of equations. Just those signs belong to our “language,” for which x = 2y. Note that this description is the most elegant and simple, completely and accurately describes our “language,” and requires less reversing effort for practical application than the first one. However, it is more complex than the second one.

Constraint-based idea is a very promising approach adopted by the majority of contemporaneous grammar formalisms. Probably with time, the linguistic findings of the MTT will be re-formulated in the form of constraint-based rules, possibly by a kind of merging of linguistic heritage of the MTT and formalisms developed in frame of HPSG. However, for the time being we consider the MTT more mature and thus richer in detailed description of a vast variety of linguistic phenomena. In addition, this approach is most directly applicable, i.e., it does not need any reversing.

As to the practical implementation of HPSG parsers, it is still an ongoing effort at present.

– Конец работы –

Эта тема принадлежит разделу:

THE ROLE OF NATURAL LANGUAGE PROCESSING

THE ROLE OF NATURAL LANGUAGE PROCESSING... LINGUISTICS AND ITS STRUCTURE... WHAT WE MEAN BY COMPUTATIONAL LINGUISTICS...

Если Вам нужно дополнительный материал на эту тему, или Вы не нашли то, что искали, рекомендуем воспользоваться поиском по нашей базе работ: GENERATIVE, MTT, AND CONSTRAINT IDEAS IN COMPARISON

Что будем делать с полученным материалом:

Если этот материал оказался полезным ля Вас, Вы можете сохранить его на свою страничку в социальных сетях:

Все темы данного раздела:

THE ROLE OF NATURAL LANGUAGE PROCESSING
We live in the age of information. It pours upon us from the pages of newspapers and magazines, radio loudspeakers, TV and computer screens. The main part of this information has the form of natura

LINGUISTICS AND ITS STRUCTURE
Linguistics is a science about natural languages. To be more precise, it covers a whole set of different related sciences (see Figure I.1). General linguistics is a nucleus [18, 36]

WHAT WE MEAN BY COMPUTATIONAL LINGUISTICS
Computational linguistics might be considered as a synonym of automatic processing of natural language, since the main task of computational linguistics is just the construction of computer

WORD, WHAT IS IT?
As it could be noticed, the term word was used in the previous sections very loosely. Its meaning seems obvious: any language operates with words and any text or utterance consists of them.

THE IMPORTANT ROLE OF THE FUNDAMENTAL SCIENCE
In the past few decades, many attempts to build language processing or language understanding systems have been undertaken by people without sufficient knowledge in theoretical linguistics. They ho

CURRENT STATE OF APPLIED RESEARCH ON SPANISH
In our books, the stress on Spanish language is made intentionally and purposefully. For historical reasons, the majority of the literature on natural languages processing is not only written in En

CONCLUSIONS
The twenty-first century will be the century of the total information revolution. The development of the tools for the automatic processing of the natural language spoken in a country or a whole gr

II. A HISTORICAL OUTLINE
A COURSE ON LINGUISTICS usually follows one of the general models, or theories, of natural language, as well as the corresponding methods of interpretation of the linguistic phenomena. A c

THE STRUCTURALIST APPROACH
At the beginning of the twentieth century, Ferdinand de Saussure had developed a new theory of language. He considered natural language as a structure of mutually linked elements, similar or

INITIAL CONTRIBUTION OF CHOMSKY
In the 1950’s, when the computer era began, the eminent American linguist Noam Chomsky developed some new formal tools aimed at a better description of facts in various languages [12].

A SIMPLE CONTEXT-FREE GRAMMAR
Let us consider an example of a context-free grammar for generating very simple English sentences. It uses the initial symbol S of a sentence to be generated and several oth

TRANSFORMATIONAL GRAMMARS
Further research revealed great generality, mathematical elegance, and wide applicability of generative grammars. They became used not only for description of natural languages, but also for specif

THE LINGUISTIC RESEARCH AFTER CHOMSKY: VALENCIES AND INTERPRETATION
After the introduction of the Chomskian transformations, many conceptions of language well known in general linguistics still stayed unclear. In the 1980’s, several grammatical theories different f

LINGUISTIC RESEARCH AFTER CHOMSKY: CONSTRAINTS
Another very valuable idea originated within the generative approach was that of using special features assigned to the constituents, and specifying constraints to characterize agreement or

HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR
One of the direct followers of the GPSG was called Head-Driven Phrase Structure Grammar (HPSG). In addition to the advanced traits of the GPSG, it has introduced and intensively used the notion of

THE IDEA OF UNIFICATION
Having in essence the same initial idea of phrase structures and their context-free combining, the HPSG and several other new approaches within Chomskian mainstream select the general and very powe

THE MEANING Û TEXT THEORY: MULTISTAGE TRANSFORMER AND GOVERNMENT PATTERNS
The European linguists went their own way, sometimes pointing out some oversimplifications and inadequacies of the early Chomskian linguistics. In late 1960´s, a new theory, the Mean

THE MEANING Û TEXT THEORY: DEPENDENCY TREES
Another important feature of the MTT is the use of its dependency trees, for description of syntactic links between words in a sentence. Just the set of these links forms the representation

THE MEANING Û TEXT THEORY: SEMANTIC LINKS
The dependency approach is not exclusively syntactic. The links between wordforms at the surface syntactic level determine links between corresponding labeled nodes at the deep syntactic level, and

CONCLUSIONS
In the twentieth century, syntax was in the center of the linguistic research, and the approach to syntactic issues determined the structure of any linguistic theory. There are two major approaches

III. PRODUCTS OF COMPUTATIONAL LINGUISTICS: PRESENT AND PROSPECTIVE
FOR WHAT PURPOSES do we need to develop computational linguistics? What practical results does it provide for society? Before we start discus-sing the methods and techniques of computational lingui

CLASSIFICATION OF APPLIED LINGUISTIC SYSTEMS
Applied linguistic systems are now widely used in business and scientific domains for many purposes. Some of the most important ones among them are the following: · Text preparation

AUTOMATIC HYPHENATION
Hyphenation is intended for the proper splitting of words in natural language texts. When a word occurring at the end of a line is too long to fit on that line within the accepted margins, a part o

SPELL CHECKING
The objective of spell checking is the detection and correction of typographic and orthographic errors in the text at the level of word occurrence considered out of its context. Nob

GRAMMAR CHECKING
Detection and correction of grammatical errors by taking into account adjacent words in the sentence or even the whole sentence are much more difficult tasks for computational linguists and softwar

STYLE CHECKING
The stylistic errors are those violating the laws of use of correct words and word combinations in language, in general or in a given literary genre. This application is the nearest in its

REFERENCES TO WORDS AND WORD COMBINATIONS
The references from any specific word give access to the set of words semantically related to the former, or to words, which can form combinations with the former in a text. This is a very importan

INFORMATION RETRIEVAL
Information retrieval systems (IRS) are designed to search for relevant information in large documentary databases. This information can be of various kinds, with the queries ranging from “Find all

TOPICAL SUMMARIZATION
In many cases, it is necessary to automatically determine what a given document is about. This information is used to classify the documents by their main topics, to deliver by Internet the documen

AUTOMATIC TRANSLATION
Translation from one natural language to another is a very important task. The amount of business and scientific texts in the world is growing rapidly, and many countries are very productive in sci

NATURAL LANGUAGE INTERFACE
The task performed by a natural language interface to a database is to understand questions entered by a user in natural language and to provide answers—usually in natural language, but sometimes a

EXTRACTION OF FACTUAL DATA FROM TEXTS
Extraction of factual data from texts is the task of automatic generation of elements of a factographic database, such as fields, or parameters, based on on-line texts. Often the flows of the curre

TEXT GENERATION
The generation of texts from pictures and formal specifications is a comparatively new field; it arose about ten years ago. Some useful applications of this task have been found in recent years. Am

SYSTEMS OF LANGUAGE UNDERSTANDING
Natural language understanding systems are the most general and complex systems involving natural language processing. Such systems are universal in the sense that they can perform nearly all the t

RELATED SYSTEMS
There are other types of applications that are not usually considered systems of computational linguistics proper, but rely heavily on linguistic methods to accomplish their tasks. Of these we will

CONCLUSIONS
A short review of applied linguistic systems has shown that only very simple tasks like hyphenation or simple spell checking can be solved on a modest linguistic basis. All the other systems should

POSSIBLE POINTS OF VIEW ON NATURAL LANGUAGE
One could try to define natural language in one of the following ways: · The principal means for expressing human thoughts; · The principal means for text generation; · T

LANGUAGE AS A BI-DIRECTIONAL TRANSFORMER
The main purpose of human communication is transferring some information—let us call it Meaning[6]—from one person to the other. However, the direct transferring of thoughts is not possi

TEXT, WHAT IS IT?
The empirical reality for theoretical linguistics comprises, in the first place, the sounds of speech. Samples of speech, i.e., separate words, utterances, discourses, etc., are given to the resear

MEANING, WHAT IS IT?
Meanings, in contrast to texts, cannot be observed directly. As we mentioned above, we consider the Meaning to be the structures in the human brain which people experience as ideas and thoughts. Si

TWO WAYS TO REPRESENT MEANING
To represent the entities and relationships mentioned in the texts, the following two logically and mathematically equivalent formalisms are used: · Predicative formulas. Logical

DECOMPOSITION AND ATOMIZATION OF MEANING
Semantic representation in many cases turns out to be universal, i.e., common to different natural languages. Purely grammatical features of different languages are not usually reflected in

NOT-UNIQUENESS OF MEANING Þ TEXT MAPPING: SYNONYMY
Returning to the mapping of Meanings to Texts and vice versa, we should mention that, in contrast to common mathematical functions, this mapping is not unique in both directions, i.e., it is of the

NOT-UNIQUENESS OF TEXT Þ MEANING MAPPING: HOMONYMY
In the opposite direction—Texts to Meanings—a text or its fragment can exhibit two or more different meanings. That is, one element of the surface edge of the mapping (i.e. text) can correspond to

MORE ON HOMONYMY
In the field of computational linguistics, homonymous lexemes usually form separate entries in dictionaries. Linguistic analyzers must resolve the homonymy automatically, by choosing the correct op

MULTISTAGE CHARACTER OF THE MEANING Û TEXT TRANSFORMER
FIGURE IV.10. Levels of linguistic representation.

TRANSLATION AS A MULTISTAGE TRANSFORMATION
FI­GURE IV.13. The role of dictionaries and grammars in linguis

TWO SIDES OF A SIGN
The notion of sign, so important for linguistics, was first proposed in a science called semiotics. The sign was defined as an entity consisting of two components, the signifier

LINGUISTIC SIGN
The notion of linguistic sign was introduced by Ferdinand de Saussure. By linguistic signs, we mean the entities used in natural languages, such as morphs, lexemes, and phrases. Lin

LINGUISTIC SIGN IN THE MMT
In addition to the two well-known components of a sign, in the Meaning Û Text Theory yet another, a third component of a sign, is considered essential: a record about its ability or inability

LINGUISTIC SIGN IN HPSG
In Head-driven Phrase Structure Grammar a linguistic sign, as usually, consists of two main components, a signifier and a signified. The signifier is defined as a phoneme string (or a sequence of s

ARE SIGNIFIERS GIVEN BY NATURE OR BY CONVENTION?
The notion of sign appeared rather recently. However, the notions equivalent to the signifier and the signified were discussed in science from the times of the ancient Greeks. For several centuries

CONCLUSIONS
The definition of language has been suggested as a transformer between the two equivalent representations of information, the Text, i.e., the surface textual representation, and the Meaning, i.e.,

V. LINGUISTIC MODELS
THROUGHOUT THE PREVIOUS CHAPTERS, you have learned, on the one hand, that for many computer applications, detailed linguistic knowledge is necessary and, on the other hand, that natural language ha

WHAT IS MODELING IN GENERAL?
In natural sciences, we usually consider the system A to be a model of the system B if A is similar to B in some important properties and exhibits somewhat simila

NEUROLINGUISTIC MODELS
Neurolinguistic models investigate the links between any external speech activity of human beings and the corresponding electrical and humoral activities of nerves in their brain. I

PSYCHOLINGUISTIC MODELS
Psycholinguistics is a science investigating the speech activity of humans, including perception and forming of utterances, via psychological methods. After creating its hypotheses and model

FUNCTIONAL MODELS OF LANGUAGE
In terms of cybernetics, natural language is considered as a black box for the researcher. A black box is a device with observable input and output but with a completely unobservable inner s

RESEARCH LINGUISTIC MODELS
There are still other models of interest for linguistics. They are called research models. At input, they take texts in natural language, maybe prepared or formatted in a special manner befo

COMMON FEATURES OF MODERN MODELS OF LANGUAGE
The modern models of language have turned out to possess several common features that are very important for the comprehension and use of these models. One of these models is given by the Meaning &

SPECIFIC FEATURES OF THE MEANING Û TEXT MODEL
The Meaning Û Text Model was selected for the most detailed study in these books, and it is necessary now to give a short synopsis of its specific features. · Orientation to synth

REDUCED MODELS
We can formulate the problem of selecting a good model for any specific linguistic application as follows. A holistic model of the language facilitates describing the language as a

DO WE REALLY NEED LINGUISTIC MODELS?
Now let us reason a little bit on whether computer scientists really need a generalizing (complete) model of language. In modern theoretical linguistics, certain researchers study phonolog

ANALOGY IN NATURAL LANGUAGES
Analogy is the prevalence of a pattern (i.e., one rule or a small set of rules) in the formal description of some linguistic phenomena. In the simplest case, the pattern can be represented with the

EMPIRICAL VERSUS RATIONALIST APPROACHES
In the recent years, the interest to empirical approach in linguistic research has livened. The empirical approach is based on numerous statistical observations gathered purely automatically

LIMITED SCOPE OF THE MODERN LINGUISTIC THEORIES
Even the most advanced linguistic theories cannot pretend to cover all computational problems, at least at present. Indeed, all of them evidently have the following limitations: · Only the

CONCLUSIONS
A linguistic model is a system of data (features, types, structures, levels, etc.) and rules, which, taken together, can exhibit a “behavior” similar to that of the human brain in understanding and

REVIEW QUESTIONS
    THE FOLLOWING QUESTIONS can be used to check whether the reader has understood and remembered the main contents of the book. The questions are also recommended for t

PROBLEMS RECOMMENDED FOR EXAMS
IN THIS SECTION, each test question is supplied with a set of four variants of the answer, of which exactly one is correct and the others are not. 1. Why automatic natural language process

RECOMMENDED LITERATURE
1. Allen, J. Natural Language Understanding. The Benjamin / Cummings Publ., Amsterdam, Bonn, Sidney, Singapore, Tokyo, Madrid, 1995. 2. Cortés García, U., J. Bé

ADDITIONAL LITERATURE
10. Baeza-Yates, R., B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley Longman and ACM Press, 1999. 11. Beristáin, Helena. Gramática estructural de la l

GENERAL GRAMMARS AND DICTIONARIES
20. Criado de Val, M. Gramática española. Madrid, 1958. 21. Cuervo, R. J. Diccionario de construcción y régimen de la lengua castellana. Instituto

REFERENCES
34. Apresian, Yu. D. et al. Linguistic support of the system ETAP-2 (in Russian). Nauka, Moscow, Russia, 1989. 35. Beekman, G. “Una mirada a la tecnología del ma&ntild

SOME SPANISH-ORIENTED GROUPS AND RESOURCES
HERE WE PRESENT a very short list of groups working on Spanish, with their respective URLs, especially the groups in Latin America. The members of the RITOS network (emilia.dc.fi.udc.es / Ritos2) a

Хотите получать на электронную почту самые свежие новости?
Education Insider Sample
Подпишитесь на Нашу рассылку
Наша политика приватности обеспечивает 100% безопасность и анонимность Ваших E-Mail
Реклама
Соответствующий теме материал
  • Похожее
  • Популярное
  • Облако тегов
  • Здесь
  • Временно
  • Пусто
Теги