《文字书写系统的计算理论》以说明文语转换系统的可操作性问题为前提,目的并不是要介绍不同的文字书写系统。最重要的理论论点都在第一章提出。其两个基本论点是:(一)词形到书写规则的映射存在正则关系(regular relation);(二)一个特定语言的书写系统所表达的语言学信息具有一致性(consistency)。其它的章节主要是通过实例以不同的角度来对这两个论点作出详细的阐述和证明。第二章较详细的阐述了书写系统的正则性。第三章则详细说明了特定文字如何表达语言学信息以及所信息表达信息的一致性问题。第四章介绍现代语言学的几种常用的文字体系分类,进而提出对文字书写系统的二维分类方法。第五章简要介绍如何用心理语言学的方法来分析母语读者进行文语转换的方式,并将本书所提出的理论与心理语言学的结论进行印证。第六章先讲解文字与书写系统是如何被不同的文字借鉴以及承传的方式方法,另外给出文字中对缩写和数字的表述以及转换,最后对本书的内容做了一个总结。
导读F9
PrefaceF29
List of FiguresF31
List of TablesF33
1 Reading Devices1
1.1 Text to Speech Conversion:A Brief Introduction2
1.2 The Task of Pronouncing Aloud:A Model6
1.2.1 A Simple Example from Russian6
1.2.2 Formal Definitions11
1.2.2.1 AVMs and Annotation Graphs11
1.2.2.2 Definitions13
1.2.2.3 Axioms14
1.2.3 Central Claims of the Theory15
1.2.3.1 Regularity16
1.2.3.2 Consistency19
1.2.4 Further Issues20
1.2.4.1 Why a Constrained Theory of Writing Systems?21
1.2.4.2 Orthography and the “Segmental” Assumption23
1.3 Terminology and Conventions25
1.A Appendix:An Overview of Finite State Automata and Transducers29
1.A.1 Regular Languages and Finite State Automata29
1.A.2 Regular Relations and Finite State Transducers30
2 Regularity34
2.1 Planar Regular Languages and Planar Regular Relations35
2.2 The Locality Hypothesis41
2.3 Planar Arrangements:Examples42
2.3.1 Korean Hankul43
2.3.2 Devanagari45
2.3.3 Pahawh Hmong47
2.3.4 Chinese48
2.3.5 A Counterexample from Ancient Egyptian54
2.4 Cross Writing System Variation in the SLU55
2.5 Macroscopic Catenation:Text Direction59
2.A Sample Chinese Characters and Their Analyses62
3 ORL Depth and Consistency67
3.1 Russian and Belarusian Orthography:A Case Study67
3.1.1 Vowel Reduction68
3.1.2 Regressive Palatalization73
3.1.3 Lexical Marking in Russian and Other Issues76
3.1.4 Summary of Russian and Belarusian79
3.2 English79
3.3 The Orthographic Representation of Serbo Croatian Consonant Devoicing89
3.3.1 Methods and Materials91
3.3.2 Results92
3.4 Cyclicity in Orthography95
3.5 Surface Orthographic Constraints96
3.A English Deep and Shallow ORLs99
3.A.1 Lexical Representations99
3.A.2 Rules for the Deep ORL127
3.A.3 Rules for the Shallow ORL129
4 Linguistic Elements131
4.1 Taxonomies of Writing Systems:A Brief Overview132
4.1.1 Gelb132
4.1.2 Sampson133
4.1.3 DeFrancis134
4.1.3.1 No Full Writing System Is Semasiographic134
4.1.3.2 All Full Writing Is Phonographic135
4.1.3.3 Hankul Is Not Featural135
4.1.4 A New Proposal
4.1.5 Summary
4.2 Chinese Writing
4.3 Japanese Writing
4.4 Some Further Examples
4.4.1 Syriac Syame
4.4.2 Reduplication Markers
4.4.3 Cancellation Signs
5 Psycholinguistic Evidence
5.1 Multiple Routes and the Orthographic
Depth Hypothesis
5.1.1 Evidence for the Orthographic Depth Hypothesis
5.1.2 Evidence against the Orthographic Depth Hypothesis
5.2 "Shallow" Processing in "Deep" Orthographies
5.2.1 Phonological Access in Chinese
5.2.2 Phonological Access in Japanese
5.2.3 Evidence for the Function of Phonetic Components in Chinese
5.2.4 Summary
5.3 Connectionist Models:The Seidenberg-McClelland Model
5.3.1 Outline of the Model
5.3.2 What Is Wrong with the Model?
5.4 Summary
6 Further Issues
6.1 Adaptation of Writing Systems:The Case of Manx Gaelic
6.2 Orthographic Reforms: The Case of Dutch
6.2.1 The 1954 Spelling Rules
6.2.2 The 1995 Spelling Rules
6.3 Other Forms of Notation:Numerical Notation and Its Relation to Number Names
6.4 Abbreviatory Devices
6.5 Non-Bloomfieldian Views on Writing
6.6 Postscript
Bibliography
Index
Our starting point for this study of writing systems is text-to-speech synthe-sis - TTS, and more specifically the computational problem of convertingfrom written text into a linguistic representation. While the connection be-tween TTS systems on the one hand and writing systems on the other maynot be immediately apparent, a moment's reflection will make it clear thatthe problem to be solved by a TTS system - namely the conversion ofwritten text into speech - is exactly the same problem as a human readermust solve when presented with a text to be read aloud. And just as writingsystems, their properties, and the ways in which they encode linguistic infor-mation are of interest to psycholinguists who study how people read, so (inprinciple) should such considerations be of interest to those who develop
TTS technology: At the very least, it ought to be of as much interest as,for example, understanding the physiology and acoustics erlying speechproduction, something that early speech synthesis researchers such as Fant(1960) were heavily involved in.
Since my starting point is TTS, and since I assume that most readers willnot be familiar with this field, I will start this chapter with a review of someof the issues relevant to the development of TTS systems, particularly asthey relate to the problem of analyzing input text. This will be the topic ofSection 1.1. In Section 1.2 1 will informally introduce, by way of a simpleexample, the model that I shall be developing throughout the rest of thisbook. Finally, Section 1.3 will introduce some aspects of the formalism andthe conventions that will be used throughout this book.
……