Notes on the (interim) lexical database

1. Overview

The dictionary is currently stored in a MS Access database, available for download at http://www.wulfila.be/gothic/download. Here is an overview of the relevant tables and relations:

Overview of the relational structure of the database

The structure of the entire database is avaliable here.

In its present form, the main table Lemmata is actually a conflation of two different dictionaries: a ‘flattened’ view on the headwords in Streitberg's Gotisch-Griechisch-Deutsches Wörterbuch (1910) and a more formal dictionary used internally for generating paradigms and tagging the text. The former is human-oriented, aimed at glossing words, the latter machine-oriented, aimed at searching and analysis.

Ultimately, the goal is to separate these entities, keeping the formal dictionary in the database and storing Streitberg's dictionary in an external XML document, encoded using the TEI DTD [Text Encoding Initiative P4, chapter 12: Print Dictionaries]. They would be linked by means of corresponding identifiers. In the future, other TEI-encoded dictionaries could be linked to the database as well, e.g. the glossary to Wright's Grammar of the Gothic Language (1910), available in ad-hoc XML at the Germanic Lexicon Project.

  1. Fields concerning Streitberg's dictionary are prefixed with WS. The main table has separate entries for every lemma, related lemma (TEI <re>), homonym (both TEI <superEntry> or <hom>) or cross-reference found in Streitberg's dictionary. It contains only information that can reasonably be stored in a relational DB, i.e. orthographical form(s), additional information on the form, grammatical class and additional information, location in the dictionary, sort key, flags marking the lemma as reconstructed, conjectured or ‘besserungsbedürftig’ etc. The list of headwords is complete and has been manually checked twice, as recorded in the timestamps. The translation field, however, is an unstructured ‘stub’, providing basic, plain-text transcriptions of the layout of the entry, German glosses and notes and occasionally a few examples.

    Table WSClasses lists the most common grammatical labels used by Streitberg (e.g. ‘Akk.Plur.’). Less frequently used labels are stored directly in field WSClassOverride (e.g. ‘Akk.Pl.’), overriding the value of WSClass; WSClassOverrideReason names the reason for doing so (in this case: alternate spelling). The distinction is a leftover from earlier attempts to normalize Streitberg's labels. It is probably unnecessary and can be discarded, since grammatical information is more precisely described in the formal dictionary. It could, however, be used to provide normalized values for some labels in a TEI edition, e.g. <gram norm="Akk.Plur.">Akk.Pl.</gram>.

  2. The formal, machine-oriented dictionary defines ‘normalized’ lexical data, used to generate the lexicon (the set of all possible word-forms) and tag the text (the set of all recorded word-forms or tokens). It is closely tied to the Gomorph application, a formal description of Gothic morphology. In contrast to Streitberg's dictionary, where Part-of-Speech (TEI <pos>) and inflection (TEI <iType>) are generally merged into one label (e.g. ‘st.V.3’), each lexeme is linked to exactly one PoS and one inflectional category, by means of foreign keys pointing to other tables. When there are several alternative reconstructions, one is chosen as the primary form, the others are stored as separate lemmata with a pointer to the main lemma (in field Parent). Table Gomorph lists the inflectional classes defined in the Gomorph specification. POSTags defines Part-of-Speech classes. LexicalParameters contains input for the Gomorph engine, typically features that override the class definition (such as noun aba (Mn) having irregular dative/genitive plural abnam/abne).

The exact nature of an entry in the combined table is determined by field LemmaType. Most of the time, there is a simple one-to-one correspondence between the formal lemma and Streitberg's entry. As mentioned above, this is not the case for homonyms, related lemmata and alternative spellings and/or reconstructions, which are usually grouped together in Streitberg's dictionary (many-to-one relation). Streitberg's cross-references to other entries are omitted from the formal dictionary (zero-to-one relation). A few headwords are missing in Streitberg's book (one-to-zero relation).

Using the type and parent fields, it is relatively straightforward to generate the ‘skeleton’ of a valid TEI document, including all entries and super-entries but minus the actual <sense>-element, which would initially be limited to a partial, plain-text transcription.

2. Definition of the main table (Lemmata)

MS Access screenshot

3. Contents of related tables

Table LemmaTypes
ID Name Description
1 Normal Normal lemma.
2 Link, spelling Streitberg only: reference to another lemma, for which it has an alternative spelling.
3 Link, hyphen Streitberg only: reference to another, identical lemma, the only difference being the use of a hyphen to separate root and prefix.
4 Link, derived Streitberg only: derived form of another lemma listed as a separate lemma with reference to the main lemma, e.g. "wait" linked to "witan".
5 Link, suppletive Suppletive form of another lemma listed as a separate lemma, e.g. "batiza" linked to "goþs".
6 Links, spelling Streitberg only: combined references to other lemmas, e.g. "leikeis, leikeinassus s. lekeis etc."
7 Link, conjecture Streitberg only: form marked as "besserungsbedürftig" refering to a corresponding conjecture (or the other way round).
9 Homograph Separated lemma, one item out of a set of homographs that is treated as one entry in Streitberg's lexicon (corresponds to TEI <hom> element).
10 Sublemma Separated lemma, mentioned as a related or derived lemma in Streitberg's lexicon (corresponds to TEI <re> element). Field [Parent] contains the primary key of the parent entry.
11 Component Lemma cannot be instantiated directly, e.g. enclitics or roots ("-hun" or "-kunnan")
12 Alternative reconstruction NOT in Streitberg: alternative reconstruction for the lemma specified by [Parent].
13 Missing in Streitberg NOT in Streitberg: missing
Table POSTags
ID Name
1 Noun, proper
2 Noun, common Generic (unspecified subcategory).
3 Noun, common, masculine
4 Noun, common, feminine
5 Noun, common, neuter
6 Adjective
7 Adjective, Comparative
8 Adjective, Superlative
9 Verb
10 Participle, present
11 Participle, past
12 Adverb
13 Adverb, Comparative
14 Adverb, Superlative
15 Adverb, manner
16 Adverb, locative
17 Adverb, directional
18 Adverb, temporal
19 Preposition Generic (unspecified subcategory).
20 Preposition, +A
21 Preposition, +D
22 Preposition, +G
23 Preposition, +AD
24 Preposition, +DG
25 Preposition, +ADG
26 Pronoun Generic (unspecified subcategory).
27 Pronoun, personal
28 Pronoun, personal, reflexive
29 Pronoun, personal, relative
30 Pronoun, possesive
31 Pronoun, possesive, reflexive
32 Pronoun, demonstrative
33 Pronoun, relative
34 Pronoun, interrogative
35 Pronoun, indefinite
36 Numeral, cardinal
37 Numeral, ordinal
38 Numeral, other Kollektives Zahlwort, distributives Zahlwort, ...?
39 Conjunction
40 Conjunction, enclitic
41 Particle
42 Particle, enclitic
43 Interjection
44 Foreign word
45 Multiple functions Temporary tag: the lemma can have different functions and should be split (e.g. "nu": adverb or conjunction).
46 Unassigned Temporary tag: the lemma is problematic (or unique as in EAGLES unique/unassigned).
47 <none> Not applicable, e.g. when the entry is a link to another lemma.
Table Gomorph
ID Name Description
1 Indeclinable Indeclinable word (preposition, conjunction, ...)
2 Noun Unspecified noun
3 Ma Masculine a-stems [St §145; BE §90-91]
4 Na Neuter a-stems [St §145; BE §93-94]
5 Mja Masculine short ja-stems [St §146; BE §90/92]
6 Mia Masculine long ja-stems [St §146; BE §90/92]
7 Nja Neuter short ja-stems [St §146; BE §93/95]
8 Nia Neuter long ja-stems [St §146; BE §93/95]
9 Mwa Masculine wa-stems [St §147]
10 Nwa Neuter wa-stems [St §147]
11 Fo Pure ō-stems (feminine) [St §149; BE §96-97]
12 Fjo Short jō-stems (feminine) [St §150; BE §96-97]
13 Fio Long jō-stems (feminine) [St §150; BE §98]
14 Fwo wō-stems (feminine) [St §151; BE §97.A1]
15 Mi Masculine i-stems [St §152; BE §100-101]
16 Fi Feminine i-stems [St §152; BE §102-103]
17 Fi-o Feminine i/o-stems [St §152.A6; BE §103.A1]
18 u Masculine OR Feminine u-stems [St §153; BE §104: 'Bei einigen ist das Geschlecht zweifelhaft']
19 Mu Masculine u-stems [St §153; BE §104-105]
20 Fu Feminine u-stems [St §153; BE §104-105]
21 Nu Neuter u-stems [St §153; EB §106]
22 Mu-i Masculine u/i-stems [St §163; EB §120.A1]
23 Mn Masculine n-stems [St §155-156; BE §107-108]
24 Fn Feminine n-stems [St §155-157; BE §111-113]
25 Nn Neuter n-stems [St §155; BE §109-110]
26 Mr Masculine r-stems [St §158; BE §114]
27 Fr [Feminine r-stems]
28 Mnd Masculine nd-stems [St §159; BE §115]
29 Fkons Feminine root nouns [St §160; BE §116]
30 Mkons Masculine root nouns [St §161; BE §117]
31 Adjective Unspecified adjective
32 Adj.a Pure a/o-stems [St §181; BE §123-124]
33 Adj.ja Short ja/jo-stems [St §182; BE §125-126]
34 Adj.ia Long ja/jo-stems [St §182; BE §127-128]
35 Adj.i Partial i-stems ('nur noch in Resten vorhanden'; most forms went over into ja-declension) [St §183; BE §130]
36 Adj.u Partial u-stems ('nur noch in Resten vorhanden'; most forms went over into ja-declension) [St §184; BE §131]
37 Part.Pres. Present participle [St §187.3; BE §133]
38 Comparative Adjectives: comparative degree [St §188; BE §135-136]
39 Part.Perf. Past participle [St §186.2; BE §134]
40 Superlative Adjectives: superlative degree [St §189; BE §137]
41 Verb Unspecified verb
42 abl.V.1 Strong verb (class 1: Ablautreihe ei/ái/i[aí]/i[aí]) [St §203; BE §172]
43 abl.V.2 Strong verb (class 2: Ablautreihe iu/áu/u[aú]/u[aú]) [St §204; BE §173]
44 abl.V.3 Strong verb (class 3: Ablautreihe i[aí]/a/u[aú]/u[aú]) [St §205-206; BE §174]
45 abl.V.4 Strong verb (class 4: Ablautreihe i[aí]/a/e/u[aú]) [St §207; BE §175]
46 abl.V.5 Strong verb (class 5: Ablautreihe i[aí]/a/e/i[aí]) [St §208; BE §176]
47 abl.V.6 Strong verb (class 6: Ablautreihe a/o/o/a) [St §209; BE §177]
48 red.V.1 Reduplicating strong verb (class 1: present stem contains -aiC-) [St §211; BE §178-179]
49 red.V.2 Reduplicating strong verb (class 2: present stem contains -auC-) [St §211; BE §178-179]
50 red.V.3 Reduplicating strong verb (class 3: present stem contains -ǎCC- or -āC-) [St §211; BE §178-179]
51 red.V.4 Reduplicating strong verb (class 4: present stem contains -ēC-) [St §211; BE §178-179]
52 red.V.5 Reduplicating strong verb (class 5: present stem contains -ō[C]-)
53 red.-abl.V. Reduplicating strong verb with Ablaut [St §212; BE §180-182]
54 sw.V.1-j Weak verb (class 1: -ja- / short stems) [St §216; BE §185-188]
55 sw.V.1-i Weak verb (class 1: -ja- / long stems) [St §216; BE §185-188]
56 sw.V.2 Weak verb (class 2: -o-) [St §217; BE §189-190]
57 sw.V.3 Weak verb (class 3: -ai-) [St §218; BE §191-193]
58 sw.V.4 Weak verb (class 4: -na-) [St §219; BE §194-195]
59 V.prt.-prs. Preterite-presents ('Die Verba präterito-präsentia haben Perfektform aber Präsensbedeutung') [St §220; BE §196]
60 Pers.Pron Personal pronouns ('ungeschlechtige Pronomina') [St §164; BE §150]
61 Pron. Enumeration of selected pronominal declensions [St §165-178; BE §152-166]
62 Num.1 'Die drei ersten Zahlen sind in allen Kasus und Geschlechtern deklinierbar.' [BE §140]
63 Num.2 'Die Zahlen 4-19 sind eingeschlechtig. [...] Diese Zahlen werden unflektiert gebraucht, im Gen. und Dat. können sie jedoch flektierte Formen [...] bilden.' [BE §141]
64 <none> Do NOT generate a paradigm, e.g. because the entry in Streitberg's lexicon is a link.
Table WSClasses
ID Name
1 <none>
2 <unassigned>
3 <multiple tags>
4 abl.V.1
5 abl.V.2
6 abl.V.3,1
7 abl.V.3,2
8 abl.V.4
9 abl.V.5
10 abl.V.6
11 Adj.
12 Adj.a
13 Adj.i/ja
14 Adj.ia
15 Adj.ja
16 Adj.u
17 Adj.wa
18 Adv.
19 Adv. d. Richtung
20 Adv. d. Ruhe
21 Akk.
22 Akk.Plur.
23 Akk.Sing.
24 Dat.
25 Dat.Plur.
26 Dat.Sing.
27 Eigenn.
28 Enklitikon
29 F
30 Fi
31 Fi/kons
32 Fi/o
33 Fio
34 Fjo
35 Fkons
36 Fn
37 Fo
38 Fr
39 Fragepartikel
40 Fu
41 Fwo
42 Gen.
43 Gen.Plur.
44 Gen.Sing.
45 i
46 Imperat.
47 Indefin.
48 indekl.
49 Interjektion
50 Kompar.
51 Kompar.-Adv.
52 Konj.
53 kons.
54 M
55 Ma
56 Ma/i
57 Mi
58 Mia
59 Mia/i
60 Mja
61 Mkons
62 Mn
63 Mnd
64 Mr
65 Mu
66 Mu/i
67 Mwa
68 n
69 N
70 Na
71 Nia
72 Nja
73 Nn
74 Nom.
75 Nom.Plur.
76 Nu
77 Nwa
78 Ortsname
79 Part.
80 Possess.
81 Präp.
82 Pron.
83 Pt.Pf.
84 Pt.Prs.
85 red.-abl.V.
86 red.V.1
87 red.V.2
88 red.V.3
89 red.V.4
90 red.V.5
91 rel. Adv.
92 Superl.
93 sw.Adj.
94 sw.Pt.Pf.
95 sw.Pt.Prs.
96 sw.V.1
97 sw.V.2
98 sw.V.3
99 sw.V.4
100 u
101 u/i
110 V.
111 V. prt.-prs.
112 Vok.Plur.
Table WSClassesOverrideReasons
ID Name Description
1 Variation Alternate spelling or wording, e.g. "Akk.Pl." for "Akk.Plur."
2 Subclassing Subset of the generic class, e.g. "st.Adj.a" for "Adj.a"
3 Uncertain Hypothesis, e.g. "M?a" or "M(i)"
4 Irregular Irregular instance of the generic class, e.g. "unreg.abl.V.1"
5 Unassigned Unclassified, e.g. "FNn"
6 Error Obvious typo/error, e.g. spillon "sw.V.1" for "sw.V.2" ([WSClass] has the correct value)