While much of my work going back 10 years or more was on the nominals, the last few years I’ve been focused on verbal morphology. I decided that for my SBL paper, however, I’d revisit some of my noun work and ended up exploring some ideas afresh.

By nominals I mean nouns, adjectives, determiners, pronouns, proforms, participles. Basically anything marked for case (see Morphological Parts of Speech in Greek).

I wanted to, at the very least, generate themes and distinguishers for the nominals. But once you have that, you have a nice set up to explore stems, endings and sandhi. This is a nice interface into some of the general (i.e. not language-specific) morphology I was doing for my PhD. Finally, it enables me to get back to my long-running goal of laying out a system of inflectional classes that improves on Funk, Mounce and others.

You can see the work in progress at https://github.com/morphgnt/morphological-lexicon/tree/master/projects/nominal_distinguishers.

The first phase involved enumerating the possible distinguishers for each combination of case/number/gender. This was done incrementally, running a Python script that (a) showed me forms that weren’t covered by the existing list; (b) showed me lexemes that had more than one theme. In some cases, multiple themes was a legitimate suppletion but in other cases it meant I hadn’t gotten the theme/distinguisher split right. Because I had them in electronic form, I also used Mounce’s inflectional classes as a hint to disambiguate distinguishers.

So the first phase involved creating a file that looked something like this (just a very small subset of what is currently an 851-line file):

NSM:
    - ας n-1d α+ς
    - ης n-1f η+ς
    - ος n-2a ο+ς
    - ψ n-3a\(1\) π+ς
    - ψ n-3a\(2\) β+ς
    - ξ n-3b\(1\) κ+ς
    - ξ n-3b\(2\) γ+ς
    - ξ n-3b\(3\) χ+ς
    - ους n=3c\(2-OD\) οδ+ς
    - ς n-3c\(1\) τ+ς
    - ς n-3c\(2\) δ+ς
    - ς n-3c\(3\) θ+ς

You’ll notice I annotated each distinguisher with the underlying stem ending and inflectional ending. You can see I needed to use Mounce’s codes (for now) to disambiguate distinguishers like ψ, ξ and ς. You’ll also notice I had to invent my own temporary extensions to Mounce in the case of οδ+ς → ους because there are deliberately no sandhi rules built in to my scripts (more on that later).

My initial script takes the above file, runs across all forms in the MorphGNT SBLGNT are produces entries like the following:

ἀγαλλίασις:
    forms:
        F:
            theme(s): ἀγαλλιασ
            NS: ἀγαλλίασις ἀγαλλίασ|ις ϳ+ς
            GS: ἀγαλλιάσεως ἀγαλλιάσ|εως ϳ+ος
            DS: ἀγαλλιάσει ἀγαλλιάσ|ει ϳ+ι

In some (not necessarily immediately) following posts, I’ll talk more about additional outputs and other scripts in the pipeline.

This mini-project is a great example of where having a deterministic verification process on manually tweaked rules works well (over, say, trying to automate the generation of the rules entirely).