• A lexical transducer for North Slope Iñupiaq

      Bills, Aric R.; Tuttle, Siri; Levin, Lori; Berge, Anna; Kaplan, Lawrence (2011-05)
      This thesis describes the creation and evaluation of software designed to analyze and generate North Slope Iñupiaq words. Given a complete lñupiaq word as input, it attempts to identify the word's stem and suffixes, including the grammatical category and any inflectional information contained in the word. Given a stem and list of suffixes as input, it attempts to produce the corresponding Iñupiaq word, applying phonological processes as necessary. Innovations in the implementation of this software include Iñupiaq-specific formats for specifying lexical data, including a table-based format for specifying inflectional suffixes in paradigms; a treatment of phonologically-conditioned irregular allomorphy which leverages the pattern-recognition capabilities of the xfst programming language; and an idiom for composing morphographemic rules together in xfst which captures the state of the software each time a new rule is added, maximizing feedback during software compilation and facilitating troubleshooting. In testing, the software recognized 81.2% of all word tokens (78.3% of unique word types) and guessed at the morphology of an additional 16.8% of tokens (19.4% of types). Analyses of recognized words were largely accurate; a heuristic for identifying accurate parses is proposed. Most guesses were at least partly inaccurate. Improvements and applications are proposed.