• Gaussian process convolutions for Bayesian spatial classification

      Best, John K.; Short, Margaret; Goddard, Scott; Barry, Ron; McIntyre, Julie (2016-05)
      We compare three models for their ability to perform binary spatial classification. A geospatial data set consisting of observations that are either permafrost or not is used for this comparison. All three use an underlying Gaussian process. The first model considers this process to represent the log-odds of a positive classification (i.e. as permafrost). The second model uses a cutoff. Any locations where the process is positive are classified positively, while those that are negative are classified negatively. A probability of misclassification then gives the likelihood. The third model depends on two separate processes. The first represents a positive classification, while the second a negative classification. Of these two, the process with greater value at a location provides the classification. A probability of misclassification is also used to formulate the likelihood for this model. In all three cases, realizations of the underlying Gaussian processes were generated using a process convolution. A grid of knots (whose values were sampled using Markov Chain Monte Carlo) were convolved using an anisotropic Gaussian kernel. All three models provided adequate classifications, but the single and two-process models showed much tighter bounds on the border between the two states.
    • The geometry in geometric algebra

      Kilpatrick, Kristopher N.; Maxwell, David A.; Williams, Gordon I.; Rhodes, John A. (2014-12)
      We present an axiomatic development of geometric algebra. One may think of a geometric algebra as allowing one to add and multiply subspaces of a vector space. Properties of the geometric product are proven and derived products called the wedge and contraction product are introduced. Linear algebraic and geometric concepts such as linear independence and orthogonality may be expressed through the above derived products. Some examples with geometric algebra are then given.
    • A geostatistical model based on Brownian motion to Krige regions in R2 with irregular boundaries and holes

      Bernard, Jordy; McIntyre, Julie; Barry, Ron; Goddard, Scott (2019-05)
      Kriging is a geostatistical interpolation method that produces predictions and prediction intervals. Classical kriging models use Euclidean (straight line) distance when modeling spatial autocorrelation. However, for estuaries, inlets, and bays, shortest-in-water distance may capture the system’s proximity dependencies better than Euclidean distance when boundary constraints are present. Shortest-in-water distance has been used to krige such regions (Little et al., 1997; Rathbun, 1998); however, the variance-covariance matrices used in these models have not been shown to be mathematically valid. In this project, a new kriging model is developed for irregularly shaped regions in R 2 . This model incorporates the notion of flow connected distance into a valid variance-covariance matrix through the use of a random walk on a lattice, process convolutions, and the non-stationary kriging equations. The model developed in this paper is compared to existing methods of spatial prediction over irregularly shaped regions using water quality data from Puget Sound.
    • An investigation into the effectiveness of simulation-extrapolation for correcting measurement error-induced bias in multilevel models

      Custer, Christopher (2015-04)
      This paper is an investigation into correcting the bias introduced by measurement errors into multilevel models. The proposed method for this correction is simulation-extrapolation (SIMEX). The paper begins with a detailed discussion of measurement error and its effects on parameter estimation. We then describe the simulation-extrapolation method and how it corrects for the bias introduced by the measurement error. Multilevel models and their corresponding parameters are also defined before performing a simulation. The simulation involves estimating the multilevel model parameters using our true explanatory variables, the observed measurement error variables, and two different SIMEX techniques. The estimates obtained from our true explanatory values were used as a baseline for comparing the effectiveness of the SIMEX method for correcting bias. From these results, we were able to determine that the SIMEX was very effective in correcting the bias in estimates of the fixed effects parameters and often provided estimates that were not significantly different than those from the estimates derived using the true explanatory variables. The simulation also suggested that the SIMEX approach was effective in correcting bias for the random slope variance estimates, but not for the random intercept variance estimates. Using the simulation results as a guideline, we then applied the SIMEX approach to an orthodontics dataset to illustrate the application of SIMEX to real data.
    • Investigations in phylogenetics: tree inference and model identifiability

      Yourdkhani, Samaneh; Rhodes, John A.; Allman, Elizabeth S.; McIntyre, Julie; Williams, Gordon (2020-05)
      This thesis presents two projects in mathematical phylogenetics. The first presents a new, statistically consistent, fast method for inferring species trees from topological gene trees under the multispecies coalescent model. The algorithm of this method takes a collection of unrooted topological gene trees, computes a novel intertaxon distance from them, and outputs a metric species tree. The second establishes that numerical and non-numerical parameters of a specic Prole Mixture Model of protein sequence evolution are generically identifiable. Algebraic techniques are used, especially a theorem of Kruskal on tensor decomposition.
    • Linear partial differential equations and real analytic approximations of rough functions

      Barry, Timothy J.; Rybkin, Alexei; Avdonin, Sergei; Faudree, Jill (2017-08)
      Many common approximation methods exist such as linear or polynomial interpolation, splines, Taylor series, or generalized Fourier series. Unfortunately, many of these approximations are not analytic functions on the entire real line, and those that are diverge at infinity and therefore are only valid on a closed interval or for compactly supported functions. Our method takes advantage of the smoothing properties of certain linear partial differential equations to obtain an approximation which is real analytic, converges to the function on the entire real line, and yields particular conservation laws. This approximation method applies to any L₂ function on the real line which may have some rough behavior such as discontinuities or points of nondifferentiability. For comparison, we consider the well-known Fourier-Hermite series approximation. Finally, for some example functions the approximations are found and plotted numerically.
    • Minimal covers of the Archimedean tilings, part II Appendices

      Williams, Gordon; Pellicer, Daniel; Mixer, Mark (2013-01-18)
      These files contain a full descriptions of the relations in the presentations of the monodromy groups for the (3.3.4.3.4), (3.3.3.4.4), (4.6.12), and (3.3.3.3.6) tilings. This material was prepared to provide additional material or the possibility of verification of our work for the interested reader of the associated article.
    • Moose abundance estimation using finite population block kriging on Togiak National Wildlife Refuge, Alaska

      Frye, Graham G. (2016-12)
      Monitoring the size and demographic characteristics of animal populations is fundamental to the fields of wildlife ecology and wildlife management. A diverse suite of population monitoring methods have been developed and employed during the past century, but challenges in obtaining rigorous population estimates remain. I used simulation to address survey design issues for monitoring a moose population at Togiak National Wildlife Refuge in southwestern Alaska using finite population block kriging. In the first chapter, I compared the bias in the Geospatial Population Estimator (GSPE; which uses finite population block kriging to estimate animal abundance) between two survey unit configurations. After finding that substantial bias was induced through the use of the historic survey unit configuration, I concluded that the ’’standard” unit configuration was preferable because it allowed unbiased estimation. In the second chapter, I examined the effect of sampling intensity on performance of the GSPE. I concluded that bias and confidence interval coverage were unaffected by sampling intensity, whereas the coefficient of variation (CV) and root mean squared error (RMSE) decreased with increasing sampling intensity. In the final chapter, I examined the effect of spatial clustering by moose on model performance. Highly clustered moose distributions induced a small amount of positive bias, confidence interval coverage lower than the nominal rate, higher CV, and higher RMSE. Some of these issues were ameliorated by increasing sampling intensity, but if highly clustered distributions of moose are expected, then substantially greater sampling intensities than those examined here may be required.
    • Multistate Ornstein-Uhlenbeck space use model reveals sex-specific partitioning of the energy landscape in a soaring bird

      Eisaguirre, Joseph M.; Goddard, Scott; Barry, Ron; McIntyre, Julie; Short, Margaret (2019-12)
      Understanding animals’ home range dynamics is a frequent motivating question in movement ecology. Descriptive techniques are often applied, but these methods lack predictive ability and cannot capture effects of dynamic environmental patterns, such as weather and features of the energy landscape. Here, we develop a practical approach for statistical inference into the behavioral mechanisms underlying how habitat and the energy landscape shape animal home ranges. We validated this approach by conducting a simulation study, and applied it to a sample of 12 golden eagles Aquila chrysaetos tracked with satellite telemetry. We demonstrate that readily available software can be used to fit a multistate Ornstein-Uhlenbeck space use model to make hierarchical inference of habitat selection parameters and home range dynamics. Additionally, the underlying mathematical properties of the model allow straightforward computation of predicted space use distributions, permitting estimation of home range size and visualization of space use patterns under varying conditions. The application to golden eagles revealed effects of habitat variables that align with eagle biology. Further, we found that males and females partition their home ranges dynamically based on uplift. Specifically, changes in wind and the angle of the sun seemed to be drivers of differential space use between sexes, in particular during late breeding season when both are foraging across large parts of their home range to support nestling growth.
    • Non-Normality In Scalar Delay Differential Equations

      Stroh, Jacob Nathaniel; Bueler, Edward (2006)
      Analysis of stability for delay differential equations (DDEs) is a tool in a variety of fields such as nonlinear dynamics in physics, biology, and chemistry, engineering and pure mathematics. Stability analysis is based primarily on the eigenvalues of a discretized system. Situations exist in which practical and numerical results may not match expected stability inferred from such approaches. The reasons and mechanisms for this behavior can be related to the eigenvectors associated with the eigenvalues. When the operator associated to a linear (or linearized) DDE is significantly non-normal, the stability analysis must be adapted as demonstrated here. Example DDEs are shown to have solutions which exhibit transient growth not accounted for by eigenvalues alone. Pseudospectra are computed and related to transient growth.
    • Numerical realization of the generalized Carrier-Greenspan Transform for the shallow water wave equations

      Harris, Matthew W.; Rybkin, Alexei; Williams, Gordon; Nikolsky, Dmitry (2015-08)
      We study the development of two numerical algorithms for long nonlinear wave runup that utilize the generalized Carrier-Greenspan transform. The Carrier-Greenspan transform is a hodograph transform that allows the Shallow Water Wave equations to be transformed into a linear second order wave equation with nonconstant coefficients. In both numerical algorithms the transform is numerically implemented, the resulting linear system is numerically solved and then the inverse transformation is implemented. The first method we develop is based on an implicit finite difference method and is applicable to constantly sloping bays of arbitrary cross-section. The resulting scheme is extremely fast and shows promise as a fast tsunami runup solver for wave runup in coastal fjords and narrow inlets. For the second scheme, we develop an initial value boundary problem corresponding to an Inclined bay with U or V shaped cross-sections that has a wall some distance from the shore. A spectral method is applied to the resulting linear equation in order to and a series solution. Both methods are verified against an analytical solution in an inclined parabolic bay with positive results and the first scheme is compared to the 3D numerical solver FUNWAVE with positive results.
    • On the Klein-Gordon equation originating on a curve and applications to the tsunami run-up problem

      Gaines, Jody; Rybkin, Alexei; Bueler, Ed; Nicolsky, Dmitry (2019-05)
      Our goal is to study the linear Klein-Gordon equation in matrix form, with initial conditions originating on a curve. This equation has applications to the Cross-Sectionally Averaged Shallow Water equations, i.e. a system of nonlinear partial differential equations used for modeling tsunami waves within narrow bays, because the general Carrier-Greenspan transform can turn the Cross-Sectionally Averaged Shallow Water equations (for shorelines of constant slope) into a particular form of the matrix Klein-Gordon equation. Thus the matrix Klein-Gordon equation governs the run-up of tsunami waves along shorelines of constant slope. If the narrow bay is U-shaped, the Cross-Sectionally Averaged Shallow Water equations have a known general solution via solving the transformed matrix Klein-Gordon equation. However, the initial conditions for our Klein-Gordon equation are given on a curve. Thus our goal is to solve the matrix Klein-Gordon equation with known conditions given along a curve. Therefore we present a method to extrapolate values on a line from conditions on a curve, via the Taylor formula. Finally, to apply our solution to the Cross-Sectionally Averaged Shallow Water equations, our numerical simulations demonstrate how Gaussian and N-wave profiles affect the run-up of tsunami waves within various U-shaped bays.
    • Phylogenetic trees and Euclidean embeddings

      Layer, Mark; Rhodes, John; Allman, Elizabeth; Faudree, Jill (2014-05)
      In this thesis we develop an intuitive process of encoding any phylogenetic tree and its associated tree-distance matrix as a collection of points in Euclidean space. Using this encoding, we find that information about the structure of the tree can easily be recovered by applying the inner product operation to vector combinations of the Euclidean points. By applying Classical Scaling to the tree-distance matrix, we are able to find the Euclidean points even when the phylogenetic tree is not known. We use the insight gained by encoding the tree as a collection of Euclidean points to modify the Neighbor Joining Algorithm, a method to recover an unknown phylogenetic tree from its tree-distance matrix, to be more resistant to tree-distance proportional errors.
    • Reliability analysis of reconstructing phylogenies under long branch attraction conditions

      Dissanayake, Ranjan; Allman, Elizabeth; McIntyre, Julie; Short, Margaret; Goddard, Scott (2018-05)
      In this simulation study we examined the reliability of three phylogenetic reconstruction techniques in a long branch attraction (LBA) situation: Maximum Parsimony (M P), Neighbor Joining (NJ), and Maximum Likelihood. Data were simulated under five DNA substitution models-JC, K2P, F81, HKY, and G T R-from four different taxa. Two branch length parameters of four taxon trees ranging from 0.05 to 0.75 with an increment of 0.02 were used to simulate DNA data under each model. For each model we simulated DNA sequences with 100, 250, 500 and 1000 sites with 100 replicates. When we have enough data the maximum likelihood technique is the most reliable of the three methods examined in this study for reconstructing phylogenies under LBA conditions. We also find that MP is the most sensitive to LBA conditions and that Neighbor Joining performs well under LBA conditions compared to MP.
    • Species network inference under the multispecies coalescent model

      Baños Cervantes, Hector Daniel; Allman, Elizabeth S.; Rhodes, John A.; Barry, Ronald; Faudree, Jill (2019-05)
      Species network inference is a challenging problem in phylogenetics. In this work, we present two results on this. The first shows that many topological features of a level-1 network are identifable under the network multispecies coalescent model (NMSC). Specifcally, we show that one can identify from gene tree frequencies the unrooted semidirected species network, after suppressing all cycles of size less than 4. The second presents the theory behind a new, statistically consistent, practical method for the inference of level-1 networks under the NMSC. The input for this algorithm is a collection of unrooted topological gene trees, and the output is an unrooted semidirected species network.
    • Statistical analysis of species tree inference

      Dajles, Andres; Rhodes, John; Allman, Elizabeth; Goddard, Scott; Short, Margaret; Barry, Ron (2016-05)
      It is known that the STAR and USTAR algorithms are statistically consistent techniques used to infer species tree topologies from a large set of gene trees. However, if the set of gene trees is small, the accuracy of STAR and USTAR in determining species tree topologies is unknown. Furthermore, it is unknown how introducing roots on the gene trees affects the performance of STAR and USTAR. Therefore, we show that when given a set of gene trees of sizes 1, 3, 6 or 10, the STAR and USTAR algorithms with Neighbor Joining perform relatively well for two different cases: one where the gene trees are rooted at the outgroup and the STAR inferred species tree is also rooted at the outgroup, and the other where the gene trees are not rooted at the outgroup, but the USTAR inferred species tree is rooted at the outgroup. It is known that the STAR and USTAR algorithms are statistically consistent techniques used to infer species tree topologies from a large set of gene trees. However, if the set of gene trees is small, the accuracy of STAR and USTAR in determining species tree topologies is unknown. Furthermore, it is unknown how introducing roots on the gene trees affects the performance of STAR and USTAR. Therefore, we show that when given a set of gene trees of sizes 1, 3, 6 or 10, the STAR and USTAR algorithms with Neighbor Joining perform relatively well for two different cases: one where the gene trees are rooted at the outgroup and the STAR inferred species tree is also rooted at the outgroup, and the other where the gene trees are not rooted at the outgroup, but the USTAR inferred species tree is rooted at the outgroup.
    • Streetlight Halos

      Tape, Walter (2010)
    • A study of saturation number

      Burr, Erika; Faudree, Jill; Williams, Gordon; Berman-Williams, Leah (2017-08)
      This paper seeks to provide complete proofs in modern notation of (early) key saturation number results and give some new results concerning the semi-saturation number. We highlight relevant results from extremal theory and present the saturation number for the complete graph Kk; and the star K₁,t, elaborating on the proofs provided in the 1964 paper A Problem in Graph Theory by Erdos, Hajnal and Moon and the 1986 paper Saturated Graphs with Minimal Number of Edges by Kászonyi and Tuza. We discuss the proof of a general bound on the saturation number for a family of target graphs provided by Kászonyi and Tuza. A discussion of related results showing that the complete graph has the maximum saturation number among target graphs of the same order and that the star has the maximum saturation number among target trees of the same order is included. Before presenting our result concerning the semi-saturation number for the path Pk; we discuss the structure of some Pk-saturated trees of large order as well as the saturation number of Pk with respect to host graphs of large order.
    • Testing multispecies coalescent simulators with summary statistics

      Baños Cervantes, Hector Daniel; Allman, Elizabeth; Rhodes, John; Goddard, Scott; McIntyre, Julie; Barry, Ron (2018-12)
      The Multispecies coalescent model (MSC) is increasingly used in phylogenetics to describe the formation of gene trees (depicting the direct ancestral relationships of sampled lineages) within species trees (depicting the branching of species from their common ancestor). A number of MSC simulators have been implemented, and these are often used to test inference methods built on the model. However, it is not clear from the literature that these simulators are always adequately tested. In this project, we formulated tools for testing these simulators and use them to show that of four well-known coalescent simulators, Mesquite, Hybrid-Lambda, SimPhy, and Phybase, only SimPhy performs correctly according to these tests.
    • The linear algebra of interpolation with finite applications giving computational methods for multivariate polynomials

      Olmsted, Coert D.; Gislason, Gary A.; Lambert, J. P.; Lando, C. A.; Olson, J. V.; Piacenca, R. J. (1988)
      Linear representation and the duality of the biorthonormality relationship express the linear algebra of interpolation by way of the evaluation mapping. In the finite case the standard bases relate the maps to Gramian matrices. Five equivalent conditions on these objects are found which characterize the solution of the interpolation problem. This algebra succinctly describes the solution space of ordinary linear initial value problems. Multivariate polynomial spaces and multidimensional node sets are described by multi-index sets. Geometric considerations of normalization and dimensionality lead to cardinal bases for Lagrange interpolation on regular node sets. More general Hermite functional sets can also be solved by generalized Newton methods using geometry and multi-indices. Extended to countably infinite spaces, the method calls upon theorems of modern analysis.