Browsing Mathematics and Statistics by Title
Now showing items 1130 of 44

Edge detection using Bayesian process convolutionsThis project describes a method for edge detection in images. We develop a Bayesian approach for edge detection, using a process convolution model. Our method has some advantages over the classical edge detector, Sobel operator. In particular, our Bayesian spatial detector works well for rich, but noisy, photos. We first demonstrate our approach with a small simulation study, then with a richer photograph. Finally, we show that the Bayesian edge detector performance gives considerable improvement over the Sobel operator performance for rich photos.

Effect of filling methods on the forecasting of time series with missing valuesThe Gulf of Alaska Mooring (GAK1) monitoring data set is an irregular time series of temperature and salinity at various depths in the Gulf of Alaska. One approach to analyzing data from an irregular time series is to regularize the series by imputing or filling in missing values. In this project we investigated and compared four methods (denoted as APPROX, SPLINE, LOCF and OMIT) of doing this. Simulation was used to evaluate the performance of each filling method on parameter estimation and forecasting precision for an Autoregressive Integrated Moving Average (ARIMA) model. Simulations showed differences among the four methods in terms of forecast precision and parameter estimate bias. These differences depended on the true values of model parameters as well as on the percentage of data missing. Among the four methods used in this project, the method OMIT performed the best and SPLINE performed the worst. We also illustrate the application of the four methods to forecasting the Gulf of Alaska Mooring (GAK1) monitoring time series, and discuss the results in this project.

Estimating confidence intervals on accuracy in classification in machine learningThis paper explores various techniques to estimate a confidence interval on accuracy for machine learning algorithms. Confidence intervals on accuracy may be used to rank machine learning algorithms. We investigate bootstrapping, leave one out cross validation, and conformal prediction. These techniques are applied to the following machine learning algorithms: support vector machines, bagging AdaBoost, and random forests. Confidence intervals are produced on a total of nine datasets, three real and six simulated. We found in general not any technique was particular successful at always capturing the accuracy. However leave one out cross validation had the most consistency amongst all techniques for all datasets.

Exact and numerical solutions for stokes flow in glaciersWe begin with an overview of the fluid mechanics governing ice flow. We review a 1985 result due to Balise and Raymond giving exact solutions for a glaciologicallyrelevant Stokes problem. We extend this result by giving exact formulas for the pressure and for the basal stress. This leads to a theorem giving a necessary condition on the basal velocity of a gravityinduced flow in a rectangular geometry. We describe the finite element method for solving the same problem numerically. We present a concise implementation using FEniCS, a freelyavailable software package, and discuss the convergence of the numerical method to the exact solution. We describe how to fix an error in a recent published model.

An existence theorem for solutions to a model problem with Yamabepositive metric for conformal parameterizations of the Einstein constraint equationsWe use the conformal method to investigate solutions of the vacuum Einstein constraint equations on a manifold with a Yamabepositive metric. To do so, we develop a model problem with symmetric data on Sn⁻¹ x S¹. We specialize the model problem to a twoparameter family of conformal data, and find that no solutions exist when the transversetraceless tensor is identically zero. When the transverse traceless tensor is nonzero, we observe an existence theorem in both the nearconstant mean curvature and farfromconstant mean curvature regimes.

Expectation maximization and latent class modelsLatent tree models are tree structured graphical models where some random variables are observable while others are latent. These models are used to model data in many areas, such as bioinformatics, phylogenetics, computer vision among others. This work contains some background on latent tree models and algebraic geometry with the goal of estimating the volume of the latent tree model known as the 3leaf model M₂ (where the root is a hidden variable with 2 states, and is the parent of three observable variables with 2 states) in the probability simplex Δ₇, and to estimate the volume of the latent tree model known as the 3leaf model M₃ (where the root is a hidden variable with 3 states, and is the parent of two observable variables with 3 states and one observable variable with 2 states) in the probability simplex Δ₁₇. For the model M₃, we estimate that the rough percentage of distributions that arise from stochastic parameters is 0:015%, the rough percentage of distributions that arise from real parameters is 64:742% and the rough percentage of distributions that arise from complex parameters is 35:206%. We will also discuss the algebraic boundary of these models and we observe the behavior of the estimates of the Expectation Maximization algorithm (EM algorithm), an iterative method typically used to try to find a maximum likelihood estimator.

An exploration of two infinite families of snarksIn this paper, we generalize a single example of a snark that admits a drawing with even rotational symmetry into two infinite families using a voltage graph construction techniques derived from cyclic PseudoLoupekine snarks. We expose an enforced chirality in coloring the underlying 5pole that generated the known example, and use this fact to show that the infinite families are in fact snarks. We explore the construction of these families in terms of the blowup construction. We show that a graph in either family with rotational symmetry of order m has automorphism group of order m2m⁺¹. The oddness of graphs in both families is determined exactly, and shown to increase linearly with the order of rotational symmetry.

An exposition on the KroneckerWeber theoremThe KroneckerWeber Theorem is a, classification result from Algebraic Number Theory. Theorem (KroneckerWeber). Every finite, abelian extension of Q is contained in a cyclotomic field. This result was originally proven by Leopold Kronecker in 1853. However, his proof had some gaps that were later filled by Heinrich Martin Weber in 1886 and David Hilbert in 1896. Hilbert's strategy for the proof eventually led to the creation of the field of mathematics called Class Field Theory, which is the study of finite, abelian extensions of arbitrary fields and is still an area of active research. Not only is the KroneckerWeber Theorem surprising, its proof is truly amazing. The idea of the proof is that for a finite, Galois extension K of Q, there is a connection between the Galois group Gal(K/Q) and how primes of Z split in a certain subring R of K corresponding to Z in Q. When Gal(K/Q) is abelian, this connection is so stringent that the only possibility is that K is contained in a cyclotomic field. In this paper, we give an overview of field/Galois theory and what the KroneckerWeber Theorem means. We also talk about the ring of integers R of K, how primes split in R, how splitting of primes is related to the Galois group Gal(K/Q), and finally give a proof of the KroneckerWeber Theorem using these ideas.

Extending the LatticeBased Smoother using a generalized additive modelThe Lattice Based Smoother was introduced by McIntyre and Barry (2017) to estimate a surface defined over an irregularlyshaped region. In this paper we consider extending their method to allow for additional covariates and noncontinuous responses. We describe our extension which utilizes the framework of generalized additive models. A simulation study shows that our method is comparable to the Soap film smoother of Wood et al. (2008), under a number of different conditions. Finally we illustrate the method's practical use by applying it to a real data set.

Gaussian process convolutions for Bayesian spatial classificationWe compare three models for their ability to perform binary spatial classification. A geospatial data set consisting of observations that are either permafrost or not is used for this comparison. All three use an underlying Gaussian process. The first model considers this process to represent the logodds of a positive classification (i.e. as permafrost). The second model uses a cutoff. Any locations where the process is positive are classified positively, while those that are negative are classified negatively. A probability of misclassification then gives the likelihood. The third model depends on two separate processes. The first represents a positive classification, while the second a negative classification. Of these two, the process with greater value at a location provides the classification. A probability of misclassification is also used to formulate the likelihood for this model. In all three cases, realizations of the underlying Gaussian processes were generated using a process convolution. A grid of knots (whose values were sampled using Markov Chain Monte Carlo) were convolved using an anisotropic Gaussian kernel. All three models provided adequate classifications, but the single and twoprocess models showed much tighter bounds on the border between the two states.

The geometry in geometric algebraWe present an axiomatic development of geometric algebra. One may think of a geometric algebra as allowing one to add and multiply subspaces of a vector space. Properties of the geometric product are proven and derived products called the wedge and contraction product are introduced. Linear algebraic and geometric concepts such as linear independence and orthogonality may be expressed through the above derived products. Some examples with geometric algebra are then given.

A geostatistical model based on Brownian motion to Krige regions in R2 with irregular boundaries and holesKriging is a geostatistical interpolation method that produces predictions and prediction intervals. Classical kriging models use Euclidean (straight line) distance when modeling spatial autocorrelation. However, for estuaries, inlets, and bays, shortestinwater distance may capture the system’s proximity dependencies better than Euclidean distance when boundary constraints are present. Shortestinwater distance has been used to krige such regions (Little et al., 1997; Rathbun, 1998); however, the variancecovariance matrices used in these models have not been shown to be mathematically valid. In this project, a new kriging model is developed for irregularly shaped regions in R 2 . This model incorporates the notion of flow connected distance into a valid variancecovariance matrix through the use of a random walk on a lattice, process convolutions, and the nonstationary kriging equations. The model developed in this paper is compared to existing methods of spatial prediction over irregularly shaped regions using water quality data from Puget Sound.

An investigation into the effectiveness of simulationextrapolation for correcting measurement errorinduced bias in multilevel modelsThis paper is an investigation into correcting the bias introduced by measurement errors into multilevel models. The proposed method for this correction is simulationextrapolation (SIMEX). The paper begins with a detailed discussion of measurement error and its effects on parameter estimation. We then describe the simulationextrapolation method and how it corrects for the bias introduced by the measurement error. Multilevel models and their corresponding parameters are also defined before performing a simulation. The simulation involves estimating the multilevel model parameters using our true explanatory variables, the observed measurement error variables, and two different SIMEX techniques. The estimates obtained from our true explanatory values were used as a baseline for comparing the effectiveness of the SIMEX method for correcting bias. From these results, we were able to determine that the SIMEX was very effective in correcting the bias in estimates of the fixed effects parameters and often provided estimates that were not significantly different than those from the estimates derived using the true explanatory variables. The simulation also suggested that the SIMEX approach was effective in correcting bias for the random slope variance estimates, but not for the random intercept variance estimates. Using the simulation results as a guideline, we then applied the SIMEX approach to an orthodontics dataset to illustrate the application of SIMEX to real data.

Investigations in phylogenetics: tree inference and model identifiabilityThis thesis presents two projects in mathematical phylogenetics. The first presents a new, statistically consistent, fast method for inferring species trees from topological gene trees under the multispecies coalescent model. The algorithm of this method takes a collection of unrooted topological gene trees, computes a novel intertaxon distance from them, and outputs a metric species tree. The second establishes that numerical and nonnumerical parameters of a specic Prole Mixture Model of protein sequence evolution are generically identifiable. Algebraic techniques are used, especially a theorem of Kruskal on tensor decomposition.

Linear partial differential equations and real analytic approximations of rough functionsMany common approximation methods exist such as linear or polynomial interpolation, splines, Taylor series, or generalized Fourier series. Unfortunately, many of these approximations are not analytic functions on the entire real line, and those that are diverge at infinity and therefore are only valid on a closed interval or for compactly supported functions. Our method takes advantage of the smoothing properties of certain linear partial differential equations to obtain an approximation which is real analytic, converges to the function on the entire real line, and yields particular conservation laws. This approximation method applies to any L₂ function on the real line which may have some rough behavior such as discontinuities or points of nondifferentiability. For comparison, we consider the wellknown FourierHermite series approximation. Finally, for some example functions the approximations are found and plotted numerically.

Minimal covers of the Archimedean tilings, part II AppendicesThese files contain a full descriptions of the relations in the presentations of the monodromy groups for the (3.3.4.3.4), (3.3.3.4.4), (4.6.12), and (3.3.3.3.6) tilings. This material was prepared to provide additional material or the possibility of verification of our work for the interested reader of the associated article.

Moose abundance estimation using finite population block kriging on Togiak National Wildlife Refuge, AlaskaMonitoring the size and demographic characteristics of animal populations is fundamental to the fields of wildlife ecology and wildlife management. A diverse suite of population monitoring methods have been developed and employed during the past century, but challenges in obtaining rigorous population estimates remain. I used simulation to address survey design issues for monitoring a moose population at Togiak National Wildlife Refuge in southwestern Alaska using finite population block kriging. In the first chapter, I compared the bias in the Geospatial Population Estimator (GSPE; which uses finite population block kriging to estimate animal abundance) between two survey unit configurations. After finding that substantial bias was induced through the use of the historic survey unit configuration, I concluded that the ’’standard” unit configuration was preferable because it allowed unbiased estimation. In the second chapter, I examined the effect of sampling intensity on performance of the GSPE. I concluded that bias and confidence interval coverage were unaffected by sampling intensity, whereas the coefficient of variation (CV) and root mean squared error (RMSE) decreased with increasing sampling intensity. In the final chapter, I examined the effect of spatial clustering by moose on model performance. Highly clustered moose distributions induced a small amount of positive bias, confidence interval coverage lower than the nominal rate, higher CV, and higher RMSE. Some of these issues were ameliorated by increasing sampling intensity, but if highly clustered distributions of moose are expected, then substantially greater sampling intensities than those examined here may be required.

Multistate OrnsteinUhlenbeck space use model reveals sexspecific partitioning of the energy landscape in a soaring birdUnderstanding animals’ home range dynamics is a frequent motivating question in movement ecology. Descriptive techniques are often applied, but these methods lack predictive ability and cannot capture effects of dynamic environmental patterns, such as weather and features of the energy landscape. Here, we develop a practical approach for statistical inference into the behavioral mechanisms underlying how habitat and the energy landscape shape animal home ranges. We validated this approach by conducting a simulation study, and applied it to a sample of 12 golden eagles Aquila chrysaetos tracked with satellite telemetry. We demonstrate that readily available software can be used to fit a multistate OrnsteinUhlenbeck space use model to make hierarchical inference of habitat selection parameters and home range dynamics. Additionally, the underlying mathematical properties of the model allow straightforward computation of predicted space use distributions, permitting estimation of home range size and visualization of space use patterns under varying conditions. The application to golden eagles revealed effects of habitat variables that align with eagle biology. Further, we found that males and females partition their home ranges dynamically based on uplift. Specifically, changes in wind and the angle of the sun seemed to be drivers of differential space use between sexes, in particular during late breeding season when both are foraging across large parts of their home range to support nestling growth.

NonNormality In Scalar Delay Differential EquationsAnalysis of stability for delay differential equations (DDEs) is a tool in a variety of fields such as nonlinear dynamics in physics, biology, and chemistry, engineering and pure mathematics. Stability analysis is based primarily on the eigenvalues of a discretized system. Situations exist in which practical and numerical results may not match expected stability inferred from such approaches. The reasons and mechanisms for this behavior can be related to the eigenvectors associated with the eigenvalues. When the operator associated to a linear (or linearized) DDE is significantly nonnormal, the stability analysis must be adapted as demonstrated here. Example DDEs are shown to have solutions which exhibit transient growth not accounted for by eigenvalues alone. Pseudospectra are computed and related to transient growth.