Browsing College of Natural Science and Mathematics (CNSM) by Subject "data mining"
Now showing items 1-1 of 1
Predictive Modeling of Avian Influenza in Wild BirdsOver the past 20 years, highly pathogenic avian influenza (HPAI), specifically Eurasian H5N1 subtypes, caused economic losses to the poultry industry and sparked fears of a human influenza pandemic. Avian influenza virus (AIV) is widespread in wild bird populations in the low-pathogenicity form (LPAI), and wild birds are thought to be the reservoir for AIV. To date, however, nearly all predictive models of AIV focus on domestic poultry and HPAI H5N1 at a small country or regional scale. Clearly, there is a need and an opportunity to explore AIV in wild birds using data-mining and machinelearning techniques. I developed predictive models using the Random Forests algorithm to describe the ecological niche of avian influenza in wild birds. In “Chapter 2 - Predictive risk modeling of avian influenza around the Pacific Rim”, I demonstrated that it was possible to separate an AIV-positivity signal from general surveillance effort. Cold winters, high temperature seasonality, and a long distance from coast were important predictors. In “Chapter 3 - A global model of avian influenza prediction in wild birds: the importance of northern regions”, northern regions remained areas of high predicted occurrence even when using a global dataset of AIV. In surveillance data, the percentage of AIV-positive samples is typically very low, which can hamper machine-learning. For “Chapter 4 - Modeling avian influenza with Random Forests: under-sampling and model selection for unbalanced prevalence in surveillance data” I wrote custom code in R statistical programming language to evaluate a balancing algorithm, a model selection algorithm, and an under-sampling method for their effects on model accuracy. Repeated random iv sub-sampling was found to be the most reliable way to improved unbalanced datasets. In these models cold regions consistently bore the highest relative predicted occurrence scores for AIV-positivity and describe a niche for LPAI that is distinct from the niche for HPAI in domestic poultry. These studies represent a novel, initial attempt at constructing models for LPAI in wild birds and demonstrated high predictive power.