• Login
    View Item 
    •   Home
    • University of Alaska Fairbanks
    • UAF Graduate School
    • Engineering
    • View Item
    •   Home
    • University of Alaska Fairbanks
    • UAF Graduate School
    • Engineering
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of Scholarworks@UACommunitiesPublication DateAuthorsTitlesSubjectsTypeThis CollectionPublication DateAuthorsTitlesSubjectsType

    My Account

    Login

    First Time Submitters, Register Here

    Register

    Statistics

    Display statistics

    An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Dang_W_2010.pdf
    Size:
    11.38Mb
    Format:
    PDF
    Download
    Author
    Dang, Wei
    Keyword
    Graphics processing units
    Computer graphics
    Metadata
    Show full item record
    URI
    http://hdl.handle.net/11122/12771
    Abstract
    "In the past decade, the Graphics Processing Unit (GPU) is reported to have become a powerful general-purpose computation platform for various application areas. The Arctic Region Supercomputing Center (ARSC) intends to assess the capability of this emerging computing tool so that they may enlist it as component of supercomputing systems, but at a lower cost. This thesis reports on parallelization, on both GPU and CPU, of a numerical algorithm named the Total Variation Diminishing (TVD) scheme, which is used in the Eulerian Polar Parallel Ionospheric Model (EPPIM) developed at UAF's Geophysical Institute (GI) and ARSC. The GPU (single NVIDIA Tesla® C2050) and CPU (dual Intel Xeon x5560) implementations were parallelized using the Compute Unified Device Architecture (CUDA) language and OpenMP with the C language respectively. A speedup of up to 175x was observed when comparing the CUDA/GPU implementation to the non-parallelized CPU version, and of almost 40x when comparing to the parallelized CPU version. Results also demonstrated an average floating-point-operation rate of 107 GFLOPs, 351 times more than that the CPU version can offer. However, there is still space for improvement as only one tenth of the peak theoretical performance of the C2050 was achieved"--Leaf iii.
    Description
    Thesis (M.S.) University of Alaska Fairbanks, 2010
    Table of Contents
    1. Introduction -- 1.1. Motivation -- 1.2. Similar work -- 1.3. Contribution -- 1.4. Thesis outline -- 2. Background -- 2.1. Evolution of GPU computing -- 2.2. Compute Unified Device Architecture -- 2.2.1. Hardware architecture -- 2.2.2. Software architecture -- 2.2.3. Terminology -- 2.2.4. Compilation workflow -- 2.2.5. CUDA memory model -- 2.2.6. Programming methodology -- 2.2.7. Performance considerations for scientific computing -- 2.3. Mathematical background -- 2.3.1. Continuity equation -- 2.3.2. Numerical schemes -- 2.3.3. The corner transport upwind scheme -- 2.3.4. The Lax-Wendroff scheme -- 2.3.5. The TVD scheme -- 3. Algorithms -- 3.1. Introduction -- 3.2. The serial algorithm -- 3.3. The parallel algorithms -- 4. Performance test and analysis -- 4.1. Hardware configuration -- 4.2. Methodology -- 4.2.1. Testing approach -- 4.2.2. Testing environment -- 4.2.3. Validation -- 4.3. Results and analysis -- 4.3.1. Serial implementation -- 4.3.2. The single-kernal parallel implementation -- 4.3.3. The multi-kernal parallel implementation -- 5. Conclusions and future work -- 5.1. Conclusions -- 5.2. Future work -- References -- Appendix.
    Date
    2010-12
    Type
    Thesis
    Collections
    Engineering

    entitlement

     
    ABOUT US|HELP|BROWSE|ADVANCED SEARCH

    The University of Alaska is an affirmative action/equal opportunity employer, educational institution and provider and prohibits illegal discrimination against any individual.

    Learn more about UA’s notice of nondiscrimination.

    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.