GPU Erasure Coding for Parallel File Systems
High capacity and high availability parallel file systems that provide high bandwidth at lower cost benefit from erasure coding, a technique that has been proven to provide low cost and reliable storage for the cloud infrastructure. Exascale High Peformance Computing applicatons require higher bandwidth and lower latencies than cloud applications, GPU based erasure coding can enable the required performance capability for exascale file systems.
PorToL – Porifera Tree of Life
The PorToL – Porifera Tree of Life project is an interdisciplinary, multi-organizational effort to define the family tree for the phylum Porifera (commonly known as sponges), which contains 8,122 valid species with an estimated 4,000 awaiting discovery and/or description. The CCL lab is responsible for designing the PorToL database and building user interfaces and processing pipelines that will allow researchers to submit genetic sequencer data for processing, cataloging, and sharing among the team.
Hi-PaL is a High-Level Parallelization Language developed for capturing the specifications of concurency from the end-users of FraSPA. It can be used in conjuction with other systems as well and is not tied to FraSPA. It is application-domain neutral and is helpful in increasing the productivity of the end-users. In their current scope, both FraSPA and Hi-PaL are used for developing applications for distributed memory architectures.
FraSPA is a Framework for Synthesizing Parallel Applications for multiple-platforms in a user-guided fashion. This research work is motivated by the complexities associated with the Message Passing Interface (MPI), the widely used parallel programming standard. FraSPA aims to address these complexities and facilitate the synthesis of parallel programs from sequential application and middleware components. The framework design is based upon design patterns and software engineering techniques like program transformation and code weaving and has the goal of making parallel programming ‘‘quick and easy’’ by raising the level of abstraction of the widely used low-level parallel programming approach, i.e., MPI. One of the contributions of FraSPA will be separation of parallel and sequential code constructs in MPI based applications with the goal of reducing code complexity and making the task of code maintenance easier.
Domain-Specific Language for Checkpointing
Checkpointing is one of the key requirements for writing fault-tolerant and flexible applications for dynamic and distributed environments like the Grid. This research effort considers the design and development of a Domain-Specific Language (DSL) for abstracting the application-level Checkpointing and Restart (CaR) mechanism. The specifications written in the DSL are used to semi-automatically generate the application-specific code for the CaR mechanism. This DSL not only provides high-level abstraction but also promotes code reuse, code correctness, and non-invasive reengineering of legacy applications to embed the CaR mechanism in them.
Designing Parallel Programs using AOP
The most popular approach to developing parallel programs for distributed memory architectures requires adding explicit message passing calls into existing sequential programs for data distribution, coordination, and communication. Aspect oriented programming provides an option to separate programming concerns and weave code into applications instead of directly modifying the original program. This effort aims to use Aspect Oriented Programming (specifically AspectC++), components and patterns for data distribution and message passing to develop parallel programs without making any changes to existing sequential program. This technique is used to generate a suite of parallel matrix multiplication algorithms, suite of parallel genetic algorithms, and several simple parallel algorithms without making any changes to the sequential code. Performance results obtained indicate that the desired functionality is achieved without compromising performance.
UABgrid is a distributed campus-wide computing environment that connects HPC resources across campus and offers access to regional resources via SURAgrid, TeraGrid and beyond. UABgrid leverages Globus technologies for system inter-connectivity and Shibboleth for federated identity management. UABgrid enables the construction of automated research workflows by providing consistent user identities and system interfaces across all HPC resources on campus. Through continuous research, UAB is unifying access to available systems and deploying applications in a manner that integrates grid related technologies with tools that applied scientists are accustomed to using. Our research in metaschedulers and workflow systems aims to provide seamless access to bioinformatics applications such as BLAST and NAMD. In order to maximize resource utilization and provide users with reduced application execution times, much work as been done on job scheduling. Through innovative scheduling methods, cluster applications have been successfully transitioned to grid nodes in such a way to exploit heterogeneity of available hardware and improve overall application performance.
Dynamic BLAST is a master-worker, grid-enabled application to execute BLAST searches on available resources. When dealing with a distributed and heterogeneous system such as the grid, where numerous, heterogeneous resources are available, we can perform more than just use the available resources to execute the searches on. By realizing and accommodating features of available resources, we can perform BLAST-specific resource selection reducing the total execution time. This is accomplished through locally developed metascheduling ideas to complement resource selection with algorithm matching. Depending on available resources, not only are the resources ranked to select the most appropriate ones first, but depending on the type and size of resource, the most appropriate BLAST algorithm is also used (e.g., mpiBLAST vs. sequential BLAST). Complementing these ideas with small enough query segments (as determined by Dynamic BLAST automatically based on user input data), load balancing can be done with greater granularity resulting in more accurate turnaround time predictions. Dynamic BLAST project is currenty in the process of being made available on UABgrid as the key application used for BLAST runs by bioinformaticians at UAB.
BLAST performance analysis
Selection of an algorithm within an application, application and resource parameters, as well as input data, affect performance of a job. Thus, when submitting a job, tested and established selection of job parameters can greatly influence application productivity on a single resource and even more so across heterogeneous resource pool. The goal of this project is to collect and analyze possible parameters and performance variations exhibited when executing BLAST jobs using various algorithms, various parameters, various parameter values, and various resources. Through a set of examples and benchmarks, we are working toward deriving at a set of observations regarding which parameters are most influential in terms of execution time and associated resource cost.
Application Specification Language (ASL)
Application Specification Language (ASL) is a new grid language that we have developed and can be used by application developers and end-users to describe details of a given application. The ASL allows an individual application to be represented in the heterogeneous world of the grid by capturing its purpose, functionality, options. Through the use of ASL, application descriptions can be made available for immediate use or further advancements among applications such as application deployers, automated interface generators, job schedulers, and application-specific on-demand help provisioning. The ASL can also be used to describe how an application is compared and/or combined with other matching services and software. This ability to specify the composition of services can facilitate the creation of new and added functionality, as well as enabling further advancement of existing tools that can take advantage of the provided information.
Application Performance Database (AppDB)
Performance of any one application is more often than not very intimately related to the hardware and software characteristics of the resource the application is being executed on, as well as the use of application parameters during job instantiation. As such, execution of applications and associated user jobs in heterogeneous environments exhibit heterogeneous performance. Users perceive this variability through inconsistent job execution times and cost for identical or similar jobs when using variable resources. To remedy this drawback, we are developing Application Performance Database (AppDB), a tool that aims at providing the community with an option to store and later retrieve historical application performance information on global scale. This enables extractions of detailed relationships between an application and a resource, such as application architecture preference, CPU speed vs. CPU number preference, resource software stack preference, etc.
GridAtlas is a tool that hides and automates the process of keeping track of installation properties of any one application across resources. Because existing grid middleware (e.g., Globus Toolkit, GridWay) enables simultaneous access and invocation of application instances across grid resources, complexities involved with the installation properties on selected resources should not be left for the end-user to deal with and manage. Therefore, upon a user job submission, a job submission interface contacts GridAtlas and obtains necessary information to complement user provided information after which further steps of job submission process can take place.
The Virtual Organization Collaboration System (myVocs) provides a flexible environment for exploring new approaches to security, application development, and access control built from Internet services without a central identity repository. By leveraging the emerging distributed identity management infrastructure myVocs provides an accessible, secure collaborative environment using standards for federated identity management and open-source software developed through NSF NMI. The Shibboleth software provides the middleware needed to assert identity and attributes across domains so that access control decisions can be determined at each resource based on local policy. The myVocs system is also being integrated with Grid authentication and authorization using GridShib.
Adaptive Parallel Genetic Algorithms
Genetic algorithms are a widely used technique for search and optimization problems and belong to the group of evolutionary algorithms. One of its recent applications has been in image clustering. Serial implementations of the algorithm, customized for image clustering, suffer from slow execution. Also, a genetic algorithm, if run for small number of generations, gets stuck in the local minima, and therefore requires larger number of generations to achieve optimal results. These two factors motivated the parallel implementation of the algorithm. The concurrency in the application was identified and we partitioned it such that the I/O and computation was done in parallel. Both the “Master-Worker” paradigm and the “All-Worker” paradigm were implemented in C and MPI. Distribution of the computation intensive task amongst multiple processors resulted in a linear speedup and a vastly reduced execution time. While the “Master-Worker” model is better suited for the grid environment, the “All-Worker” model performs better on dedicated clusters. We modified the “Master-Worker” implementation further to run successfully and intelligently in Grid Environment where resources are heterogeneous, distributed across disparate locations, and can appear and disappear dynamically. This eventually turned out to be a scheduling problem that can be solved using operation research techniques.
The Grid Automation and Generative Environment (GAUGE) uses concepts of domain-specific modeling (DSM) to build a high-level abstract layer to enable users to create Grid applications without knowledge of specific programming languages or being bound to specific Grid platforms. The goal of GAUGE is to automate the generation of Grid applications to allow inexperienced users to exploit the Grid fully. At the same time, GAUGE provides an open framework in which experienced users can build upon and extend to tailor their applications to particular Grid environments or specific platforms. GAUGE employs domain-specific modeling techniques to accomplish this challenging task.
Grid-Flow is a scientific workflow infrastructure that assists researchers in specifying scientific experiments using a Petri-net-based interface. The contributions of Grid-Flow are as follows: (1) a new, lightweight, programmable Grid workflow language, Grid-Flow Description Language (GFDL), to describe the workflow process in a Grid environment; (2) a Petri-net-based user interface, based on the Generic Modeling Environment, to help the user design the workflow process with a Petri-net model; and (3) a program integration component of the Grid-Flow system to integrate all possible programs into the system.