GA tasks (Updated 2004-01-09)

 

4. Education and Documentation

 

4.a. Provide Tutorial for Understanding Certificates (Y1Q1, 8 weeks)

 

This will be a tutorial on X.509 certificates.  The intended audience is the physics researcher using the National Fusion Grid.  The tutorial covers the basics of public key encryption, digital signatures, and certificates.  Explains how certificates are used on the National Fusion Grid.  Gives other illustrations of how certificates are used (web site certificates, for example).  The tutorial outlines the use of proxy certificates in a grid environment.  By completing this tutorial the physics researcher will understand both why they have a certificate and private key files and how to use them on the National Fusion Grid.

 

·      Develop a written tutorial.  This tutorial will be printed in a short booklet format and made available as a PDF document. (3 weeks)

·      Create a one-hour presentation for this tutorial document.  The presentation will be made available in electronic format (PDF and/or PowerPoint).  The presentation will be given at APS/DPP 2004 conference, TTF 2005, Sherwood Theory conference 2005 and through AG meetings.  (2 weeks + 1 week for organization)

·      Create a web page for the tutorial.  The tutorial will be formatted as an easy-to-navigate set of HTML pages and made available through the fusiongrid.org website.  (2 weeks)

 

4.b Complete Documentation (Y2Q3, 27 weeks + 2 weeks/year)

 

There will be a User's Guide, a Programmer's Guide, and an Administrator's Guide for the National Fusion Grid.

 

4.b.1 Complete User's Guide to National Fusion Grid Services (Y1Q4, 10 weeks + 2 weeks/year)

 

This user's guide will be written for the physics researcher.  It will provide an overview of grid computing, including the concepts of grid services and authorization.  The user's guide will introduce a list of services available on the National Fusion Grid along with instructions on getting more information about the individual services.  After reading this guide the physics researcher will understand why we have a grid, what they can get from the National Fusion Grid, and how they can use the services available.

 

In addition to explaining Fusion Grid Services, the User's Guide will also document tools for collaboration & remote participation available for the NFG.

 

The guide will also outline visualization tools available through the Collaboratory for use on the NFG.

 

It is anticipated that the user's guide will need to be updated periodically as new services are added to the National Fusion Grid.  While most of the work on the guide will be done up front, follow-up work will be required to keep the documentation up-to-date.

 

The User's Guide will be created as a written booklet, available in electronic format (PDF) and in printed form, as well as HTML.  We need to discuss how to keep the PDF document synchronized with the HTML document.

 

Write an introduction to the National Fusion Grid (1.5 weeks) focusing on   capabilities for the researcher

·      computation on the NFG: the researcher can crunch numbers

·      visualization: the can look at large data sets

·      collaboration & remote participation: the researcher can work with others

 

Provide an overview of grid computing (0.5 weeks)

·      what is a grid?

·      what is a service?

·      basics of certificates & authorization (reference the certificate guide)

 

List services available on National Fusion Grid (1 week)

·      what is the purpose of each service?

·      what are the inputs & outputs?

·      what tools are available for use with each service?

·      where does the researcher find the complete documentation for each service?

·      where does the researcher download (if necessary) utilities for each service?

·      who does the researcher contact for more information about each service?

 

List visualization tools available (1 week)

·      overview of tools available

·      why you would use each tool (better for large data, etc.)

·      where to find documentation for each tool

·      where to download (might ship with NFG software package)

 

Document tools for collaboration & remote participation (1 week)

·      what are AG nodes and why would the researcher want to use them?

·      what control room remote participation capabilities are available?

·      what control tiled display capabilities are available?

 

Explain prerequisites: getting ready to use National Fusion Grid (1.5 weeks)

·      how does the researcher obtain & install a certificate? (reference certificate documentation here)

·      document the data usage policy (or policies) and other "administrative      hoops"

·      does the researcher need a SecureID and if so how does the researcher obtain and use the SecureID?

·      how does the researcher go about installing NFG software? (reference admin guide)

 

Using AG & collaborative software (1.5 weeks)

·      how does the researcher set up an AG node? (reference admin guide)

·      joining/hosting AG session

·      AG etiquette (volume levels, etc.)

·      how does the researcher share their desktop?

·      ? (should identify software to be used e.g. VRVS, AG1 or AG2 and     add details for those items)

 

Maintain HTML and PDF copies.  If automated HTML available, then only need to maintain one copy

·      convert guide to a set of HTML pages (2 weeks)

·      maintain both copies (HTML & PDF) of documentation (2 weeks/year)

 

4.b.2 Complete Programmer's Guide to Writing Services for the National Fusion Grid  (Y2Q3, 9 weeks)

 

This guide will be written for the software developer.  The guide will provide a language-neutral explanation of Globus Toolkit version 2 grid services.  The guide will include a walkthrough of adding a grid service to the National Fusion Grid.  This walkthrough will include example source code; source code will be written in C.  Any scripting examples will be written in Bourne Shell.  Examples will be for a UNIX system.

 

The walkthrough will take as a starting point a working application that is to be turned into a grid service.  The walkthrough will illustrate making the application available as a service, invoking the service remotely,  shipping inputs to the service, having the service output monitoring information to the Fusion Grid Monitor, and writing outputs when the service  is complete.

 

Using GT2 as a starting point, the programmer's guide will introduce GT3 and provide guidance for converting a service from GT2 to GT3.  Differences and incompatibilities between GT2 and GT3 will be discussed in the document.

 

The guide will define any standards to which each National Fusion Collaboratory service must conform, and any conventions to which each service should conform.  The programmer's guide will also provide a standard documentation template to assist developers in the task of grid service documentation (not just a template for comments but a template for documentation).

 

The Programmer's Guide will be made available in HTML format on fusiongrid.org.

 

·      Provide an overview of GT2 services (1 week)

·      Create a walkthrough for adding a service for National Fusion Grid (3.5 weeks)

·      Explain what is needed to migrate a service from GT2 to GT3 (0.5 weeks)

·      Write grid service standards/conventions (2.5 weeks)

·      Provide grid service documentation template (1.5 weeks)

 

4.b.3 Complete Administrator's Guide to the National Fusion Grid (Y2Q1, 8 weeks)

 

This document will be written for systems administrators tasked with installing and maintaining software and systems that will interact with the National Fusion Grid.  This guide will give an overview of grid technology.  A list of systems software for the National Fusion Grid (e.g. Globus Toolkit, MDSplus) will be provided. The guide will include step-by-step instructions for installing required National Fusion Grid software such as the Globus Toolkit and MDSplus.  General instructions will be for UNIX; specific instructions will be for Red Hat 7.x and 9 Linux distributions.

 

It is understood that this document will depend on the development of a National Fusion Grid software distribution package.  This guide should therefore be completed after the software distribution package has been completed.

 

In addition to the above software documentation, the administrator's guide will document the installation and operation of Access Grid nodes and control room tiled wall displays.

 

The guide will be made available in HTML format on fusiongrid.org.

 

Brief overview of grid technology (0.5 weeks)

·      what is a grid?

·      what are services?

·      what are certificates?

·      what is an AG node?

·      what is a tiled display wall?

 

Administration of grid hosts and grid services (4 weeks)

·      How to install & maintain GT2/GT3

·      How to request, install, and maintain host certificates

·      How to install and maintain a grid-enabled MDSplus service

·      How to install/maintain any other required systems software not already listed

·      How to maintain a grid service

·      Firewall requirements

·      Migration from GT2 to GT3

 

AG node administration (2.5 weeks)

·      Hardware requirements for full & personal AG nodes

·      Network & firewall requirements

·      How to install and maintain an AG node

·      How to operate an AG node

·      AG node security issues

 

Control room tiled display wall administration (1 week)

·      Hardware & software requirements

·      Tiled wall server software installation and maintenance

 

4.c National Fusion Grid Software Package (Y1Q3, 13 weeks + 3 weeks/year)

 

This software package will include systems software, documentation, and applications for use with the National Fusion Grid.  Contents include, but are not limited to: Globus Toolkit (those components needed), secure MDSplus, PreTRANSP, SendPost, Firebird web browser  (for use with FGM), Understanding Certificates, and User's Guide  to National Fusion Grid Services.

 

We must identify supported platforms, e.g. Linux Red Hat 9/32-bit Intel.

 

For those platforms supported, the software package should be as easy to install as possible.  Perhaps this would be an RPM,  or maybe an install script.

 

Any dependencies not included in the software package must be clearly documented.  For example, if we can't include a Java runtime  (needed by SendPost) then we must include instructions on where to obtain Java.

 

The package should include as many prerequisites as possible.  This should provide a "one-click" install.  Users should not need to separately install each prerequisite (e.g. if Globus needs Perl, specific Perl modules, Java, specific Java components, ant, and  junit then those should ship with the package if possible).  The package will be built with the assumption that we have those components that ship with the OS but nothing else.

 

This task will require an initial effort to develop the package as well as ongoing effort to keep the package up-to-date.

 

·      Investigate and specify supported platform(s) (1 week)

·      Investigate and specify software and documentation to include in package (1 week)

·      Decide what to do with commercial software (users' own responsibility or provide run-time and if so how, etc.) if there is any (1 week)

·      Write beta version of package for one platform (4 weeks)

·      Test beta version, fix bugs from beta release, do 1.0 release (6 weeks)

·      Keep package up-to-date (3 weeks/year for one platform, add 2 weeks/year per additional platform)

 

4.d Web Site Maintenance (4 weeks/year)

 

The Collaboratory must allow for web site maintenance for the fusiongrid.org web site.

 

4.e Help Desk (13 weeks/year)

 

The National Fusion Grid will have a single point-of-contact for questions about using the grid.  This person will be expected to field questions from users, grid service developers, and systems administrators.  The help desk person will provide direct support when possible, or refer the question to the appropriate person when direct support is not possible.  The expectation is that if the help desk person can't answer the question, they will know who can.

 

The help desk person will create and maintain a web-based FAQ for the NFG.

 

5. Large Scale Data Management

 

5.b Provide a Scheme for Large Scale Data Management for Simulations (Y2Q2, 15 weeks +)

 

The motivation of this work is the desire to compare theory and experiment.

 

Write several scenarios for large-scale simulations data management.   Each scenario includes how a researcher loads inputs, how the code is invoked, how results are written, and how results are visualized.

 

Scenarios should include inputs and outputs of various sizes.  There will be at least one scenario each for outputs of size 50 MB, 500 MB, 1 GB, 50 GB, and 500 GB.

 

Each scenario will be tested.  Tests will include timing information.   Results will be quantified.

 

Scenarios should test inputs/outputs using flat files, HDF5, and MDSplus (with and without compression); results will be quantified  and compared.

 

The results of these tests will be used to make intelligent decisions about data management for large files for simulations.  Test results will be used to reach conclusions about which technologies are to be used  and how they are to be used.

 

Preliminary work (3 weeks)

·      Identify requirements for comparison of simulation and experient (1 week)

·      Identify requirements for sharing data within simulation community (1 week)

·      Develop use cases (1 week)

 

Phase I: understand performance and identify most promising technologies (6 weeks)

·      Design & execute tests for various data sizes for 1 client and 1 server:

o      HDF5 vs. remote MDSplus, find out where MDSplus breaks down in        terms of speed, where HDF5 breaks down because of the size of the file

o      HDF5 vs. parallel MDSplus

o      HDF5 vs. local MDSplus

o      Investigate what is involved for caching MDSplus tree files

o      Write HDF5 to MDSplus, retrieve MDSplus and convert to HDF5

o      GridFTP HDF5 to remote server, write to MDSplus, compare with        parallel MDSplus, reverse for retrieval

 

Phase II: continue tests on most promising technologies (6 weeks)

·      Check use cases: using the most promising technologies based on results of Phase I testing, design & execute tests for specific use cases (4 weeks)

·      Site testing: perform tests between various client sites and      remote server sites including 3 tokomak sites, NERSC, TCF at      PPPL (2 weeks)

 

Conclusion

·      Decide whether one solution fits all or different requirements warrant different solutions

·      Determine roles of NERSC and TCF.

 

6. COMPUTATIONAL PROCESSING

 

6.a GYRO grid service available on FusionGrid (Y1Q4, 11 weeks)

 

The National Fusion Collaboratory will create a grid service for the GYRO plasma turbulence simulation code.  The new grid-enabled GYRO code will run on the PPPL Linux cluster.  Monitoring information will be made available trough the Fusion Grid Monitor (FGM).

 

Along with the GYRO service, the Collaboratory will develop a GYRO client used to simplify the tasks of preparing a GYRO code run and dispatching the run for execution on FusionGrid.  [will this client have a GUI or is it to be simple scripts?]

 

·      Adapt GYRO for use as a grid service  (4 weeks)

·      Instrument GYRO for monitoring information (1 week)

·      Add page to FGM for GYRO monitoring information (2 weeks)

·      GYRO preparation client (4 weeks)

 

6.b Computation reservation deployed for between-shot computation (Y2Q? depend on ANL-Keahey, 13 weeks)

 

The Collaboratory will develop a tool for reserving computational resources (e.g. CPU time, memory, mass storage) on the National Fusion Grid.  This computation reservation program will provide users with an easy-to-use GUI through which they may resource reservations in advance of actual resource usage.  The program will be used by researchers for between-shot computation.  The users will specify the code, input size, and reservation date; the application will then estimate computation time and either make the appropriate reservation or indicate to the user that such reservation is not possible.  The tool must be able to reserve resources for multiple codes.

 

It is expected that this tool will leverage existing software used to reserve computational resources.  Much of the work will be in making the system easy to use and flexible enough for general use with different codes.

 

·      User interface design (2 weeks)

·      Calculate computation time based on parameters (1 week)

·      Adapt existing resource reservation software for general use (2 weeks)

·      GUI implementation and testing (2 weeks)

·      Beta version deployment, testing with one code (3 weeks)

·      Modify program based on suggestions from Beta, test with multiple codes (3 weeks)

 

 

7. Collaboratory Control Room tasks

 

7.a One to many within the control room (Y?Q? depend on PCS, 7 weeks + 2 week/year)

This will allow one scientist at a computer in the control room sharing his or her display or applications to the control room tiled display wall.  The supported platforms will include Linux, Windows and MacOS X.  The primary applications to be supported are any X-windows applications and web browsers.  The display sharing software will be field-tested in the control rooms.

 

·      Install and test server software on the computer that controls the tiled display wall in the DIII-D control room; install and test the client software on one of the each supported platform  (2 weeks)

·      Propagate the client software to all of the personal computers in the DIII-D control room (1 week)

·      Field test the system, give feedbacks to the developers and iterate (4 weeks)

·      Assist users and maintain the system in the control room (2 week/year)

 

7.b One outside to one inside the Control Room  (Y?Q? depend on ANL-Papka, 10 weeks + 4 weeks/year)

 

This will allow one scientist outside of the control room at a computer sharing his or her display or applications with one scientist inside the control room.  The sharing involves interactive analysis discussion thus requiring audio and video.  It will be accomplished using VNC within AG.

 

 

7.c One outside to many inside the Control Room (Y?Q? depend on PCS, 6 weeks + 2 weeks/year)

 

This will allow one scientist outside of the control room at a computer sharing his or her display or applications to the control room tiled display wall.  This will involve both the AG sharing and the tiled display sharing.

 

 

7.g Solutions for Audio in Control Room (Y1Q1 depend on ANL-Papka, 8 weeks)

 

The collaboratory will solve the unique audio problem of collaboration between an outside group and a big control room where the background is noisy and the sharing needs be switched between private and public easily.

 

 

7.h Experiment Browser (Y1Q2, 10 weeks)

 

This task requires that the Data Analysis Monitor (DAM) be enhanced to provide more information in a concise way.  The enhanced DAM would also have the ability to plot simple XY signals and would provide the user a shot countdown clock.

 

The current state of the art is to use DAM along with the real-time web display, and the shot clock.  This experiment browser would combine the functionality of these three tools.

 

A requirement of the experiment browser is that it fit as much information in an as small as possible screen space.  This is to be an information-dense, expert display.  There will be a mechanism for user-customization of the information display.

 

The browser should work on multiple platforms. OS X, Win32, and Linux must be supported.  The browser will be designed such that it may be used for different tokamak experiments.

 

 

9. Advanced Visualization tasks

 

9.b Simulation Community is Target Audience (Beginning Y1Q2, 8 weeks)

 

 

9.c Easy and Robust to use (Y1Q2 depend on Utah, 4 weeks)

 

 

N.x Project Management (15 weeks/year)

 

We should account for project management...ask Dave for more info here.