# Thu Aug 28 2003 16:29:24PM Thomas Naughton # # OSCAR Meeting @ NCSA # August 25-26, 2003 Attendees: ---------- Jeremy Enos NCSA jenos@ncsa.uiuc.edu Tom Lehmann Intel tom.lehmann@intel.com Tim Mattson Intel timothy.g.mattson@intel.com John Mugler ORNL muglerj@ornl.gov Brian Finley BGSW brian@bgsw.net Stephen Scott ORNL scottsl@ornl.gov Jeff Squyres IU jsquyres@lam-mpi.org Rich Libby Intel rml@hpc.intel.com Jason Brechin NCSA brechin@ncsa.uiuc.edu Terry Fleury NCSA tfleury@ncsa.uiuc.edu Neil Gorsuch NCSA ngorsuch@ncsa.uiuc.edu Hrabri Rajic Intel hrabri.rajic@intel.com Thomas Naughton ORNL naughtont@ornl.gov Summary: -------- Thomas: These are in part from notes and part from memory. If you see items that should be corrected/added please let me know. The meeting focused primarily on the proposed enhancements by NCSA to the Package API and Configuration changes (see Figure 'new-pkg-diagram'). The slides by Jeremy Enos outlined NCSA efforts (good/bad) that led to their proposal -- leading to several simplifications (see slides). There was also discussion about what the actual GUI interface should look like or this new configuration flow. Terry provided suggestions and a description of what he's planning for the proposed changes. These changes are all slated for the OSCAR 4.x phase of things. The changes for the upcoming SC'03 release, (oscar-3.0), will continue to function under the existing package and windowing setup. However, the 4.x design document is a target item for this OSCAR 3.0 release. (see feature list at the end of this summary) Also discussed were the issues related to distributing source for things like compiling GM, etc. It was decided that in the case that things must be dynamically built (e.g., b/c of kernel differences or pkgs that are only know based upon what the user selects, lam w/ GM support) they would remain in the domain of RPMS/SRPMS. So that whatever is ultimately installed will be an RPM and if needed the SRPMS can be rebuilt either in the chroot'ed image area or on the nodes themselves if build must take place on the hardware itself (via a tmp build of that node, and then grabbing the RPMS or some similar method). The world of IA64 was discussed and basically it was decided that there's currently no current/stable distros for use with OSCAR -- ia32 and ia64 distros are not evolving at the same pace. This gap between the ia32 & ia64 distros leaves support for ia64 behind b/c they're still a minority. The only distros that are currently a potential are the upcoming RedHat Advanced Server release, which is more current but again not yet available. The other being "OSCAR-gold/OSCAR-linux" (not sure about name) that's coming from Prodginy along w/ NCSA efforts and is a RedHat 7.x like release. This doesn't appear to have much future and thus not likely to be too significant w.r.t. future OSCAR release (i.e., SC'03 time frame). There was discussion from Intel that there might be support for efforts to address this ia64 distro void. OSCAR "etiquette" was discussed by Stephen. Basically things like tutorials and proposals for conference BoF's (and the like) need to be better coordinated among the group. The point being raised was based on comments that things didn't look "organized" when OSCAR members submitted conflicting/competing proposals for the same event. This was not to say that papers, etc. should require some group-wide inclusion, rather a "heads up" so that things look professional and organized. Neil gave an overview of ODA, including example usage and a summary of the design, e.g., shortcuts (see slides). The node/pkg group material was listed in the slides -- how it's laid out in the DB. There was a brief summary of XCAT by Brian Finley. It is being used by the TeraGrid and is also being used to wrap SystemImager (see Notes below). Brian also mentioned the tools used at ANL that seemed very similar to the ideas being expressed by the new proposals by NCSA. The 'sanity' and ? (forgot name) are used after a node builds to check if a node's definition/configuration has changed and if so add/del to make it match, and then check if the changes worked (i.e. are sane). This includes the addition of RPMS, configuration files or other scripts that might need to be run on the nodes. This is all triggered via init.d scripts and settings on a master server. There was a question of when the next meeting should be -- since many will be at SC'03 in Phoenix in November that seems like a good time. A specific time/date wasn't set but this looks like a likely time for a next meeting. The following was the final SC'03 target feature list for OSCAR-3.0. These will be made to the existing 2.x series with none of the proposed package and installer modifications coming until after this 3.x release. SC'03 Target / 3.0 Feature List: CORE: o Web updates - registration area o multicast toggle option in wizard - requires edit to SIS...Prameth do-able? i.e., SystemInstaller enhancement (Note, Open to others who'd like to maintain/devel on SIS- specifically SystemInstaller, Mike CS & Sean would probably be cool) o black box dependency engine. (AutoUpdate) o add/remove pkg (see Table 1 in arch doc) - Terry: GUI panel to feed this - John: Uninstall oscar pkg get list of rpms for pkg from ODA, 'rpm -e' those and run following after the actual RPM remove post_server_rpm_uninstall -- runs on server's fs post_client_rpm_uninstall -- runs on client's fs (image & nodes) Keep DB consistent! o Design docs for 4.0 o OSCAR RPM Packages: o oscar-sss pkgs o HPL pkg (configurator to tweak) o LAM Chkpt/Restart o Infiniband Package o Documentation updated - Including design/arch documentation for oscar 4.x - Offline methods for getting OPD stuffo adding to /var.../packages, useful for those creating CD * Maintain existing pkgs * Test Misc. Meeting Notes: -------------------- #========================================================================== # Monday, August 25, 2003 #========================================================================== [IA64] + GUI libs not working under IA64 + Not working, even w/ rh7.2 & rhas2.1, and rhas2.5beta - perl-Qt = works under rhas2.5beta rhas 2.95beta, perl-Qt = works, MySQL = not works (supporting libs, etc.) + Windowing envs - perl-Tk = works - Qt = not works - WXWindows (GTK based) + New Clusters @ NCSA will have SuSE, therefore not OSCAR + Running "OSCAR gold" on dual I2 cluster (currently) + Testing of SuSE 8.1 - update Qt + Group comfortable w/ current status of ia64 and not supporting free distro or breaking back to try to. When the rhas comes along it will be sufficient. + The current "OSCAR-gold/linux" (whatever called) from NCSA/Proginy is not going to be upgraded to be another "free ia64" offering. + So this upcoming rhas 3.0 will *be* the support OSCAR distro for ia64 + OSCAR 2.2.1 + RedHat 7.2 (Renamed OSCAR-Linux) OSCAR 2.3 + RedHat AS 3.0 (coming) #--------- + Rough Agenda + Introductions #--------- + Jeremy's time (PPT slides) & (Figure of whiteboard drawings) #--------- LASCI - oct. 27-29 - ha-oscar accepted - others to submit abstracts #--------- BrianF's overview of XCAT - XCAT is available to all who have purchased IBM hardware (got an IBM mouse?) - It is currently being used to wrap the SystemImager tool. - There were several utilities, including one that redirected serial terminals to allow for status and console views during the installs. (conserver - used by XCAT to get remote serial consoles) - The use of attributes on nodes for grouping via a simple text based configuration file was demonstrated to show how grouping is done in XCAT. e.g., node3 ia32-compute,debug node4 ia32-compute ... node21 ia64-compute,debug node22 ia64-compute So commands/queries could be placed based on the associated attributes, thus creating dynamic groups based on the search results for say 'ia64-compute', root# dsh ia64-compute hostname node21 node22 - There were other utilities that did common tasks like setup PBS, gather host SSH keys for building known_hosts files, etc. These all stemming from previous efforts when setting up clusters and trying to reuse wherever possible. << sounds familiar :) >> - Brian mentioned that Egan (developer of XCAT) would likely be receptive if OSCAR could get similar functionality -- attributes for nodes to group and general framework to function on these groupings -- b/c of the open sourced effort. #========================================================================== # Tuesday, August 26, 2003 #========================================================================== #---- Terry: Selector/Configurator redesign overview #---- Neil: ODA Summary (PPT slides) #---- * How to handle SIS SCSI/IDE differences w/ same image? - edit autoinstall script for the image, adding info on how to setup disk on IDE nodes and then for the SCSI nodes (keeping other img stuffo) - Discussion: + possibly do in a more intelligent way by changing the "language" used to represent the partitioning scheme, i.e., DISK1,partition1 instead of /dev/hda1 + Problems arise when you have both IDE & SCSI in same box which is DISK1? - currently, not dealing w/ bootloader or softw raid info -- deriving this from autoinstall script. but can edit master autoinstall scripts and then run mkautoinstall again pointing toward appropriate script based on nodes w/ IDE & those w/ SCSI * Different interface for installation? i.e., not nec. eth0 - Feature request filed to SystemImager & OSCAR by Jeremy S. (Intel) * Finalize: Handling src/tars? - Stay w/in the RPM world, if needed use SRPMS and possibly rebuild * Web updates #+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ NOTICE: Much fun was had by all at Neil's OSCAR B-B-Q Bonfire Bash! Thanks for having us. QUOTES: "I would love to run for governor of California." - TGM "I would be honored to loose to Gary Coleman." - TGM "This is the best OSCAR Meeting Ever!" - TJN # $Id: Meeting-Notes.txt,v 1.4 2003/09/05 20:54:43 naughtont Exp $