Node Build and Configuration Notebook - page 35 of 55

EditDeleteAnnotateNotarize
First PagePrevious PageNext PageLast PageTable of ContentsSearch

Date and Author(s)

new abstraction model

This is a third take on the build/config abstraction    
parts:    
    
1. Node State Management and Correction    
   Data:    
   node state - current operational/BCM state of a node (run/kernel/init/build)    
   node administrative state - online or offline    
    
   Functions include tracking this data. Also corrective actions can    
   be generated based on current state. (ie last state transition,    
   into kernel happened 10 minutes ago. it must have hung. time to    
   reboot, etc)    
  
2. Build system    
   Data:    
   node configuration info - image, attributes, network configuration,    
        etc. everything required to build software on a node once it     
        is identified.     
    
   functions:    
   set node    
  
3. Cluster Build Infrastructure    
   Data:    
   power controller setup - node to power controller mapping    
   serial console setup - node to serial console setup    
   identification info - node to identifying characteristic    
       - this might potentially contain network interface info    
   hardware log - node to open or closed issue mapping (generally hw)    
   network device setup - same as serial for purposes of identification?    
   BIOS configuration       
    
   This component handles all system wiring setup issues and power    
   control. the trickier issue is initial identification. This process    
   could use any of a number of criterion to allocate identities to    
   nodes.     
 
These can be viewed as a stack; nodes enter into the system initially in     
the cluster build infrastructure. Initially, nodes are not identified;    
the cluster build infrastructure uses some logic to identify these    
nodes. One part of this identification process is notification to    
other components that a new node has been identified and introduced    
into the cluster. At this point, the node is ready to be    
built. Control is functionally handed off to the build system at this    
point. The build system, based on its internal notion of node    
configuration, (probably very implimentation dependant notion of node    
configuration) then builds the node. When this process is completed,    
then the node is actually operational. At this point, the node state    
manager can start to track and affect the node's operational state.    
    
The guiding goal that motivated this rework is the ability to replace    
the middle component, the build system, with a different    
implimentation. Ie, we replace the chiba build-system with the oscar    
one. I think that it will be reasonable for the chiba tools be    
implimented to this interface. If it is reasonable for OSCAR to be    
wrapped in the same interface, this would mean that things like    
package services would run as a part of the build-system, but would    
service internal protocols that more accurately reflect their internal    
data layout, etc. Does the current interface look like enough of a    
lowest common denominator of configurations? Current interface being    
    
Node config data:    
image    
tags    
network interface configuration    
    
image and tags being opaque identifiers used by the build system.    
the network interface configuration may or may not be a good thing to    
expose externally