#
Computational Statistics

## A Method for Estimating Occupational Radiation Dose to
Individuals, Using Weekly Dosimetry Data

Mitchell, T. J., Ostrouchov, G., Frome, E. L., and Kerr, G. D.
Radiation Research,147;195-207(1997) Postcript File(456KB) OR
PDF File.(254K)

**ABSTRACT****
Statistical analyses of data from epidemiologic studies of workers
exposed to radiation have been based on recorded annual radiation
doses. It is usually assumed that the annual dose values are known
exactly, although it is generally recognized that the data contain
uncertainty due to measurement error and bias. We propose the use
of a probability distribution to describe an individual's dose
during a specific period of time. Statistical methods for
estimating this dose distribution are developed. The methods take
into account the ``measurement error'' that is produced by the
dosimetry system, and the bias that was introduced by policies that
lead to right censoring of small doses as zero. The method is
applied to a sample of dose histories over the period 1945 to 1955
obtained from hard copy dosimetry records at Oak Ridge National
Laboratory (ORNL). The result of this evaluation raises serious
questions about the validity of the historical personnel dosimetry
data that is currently being used in low-dose studies of nuclear
industry workers. In particular, it appears that there was a
systematic underestimation of doses for ORNL workers. This could
result in biased estimates of dose-response coefficients and their
standard errors.
**

George Ostrouchov and Edward L. Frome
Comp. Stat. & Data Analysis (1993) Adobe Acrobat(pdf) (221K)

**ABSTRACT****
Large data sets cross-classified according to multiple factors are
available in epidemiology and other disciplines. Their analysis often
calls for finding a small set of best hierarchical models to serve as a
basis for further analysis. This selection can be based on some well
defined model optimality criterion. Fitting all possible models to
find a best set is usually not feasible for as few as five factors
(7581 possible models). We note that the set of hierarchical models
and their relationships can be represented by a graph and develop an
algorithm to generate it efficiently. We further develop a graph
traversal algorithm that requires fitting of only a fraction of
all models to find exactly a best subset of the models. The algorithm
classifies as many models as possible on the basis of each fit. A data
structure implementing the graph of model nodes keeps track of the
information required by the model search algorithm.
**
- An algorithm for data-driven
** global model search ** with respect
to a model optimality criterion within a large user defined
class of models
(viewgraph,
software).

The Computing facilities available for our work in computational
statistics include those of the
Mathematical Sciences Section
(MSR) which houses approximately 50 networked high performance
workstations (Sun Sparkstations 5/10/20 and IBM Risk/6000) as well as
parallel computers (Intel, Sequent). Within MSR, there is an
Advanced Visualization Research Center
that has a number of high performance visualization
workstations and other related high resolution graphics equipment.
Network access is also available to supercomputers of the
Center for Computational Sciences,
which is housed in a nearby building. The
primary CCS computers are two Intel Paragon computers, one with 512
processors and the other with 66 processors, and a Kendall Square
KSR-2 with 64 processors.
A high speed link connects MSR with the
University of Tennessee
(UT)
Computer Science Department
and the
Joint Institute of Computational Science
to
facilitate the use of computers at both sites. UT computers include a
CM-5 with 32 processors and a MasPar MP-2 with 4,096 processors.
There are a number of other smaller MPP
systems available plus an extensive infrastructure of Sun and IBM
workstations on a local area network with various compute, network, file,
and print servers.