Goals of the Reduced Data Set Subgroup and Volunteers
J. Brau and D. Strom
The goal is to reduce the data to a manageable volume.
There are two obvious ways:
1. eliminate extraneous channels
2. decimate the retained
channels
The word reduction can have ambiguous meaning:
1. to make shorter
2. to transform the data (such as in "data
reduction")
The task of this subrougp focusses primarily on 1., and some
of 2. may come in naturally
Primary task:
Define a data set which is significantly smaller
than the
full data set, is dynamic in content, which contains
nearly
a complete characterization of the detector, and
the signals
we are looking for.
Proposal:
Work with three types of data sets:
1. The full data set
2. a ``standard" reduced
data set
3. ``designer" reduced data
sets
The ``standard" reduced data set will strive to be everything
to everyone.
It will include a complete
set of the interesting
channels, in a compressed format.
It will be dynamic
so that when
it is found to be deficient, it can be revised.
It will be designated
``standard" by a designated LSC authority.
It is in the light-weight
data format
A ``designer" reduced data set is created by a user
who wants to
make an alternative reduced data set.
This allows one to
test the
completeness
of the ``standard"
It is in the light-weight data format
When the
new elements of the ``designer" set are found
to be
worthy of promotion, they may be added to the
``standard"
reduced data set
Could also be used
for short term purposes:
request
two channels with no decimation for subtle
correlation study
record
a set of channels with one arm removed
from the IFO via gross misalignment to study
behavior of other arm
request
full data set for a window triggered
by satisfied set of conditions
We have in mind creating these reduced data sets by ``eliminating"
or ``decimating" each of the channels of the full data set in a prescribed
way, which can be channel dependent, with each channel reduced following
one
of a set of flexible, yet specific, algorithms.
A few specific tasks have been planned and are beginning,
in an effort to begin real work on this effort.
(I) The first priority is to produce a prototype system based
on the commissioned channels at Hanford. For example, weather
data and seismometer data are currently being written in
Frame Format, along with place holders for all of the LIGO
channels which are not commissioned. The prototype software
would produced a reduced data set in Frame Format consisting
of only the working channels. Tapes of this data could be produced
for time periods of days and checks on the data could be then
be made. This would allow to gain experience with data
and help define the specifications needed for II
(II) In parallel to the prototype test, the user interface
for the reduced data sets would be specified. The interface
should allow the user to select an arbitrary subset of
channels with arbitrary transformations (e.g. mean, rms only;
peak values; within limits; decimation; etc.) to be written
to the reduced data stream.
(III) Using the tools developed by the Transient Analysis Subgroup
as well as well any other transformations which are needed,
a the software needed for part the specification given in
(II) could be developed.
(IV) Once the first version of this system is complete it
would replace the prototype developed for (I).
People expressing an interest in working with this subgroup:
Jim Brau, Oregon
S. Klimenko, Florida
W. Majid, Caltech
Evan Mauceli, Oregon
David McClelland, Australian Nat. Univ.
G. Mitselmakher, Florida
Fred Rabb, Hanford
Robert Schofield, Oregon
Susan Scott, Australian Nat. Univ.
David Strom, Oregon
Rai Weiss, MIT
Bernard Whiting, Australian Nat. Univ.
Natalia Zotov, La. Tech.