A Linear Combination Software
                    Reliability Modeling Tool with
                 A Graphically-Oriented User Interface


Michael R. Lyu				Allen P. Nikora
Electrical and Computer			Jet Propulsion Laboratory
Engineering Dep't.			California Institute of Technology
University of Iowa			4800 Oak Grove Drive
Iowa City, IA  52242			Pasadena, CA  91109-8099
e-mail:					e-mail:
lyu@hitchcock.eng.uiowa.edu		bignuke@spa1.jpl.nasa.gov

                           Thomas M. Antczak
                       Jet Propulsion Laboratory
                  California Institute of Technology
                         4800 Oak Grove Drive
                       Pasadena, CA  91109-8099


                               Abstract

In previous papers, we have shown that forming linear combinations
of model results will tend to yield more accurate predictions of
software reliability.  Using linear combinations will also simplify
the software reliability practitioner's task of deciding which
model or models to apply to a particular development effort. 
However, there are currently no tools commercially available that
permit such combinations to be formed within the environment
provided by the tool.

In addition, most software reliability modeling tools do not take
advantage of the high-resolution displays available today. 
Performing actions within the tool may be awkward, and the output
of the tools may be understandable only to a specialist.  We
propose a software reliability modeling tool that allows users to
formulate linear combination models, that can be operated by non-
specialists, and that produces results in a form that can be
understood by the software development and management personnel.



1.       Introduction

         Over the past twenty years, many software reliability models
have appeared in the literature [Littlewood et al. 86].  Many of
these models have been shown to be applicable to a sufficiently
large number of sets of failure data, so that development efforts
would have some degree of confidence in using one or more of these
models.  Techniques for recalibrating models [Littlewood 90] and
combining the results of models in a linear fashion [Lyu and
Nikora] have been developed that appear to yield more accurate
predictions than the user of a single model.  However, these models
have not been used as widely as one might expect.  A principal
factor here is that it does not seem possible to make a priori
determinations of which model or models will be the best suited to
a particular development effort.

         Another difficulty has been the lack of modeling tools that
are easy for the non-specialist to use.  For instance, many of the
tools currently available were initially developed prior to the
widespread availability of high-resolution displays, and therefore
employ character-oriented user interfaces [SMERFS, SRMP, ITT Tool]. 
This characteristic of the tools may result in terse and cryptic
command sequences, making it difficult for non-specialists or
casual users to perform modeling actions with the tool.  Since the
results of the tools are displayed in a character-oriented fashion,
the results will tend to be expressed in a way that is not easily
understandable to non-specialists (e.g. model parameter values,
tabular displays of interfailure times as opposed to failure rate
curves).  Considering the schedule pressures under which software
developers and managers frequently operate, there is little
incentive to learn how to operate a complicated new tool.

         Also, the tools available today do not allow users to form
linear combinations of model results within the tool environment.
In earlier papers, we have shown that linear combinations of
individual models can yield more accurate reliability predictions
than the individual models themselves.  To form linear combinations
with current tools, the tool must be run several times to obtain
the results from the desired component models of the combination. 
These results must then be combined in an application separate from
the tool.  Of course, this consumes more time than would be
required if linear combinations could be formed within the tool
environment.

         In this paper we propose an architecture for a software
reliability modeling tool that:

         1.      Supports the formation of linear combinations of model
                 results within the tool environment.

         2.      Allows non-specialists to operate the tool and easily
                 interpret the model results.

We refer to this tool as a Computer-Aided Software Reliability
Estimation (CASRE) tool.


2.       User Analysis

         In developing the user interface for CASRE, it was necessary
to identify the types of users that would make use of this tool. 
The six following types of users were identified:

         1.      Project managers
         2.      Line managers
         3.      Software development staff (system, software, and test
                 engineers)
         4.      Software support staff (configuration management and
                 product assurance personnel)
         5.      Consultants
         6.      Researchers

For each of these user categories, we describe here their role in
the software reliability measurement task, and further classify
them according to schemes suggested in [Sutcliffe and
Schneiderman].

Category                                   Value Ranges

User Knowledge
  task:                             Knowledge of software
                                    reliability measurement
                                    techniques, rated as novice,
                                    skilled, or expert
  computer:                         Knowledge in use of computers
                                    to accomplish task, rated as
                                    novice, skilled, or expert
  syntax:                           Knowledge of syntax of actions
                                    required to accomplish task,
                                    rated as novice, skilled, or
                                    expert
Frequency:                          How often involved in software
                                    reliability measurement task,
                                    measured as hourly, daily,
                                    weekly, monthly, or
                                    intermittent
Discretion:                         Rated as compulsory or optional
Workload:                           Proportion of time estimated to
                                    be dedicated to software reli-
                                    ability measurement - rated as
                                    low, medium, high
Interaction:                        Data entry, low-level functions
                                    (e.g. synthesis of new com-
                                    bination models), high-level
                                    functions (e.g. execution of
                                    one or more pre-specified
                                    models), all functions, uses
                                    output only


2.1.     Project Managers

         Project Managers are typically former engineers who have made
the transition to management.  As such, they are familiar with
basic techniques for interpreting statistical information, but may
not be familiar with details of statistical modeling.  These
individuals typically receive reports generated by the support
staff and use them as input to their decision making process. 
Consultants may also work with researchers to transfer academic
findings to specific application domains.

Knowledge
        task:                             skilled-
        computer:                         skilled-
        syntax:                           novice
Frequency:                                monthly
Discretion:                               optional
Workload:                                 low
Interaction:                              Receives hardcopy reports.  Rarely
                                          interacts directly with tool.


2.2.     Line Managers

         Line Managers are usually also former engineers who have made
the transition to management.  As with Project Managers, these
individuals are familiar with the basic techniques for interpreting
statistical information.  Line Managers receive reports from their
support staff and use them as input to their decision process. 
Since Line Managers are usually closer to the actual development
effort, they would tend to request reports more frequently than
Project Managers.  They may also use some of the basic tool
capabilities (e.g. running pre-specified models, but not creating
new ones).

Knowledge
        task:                             skilled-
        computer:                         skilled-
        syntax:                           novice
Frequency:                                biweekly
Discretion:                               optional
Workload:                                 low
Interaction:                              Receives hardcopy reports. 
                                          Occasional use of high-level
                                          functions of tool.


2.3.     Development Staff

         Users of software reliability measurement techniques within
the development organization include system engineers, software
engineers, programmers, and test engineers.  These individuals
typically have degrees in technical disciplines, extensive software
development experience, and some additional training in the methods
and tools that apply to their assignment.  These individuals use
modern software development tools on a regular basis.  They will be
familiar with the basics of probability theory and statistics, and
may have advanced training in statistical modeling techniques. 
Currently, however, they rarely have had training in software
reliability theory, methods, or tools.

Knowledge
        task:                             skilled-
        computer:                         expert
        syntax:                           expert
Frequency:                                weekly
Discretion:                               compulsory, subject to Project or
                                          Line Management policy
Workload:                                 low+
Interaction:                              use high-level functions of tool


2.4.     Support Staff

         Users of software reliability measurement techniques within
the support staff include configuration management specialists and
quality assurance personnel.  These individuals include both
clerical staff and technical personnel.  Most of these individuals
have extensive experience in configuration management and quality
assurance activities across a wide range of projects. 
Consequently, some support staff members have training or
experience with software reliability measurement techniques at
various levels.

Knowledge
        task:                             novice+
        computer:                         skilled
        syntax:                           skilled
Frequency:                                weekly
Discretion:                               compulsory, subject to Project and
                                          Line Management policy
Workload:                                 low+
Interaction:                              primarily high-level functions; some
                                          low-level functions.


2.5.     Consultants

         A software reliability consultant typically has an advanced
degree in a technical discipline extensive background in all
aspects of software reliability measurement, and significant
software development experience.  This individual plays a key role
in introducing software reliability measurement techniques into a
project at all levels.  This includes assisting the Project and
Line Managers in setting software reliability goals and
interpreting results, and assisting the development and support
staffs in selecting and using models and support tools.  

Knowledge
        task:                             expert-
        computer:                         expert
        syntax:                           expert
Frequency:                                intermittent
Discretion:                               optional
Workload:                                 high
Interaction:                              all functions


2.6.     Researchers

         Researchers are typically members of the faculty at a
university who develop or refine reliability models.  Researchers
may work with consultants in transferring knowledge from the
academic to environment to specific applications domains.

Knowledge
        task:                             expert
        computer:                         skilled
        syntax:                           expert-
Frequency:                                daily
Discretion:                               optional
Workload:                                 high
Interaction:                              all functions


2.7.     User Analysis Summary and Recommendations

Table 1 summarizes the user analysis given above.

   ZDDDDDDDDDDDDDBDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDBDDDDDDDDDDDBDDDDDDDDDDDDBDDDDDDDDDDBDDDDDDDDDDDDDDDDDDDDDDDD?
   3             3        User Knowledge          3           3            3          3                        3
   3             3   Task     Computer     Syntax 3 Frequency 3 Discretion 3 Workload 3 Interaction            3
   FMMMMMMMMMMMMMXMMMMMMMMMMQMMMMMMMMMMMQMMMMMMMMMXMMMMMMMMMMMXMMMMMMMMMMMMXMMMMMMMMMMXMMMMMMMMMMMMMMMMMMMMMMMM5
   3 Proj Mgr    3  skilled-3 skilled-  3  novice 3  monthly  3 optional   3   low    3  hardcopy              3
   3             3          3           3         3           3            3          3                        3
   3 Line Mgr    3  skilled-3  skilled- 3  novice 3 bi-weekly 3 optional   3   low    3  hcopy, occ. high      3     
   3             3          3           3         3           3            3          3  level functions       3 
   3             3          3           3         3           3            3          3                        3
   3 Dev Staff   3  skilled-3  expert   3  expert 3  weekly   3 compulsory 3   low+   3  high-level functions  3
   3             3          3           3         3           3            3          3                        3
   3 Supp. Staff 3  novice+ 3 skilled   3 skilled 3  weekly   3 compulsory 3   low+   3  hi-level funcs, some  3
   3             3          3           3         3           3            3          3  low level fncns       3
   3             3          3           3         3           3            3          3                        3
   3 Consultant  3  expert- 3  expert   3  expert 3  intrmt   3  optional  3   high   3  high and low          3
   3             3          3           3         3           3            3          3  level functions       3
   3             3          3           3         3           3            3          3                        3
   3 Researcher  3  expert  3 skilled   3 expert  3  daily    3  optional  3   high   3  high and low          3
   3             3          3           3         3           3            3          3  level functions       3     
   @DDDDDDDDDDDDDADDDDDDDDDDADDDDDDDDDDDADDDDDDDDDADDDDDDDDDDDADDDDDDDDDDDDADDDDDDDDDDADDDDDDDDDDDDDDDDDDDDDDDDY
                                                                                                                
         Table 1 - Summary User Profiles                                                                        
                                                                                                           

         We see from the above table that the reliability measurement
task is performed within a software development effort on, at best,
a weekly basis (discounting the time that may have been spent with
a consultant in setting up a reliability measurement program). 
Also, some of the users performing the task the most frequently
have the lowest level of reliability measurement knowledge.  Given
these factors, the goals of low learning time and good retention
over time were the primary concerns in designing the CASRE
interface.  These findings suggest that a menu-oriented or direct
manipulation style of interaction, or perhaps a combination of the
two, is appropriate.  

         While important, good speed of performance was not deemed as
critical as the other two goals.  When running a complicated model,
such as the Littlewood-Verrall model, on a set of failure data, it
is to be expected that results will not be immediately available. 
We therefore specified the goal that the throughput of the modeling
section should at least be comparable to that of some of the more
popular tools currently in use.


3.       The CASRE Tool - High-Level Structure and Functionality    

         In this section we describe the high-level architecture and
the basic functionality of the CASRE tool.  To implement the
recommendations resulting from the user analysis, it is planned to
implement CASRE on top of a windowing system (e.g. X-Windows/MOTIF,
DOS Windows 3.0).  Figure 1 shows the proposed high-level
architecture for CASRE, whose major functional areas are:

         Data Modification 
         Failure Data Analysis 
         Modeling and Measurement 
         Modeling/Measurement Results Display 

 Figure 1:  High-Level Architecture for CASRE

         Much of CASRE's functionality is available in current software
reliability tools .[[ SMERFS .]],  .[[ SRMP .]]. However, a feature
unique to CASRE allows users to combine the results of several
models in addition to executing a single model.  Feedback from the
Model Evaluation block assists users in identifying a model or
combination of models best suited to the failure data being
analyzed. Moreover, the I/O facilities, the user interface, and the
measurement procedures are greatly enhanced in this tool.


3.1.  Data Modification

CASRE allows users to create new failure data files, modify
existing files, and perform global operations on files.  Editing 
CASRE allows users to create or alter failure history data files. 
A simplified spreadsheet-like user interface allows users to enter
time between failures or test interval lengths and failure counts
from the keyboard. Users are also allowed to invoke a preferred
editor (e.g. emacs or vi).

3.2.     Smoothing

         Since input data to the models is often fairly noisy, the
following smoothing techniques are proposed: 

         -       Sliding rectangular window 
         -       Hann window 
         -       General polynomial fit                       
         -       Cardinal Spline 
         -       Specific cubic-polynomial fits (e.g. B-Spline, Bezier
                 Curve) 

Users select smoothing techniques appropriate to the failure data
being analyzed.  The smoothed input data can be plotted, used as
input to a reliability model, or written out to a new file for
later use.  Summary statistics for the smoothed data can also be
displayed (see "Failure Data Analysis" below).  


3.3.     Data Transformation  

         In some situations, logarithmic, exponential, or linear
transformations of the failure data produce better or more
understandable results.  The following operations, currently
available in some tools, allow users to transform an entire set of
failure data in this manner.

         -       log(a * x(i)) + b); x(i) represents a failure data item,
                 and a and b are user-selectable scale factors
         -       exp(a * x(i) + b) 
         -       x(i) ** a 
         -       x(i) + a 
         -       x(i) * a 
         -       User-specified transformations might also be allowed.

As with smoothing, users select a specific transformation.  Users
are able to manipulate transformed data as they would smoothed
data.


3.4.  Failure Data Analysis

The "Summary Statistics" block in Figure 1 allows users to display
the failure data's summary statistics, including the mean and
median of the failure data, 25% and 75% hinge points, skewness, and
kurtosis .[[ Hogg Craig .]].


3.5.  Modeling and Measurement

Figure 1 shows two modeling functions.  The "Models" block executes
single software reliability models on a set of failure data.  The 
"Model Combination" block allows users to execute several models on
the failure data and combine the results of those models.  We
include this capability because our experience in combining the
results of more than one model indicates that such "combination
models" may provide more accurate reliability predictions than
single models.  The block labeled "Model Evaluation" allows users
to determine the applicability of a model to a set of failure data. 
 Single Model Execution  Based on our experience in applying
software reliability models,  we include the following models in
CASRE: BJM, GO, JM, KL, LM, LNHPP, LV, MO, PM, SM, and YM. The
models should be implemented to allow input to be in the form of
interfailure times or failure frequencies.  CASRE allows users to
choose the parameter estimation method (maximum likelihood, least
squares, or method of moments).  Model outputs include:

         -       Current estimates of failure rate/interfailure time
         -       Current estimates of reliability
         -       Model parameter values, including high and low parameter
                 values for a user-selectable confidence bound
         -       Current values of the pdf and cdf
         -       The probability integral transform ui .[[ Littlewood 1986
                 IEEE .]]
         -       The normalized logarithmic transform of ui, yi .[[
                 Littlewood 1986 IEEE .]]

Users can display these quantities on-screen or write them to disk.


3.6.     Combination Models  

CASRE allows users to combine the results of several models
according to the Equally-Weighted Linear Combination (ELC), Median-
Oriented Linear Combination (MLC), Unequally-Weighted Linear
Combination (ULC), or Dynamically-Weighted Linear Combination (DLC)
schemes described in earlier papers.  Users may also be allowed to
define their own weighting schemes.  The resulting combination
models could be further used as the component models to form
another combination model.


3.7.     Model Evaluation

         CASRE includes the following statistical methods to help users
determine the applicability of a model (including "combination
models") to a specific failure data set:

         -       Computation of prequential likelihood (PL) function (the
                 "Accuracy" criterion).
         -       Determination of the probability integral transform ui,
                 (plotted as the u-plot - the "Bias" criterion).
         -       Computation of yi to produce the y-plot (the "Trend"
                 criterion).
         -       Noisiness of model predictions (the "Noise" criterion).
         -       The Akaike Information Criterion (AIC) .[[ Akaike IEEE
                 .]], similar in concept to prequential likelihood, could
                 also be implemented.

This model evaluation function would also compute goodness-of-fit
measures (e.g. Chi-Square test).  The PL and AIC outputs are used
as input to "Model Combination" to determine the relative
contribution of individual models if the user has specified a
combination model.  


3.8  Display of Results

CASRE graphically displays model results in the following forms: 

         Interfailure times/failure frequencies, actual and estimated
         Cumulative failures, actual and estimated
         Reliability growth, actual and estimated

Actual and estimated quantities are available on the same plot. 
Plots include user-specified confidence limits.  Users are able to
control the range of data to be plotted as well as the usual
cosmetic aspects of the plot (e.g. X and Y scaling, titles).  In a
windowing environment, multiple plots could be simultaneously
displayed.  CASRE allows users to save plots displayed on-screen as
a disk file or to print them. One public-domain tool, SMERFS  .[[
SMERFS .]] version 4,  can write the data used to produce a plot to
a file that can be imported by a spreadsheet, a DBMS, or a
statistics package for further analysis.  CASRE includes this
capability.  The plotting function also produces u-plots and
y-plots from Model Evaluation's ui and yi outputs.  These plots
indicate the degree and direction of model bias and the way in
which the bias changes over time.


4.  Application Procedure

Figures 2-8 show a series of screen dumps for the described CASRE
tool, using simulated failure data.  It can been seen that a
project application in software reliability measurement, including
an extensive exercise of the  linear combination approach
elaborated in this paper, could be systematically investigated and
engineered as much as the user wishes.

Screen 1 - opening a failure data file

The screen is shown in Figure 2. To choose a set of failure data on
which a reliability model will be run, the user selects the "File"
menu with the mouse.  After selecting the "Open" option in the File
menu, a dialogue box for selecting a file appears on the screen. 
The current directory appears in the editable text window at the
top of the dialogue box.  The failure history files in that
directory are listed in the scrolling text window.  The user
selects a file by highlighting its name (scrolling the file name
window if necessary) and then pressing the "Open" button.  To
change the current directory, the user enters the name of the new
directory in the "Current Directory" window and presses the "Change
Directory" button.  Pressing the "Cancel" button removes the
dialogue box from the screen.

Screen 2 - preliminary failure data analysis

The screen is shown in Figure 3. After opening a failure history
file, the contents of the file are displayed in tabular and graphic
forms.  The tabular representation resembles a spreadsheet, and the
user can perform similar types of operations (e.g. selecting a
range of data, deleting one or more rows of data).  All of the
fields can be changed by the user except for the "Interval Number"
field (or "Error Number" field if the data is interfailure times). 
In this example, the selected data set is in the form of test
interval lengths and number of failures per test interval.  The
user can scroll up and down through this tabular representation and
resize it as per the MOTIF or DOS Windows conventions.  The large
graphics window displays the same data as the worksheet.  If the
failure data set is interfailure times, the initial graphical
display is interfailure times.  If, as in this example, the failure
data set is test interval lengths and failure counts, the initial
graphical display is the number of failures per test interval.  The
display type can be changed by selecting one of the items from the
"Display Type" menu associated with the graphics window.  The user
can move forward and backward through the data set by pressing the
right arrow or left arrow buttons at the bottom of the graphics
window.  Finally, the iconified window at the lower left corner of
the screen lists the summary statistics for the data.  To open this
window, the user clicks on the icon.  The following information is
then displayed in a separate window:

         -       Number of observations in this data set
         -       Type of observations made (interfailure times or test
                 interval lengths and failure counts)
         -       Mean value of the observations
         -       Minimum and maximum values
         -       Median
         -       25% and 75% hinges
         -       Standard deviation and variance
         -       Skewness and Kurtosis

Screen 3 - failure data selection and edition

The screen is shown in Figure 4. The user will frequently use only
a portion of the data set to estimate the current reliability of
the software.  This is because testing methods may change during
the testing effort, or different portions of the data set may
represent failures in different portions of the software.  To use
only a subset of the selected data set, the user may simply "click
and drag" on the tabular representation of the data set to
highlight a specific range of observations.  The user may also
select previously-defined data ranges.  To do this, the user
chooses the "Select Range" option of the Edit menu.  This brings up
a dialogue box containing a scrolling text window in which the
names of previously-defined data ranges and the points they
represent are listed.  To select a particular range, the user
highlights the name of the range in the scrolling text window and
presses the "OK" button.  Pressing the "Cancel" button removes the
dialogue box and the Edit menu from the screen.  Once a range has
been selected, all future modeling operations will be only for that
range.  The selected data range is highlighted in the tabular
representation.  The graphics display will change to include only
the highlighted data range.  All other observations will be removed
from the graphics display.

Screen 4 - data filtering

The screen is shown in Figure 5. After selecting a data range, the
user may wish to transform the file or smooth the data.  Software
failure data is frequently very noisy; smoothing the data or
otherwise transforming it may improve the modeling results.  To do
this, the user selects one of the options in the "Filter" menu. 
There are five affine transformations which the user may apply to
the data, and six types of smoothing.  Transformations and
smoothing operations may be pipelined - for example, the user could
select the "ln(A * X(i) + B)" transformation followed by the
B-spline smoothing operation.  The number of filters that may be
pipelined is limited only by the amount of available memory.  The
tabular representation of the failure is changed to reflect the
filter, as is the graphical display of the data.  The type of
filter applied to the data is listed at the right hand edge of the
graphics display window.  In this example, we have applied a B
spline to the data.  Once a series of filters has been applied to
the data, the user may remove the effect of the most recent filter
by selecting the "Undo" option of the Filter menu.  To remove the
effect of the entire series of filters, the user selects the "Undo
All Filters" option of the Filter menu.

Screen 5 - applying software reliability models

The screen is shown in Figure 6. After the user has opened a file,
selected a data range, and done any smoothing or other
transformation of the data, a software reliability model can be run
on the data.  In the Model menu, the user has the choice of 13
individual models or a set of models which combine the results of
two or more of the individual models.  The user may also choose the
method of parameter estimation (maximum likelihood, least squares,
or method of moments), the confidence bounds that will be
calculated for the selected model, and the interval of time over
which predictions of future failure behavior will be made.


Screen 6 - selecting the best model(s)

The screen is shown in Figure 7. There are many models from which
to choose in this tool.  The user may not know which model is most
appropriate for the data set being analyzed.  Using CASRE, the user
can request, "display the results of the individual model which
best meets the four prioritized criteria of accuracy (based on
prequential likelihood), biasedness, trend, and noisiness of
prediction."  To do this, the user first selects the "Individual"
option of the Model menu.  A submenu then appears, on which 13
individual models are listed, as well as a "Choose Best" option. 
The user selects the "Choose Best" option, which results in a
"Selection Criteria" dialogue box being displayed.  The user moves
the four sliders in this dialogue box back and forth to establish
the relative priorities of the four criteria.  Numerical values of
the priorities are displayed in the text boxes on the right side of
the dialogue box.  Once the priorities have been established, the
user presses the "OK" button.  CASRE then proceeds to run all of
the individual models against the data set, first warning the user
that this is a time-consuming operation and allowing cancellation
of the operation.  If the user continues, CASRE provides the
opportunity for cancellation at any time if the user decides that
the operation is taking too much time.


Screen 7 - displaying final results

The screen is shown in Figure 8. Once a model has been run on the
failure data, the results are graphically displayed.  Actual and
predicted data points are shown, as are confidence bounds.  The
model is identified in the window's title bar; the percent
confidence bounds are given at the right side of the graphics
window. This concludes one round of software reliability
measurement with CASRE.


4.       General Experiences and On-Going Work

         A prototype of the CASRE interface was first presented at the
14th Minnowbrook Workshop on Software Engineering.  Remarkably,
there were no suggestions for change that would have meant any
significant re-organization of the tool.  Currently, the Air Force
Operational and Test Center (AFOTEC) is funding the implementation
of this tool for a DOS Windows 3.0 environment.  The modeling
capability of CASRE will be based on the mathematical library of
SMERFS version IV.  We decided that this would be the most
effective way of accomplishing the task within the allocated
resources.  Rather than writing a new set of modeling routines, it
made more sense to make use of an already existing modeling library
that had been extensively tested.  Implementation of the linear
combination modeling facility will be a straightforward task, since
all that is needed is a control mechanism to sequence through the
selected models and assign weights to the results of individual
models.

         Since the time of the Minnowbrook presentation, some changes
to the original concept have been made.  The most significant
change is in the model selection and application area (illustrated
in Figures 6 and 7).  Recall from the previous section that to
execute a model, users would choose a model from a sub-menu of the
Model pull-down menu.  This would have resulted in a sub-menu for
individual models and a set of control panels and sub-menus for the
linear combination models.  To run more than one model would be a
tedious exercise with this type of interaction; users would have to
choose one model, wait for it to complete, then choose the next
model, and so forth.

         A discussion with the AFOTEC sponsors revealed that a more
sensible model selection and execution mechanism would be a
checklist, in which users would indicate all of the models,
individual or combination, to be executed during a modeling run. 
Upon completing the checklist, users would select an item on the
checklist that would start execution of the models.  This would
free users to perform other tasks while the models were executing. 
As with other applications involving possibly lengthy computations,
users would be given the option to terminate execution of the
chosen set of models at any time.

         This change has led to modifications in the drawing window in
which modeling results are displayed.  As originally conceived,
this window would display the raw data and the results of only one
model.  Now that the user will be allowed to execute more than one
model at a time, the drawing window will change to allow users to
specify which models' results will be displayed in the window. 
This will be accomplished by a checklist similar to that used to
specify models for execution.  However, in this "display selection"
checklist, only the models that have been executed will be listed. 
This facility will allow users to easily compare the outputs, and
hence the behavior, of two or more models.


5.       Conclusions and Future Work

We have proposed a set of linear-combination models for more
accurate measurement of software reliability.  These models have
shown promising results when compared with the traditional
single-model approaches.  To relieve the tedious work involved in
applying  these approaches, a CASE tool, called CASRE, is proposed
to automate the software reliability measurement task.  For the
purpose of model validation and determining tool applicability, we
need to obtain enough data to compare software reliability models
and predictions across various types of software projects. Some
data sets can be found in .[[ Musa Data .]],  .[[ Misra IBM 1983
.]],  .[[ Troy IEEE 1985 Models .]],  .[[ Troy IEEE 1986 .]],  .[[
Levendel 1989 .]],  .[[ Levendel 1990 .]],  .[[ Stampfel .]],  .[[
Keller Shuttle .]],  .[[ Zinnel .]],  .[[ Rapp .]]. 
In future investigations, we will apply more data sets to the
proposed combination models for the purpose of validating them, and
for refining the structure and functionality of the CASRE 
tool.  Prototypes of the CASRE tool will be prepared and 
refined; potential users will be identified and asked to 
evaluate the tool based on interaction with the prototype. 
These evaluations will be also be used in refining the structure 
and functionality of the tool.  

[ $LIST$ .]