You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
885 lines
44 KiB
885 lines
44 KiB
6 years ago
|
|
||
|
|
||
|
Failure propagation models are defined by safety engineers and are
|
||
|
usually obtained through manual assessment of the safety of the
|
||
|
system. This is a complicated task since failures can depend on more
|
||
|
than one element of the system; be the result of the interaction
|
||
|
between many faults; be the consequence of the missed detection of
|
||
|
another fault (e.g. a fault inside an element tasked with detecting
|
||
|
faults); etc. To cope with the complexity of the systems and the
|
||
|
scenarios that need to be analysed, several model-based approaches
|
||
|
have been proposed such as AltaRica~\cite{Altarica2.3,arnold1999altarica}, Figaro~\cite{bouissou05}, etc. each with their
|
||
|
associated tooling.
|
||
|
|
||
|
|
||
|
Our work is concerned with the modelling and analysis of failures
|
||
|
propagation in the presence of time constraints. We concentrate on a
|
||
|
particular safety property, called \emph{loss detection convergence},
|
||
|
meaning that the system applies an appropriate and timely response
|
||
|
to the occurrence of a fault before the failure is
|
||
|
propagated and produces unwanted system behaviours.
|
||
|
%
|
||
|
Similar problems
|
||
|
were addressed in~\cite{thomas2013}, where the authors describe a
|
||
|
process to model Failure Detection Isolation and Reconfiguration architecture (for use on-board satellites) that
|
||
|
requires to take into account failure propagation,
|
||
|
detection, and recovery times. However, these needs are not matched by
|
||
|
an effective way to express or check the safety constraints of the system.
|
||
|
|
||
|
|
||
|
Our contribution is as follows. We define a lightweight extension of
|
||
|
AltaRica, meaning that timing constraints are declared separately from
|
||
|
the behaviour of a system (Sect.~\ref{sec:sect2}). Therefore it is easy to reuse a prior
|
||
|
safety model and to define its temporal behaviour afterwards. We
|
||
|
illustrate our method with an example inspired by safety architectures
|
||
|
found in avionic systems. This example illustrate the impact of time
|
||
|
when reasoning about failure propagation, and we use it to
|
||
|
show that taking into accounts timing constraints---in particular
|
||
|
propagation delays---can help finding new failure modes that cannot be
|
||
|
detected in the untimed models currently in use. In the process, we
|
||
|
define two safety properties: \emph{loss detection}; and its temporal
|
||
|
version, \emph{loss detection convergence}, meaning that a system
|
||
|
applies an appropriate and timely response to the occurrence of a
|
||
|
fault before the failure is propagated and produces unwanted system
|
||
|
behaviours. We show that these two properties, which are of interest
|
||
|
in a much broader context, can be reduced to effective model-checking
|
||
|
problems.
|
||
|
|
||
|
In the next chapter we extend this approach, by applying it to an industrial
|
||
|
satellite study case, and analysing the scalability of the methodology to more
|
||
|
complex system models.
|
||
|
|
||
|
|
||
|
\section{Model-Based Safety Analysis with AltaRica}\label{sec:sect2}
|
||
|
|
||
|
|
||
|
\subsection{AltaRica language and versions}
|
||
|
|
||
|
|
||
|
AltaRica is a high level modelling language dedicated to Safety
|
||
|
Analysis. It has been defined to ease the modelling and analysis of
|
||
|
failure propagation in systems. The goal is to identify the
|
||
|
possible failure modes of the system and, for each mode, the chain of
|
||
|
events that lead to an unwanted situation.
|
||
|
|
||
|
|
||
|
In AltaRica, a system is expressed in terms of variables constrained
|
||
|
by formulas and transitions. Several versions of the language have
|
||
|
been defined (see for
|
||
|
instance~\cite{arn00,prosvirnova2013altarica}).
|
||
|
|
||
|
|
||
|
|
||
|
Three main versions of AltaRica have been designed so far. These three versions differ essentially with
|
||
|
two respects: the way variables are updated after each transition firing and the way hierarchies of
|
||
|
components are managed. In the first version, designed at the Computer Science Laboratory of
|
||
|
University of Bordeaux (LaBRI), variables were updated by solving constraint systems. This model,
|
||
|
although very powerful, was too resource consuming for industrial scale applications. A radical turn
|
||
|
was therefore taken with the design of a data-flow version of the language. In AltaRica 2.0 Dataflow,
|
||
|
variables are updated by propagating values in a fixed order. This order is determined at compile time.
|
||
|
At the time this document is written, most of AltaRica tools rely on this Dataflow version. Ongoing investigation aim
|
||
|
at improving AltaRica Dataflow in several ways, like propagating variable values by determining a fixpoint at execution time,
|
||
|
hence justifying the third version of
|
||
|
the language currently being developped: AltaRica 3.0 or OpenAltaRica\cite{prosvirnova2013altarica}.
|
||
|
|
||
|
\subsection{AltaRica modelling}
|
||
|
In this work, we use the AltaRica 2.0 Dataflow language which is a fragment of other
|
||
|
versions and which is sufficient to analyse the behaviour of computer
|
||
|
based systems. (The approach discussed here could be applied to other
|
||
|
AltaRica dialects.) The models used in this work have been edited and
|
||
|
analysed using Cecilia OCAS, a graphical interactive simulator
|
||
|
developed by Dassault Aviation~\cite{bieber2004safety}.
|
||
|
|
||
|
An AltaRica model is made of interconnected \emph{nodes}. A node can
|
||
|
be essentially viewed as a mode automaton~\cite{rauzy02} extended with
|
||
|
guards and actions on data variables. A node comprises three parts: a
|
||
|
declaration of variables and events, the definition of transitions,
|
||
|
and the definition of assertions. We illustrate these concepts with
|
||
|
the example of a simple node called \FUNC. We give the code (textual
|
||
|
definition) of \FUNC in Listing~\ref{ls:altarica} and a schematic
|
||
|
representation in Fig.~\ref{fig:funct2}. The node \FUNC has one input,
|
||
|
\code{I}, and one output, \code{O}.
|
||
|
|
||
|
\lstinputlisting[language=Altarica,float=bht,captionpos=b,caption=\protect{Example
|
||
|
of AltaRica code for the node \FUNC.},frame=single,label=ls:altarica]{algos/function2.alt}
|
||
|
|
||
|
\begin{figure}[thb]
|
||
|
\centering
|
||
|
\begin{minipage}{0.25\textwidth}
|
||
|
\centering
|
||
|
\includegraphics[width=\textwidth]{altarica-function.pdf}
|
||
|
\end{minipage}
|
||
|
\begin{minipage}{0.7\textwidth}
|
||
|
\centering
|
||
|
\vspace{1em}
|
||
|
\includegraphics[width=0.8\textwidth]{altarica-diagrams.pdf}
|
||
|
\end{minipage}
|
||
|
\caption{Graphical representation of node \FUNC (left) and its
|
||
|
associated failure mode automaton (right).\label{fig:funct2}}\vspace{1em}
|
||
|
\end{figure}
|
||
|
|
||
|
\enlargethispage{\baselineskip}
|
||
|
%
|
||
|
Nodes can have an internal state stored in a set of \emph{state
|
||
|
variables}, declared in a heading called \texttt{state}. In its
|
||
|
nominal state (when \code{S = NOMINAL}), the \FUNC node acts as a
|
||
|
perfect relay: it copies on its output the value provided by its input
|
||
|
(we have \code{O = I}); this is expressed in the \code{assert}
|
||
|
block. On the opposite, its output is set to \LOSS or \ERR when
|
||
|
\code{S} equals \code{LOST} or \code{ERROR}, respectively.
|
||
|
%
|
||
|
The assert directive is used to express constraints on the values of
|
||
|
the input and output variables of a node, also called \emph{flow
|
||
|
variables}, establishing a link between the node and its
|
||
|
environment, i.e. the other interconnected nodes. It distinguishes
|
||
|
between input (\texttt{in}) and output (\texttt{out}) variables. An
|
||
|
assertion defines a rule to update the value of output flows according
|
||
|
to the state of the component and the value of input flows.
|
||
|
|
||
|
The state variables of a node can only change when an
|
||
|
event is triggered. The code of \FUNC declares two events: \code{fail\_err}, that
|
||
|
changes the state from \code{NOMINAL} to \code{ERROR}, and
|
||
|
\code{fail\_loss}, that can transition from any state to
|
||
|
\code{LOST}. This behaviour is the one displayed on the mode
|
||
|
automaton of Fig.~\ref{fig:funct2}. Transitions are listed in the
|
||
|
\texttt{trans} block of the node. Each transition has an (event)
|
||
|
name and a definition of the form
|
||
|
\lstinline[language=Altarica]{g |-evt-> e}, where the guard \code{g}
|
||
|
is a Boolean condition that can refer to state and flow
|
||
|
variables. The event \code{evt} can be triggered when the guard is
|
||
|
satisfied. In this case we apply the effect, \code{e}, that is an
|
||
|
expression that modifies the values of state variables.
|
||
|
|
||
|
Events are useful to model the occurrence of failures or the reaction
|
||
|
to conditions on the flow variables.
|
||
|
% It is possible to indicate
|
||
|
% whether an event is triggered internally by a node---as in the case of
|
||
|
% a reconfiguration---or externally by its environment---as in the case
|
||
|
% of the occurrence of a failure. In the latter case,
|
||
|
We can assign a
|
||
|
law of probability on the occurrence of the failure using the heading
|
||
|
\code{extern}. For instance we could assert that event
|
||
|
\code{fail\_loss} follows an exponential distribution with the
|
||
|
declaration:
|
||
|
%%
|
||
|
{\lstinline[language=Altarica]{extern law (<event fail_loss>)="exp 1e-4";}}
|
||
|
%%
|
||
|
In the next section, we propose a way to enrich this syntax to express
|
||
|
timing constraints on events instead of probability distributions. At
|
||
|
the moment, it is not possible to use stochastic events in addition to
|
||
|
time events.
|
||
|
|
||
|
\enlargethispage{\baselineskip}
|
||
|
|
||
|
In the general case, an AltaRica model is composed of several
|
||
|
interconnected node instances, following a component-based
|
||
|
approach. Global assertions relate the input flows of a component to
|
||
|
the output flows of other components. For the sake of brevity, we do
|
||
|
not describe component synchronisation here and we refer the reader to
|
||
|
\cite{Altarica2.3} for further details. More importantly, a
|
||
|
hierarchical AltaRica model can always be ``flattened'', i.e.
|
||
|
represented by a single node containing all the variables, events,
|
||
|
assertions, and transitions from the composite system. We use this
|
||
|
property in our interpretation of AltaRica in Fiacre.
|
||
|
|
||
|
%%
|
||
|
\subsection{Time AltaRica: Adding Timing Constraints to Events}%%\label{sec:adding-timing-constr}
|
||
|
%%
|
||
|
There already is a limited mechanism for declaring timing constraints
|
||
|
in AltaRica. It relies on the use of external law associated with a
|
||
|
\emph{Dirac distribution}. An event with Dirac$(0)$ law denotes an
|
||
|
instantaneous transition, that should be triggered with the highest
|
||
|
priority. Likewise, an event with Dirac($d$) (where $d$ is a
|
||
|
positive constant) models a transition that should be triggered with a
|
||
|
delay of $d$ units of time.
|
||
|
%
|
||
|
In practice, Dirac laws are rather a way to encode priorities between
|
||
|
events than an actual mean to express duration. Moreover, while Dirac
|
||
|
laws are used during simulation, they are not taken into account by
|
||
|
the other analysis tools. Finally, the use of Dirac laws is not
|
||
|
expressive enough to capture non-deterministic transitions that can
|
||
|
occur within time intervals of the form $[a, b]$, where $a \neq
|
||
|
b$. These constraints are useful to reason about failure propagation
|
||
|
delays with different best and worst case traversal time. For this
|
||
|
reason, we propose to extend event properties with \emph{temporal
|
||
|
laws} of the form:
|
||
|
%%
|
||
|
{\lstinline[language=Altarica]{extern law (evt) = "[a,b]";}}
|
||
|
%%
|
||
|
It is also possible to use open and/or unbounded time intervals, such
|
||
|
as \code{]a,}$\infty$\code{[}.%%\scriptstyle
|
||
|
|
||
|
With such a declaration, the transition
|
||
|
\lstinline[language=Altarica]{g |-evt-> e} can be triggered only if
|
||
|
the guard \code{g} is satisfied for a duration (or \emph{waiting
|
||
|
time}) $\delta$, with $\delta \in [a, b]$. A main difference with
|
||
|
the original semantics of AltaRica is that the timing constraint of an event is
|
||
|
not reinitialised unless its guard is set to false.
|
||
|
%%
|
||
|
% Hence the waiting time of an event decreases when time elapse whereas,
|
||
|
% while with a Dirac($d$) law the waiting time is reset to $d$ each time
|
||
|
% the event fires. Finally,
|
||
|
Moreover, our semantics naturally entails a notion of \emph{urgency},
|
||
|
meaning that it is not possible to miss a deadline: when $\delta$
|
||
|
equals $b$, then either \code{evt} is triggered or another transition
|
||
|
should (instantaneously) change the value of the guard \code{g} to
|
||
|
false.
|
||
|
|
||
|
We can illustrate the use of temporal laws with the following example of a new
|
||
|
node, \PRE; see Listing.~\ref{ls:pre}. This node encodes a buffer that
|
||
|
delays the propagation of its input. When the input changes, event
|
||
|
\code{pre\_read} has to be triggered instantaneously. Then, only after
|
||
|
a duration $\delta \in [a, b]$, the value stored by \PRE (in the state
|
||
|
variable \code{Stored}) is propagated to its output.
|
||
|
%
|
||
|
\lstinputlisting[language=Altarica,float=t,captionpos=b,caption=\protect{Example
|
||
|
of Time AltaRica code: the basic delay.},frame=single,label=ls:pre]{algos/pre.alt}
|
||
|
|
||
|
|
||
|
\section{A Definition of Fiacre Using Examples}\label{sec:sect3}
|
||
|
|
||
|
Fiacre~\cite{berthomieu2008fiacre} is a high-level, formal
|
||
|
specification language designed to represent both the behavioural and
|
||
|
timing aspects of reactive systems. Fiacre programs are stratified in
|
||
|
two main notions: \emph{processes}, which are well-suited for modelling
|
||
|
structured activities (like for example simple state machines), and
|
||
|
\emph{components}, which describes a system as a composition of
|
||
|
processes. In the following, we base our presentation of Fiacre on
|
||
|
code examples used in our interpretation of Time AltaRica.
|
||
|
% \subsection{A Definition of Fiacre by Examples}
|
||
|
% \label{sec:defin-fiacre-exampl}
|
||
|
We give a simple example of Fiacre specification in
|
||
|
Listing~\ref{lst:fiacre}. This code defines a process,
|
||
|
\code{Function}, that simulates the behaviour of the AltaRica node
|
||
|
given in Listing~\ref{ls:altarica}.
|
||
|
|
||
|
\lstinputlisting[language=Fiacre,float=hbt,captionpos=b,caption=\protect{Example
|
||
|
of Fiacre code: type, functions and processes},label=lst:fiacre,frame=single]{algos/function.fcr}
|
||
|
|
||
|
Fiacre is a strongly typed language, meaning that type annotations are
|
||
|
exploited in order to avoid unchecked run-time errors. Our example
|
||
|
defines two enumeration types, \code{FState} and \code{FailureType},
|
||
|
that are the equivalent of the namesake AltaRica domains. We also
|
||
|
define a record type, \code{Flows}, that models the environment of the
|
||
|
node \code{Function}, that is an association from flow variables to
|
||
|
values. Fiacre provides more complex data types, such as arrays,
|
||
|
tagged union or FIFO queues. Fiacre also supports native
|
||
|
\emph{functions} that provide a simple way to compute on values. In
|
||
|
our example, function \code{update} is used to compute the state of
|
||
|
the environment after an event is triggered; that is to model the
|
||
|
effect of assertions in AltaRica. It uses two ternary (conditional)
|
||
|
operators to mimic the \code{case}-expression found in the
|
||
|
\code{assert} heading of Listing~\ref{ls:altarica}.
|
||
|
|
||
|
% Functions are evaluated applicatively (there are no side effects) and
|
||
|
% can be recursively defined.
|
||
|
|
||
|
A Fiacre \emph{process} is defined by a set of parameters and {control
|
||
|
states}, each associated with a set of \emph{complex transitions}
|
||
|
(introduced by the keyword \code{from}). Our example defines a process
|
||
|
with two {shared variables}---symbol \code{\&} denotes variables
|
||
|
passed by reference---that can be updated concurrently by other
|
||
|
processes. In our case, variable \code{S} models the (unique) state
|
||
|
variable of node \FUNC.
|
||
|
|
||
|
Complex transitions are expressions that declares how variables are
|
||
|
updated and which transitions may fire. They are built from constructs
|
||
|
available in imperative programming languages (assignments,
|
||
|
conditionals, sequential composition, \dots); non-deterministic
|
||
|
constructs (such as external choice, with the \code{select} operator);
|
||
|
communication on ports; and jump to a state (with the \code{to} or
|
||
|
\code{loop} operators). In Listing~\ref{lst:fiacre}, the \code{select}
|
||
|
statement defines two possible transitions, separated by the symbol
|
||
|
\code{[]}, that loop back to \code{s0}. Each transition maps exactly
|
||
|
to one of the AltaRica events, \code{fail\_loss} and \code{fail\_err},
|
||
|
that we want to translate. Transitions are triggered
|
||
|
non-deterministically and their effects are atomic (they have an ``all
|
||
|
or nothing'' semantics). A transition can also be guarded by a Boolean
|
||
|
condition, using the operator \code{on} or another conditional
|
||
|
construct.
|
||
|
|
||
|
It is possible to associate a time constraint to a transition using
|
||
|
the operator \code{wait}. Actually, the ability to express directly
|
||
|
timing constraints in programs is a distinguishing feature of
|
||
|
Fiacre. We illustrate this mechanism in the code below, that
|
||
|
corresponds to the interpretation of the node \PRE of
|
||
|
Listing~\ref{ls:pre}. Basically, a transition constrained by a (time)
|
||
|
interval $I$ can be triggered after a time $\delta$, with
|
||
|
$\delta \in I$, only if its guard stayed continuously valid during
|
||
|
this time. It is this behaviour that inspired our choice of semantics
|
||
|
for the temporal law.
|
||
|
|
||
|
The last notion that we need to introduce is \emph{components}.
|
||
|
Components allow the hierarchical composition of processes, in the
|
||
|
same way than AltaRica nodes can be composed into classes in order to
|
||
|
build more complex behaviors.
|
||
|
Thus, a Fiacre component defines a parallel composition of components and/or
|
||
|
processes using statements of the form \code{par} \vars{P}$_0$
|
||
|
$\parallel \dots \parallel$ \vars{P}$_n$ \code{end}. It can also be
|
||
|
used to restrict the visibility of variables and ports and to define
|
||
|
priorities between communication events.
|
||
|
% Priorities are explicitly declared as constraints between
|
||
|
% communication ports, as in \code{go1 > go2}.
|
||
|
We give an example of Fiacre component in Listing~\ref{lst:pre2},
|
||
|
where we define an upgraded version of the delay operator \PRE.
|
||
|
%
|
||
|
\lstinputlisting[language=Fiacre,float=bt,captionpos=b,caption=\protect{Wait statement, components and synchronisation on ports.},label=lst:pre2,frame=single]{algos/pre2.fcr}
|
||
|
|
||
|
% %
|
||
|
% \lstinputlisting[language=Fiacre,float=bt,captionpos=b,caption=\protect{Example
|
||
|
% of Fiacre code: declaring timing
|
||
|
% constraints},label=lst:pre1,frame=single]{algos/pre.fcr}
|
||
|
% %
|
||
|
|
||
|
|
||
|
A problem with the implementation of \PRE is that at most one failure
|
||
|
mode can be delayed at a time. Indeed, if the input of \PRE changes
|
||
|
while the state is \code{Full}, then the updated value is not taken
|
||
|
into account until after event \code{pre\_wait} triggers.
|
||
|
%
|
||
|
It is not possible to implement a version that can delay an unbounded
|
||
|
number of events in a bounded time; this would require an unbounded
|
||
|
amount of memory to store the intermediate values. More fundamentally,
|
||
|
this would give rise to undecidable verification problems (see
|
||
|
e.g.~\cite{bouyer2004updatable}). To fix this problem, we can define a
|
||
|
family of operators, \PRE\code{\_k}, that can delay up-to $k$
|
||
|
simultaneous different inputs. Our implementation relies on a
|
||
|
component that uses three process instances: one instance of
|
||
|
\code{front}, that reads messages from the input (variable \code{I}),
|
||
|
and two instances of \code{delay}, that act as buffers for the values
|
||
|
that need to be delayed. Process \code{front} uses the local ports
|
||
|
\code{go1} and \code{go2} to dispatch values to the buffers.
|
||
|
Component \PRE\code{\_2} is enough to model the use case defined in
|
||
|
Sect.~\ref{sec:example-fail-detect-1}. Indeed, any element in the FDI
|
||
|
system may propagate at most two different status, one from \OK to
|
||
|
\LOSS and then from \LOSS to \ERR.
|
||
|
|
||
|
|
||
|
|
||
|
\section{Example of a Failure Detection and Isolation System}
|
||
|
\label{sec:example-fail-detect-1}
|
||
|
|
||
|
We study the example of a safety critical function that illustrates
|
||
|
standard failure propagation problems. We use this example to show the
|
||
|
adverse effects of temporal failure propagation even in the presence
|
||
|
of % system safety improvements. In our case the addition of
|
||
|
Failure Detection and Isolation (FDI) capabilities.
|
||
|
%
|
||
|
This example is inspired by the avionic functions that provide
|
||
|
parameters for Primary Flight Display (PFD), which is located in the
|
||
|
aircraft cockpit. The system of interest is the computer that acquires
|
||
|
sensors measurements and computes the aircraft \emph{calibrated airspeed} (CAS) parameter.
|
||
|
Airspeed is crucial for pilots: it is taken into
|
||
|
account to adjust aircraft engines thrust and it plays a main role in
|
||
|
the prevention of over speed and stall.
|
||
|
|
||
|
%%
|
||
|
\begin{figure}[thb]
|
||
|
\centering
|
||
|
\includegraphics[width=1\textwidth]{figures/AirSpeed_Computation.jpg}
|
||
|
\caption{Functional and physical views of the airspeed computation function.\label{fig:cockpit}}
|
||
|
\end{figure}
|
||
|
%%
|
||
|
|
||
|
CAS is not directly measured by a dedicated sensor, but is computed as
|
||
|
a function of two auxiliary pressure measurements, the static pressure
|
||
|
(Ps) and total pressure (Pt); that is
|
||
|
$\mathrm{CAS} = f(\mathrm{Pt}, \mathrm{Ps})$.
|
||
|
% \begin{equation*}
|
||
|
% CAS = C_{SO}\sqrt{5 \cdot \left( \left(\frac{Pt - Ps}{P_0} + 1\right)^{2/7} - 1\right)}
|
||
|
% \end{equation*}
|
||
|
% with $C_{SO}$ and $P_0$ two
|
||
|
% constants. % the speed of sound under standard day sea level conditions
|
||
|
These two measurements come from sensors located on the aircraft nose,
|
||
|
a pressure probe and a pitot tube.
|
||
|
% Our proposed functional view shows:
|
||
|
% \begin{itemize}
|
||
|
% \item Two external functions \I{1} and \I{2} that measure static and total pressure. They represent the functions allocated to the sensors.
|
||
|
% \item Three inner functions of the system: \F{1} and \F{2} for sensor measurements acquisition by the onboard computer and \F{3} for airspeed computation.
|
||
|
% \item The PFD functions have not been modelled in the sole purpose of simplification.
|
||
|
% \end{itemize}
|
||
|
%
|
||
|
%%
|
||
|
|
||
|
Our proposed functional view is given in Fig.~\ref{fig:cockpit}. It
|
||
|
consists in two external input functions \I{1} and \I{2} that measure static
|
||
|
and total pressure; and three inner functions of the system, \F{1} and
|
||
|
\F{2} for sensor measurements acquisition by the on-board computer and
|
||
|
\F{3} for airspeed computation. For simplification purposes, the
|
||
|
PFD functions have not been modelled.
|
||
|
|
||
|
Next, we propose a first failure propagation view aiming at
|
||
|
identifying the scenarios leading to an erroneous airspeed computation
|
||
|
and display to the pilot (denoted \ERR). Such failure can only be
|
||
|
detected if a failure detector is implemented, for instance by
|
||
|
comparing the outputs of different functions. Undetected, it could
|
||
|
mislead the pilot and, consequently, lead to an inappropriate engine
|
||
|
thrust setting. We also want to identify the scenarios leading to
|
||
|
the loss of the measure (denoted \LOSS). In such a case, the pilot can
|
||
|
easily assess that the measure is missing or false and consequently
|
||
|
rely upon another measure to control the aircraft (note that such
|
||
|
redundancy is not modelled). For example, airspeed out of
|
||
|
bound---indicating that an airliner has crossed the sonic barrier---is
|
||
|
considered to be of kind \LOSS. It can be understood that scenarios leading to the
|
||
|
loss of the airspeed are less critical than the ones leading erroneous
|
||
|
values.
|
||
|
|
||
|
% We consider two possible kinds of fail status: the ``silent'' failure of
|
||
|
%an element (denoted \ERR); or a communication problem with one of the
|
||
|
% functions (denoted \LOSS).
|
||
|
|
||
|
A failure is typically a loss of connectivity or the
|
||
|
detection of (clearly) erroneous transmitted values, for instance a
|
||
|
variable out of bounds. A \LOSS fail is less critical than an \ERR
|
||
|
since it can be detected immediately and isolated. On the other hand,
|
||
|
an \ERR fail can only be detected if a failure detector is
|
||
|
implemented, for instance by comparing the outputs of different
|
||
|
functions. In the following, we consider that the status
|
||
|
are ordered by their level of severity, that is \ERR $<$ \LOSS $<$\OK.
|
||
|
|
||
|
|
||
|
\subsubsection{Safety model of the architecture without FDI}
|
||
|
%%\label{sec:error-prop-simple}
|
||
|
For the sake of understandability, we choose to study a simplified version of
|
||
|
FDI system. While simple, this system is representative of failure
|
||
|
propagation mechanisms found in actual onboard systems.
|
||
|
%%
|
||
|
|
||
|
We provide an AltaRica model corresponding to the functional view
|
||
|
of the CAS function in Fig.~\ref{fig:example0}. This model, tailored
|
||
|
to study failure propagation, is comprised of:
|
||
|
% ( All the functions are modelled according to the node function
|
||
|
% introduced in the Sect.~\ref{sec:sect2}.
|
||
|
%
|
||
|
two external functions, \I{1} and \I{2}, that have no input (so, in
|
||
|
their nominal state, the output is set to \OK); two inner functions,
|
||
|
\F{1} and \F{2}, which are instances of the node \FUNC described in
|
||
|
Sect.~\ref{sec:sect2}; and a function, \F{3}, that is the composition
|
||
|
of two basic elements: a multiplexer, \MIN, representing the
|
||
|
dependence of the output of \F{3} from its two inputs, and a computing
|
||
|
element \F{3Processing} that represents the computation of the
|
||
|
airspeed. \F{3Processing} is also an instance of node \FUNC.
|
||
|
|
||
|
\begin{figure}[t]
|
||
|
\centering
|
||
|
\begin{tikzpicture}
|
||
|
\node[anchor=south west,inner sep=0] at (0,0) { \includegraphics[width=0.75\textwidth]{figures/Modele0}};
|
||
|
\draw[red!50,ultra thick, dashed] (7,2.85) rectangle (2.6,0.5);
|
||
|
\node[red] at (6.5,2.5) {\texttt{\large F3}};
|
||
|
\end{tikzpicture}
|
||
|
\caption{A simple example of failure propagation.\label{fig:example0}}
|
||
|
\end{figure}
|
||
|
%
|
||
|
In case of single failure scenario, \MIN propagates the failure coming
|
||
|
either from one input or the other. In case of multiple failures,
|
||
|
%when both inputs have identical failure, the faillure propagates;
|
||
|
%For instance,
|
||
|
% both inputs to \LOSS lead to an output of \LOSS.
|
||
|
% But when it receives an erroneous fail then the multiplexer will
|
||
|
% propagate an \ERR. The remaining failure combination drew our
|
||
|
% attention during \F{3} modelling.
|
||
|
%on the other hand,
|
||
|
when different failures propagate, one being \LOSS
|
||
|
and the other being \ERR,---and without appropriate FDI---the system
|
||
|
outcome is uncertain.
|
||
|
%
|
||
|
Solving this uncertainty would require a detailed behavioural model of
|
||
|
the on-board computer and a model for all the possible failure modes,
|
||
|
which is rarely feasible with a sufficient level of confidence, except
|
||
|
for time-tested technology. Given this uncertainty, it is usual to
|
||
|
retain the state with the most critical effect, that is to say:
|
||
|
% in this case, \MIN can be thought of as a simple logic operator that
|
||
|
% computes the minimum of its two inputs, and
|
||
|
the output of \F{3} is \ERR.
|
||
|
|
||
|
|
||
|
% Since a \LOSS can be ``easily'' detected, we want to avoid a situation
|
||
|
% where the whole system propagates \ERR while one of its
|
||
|
%internal element propagates a \LOSS. The rationale is that we should be
|
||
|
% able to isolate (or quarantine) the system when we know for sure that
|
||
|
% one of its element does not reliably respond to commands. This can be
|
||
|
% expressed by the following safety property.
|
||
|
%%
|
||
|
|
||
|
% In the next section, we study the system with the addition of FDI
|
||
|
% capabilities.
|
||
|
|
||
|
Our goal is to prevent the computation of an erroneous airspeed while
|
||
|
one of \F{3} input signals is lost. The rationale is that the system
|
||
|
should be able to passivate automatically the airspeed when it detects
|
||
|
that one of its input signals is not reliable. This behaviour can be
|
||
|
expressed with the following property:
|
||
|
|
||
|
%%
|
||
|
\begin{prop}[Loss Detection and Instantaneous Propagation]\label{prop:1}
|
||
|
A function is \emph{loss detection safe} if, when in nominal mode, it
|
||
|
propagates a \LOSS whenever one of its input nodes propagates a
|
||
|
\LOSS.
|
||
|
\end{prop}
|
||
|
|
||
|
% This is a simple system used to detect non-nominal behaviours
|
||
|
% and to trigger observables in order to isolate the failure mode.
|
||
|
% ({The figures are actual screen capture of models edited with Cecilia
|
||
|
% OCAS.})
|
||
|
% \subsubsection{Analysis}
|
||
|
|
||
|
|
||
|
We can show that our example of Fig.~\ref{fig:example0} does not meet
|
||
|
this property using the \emph{Sequence Generation} tool available in
|
||
|
Cecilia OCAS.
|
||
|
%%
|
||
|
% This system has several sources of unreliability. Both its sources and
|
||
|
% the computing elements (\F{1}, \F{2} and \F{3Processing}) can
|
||
|
% experience spontaneous failures, meaning that their outputs can
|
||
|
% transition instantaneously to \LOSS or \ERR. Errors are irrecoverable;
|
||
|
% once a function becomes faulty, its output will always stay equal to
|
||
|
% \LOSS or \ERR. Likewise, if both inputs are ``nominal'' then the
|
||
|
% output of \MIN is \OK. We assume that the multiplexer cannot undergo
|
||
|
% spontaneous failures.
|
||
|
%%
|
||
|
% This can be tested using a perfect detectors, \FF{3}{Loss}, placed at
|
||
|
% the output of the system.
|
||
|
%%
|
||
|
To this end, we compute the minimal cuts for the target equation
|
||
|
$((\text{\F{1.O.Loss}} \vee \text{\F{2.O.Loss}}) \wedge \neg
|
||
|
\text{\F{3.O.Loss}})$, meaning the scenario where \F{3} does not
|
||
|
propagates \LOSS when one of \F{1} or \F{2} does. Hence function \F{3}
|
||
|
is {loss detection safe} if and only if the set is empty.
|
||
|
\footnote{To check the safety of \F{3}, we need to perform the same
|
||
|
analysis for all the elements that can propagate a loss fail, not
|
||
|
only \F{1}.}.
|
||
|
|
||
|
In our example, once we eliminate the cases where \F{3} is not nominal
|
||
|
(that is when \F{3Processing} is in an error state), we find eight
|
||
|
minimal cuts, all of order $2$. In the following section, we correct
|
||
|
the behaviour of \F{3} by considering a new architecture based on
|
||
|
detectors and a switch to isolate the output of \F{3} when faulty:
|
||
|
|
||
|
%%
|
||
|
{%\centering
|
||
|
\begin{small}
|
||
|
\begin{tabular}{c@{\quad\quad}c}
|
||
|
\begin{minipage}[t]{0.42\linewidth}
|
||
|
\begin{verbatim}
|
||
|
{'F1.fail_err', 'F2.fail_loss'}
|
||
|
{'F1.fail_err', 'I2.fail_loss'}
|
||
|
{'F1.fail_loss', 'F2.fail_err'}
|
||
|
{'F1.fail_loss', 'I2.fail_err'}
|
||
|
\end{verbatim}
|
||
|
\end{minipage}
|
||
|
&
|
||
|
\begin{minipage}[t]{0.42\linewidth}
|
||
|
\begin{verbatim}
|
||
|
{'F2.fail_err', 'I1.fail_loss'}
|
||
|
{'F2.fail_loss', 'I1.fail_err'}
|
||
|
{'I1.fail_err', 'I2.fail_loss'}
|
||
|
{'I1.fail_loss', 'I2.fail_err'}
|
||
|
\end{verbatim}
|
||
|
\end{minipage}\\
|
||
|
\end{tabular}\\
|
||
|
\end{small}
|
||
|
}
|
||
|
%
|
||
|
Each of these cuts lead to trivial violations of our safety
|
||
|
property. For instance, we obtain the cut \code{\{'F1.fail\_err',
|
||
|
'F2.fail\_loss'\}} that corresponds to the case where \F{1}
|
||
|
propagates \ERR and \F{2} propagates \LOSS, in which case \F{3}
|
||
|
propagates \ERR. This is exactly the situation that we want to
|
||
|
avoid.
|
||
|
|
||
|
|
||
|
\begin{figure}[bt]
|
||
|
\centering
|
||
|
\begin{tikzpicture}
|
||
|
\node[anchor=south west,inner sep=0] at (0,0) { \includegraphics[width=\textwidth]{figures/Modele2.png}};
|
||
|
\draw[red!50,ultra thick, dashed] (10.8,3.4) rectangle (2,0);
|
||
|
\node[red] at (10.4,3.1) {\texttt{\large F3}};
|
||
|
\end{tikzpicture}
|
||
|
\caption{Model of a FDI function with a switch and an alarm.\label{fig:example1}}
|
||
|
\end{figure}%\vspace*{-1cm}
|
||
|
%\enlargethispage{\baselineskip}
|
||
|
|
||
|
\subsubsection{Safety model of the architecture with FDI}
|
||
|
%%\label{sec:simple-fail-detect}
|
||
|
%%
|
||
|
%In order to satisfy the Prop.~\ref{prop:1},
|
||
|
The updated implementation of \F{3} (see Fig.~\ref{fig:example1})
|
||
|
uses two perfect detectors, \F{1Loss} and \F{2Loss}, that can detect
|
||
|
a loss failure event on the inputs of the function. The (Boolean) outputs of
|
||
|
these detectors are linked to an OR gate ({\ttfamily AtLeastOneLoss})
|
||
|
which triggers an \ALARM when at least one of the detectors outputs
|
||
|
true. The alarm commands a \SWITCH; the output of \SWITCH is the same
|
||
|
as \MIN, unless \ALARM is activated, in which case it propagates a
|
||
|
\LOSS failure. The alarm can fail in two modes, either
|
||
|
continuously signaling a \LOSS or never being activated. The
|
||
|
schema in Fig.~\ref{fig:example1} also includes two delays operators, \D{1} and \D{2},
|
||
|
that model delay propagation at the input of the detectors;
|
||
|
we will not consider them in the following lines, but come back to these timing constraints at the end of the section.
|
||
|
|
||
|
% \TODO{why do we need Alarm ! why does the alarm can fail ! we could
|
||
|
% plug directly the or gate to the switch}- Permanent failure of the alarm.
|
||
|
|
||
|
The FDI function---with a switch and an alarm---is a stable scheme for
|
||
|
failure propagation: when in nominal mode, it detects all the failures
|
||
|
of the system and it is able to disambiguate the case where its inputs
|
||
|
contains both \ERR and \LOSS. Once again, this can be confirmed using
|
||
|
the Sequence Generation tool. If we repeat the same analysis than
|
||
|
before---and if we abstract away the delay nodes---we find $56$
|
||
|
minimal cuts, all involving a failure of either \ALARM or
|
||
|
\F{3Processing}, i.e. a non-nominal mode. This means that, in an untimed model, our new
|
||
|
implementation of \F{3} satisfies the loss detection property, as
|
||
|
desired.
|
||
|
% \begin{figure}[hbt]
|
||
|
% \centering
|
||
|
% \begin{tikzpicture}
|
||
|
% \node[anchor=south west,inner sep=0] at (0,0) { \includegraphics[width=\textwidth]{figures/Modele2.png}};
|
||
|
% \draw[red,ultra thick, dashed] (10.6,3.6) rectangle (2.2,0);
|
||
|
% \node[red] at (10,3) {\texttt{\huge F3}};
|
||
|
% \end{tikzpicture}
|
||
|
% % \includegraphics[width=0.95\textwidth]{figures/Modele2}
|
||
|
% \caption{Taking into consideration propagation delays in the FDI
|
||
|
% system.\label{fig:example2}}
|
||
|
% \end{figure}
|
||
|
% Unfortunately, this model makes unrealistic assumptions about the
|
||
|
% instantaneous failure propagation of detection. Indeed, it may be the
|
||
|
% case that the outputs of \F{1} and \F{2} are delayed as they are
|
||
|
% propagated to the observers \F{1Loss} and \F{2Loss}. Next, we study
|
||
|
% new failure modes that may arise from this situation and how we can
|
||
|
% detect them.
|
||
|
%%
|
||
|
% \subsection{Timed Safety model of the architecture with FDI}
|
||
|
% \label{sec:timed-safety-model}
|
||
|
%%
|
||
|
% In this section, we consider a new AltaRica model that takes into
|
||
|
% account propagation delays. To this end, we may insert two instances
|
||
|
% of the delay node (the element \PRE defined in Sect.~\ref{sec:sect2})
|
||
|
% before the detectors \F{1Loss} and \F{2Loss}.
|
||
|
Even so, it is easy to find a timed scenario where the safety property
|
||
|
is violated.
|
||
|
|
||
|
Assume now that \F{1} and \F{2} propagate respectively the status \LOSS and \ERR,
|
||
|
at the same date. In such a case and considering possible latencies,
|
||
|
while \ERR reaches \MIN instantaneously,
|
||
|
the output of \F{1} might reach \F{1Loss} at successive date.
|
||
|
% of the output of \F{2} reaching \F{2Loss},.
|
||
|
This leads to a transient state where the alarm is not activated whereas
|
||
|
the output of \MIN is set to \ERR. This brings us back to the same
|
||
|
dreaded scenario than in our initial model.
|
||
|
%%
|
||
|
In particular, this scenario corresponds to the cut
|
||
|
\code{\{'F1.fail\_loss', 'F2.fail\_err'\}}, that is not admissible in
|
||
|
an untimed semantics.
|
||
|
%%
|
||
|
|
||
|
This example suggests that we need a more powerful method to compute
|
||
|
the set of cuts in the presence of temporal constraints. On the other
|
||
|
hand, we may also advocate that our safety property is too limiting in
|
||
|
this context, where perfect synchronicity of events is rare. Actually,
|
||
|
it can be proven that the output of \F{3} will eventually converge to a
|
||
|
loss detection and isolation mode (assuming that \F{3} stays nominal and
|
||
|
that its inputs stay stable).
|
||
|
%
|
||
|
To reflect
|
||
|
this si\-tua\-tion, we pro\-pose an improved safety property that takes into
|
||
|
account temporal properties of the system:
|
||
|
%%
|
||
|
\begin{prop}[Loss Detection Convergent]\label{prop:2} A function is
|
||
|
\emph{loss detection convergent} if (when in nominal mode) there
|
||
|
exists a duration $\Delta$ such that it continuously outputs a \LOSS
|
||
|
after the date $\delta_0 + \Delta$ if at least one of its input
|
||
|
nodes continuously propagates a \LOSS starting from $\delta_0$
|
||
|
onward. The smallest possible value for $\Delta$ is called the
|
||
|
\emph{convergence latency} of the function.
|
||
|
\end{prop}
|
||
|
Hence, if the latency
|
||
|
needed to detect the loss failure can be bound, and if the bound is sufficiently
|
||
|
small safety-wise, we can still deem our system as safe.
|
||
|
In the example in Fig.~\ref{fig:cockpit}, this property can indicate for how long an erroneous airspeed is shown on the PFD to the pilot,
|
||
|
before the failure is isolated.
|
||
|
|
||
|
In the next section, we use our approach to generate a list of ``timed
|
||
|
cuts'' (as model-checking counterexamples) that would have exposed the
|
||
|
aforedescribed problems. We also use model-checking to compute
|
||
|
the {convergence latency} for the node \F{3}. In this simple example,
|
||
|
we can show that the latency is equal to the maximal propagation delay
|
||
|
at the input of the detectors. The value of the latency could be much
|
||
|
harder to compute in a more sophisticated scenario, where delays can
|
||
|
be chained and/or depends on the internal state of a component.
|
||
|
|
||
|
|
||
|
\section{Compilation of AltaRica and Experimental evaluation}\label{sec:sect4}
|
||
|
|
||
|
We have implemented the transformation outlined in
|
||
|
Sect.~\ref{sec:sect3}; the result is a compiler that automatically
|
||
|
generates Fiacre code from an AltaRica model. The compilation process
|
||
|
relies on the fact that it is possible to ``flatten'' a composition of
|
||
|
interconnected nodes into an intermediate representation, called a
|
||
|
\emph{Guarded Transition System} (GTS)~\cite{rau08}. A GTS is very
|
||
|
similar to a (single) AltaRica node and can therefore be encoded in a
|
||
|
similar way. Our tool is built using the codebase of the model-checker
|
||
|
EPOCH~\cite{epoch}, which provides the functionalities for the
|
||
|
syntactic analysis and the linking of AltaRica code. After
|
||
|
compilation, the Fiacre code can be checked using
|
||
|
Tina~\cite{BRV04}. As seen in Sect.~\ref{sec:tina}, the core of the Tina toolset is an
|
||
|
exploration engine that can be exploited by dedicated model-checking
|
||
|
and transition analyser tools. Tina offers several abstract state
|
||
|
space constructions that preserve specific classes of properties like
|
||
|
absence of deadlocks, reachability of markings, or linear and
|
||
|
branching time temporal properties. These state space abstractions are
|
||
|
vital when dealing with timed systems that generally have an infinite
|
||
|
state space (due to the use of a dense time model). In our
|
||
|
experiments, most of the requirements can be reduced to reachability
|
||
|
properties, so we can use on-the-fly model-checking techniques.
|
||
|
% and very ``aggressive'' abstractions.
|
||
|
|
||
|
\begin{figure}[thb]
|
||
|
\centering
|
||
|
\includegraphics[width=\textwidth]{figures/passerelle.png}
|
||
|
\caption{Graphical representation of the AltaRica/Fiacre toolchain.\label{fig:toolchain}}
|
||
|
\end{figure}
|
||
|
|
||
|
We interpret a GTS by a Fiacre process whose parameters consist of all
|
||
|
its state and flow variables. Each transition
|
||
|
\lstinline[language=Altarica]{g |-evt-> e} in the GTS is (bijectively)
|
||
|
encoded by a transition that matches the guard \code{g} and updates
|
||
|
the variables to reflect the effect of \code{e} plus the assertions. Each transition can
|
||
|
be labelled with time and priorities constraints to take
|
||
|
into account the \code{extern} declarations of the node. This
|
||
|
translation is straightforward since all the operators available in
|
||
|
AltaRica have a direct equivalent in Fiacre. Hence every
|
||
|
state/transition in the GTS corresponds to a unique state/transition
|
||
|
in Fiacre. This means that the state (reachability) graph of a GTS and
|
||
|
its associated Fiacre model are isomorphic. This is a very strong and
|
||
|
useful property for formal verification, since we can very easily
|
||
|
transfer verification artefacts (such as counterexamples) from one
|
||
|
model back to the other.
|
||
|
|
||
|
The close proximity between AltaRica and Fiacre is not really
|
||
|
surprising. First of all, both languages have similar roots in process
|
||
|
algebra theory and share very similar synchronisation mechanisms. More
|
||
|
deeply, they share formal models that are very close: AltaRica
|
||
|
semantics is based on the product of ``communicating automata'',
|
||
|
whereas the semantics of Fiacre can be expressed using (a time
|
||
|
extension of) one-safe Petri nets. The main difference is that
|
||
|
AltaRica provide support for defining probabilities on events, whereas
|
||
|
Fiacre is targeted towards the definition of timing aspects. This
|
||
|
proximity in both syntax and semantics is an advantage for the
|
||
|
validation of our tool, because it means that our translation should
|
||
|
preserve the semantics of AltaRica on models that do not use extern
|
||
|
laws to define probabilities and time. We have used this property to
|
||
|
validate our translation by comparing the behaviours of the models
|
||
|
obtained using Cecilia OCAS simulation tool and their translation. For
|
||
|
instance, in the case of the CAS system of
|
||
|
Sect.~\ref{sec:example-fail-detect-1}, we can compute the set of cuts
|
||
|
corresponding to Safety Property~\ref{prop:1} (loss detection) by
|
||
|
checking an invariant of the form
|
||
|
$((\text{\F{1.O}} = \text{\code{Loss}}) \vee (\text{\F{2.O}} =
|
||
|
\text{\code{Loss}}) \Rightarrow (\text{\F{3.O}} =
|
||
|
\text{\code{Loss}}))$.
|
||
|
%
|
||
|
%%\TODO{Check if it is necessary to mention the tool, as opposed to counterexamples}
|
||
|
In both cases---with and without FDI---we are able to compute the
|
||
|
exact same set of cuts than Cecilia OCAS. This is done using the
|
||
|
model-checker for modal mu-calculus provided with Tina, which can list
|
||
|
all the counterexamples for a (reachability) formula as a graph. More
|
||
|
importantly, we can use our approach to compute the timed
|
||
|
counterexample described at the end of
|
||
|
Sect.~\ref{sec:example-fail-detect-1}. All these computations can be
|
||
|
done in less than a second on our test machine.
|
||
|
|
||
|
\subsection{Empirical evaluation}
|
||
|
We have used our toolchain to generate the reachable state space of
|
||
|
several AltaRica models~\footnote{All the benchmarks tested in this paper are available at
|
||
|
https://w3.onera.fr/ifa-esa/content/model-checking-temporal-failure-propagation-altarica}:
|
||
|
RUDDER describes a control system for the
|
||
|
rudder of an A340 aircraft~\cite{bernard2007experiments}; ELEC refers
|
||
|
to three simplified electrical generation and power distribution
|
||
|
systems for a hypothetical twin jet aircraft; the HYDRAU model
|
||
|
describes a hydraulic system similar to the one of the A320
|
||
|
aircraft~\cite{bieber2002combination}. The results are reported in
|
||
|
Table~\ref{table:2}. In each case, we indicate the time needed to
|
||
|
generate the whole state space (in seconds) and the number of states
|
||
|
and transitions explored. For information, we also give the number of
|
||
|
state variables as reported by Cecilia OCAS. All tests were run on an
|
||
|
Intel 2.50GHz CPU with 8GB of RAM running Linux. In the case of model
|
||
|
HYDRAU we stopped the exploration after $30$ minutes and more than
|
||
|
$9.10^{9}$ generated states.
|
||
|
The state space is large in this benchmark because it models the physical a-causal propagation of a leak, so a leak can impact both upward and backward components and trigger a reconfiguration, multiplying the number of reachable states.
|
||
|
Moreover, the topology of the system shall be reconfigurated after the detection of some failures and this dynamic reconfiguration combined with a-causal propagation increases again more the size of the state space.
|
||
|
In all cases, the time needed to generate
|
||
|
the Fiacre code is negligible, in the order of 10~ms.
|
||
|
|
||
|
\begin{table}[tb]
|
||
|
\centering
|
||
|
\begin{tabular}{|l|c|c|c|c|c|}
|
||
|
\hline
|
||
|
\textbf{Model} & \textbf{time} (s) & \textbf{\# states} &
|
||
|
\textbf{\#
|
||
|
trans.}
|
||
|
& \textbf{\# state vars}\\ \hline
|
||
|
RUDDER~~~ & 0.85 & $3.3\,10^4$ & $2.5\,10^5$ & 15 \\ \hline
|
||
|
ELEC 01 & 0.40 & 512 & $2.3\,10^3$ & 9 \\ \hline
|
||
|
ELEC 02 & 0.40 & 512 & $2.3\,10^3$ & 9 \\ \hline
|
||
|
ELEC 03 & 101 & $4.2\,10^6$ & $4.6\,10^7$ & 22 \\ \hline
|
||
|
HYDRAU & 1800 & --- & --- & 59 \\ \hline
|
||
|
CAS~ & 0.40 & 729 & $2.9\,10^3$ & 6 \\ \hline
|
||
|
CAS with \PRE~~~ & 46 & $9.7\, 10^5$ & $4.3\, 10^6$ & 10 \\ \hline
|
||
|
\end{tabular}
|
||
|
\vspace*{1em}
|
||
|
\caption{State space size and generation time for several use cases.\label{table:2}}
|
||
|
\end{table}
|
||
|
|
||
|
Our models also include two versions of the complete CAS system
|
||
|
(including the detectors, the alarm and the switch); both with and
|
||
|
without the delay functions \D{1} and \D{2}. The ``CAS with \PRE'' model
|
||
|
is our only example that contains timing constraints. In this case, we
|
||
|
give the size of the state class graph generated by Tina, that is an
|
||
|
abstract version of the state space that preserves LTL properties. We
|
||
|
can use Tina to check temporal properties on this
|
||
|
example. More precisely, we can check that \F{3} has the \emph{loss
|
||
|
detection convergence} property. To this end, a solution is to add a
|
||
|
Time Observer to check the maximal duration between two events: first, a \code{obs\_start} event is triggered when the output of \F{1} or \F{2} changes to \LOSS;
|
||
|
then an \code{obs\_end} event is triggered when the output of \F{3} changes to \LOSS. The
|
||
|
observer has also a third transition (\code{obs\_err}) that acts as a
|
||
|
timeout and is associated with a time interval $I$ and is enabled
|
||
|
concurrently with \code{obs\_end}. Hence,
|
||
|
Time Observer ends up in the state yield by \code{obs\_err}
|
||
|
when the output of \F{3} deviates from its
|
||
|
expected value for more than $d$ units of time, with $d \in I$. We
|
||
|
have used this observer to check that the \emph{convergence latency}
|
||
|
of the CAS system equals $3$, when
|
||
|
we assume that the
|
||
|
delays are in the time interval $[1, 3]$.
|
||
|
The result is that
|
||
|
\code{obs\_err} is fireable for any value of $d$ in the interval
|
||
|
$[0,3]$, while \code{obs\_err} is not fireable if $I = ]3, \infty[$.
|
||
|
% It is enough to indicate to choose $I = [0,3]$.
|
||
|
These two safety properties can be checked on the system (plus the
|
||
|
observer) in less than $0.6$~s.
|
||
|
|
||
|
|
||
|
% \begin{wraptable}{r}{0.35\textwidth}
|
||
|
% %\begin{table}[bth]
|
||
|
% \centering
|
||
|
% \vspace*{-7mm}
|
||
|
% \begin{tabular}{|c|c|l}
|
||
|
% \cline{1-2}
|
||
|
% \textbf{delay}$\mathbf\tau$ & \textbf{safety property} & \\ \cline{1-2}
|
||
|
% % {$\quad[4,4]\quad$} & ok & \\ \cline{1-2}
|
||
|
% {$]3, \infty[$} & holds & \\ \cline{1-2}
|
||
|
% {[}3,3{]} & not holds & \\ \cline{1-2}
|
||
|
% {[}0,3{]} & not holds & \\ \cline{1-2}
|
||
|
% \end{tabular}
|
||
|
% \caption{Latencies check on Eq.~\eqref{eq:mc1}
|
||
|
% with $\delta = [1,3]$.\label{table:1}}
|
||
|
% %\end{table}
|
||
|
% \end{wraptable}
|
||
|
|
||
|
|
||
|
%%% Local Variables:
|
||
|
%%% mode: latex
|
||
|
%%% TeX-master: "main"
|
||
|
%%% End:
|
||
|
|
||
|
%% LocalWords: Fiacre
|