ISO 26262 compliant EBA electronic system

ISO 26262 documentation
Critical safety application programming

This is a project developed for an university assignment. The task was to develop an Emergency Braking Assist system for a generic city car, also it had to be compliant with ISO 26262 standard.

ISO 26262 documentation

Given the current vehicle speed V, and the distance D of the vehicle from the preceding vehicle, if $\frac{V^2}{100}>=D$ then the EBA activates the brake request and keeps braking the vehicle until $\frac{V^2}{100}<D$ The brake request is either 0 (indicating no brake is needed) or 1 (indicating that bake is needed). The EBA receives the vehicle speed V and the distance D every 100 msec, and decides whether a brake is needed or not.

Elements of the item

Microcontroller (CPU + embedded memory + CAN interface), radar.

Interaction of the items with other items

The item interacts with the Body Computer through the CAN for receiving the vehicle speed and for providing the brake signal.

Identification of provided functionalities to other items

The item provides the brake signal to the Body Computer through the CAN.

Identification of required functionalities from other items

The item requires the vehicle speed from the Body Computer through the CAN.

Identification of hazards

To identify correctly the hazards, we can firstly perform the FMEA (Failure Mode & Effect Analysis)

FM1: radar breaks → distance is not correctly measured. Note that this failure can lead to different situations: distance measured is smaller than the real one or distance measured is higher than the real one. Anyway the main point is that the measure is not correct anymore due to the failure.
FM2: microcontroller breaks → break signal is not reliable. Also for this failure we can have different situations, but the key point is that the microcontroller is not able to correctly perform the algorithm and as a consequence the brake signal is not correct.
FM3: CAN interface breaks → information is not correctly exchanged with the CAN.

All these failures lead to two main hazards that we have to take into account for the Hazard Analysis and Risk Assessment of the item:

H1: the item sends the brake signal when is not required (i.e. V²/100≥D).
H2: the item does not send the brake signal when is required (i.e. V²/100<D).

Identification of operational situations

The key point in the identification of the operational situations is that we have to consider also the eventual presence of a following car. Hence we call:

d1 the distance between the car under analysis and the following one
d2 the distance between the car under analysis and the preceding one
As a consequence, we can discriminate the operational situations by considering:
Distance that can be higher or smaller with respect to safety distance (safey distance = V²/100).
Speed that can be higher or smaller than a threshold value, we will consider 30 km/h as threshold (the speed will distinguish different degrees of severity).

Therefore, the operational situations are the following:

OS1: V < 30 km/h, d1 < safety distance, d2 > safety distance;
OS2: V > 30 km/h, d1 < safety distance, d2 > safety distance;
OS3: V < 30 km/h, d1 > safety distance, d2 < safety distance;
OS4: V > 30 km/h, d1 > safety distance, d2 < safety distance;
OS5: V < 30 km/h, d1 < safety distance, d2 < safety distance;
OS6: V > 30 km/h, d1 < safety distance, d2 < safety distance;
OS7: vehicle moving and a pedestrian or a cycle crosses the road.
Note that it is useless to consider the operational situations in which:
OS8: V < 30 km/h, d1 > safety distance, d2 > safety distance;
OS9: V > 30 km/h, d1 > safety distance, d2 > safety distance;
In fact, in both of them, the distance is limit is respected and there is no risk.

ASIL determination

Operational situation	H1	ASIL	H2	ASIL	Notes
OS1	S=1,E=4,C=2	A			H2 cannot take place in OS1 since d2 > safety distance
OS2	S=2,E=4,C=3	C			H2 cannot take place in OS2 since d2 > safety distance
OS3	S=1,E=4,C=2	A	S=1,E=4,C=1	QM
OS4	S=2,E=4,C=3	C	S=2,E=4,C=2	B
OS5	S=2,E=4,C=2	B	S=2,E=4,C=1	A
OS6	S=3,E=4,C=3	D	S=3,E=4,C=2	C
OS7	S=3,E=3,C=3	C	S=3,E=3,C=2	B

General notes

The exposure is always 4 for the OS 1-6, in fact it is a common condition (> 10% of the driving time) to be driving either at low speed (V < 30 km/h) or at high speed (V > 30 km/h).
The controllability has been set according to the following rules:
- H1 is considered less controllable than H2 since the driver does not expect that behaviour and he cannot control it by breaking or accelerating. Hence:
  - If V < 30 km/h → C = 2;
  - If V > 30 km/h → C = 3;
- H2 is considered more controllable than H1 since the driver needs only to brake to avoid dangerous situations. Then:
  - If V < 30 km/h → C = 1;
  - If V > 30 km/h → C = 2;
The severity has been set according to the following rules:
- If V < 30 km/h, then:
  - If only one of the two distances is not respected → S = 1;
  - If both the distances are not respected → S = 2;
- If V > 30 km/h, then:
  - If only one of the two distances is not respected → S = 2;
  - If both the distances are not respected → S = 3;
For what concerns OS7:
- The exposure is set to 3 since we consider an urban scenario, then it is medium probable (1% - 10% of driving time) to have a pedestrian or a cycle crossing the road.
- The severity is always set to 3.
- The controllability depends on the hazard.

Definition of Safety goals

The item shall measure correctly the distance of the preceding vehicle and shall send the brake request only if the condition V²/100<D is verified.

Definition of Functional Safety Concepts

Since an ASIL D combination came out during the analysis, a lot of resources have to be spent in the functional safety concepts definition.

FSC1: An ASIL D compatible microprocessor has to be used.
FSC2: the item shall perform self-test in order to check the correct behaviour of all the components. If any misbehaviour is detected, then the item shall transit to safe state.
FSC3: redundancy is crucial in case of ASIL D. Hence two radars shall be used to compare the different measurements and eventually two microcontrollers in master-slave configuration.
FSC4: a bypass circuit which receives V and D and combines them through some logical ports shall be implemented.
FSC5: an external check of the ECU correct working condition shall be implemented.

Definition of the Safe State

If only one of the microcontrollers is working, then the other is disabled and the driver is informed.
If both the microcontrollers are not working, the item is disabled and the driver is informed of the malfunction (e.g. through a led in the dashboard).

Implementation of redundancy

In this configuration two boards work in parallel to measure the distance independently and issue brake request. One board is the master and the other is the slave. Only the master can issue brake request. In case of malfunction of the master the slave becomes able to issue brake requests.

Critical safety application programming

As the ISO 26262 specifies, it is necessary to satisfy some software and hardware rules to operate in a critical safety environment. For this reason, part of the software is produced by automation tools such as Simulink. The hardware platform for the ECUs is Freedom K64F, connected to an ultrasonic sensor HC-04 for distance measurement. Furthermore, an Arduino Nano Pro is used as an external ECU which manages and coordinates the two sensoring ECUs.
The code of the K64F boards is composed by a discrete state part built in Stateflow, then exported in C code and included in the MBED project.

K64F code

The program is based on the source code exported from the Stateflow chart: the main code manages the inputs and the outputs depending on the ECU working state and tries to catch internal errors.
The following are possibile sources of errors:

No signal from Arduino: the ECU can not operate because the connection with the manager is broken.
No signal from body computer (simulated on PC): the ECU can not operate without speed value.
Board error: in this case the ECU is not able to do a self check of this type, thus this error is recognized by the Arduino and notified to the ECU through serial communication.

Both the network checks (Arduino and body computer connections) are performed through a timer which is stopped and reset only when the signal is good. After a specific timeout, if no good signal is received, then the board put itself in error state.

reading from body computer

void read_bc()
{
  // read speed from body computer (ASCII encoding)
  // - if malfunction set state to ERROR
  if (pc.readable())
  {
    bool malfunc = false;
    int i = 0;
    char buf[10];
    while (pc.readable() && i<10) {
      buf[i] = pc.getc();
      if (buf[i] == '\r') break;
      else if (!isdigit(buf[i])) malfunc = true;
      i++;
    }
    fflush(pc);
    if (!malfunc) {
      // the communication is working as expected then
      // the value is stored and the timer of network fault
      // check is reset
      buf[i] = '\0';
      controller_U.speed = atoi(buf);
      bc_t.reset();
    }
  }
}

The timer check is executed asynchronously to the main code using a scheduled routine.

network timer check

void netcheck()
{
  bc_t.stop();
  if (bc_t.read_ms() > TIMEOUT) {
    // network error
    state = ERROR;
  }
  bc_t.start();
}

The routine that checks Arduino connection is implemented in a similar way.

Arduino code

Arduino is responsible for the detection of boards’ errors, as they are not completely able to detect a malfunction themself. The following checks are performed on each ECU port (connected to a K64F board):

Presence of signal: if a port does not receive any signal in a defined number of cycles, then it means that the ECU is not working or that the communication cannot take place. After a certain amount of time the ECU is considered faulty.
Coherence of signal: the signal coming from a port is analyzed for errors in the structure of information, in particular two checks are executed: the first three characters should represent the distance measured, hence they have to be digits, second, the last two characters should represent respectively the state and the role of ECU. If they are not consistent with what declared before or if the message has an unexpectd total lenght then the ECU could be faulty. However, some disturbs can occur along the transmission line, thus a reasonable number of errors in the stream has to occur in order to consider the ECU faulty.
Coherence of measure: the distances measured by the ECUs are compared, and if the difference is higher than a certain threshold, then a malfunction could have happened in one ECU. In this case there is not any other way to check which is the correctly working ECU. Hence, we decided to assume that the master has a better hardware so the slave will be considered faulty after a certain amount of wrong measurements.
Self check on ECU: the ECUs inform the manager about their state, because they can detect a malfunction also on their own (for example absence of signal from body computer which should provide vehicle current speed, as stated previously). If the master ECU notifies the manager that it is not working correctly, then the manager promotes the slave by sending a promote signal, so that it can issue brake requests to the vehicle.

The software implementation relies on SoftwareSerial library, which allows to build virtual serial ports, overcoming the limit imposed by a single UART connected only to pins 0 and 1. With SoftwareSerial, two software ports are created, but they share the same hardware infrastructure (buffer, clock, etc…) so they need to use the resources alternatively. During the main loop of Arduino code, the control over the serial port is swapped between the two software serial connections.
The message sent from each K64F board to the Arduino is composed as follows:

Block	Size	Type	Description
0	3	text encoding	Distance measured
1	1	text encoding	Current state of ECU
2	1	text encoding	Current role of ECU

For each port, the following code is executed:

port reading

// Listening for signals from ECU at port one
  i = 0;
  f = false;
  p_one.listen();
  delay(WAITFORDATA);
  if (p_one.available()){
    // A signal is received from ECU at port one
    while(p_one.available() && i < 5){
      if (i < 3){
        // First three characters should be digits
        buff_n[i] = p_one.read();
        // If character is not digit there is an error in the stream
        if (!isDigit(buff_n[i])) f = true;
      }
      else {
        // Other two alphabetic characters for state and role
        buff[i-3] = p_one.read();
        // First is state and should be E (error) or N (normal)
        if ((i == 3) && !((buff[i-3] == 'E') || (buff[i-3] == 'N'))) f = true;
        // Second is role and should be S (slave) or M (master)
        if ((i == 4) && !((buff[i-3] == 'S') || (buff[i-3] == 'M'))) f = true;
      }
      i++;
      delay(1);
    }
    // If the message lenght is not the one expected or there is a structure error
    // in the stream then the count of errors at port one is incremented
    if ((i != 5) || f ) { 
      p_one_errors++;
    }
    else {
      // If the message is good the count of errors is reset
      p_one_errors = 0;
      // The conversion of distance from characters to integer takes place
      // through the function atoi which accepts a string so the last character
      // has to be a null character
      buff_n[3] = '\0';
      dist_one = atoi(buff_n);
      // If the ECU is already in fault (due to previous communication from manager
      // or failed self check) then it is considered
      state_one = buff[0];
      if (buff[0] == 'E') fault_one = true;
      // The role of ECU at port one is saved
      if (buff[1] == 'M')
        role_one = 0;
      else
        role_one = 1;
    }
    // A message has arrived, then the counter of steps without message has to be
    // reset
    wait_one = 0;
  } else {
    // No message received, then the counter of steps without message has to be
    // incremented
    wait_one++;  
  }

After the reading of the second serial port, all the checks are performed, and response is sent to the boards:

error check

// If the distances are too much different the counter of measurement
  // errors is incremented, otherwise is reset
  if (THRESHOLD < abs(dist_two - dist_one)){
    dist_errors++;
  } else {
    dist_errors = 0;
  }
  // If the measurements differ for too long time the slave board if marked
  // as faulty if the master is not faulty
  if (dist_errors > MAXDISTER) {
    if ((role_one == 0) && !fault_one) fault_two = true;
    if ((role_two == 0) && !fault_two) fault_one = true;
  }
  // If the stream from an ECU is not coherent or is not present at all
  // for some time, the the ECU is marked as faulty
  if (p_one_errors > MAXERRORS || wait_one > MAXWAIT) fault_one = true;
  if (p_two_errors > MAXERRORS || wait_two > MAXWAIT) fault_two = true;
  // Messages to the ECUs are sent on the base of their state:
  // - E stands for error: if an ECU receives E it put itself in error
  //   state if is not already in that state
  // - P stands for promote: if an ECU receives P it put its role in
  //   master state if is not alreay in master state (i.e. able to issue
  //   brake requests)
  if (fault_one && !fault_two){
    p_one.print("E");
    if (role_two == 1)
      p_two.print("P");
  }
  if (!fault_one && fault_two){
    if (role_one == 1)
      p_one.print("P");
    p_two.print("E");
  }
  if (fault_one && fault_two){
    Serial.println("No ECUs available. System is deactivated.");
    // An infinite delay means that the component is deactivated
    // Error message is transmitted continuously to be sure that the boards
    // receive it
    while(true){
      p_one.print("E");
      delay(1);
      p_two.print("E");
      delay(1);
    }
    // Moreover a visual indicator, like a LED can be lighted in this
    // case by the manager through a DigitalWrite command and a suitable circuit.
  }

To improve the system, a bypass communicaton between master and slave can be added, because in case of manager fault the system is not able to work at all. Anyway, in this study, the manager is considered always working properly.

Source code

The code is available at the following location.