Content uploaded by Ermin Sakic
Author content
All content in this area was uploaded by Ermin Sakic on Aug 17, 2019
Content may be subject to copyright.
P4BFT: A Demonstration of Hardware-Accelerated
BFT in Fault-Tolerant Network Control Plane
Ermin Sakic
Cristian Bermudez Serna
Siemens AG
Munich, Germany
{rstname.lastname}@siemens.com
Endri Goshi
Nemanja Deric
Wolfgang Kellerer
Technical University Munich
Munich, Germany
{rstname.lastname}@tum.de
CCS Classication:
Networks -> Programmable Networks;
Systems Security -> Distributed Systems Security;
1 INTRODUCTION
ONOS and OpenDaylight deploy RAFT consensus to enforce
update ordering and leader fail-over in the face of controller
failures. RAFT is, however, unable to distinguish malicious /
incorrect (e.g., buggy [
7
]), from correct controller decisions,
and can easily be manipulated by an adversary in possession
of the leader [
4
]. Byzantine Fault Tolerance (BFT)-enabled
controller designs support correct consensus in scenarios
where a subset of controllers is faulty due to a malicious adver-
sary or internal bugs. Existing proposals [
1
,
2
,
4
] base their
correctness on premonition that controller outputs are col-
lected by trusted conguration targets and are compared by
the same for payload matching for the purpose of correct
message identication. Namely, each controller instance of
the administrative domain transmits its conguration to the
target switch. In in-band [
6
] deployments, where applica-
tion ows share the same infrastructure as the control ows,
the trac generated by controller replicas imposes a non-
negligible network load [
3
]. Furthermore, comparing and
processing controller messages in the switches’ control plane
incurs additional CPU load [2, 4] and reconguration time.
P4BFT introduces the concept of in-network processing
nodes, which intercept and collect individually computed
controller outputs for matching client requests. After col-
lecting sucient number of packets to identify the correct
message payload, a processing node forwards the correct
payload to the destined conguration target. By intercepting
control ows in processing switches, and establishing point-
to-point connections between processing switches and target
destinations, it minimizes the network load imposed by BFT
operation. P4BFT realizes the processing node functionality
purely using a P4 pipeline, responsible for controller packet
collection, correct packet identication and its forwarding to
the destination nodes at line rate, thus eectively minimizing
accesses to the switches’ software control plane and vastly
outperforming software-based BFT solutions.
2 P4BFT SYSTEM DESIGN
In P4BFT, network controller instances are congured as a
set of replicated state machines, i.e., where each instance
calculates its decision in isolation from other controllers,
and transmits its decision to the destination switch. Control
packets are intercepted by the processing nodes (i.e., pro-
cessing switches) responsible for decisions destined for the
target switch. Consider Fig. 1. Given the placement of con-
trollers and the processing nodes’ capacity, with objective
to minimize the total control plane footprint and response
time, incurred for target conguration switches, P4BFT’s Re-
assigner component identies the optimal processing node
S2
as the best t processing node for
S4
. The multi-objective
formulation further considers the delay metric, available
processing capacities at switches (e.g., a hardware-enabled
P4BFT node has a higher throughput than the software-based
one), and thus minimizes the total traversed critical path be-
tween the controller furthest away from the conguration
target (
FD=
3in the worst case in Fig. 1, assuming a delay
weight of 1per hop). The resulting assignment additionally
minimizes the communication overhead to
FC=
11. This
is compared to state-of-the-art works [
1
,
4
] that default the
processing node assignment to reconguration targets, thus
resulting in higher FDand FCcompared to P4BFT.
Reassigner is the component that dynamically reassigns
the controller-switch connections based on the events col-
lected from the detection mechanism of the P4BFT switches.
Upon detection of faulty controllers, it excludes those from
the assignment procedure. It determines a minimum number
of required controllers, necessary to tolerate a congurable
number of availability failures
FA
and Byzantine failures
FM
[
4
], and assigns the controllers to each switch of its adminis-
trative domain. Additionally, the Reassigner maps a process-
ing node, in charge of controller messages’ comparison, to
each destination switch. Thus, switches declared as process-
ing nodes gain the responsibility of control packets collection
and forwarding. Reassigner executes once during network
bootstrapping and for selected control plane changes (i.e., on
SIGCOMM’19, August 19-24, 2019, Beijing, China Sakic et al.
Cluster 1 Cluster 2
C1 C2 C3 C4 C5
Controller Cluster
Processing Node
Destination Node
FC+=2
FC+=2
FC+=1
FD+=1
FC+=3
FD+=1
FC+=3
FD+=1
S1 S2 S3
S4 S5
Figure 1: P4BFT’s oloading of processing role ca-
pability to intermediate switches leads to decreased
packet footprint and control ow delays on critical
path, FC= 11 and FD= 3 hops, respectively.
addition / disconnection of a switch / controller). The opti-
mization output is enforced upon the P4 match-action tables
of the switches: i) the Processing Table, necessary to iden-
tify the switches responsible for comparison of controller
messages; and ii) the Forwarding Tables, necessary for for-
warding of controller messages to processing nodes and re-
conguration targets. Given the user-congurable parameter
of required tolerated failures
FA
and
FM
, Reassigner reports
to processing nodes the number of necessary matching mes-
sages that must be collected prior to marking a controller
message as correct. The details of the P4 control ow, as well
as the match-action pairs of P4 tables, are presented in [
4
,
5
].
P4BFT’s Reassigner can be deployed as a trusted compo-
nent of switch, or as a replicated component of the network
controller, i.e., in at least 2
FM+FA+
1instances, so to tolerate
FMByzantine and FAavailability faults of the Reassigner.
3 P4BFT DEMONSTRATION
The accompanying demonstration showcases the practical
advantage of P4BFT in 34-switch Internet2 and exemplary
topologies (ref. Fig. 1) in a testbed equipped with software
and physical P4-enabled switches. The signicant load foot-
print and reconguration delay improvements over state-
of-the-art works are visualized on a real-time dashboard,
similarly to Fig. 2. Furthermore, it is shown how a hardware-
based packet comparison can lead to a lowered total recong-
uration delay in scenarios where the capability of a process-
ing role is centralized in a single P4-enabled hardware node,
due to the decrease in number of accesses to the software-
based control plane. The software switches are instances of
the open-source
bmv
2reference switch, adhering to
P416
language specication. The hardware-based
P416
-enabled
P4BFT data plane node comprises the Netronome Agilio CX
10GE SmartNIC. The option to disable the ooading of pro-
cessing node capability is implemented for the purpose of
comparison to methods presented in existing works. The con-
gurable weight parameters allow for ne-tuning of multi-
objective optimization, and thus provide the user with an in-
terface to prefer either minimized communication overhead
or resulting reconguration latency. The special case, where
all control plane packets stemming from replicated controller
instances are traversing a single
P416
hardware node, demon-
strates the advantages of hardware-accelerated packet hash-
ing and comparison, and thus undermines a case for hybrid
deployments where the control-plane relies on state-of-the-
art control protocols (e.g., OpenFlow / NETCONF+YANG),
whereas the P4BFT-equipped edge nodes internalize the pro-
cessing node capabilities.
2 4 6 8 10 12 14 16 18
Switch Reconfiguration Delay [ms]
0.0
0.2
0.4
0.6
0.8
1.0
Cumulative Probability
0
20
40
60
80
100
Footprint Improvement [%]
(compared to [1, 2, 4])
State-of-Art ([1, 2, 4])
P4BFT (bmv2)
P4BFT - 1x Processing Node (P416 SmartNIC)
P4BFT - 1x Processing Node (bmv2 )
Mean Control Plane Load Improvement
Figure 2: P4BFT’s performance gains in terms of con-
trol plane load and reconguration latency.
4 ACKNOWLEDGMENT
This work has received funding from the EU in the context
of the H2020 project SEMIOTICS (grant agreement number
780315).
P4BFT: A Demonstration of HW-Accelerated BFT in FT Network CP SIGCOMM’19, August 19-24, 2019, Beijing, China
REFERENCES
[1]
He Li et al
.
2014. Byzantine-resilient secure SDN with multiple con-
trollers in cloud. IEEE Transactions on Cloud Computing 2, 4 (2014).
[2]
Purnima Murali Mohan et al
.
2017. Primary-Backup Controller Mapping
for Byzantine Fault Tolerance in SDN. In 2017 IEEE Global Communica-
tions Conference (IEEE Globecom 2017). IEEE.
[3]
Abubakar Siddique Muqaddas et al
.
2016. Inter-controller trac in
ONOS clusters for SDN. In 2016 IEEE International Conference on Com-
munications (IEEE ICC 2016). IEEE.
[4]
Ermin Sakic et al
.
2018. MORPH: An Adaptive Framework for Ecient
and Byzantine Fault-Tolerant SDN Control Plane. IEEE Journal on
Selected Areas in Communications 36, 10 (2018).
[5]
Ermin Sakic et al
.
2019. P4BFT: Hardware-Accelerated Byzantine-
Resilient Network Control Plane. CoRR abs/1905.04064 (2019).
arXiv:1905.04064 http://arxiv.org/abs/1905.04064
[6]
Liron Schi et al
.
2016. In-band synchronization for distributed SDN
control planes. ACM SIGCOMM CCR 46, 1 (2016).
[7]
Petra Vizarreta et al
.
2017. An empirical study of software reliability in
SDN controllers. In 2017 13th International Conference on Network and
Service Management (IEEE CNSM 2017). IEEE.