Context-Specific Middleware Specialization Techniques
for Optimizing Software Product-line Architectures
Arvind S. Krishna
Dept. of Electrical Engineering
and Computer Science
Nashville, TN, USA
Aniruddha S. Gokhale
Dept. of Electrical Engineering
and Computer Science
Nashville, TN, USA
Douglas C. Schmidt
Dept. of Electrical Engineering
and Computer Science
Nashville, TN, USA
Product-line architectures (PLAs)are anemerging paradigm forde-
veloping software families for distributed real-time and embedded
(DRE) systems by customizing reusable artifacts, rather than hand-
crafting software from scratch. To reduce the effort of developing
software PLAs and product variants for DRE systems, developers
are applying general-purpose – ideally standard – middleware plat-
forms whose reusable services and mechanisms support a range
of application quality of service (QoS) requirements, such as low
latency and jitter. The generality and flexibility of standard mid-
dleware, however, often results in excessive time/space overhead
for DRE systems, due to lack of optimizations tailored to meet the
specific QoS requirements of different product variants in a PLA.
This paper provides the following contributions to the study of
middleware specialization techniques for PLA-basedDREsystems.
First, we identify key dimensions of generality in standard mid-
dleware stemming from framework implementations, deployment
platforms, and middleware standards. Second, we illustrate how
context-specific specialization techniques can be automated and
used to tailor standard middleware to better meet the QoS needs
of different PLA product variants. Third, we quantify the ben-
efits of applying automated tools to specialize a standard Real-
time CORBA middleware implementation. When applied together,
these middleware specializations improved our application prod-
uct variant throughput by ∼65%, average- and worst-case end-to-
end latency measures by ∼43% and ∼45%, respectively, and pre-
dictability by a factor of two over an already optimized middleware
implementation, withlittleor no effect on portability, standard mid-
dleware APIs, or application software implementations, and inter-
Categories and Subject Descriptors
D.4.8 [Operating Systems]: Performance
∗Work supported by NSF ITR CCR-0312859 and Qualcomm
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
EuroSys’06, April 18–21, 2006, Leuven, Belgium.
Copyright 2006 ACM 1-59593-322-0/06/0004 ...$5.00.
Product lines, Middleware, Specializations
Emerging trends and challenges. Product-line architectures (P-
LAs) [2, 20] are a promising technology for systematically address-
ing key challenges of large-scale software systems. In contrast to
conventional software processes that produce separate point solu-
tions i.e., solutions customized on a case-by-case basis, PLA-based
processes create families of product variants  that share a com-
mon set of capabilities, patterns, and architectural styles. PLAs can
be characterized using scope, commonality, and variabilities (SCV)
analysis , which identifies the scope of the product families in an
application domain and determines the common and variable prop-
erties among them.
PLAs have been created and applied to a variety of domains [10,
25], including the domain of distributed, real-time and embedded
(DRE) systems [5, 30, 31]. Examples of DRE systems include ap-
plications with hard real-time requirements, such as avionics mis-
sion computing , as well as those with softer real-time require-
ments, such as telecommunication call processing and streaming
video . QoS challenges (such as low memory footprint and pre-
dictable or bounded latency) of DRE systems have hitherto led de-
velopers to (re)invent custom applications that are tightly coupled
to specific hardware/software platforms, which is tedious, error-
prone, and hard to evolve over product lifecycles. During the past
decade, therefore, a key technology for alleviating the tight cou-
pling between applications and their underlying platforms has been
middleware, which (1) functionally bridges the gap between ap-
plications and platforms, (2) controls many aspects of end-to-end
QoS, and (3) simplifies the integration of components developed by
Although middleware has been used successfully in DRE sys-
tems [5, 30, 31], key challenges must be overcome before it can
be applied broadly to support the QoS needs of PLA-based DRE
systems. In particular, R&D is needed to help resolve the tension
between (1) the generality of standards-based middleware plat-
forms, which benefit from reusable architectures designed to sat-
isfy a broad range of application requirements, and (2) application-
specific product variants, which benefit fromhighly-optimized, cus-
tom middleware implementations. In resolving this tension, solu-
tions should ideally retain the portability and interoperability af-
forded by standard middleware.
Specializing Middleware for PLAs. The chief hypotheses of this
paper are that even for highly optimized general-purpose standard
middleware frameworks (1) there are opportunities to further op-
timize the system when unwanted generality from the middleware
is removed and (2) that optimizations are not feasible without first
removing the generality. This paper operationalizes these hypothe-
ses developing and applying a toolkit to help resolve key aspects
of the generality/specificity tension outlined above. This toolkit
automates the specialization  of general-purpose standard mid-
dleware to meet the needs of specific PLA-based DRE systems.
This paper provides the following research contributions:
1. Weuse arepresentative PLAcase study drawn from Boeing’s
avionics mission computing PLA-based DRE system called
Bold Stroke [30, 31] to identify key dimensions of excessive
generality in standards-based middleware, focusing on Real-
time CORBA  used in Bold Stroke.
2. We show how context-specific specialization techniques 
(such as code refactoring , and code weaving ) can
be used to customize the widely used TAO  Real-time
CORBA implementation to remove excessive generality and
thus better support application-specific QoS needs of PLA-
based DRE systems, such as Bold Stroke.
3. We describe the design of a domain-specific language, tools,
and a process for automating the specialization techniques
discussed in the paper.
4. Wediscuss quantitative results thatdemonstrate theimprove-
ment in performance and predictability of specializations ap-
plied to TAO in the context of our PLA case study.
Our results show that specialization techniques guided by context-
specific information can significantly improve the QoS of a standa-
rds-based middleware implementation that has already been opti-
mized extensively via general-purpose techniques [22, 24].
General-purpose implementations ofstandard middleware arede-
signed to be reusable since they need to satisfy a broad range of
functional and QoS application requirements. PLAs define a fam-
ily of systems that have many common functional and QoS re-
quirements, as well as variability specific to particular products
built using the PLA. Resolving the tension between generality and
specificity is essential to ensure middleware can support the QoS
requirements of PLA-based DRE systems. Unfortunately, imple-
mentations of standards-based, QoS-enabled middleware, such as
Real-time CORBA and Real-time Java, can incur time/space over-
heads due toexcessive generality. Thissection uses a representative
PLA-based DRE system scenario to identify and illustrate common
types of excessive generality in standard middleware.
2.1 DRE PLA Case Study
This section uses a representative DRE PLA scenario to (1) illus-
trate how the generality/specificity tension outlined above occurs in
production DRE systems and (2) identify concrete system invari-
ants that drive our specialization approach. The scenario is based
on the Boeing Bold Stroke avionics mission computing PLA ,
which is a component-based, publish/subscribe platform built atop
the TAO Real-time CORBA Object Request Broker (ORB). Fig-
ure 1 illustrates the BasicSP application scenario, which is an as-
sembly of avionics mission computing components reused in dif-
ferent Bold Stroke product variants. This scenario involves four
avionics mission computing components thatperiodically send GPS
position updates to a pilot and navigator cockpit displays at a rate of
20 Hz. The time to process inputs to the system and present output
to cockpit displays should thus be less than a single 20 Hz frame.
Communication between components uses an event-push/data-
pull model, with data producing components pushing an event to
NAV DISP NAV DISP
Figure 1: BasicSP Application Scenario
notify new dataisavailable and data consuming components pulling
data from the source. A Timer component pulses a GPS naviga-
tion sensor component at a certain rate, which in turn publishes the
data_avail events to an Airframe component that then calls
a method provided by the Read_Data interface of the GPS com-
ponent to retrieve the current location. After formatting the data,
Airframe sends a data_avail event to the Nav_Display
component, which pulls the location and velocity data from the
Airframe component and displays this information on the pilot’s
Commonalities intheBasicSP scenario include theset ofreusable
components (such as Display, Airframe, and GPS) in Bold
Stroke and middleware capabilities (such as connection manage-
ment, data transfer, concurrency, synchronization, (de)marshaling,
(de)multiplexing, and error-handling) that occur in all product vari-
ants. Variabilities include application-specific component connec-
tions (such as how GPS and Airframecomponents are connected
in different airplanes), different implementations (such as whether
GPS or inertial navigation algorithms are used), and components
specific to particular customers (such as restrictions on exporting
certain encryption algorithms). The rates at which these compo-
nents interact is yet another variability that may change in different
Analysis of commonalities and variabilities in the BasicSP sce-
nario helps identify functional (e.g., specific communication proto-
cols) and QoS (e.g., end-to-end latency) characteristics of PLAs. In
turn, these characteristics map to specific requirements on – and po-
tential optimizations of – the underlying middleware. The remain-
der of this paper focuses on specialized middleware optimizations
of PLA functionality and QoS characteristics.
2.2Common Types of Excessive Generality in
Using the BasicSP scenario depicted in Figure 1, we describe
key types of excessive middleware generality manifested in PLA-
based DRE systems. The challenges for each type of generality are
shown in Figure 2 and discussed below. The figure depicts a stan-
dard distribution middleware architecture, i.e., Real-time CORBA,
and the numbers in the figure indicate the parts of the middleware
architecture where sources of excessive generality occur.
Challenge 1. Overly extensible object-oriented (OO) frame-
works. Middleware is often developed using OO frameworks that
can be extended and configured with alternative implementations of
key components, such as different types of transport protocols (e.g.,
TCP/IP, VME, or shared memory), event demultiplexing mecha-
nisms (e.g., reactive-, proactive-, or thread-based), request demulti-
plexing strategies (e.g., dynamic hashing, perfect hashing, or active
demuxing), and concurrency models (e.g., thread-per-connection,
thread pool, or thread-per-request). A particular DRE product vari-
ant, however, may only use a small subset of the framework alter-
natives. As a result, general-purpose middleware may be overly
extensible, i.e., contain unnecessary overhead for indirection and
dynamic dispatching that is unnecessary in a particular context.
In the BasicSP scenario, for instance, the transport protocol is
VME, the event demultiplexing mechanism is reactive, the request
our work, we have identifiedadditional CORBAarchitectural struc-
tures that are amenable to optimization via specialization. Techni-
cal challenges remaining include extending the automatic C pro-
gram techniques described in  to richer object-oriented lan-
guages, such as C++ and Java, that place a greater emphasis on
dynamically created data.
Schultz et. al  describe an automatic program specializa-
tion technique for Java wherein they use language-level mecha-
nisms to eliminate virtual dispatch overhead. While our focus is
also on eliminating such kinds of overhead, our approach focuses
on language independent mechanisms. Le Muer et. al  describe
a module-based language similar in syntax to the C language to en-
able non-experts to describe the program and data structures that
need to be specialized. A special compiler to synthesize metadata
for the Tempo partial evaluator has been developed. Our approach
is similar to that of Le Muer et. al., however, instead of a special
language and compiler, we use XML, middleware annotations and
Perl-driven transformations to automate the specializations.
5.2 Aspect-Oriented Programming (AOP)
[35, 9] has applied aspect-oriented programming (AOP) tech-
niques to factor out cross-cutting middleware features, such as port-
able interceptors, (de)marshaling routines, and dynamic typing. Some
specializations described in this paper can be implemented using
AOP. The primary difference is that our specializations focus more
on the transformations (woven code) required to specialize mid-
dleware, whereas AOP is more of a delivery mechanism to realize
5.3 Empirically-guided Optimizers
The ATLAS  numerical algebra library uses an empirical
optimization engine to set the values of optimization parameters
by generating different program versions that are run on various
hardware/OS platforms. The output from these runs are used to se-
lect parameter values that maximize performance. Similarly, our
GNU autoconf specializations run empirical benchmarks on the
target deployment platform to determine the OS, compiler, and
hardware parameters that maximize performance.
5.4 Code Synthesis Techniques
’C (tick-C)  extends ANSI C to provide dynamic code gener-
ation capabilities. ’C provide code specifications that capture val-
ues of run-time constants. ’C implementation tcc is a compiler that
translated ’C to C and to assembly code. Our FOCUS approach,
differs from ’C as follows (1) it captures the code transformations
required to optimize code for a run-time constant and (2) provides
only a source to source transformation.
The Synthesis Kernel  generated custom system calls for
specific situations to collapse layers and eliminate unnecessary pro-
cedure calls. In this approach, specialized kernel code is dynami-
cally synthesized to improve performance. This approach has been
extended to use incremental specialization techniques. For exam-
ple,  have identified several invariants for an OS read() call
on HP/UX. Our work extends the range of specializations to en-
compass middleware invariants in the context of PLA-based DRE
systems, which have some different constraints. For example, we
do not consider dynamic re-plugging costs since it would unduly
increase jitter for product variants in many DRE systems.
6. CONCLUDING REMARKS
This paper describes how context-specific specializations can be
automated and applied tooptimize excessive generality instandards-
based middleware implementations used for PLAs. We applied
specializations based on the Bold Stroke avionics mission comput-
ing PLA to optimize the TAO Real-time CORBA ORB. Our re-
sults showed the throughput of Bold Stroke BasicSP scenario im-
proved by ∼65%, its average- and worst-case end-to-end latency
measures improved by ∼43% and ∼45%, respectively, and its pre-
dictability improved by a factor of two, without affecting porta-
bility, standard middleware APIs, or application software imple-
mentations, while preserving interoperability wherever possible.
These improvements are particularly notable since TAO has al-
ready been tuned via many general-purpose middleware optimiza-
tions [22, 24]. We also described how GNU autoconf and FO-
CUS were used to automate the middleware specializations de-
scribed in the paper. FOCUS has been integrated with the open-
source TAOrelease availablefrom www.dre.vanderbilt.edu/
The remainder of this section discusses the consequences and
implications of our specialization techniques and tools.
Implications on QoS. The specializations discussed in this paper
had no inter-dependencies, i.e.,the specializations do not overlap in
the end-to-end code path. As middleware and system architects de-
velop a catalog of specializations, it will be necessary to document
the interplay between the specializations and analyze the implica-
tions on mixing and matching different specializations. Similarly,
not all the specializations will be applicable to every PLA applica-
tion scenario, so PLA developers will need to work in conjunction
with middleware developers to determine the applicability of the
different specialization techniques to product variants.
Quantitative results show that improvements from applying our
specializations can be scenario-specific. For example, the demar-
shaling results showed how a complicated structure benefited more
from the specialization than a simple type. When the specialized
path is traversed more often, therefore, its influence on end-to-end
performance is more significant.
Implications on adaptability. Our specialization mechanisms do
not consider adaptation costs, i.e., the overhead of handling and re-
covering from situations where the invariance assumptions are vio-
lated. Adding such mechanisms require activities (such as loading
new libraries or adding run-time checks) that can incur consider-
able jitter, and thus are not desirable for DRE systems.
Implications on schedulability. In many DRE systems, real-time
tasks are scheduled and analyzed offline to ensure they complete
before theirdeadlines. Latency overheads caused by general-purpose
middleware implementations may cause deadline misses for criti-
cal tasks scheduled a priori. Applying our specializations could
reduce middleware overhead considerably, helping ensure that crit-
ical tasks complete before their deadlines. Our optimizations might
also enable such tasks tofinishwellahead oftheirdeadlines, thereby
increasing the total slack, i.e., time interval available for scheduling
other tasks (such as soft real-time tasks), in the system. More avail-
able slack could potentially increase the number of schedulable soft
real-time tasks in the system.
7. ADDITIONAL AUTHORS
Venkatesh Prasad Ranganath (Kansas State University, email:
email@example.com) and John Hatcliff (Kansas State University,
 Anne-Franoise Le Meur, Julia L. Lawall and Charles Consel.
Specialization Scenarios: A Pragmatic Approach to
Declaring Specialization Scenarios. Higher-Order and
Symbolic Computation, 17(1), March 2004.
 P. Clements and L. Northrop. Software Product Lines:
Practices and Patterns. Addison-Wesley, Boston, 2002.
 J. Coplien, D. Hoffman, and D. Weiss. Commonality and
Variability in Software Engineering. IEEE Software, 15(6),
 G. Daugherty. A Proposal for the Specialization of HA/DRE
Systems. In Proceedings of the ACM SIGPLAN 2004
Symposium on Partial Evaluation and Program
Manipulation (PEPM 04), Verona, Italy, Aug. 2004. ACM.
 B. S. Doerr and D. C. Sharp. Freeing Product Line
Architectures from Execution Dependencies. In Proceedings
of the 11th Annual Software Technology Conference, Apr.
 M. Fowler, K. Beck, J. Brant, W. Opdyke, and D. Roberts.
Refactoring - Improving the Design of Existing Code.
Addison-Wesley, Reading, Massachusetts, 1999.
 G. Muller and R. Marlet and E.-N. Volanschi and C. Consel
and C. Pu and A. Goel. Fast, Optimized Sun RPC Using
Automatic Program Specialization. In ICDCS ’98:
Proceedings of the The 18th International Conference on
Distributed Computing Systems, page 240, Washington, DC,
USA, 1998. IEEE Computer Society.
 E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design
Patterns: Elements of Reusable Object-Oriented Software.
Addison-Wesley, Reading, MA, 1995.
 C. Z. D. Gao and H.-A. Jacobseon. Towards Just In Time
Middleware Architectures. In Proceedings of the 2005
Aspect Oriented Software Engineering Conference (AOSD),
 J. Greenfield, K. Short, S. Cook, and S. Kent. Software
Factories: Assembling Applications with Patterns, Models,
Frameworks, and Tools. John Wiley & Sons, New York,
 J. Hatcliff. An Introduction to Online and Offline Partial
Evaluation using a Simple Flowchart Language. Partial
Evaluation – Practice and Theory DIKU 1998 International
Summer School, Springer Verlag, 1706:20 – 82, Jun 1998.
 V. Itkin. On Partial and Mixed Program Execution. In
Program Optimization and Transformation, pages 17–30.
CCN, 1983. (In Russian).
 N. Jones, C. Gomard, and P. Sestoft. Partial Evaluation and
Automatic Program Generation. Englewood Cliffs, NJ:
Prentice Hall, 1993.
 Kamen Yotov and Xiaoming Li and Gan Ren et.al. A
Comparison of Empirical and Model-driven Optimization. In
Proceedings of ACM SIGPLAN conference on Programming
Language Design and Implementation, June 2003.
 G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C. V.
Lopes, J.-M. Loingtier, and J. Irwin. Aspect-Oriented
Programming. In Proceedings of the 11th European
Conference on Object-Oriented Programming, pages
220–242, June 1997.
 R. Marlet, S. Thibault, and C. Consel. Efficient
Implementations of Software Architectures via Partial
Evaluation. Automated Software Engineering: An
International Journal, 6(4):411–440, October 1999.
 Massimiliano Poletto and Wilson C. Hsieh and Dawson R.
Engler and M. Frans Kaashoek. ‘C and tcc: A Language and
Compiler for Dynamic Code Generation. ACM Transactions
on Programming Languages and Systems, 21(2):324–369,
 Object Management Group. Real-time CORBA Specification,
OMG Document formal/02-08-02 edition, Aug. 2002.
 C. O’Ryan, F. Kuhns, D. C. Schmidt, O. Othman, and
J. Parsons. The Design and Performance of a Pluggable
Protocols Framework for Real-time Distributed Object
Computing Middleware. In Proceedings of the Middleware
2000 Conference. ACM/IFIP, Apr. 2000.
 D. L. Parnas. On the Design and Development of Program
Families. IEEE Transactions on Software Engineering,
 C. Pu, T. Autery, A. Black, C. Consel, C. Cowan, J. W.
Jon Inouye, Lakshmi Kethana, and K. Zhang. Optimistic
Incremental Specialization: Streamlining a Commercial
Operating System. In Symposium of Operating System
Principles, Copper Mountain Resort, Colorado, Dec. 1995.
 I. Pyarali, C. O’Ryan, D. C. Schmidt, N. Wang, V. Kachroo,
and A. Gokhale. Using Principle Patterns to Optimize
Real-time ORBs. IEEE Concurrency Magazine, 8(1), 2000.
 I. Pyarali and D. C. Schmidt. An Overview of the CORBA
Portable Object Adapter. ACM StandardView, 6(1), Mar.
 I. Pyarali, D. C. Schmidt, and R. Cytron. Techniques for
Enhancing Real-time CORBA Quality of Service. IEEE
Proceedings Special Issue on Real-time Systems, 91(7), July
 F. v. d. L. Rob van Ommering, J. Kramer, and J. Magee. The
Koala Component Model for Consumer Electronics
Software. IEEE Computer, 3(33):78–85, Mar. 2000.
 D. C. Schmidt and S. D. Huston. C++ Network
Programming, Volume 2: Systematic Reuse with ACE and
Frameworks. Addison-Wesley, Reading, Massachusetts,
 D. C. Schmidt, D. L. Levine, and S. Mungee. The Design
and Performance of Real-time Object Request Brokers.
Computer Communications, 21(4):294–324, Apr. 1998.
 D. C. Schmidt, M. Stal, H. Rohnert, and F. Buschmann.
Pattern-Oriented Software Architecture: Patterns for
Concurrent and Networked Objects, Volume 2. Wiley &
Sons, New York, 2000.
 U. P. Schultz, J. L. Lawall, and C. Consel. Automatic
program specialization for Java. ACM Trans. Program. Lang.
Syst., 25(4):452–499, 2003.
 D. C. Sharp. Reducing Avionics Software Cost Through
Component Based Product Line Development. In
Proceedings of the 10th Annual Software Technology
Conference, Apr. 1998.
 D. C. Sharp and W. C. Roll. Model-Based Integration of
Reusable Component-Based Avionics System. In Proc. of
the Workshop on Model-Driven Embedded Systems in RTAS
2003, May 2003.
 J. Sztipanovits and G. Karsai. Model-Integrated Computing.
IEEE Computer, 30(4):110–112, Apr. 1997.
 A. van Deursen, P. Klint, and J. Visser. Domain-Specific
 B. White and J. L. et al. An Integrated Experimental
Environment for Distributed Systems and Networks. In
Proceedings of the Fifth Symposium on Operating Systems
Design and Implementation, pages 255–270, Boston, MA,
Dec. 2002. USENIX Association.
 C. Zhang and H. Jacobsen. Re-factoring Middleware with
Aspects. IEEE Transactions on Parallel and Distributed
Systems, 14(11):1058–1073, Nov 2003.