Lighting Control System using an Actor - Critic type Learning Algorithm
Faculty of Life and Medical Sciences
Graduate School of Engineering
Department of Science and Engineering
Department of Science and Engineering
Faculty of Life and Medical Sciences
Abstract—A novel lighting control system using the Actor
- Critic algorithm was developed, in which users can set the
brightness of the system through sensory operation, such as
”much brighter” or ”slightly darker”. During development, this
system must learn two states, i.e., the demands of the user and
the brightness around the user. The Actor - Critic algorithm
was applied for this purpose, and a simplified algorithm was
developed. The effectiveness and usefulness of the proposed
algorithm are discussed here through numerical simulations.
Keywords-Lighting; Sensory scale; Reinforcement learning;
Traditional lighting systems control multiple banks of
lights at once. In the near future, new devices such as light-
emitting diodes (LEDs) and organic light-emitting diodes
(OLED) will change lighting environments, and the number
of lights to be controlled will increase dramatically. At
present, it is difficult to control lights in such numbers
using conventional control systems. The development of
methods to control these new devices on an individual
basis will allow the system to perform intelligent actions
and achieve various lighting environments. It will be
necessary to change the lighting system user interface (UI)
to address this increase in number of lights. Adjusting the
brightness of many lights on an individual basis places a
large burden on the user. The development of a UI
capable of interpreting users’ sensory indications, such as
”much brighter” or ”slightly darker”, will be very convenient
for users. However, the definitions of the sensory scale,
such as ”much” or ”slightly”, differ for each user. To
address this issue, we have developed a learning mechanism
consisting of an Actor - Critic algorithm, which is a rein-
forcement learning method, to allow the system to learn the
users’ sensory lighting requirements. This system changes
the illuminance according to two states: the user’s sensory
indications and the degree of change in illuminance. The
amount of brightness is changed corresponding to the users’
sensory indications. However, the amount of brightness
change can be varied relative to the present level even with
the same sensory indication.
This is because the users’ perceptions differ with variation
in the brightness of the surrounding environment.
Thus, the Actor - Critic algorithm should learn two states:
the user’s sensory indications and the degree of change in
brightness. When there are m choices for sensory indication
and n choices for the degree of change in brightness, the
total number of conditions is mn. The conventional Actor
- Critic algorithm should learn all of these states and
conditions. However, the target is a lighting system,
which we assume will have many possible conditions. This
increases the number of possible states and results in huge
computational costs. The time required for learning should
be as short as possible, and an efficient learning algorithm
is necessary for this system. To make learning of the users’
sensory scale more efficient, we propose a Two-Actor -
Critic algorithm in this paper, which is an Actor - Critic
algorithm with two types of Actor applying to the two types
II. LIGHTING CONTROL SYSTEM
USING SENSORY OPERATION
As described in the previous section, it is efficient that
users can operate a lighting system by sensory order, such as
”much brighter” or ”slightly darker”. In this section,
the overview and the requirements of the lighting control
system using sensory operation are described.
An overview of the proposed system is shown in Fig. 1.
This system consists of a control computer, control de-
vices, lights, and illuminance sensors. Each light can be con-
trolled by the control computer on an individual basis. The
User 2”, in which we increased the number of states for
illuminance around the user. Although the threshold values
of Virtual User 1 were delimited at intervals of 1000 lx,
Virtual User 2 had threshold values delimited at intervals of
500 lx (0 - 500 lx, 500 - 1000 lx, 1000 - 1500 lx, 1500 -
2000 lx, 2000 - 2500 lx, 2500 - 3000 lx, 3000 - 3500 lx,
3500 - ∞ lx). The experimental results are shown in Fig. 8,
and the parameters of each algorithm were the same as those
in the experiments described in sections IV-A and IV-B.
Actor - Critic Algorithm for Virtual User 2
Comparison of Two-Actor - Critic Algorithm and Conventional
As shown in Fig. 8, the difference in speed of conver-
gence between the Two-Actor - Critic algorithm and the
conventional Actor - Critic algorithms was larger than that in
Fig. 7. The results of this experiment indicated that learning
efficiency can be improved by applying two types of Actor
to two types of state in the Two-Actor - Critic algorithm.
Here, we proposed a lighting control system through sen-
sory operation to minimize the operation burden associated
with a lighting system. To realize sensory operation, this
system learns the users’ sensory scale, such as ”very” or
”slightly”, using an Actor - Critic algorithm. This system
must learn efficiently to decrease user burden. There are
two types of state in the target environment of this system,
and the conventional Actor - Critic algorithm has to learn all
states combined with two types of state. To improve learning
efficiency, a learning algorithm involving the application of
two types of Actor to two types of state was proposed, i.e., a
Two-Actor - Critic algorithm. We verified the effectiveness
of this algorithm by experiments using virtual users. The
results indicated that this algorithm has learning accuracy
equivalent to that of the conventional Actor - Critic algo-
rithm. In addition, the proposed algorithm was shown to
have a faster learning speed than the conventional Actor -
Critic algorithm. Further studies are required to verify the
effectiveness under different environments and experiments
with real human users should also be performed.
 M Ashibe, M Miki, T Hiroyasu, : Distributed Optimization
Algorithm for Lighting Color Control using Chroma Sensors,
2008 IEEE International Conference on Systems,Man and
Cybernetics, pp.174-178, 2008.
 Vipul Singhvi, Andreas Krause, Carlos GuestrinJames H. Gar-
rett Jr, H. Scott Matthews : Intelligent light control using sensor
networks, Proceedings of the 3rd international conference on
Embedded networked sensor systems, pp.218-229, 2005.
 Barry Brumitt, JJ Cadiz : ”Let There Be Light” Examining
Interfaces for Homes of the Future, Proceedings of Interact’01,
 Krzysztof Gajos, Daniel S. Weld : SUPPLE -Automatically
Generating User Interfaces-, Proceedings of the 9th interna-
tional conference on Intelligent user interfaces, 2004.
 Liz C. Throop : Field of play: sensual interface, Proceedings
of the 2003 international conference on Designing pleasurable
products and interfaces, pp.82-86, 2003.
 K Tsukada, M Yasumura : Ubi-Finger: Gesture Input Device
for Mobile Use, Proceedings of APCHI 2002, Vol.1, pp.388-
 Stevens SS : On the psychophysical law, Psychological Review,
Vol.64(3), pp.153-181, 1957.
 Andrew G. Barto, Richard S. Sutton, Charles W. Anderson :
Neuronlike adaptive elements that can solve difficult learning
control problems, Neurocomputing: foundations of research,
 Tomoaki Sikakura, Hiroyuki Morikawa, Yoshiki Nakamura :
Perception of Lighting Fluctuation in Office Lighting Environ-
ment, Journal of Light and Visual Environment, Vol.27, No.2,
 D. H. Kelly : Visual Responses to Time-Dependent Stimuli.
I. Amplitude Sensitivity Measurements, Journal of the Optical
Society of America, pp.422-429, 1961.
 Jun Morimoto, Kenji Doya : Acquisition of Stand-up Behav-
ior by a Real Robot using Hierarchical Reinforcement Learn-
ing, Proceedings of the Seventeenth International Conference
on Machine Learning, pp.623-630, 2000.
 G. R. Gajjar, S. A. Khaparde, P. Nagaraju, S. A. Soman
: Application of actor-critic learning algorithm for optimal
bidding problem of a GenCo, IEEE Transactions on Power
Engineering Review, Vol.18, No.1, pp.11-18, 2003.
 M Miki, T Hiroyasu, K Imazato, M Yonezawa : Intelligent
Lighting Control using Correlation Coefficient between Lumi-
nance and Illuminance, Proc IASTED Intelligent Systems and
Control, Vol.497, No.078, pp.31-36, 2005.
 M Miki, E Asayama, T Hiroyasu : Intelligent Lighting System
using Visible-Light Communication Technology, 2006 IEEE
Conference on Cybernetics and Intelligent Systems,pp.1-6,
 T Mitchell, B Buchanan, G DeJong, T Dietterich, P Rosen-
bloom, A Waibel : Machine Learning, Annual Review of
Computer Science, Vol.4, No.1, pp.417-433, 1990.
 R. S. Sutton, A Barto : Reinforcement Learning -An
Introduction-, The MIT Press, 1998.