
Yijun YuThe Open University (UK) · Centre for Research in Computing (CRC)
Yijun Yu
PhD, Shanghai Fudan University
About
220
Publications
53,371
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,115
Citations
Citations since 2017
Introduction
Yijun Yu currently works at the Centre for Research in Computing (CRC), The Open University (UK). Yijun does research in Software Engineering, Programming Languages and Parallel Computing.
Additional affiliations
October 2006 - present
January 2003 - September 2006
January 1999 - December 2002
Publications
Publications (220)
In programming, learning code representations has a variety of applications, including code classification, code search, comment generation, bug prediction, and so on. Various representations of code in terms of tokens, syntax trees, dependency graphs, code navigation paths, or a combination of their variants have been proposed, however, existing v...
Video data is multimodal in its nature, where an utterance can involve linguistic, visual and acoustic information. Therefore, a key challenge for video sentiment analysis is how to combine different modalities for sentiment recognition effectively. The latest neural network approaches achieve state-of-the-art performance, but they neglect to a lar...
Video sentiment analysis as a decision-making process is inherently complex, involving the fusion of decisions from multiple modalities and the so-caused cognitive biases. Inspired by recent advances in quantum cognition, we show that the sentiment judgment from one modality could be incompatible with the judgment from another, i.e., the order matt...
Multimodal video sentiment analysis is a rapidly growing area. It combines verbal (i.e., linguistic) and non-verbal modalities (i.e., visual, acoustic) to predict the sentiment of utterances. A recent trend has been geared towards different modality fusion models utilizing various attention, memory and recurrent components. However, there lacks a s...
Context
With the prevalence of publicly available source code repositories to train deep neural network models, neural program models can do well in source code analysis tasks such as predicting method names in given programs that cannot be easily done by traditional program analysis techniques. Although such neural program models have been tested...
Video sentiment analysis as a decision-making process is inherently complex, involving the fusion of decisions from multiple modalities and the so-caused cognitive biases. Inspired by recent advances in quantum cognition, we show that the sentiment judgment from one modality could be incompatible with the judgment from another, i.e., the order matt...
Building deep learning models on source code has found many successful software engineering applications, such as code search, code comment generation, bug detection, code migration, and so on. Current learning techniques, however, have a major drawback that these models are mostly trained on datasets labeled for particular downstream tasks, and co...
Recently program learning techniques have been proposed to process source code based on syntactical structures (e.g., Abstract Syntax Trees) and/or semantic information (e.g., Dependency Graphs). Although graphs may be better at capturing various viewpoints of code semantics than trees, constructing graph inputs from code needs static code semantic...
Empowering end-users to program robots is becoming more significant. Introducing software engineering principles into end-user programming could improve the quality of the developed software applications. For example, model-driven development improves technology independence and adaptive systems act upon changes in their context of use. However, en...
With the prevalence of publicly available source code repositories to train deep neural network models, neural program analyzers can do well in source code analysis tasks such as predicting method names in given programs that cannot be easily done by traditional program analyzers. Although such analyzers have been tested on various existing dataset...
While teaching the art of Computer Programming, students with visual impairments (VI) are disadvantaged, because speech is their preferred modality. Existing accessibility assistants can only read out predefined texts sequentially, word-for-word, sentence-for-sentence, whilst the presentations of programming concepts could be conveyed in a more str...
Systems-of-systems are formed by the composition of independently created software components. These components are designed to satisfy their individual requirements, rather than the global requirements of the systems-of-systems. We refer to components that cannot be adapted to meet both individual and global requirements as "defiant" components. I...
Unmanned Aerial Vehicles (UAVs), or drones, are increasingly expected to operate in spaces populated by humans while avoiding injury to people or damaging property. However, incidents and accidents can, and increasingly do, happen. Traditional investigations of aircraft incidents require on-board flight data recorders (FDRs); however, these physica...
To save effort, developers often translate programs from one programming language to another, instead of implementing it from scratch. Translating application program interfaces (APIs) used in one language to functionally equivalent ones available in another language is an important aspect of program translation. Existing approaches facilitate the...
To save manual effort, developers often translate programs from one programming language to another, instead of implementing it from scratch. Translating application program interfaces (APIs) used in one language to functionally equivalent ones available in another language is an important aspect of program translation. Existing approaches facilita...
Drone simulators can provide an abstraction of different applications of drones and facilitate reasoning about distinct situations, in order to evaluate the effectiveness of these applications. In this paper we describe Dragonfly, a simulator of the behaviours of individual and collection of drones in various environments, involving random contextu...
Security and privacy can often be considered from two perspectives. The first perspective is that of the attacker who seeks to exploit vulnerabilities of the system to harm assets such as the software system itself or its users. The second perspective is that of the defender who seeks to protect the assets by minimising the likelihood of attacks on...
New challenges such as big data, ultra-large-scale services, and continuously available services are driving the evolution to adaptive software systems, which are able to modify their behavior in response to their environmental and internal changes, in order to achieve their goals. Providing support in all phases of the life cycle of adaptive softw...
This book discusses the problems and challenges in the interdisciplinary research field of self-adaptive software systems. Modern society is increasingly filled with software-intensive systems, which are required to operate in more and more dynamic and uncertain environments. These systems must monitor and control their environment while adapting t...
During software maintenance and evolution, developers need to deal with a large number of change requests by modifying existing code or adding code into the system. An efficient tackling of change request calls for an accurate localising of software changes, i.e. identifying which code are problematic and where new files should be added for any typ...
Formal methods have been applied widely to verifying the safety requirements of communication-based train control (CBTC) systems, while the problem situations could be much simplified. In industrial practices of CBTC systems, however, huge complexity arises, which renders those methods nearly impossible to apply. In this paper, we aim to reduce the...
Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code? Information retrieval (IR) approaches see the bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the...
As the number, complexity, and heterogeneity of connected devices in the Internet of Things (IoT) increase, so does our need to secure these devices, the environment in which they operate, and the assets they manage or control. Collaborative security exploits the capabilities of these connected devices and opportunistically composes them to protect...
In an adaptive security-critical system, security mechanisms change according to the type of threat posed by the environment. Specifying the behavior of these systems is difficult because conditions of the environment are difficult to describe until the system has been deployed and used for a length of time. This paper defines the problem of adapta...
Some user needs can only be met by leveraging the capabilities of others to undertake particular tasks that require intelligence and labor. Crowdsourcing such capabilities is one way to achieve this. But providing a service that leverages crowd intelligence and labor is a challenge, since various factors need to be considered to enable reliable ser...
Security bug reports can describe security critical vulnerabilities in software. Bug tracking systems may contain thousands of bug reports, where relatively few of them are security related. Therefore finding unlabelled security bugs among them can be challenging. To help security engineers identify these reports quickly and accurately, text-based...
Towards the vision of automatically translating code that implements an algorithm from one programming language into another, this paper proposes an approach for automated program classifications using bilateral tree-based convolutional neural networks (BiTBCNNs). It is layered on top of two tree-based convolutional neural networks (TBCNNs), each o...
The timing requirements of embedded cyber-physical systems (CPS) constrain CPS behaviors made by scheduling analysis. Lack of physical entity properties modeling and the need of scheduling analysis require a systematic approach to specify timing requirements of CPS at the early phase of requirements engineering. In this work, we extend the Problem...
Digital evidence needs to be made persistent so that it can be used later. For citizen forensics, sometimes intelligence cannot or should not be made persistent forever. In this position paper, we propose a form of snap forensics by defining an elastic duration of evidence/intelligence validity. Explicitly declaring such a duration could unify the...
This article describes how earlier detection of security problems and the implementation of solutions would be a cost-effective approach for developing secure software systems. Developing, gathering and sharing similar repeatable programming knowledge and solutions has led to the introduction of Patterns in the 90's. The same concept has been adopt...
Empowering end-users to wire Internet of Things (IoT) objects (things and services) together would allow them to more easily conceive and realize interesting IoT solutions. A challenge lies in devising a simple end-user development approach to support the specification of transformations, which can bridge the mismatch in the data being exchanged am...
Search based software engineering has been extensively applied to the problem of finding improved modular structures that maximise cohesion and minimise coupling. However, there has, hitherto, been no longitudinal study of developers’ implementations, over a series of sequential releases. Moreover, results validating whether developers respect the...
In this paper we present our ongoing work to build an approach to empower users of IoT-based cyber physical systems to protect their privacy by themselves. Our approach allows users to identify the privacy risks involved in sharing private data with a data consumer, assess the value of their private data based on identified risks and take a pragmat...
The Malaysian Airlines (MH370) aircraft went missing somewhere over the Indian Ocean two years ago. After intensive search since then, international team still has not been able to locate any first-hand evidence from the missing plane's flight data recorders (also known as `blackboxes'). To mitigate similar problems, a proposal has been made to ana...
Software applications that are very large-scale, can encompass hundreds of complex user interfaces (UIs). Such applications are commonly sold as feature-bloated off-the-shelf products to be used by people with variable needs in the required features and layout preferences. Although many UI adaptation approaches were proposed, several gaps and limit...
Password managers address the usability challenge of authentication, i.e., to manage the effort in creating, memorising, and entering complex passwords for an end-user. Offering features such as creating strong passwords, managing increasing number of complex passwords, and auto-filling of passwords for variable contexts, their security is as criti...
Police investigations involving digital evidence tend to focus on forensic examination of storage units on personal electronic devices (laptops, smartphones, etc). However, a number of factors are making digital forensic tools increasingly ineffective: (i) storage capacities of electronic devices have increased, and so has the amount of personal in...
Some user needs in real life can only be accomplished by leveraging the intelligence and labor of other people via crowdsourcing tasks. For example, one may want to confirm the validity of the description of a secondhand laptop by asking someone else to inspect the laptop on site. To integrate these crowdsourcing tasks into user applications, it is...
Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, where is it located in the source code files? Information retrieval (IR) approaches see a bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not r...
Cloud computing has become popular because of its benefits to users, which include having access to the resources they need at any time without having to invest in or manage an extensive computing infrastructure. However, users also lose control of the systems they depend on, which creates privacy and security concerns.
Examines how engineers can work to solve major aircraft disasters, focusing on missing flight MH370. It is not yet known what has really happened to the missing flight MH370; the plane could not be located from radar signals. Its last hourly ping signals to the Inmarsat satellite suggested that the flight headed toward the southern Indian Ocean, wh...
Software plays a crucial role in modern societies. Not only do people rely on it for their daily operations or business, but also for their lives as well. For this reason, correct and consistent behavior of software systems is a fundamental and critical part of end-user expectations. Additionally, businesses require cost-effective production, maint...
One of the challenges of any adaptive system is to ensure that users can understand how and why the behaviour of the system changes at runtime. This is particularly important for adaptive security behaviours which are essential for applications that are used in many different contexts, such as those hosted in the cloud. In this paper, we propose an...
Adaptive user interfaces (UIs) were introduced to address some of the usability problems that plague many software applications. Model-driven engineering formed the basis for most of the systems targeting the development of such UIs. An overview of these systems is presented and a set of criteria is established to evaluate the strengths and shortco...
In an uncertain and changing environment, a composite service needs to continuously optimize its business process and service selection through runtime adaptation. To achieve the overall satisfaction of stakeholder requirements, quality tradeoffs are needed to adapt the composite service in response to the changing environments. Existing approaches...
Socio-technical systems (STSs) consist of human, hardware and software agents that work in tandem to fulfill stakeholder requirements. A specification for an STS consists of a set of (social) commitments among participating agents that serve as a contract among them. However, by their very nature, STSs are open, dynamic and continuously evolving al...
Security is concerned with the protection of assets from intentional harm. Secure systems provide capabilities that enable such protection to satisfy some security requirements. In a world increasingly populated with mobile and ubiquitous computing technology, the scope and boundary of security systems can be uncertain and can change. A single func...
Self-adaptive access control, in which self-* properties are applied to protecting systems, is a promising solution for the handling of malicious user behaviour in complex infrastructures. A major challenge in self-adaptive access control is ensuring that chosen adaptations are valid, and produce a satisfiable model of access. The contribution of t...
One of the key challenges in cloud computing is the security of the consumer data stored and processed by cloud machines. When the usage context of a cloud application changes, or when the context is unknown, there is a risk that security policies are violated. To minimize this risk, cloud applications need to be engineered to adapt their security...
Many existing enterprise applications are at a mature stage in their development and are unable to easily benefit from the usability gains offered by adaptive user interfaces (UIs). Therefore, a method is needed for integrating adaptive UI capabilities into these systems without incurring a high cost or significantly disrupting the way they functio...
A self-adaptive system uses runtime models to adapt its architecture to the changing requirements and contexts. However, there is no one-to-one mapping between the requirements in the problem space and the architectural elements in the solution space. Instead, one refined requirement may crosscut multiple architectural elements, and its realization...
Goal-driven self-optimization through feedback loops has shown effectiveness in reducing oscillating utilities due to a large number of uncertain factors in the runtime environments. However, such self-optimization is less satisfactory when there contains uncertainty in the predefined requirements goal models, such as imprecise contributions and un...
A digital forensic investigation aims to collect and analyse the evidence
necessary to demonstrate a potential hypothesis of a digital crime. Despite the
availability of several digital forensics tools, investigators still approach
each crime case from scratch, postulating potential hypotheses and analysing
large volumes of data. This paper propose...
Following the “convention over configuration” paradigm, model-driven software development (MDSD) generates code to implement the “default” behaviour that has been specified by a template separate from the input model. On the one hand, developers can produce end-products without a full understanding of the templates; on the other hand, the tacit kno...
Security requirements are concerned with protecting assets of a system from harm. Implemented as code aspects to weave protection mechanisms into the system, security requirements need to be validated when changes are made to the programs during system evolution. However, it was not clear for developers whether existing validation procedures such a...
The principle of Separation of Concerns encourages developers to divide complex problems into simpler ones and solve them individually. Aspect-Oriented Programming (AOP) languages provide mechanisms to modularise concerns that affect several software components, by means of joinpoints, advice and aspect weaving. In a software system with multiple a...
Development of several computing and communication technologies is enabling the widespread availability of pervasive systems. In smart home applications, household appliances-such as security alarms, heating systems, doors and windows-are connected to home digital networks. These applications offer features that are typically developed by disparate...
Purpose
– In any information security risk assessment, vulnerabilities are usually identified by information‐gathering techniques. However, vulnerability identification errors – wrongly identified or unidentified vulnerabilities – can occur as uncertain data are used. Furthermore, businesses' security needs are not considered sufficiently. Hence, s...
We propose the use of forensic requirements to drive the automation of a digital forensics process. We augment traditional reactive digital forensics processes with proactive evidence collection and analysis activities, and provide immediate investigative suggestions before an investigation starts. These activities adapt depending on suspicious eve...
Self-repairing approaches have been proposed to alleviate the runtime requirements satisfaction problem by switching to appropriate alternative solutions according to the feedback monitored. However, little has been done formally on analyzing the relations between specific environmental failures and corresponding repairing decisions, making it a ch...
Enterprise applications such as customer relationship management (CRM) and enterprise resource planning (ERP) are very large scale, encompassing millions of lines-of-code and thousands of user interfaces (UI). These applications have to be sold as feature-bloated off-the-shelf products to be used by people with diverse needs in required feature-set...