## Publications

Advances in synthesis and sequencing technologies have made DNA macromolecules an attractive medium for digital information storage. Compared with the ex vivo method that stores data in a non-biological environment, there have been considerations and attempts to store data in living organisms, also known as the in vivo method or live DNA due to sev...

In this paper, we present an explicit construction of list-decodable codes for single-deletion and single-substitution with list size two and redundancy 3log n+4, where n is the block length of the code. Our construction has lower redundancy than the best known explicit construction by Gabrys et al. (arXiv 2021), whose redundancy is 4log n+O(1).

In this article, under a general cost function
$C$
, we present a dynamic programming (DP) method to obtain an optimal sequential deterministic quantizer (SDQ) for
$q$
-ary input discrete memoryless channel (DMC). The DP method has complexity
$O(q (N-M)^{2}\,\,M)$
, where
$N$
and
$M$
are the alphabet sizes of the DMC output and quantizer...

The process of DNA-based data storage (DNA storage for short) can be mathematically modelled as a communication channel, termed DNA storage channel, whose inputs and outputs are sets of unordered sequences. To design error correcting codes for DNA storage channel, a new metric, termed the
sequence-subset distance
, is introduced, which generalize...

The process of DNA-based data storage (DNA storage for short) can be mathematically modelled as a communication channel, termed DNA storage channel, whose inputs and outputs are sets of unordered sequences. To design error correcting codes for DNA storage channel, a new metric, termed the sequence-subset distance, is introduced, which generalizes the Hamming distance to a distance between two sets of sequences.

The process of DNA data storage can be mathematically modelled as a communication channel, termed DNA storage channel, whose inputs and outputs are sets of unordered sequences. To design error correcting codes for DNA storage channel, a new metric, termed the sequence-subset distance, is introduced, which generalizes the Hamming distance to a dista...

We propose a coding method to transform binary sequences into DNA base sequences (codewords), namely sequences of the symbols A, T, C and G, that satisfy the following two properties • Run-length constraint. The maximum run-length of each symbol in each codeword is at most three; • GC-content constraint: The GC-content of each codeword is close to...

We prove that for any positive integers $n$ and $k$ such that $n\!\geq\! k\!\geq\! 1$, there exists an $[n,k]$ generalized Reed-Solomon (GRS) code that has a sparsest and balanced generator matrix (SBGM) over any finite field of size $q\!\geq\! n\!+\!\lceil\frac{k(k-1)}{n}\rceil$, where sparsest means that each row of the generator matrix has the l...

We consider the locally repairable codes (LRC), aiming at sequential recovering multiple erasures. We define the (n,k,r,t)-SLRC (Sequential Locally Repairable Codes) as an [n,k] linear code where any t'(>= t) erasures can be sequentially recovered, each one by r (2<=r<k) other code symbols. Sequential recovering means that the erased symbols are re...

Locally repairable codes (LRC) for distribute storage allow two approaches to
locally repair multiple failed nodes: 1) parallel approach, by which each
newcomer access a set of $r$ live nodes $(r$ is the repair locality$)$ to
download data and recover the lost packet; and 2) sequential approach, by which
the newcomers are properly ordered and each...

False data injection attacks (FDIAs) have been introduced as a critical class of cyber attacks against smart grid's monitoring system. These attacks aim to compromise the reading of grid sensors and phasor measurement units. It was shown that FDIAs can pass the traditional bad data detection. Furthermore, to perform an FDIA, the attacker need to ac...

Vandermonde and Cauchy matrices are commonly used in the constructions of maximum distance separable (MDS) codes. However, when additional design constraints are imposed on the code construction in addition to the MDS requirement, a Vandermonde or Cauchy matrix may not always suffice. We discuss some related coding problems of that nature that aris...

We consider the problem of designing [n,k] linear codes for distributed
storage systems (DSS) that satisfy the following (r,t)-local repair property:
(r,t)-Local Repair Property: Any t' (<= t) simultaneously failed nodes can be
locally repaired, each with locality r.
The parameters n,k,r,t are positive integers such that r<k<n and t <= n-k. We
cons...

We consider the complexities of repair algorithms for locally repairable
codes and propose a class of codes that repair single node failures using
addition operations only, or codes with addition based repair. We construct two
families of codes with addition based repair. The first family attains distance
one less than the Singleton-like upper boun...

We consider a simple multiple access network (SMAN), where $k$ sources of
unit rates transmit their data to a common sink via $n$ relays. Each relay is
connected to the sink and to certain sources. A coding scheme (for the relays)
is weakly secure if a passive adversary who eavesdrops on less than $k$
relay-sink links cannot reconstruct the data fr...

We consider the locality of encoding and decoding operations in distributed
storage systems (DSS), and propose a new class of codes, called locally
encodable and decodable codes (LEDC), that provides a higher degree of
operational locality compared to currently known codes. For a given locality
structure, we derive an upper bound on the global dist...

We study the network coding problem of sum-networks with 3 sources and n
terminals (3s/nt sum-network), for an arbitrary positive integer n, and derive
a sufficient and necessary condition for the solvability of a family of
so-called terminal-separable sum-network. Both the condition of
terminal-separable and the solvability of a terminal-separable...

We introduce a new family of erasure codes, called group decodable code
(GDC), for distributed storage system. Given a set of design parameters
{\alpha; \beta; k; t}, where k is the number of information symbols, each
codeword of an (\alpha; \beta; k; t)-group decodable code is a t-tuple of
strings, called buckets, such that each bucket is a string...

We investigate a simple multiple access network (SMAN) where $k$ independent sources of unit rates multicast their information to a set of sinks, via $n$ commonly shared relays. All links are assumed to have unit capacity. Given such a SMAN, a coding scheme for the relays is called optimal if each sink can retrieve all information from the sources...

The MDS property (aka the $k$-out-of-$n$ property) requires that if a file is
split into several symbols and subsequently encoded into $n$ coded symbols,
each being stored in one storage node of a distributed storage system (DSS),
then an user can recover the file by accessing any $k$ nodes. We study the
so-called $p$-decodable $\mu$-secure erasure...

A sum-network is a directed acyclic network where each source independently
generates one symbol from a given field $\mathbb F$ and each terminal wants to
receive the sum $($over $\mathbb F)$ of the source symbols. For sum-networks
with two sources or two terminals, the solvability is characterized by the
connection condition of each source-termina...

We study the existence over small fields of Maximum Distance Separable (MDS)
codes with generator matrices having specified supports (i.e. having specified
locations of zero entries). This problem unifies and simplifies the problems
posed in recent works of Yan and Sprintson (NetCod'13) on weakly secure
cooperative data exchange, of Halbawi et al....

In wireless networks, getting the global knowledge of channel state information (CSI, e.g., channel gain or link loss probability) is always beneficial for the nodes to optimize the network design. However, the node usually only has the local CSI between itself and other nodes, and lacks the CSI between any pair of other nodes. To enable all the no...

A passive adversary can eavesdrop stored content or downloaded content of
some storage nodes, in order to learn illegally about the file stored across a
distributed storage system (DSS). Previous work in the literature focuses on
code constructions that trade storage capacity for perfect security. In other
words, by decreasing the amount of origina...

The encoding complexity of network coding for single multicast networks has been intensively studied from several aspects: e.g., the time complexity, the required number of encoding links, and the required field size for a linear code solution. However, these issues as well as the solvability are less understood for networks with multiple multicast...

Linear erasure codes with local repairability are desirable for distributed
data storage systems. An [n, k, d] code having all-symbol (r,
\delta})-locality, denoted as (r, {\delta})a, is considered optimal if it also
meets the minimum Hamming distance bound. The existing results on the existence
and the construction of optimal (r, {\delta})a codes...

The capacity factor, as a useful tool, was used to characterize the
dependence of every link on the capacity changes of a network coding
based network. In this paper, we shall investigate the relationship
between the network capacity and the set of capacity factors. We firstly
introduce a new concept, the capacity kernel, which is a subnetwork
dedu...

We show that given $n$ and $k$, for $q$ sufficiently large, there always
exists an $[n, k]_q$ MDS code that has a generator matrix $G$ satisfying the
following two conditions: (C1) Sparsest: each row of $G$ has Hamming weight $n
- k + 1$; (C2) Balanced: Hamming weights of the columns of $G$ differ from each
other by at most one.

This paper considers the problem of error correction for a cooperative data
exchange (CDE) system, where some clients are compromised or failed and send
false messages. Assuming each client possesses a subset of the total messages,
we analyze the error correction capability when every client is allowed to
broadcast only one linearly-coded message....

In this paper, we consider the problem of minimizing the total transmission
cost for exchanging channel state information. We proposed a network coded
cooperative data exchange scheme, such that the total transmission cost is
minimized while each client can decode all the channel information held by all
other clients. In this paper, we first derive...

We consider a directed acyclic network with two source-sink pairs {s1, t1} and {s2, t2}. The source s1 wishes to communicate a message X1 to the sink t1 and the source s2 wishes to communicate two messages X2 and X3 to the sink t2, where Xi, i = 1,2,3, are independent random variables of unit rate. We give a simple characterization for linear solva...

still of extraordinary complexity to obtain a network coding solution. In this paper, we propose an O(IEI) time algorithm to determine the solvability of such networks. Based on our method, a network coding solution of such networks can also be obtained in time O(IEI), where E is the link set of the network. Moreover, we prove that a field of size...

The encoding complexity of network coding for single multicast networks has
been intensively studied from several aspects: e.g., the time complexity, the
required number of encoding links, and the required field size for a linear
code solution. However, these issues as well as the solvability are less
understood for networks with multiple multicast...

The intersession network coding problem, which is also known as the multiple
source network coding problem is a challenging topic, and has attracted
significant attention from the network coding community. In this paper, we
study the encoding complexity for intersession network coding with two simple
multicast sessions. The encoding complexity is c...