Han Mao Kiah

Han Mao Kiah
Nanyang Technological University | ntu · Division of Mathematical Sciences (DMS)

PhD

About

129
Publications
10,109
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,230
Citations
Introduction
Han Mao Kiah currently works at the Division of Mathematical Sciences (DMS), Nanyang Technological University. Han Mao does research in Applied Mathematics.
Additional affiliations
February 2014 - February 2015
University of Illinois, Urbana-Champaign
Position
  • PostDoc Position
August 2013 - February 2014
Nanyang Technological University
Position
  • Project Officer
Education
August 2010 - February 2014
Nanyang Technological University
Field of study
  • Mathematics
August 2002 - December 2005
National University of Singapore
Field of study
  • Mathematics

Publications

Publications (129)
Conference Paper
In this work, given n,ϵ > 0, two efficient encoding (decoding) methods are presented for mapping arbitrary data to (from) n×n binary arrays in which the weight of every row and every column is within [(1/2–ϵ)n, (1/2+ϵ)n], which is referred to as the ϵ-balanced constraint. The first method combines the divide and conquer algorithm and a modification...
Preprint
Full-text available
We study and propose schemes that map messages onto constant-weight codewords using variable-length prefixes. We provide polynomial-time computable formulas that estimate the average number of redundant bits incurred by our schemes. In addition to the exact formulas, we also perform an asymptotic analysis and demonstrate that our scheme uses $\frac...
Preprint
Full-text available
Transmit a codeword $x$, that belongs to an $(\ell-1)$-deletion-correcting code of length $n$, over a $t$-deletion channel for some $1\le \ell\le t<n$. Levenshtein, in 2001, proposed the problem of determining $N(n,\ell,t)+1$, the minimum number of distinct channel outputs required to uniquely reconstruct $x$. Prior to this work, $N(n,\ell,t)$ is k...
Article
The de Bruijn graph, its sequences, and their various generalizations, have found many applications in information theory, including many new ones in the last decade. In this paper, motivated by a coding problem for emerging memory technologies, a set of sequences which generalize the window property of de Bruijn sequences, on its shorter subsequen...
Article
We propose coding techniques that simultaneously limit the length of homopolymers runs, ensure the GC-content constraint, and are capable of correcting a single edit error in strands of nucleotides in DNA-based data storage systems. In particular, for given ℓ, ϵ > 0, we propose simple and efficient encoders/decoders that transform binary sequences...
Preprint
Full-text available
The problem of computing the permanent of a matrix has attracted interest since the work of Ryser(1963) and Valiant(1979). On the other hand, trellises were extensively studied in coding theory since the 1960s. In this work, we establish a connection between the two domains. We introduce the canonical trellis $T_n$ that represents all permutations,...
Article
We propose new repair schemes for Reed-Solomon codes that use subspace polynomials and hence generalize previous works in the literature that employ trace polynomials. The Reed-Solomon codes are over Fqℓ and have redundancy r = n-k ≥ qm, 1 ≤ m ≤ ℓ, where n and k are the code length and dimension, respectively. In particular, for one erasure, we sho...
Article
It is well known that, whenever k divides n, the complete k‐uniform hypergraph on n vertices can be partitioned into disjoint perfect matchings. Equivalently, the set of k‐subsets of an n‐set can be partitioned into parallel classes so that each parallel class is a partition of the n‐set. This result is known as Baranyai's theorem, which guarantees...
Article
An indel refers to a single insertion or deletion, while an edit refers to a single insertion, deletion or substitution. In this article, we investigate codes that correct either a single indel or a single edit and provide linear-time algorithms that encode binary messages into these codes of length n. Over the quaternary alphabet, we provide two l...
Article
The Hamming ball of radius \begin{document}$ w $\end{document} in \begin{document}$ \{0,1\}^n $\end{document} is the set \begin{document}$ \mathcal{B}(n,w) $\end{document} of all binary words of length \begin{document}$ n $\end{document} and Hamming weight at most \begin{document}$ w $\end{document}. We consider injective mappings \begin{document}$...
Article
The \begin{document}$ k $\end{document}-deck of a sequence is defined as the multiset of all its subsequences of length \begin{document}$ k $\end{document}. Let \begin{document}$ D_k(n) $\end{document} denote the number of distinct \begin{document}$ k $\end{document}-decks for binary sequences of length \begin{document}$ n $\end{document}. For bina...
Article
The class of multiset combinatorial batch codes (MCBCs) was introduced by Zhang et al. (2018) as a generalization of combinatorial batch codes (CBCs), which are replication-based batch codes. The MCBCs allow multiple users to retrieve items in parallel in a distributed storage and a fundamental objective in this study is to determine the minimum...
Article
We apply the generalized sphere-packing bound to two classes of subblock-constrained codes. À la Fazeli et al. (2015), we make use of automorphisms to significantly reduce the number of variables in the associated linear programming problem. In particular, we study binary constant subblock-composition codes (CSCCs), characterized by the prope...
Preprint
Full-text available
We propose new repair schemes for Reed-Solomon codes that use subspace polynomials and hence generalize previous works in the literature that employ trace polynomials. The Reed-Solomon codes are over $\mathbb{F}_{q^\ell}$ and have redundancy $r = n-k \geq q^m$, $1\leq m\leq \ell$, where $n$ and $k$ are the code length and dimension, respectively. I...
Preprint
Full-text available
It is well known that, whenever $k$ divides $n$, the complete $k$-uniform hypergraph on $n$ vertices can be partitioned into disjoint perfect matchings. Equivalently, the set of $k$-subsets of an $n$-set can be partitioned into parallel classes so that each parallel class is a partition of the $n$-set. This result is known as Baranyai's theorem, wh...
Conference Paper
In this paper, we first propose coding techniques for DNA-based data storage which account the maximum homopolymer runlength and the GC-content. In particular, for arbitrary $\ell,\epsilon > 0$, we propose simple and efficient $(\epsilon, \ell)$-constrained encoders that transform binary sequences into DNA base sequences (codewords), that satisfy t...
Preprint
Full-text available
The de Bruijn graph, its sequences, and their various generalizations, have found many applications in information theory, including many new ones in the last decade. In this paper, motivated by a coding problem for emerging memory technologies, a set of sequences which generalize sequences in the de Bruijn graph are defined. These sequences can be...
Preprint
Full-text available
The sequence reconstruction problem, introduced by Levenshtein in 2001, considers a communication scenario where the sender transmits a codeword from some codebook and the receiver obtains multiple noisy reads of the codeword. Motivated by modern storage devices, we introduced a variant of the problem where the number of noisy reads $N$ is fixed (K...
Article
To equip DNA-based data storage with random-access capabilities, Yazdi et al. (2018) prepended DNA strands with specially chosen address sequences called primers and provided certain design criteria for these primers. We provide explicit constructions of error-correcting codes that are suitable as primer addresses and equip these constructions wi...
Article
In a bus with n wires, each wire has two states, '0' or '1', representing one bit of information. Whenever the state transitions from '0' to '1', or '1' to '0', joule heating causes the temperature to rise, and high temperatures have adverse effects on on-chip bus performance. Recently, the class of low-power cooling (LPC) codes was proposed to con...
Preprint
Full-text available
We propose coding techniques that limit the length of homopolymers runs, ensure the GC-content constraint, and are capable of correcting a single edit error in strands of nucleotides in DNA-based data storage systems. In particular, for given $\ell, {\epsilon} > 0$, we propose simple and efficient encoders/decoders that transform binary sequences i...
Preprint
Full-text available
The sequence reconstruction problem, introduced by Levenshtein in 2001, considers a communication scenario where the sender transmits a codeword from some codebook and the receiver obtains multiple noisy reads of the codeword. The common setup assumes the codebook to be the entire space and the problem is to determine the minimum number of distinct...
Preprint
Full-text available
A robust positioning pattern is a large array that allows a mobile device to locate its position by reading a possibly corrupted small window around it. In this paper, we provide constructions of binary positioning patterns, equipped with efficient locating algorithms, that are robust to a constant number of errors and have redundancy within a cons...
Preprint
Full-text available
The linear complexity of a sequence $s$ is one of the measures of its predictability. It represents the smallest degree of a linear recursion which the sequence satisfies. There are several algorithms to find the linear complexity of a periodic sequence $s$ of length $N$ (where $N$ is of some given form) over a finite field $F_q$ in $O(N)$ symbol f...
Preprint
Full-text available
An indel refers to a single insertion or deletion, while an edit refers to a single insertion, deletion or substitution. In this paper, we investigate codes that combat either a single indel or a single edit and provide linear-time algorithms that encode binary messages into these codes of length n. Over the quaternary alphabet, we provide two line...
Preprint
Full-text available
An indel refers to a single insertion or deletion, while an edit refers to a single insertion, deletion or substitution. In this paper, we investigate codes that combat either a single indel or a single edit and provide linear-time algorithms that encode binary messages into these codes of length n. Over the quaternary alphabet, we provide two line...
Article
Write-once memory (WOM) is a storage device consisting of binary cells that can only increase their levels. A t-write WOM code is a coding scheme that makes it possible to write t times to a WOM without decreasing the levels of any of the cells. The sum-rate of a WOM code is the ratio between the total number of bits written to the memory during th...
Conference Paper
The class of multiset combinatorial batch codes (MCBCs) was introduced by Zhang et al. (2018) as a generalization of combinatorial batch codes (CBCs). MCBCs allow multiple users to retrieve items in parallel in a distributed storage and a fundamental objective in this study is to determine the minimum total storage given certain requirements. We r...
Conference Paper
Private Information Retrieval (PIR) array codes were introduced by Fazeli et al. (2015) to reduce the storage overhead in designing PIR protocols. Blackburn and Etzion (2017) introduced the (virtual server) rate to quantify the storage overhead of the codes, and when $s>2$ (here, $\frac{1}{s}$ is the proportion of the database storing in one server...
Conference Paper
Full-text available
An indel refers to a single insertion or deletion, while an edit refers to a single insertion, deletion or substitution. We investigate codes that combat either a single indel or a single edit and provide linear-time algorithms that encode binary messages into these codes of length n. Over the quaternary alphabet, we provide two linear-time encoder...
Book
Full-text available
In 1989 we organized the first Benelux‐Japan workshop on Information and Communication theory in Eindhoven, the Netherlands. This year, 2019 we celebrate 30 years of our friendship between Asian and European scientists at the AEW11 in Rotterdam, the Netherlands. Many of the 1989 participants are also present at the 2019 event. This year we have man...
Preprint
Full-text available
This year, 2019 we celebrate 30 years of our friendship between Asian and European scientists at the AEW11 in Rotterdam, the Netherlands. Many of the 1989 participants are also present at the 2019 event. This year we have many participants from different parts of Asia and Europe. It shows the importance of this event. It is a good tradition to pay...
Preprint
Full-text available
To equip DNA-based data storage with random-access capabilities, Yazdi et al. (2018) prepended DNA strands with specially chosen address sequences called primers and provided certain design criteria for these primers. We provide explicit constructions of error-correcting codes that are suitable as primer addresses and equip these constructions with...
Preprint
Full-text available
We apply the generalized sphere-packing bound to two classes of subblock-constrained codes. A la Fazeli et al. (2015), we made use of automorphism to significantly reduce the number of variables in the associated linear programming problem. In particular, we study binary constant subblock-composition codes (CSCCs), characterized by the property tha...
Article
We demonstrate that certain Johnson-type bounds are asymptotically exact for a variety of classes of codes, namely, constant-composition codes, nonbinary constant-weight codes, group divisible codes, and multiply constant-weight codes. We achieve this via an application of the theory of decomposition of edge-colored digraphs.
Article
We investigate constant-composition constrained codes for mitigation of intercell interference for multilevel cell flash memories with dynamic threshold scheme. The first explicit formula for the maximum size of a q-ary F-avoiding code with a given composition and certain families of substrings F is presented. In addition, we provide methods to det...
Preprint
Full-text available
A class of low-power cooling (LPC) codes, to control simultaneously both the peak temperature and the average power consumption of interconnects, was introduced recently. An $(n,t,w)$-LPC code is a coding scheme over $n$ wires that (A) avoids state transitions on the $t$ hottest wires (cooling), and (B) limits the number of transitions to $w$ in ea...
Preprint
Full-text available
The Hamming ball of radius $w$ in $\{0,1\}^n$ is the set ${\cal B}(n,w)$ of all binary words of length $n$ and Hamming weight at most $w$. We consider injective mappings $\varphi: \{0,1\}^m \to {\cal B}(n,w)$ with the following domination property: every position $j \in [n]$ is dominated by some position $i \in [m]$, in the sense that "switching of...
Article
FIVE THOUSAND YEARS AGO, a man died in the Alps. It's possible he died from a blow to the head, or he may have bled to death after being shot in the shoulder with an arrow. There's a lot we don't know about Otzi (named for the Otztal Alps, where he was discovered), despite the fact that researchers have spent almost 30 years studying him.
Article
Full-text available
Tandem duplication is the process of inserting a copy of a segment of DNA adjacent to the original position. Motivated by applications that store data in living organisms, Jain et al. (2017) proposed the study of codes that correct tandem duplications. Known code constructions are based on {\em irreducible words}. We study efficient encoding/decodi...
Article
The class of geometric orthogonal codes (GOCs) were introduced by Doty and Winslow (2016) for more robust macrobonding in DNA origami. They observed that GOCs are closely related to optical orthogonal codes (OOCs). It is possible for GOCs to have size greater than OOCs of corresponding parameters due to slightly more relaxed constraints on correlat...
Article
Full-text available
High temperatures have dramatic negative effects on interconnect performance and, hence, numerous techniques have been proposed to reduce the power consumption of on-chip buses. However, existing methods fall short of fully addressing the thermal challenges posed by high-performance interconnects. In this paper, we introduce new efficient coding sc...
Article
Full-text available
We introduce the notion of weakly mutually uncorrelated (WMU) sequences, motivated by applications in DNA-based data storage systems and for synchronization of communication devices. WMU sequences are characterized by the property that no sufficiently long suffix of one sequence is the prefix of the same or another sequence. WMU sequences used for...
Article
Full-text available
Tandem duplication in DNA is the process of inserting a copy of a segment of DNA adjacent to the original position. Motivated by applications that store data in living organisms, Jain et al. (2016) proposed the study of codes that correct tandem duplications to improve the reliability of data storage. We investigate algorithms associated with the s...
Conference Paper
Full-text available
The subblock energy-constrained codes (SECCs) have recently been shown to be suitable candidates for simultaneous energy and information transfer, where bounds on SECC capacity were presented for communication over noisy channels. In this paper, we study binary SECCs with given error correction capability, by considering codes with a certain minimu...
Conference Paper
Full-text available
The study of binary constant subblock-composition codes (CSCCs) has recently gained attention due to their application in diverse fields. These codes are a class of constrained codes where each codeword is partitioned into equal sized subblocks, and every subblock has the same fixed weight. We present novel upper and lower bounds on the asymptotic...
Article
We introduce a new family of codes, termed asymmetric Lee distance (ALD) codes, designed to correct errors arising in DNA-based storage systems and systems with parallel string transmission protocols. ALD codes are defined over a quaternary alphabet and analyzed in this particular setting, but the derived results hold for other alphabet sizes as we...