ArticlePDF Available

Rank Modulation for Flash Memories

Authors:

Abstract

We explore a novel data representation scheme for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. The only allowed charge-placement mechanism is a "push-to-the-top" operation which takes a single cell of the set and makes it the top-charged cell. The resulting scheme eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. We present unrestricted Gray codes spanning all possible n-cell states and using only "push-to-the-top" operations, and also construct balanced Gray codes. We also investigate optimal rewriting schemes for translating arbitrary input alphabet into n-cell states which minimize the number of programming operations.
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009 2659
Rank Modulation for Flash Memories
Anxiao (Andrew) Jiang, Member, IEEE, Robert Mateescu, Member, IEEE, Moshe Schwartz, Member, IEEE, and
Jehoshua Bruck, Fellow, IEEE
Abstract—We explore a novel data representation scheme for
multilevel flash memory cells, in which a set of cells stores infor-
mation in the permutation induced by the different charge levels
of the individual cells. The only allowed charge-placement mech-
anism is a “push-to-the-top” operation, which takes a single cell
of the set and makes it the top-charged cell. The resulting scheme
eliminates the need for discrete cell levels, as well as overshoot er-
rors, when programming cells.
We present unrestricted Gray codes spanning all possible -cell
states and using only “push-to-the-top” operations, and also
construct balanced Gray codes. One important application of
the Gray codes is the realization of logic multilevel cells, which
is useful in conventional storage solutions. We also investigate
rewriting schemes for random data modification. We present both
an optimal scheme for the worst case rewrite performance and an
approximation scheme for the average-case rewrite performance.
Index Terms—Asymmetric channel, flash memory, Gray codes,
permutations, rank modulation.
I. INTRODUCTION
FLASH memory is a nonvolatile memory both electrically
programmable and electrically erasable. Its reliability,
high storage density, and relatively low cost have made it a
dominant nonvolatile memory technology and a prominent
candidate to replace the well-established magnetic recording
technology in the near future.
The most conspicuous property of flash storage is its inherent
asymmetry between cell programming (charge placement) and
cell erasing (charge removal). While adding charge to a single
cell is a fast and simple operation, removing charge from a
single cell is very difficult. In fact, today, most (if not all) flash
memory technologies do not allow a single cell to be erased but
rather only a large block of cells. Thus, a single-cell erase op-
eration requires the cumbersome process of copying an entire
block to a temporary location, erasing it, and then programming
all the cells in the block.
Manuscript received September 18, 2008; revised January 28, 2009. Current
version published May 20, 2009. This work was supported in part by the Cal-
tech Lee Center for Advanced Networking, by the National Science Foundation
(NSF) under Grant ECCS-0802107 and the NSF CAREER Award 0747415 , by
the GIF under Grant 2179-1785.10/2007, by the NSF-NRI, and by a gift from
Ross Brown. The material in this paper was presented in part at the IEEE Inter-
national Symposium on Information Theory, Toronto, ON, Canada, July 2008
A. Jiang is with the Department of Computer Science, Texas A&M Univer-
sity, College Station, TX 77843-3112 USA (e-mail: ajiang@cs.tamu.edu).
R. Mateescu and J. Bruck are with the Department of Electrical Engineering,
California Institute of Technology, Pasadena, CA 91125 USA (e-mail: ma-
teescu@paradise.caltech.edu; bruck@paradise.caltech.edu).
M. Schwartz is with the Department of Electrical and Computer Engineering,
Ben-Gurion University, Beer-Sheva 84105, Israel (e-mail: schwartz@ee.bgu.ac.
il).
Communicated by T. Etzion, Associate Editor for Coding Theory.
Digital Object Identifier 10.1109/TIT.2009.2018336
To keep up with the ever-growing demand for denser storage,
the multilevel flash cell concept is used to increase the number
of stored bits in a cell [8]. Instead of the usual single-bit flash
memories, where each cell is in one of two states (erased/pro-
grammed), each multilevel flash cell stores one of levels and
can be regarded as a symbol over a discrete alphabet of size .
This is done by designing an appropriate set of threshold levels
which are used to quantize the charge level readings to symbols
from the discrete alphabet.
Fast and accurate programming schemes for multilevel flash
memories are a topic of significant research and design efforts
[2], [14], [31]. All these and other works share the attempt to
iteratively program a cell to an exact prescribed charge level
in a minimal number of programming cycles. As mentioned
above, flash memory technology does not support charge re-
moval from individual cells. As a result, the programming cycle
sequence is designed to cautiously approach the target charge
level from below so as to avoid undesired global erases in case of
overshoots. Consequently, these attempts still require many pro-
gramming cycles, and they work only up to a moderate number
of levels per cell.
In addition to the need for accurate programming, the move to
multilevel flash cells also aggravates reliability. The same relia-
bility aspects that have been successfully handled in single-level
flash memories may become more pronounced and translate into
higher error rates in stored data. One such relevant example is
errors that originate from low memory endurance [5], by which
a drift of threshold levels in aging devices may cause program-
ming and read errors.
We therefore propose the rank-modulation scheme, whose
aim is to eliminate both the problem of overshooting while pro-
gramming cells, and the problem of memory endurance in aging
devices. In this scheme, an ordered set of cells stores the in-
formation in the permutation induced by the charge levels of the
cells. In this way, no discrete levels are needed (i.e., no need for
threshold levels) and only a basic charge-comparing operation
(which is easy to implement) is required to read the permutation.
If we further assume that the only programming operation al-
lowed is raising the charge levelof one of the cells above the cur-
rent highest one (push-to-the-top), then the overshoot problem
is no longer relevant. Additionally, the technology may allow in
the near future the decrease of all the charge levels in a block of
cells by a constant amount smaller than the lowest charge level
(block deflation), which would maintain their relative values,
and thus leave the information unchanged. This can eliminate
a designated erase step, by deflating the entire block whenever
the memory is not in use.
Once a new data representation is defined, several tools are
required to make it useful. In this paper, we present Gray codes
that bring to bear the full representational power of rank mod-
0018-9448/$25.00 © 2009 IEEE
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
2660 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009
ulation, and data rewriting schemes. The Gray code [13] is an
ordered list of distinct length binary vectors such that every
two adjacent words (in the list) differ by exactly one bit flip.
They have since been generalized in countless ways and may
now be defined as an ordered set of distinct states for which
every state is followed by a state such that ,
where is a transition function from a predetermined set
defining the Gray code. In the original code, is the set of
all possible single bit flips. Usually, the set consists of transi-
tions that are minimal with respect to some cost function, thus
creating a traversal of the state space that is minimal in total
cost. For a comprehensive survey of combinatorial Gray codes,
the reader is referred to [33].
One application of the Gray codes is the realization of logic
multilevel cells with rank modulation. The traversal of states
by the Gray code is mapped to the increase of the cell level in
a classic multilevel flash cell. In this way, rank modulation can
be naturally combined with current multilevel storage solutions.
Some of the Gray code constructions we describe also induce a
simple algorithm for generating the list of permutations. Effi-
cient generation of permutations has been the subject of much
research as described in the general survey [33], and the more
specific [34] (and references therein). In [34], the transitions we
use in this paper are called “nested cycling,” and the algorithms
cited there produce lists that are not Gray codes since some of
the permutations repeat, which makes the algorithms inefficient.
We also investigate efficient rewriting schemes for rank mod-
ulation. Since it is costly to erase and reprogram cells, we try
to maximize the number of times data can be rewritten between
two erase operations [4], [21], [22]. For rank modulation, the
key is to minimize the highest charge level of cells. We present
two rewriting schemes that are, respectively, optimized for the
worst case and the average-case performance.
Rank modulation is a new storage scheme and differs from
existing data storage techniques. There has been some recent
work on coding for flash memories. Examples include floating
codes [22], [23], which jointly record and rewrite multiple vari-
ables, and buffer codes [4], [37], that keep a log of the recent
modifications of data. Both floating codes and buffer codes use
the flash cells in a conventional way, namely, the fixed discrete
cell levels. Floating codes are an extension of the write-once
memory (WOM) codes [6], [10], [11], [17], [32], [36], which are
codes for effective rewriting of a single variable stored in cells
that have irreversible state transitions. The study in this area also
includes defective memories [16], [18], where defects (such as
“stuck-at faults”) randomly happen to memory cells and how to
store the maximum amount of information is considered. In all
the above codes, unlike rank modulation, the states of different
cells do not relate to each other. Also related is the work on per-
mutation codes [3], [35], used for data transmission or signal
quantization.
The paper is organized as follows: Section II describes a Gray
code that is cyclic and complete (i.e., it spans the entire sym-
metric group of permutations); Section III introduces a Gray
code that is cyclic, complete and balanced, optimizing the tran-
sition step and also making it suitable for block deflation; Sec-
tion IV shows a rewriting scheme that is optimal for the worst
case rewrite cost; Section V presents a code optimized for the
average rewrite cost with small approximation ratios; Section VI
concludes this paper.
II. DEFINITIONS AND BASIC CONSTRUCTION
Let be a state space, and let be a set of transition func-
tions, where every is a function .AGray
code is an ordered list of distinct elements from
such that for every , for some
.If for some , then the code is cyclic.If
the code spans the entire space we call it complete.
Let denote the set of integers . An ordered
set of flash memory cells named , each containing
a distinct charge level, induces a permutation of by writing
the cell names in descending charge level , i.e.,
the cell has the highest charge level while has the lowest.
The state space for the rank modulation scheme is therefore the
set of all permutations over , denoted by .
As described in the previous section, the basic minimal-cost
operation on a given state is a “push-to-the-top” operation by
which a single cell has its charge level increased so as to be
the highest of the set. Thus, for our basic construction, the set
of minimal-cost transitions between states consists of
functions pushing the th element of the permutation,
, to the front
Throughout this work, our state space will be the set of
permutations over , and our set of transition functions will be
the set of “push-to-the-top” functions. We call such a code a
length- rank modulation Gray code ( -RMGC).
Example 1: An example of a 3-RMGC is the following:
where the permutations are the columns being read from left to
right. The sequence of operations creating this cyclic code is:
, , , , , . This sequence will obviously create a Gray
code regardless of the choice of the first column.
One important application of the Gray codes is the realiza-
tion of logic multilevel cells. The traversal of states by the Gray
code is mapped to the increase of the cell level in a classic mul-
tilevel flash cell. As an -RMGC has states, it can simulate
a cell of up to discrete levels. Current data storage schemes
(e.g., floating codes [22]) can therefore use the Gray codes as
logic cells, as illustrated in Fig. 1, and get the benefits of rank
modulation.
We will now show a basic recursive construction for
-RMGCs. The resulting codes are cyclic and complete, in the
sense that they span the entire state space. Our recursion basis
is the simple 2-RMGC: , .
Now let us assume we have a cyclic and complete
-RMGC, which we call , defined by the sequence
of transitions and where , i.e.,
a “push-to-the-top” operation on the second element in the
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
JIANG et al.: RANK MODULATION FOR FLASH MEMORIES 2661
Fig. 1. Two multilevel flash-memory cells with six levels, currently storing the
value “ .” (a) The first is realized using a single multilevel cell with absolute
thresholds. The possible transitions between states are shown to its right. (b)
The second is realized by combining three flash cells with no thresholds and
by using a rank-modulation scheme. The possible transitions between states are
given by the 3-RMGC of Example 1.
permutation.1We further assume that the transition appears
at least twice. We will now show how to construct , a cyclic
and complete -RMGC with the same property.
We set the first permutation of the code to be ,
and then use the transitions to get a list
of permutations which we call the first block of the
construction. By our assumption, the permutations in this list
are all distinct, and they all share the property that their last
element is (since all the transitions use just the first
elements). Furthermore, since , we know that the
last permutation generated so far is .
We now use to create the first permutation of the second
block of the construction, and then use
again to create the entire second block. We repeat this
process times, i.e., use the sequence of transitions
a total of times to construct
blocks, each containing permutations.
The following two simple lemmas extend the intuition given
above.
1This last requirement merely restricts us to have used somewhere since
we can always rotate the set of transitions to make be the last one used.
Lemma 2: The second element in the first permutation in
every block is . The first element in the last permutation in
every block is also .
Proof: During the construction process, in each block we
use the transitions in order. If we were to
use the transition next, we would return to the
first permutation of the block since are the
transitions of a cyclic -RMGC. Since the element is
second in the initial permutation of the block, it follows that it
is the first element in the last permutation of the block. By the
construction, we now use , thus making the element second
in the first permutation of the second block. By repeating the
above arguments for each block we prove the lemma.
Lemma 3: In any block, the last element of all the permuta-
tions is constant. The sequence of last elements in the blocks
constructed is . The element is never a last
element.
Proof: The first claim is easily proved by noting that the
transitions creating a block, , only operate
on the first positions of the permutations. Also, by the
same logic used in the proof of the previous lemma, if the first
permutation of a block is , then the last
permutation in a block is , and thus the
first permutation of the next block is .
It follows that if we examine the sequence containing just the
first permutation in each block, the element remains fixed,
and the rest just rotate by one position each time. By the previous
lemma, the fixed element is , and therefore, the sequence of last
elements is as claimed.
Combining the two lemmas above, the blocks con-
structed so far form a cyclic (but not complete) -RMGC,
that we call , which may be schematically described as as
shown at the bottom of the page (where each box represents
a single block, and denotes the sequence of transitions
).
It is now obvious that is not complete because it is missing
exactly the permutations containing as their last el-
ement. We build a block containing these permutations in
the following way: we start by rotating the list of transitions
such that its last transition is .2For con-
venience, we denote the rotated sequence by ,
where . Assume the first permutation in the
block is . We set the following permuta-
2The transition must be present somewhere in the sequence or else the
last element would remain constant, thus contradicting the assumption that the
sequence generates a cyclic and complete -RMGC.
.
.
..
.
..
.
..
.
..
.
..
.
.
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
2662 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009
tions of the block to be the ones formed by the sequence
of transitions . Thus, the last permutation in
is .
In , we look for a transition of the following form:
. We contend that
such a transition must surely exist: does not contain permu-
tations in which is last, while it does contain permutations in
which is next to last, and some where is the first element.
Since is cyclic, there must be at least one transition
pushing an element from a next-to-last position to the first
position. At this transition we split and insert as follows:
.
.
..
.
..
.
..
.
.
where it is easy to see all transitions are valid. Thus, we have
created and to complete the recursion we have to make sure
appears at least twice, but that is obvious since the sequence
contains at least one occurrence of , and is
replicated times, . We therefore reach the following
conclusion.
Theorem 4: For every integer there exists a cyclic and
complete -RMGC.
Example 5: We construct a 4-RMGC by recursively using
the 3-RMGC shown in Example 1, to illustrate the construction
process. The sequence of transitions for the 3-RMGC in Ex-
ample 1 is , , , , , . As described in the construction,
in order to use this code as a basis for the 4-RMGC construction,
we need to have as the last transition. We therefore rotate the
sequence of transitions to be , , , , , . The resulting
first three blocks, denoted , are
To create the missing fourth block, , the construction requires
a transition sequence ending with , so we use the original
sequence , , , , , shown in Example 1. To decide
the starting permutation of the block, we search
for a transition of the form in
. Several such transitions exist, and we arbitrarily choose
seen in the fifth and sixth columns of
. The resulting missing block, ,is
Inserting between the fifth and sixth columns of results
in the following 4-RMGC given at the bottom of the page.
III. BALANCED -RMGCS
While the construction for -RMGCs given in the previous
section is mathematically pleasing, it suffers from a practical
drawback: while the top-charged cells are changed (having
their charge level increased while going through the permuta-
tions of a single block), the bottom cell remains untouched and
a large gap in charge levels develops between the least charged
and most charged cells. When eventually, the least charged cell
gets “pushed-to-the-top,” in order to acquire the target charge
level, the charging of the cell may take a long time or involve
large jumps in charge level (which are prone to cause write-dis-
turbs in neighboring cells). The balanced -RMGC described in
this section solves this problem.
A. Definition and Construction
In the current models of flash memory, it is sometimes the
case that due to precision constraints in the charge placement
mechanism, the actual possible charge levels are discrete. The
rank-modulation scheme is not governed by such constraints,
since it only needs to order cell levels unambiguously by means
of comparisons, rather than compare the cell levels against pre-
defined threshold values. However, in order to describe the fol-
lowing results, we will assume abstract discrete levels, that can
be understood as counting the number of push-to-the-top op-
erations executed up to the current state. In other words, each
push-to-the-top increases the maximum charge level by one.
Thus, we define the function , where
is the charge level of the th cell after the th programming
cycle. It follows that if we use transition in the th pro-
gramming cycle and the th cell is, at the time, th from the
top, then , and for ,
. In an optimal setting with no overshoots,
.
The jump in the th round is defined as , as-
suming the th cell was the affected one. It is desirable, when
programming cells, to make the jumps as small as possible.
We define the jump cost of an -RMGC as the maximum jump
during the transitions dictated by the code. We say an -RMGC
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
JIANG et al.: RANK MODULATION FOR FLASH MEMORIES 2663
is nondegenerate if it raises each of its cells at least once. A non-
degenerate -RMGC is said to be optimal if its jump cost is not
larger than any other nondegenerate -RMGC.
Lemma 6: For any optimal nondegenerate -RMGC, ,
the jump cost is at least .
Proof: In an optimal -RMGC, , we must raise the
lowest cell to the top charge level at least times. Such a jump
must be at least of magnitude . We cannot, however, do these
jumps consecutively, or else we return to the first permutation
after just steps. It follows that there must be at least one other
transition , , and so the first to be used after it jumps
by at least a magnitude of .
We call an -RMGC with a jump cost of a balanced
-RMGC. We now show a construction that turns any
-RMGC (balanced or not) into a balanced -RMGC. The orig-
inal -RMGC is not required to be cyclic or complete, but
if it is cyclic (complete) the resulting -RMGC will turn out to
be also cyclic (complete). The intuitive idea is to base the con-
struction on cyclic shifts that push the bottom to the top, and
use them as often as possible. This is desirable because does
not introduce gaps between the charge levels, so it does not ag-
gravate the jump cost of the cycle. Moreover, partitions the set
of permutations into orbits of length . Theorem 7 gives
a construction where these orbits are traversed consecutively,
based on the order given by the supporting -RMGC.
Theorem 7: Given a cyclic and complete -RMGC,
, defined by the transitions , then the fol-
lowing transitions define an -RMGC, denoted by , that is
cyclic, complete and balanced:
otherwise
for all .
Proof: Let us define the abstract transition , ,
that pushes to the bottom the th element from the bottom:
Because is cyclic and complete, using
starting with a permutation of produces a complete cycle
through all the permutations of , and using them starting
with a permutation of creates a cycle through all the
permutations of with the respective first element fixed,
because they operate only on the last elements.
Because of the first element being fixed, those permu-
tations of produced by , also have the prop-
erty of being cyclically distinct. Thus, they are representatives
of the distinct orbits of the permutations of under
the operation , since represents a simple cyclic shift when
operated on a permutation of .
Taking a permutation of , then using the transition
once, , followed by times using ,is
equivalent to using
Every transition of the form , , moves us to a
different orbit of , while the consecutive executions of
generate all the elements of the orbit. It follows that the resulting
permutations are distinct. Schematically, the construction of
based on is
The code is balanced, because in every block of tran-
sitions starting with a , we have: the
transition has a jump of ; the following
transitions have a jump of , and the rest a jump of .In
addition, because is cyclic and complete, it follows that
is also cyclic and complete.
Theorem 8: For any , there exists a cyclic, complete,
and balanced -RMGC.
Proof: We can use Theorem 7 to recursively construct all
the supporting -RMGCs, , with the basis
of the recursion being the complete cyclic 2-RMGC: ,
.
A similar construction, but using a more involved second-
order recursion, was later suggested by Etzion [9].
Example 9: Fig. 2 shows the transitions of a recursive, bal-
anced -RMGC for . The permutations are represented in
an matrix, where each row is an orbit generated
by . The transitions between rows occur when is the
top element. Note how these permutations (the exit points of the
orbits), after dropping the at the top and turning them upside
down, form a 3-RMGC:
This code is equivalent to the code from Example 1, up to a
rotation of the transition sequence and the choice of first per-
mutation. Fig. 3 shows the charge levels of the cells for each
programming cycle, for the resulting balanced 4-RMGC.
B. Successor Function
The balanced -RMGC can be used to implement a logic
cell with levels. This can also be understood as a counter
that increments its value by one unit at a time. The function
takes as input the current permu-
tation, and determines the transition to the next permutation
in the balanced recursive -RMGC. If , the next transition
is always (line 2). Otherwise, if the top element is not , then
the current permutation is not at the exit point of its orbit, there-
fore the next transition is (line 5). However, if is the top
element, then the transition is defined by the supporting cycle.
The function is called recursively, on the reflected permutation
of (line 7).
An important practical aspect is the average number of steps
required to decide which transition generates the next permu-
tation from the current one. A step is defined as a single query
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
2664 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009
Fig. 2. Balanced 4-RMGC.
Fig. 3. Charge level growth for the balanced 4-RMGC.
of the form “what is the th highest charged cell?” namely the
comparison in line 4.
The function is asymptotically optimal with re-
spect to this measure:
Theorem 10: In the function , the asymptotic av-
erage number of steps to create the successor of a given permu-
tation is one.
Proof: A fraction of of the transitions are , and these
occur whenever the cell is not the highest charged one, and
they are determined in just one step. Of the cases where is
highest charged, by recursion, a fraction of the transitions
are determined by just one more step, and so on. At the basis
of the recursion, permutations over two elements require zero
steps. Equivalently, the query “is equal to ” is performed
for every permutation, therefore times; the query “is equal
to ” is performed only for permutations, therefore
times, and so on. Thus, the total number of queries is
. Since , the asymptotic av-
erage number of steps to generate the next permutation is as
stated.
C. Ranking Permutations
In order to complete the design of the logic cell, we need to
define the correspondence between a permutation and its rank
in the balanced -RMGC. This problem is similar to that of
ranking permutations in lexicographic order. We will first re-
view the factoradic numbering system, and then present a new
numbering system that we call b-factoradic, induced by the bal-
anced -RMGC construction.
1) Review of the Factoradic Numbering System: The fac-
toradic is a mixed radix numbering system. The earliest ref-
erence appears in [26]. Lehmer [27] describes algorithms that
make the correspondence between permutations and factoradic.
Any integer number can be repre-
sented in the factoradic system by the digits ,
where for , and the weight of
is (with the convention that ). The digit is always
, and is sometimes omitted
Any permutation has a unique factoradic repre-
sentation that gives its position in the lexicographic ordering.
The digits are in this case the number of elements smaller
than that are to the right of . They are therefore in-
version counts, and the factoradic representation is an inversion
table (or vector) [15].
There is a large literature devoted to the study of ranking
permutations from a complexity perspective. Translating be-
tween factoradic and decimal representation can be done in
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
JIANG et al.: RANK MODULATION FOR FLASH MEMORIES 2665
arithmetic operations. The bottleneck is how to trans-
late efficiently between permutations and factoradic. A naive
approach similar to the simple algorithms described in [27]
requires . This can be improved to by using
merge–sort counting, or a binary search tree, or modular arith-
metic, all techniques described in [25]. This can be further
improved to [30], by using the special
data structure of Dietz [7]. In [30] linear time complexity is also
achieved by departing from lexicographic ordering. A linear
time complexity is finally achieved in [28], by using the fact
that the word size has to be in order to represent
numbers up to , and by hiding rich data structures in integers
of this size.
2) B-Factoradic—A New Numbering System: We will now
describe how to index permutations of the balanced recur-
sive -RMGC with numbers from , such that
consecutive permutations in the cycle have consecutive ranks
modulo . The permutation that gets index is a special
permutation that starts a new orbit generated by , and also
starts a new orbit in any of the recursive supporting -RMGCs,
.
The rank of a permutation is determined by its position in
the orbit of , and by the rank of the orbit, as given by the
rank of the supporting permutation of . The position
of a permutation inside an orbit of is given by the position
of . If the current permutation is and
for , then the position in the current orbit of
is (because the orbit starts with in posi-
tion ). The index of the current orbit is given by the rank
of the supporting permutation of , namely, the rank of
(notice that the permu-
tation of is reflected). Therefore, if , then
(1)
The above formula can be used recursively to determine
the rank of the permutations from the supporting balanced
-RMGCs, for . It now becomes clear
what permutation should take rank . The highest element
in every supporting RMGC should be in the second position,
therefore, , , , , and so
on, and . Therefore,
gets the rank . See Example 9 for the construction of the
recursive and balanced 4-RMGC where the permutation
has rank . Equation (1) induces a new numbering
system that we call b-factoradic (backwards factoradic). A
number can be represented by the digits
, where and the
weight of is . In this case is always and
can be omitted. It is easy to verify that this is a valid numbering
system, therefore, any has a unique
b-factoradic representation such that
The weights of the b-factoradic are sometimes called
“falling factorials,” and can be represented succinctly by the
Pochhammer symbol.
Example 11: Let and be the
current permutation. We can find its b-factoradic representation
as follows. We start from the least significant digit
, which is given by the position of minus modulo ,so
(here we keep the elements of the permutation
indexed from to ). We now recurse on the residual permuta-
tion of five elements, (notice the reflected
reading of this permutation, from towards the left). Now is
given by the position of ; . The residual
permutation is , therefore,
. For the next step, and .
Finally, and . As always
. The b-factoradic representation is therefore
, where the subscript indicates the position of
the digit. Going from a b-factoradic representation to a permu-
tation of the balanced -RMGC can follow a similar reversed
procedure.
The procedure of Example 11 can be formalized algorithmi-
cally, however, its time complexity is , similar to the naive
algorithms specific to translations between permutations in lex-
icographic order and factoradic. We can in principle use all the
available results for factoradic, described previously, to achieve
time complexity of or lower. However, we are not
going to repeat all those methods here, but rather describe a
linear time procedure that takes a permutation and its factoradic
as input and outputs the b-factoradic. We can thus leverage di-
rectly all the results available for factoradic, and use them to
determine the current symbol of a logic cell.
The procedure - - exploits the
fact that the inversion counts are already given by the factoradic
representation, and they can be used to compute directly the
digits of the b-factoradic. A b-factoradic digit is a count of
the elements smaller than that lie between and
when the permutation is viewed as a cycle. The direction
of the count alternates for even and odd values of . The inverse
of the input permutation can be computed in time (line
1). The position of every element of the permutation can then
be computed in constant time (lines 2 and 5). The test in line
6 decides if we count towards the right or left starting from the
position that holds element , until we reach position
that holds element . By working out the cases when
and we obtain the formulas in lines 7 and 9. Since this
computation takes a constant number of arithmetic operations,
the entire algorithm takes time.
Unranking, namely, going from a number in
to a permutation in balanced order is likely to never be necessary
in practice, since the logic cell is designed to be a counter. How-
ever, for completeness, we describe the simplest proce-
dure , that takes a b-factoradic as input and produces
the corresponding permutation. The procedure uses variable
to simulate the cyclic counting of elements smaller than the cur-
rent one. The direction of the counting alternates, based on the
test in line 4.
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
2666 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009
IV. REWRITING WITH RANK-MODULATION CODES
In Gray codes, the states transit along a well-designed path.
What if we want to use the rank-modulation scheme to store
data, and allow the data to be modified in arbitrary ways? Con-
sider an information symbol that is stored using cells.
In general, might be smaller than , so we might end up
having permutations that are not used. On the other hand, we
can map several distinct permutations to the same symbol in
order to reduce the rewrite cost. We let denote the set
of states (i.e., the set of permutations) that are used to represent
information symbols. We define two functions, an interpreta-
tion function,, and an update function, .
Definition 12: The interpretation function
maps every state to a value in . Given an “old
state” and a “new information symbol” , the
update function produces a state
such that .
When we use cells to store an information symbol, the
permutation induced by the charge levels of the cells repre-
sents the information through the interpretation function. We
can start the process by programming some arbitrary initial per-
mutation in the flash cells. Whenever we want to change the
stored information symbol, the permutation is changed using
the “push-to-the-top” operations based on the update function.
We can keep changing the stored information as long as we
do not reach the maximal charge level possible in any of the
cells. Therefore, the number of “push-to-the-top” operations in
each rewrite operation determines not only the rewriting delay
but also how much closer the highest cell-charge level is to the
system limit (and therefore how much closer the cell block is
to the next costly erase operation). Thus, the objective of the
coding scheme is to minimize the number of “push-to-the-top”
operations.
Definition 13: Given two states , the cost of
changing into , denoted , is defined as the min-
imum number of “push-to-the-top” operations needed to change
into .
For example, ,
. We define two important measures: the worst case
rewrite cost and the average rewrite cost.
Definition 14: The worst case rewrite cost is defined as
. Assume input symbols are
independent and identically distributed (i.i.d.) random vari-
ables having value with probability . Given a fixed
, the average rewrite cost given is defined as
. If we further assume some
stationary probability distribution over the states , where we
denote the probability of state as , then the average rewrite
cost of the code is defined as . (Note that for all
, .)
In this section, we present a code that minimizes the worst
case rewrite cost. In Section IV-A, we focus on codes with good
average rewrite cost.
A. Lower Bound
We start by presenting a lower bound on the worst case rewrite
cost. Define the transition graph as a directed graph
with , that is, with vertices representing the permu-
tations in . For any , there is a directed edge from
to iff . is a regular digraph, because every
vertex has incoming edges and outgoing edges. The
diameter of is .
Given a vertex and an integer ,
define the ball centered at with radius as
, and define the sphere centered at
with radius as . Clearly
By a simple relabeling argument, both and are
independent of , and so will be denoted by and re-
spectively.
Lemma 15: For any
.
Proof: Fix a permutation . Let be the set of per-
mutations having the following property: for each permutation
, the elements appearing in its last positions appear
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
JIANG et al.: RANK MODULATION FOR FLASH MEMORIES 2667
in the same relative order in . For example, if , ,
, and , the last three elements
of —namely, —have the same relative order in .Itis
easy to see that given , when the elements occupying the first
positions in are chosen, the last positions become
fixed. There are choices for occupying
the first positions of , hence, . We will
show that a vertex is in if and only if .
Suppose . It follows that can be obtained from
with at most “push-to-the-top” operations. Those elements
pushed to the top appear in the first positions of , so the
last positions of contain elements which have the same
relative order in , thus, .
Now suppose .For , let denote the element in
the th position of . One can transform into by sequentially
pushing to the top. Hence, .
We conclude that . Since
, the second claim follows.
The following lemma presents a lower bound on the worst
case rewrite cost.
Lemma 16: Fix integers and , and define to be
the smallest integer such that . For any code
and any state , there exists such that
, i.e., the worst case rewrite cost of any code
is at least .
Proof: By the definition of , . Hence,
we can choose . Clearly, by
our choice .
B. Optimal Code
We now present a code construction. It will be shown that the
code has optimal worst case performance. First, let us define the
following notation.
Definition 17: Aprefix sequence
is a sequence of distinct symbols from . The prefix
set is defined as all the permutations in which
start with the sequence .
We are now in a position to construct the code.
Construction 18: Arbitrarily choose distinct prefix se-
quences, , each of length . Let us define
and map the states of to , i.e., for
each and , set .
Finally, to construct the update function ,given and
some , we do the following: let
be the first elements which appear in all the permu-
tations in . Apply push-to-the-top on the elements
in to get a permutation
for which, clearly, . Set .
Theorem 19: The code in Construction 18 is optimal in terms
of minimizing the worst case rewrite cost.
Proof: First, the number of length prefix sequences is
. By definition, the number of prefix sequences
of length is at least , which allows the first of the con-
struction. To complete the proof, it is obvious from the descrip-
tion of that the worst case rewrite cost of the construction is
at most . By Lemma 16 this is also the best we can hope
for.
Example 20: Let , . Since , it fol-
lows that . We partition the states into
sets, which induce the mapping
The cost of any rewrite operation is at most .
V. O PTIMIZING AVERAGE REWRITE COST
In this section, we study codes that minimize the average
rewrite cost. We first present a prefix-free code that is optimal
in terms of its own design objective. Then, we show that this
prefix-free code minimizes the average rewrite cost with an ap-
proximation ratio if , and when , the ap-
proximation ratio is further reduced to .
A. Prefix-Free Code
The prefix-free code we propose consists of prefix sets
(induced by prefix sequences )
which we will map to the input symbols: for every and
, we set . Unlike in the previous section, the
prefix sequences are no longer necessarily of the same length.
We do, however, require that no prefix sequence be the prefix
of another.
A prefix-free code can be represented by a tree. First, let us
define a full permutation tree as follows. The vertices in
are placed in layers, where the root is in layer and the
leaves are in layer . Edges only exist between adjacent layers.
For , a vertex in layer has children. The
edges are labeled in such a way that every leaf corresponds to a
permutation from which may be constructed from the labels
on the edges from the root to the leaf. An example is given in
Fig. 4(a). A prefix-free code corresponds to a subtree of
(see Fig. 4(b) for an example). Every leaf is mapped to a prefix
sequence which equals the string of labels as read on the path
from the root to the leaf.
For , let denote the prefix sequence representing ,
and let denote its length. For example, the prefix sequences
in Fig. 4(b) have minimum length and maximum length . The
average codeword length is defined as
Here, the probabilities are as defined before, that is, infor-
mation symbols are i.i.d. random variables having value
with probability . We can see that with the prefix-free
code, for every rewrite operation (namely, regardless of the old
permutation before the rewriting), the expected rewrite cost is
upper-bounded by . Our objective is to design a
prefix-free code that minimizes its average codeword length.
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
2668 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009
Fig. 4. Prefix-free rank-modulation code for and . (a) The full permutation tree . (b) A prefix-free code represented by a subtree of . The leaves
represent the prefix sequences, which are displayed beside the leaves.
Example 21: Let and , and let the prefix-free
code be as shown in Fig. 4(b). We can map the information
symbols to the prefix sequences as follows:
Then, the mapping from the permutations to the information
symbols is
Assume that the current state of the cells is
, representing the information symbol . If we want to
rewrite the information symbol as , we can shift cells 3, 4 to the
top to change the state to . This rewrite
cost is , which does not exceed . In general, given any
current state, considering all the possible rewrites, the expected
rewrite cost is always less than , the average code-
word length.
The optimal prefix-free code cannot be constructed with a
greedy algorithm like the Huffman code [19], because the in-
ternal nodes in different layers of the full permutation tree
have different degrees, making the distribution of the vertex de-
grees in the code tree initially unknown. The Huffman code is
a well-known variable-length prefix-free code, and many vari-
ations of it have been studied. In [20], the Huffman code con-
struction was generalized, assuming that the vertex-degree dis-
tribution in the code tree is given. In [1], prefix-free codes for
infinite alphabets and nonlinear costs were presented. When the
letters of the encoding alphabet have unequal lengths, only ex-
ponential-time algorithms are known, and it is not known yet
whether this problem is NP-hard [12]. To construct prefix-free
codes for our problem, which minimize the average codeword
length, we present a dynamic-programming algorithm of time
complexity . Note that without loss of generality, we can
assume the length of any prefix sequence to be at most .
The algorithm computes a set of functions ,
for , , and
. We interpret the meaning
of as follows. We take a subtree of that con-
tains the root. The subtree has exactly leaves in the layers
. It also has at most vertices in the layer .
We let the leaves represent the letters from the alphabet
with the lowest probabilities : the further the leaf is from the
root, the lower the corresponding probability is. Those leaves
also form prefix sequences, and we call their weighted av-
erage length (where the probabilities are weights) the value
of the subtree. The minimum value of such a subtree (among
all such subtrees) is defined to be . In other words,
is the minimum average prefix-sequence length
when we assign a subset of prefix sequences to a subtree of
(in the way described above). Clearly, the minimum average
codeword length of a prefix-free code equals .
Without loss of generality, let us assume that
. It can be seen that the following recursions hold.
• When and
• When and
• When
• When and
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
JIANG et al.: RANK MODULATION FOR FLASH MEMORIES 2669
Fig. 5. Three cases for computing in Example 22. The solid-line edges are in the subtree. The dotted-line edges are the remaining edges in the
full-permutation tree . The leaves in the subtree are shown as black vertices. (a) No leaf in layer 2. (b) One leaf in layer 2. (c) Two leaves in layer 2.
The last recursion holds because a subtree with leaves in
layers and at most vertices in layer can
have leaves in layer .
The algorithm first computes , then
, and so on, until it finally computes ,
by using the above recursions. Given these values, it is straight-
forward to determine in the optimal code, how many prefix
sequences are in each layer, and therefore determine the optimal
code itself. It is easy to see that the algorithm returns an optimal
code in time .
Example 22: Let and , and let us assume that
. As an example, let us consider how to
compute .
By definition, corresponds to a subtree of with
a total of four leaves in layer 2 and layer 3, and with at most
three vertices in layer 2. Thus, there are four cases to consider:
either there are zero, one, two, or three leaves in layer 2. The
corresponding subtrees in the first three cases are as shown in
Fig. 5(a)–(c), respectively. The fourth case is actually impos-
sible, because it leaves no place for the fourth leaf to exist in the
subtree.
If layer 2 has leaves , then layer 3 has leaves
and there can be at most vertices in layer 3 of the sub-
tree. To assign to the four leaves and minimize the
weighted average distance of the leaves to the root (which is de-
fined as ), among the four cases mentioned above, we
choose the case that minimizes that weighted average distance.
Therefore
Now assume that after computing all the ’s, we
find that
That means that in the optimal code tree, there are two leaves in
layer 1. If we further assume that
we can determine that there are five leaves in layer 2, and the
optimal code tree will be as shown in Fig. 4(b).
We can use the prefix-free code for rewriting in the following
way: to change the information symbol to , push at most
cells to the top so that the top-ranked cells are the same
as the codeword .
B. Performance Analysis
We now analyze the average rewrite cost of the prefix-free
code. We obviously have . When , the code design
becomes trivial—each permutation is assigned a distinct input
symbol. In this subsection, we prove that the prefix-free code
has good approximation ratios under mild conditions: when
, the average rewrite cost of a prefix-free code (that was
built to minimize its average codeword length) is at most three
times the average rewrite cost of an optimal code (i.e., a code
that minimizes the average rewrite cost), and when ,
the approximation ratio is further reduced to .
Loosely speaking, our strategy for proving this approxima-
tion ratio involves an initial simple bound on the rewrite cost of
any code when considering a rewrite operation starting with a
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
2670 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009
stored symbol . We then proceed to define a prefix-free
code which locally optimizes (up to the approximation ratio)
rewrite operations starting with stored symbol . Finally, we in-
troduce the globally optimal prefix-free code of the previous
section, which optimizes the average rewrite cost, and show that
it is still within the correct approximation ratio.
We start by bounding from below the average rewrite cost
of any code, depending on the currently stored information
symbol. Suppose we are using some code with an interpreta-
tion function and an update function . Furthermore, let
us assume the currently stored information symbol is
in some state , i.e., . We want to consider
rewrite operations which are meant to store the value
instead of , for all . Without loss of generality, assume
that the probabilities of information symbols are monotonically
decreasing
Let us denote by the closest
permutations to ordered by increasing distance, i.e.,
and denote for every . We note
that are independent of the choice of , and fur-
thermore, that while .
The average rewrite cost of a stored symbol using a
code is the weighted sum
This sum is minimized when are assigned
the closest permutations to with higher probability in-
formation symbols mapped to closer permutations. For conve-
nience, let us define the functions
.
Thus, the average rewrite cost of a stored symbol , under
any code, is lower-bounded by
We continue by considering a specific intermediary prefix-
free code that we denote by . Let it be induced by the prefix
sequences . We require the following two proper-
ties:
P.1 For every , , we require .
P.2 .
We also note that is not necessarily a prefix-free code with
minimal average codeword length.
Finally, let be a prefix-free code that minimizes its average
codeword length. Let be induced by the prefix sequences
, and let be any state such that .
Denote by the average rewrite cost of a rewrite operation
under starting from state .
By the definition of and we have
Since it follows that
Since the same argument works for every , we can
say that
(2)
It is evident that the success of this proof strategy hinges on
the existence of for every , which we now turn to
consider.
The following lemma is an application of the well-known
Kraft–McMillan inequality [29].
Lemma 23: Let be nonnegative integers.
There exists a set of prefix sequences with exactly prefix
sequences of length , for (i.e., there are
leaves in layer of the code tree ), if and only if
Let us define the following sequence of integers:
,
,
.
We first contend that they are all nonnegative. We only need to
check and indeed
It is also clear that
In fact, in the following analysis, represent a
partition of the alphabet letters.
Lemma 24: When , there exists a set of prefix se-
quences that contains exactly prefix sequences of length ,
for .
Proof: Let us denote
(3)
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
JIANG et al.: RANK MODULATION FOR FLASH MEMORIES 2671
When
respectively. Thus, for all .Wenow
show that when , monotonically decreases in .
Substituting into (3) we get
After some tedious rearrangement, for any integer
Hence, monotonically decreases for all which im-
mediately gives us for all . By Lemma 23, the
proof is complete.
We are now in a position to show the existence of ,
, for . By Lemma 24, let be a list
of prefix sequences, where exactly of the sequences are of
length . Without loss of generality, assume
Remember we also assume
We now define to be the prefix-free code induced by the
prefix sequences
that is, for all
, .
Note that for all , the prefix sequence represents the
information symbol , which is associated with the probability
in rewriting.
Lemma 25: The properties P.1 and P.2 hold for , .
Proof: Property P.2 holds by definition, since
whose length is set to . To prove property P.1 holds,
we first note that when , for all there are
exactly indices for which . On the other hand,
when , among the prefix sequences we have
of them of length when , and the rest of
them are of length . Intuitively speaking, we can map the
indices for which to distinct prefix sequences of
length , the indices for which to distinct prefix
sequences of length , and so on.
Since the prefix sequences are arranged in ascending length
order
it follows that for every ,
Hence, property P.1 holds.
We can now state the main theorem.
Theorem 26: Fix some and let be a prefix-free
code over which minimizes its average codeword length.
For any rewrite operation with initial stored information symbol
i.e., the average cost of rewriting under is at most three times
the lower bound.
Proof: Define and consider the input alphabet
with input symbols being i.i.d. random variables where
symbol appears with probability . We set
.
Let be a prefix-free code over which minimizes its av-
erage codeword length.
A crucial observation is the following: , the lower bound
on the average rewrite cost of symbol , does depend on the
probability distribution of the input symbols. Let us therefore
distinguish between over , and over . However,
by definition, and by our choice of probability distribution over
for every . Since is a more restricted version of ,it
obviously follows that
for every . By applying inequality (2), and since by
Lemma 25, the code exists over , we get that
for all .
Corollary 27: When , the average rewrite cost of
a prefix-free code minimizing its average codeword length is at
most three times that of an optimal code.
Proof: Since the approximation ratio of holds for every
rewrite operation (regardless of the initial state and its interpre-
tation), it also holds for any average case.
With a similar analysis, we can prove the following result:
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
2672 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 6, JUNE 2009
Theorem 28: Fix some , , and let be a
prefix-free code over which minimizes its average codeword
length. For any rewrite operation with initial stored information
symbol
i.e., the average cost of rewriting under is at most twice the
lower bound.
Proof: See the Appendix.
Corollary 29: When , , the average rewrite
cost of a prefix-free code minimizing its average codeword
length is at most twice that of an optimal code.
VI. CONCLUSION
In this paper, we present a new data storage scheme, rank
modulation, for flash memories. We show several Gray code
constructions for rank modulation, as well as data rewriting
schemes. One important application of the Gray codes is the
realization of logic multilevel cells. For data rewriting, an op-
timal code for the worst case performance is presented. It is also
shown that to optimize the average rewrite cost, a prefix-free
code can be constructed in polynomial time that approximates
an optimal solution well under mild conditions. There are many
open problems concerning rank modulation, such as the con-
struction of error-correcting rank-modulation codes and codes
for rewriting that are robust to uncertainties in the information
symbol’s probability distribution. Some of these problems have
been addressed in some recent work [24].
APPENDIX
In this appendix, we prove Theorem 28. The general approach
is similar to the proof of Theorem 26, so we only specify some
details that are relatively important here.
We define the following sequence of numbers:
,
,
.
As before, we contend that they are all nonnegative. We only
need to check and indeed, for
We now prove the equivalent of Lemma 24.
Lemma 30: When , , there exists a set of
prefix sequences that contains exactly prefix sequences of
length , for .
Proof: Let us denote
(4)
When
respectively. Thus, for all .Wenow
show that when , monotonically decreases in .
Substituting into (4) we get
After some tedious rearrangement, for any integer
Hence, monotonically decreases for all which
immediately gives us for all . By Lemma 23,
the proof is complete.
The remaining lemmas comprising the rest of the proof pro-
cedure are similar to those of Section V-B.
ACKNOWLEDGMENT
The authors would like to thank the anonymous reviewers,
whose comments helped improve the presentation of the paper.
REFERENCES
[1] M. B. Baer, “Optimal prefix codes for infinite alphabets with nonlinear
costs,” IEEE Trans. Inf. Theory, vol. 54, no. 3, pp. 1273–1286, Mar.
2008.
[2] A. Bandyopadhyay, G. Serrano, and P. Hasler, “Programming analog
computational memory elements to 0.2% accuracy over 3.5 decades
using a predictive method,” in Proc. IEEE Int. Symp. Circuits and Sys-
tems, Kobe, Japan, May 2005, pp. 2148–2151.
[3] T. Berger, F. Jelinek, and J. K. Wolf, “Permutation codes for sources,”
IEEE Trans. Inf. Theory, vol. IT-18, no. 1, pp. 160–169, Jan. 1972.
[4] V. Bohossian, A. Jiang, and J. Bruck, “Buffer coding for asymmetric
multi-level memory,” in Proc. IEEE Int. Symp. Information Theory
(ISIT2007), Nice, France, Jun. 2007, pp. 1186–1190.
[5] P. Cappelletti and A. Modelli, “Flash memory reliability,” in Flash
Memories, P. Cappelletti, C. Golla, P. Olivo, and E. Zanoni, Eds.
Amsterdam, The Netherlands: Kluwer, 1999, pp. 399–441.
[6] G. D. Cohen, P. Godlewski, and F. Merkx, “Linear binary code for
write-once memories,” IEEE Trans. Inf. Theory, vol. IT-32, no. 5, pp.
697–700, Sep. 1986.
[7] P. Dietz, Optimal Algorithms for List Indexing and Subset Rank.
London, U.K.: Springer-Verlag, 1989.
[8] B. Eitan and A. Roy, “Binary and multilevel flash cells,” in Flash Mem-
ories, P. Cappelletti, C. Golla, P. Olivo, and E. Zanoni, Eds. Ams-
terdam, The Netherlands: Kluwer, 1999, pp. 91–152.
[9] T. Etzion, Oct. 2007, personal communication.
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
JIANG et al.: RANK MODULATION FOR FLASH MEMORIES 2673
[10] A. Fiat and A. Shamir, “Generalized write-once memories,” IEEE
Trans. Inf. Theory, vol. IT-30, no. 3, pp. 470–480, May 1984.
[11] F.-W. Fu and A. J. H. Vinck, “On the capacity of generalized write-once
memory with state transitions described by an arbitrary directed acyclic
graph,” IEEE Trans. Inf. Theory, vol. 45, no. 1, pp. 308–313, Jan. 1999.
[12] M. J. Golin and G. Rote, “A dynamic programming algorithm for con-
structing optimal prefix-free codes with unequal letter costs,” IEEE
Trans. Inf. Theory, vol. 44, no. 5, pp. 1770–1781, Sep. 1998.
[13] F. Gray, “Pulse Code Communication,” U.S. Patent 2632058, Mar.
1953.
[14] M. Grossi, M. Lanzoni, and B. Ricco, “Program schemes for multilevel
flash memories,” Proc. IEEE, vol. 91, no. 4, pp. 594–601, Apr. 2003.
[15] M. Hall, Jr and D. E. Knuth, “Combinatorial analysis and computers,”
Amer. Math. Monthly, vol. 72, no. 2, pp. 21–28, 1965.
[16] A. J. H. Vinck and A. V. Kuznetsov, “On the general defective channel
with informed encoder and capacities of some constrained memories,”
IEEE Trans. Inf. Theory, vol. 40, no. 6, pp. 1866–1871, Nov. 1994.
[17] C. D. Heegard, “On the capacity of permanent memory,” IEEE Trans.
Inf. Theory, vol. IT-31, no. 1, pp. 34–42, Jan. 1985.
[18] C. D. Heegard and A. A. El Gamal, “On the capacity of computer
memory with defects,” IEEE Trans. Inf. Theory, vol. IT-29, no. 5, pp.
731–739, Sep. 1983.
[19] D. A. Huffman, “A method for the construction of minimum-redun-
dancy codes,” Proc. IRA, vol. 40, no. 9, pp. 1098–1101, Sep. 1952.
[20] F. K. Hwang, “Generalized Huffman trees,” SIAM J. Appl. Math., vol.
37, no. 1, pp. 124–127, 1979.
[21] A. Jiang, “On the generalization of error-correcting WOM codes,” in
Proc. IEEE Int. Symp. Information Theory (ISIT2007), Nice, France,
Jun. 2007, pp. 1391–1395.
[22] A. Jiang, V. Bohossian, and J. Bruck, “Floating codes for joint in-
formation storage in write asymmetric memories,” in Proc. IEEE Int.
Symp. Information Theory (ISIT2007), Nice, France, Jun. 2007, pp.
1166–1170.
[23] A. Jiang and J. Bruck, “Joint coding for flash memory storage,” in Proc.
IEEE Int. Symp. Information Theory (ISIT2008), Toronto, ON, Canada,
Jul. 2008, pp. 1741–1745.
[24] A. Jiang, M. Schwartz, and J. Bruck, “Error-correcting codes for rank
modulation,” in Proc. IEEE Int. Symp. Information Theory (ISIT2008),
Toronto, ON, Canada, Jul. 2008, pp. 1736–1740.
[25] D. E. Knuth, The Art of Computer Programming Volume 3: Sorting and
Searching, 2nd ed. Reading, MA: Addison-Wesley, 1998.
[26] C. A. Laisant, “Sur la numération factorielle, application aux permuta-
tions,” Bull. Société Mathématique de France, vol. 16, pp. 176–183.
[27] D. H. Lehmer, “Teaching combinatorial tricks to a computer,” in Proc.
Symp. Applied Mathematics and Combinatorial Analysis, 1960, vol.
10, pp. 179–193.
[28] M. Mares and M. Straka, “Linear-time ranking of permutations,” Algo-
rithms-ESA, pp. 187–193, 2007.
[29] B. McMillan, “Two inequalities implied by unique decipherability,
IEEE Trans. Inf. Theory, vol. 2, no. 4, pp. 115–116, 1956.
[30] W. Myrvold and F. Ruskey, “Ranking and unranking permutations in
linear time,” Inf. Process. Lett., vol. 79, no. 6, pp. 281–284, 2001.
[31] H. Nobukata, S. Takagi, K. Hiraga, T. Ohgishi, M. Miyashita, K.
Kamimura, S. Hiramatsu, K. Sakai, T. Ishida, H. Arakawa, M. Itoh, I.
Naiki, and M. Noda, “A 144-Mb, eight-level nand flash memory with
optimized pulsewidth programming,” IEEE J. Solid-State Circuits,
vol. 35, no. 5, pp. 682–690, May 2000.
[32] R. L. Rivest and A. Shamir, “How to reuse a write-once memory,” Inf.
Contr., vol. 55, pp. 1–19, 1982.
[33] C. D. Savage, “A survey of combinatorial gray codes,SIAM Rev., vol.
39, no. 4, pp. 605–629, Dec. 1997.
[34] R. Sedgewick, “Permutation generation methods,” Comput. Surv., vol.
9, no. 2, pp. 137–164, Jun. 1977.
[35] D. Slepian, “Permutation modulation,” Proc. IEEE, vol. 53, no. 3, pp.
228–236, Mar. 1965.
[36] J. K. Wolf, A. D. Wyner, J. Ziv, and J. Körner, “Coding for a write-once
memory,AT&T Bell Labs. Tech. J., vol. 63, no. 6, pp. 1089–1112,
1984.
[37] E. Yaakobi, P. H. Siegel, and J. K. Wolf, “Buffer codes for multi-level
flash memory,” presented at the Poster Session of the IEEE Int. Symp.
Information Theory, Toronto, ON, Canada, Jul. 2008.
Anxiao (Andrew) Jiang (S’00)-M’04) received the B.S. degree in electronic
engineering from Tsinghua University, Beijing, China, in 1999 and the M.S.
and Ph.D. degrees in electrical engineering from the California Institute of Tech-
nology, Pasadena, in 2000 and 2004, respectively.
He is currently an Assistant Professor in the Computer Science and Engi-
neering Department at Texas A&M University, College Station. His research
interests include information theory, data storage, networks, and algorithm de-
sign.
Prof. Jiang is a recipient of the NSF CAREER Award in 2008 for his research
on information theory for flash memories.
Robert Mateescu (M’08) received the B.S. degree in computer science and
mathematics from the University of Bucharest, Bucharest, Romania, in 1997
and the M.S. and Ph.D. degrees in information and computer science from the
University of California, Irvine, in 2003 and 2007, respectively.
He is currently a Postdoctoral Scholar in Electrical Engineering at the Cal-
ifornia Institute of Technology, Pasadena. His research interests include algo-
rithms for the representation and inference of information.
Moshe Schwartz (M’03) was born in Israel in 1975. He received the B.A.,
M.Sc., and Ph.D. degrees from the Technion–Israel Institute of Technology,
Haifa, Israel, in 1997, 1998, and 2004, respectively, all from the Computer Sci-
ence Department.
He was a Fulbright Postdoctoral Researcher in the Department of Electrical
and Computer Engineering, University of California, San Diego, La Jolla, and a
Postdoctoral Researcher in the Department of Electrical Engineering, California
Institute of Technology, Pasadena. He now holds a position with the Department
of Electrical and Computer Engineering, Ben-Gurion University, Beer-Sheva,
Israel. His research interests include algebraic coding, combinatorial structures,
and digital sequences.
Jehoshua Bruck (S’86–M’89–SM’93–F’01) received the B.Sc. and M.Sc. de-
grees in electrical engineering from the Technion–Israel Institute of Technology,
Haifa, Israel, in 1982 and 1985, respectively, and the Ph.D. degree in electrical
engineering from Stanford University, Stanford, CA, in 1989.
He is the Gordon and Betty Moore Professor of Computation and Neural Sys-
tems and Electrical Engineering at the California Institute of Technology (Cal-
tech), Pasadena. His research focuses on information theory and systems and
the theory biological networks. He has an extensive industrial experience. He
worked at IBM Research where he participated in the design and implementa-
tion of the first IBM parallel computer. He was a cofounder and Chairman of
Rainfinity, a spinoff company from Caltech that focused on software products
for management of network information storage systems.
Dr. Bruck received the National Science Foundation Young Investigator
award, the Sloan fellowship, and the 2005 S. A. Schelkunoff Transactions prize
paper award from the IEEE Antennas and Propagation society.
Authorized licensed use limited to: Texas A M University. Downloaded on May 29, 2009 at 19:23 from IEEE Xplore. Restrictions apply.
... Rank modulation was proposed as a solution to the challenges posed by flash memory storages [8]. In the rank modulation framework, codes are permutation codes, where by a permutation code (PC) of length n we simply mean a nonempty subset C of S n , the set of all permutations of [n] := {1, 2, . . . ...
... For two permutations ρ, π ∈ S n , the Kendall τ -distance between ρ and π, d K (ρ, π), is defined as the minimum number of adjacent transpositions needed to transform ρ into π. Under the Kendall τ -distance a PC of length n with minimum distance d can correct up to d−1 2 errors caused by charge-constrained errors [8]. ...
... The maximum size of a PC of length n and minimum Kendall τ -distance d is denoted by P (n, d). Several researchers have presented bounds on P (n, d) (see [1,2,8,10,11,12]), some of these results are shown in Table 1. It is known that P (n, 1) = n! and P (n, 2) = n! 2 . ...
Preprint
Full-text available
We study $P(n,3)$, the size of the largest subset of the set of all permutations $S_n$ with minimum Kendall $\tau$-distance $3$. Using a combination of group theory and integer programming, we reduced the upper bound of $P(p,3)$ from $(p-1)!-1$ to $(p-1)!-\lceil\frac{p}{3}\rceil+2\leq (p-1)!-2$ for all primes $p\geq 11$. In special cases where $n$ is equal to $6,7,11,13,14,15$ and $17$ we reduced the upper bound of $P(n,3)$ by $3,3,9,11,1,1$ and $4$, respectively.
... (ii) Up to now, various coding techniques have been applied to alleviate the detection in case of channel mismatch, such as, rank modulation [32], balanced codes [33][34][35][36][37], and composition check codes [38]. ...
... In rank modulation [32], data is carried by the relative charge levels of many cells and not by the charge level in a single cell. Assume a sequence of the charge levels in 5 cells is (6,1,3,2,10). ...
... Other approaches are errorcorrecting techniques. Up to now, various coding techniques have been applied to alleviate the detection in case of channel mismatch, specifically rank modulation [32], balanced codes [34], and composition check codes [38]. These methods are often considered too expensive in terms of redundancy and complexity. ...
... Flash memory is a non-volatile storage medium that is both electrically programmable and erasable. The rank modulation scheme for flash memories has been proposed in [7]. In this scheme, a permutation corresponds to a relative ranking of all the flash memory cells' levels. ...
... Thus, in these cases, we still have the representation in (10). By (7) and (10), we can obtain the expression of S n K (i) in the above lemma. ...
... Example 5 Let n = 11 and π = [3, 2, 1, 4, 5, 6,7,8,9,10,11]. Consider T (5) 11,(3,3) (π), we obtain the two kinds of permutations in T (5) 11,(3,3) (π). By using an adjacent transpositions on the former 2 elements of π, we obtain the first kind of permutation σ = [2,3,1,4,5,6,7,8,9,10,11]. ...
Article
Full-text available
In the rank modulation scheme for flash memories, permutation codes have been studied. In this paper, we study perfect permutation codes in Sn, the set of all permutations on n elements, under the Kendall τ-metric. We answer one open problem proposed by Buzaglo and Etzion. That is, proving the nonexistence of perfect codes in Sn, under the Kendall τ-metric, for more values of n. Specifically, we present the polynomial representation of the size of a ball in Sn under the Kendall τ-metric for some radius r, and obtain some sufficient conditions of the nonexistence of perfect permutation codes. Further, we prove that there does not exist a perfect t-error-correcting code in Sn under the Kendall τ-metric for some n and t=2,3,4,5,or58n2<2t+1≤n2.
... In RM scheme for flash memory, information is stored in the form of rankings of cell charges [23], [24]. The translocation error, which is an extension of another well-studied error, the adjacent transposition error [1], [5], [7], [24], [31], is caused by moving the rankings of one cell below a certain number of closest ranked cells. ...
... The data may be vulnerable to noises caused by potential cell over-injection, charge leakage, and read/write disturbance. The translocation errors were defined to characterize the noises [13], [23], [24]. For a permutation π = (x 1 , . . . ...
Preprint
Permutation codes were extensively studied in order to correct different types of errors for the applications on power line communication and rank modulation for flash memory. In this paper, we introduce the neural network decoders for permutation codes to correct these errors with one-shot decoding, which treat the decoding as $n$ classification tasks for non-binary symbols for a code of length $n$. These are actually the first general decoders introduced to deal with any error type for these two applications. The performance of the decoders is evaluated by simulations with different error models.
... Another suggestion put forth by [16], and later studied by [21], was to employ the rank-modulation scheme over the profile vectors. Rank modulation has a long history, starting with [4], [6], [22] for vector digitization and signal detection, through communication over power lines [25], and more recently, for information storage in non-volatile memories [14]. In our context, instead of storing the information in the profile vector, whose integer entries count the number of occurrences of each ℓ-gram from Σ ℓ , the information is stored in the permutation over Σ ℓ which is the ranking (by frequency of appearance) of the entries of the profile vector. ...
... The identity of the set A depends on the specifics of the applications. As examples we bring [6] dealing with signal detection with impulsive noise, [25] for powerline communications, and [14] for coding in flash memories. ...
Preprint
We study permutations over the set of $\ell$-grams, that are feasible in the sense that there is a sequence whose $\ell$-gram frequency has the same ranking as the permutation. Codes, which are sets of feasible permutations, protect information stored in DNA molecules using the rank-modulation scheme, and read using the shotgun sequencing technique. We construct systematic codes with an efficient encoding algorithm, and show that they are optimal in size. The length of the DNA sequences that correspond to the codewords is shown to be polynomial in the code parameters. Non-systematic with larger size are also constructed.
... Researchers have also proposed some innovative data representation schemes with different requirements in terms of read thresholds. For example, rank modulation [17], [18], [19], [20], [21] stores information in the relative voltages between the cells instead of using pre-defined voltage levels. The strategy of writing data represented by rank modulation in parallel to flash memories is studied in [22]. ...
Preprint
Full-text available
A primary source of increased read time on NAND flash comes from the fact that in the presence of noise, the flash medium must be read several times using different read threshold voltages for the decoder to succeed. This paper proposes an algorithm that uses a limited number of re-reads to characterize the noise distribution and recover the stored information. Both hard and soft decoding are considered. For hard decoding, the paper attempts to find a read threshold minimizing bit-error-rate (BER) and derives an expression for the resulting codeword-error-rate. For soft decoding, it shows that minimizing BER and minimizing codeword-error-rate are competing objectives in the presence of a limited number of allowed re-reads, and proposes a trade-off between the two. The proposed method does not require any prior knowledge about the noise distribution, but can take advantage of such information when it is available. Each read threshold is chosen based on the results of previous reads, following an optimal policy derived through a dynamic programming backward recursion. The method and results are studied from the perspective of an SLC Flash memory with Gaussian noise for each level but the paper explains how the method could be extended to other scenarios.
... Flash memory is a non-volatile storage medium that is both electrically programmable and erasable. The rank modulation scheme for flash memories has been proposed in [2]. In this scheme, one permutation corresponds to a relative ranking of all the flash memory cells' levels. ...
Preprint
In the rank modulation scheme for flash memories, permutation codes have been studied. In this paper, we study perfect permutation codes in $S_n$, the set of all permutations on $n$ elements, under the Kendall \tau-Metric. We answer one open problem proposed by Buzaglo and Etzion. That is, proving the nonexistence of perfect codes in $S_n$, under the Kendall \tau-metric, for more values of $n$. Specifically, we present the recursive formulas for the size of a ball with radius $r$ in $S_n$ under the Kendall \tau-metric. Further, We prove that there are no perfect $t$-error-correcting codes in $S_n$ under the Kendall $\tau$-metric for some $n$ and $t$=2,3,4,or 5.
Preprint
Permutation matrices form an important computational building block frequently used in various fields including e.g., communications, information security and data processing. Optical implementation of permutation operators with relatively large number of input-output interconnections based on power-efficient, fast, and compact platforms is highly desirable. Here, we present diffractive optical networks engineered through deep learning to all-optically perform permutation operations that can scale to hundreds of thousands of interconnections between an input and an output field-of-view using passive transmissive layers that are individually structured at the wavelength scale. Our findings indicate that the capacity of the diffractive optical network in approximating a given permutation operation increases proportional to the number of diffractive layers and trainable transmission elements in the system. Such deeper diffractive network designs can pose practical challenges in terms of physical alignment and output diffraction efficiency of the system. We addressed these challenges by designing misalignment tolerant diffractive designs that can all-optically perform arbitrarily-selected permutation operations, and experimentally demonstrated, for the first time, a diffractive permutation network that operates at THz part of the spectrum. Diffractive permutation networks might find various applications in e.g., security, image encryption and data processing, along with telecommunications; especially with the carrier frequencies in wireless communications approaching THz-bands, the presented diffractive permutation networks can potentially serve as channel routing and interconnection panels in wireless networks.
Chapter
The selection of a Flash cell approach is a reflection of the market and product features that a company decides to pursue. There are two major markets for Flash memories: one is the traditional embedded memory, and the other is the new emerging market of mass storage.
Article
Storage media such as digital optical disks, PROMS, or paper tape consist of a number of ”write-once” bit positions (wits); each wit initially contains a ”0” that may later be irreversibly overwritten with a ”1”. It is demonstrated that such ”write-once memories” (woms) can be ”rewritten” to a surprising degree. For example, only 3 wits suffice to represent any 2-bit value in a way that can later be updated to represent any other 2- bit value. For large k, 1·29····k wits suffice to represent a k- bit value in a way that can be similarly updated. Most surprising, allowing t writes of a k-bit value requires only t+o(t) wits, for any fixed k. For fixed t, approximately k·t/log(t) wits are required as k→∞. An n-wit WOM is shown to have a ”capacity” (i.e., k·t whenn writing a k-bit value t times) of up to n·log(n) bits.
Article
We consider the problem of constructing prefix-free codes of minimum cost when the encoding alphabet contains letters of unequal length. The complexity of this problem has been unclear for thirty years with the only algorithm known for its solution involving a transformation to integer linear programming. We introduce a new dynamic programming solution to the problem. It optimally encodes n words in O(n C+2) time, if the costs of the letters are integers between 1 and C. While still leaving open the question of whether the general problem is solvable in polynomial time, our algorithm seems to be the first one that runs in polynomial time for fixed letter costs
Article
A write-once memory (WOM) is a binary storage medium in which the individual bit positions can be changed from the 0 state to the 1 state only once. Examples of WOMs are paper tapes, punched cards, and, most importantly, optical disks. For the latter storage medium, the l's are marked by a laser that burns away a portion of the disk. In a recent paper, Rivest and Shamir showed that it is possible to update or rewrite a WOM to a surprising degree, and that the total amount of information which can be stored in an JV-position WOM in many write/read "generations" or "stages" can be much larger than N.1 In this paper we extend their results in several directions. Let C(T, N) be the total number of bits of information that can be stored in an N-position WOM using T write/read generations. We consider the four cases that result when the writer (encoder) and/or reader (decoder) know the state of the memory at the previous generation. For three of these cases, when either the encoder and/or decoder knows the previous state, we show that C(T, N) ∼ N log(T + 1), with T held fixed, as A→∞. For the remaining case, when neither the encoder nor the decoder knows the previous state, we show that C(T, N) < N Π2(6 In 2) ≈AT (2.37) and that this bound can be approached arbitrarily closely with T, N sufficiently large.
Article
The Huffman construction for t-ary trees is generalized to the case where the collection of outdegrees for every internal node is given but the outdegrees are not necessarily a constant. The output trees from this construction are called generalized Huffman trees. Optimality properties for such trees are proved. A criterion is also given to compare the costs of two generalized Huffman trees with the same collection of outdegrees.
Conference Paper
Alexicographic ranking function for the set of all permutations of n ordered symbols translates permutations to their ranks in the lexicographic order of all permutations. This is frequently used for indexing data structures by permutations. We present algorithms for computing both the ranking function and its inverse using O(n) arithmetic operations.