# Mark ManasseI2chain.com · Research

Mark Manasse

PhD, University of Wisconsin, Madison

## About

76

Publications

16,768

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

7,849

Citations

Introduction

Mark Manasse worked at Infrastructure Security, Salesforce.com, and i currently at i2chain, working on certifications to outlive the one-way hash functions they use. Mark does research in Algorithms, Computer Architecture, Applied Cryptography, and Data Mining. Recent patents are 'File system level key rotation via see-through directory unions', and a related patent at the volume level.
Despite this, my current fear is the impending loss of security for RSA and Diffie-Hellman key exchange as quantum computers work at scale in a decade or two, and the inevitable decline of Blockchain certification. How long can immutabiliy last?
Working on Blockchain in the interim.

Additional affiliations

December 2014 - January 2016

October 2001 - November 2014

October 2001 - November 2014

## Publications

Publications (76)

Reviving a blockchain when a hash function breaks [blinded] [blinded] The work described in this paper is currently under review by NSF for an SBIR grant. This paper i under submission to a conference.

A generator matrix is provided to generate codewords from messages of write operations. Rather than generate a codeword using the entire generator matrix, some number of bits of the codeword are determined to be, or designated as, stuck bits. One or more submatrices of the generator matrix are determined based on the columns of the generator matrix...

Document sketching using Jaccard similarity has been a workable effective
technique in reducing near-duplicates in Web page and image search results, and
has also proven useful in file system synchronization, compression and learning
applications.
Min-wise sampling can be used to derive an unbiased estimator for Jaccard
similarity and taking a few...

In this paper, we investigate near-duplicate detection, particularly looking at the detection of evolving news stories. These stories often consist primarily of syndicated information, with local replacement of headlines, captions, and the addition of locally-relevant content. By detecting near-duplicates, we can offer users only those stories with...

Zombie is an endurance management framework that enables a variety of error correction mechanisms to extend the lifetimes of memories that suffer from bit failures caused by wearout, such as phase-change memory (PCM). Zombie supports both single-level cell (SLC) and multi-level cell (MLC) variants. It extends the lifetime of blocks in working memor...

Zombie is an endurance management framework that enables a variety of error correction mechanisms to extend the lifetimes of memories that suffer from bit failures caused by wearout, such as phase-change memory (PCM). Zombie supports both single-level cell (SLC) and multi-level cell (MLC) variants. It extends the lifetime of blocks in working memor...

The time-worn aphorism "close only counts in horseshoes and hand-grenades" is clearly inadequate. Close also counts in golf, shuffleboard, archery, darts, curling, and other games in which hitting the precise center of the target isn't to be expected every time, or in which we can expect to be driven from the target by skilled opponents.
This lectu...

Solid-state disks (SSDs) have the potential to revolution- ize the storage system landscape. However, there is little published work about their internal organization or the design choices that SSD manufacturers face in pursuit of optimal performance. This paper presents a taxonomy of such design choices and analyzes the likely performance of vario...

Index TermsSki-rental problem, Competitive algorithms, Deterministic and randomized algorithms, On-line algorithmsKeywords and SynonymsOblivious adversaries, Worst-case
approximation, Metrical task
systemsProblem DefinitionThe ski rental problem was developed as a pedagogical tool for understanding the basic concepts in some early results in on-lin...

We describe an efficient procedure for sampling representa-tives from a weighted set such that the probability that for any weightings S and T , the probability that the two choose the same sample is the Jacard similarity: P r[sample(S) = sample(T)] = P x min(S(x), T (x)) P x max(S(x), T (x)) . The sampling process takes expected time linear in the...

The number field sieve is an algorithm to factor integers of the form r
e − s for small positive r and |s|. The algorithm depends on arithmetic in an algebraic number field. We describe the algorithm, discuss several aspects of
its implementation, and present some of the factorizations obtained. A heuristic run time analysis indicates that the numb...

We have integrated digital video into Trestle, an object-oriented user interface toolkit written in Modula-3. The display
of video frames is managed within the application process using, where possible, shared memory to transmit images to the window
system. We took advantage of Modula-3's type system, lightweight threads and garbage collection to d...

The seventh Workshop on Distributed Data and Structures (WDAS) took place on the campus of Santa Clara University on January 4 and 5, 2006 and drew participants actively working in research on distributed data, structures, and their applications. WDAS aims to stimulate the exchange of ideas and to be a forum for work in progress. The electronic ver...

In this paper, we continue our investigations of "web spam": the injection of artificially-created pages into the web in order to influence the results from search engines, to drive traffic to certain pages for fun or profit. This paper considers some previously-undescribed techniques for automatically detecting spam pages, examines the effectivene...

As storage sites grow in size to thousands of disks, and as the need to predict availability and reliability increases, researchers and designers need a better quantitative understanding of the ways that disks fail or lose data. Unfortunately, these numbers are hard to come by. Disk manufacturers have some approximations to these numbers from their...

Two years ago, we conducted a study on the evolution of web pages over time. In the course of that study, we discovered a large number of machine-generated "spam" web pages emanating from a handful of web servers in Germany. These spam web pages were dynamically assembled by stitching together grammatically well-formed German sentences drawn from a...

A resource may be abused if its users incur little or no cost. For example, e-mail abuse is rampant because sending an e-mail has negligible cost for the sender. It has been suggested that such abuse may be discouraged by introducing an artificial cost in the form of a moderately expensive computation. Thus, the sender of an e-mail might be require...

The increasing importance of search engines to commercial web sites has given rise to a phenomenon we call "web spam", that is, web pages that exist only to mislead search engines into (mis)leading users to certain web sites. Web spam is a nuisance to users as well as search engines: users have a harder time finding the information they need, and s...

How fast does the web change? Does most of the content remain unchanged once it has been authored, or are the documents continuously updated? Do pages change a little or a lot? Is the extent of change correlated to any other property of the page? All of these questions are of interest to those who mine the web, including all the popular search engi...

We expand on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million Web pages on a weekly basis over the span of 11 weeks. We then determined which of these pages are near-duplicates of one another, and tracked how clusters of near-duplicate documents evolved over time. We found...

This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis over the span of 11 weeks. We then determined which of these pages are near-duplicates of one another, and tracked how clusters of near-duplicate documents evolved over time....

How fast does the web change? Does most of the content remain unchanged once it has been authored, or are the documents continuously updated? Do pages change a little or a lot? Is the extent of change correlated to any other property of the page? All of these questions are of interest to those who mine the web, including all the popular search engi...

The number field sieve is an algorithm to factor integers of the form r e -s for small positive r and |s|. The algorithm depends on arithmetic in an algebraic number field. We describe the algorithm, discuss several aspects of its implementation, and present some of the factorizations obtained. A heuristic run time analysis indicates that the numbe...

Let be a set of on-line algorithms for a problem P with input set I . We assume that P can be represented as a metrical task system. Each A i has a competitive ratio a i with respect to the optimum o#-line algorithm, but only for a subset of the possible inputs such that the union of these subsets covers I . Given this setup, we construct a generic...

This is a tutorial introduction to programming with Trestle, a Modula-3 window system toolkit currently implemented over the X window system. We assume that you have some experience as a user of window systems, but no previous experience programming with X or other window systems. To run Trestle, you need a copy of SRC Modula-3 and an X server. The...

We have developed an efficient way to determine the syntactic similarity of files and have applied it to every document on the World Wide Web. Using this mechanism, we built a clustering of all the documents that are syntactically similar. Possible applications include a “Lost and Found” service, filtering the results of Web searches, updating wide...

. We present data concerning the factorization of the 120-digit number RSA-120, which we factored on July 9, 1993, using the quadratic sieve method. The factorization took approximately 825 MIPS years and was completed within three months real time. At the time of writing RSA-120 is the largest integer ever factored by a general purpose factoring a...

Members of geographically distributed work groups often complain of a feeling of isolation and of not knowing “who is around”. Argohalls attempt to solve this problem by integrating video icons, clustered into groups representing physical hallways, into the Argo telecollaboration system. Argo users can “hang out” in hallways in order to keep track...

We describe side conversations, a new facility we have added to the Argo telecollaboration system. Side conversations allow subgroups of teleconference participants to whisper to each other. The other participants can see who is whispering to whom, but cannot hear what is being said.

We present data concerning the factorization of the 120-digit number RSA-120, which we factored on July 9, 1993, using the
quadratic sieve method. The factorization took approximately 825 MIPS years and was completed within three months real time.
At the time of writing RSA-120 is the largest integer ever factored by a general purpose factoring alg...

This is all interesting and necessary, but it fails to address the problems of a market segment that I expect to grow rapidly, once it becomes at all possible: extremely low-priced walk-up transactions. The proposed payment schemes all come with fee schedules that limit transactions to be fairly valuable. In practical terms, for today's systems, an...

Full text at http://www.w3.org/Conferences/WWW4/Papers/246/
Abstract:
Millicent is a lightweight and secure protocol for electronic commerce over the Internet. It is designed to support purchases costing less than a cent. It is based on decentralized validation of electronic cash at the vendor's server without any additional communication, expensi...

We describe a modification to the well-known large prime variant of the multiple polynomial quadratic sieve factoring algorithm [Eurocrypt ’90, Lect. Notes Comput. Sci. 473, 72–82 (1991; Zbl 0779.11061)]. In practice this leads to a speed-up factor of 2 to 2.5. We discuss several implementation-related aspects, and we include some examples. Our new...

One important facet of common-sense reasoning is the ability to draw default conclusions about the state of the world, so that one can, for example, assume that a given bird flies in the absence of information to the contrary. A deficiency in the circumscriptive approach to common-sense reasoning has been its difficulties in producing default that...

Let fA1 ; A2 ; : : : ; Amg be a set of on-line algorithms for a problem P with input set I. We assume that P can be represented as a metrical task system. Each A i has a competitive ratio a i with respect to the optimum offline algorithm, but only for a subset of the possible inputs such that the union of these subsets covers I. Given this setup, w...

Competitive analysis is concerned with comparing the performance of on-line algorithms with that of optimal off-line algorithms. In some cases randomization can lead to algorithms with improved performance ratios on worst-case sequences. In this paper we present new randomized on-line algorithms for snoopy caching and the spin-block problem. These...

The goal of the Argo system is to allow medium-sized groups of users to collaborate remotely from their desktops in a way that approaches as closely as possible the effectiveness of face-to-face meetings. In support of this goal, Argo combines high quality multi-party digital video and full-duplex audio with telepointers, shared applications, and w...

In an on-demand video server environment, clients make requests for movies to a centralized video server. Due to the stringent response time requirements, continuous delivery of a video stream to the client has to be guaranteed by reserving sufficient ...

In this paper we exhibit the full prime factorization of the ninth Fermat number F9 = 2(512) + 1. It is the product of three prime factors that have 7, 49, and 99 decimal digits. We found the two largest prime factors by means of the number field sieve, which is a factoring algorithm that depends on arithmetic in an algebraic number field. In the p...

In this paper we exhibit the full prime factorization of the ninth Fermât number Fg = 2512 + 1. It is the product of three prime factors that have 7, 49, and 99 decimal digits. We found the two largest prime factors by means of the number field sieve, which is a factoring algorithm that depends on arithmetic in an algebraic number field. In the pre...

We present data concerning the factorization of the 120-digit number RSA-120, which we factored on July 9, 1993, using the quadratic sieve method. The factorization took approximately 825 MIPS years and was completed within three months real time. At the time of writing RSA-120 is the largest integer ever factored by a general purpose factoring alg...

A common operation in multiprocessor programs is acquiring a lock to protect access to shared data. Typically, the requesting thread is blocked if the lock it needs is held by another thread. The cost of blocking one thread and activating another can be a substantial part of program execution time. Alternatively, the thread could spin until the loc...

We present a divertible zero-knowledge proof (argument) for SAT under the assumption that probabilistic encryption homomorphisms exist. Our protocol uses a simple 'swapping' technique which can be applied to many zero knowledge proofs (arguments). In ...

Extensive experience with X11 has convinced us that it represents a true advance in window systems, but that there are areas in which the X protocol is seriously deficient. The problems we describe fall into seven categories: coordinate system pitfalls, unavoidable race conditions, incomplete support for window managers, insufficient window viewabi...

The k-server problem is that of planning the motion of k mobile servers on the vertices of a graph under a sequence of requests for service. Each request consists of the name of a vertex, and is satisfied by placing a server at the requested vertex. The requests must be satisfied in their order of occurrence. The cost of satisfying a sequence of re...

The k-server problem is that of planning the motion of k mobile servers on the vertices of a graph under a sequence of requests for service. Each request consists of the name of a vertex, and is satisfied by placing a server at the requested vertex. The requests must be satisfied in their order of occurrence. The cost of satisfying a sequence of re...

The study of integer factoring algorithms and the design of faster factoring algorithms is a subject of great importance in cryptology (cf. [1]), and a constant concern for cryptographers. In this paper we present a new technique that proved to be extremely useful, not only to achieve a considerable speed-up of an older and widely studied factoring...

The number field sieve is an algorithm to factor integers of the form r e ± s for small positive r and s . This note is intended as a 'report on work in progress' on this algorithm. We informally describe the algorithm, discuss several implementation related aspects, and present some of the factorizations obtained so far. We also mention some solut...

In this paper we describe our distributed implementation of two factoring algorithms, the elliptic curve method (ecm) and the multiple polynomial quadratic sieve algorithm (mpqs).
Since the summer of 1987, our ecm-implementation on a network of MicroVAX processors at DEC’s Systems Research Center has factored several most and more wanted numbers fr...

An on-line problem is one in which an algorithm must handle a sequence of requests, satisfying each request without knowledge of the future requests. Examples of on-line problems include scheduling the motion of elevators, finding routes in networks, allocating cache memory, and maintaining dynamic data structures. A competitive algorithm for an on...

In a snoopy cache multiprocessor system, each processor has a cache in which it stores blocks of data. Each cache is connected to a bus used to communicate with the other caches and with main memory. Each cache monitors the activity on the bus and in its own processor and decides which blocks of data to keep and which to discard. For several of the...

In a snoopy cache multiprocessor system, each processor has a cache in which it stores blocks of data. Each cache is connected to a bus used to communicate with the other caches and with main memory. For several of the proposed models of snoopy caching, we present new on-line algorithms which decide, for each cache, which blocks to retain and which...

We are developing a model for human cognitive processing which assumes that a major component of the rules for calculating behavior are resident outside the individual, in the inherited, collective phenomena that anthropologists call 'culture'. Our model contains rules of behavior encoded in propositional structures such as frames or scripts, plus...

Reed-Solomon erasure codes provide efficient simple techniques for re-dundantly encoding information so that the failure of a few disks in a disk array doesn't compromise the availability of data. This paper presents a tech-nique for constructing a code that can correct up to three errors with a sim-ple, regular encoding, which admits very efficien...

Remote Differential Compression (RDC) protocols can efficiently update files over a limited-bandwidth network when two sites have roughly similar files; no site needs to know the content of another's files a priori. We present a heuristic approach to identify and transfer the file differences that is based on finding similar files, subdividing the...

1. ABSTRACT This paper describes our contribution to the 2007 Web Spam Challenge. We computed some additional features from the data provided with the UK 2006-05 dataset, and other features from external data sources. Our contributions to the Web Spam Challenge fall into two categories. First, we used the features introduced in our earlier work ([2...

Digital rights management based on enforcement is moribund. The bits are free and they can't be put back in the bottle. Yet, content creators want to get paid and users want superior quality content. Assuming that users are willing to pay for content they like, we propose a scheme for digital rights licensing modeled after shareware licensing.