Applying generalized context trees and their finite state machine closures, we show that a two-pass version of Context, a twice-universal lossless coding scheme for tree models, can be implemented in linear encoding/decoding time. As it turns out, an optimal context selection rule and the corresponding context transitions are computationally not more expensive than the various steps involved in
... [Show full abstract] the implementation of the Burrows-Wheeler transform (BWT) and use, in fact, similar tools. We also present a reversible transform that displays the same "context deinterleaving" feature as the BWT but is naturally based on an optimal context tree. This transform offers insight into the workings of the BWT and the nature of its sub-optimality for twice-universal coding of tree models.