The string-edit operations in PCFG SET from Hupkes et al. (2020).

The string-edit operations in PCFG SET from Hupkes et al. (2020).

Source publication
Preprint
Full-text available
A fundamental question in interpretability research is to what extent neural networks, particularly language models, implement reusable functions via subnetworks that can be composed to perform more complex tasks. Recent developments in mechanistic interpretability have made progress in identifying subnetworks, often referred to as circuits, which...

Contexts in source publication

Context 1
... illustrated in Table 1, the dataset (Hupkes et al., 2020) comprises ten different string-edit operations (SET) applied to sequences generated by a probabilistic context-free grammar (PCFG). All tasks resemble translation problems, where an input sequence is transformed into a corresponding output sequence through the recursive application of the string-edit operations specified within the input sequence. ...
Context 2
... all operators in PCFG SET are functionally related. For example, the repeat operator can be replicated by applying the copy operation two times in succession (see Table 1). Our objective is to identify circuits for each of the ten string-edit operations in PCFG SET. ...
Context 3
... we assess faithfulness performance under two configurations: one where F T is averaged across all output tokens ( Figure 1a) and another where faithfulness is calculated only at positions where the ground truth output sequences of the circuit-task and evaluation-task diverge (Figure 1b). For example, when evaluating the copy circuit on the echo task, the evaluation focuses on measuring task faithfulness at the final token of the target output sequence -specifically, the additional x n in echo which differs from the end-of-sequence token in copy (see Table 1). ...

Similar publications

Preprint
Full-text available
The Universal Dependencies (UD) project has significantly expanded linguistic coverage across 161 languages, yet Luxembourgish, a West Germanic language spoken by approximately 400,000 people, has remained absent until now. In this paper, we introduce LuxBank, the first UD Treebank for Luxembourgish, addressing the gap in syntactic annotation and a...