Conference Paper

Tool support for data validation by end-user programmers

DOI: 10.1145/1368088.1368226 Conference: 30th International Conference on Software Engineering (ICSE 2008), Leipzig, Germany, May 10-18, 2008
Source: DBLP


End-user programming tools for creating spreadsheets and webforms offer no data types except "string" for storing many kinds of data, such as person names and street addresses. Consequently, these tools cannot automatically validate these data. To address this problem, we have developed a new userextensible model for string-like data. Each "tope" in this model is a user-defined abstraction that guides the interpretation of strings as a particular kind of data, such as a mailing address. Specifically, each tope implementation contains software functions for recognizing and reformatting that tope's kind of data. With our tools, end-user programmers define new topes and associate them with fields in spreadsheets, webforms, and other programs. This makes it possible at runtime to distinguish between invalid data, valid data, and questionable data that could be valid or invalid. Once identified, questionable and/or invalid data can be double-checked and possibly corrected, thereby increasing the overall reliability of the data.

Download full-text


Available from: Brad A. Myers
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Ultra-Large Scale (ULS) systems comprise numerous software ele-ments designed and implemented by independent stakeholders whose requirements may vary widely. Consequently, elements in a ULS system may use different data formats, which complicates inte-gration of elements. Writing code to robustly convert data from one format to another requires time and skills that some programmers may lack. Worse, the stakeholders who control a software element may change the element's data format at any point in the future without warning, causing format incompatibility not foreseen during the ULS system's construction. To address heterogeneity of data formats, we present a new abstrac-tion called "topes". Each tope describes one kind of data, including known formats of that data and rules for transforming values among formats. Labeling the inputs and outputs of software elements with topes raises the level of abstraction so that elements produce and consume certain kinds of data, rather than particular formats.
    Full-text · Article · Jan 2008
  • Source

    Full-text · Article · Apr 2009