Carl Lagoze’s research while affiliated with University of Michigan and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (160)


Platformed Knowledge Brokerage in Education: Power and Possibilities
  • Chapter

November 2021

·

73 Reads

·

2 Citations

·

J. W. Hammond

·

Carl Lagoze

·

[...]

·

To examine some of the ways platforms facilitate knowledge brokerage within the field of education, this chapter establishes the concept of platformed knowledge brokerage and critically compares four case examples: EdArXiv, Marginal Syllabus, Teachers Pay Teachers, and What Works Clearinghouse. Analytic questions focus on who is involved in the brokerage process, the nature of the knowledge objects exchanged, the ways platforms organize knowledge, and the functions available to platform users in the brokerage process. The chapter discusses the implications these platforms have for the movement and transformation of knowledge through networks in education, highlighting questions concerning whose voices are amplified by online platforms, how, and to whom. It concludes by setting the stage for future research in this area.


Metajelo: a Metadata Package for Journals to Support External Linked Objects
  • Article
  • Full-text available

October 2021

·

10 Reads

International Journal of Digital Curation

We propose a metadata package that is intended to provide academic journals with a lightweight means of registering, at the time of publication, the existence and disposition of supplementary materials. Information about the supplementary materials is, in most cases, critical for the reproducibility and replicability of scholarly results. In many instances, these materials are curated by a third party, which may or may not follow developing standards for the identification and description of those materials. As such, the vocabulary described here complements existing initiatives that specify vocabularies to describe the supplementary materials or the repositories and archives in which they have been deposited. Where possible, it reuses elements of relevant other vocabularies, facilitating coexistence with them. Furthermore, it provides an “at publication” record of reproducibility characteristics of a particular article that has been selected for publication. The proposed metadata package documents the key characteristics that journals care about in the case of supplementary materials that are held by third parties: existence, accessibility, and permanence. It does so in a robust, time-invariant fashion at the time of publication, when the editorial decisions are made. It also allows for better documentation of less accessible (non-public data), by treating it symmetrically from the point of view of the journal, therefore increasing the transparency of what up until now has been very opaque.

Download

Figure 2. Illustrations of Aggregate and Collapse
Figure 3. Appending Datasets
Examples of Conditional Execution by Row and Dataframe in Stata
Provenance Metadata for Statistical Data: An Introduction to Structured Data Transformation Language (SDTL)

July 2020

·

89 Reads

·

4 Citations

IASSIST Quarterly

Structured Data Transformation Language (SDTL) provides structured, machine actionable representations of data transformation commands found in statistical analysis software. The Continuous Capture of Metadata for Statistical Data Project (C2Metadata) created SDTL as part of an automated system that captures provenance metadata from data transformation scripts and adds variable derivations to standard metadata files. SDTL also has potential for auditing scripts and for translating scripts between languages. SDTL is expressed in a set of JSON schemas, which are machine actionable and easily serialized to other formats. Statistical software languages have a number of special features that have been carried into SDTL. We explain how SDTL handles differences among statistical languages and complex operations, such as merging files and reshaping data tables from “wide” to “long”.


Automating the Capture of Data Transformation Metadata from Statistical Analysis Software

July 2020

·

25 Reads

·

1 Citation

The C2Metadata (“Continuous Capture of Metadata for Statistical Data”) Project automates one of the most burdensome aspects of documenting the provenance of research data: describing data transformations performed by statistical software. Researchers in many fields use statistical software (SPSS, Stata, SAS, R, Python) for data transformation and data management as well as analysis. The C2Metadata Project creates a metadata workflow paralleling the data management process by deriving provenance information from scripts used to manage and transform data. C2Metadata differs from most previous data provenance initiatives by documenting transformations at the variable level rather than describing a sequence of opaque programs. Scripts used with statistical software are translated into an independent Structured Data Transformation Language (SDTL), which serves as an intermediate language for describing data transformations. SDTL can be used to add variable-level provenance to data catalogs and codebooks and to create “variable lineages” for auditing software operations. Better data documentation makes research more transparent and expands the discovery and re-use of research data.


Research Synthesis Infrastructures: Shaping Knowledge in Education

March 2020

·

93 Reads

·

11 Citations

Review of Research in Education

Research syntheses provide one means of managing the proliferation of research knowledge by integrating learnings across primary research studies. What it means to appropriately synthesize research, however, remains a matter of debate: Syntheses can assume a variety of forms, each with important implications for the shape knowledge takes and the interests it serves. To help shed light on these differences and their stakes, this chapter provides a critical comparative review of six research synthesis infrastructures, entities that support research syntheses through investments they make in synthesis production and/or publication—enabling (and constraining) the ways knowledge takes shape. Identifying our critical cases through purposive selection, we examined research synthesis infrastructure variations with respect to four different kinds of investments they make: in the genres of synthesis they support, in their promotion of synthesis quality, in sponsoring stakeholder engagement, and in creating the conditions for collective work. We draw on this comparison to suggest some of the potential changes and challenges in store for education researchers in future years.


ScriptNumerate: A Data-to-Advice Pipeline using Compound Digital Objects to Increase the Interoperability of Computable Biomedical Knowledge

December 2018

·

14 Reads

·

5 Citations

AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

Many obstacles must be overcome to generate new biomedical knowledge from real-world data and then directly apply the newly generated knowledge for decision support. Attempts to bridge the processes of data analysis and technical implementation of analytic results reveal a number of gaps. As one example, the knowledge format used to communicate results from data analysis often differs from the knowledge format required by systems to compute advice. We asked whether a shared format could be used by both processes. To address this question, we developed a data-to-advice pipeline called ScriptNumerate. ScriptNumerate analyzes historical e-prescription data and communicates its results in a compound digital object format. ScriptNumerate then uses these same compound digital objects to compute its advice about whether new e-prescriptions have common, rare, or unprecedented instructions. ScriptNumerate demonstrates that data-to-advice pipelines are feasible. In the future, data-to-advice pipelines similar to ScriptNumerate may help support Learning Health Systems.


Figure 1. Project ''Code as a Research Object''. 
Figure 2. Figshare portal for PLOS. 
Figure 3. Figshare portal for Monash University. 
Figure 4. Figshare system architecture. 6 
Re-integrating scholarly infrastructure: The ambiguous role of data sharing platforms

June 2018

·

359 Reads

·

74 Citations

Big Data & Society

Web-based platforms play an increasingly important role in managing and sharing research data of all types and sizes. This article presents a case study of the data storage, sharing, and management platform Figshare. We argue that such platforms are displacing and reconfiguring the infrastructure of norms, technologies, and institutions that underlies traditional scholarly communication. Using a theoretical framework that combines infrastructure studies with platform studies, we show that Figshare leverages the platform logic of core and complementary components to re-integrate a presently splintered scholarly infrastructure. By means of this logic, platforms may provide the path to bring data inside a scholarly communication system still optimized mainly for text publications. Yet the platform strategy also risks turning over critical scientific functions to private firms whose longevity, openness, and corporate goals remain uncertain. It may amplify the existing trend of splintering infrastructures, with attendant effects on equity of service.


FIGURE 1 The learning health cycle of the learning health system with 3 information flows and 8 steps 
FIGURE 3 Portion of Knowledge Object Reference Ontology (KORO) indicating the required and optional parts of a knowledge object content package 
FIGURE 4 Portion of Knowledge Object Reference Ontology (KORO) indicating the whole and parts of a knowledge object with key relationships. BFO, Basic Formal Ontology; IAO, Information Artifact Ontology 
Figure 4 of 4
The Knowledge Object Reference Ontology (KORO): A formalism to support management and sharing of computable biomedical knowledge for learning health systems

April 2018

·

7,046 Reads

·

44 Citations

Introduction Health systems are challenged by care underutilization, overutilization, disparities, and related harms. One problem is a multiyear latency between discovery of new best practice knowledge and its widespread adoption. Decreasing this latency requires new capabilities to better manage and more rapidly share biomedical knowledge in computable forms. Knowledge objects package machine‐executable knowledge resources in a way that easily enables knowledge as a service. To help improve knowledge management and accelerate knowledge sharing, the Knowledge Object Reference Ontology (KORO) defines what knowledge objects are in a formal way. Methods Development of KORO began with identification of terms for classes of entities and for properties. Next, we established a taxonomical hierarchy of classes for knowledge objects and their parts. Development continued by relating these parts via formally defined properties. We evaluated the logical consistency of KORO and used it to answer several competency questions about parthood. We also applied it to guide knowledge object implementation. Results As a realist ontology, KORO defines what knowledge objects are and provides details about the parts they have and the roles they play. KORO provides sufficient logic to answer several basic but important questions about knowledge objects competently. KORO directly supports creators of knowledge objects by providing a formal model for these objects. Conclusion KORO provides a formal, logically consistent ontology about knowledge objects and their parts. It exists to help make computable biomedical knowledge findable, accessible, interoperable, and reusable. KORO is currently being used to further develop and improve computable knowledge infrastructure for learning health systems.


Table summarizing infrastructure and platform properties.
Infrastructure studies meet platform studies in the age of Google and Facebook

January 2018

·

1,310 Reads

·

1,199 Citations

Two theoretical approaches have recently emerged to characterize new digital objects of study in the media landscape: infrastructure studies and platform studies. Despite their separate origins and different features, we demonstrate in this article how the cross-articulation of these two perspectives improves our understanding of current digital media. We use case studies of the Open Web, Facebook, and Google to demonstrate that infrastructure studies provides a valuable approach to the evolution of shared, widely accessible systems and services of the type often provided or regulated by governments in the public interest. On the other hand, platform studies captures how communication and expression are both enabled and constrained by new digital systems and new media. In these environments, platform-based services acquire characteristics of infrastructure, while both new and existing infrastructures are built or reorganized on the logic of platforms. We conclude by underlining the potential of this combined framework for future case studies.


Architecture and Initial Development of a Knowledge-as-a-Service Activator for Computable Knowledge Objects for Health

January 2018

·

32 Reads

·

7 Citations

Studies in Health Technology and Informatics

The Knowledge Grid (KGrid) is a research and development program toward infrastructure capable of greatly decreasing latency between the publication of new biomedical knowledge and its widespread uptake into practice. KGrid comprises digital knowledge objects, an online Library to store them, and an Activator that uses them to provide Knowledge-as-a-Service (KaaS). KGrid's Activator enables computable biomedical knowledge, held in knowledge objects, to be rapidly deployed at Internet-scale in cloud computing environments for improved health. Here we present the Activator, its system architecture and primary functions.


Citations (79)


... Since then, more than 500,000 users have visited the eBird website (Sullivan et al. 2009). Kelling et al. (2012) proposed a Human/Computer Learning Network for Biodiversity Conservation incorporating VGI coming from eBird using an active learning feedback loop to improve the results of the AI algorithms. Fink et al. (2010Fink et al. ( , 2013 introduced a spatiotemporal exploratory model, STEM, and AdaSTEM, to study species distribution models. ...

Reference:

Uncertainty-Aware Enrichment of Animal Movement Trajectories by VGI
eBird: A Human/Computer Learning Network for Biodiversity Conservation and Research
  • Citing Article
  • July 2012

Proceedings of the AAAI Conference on Artificial Intelligence

... Social media and websites can also serve as an intermediary actor, providing an environment in which specific forms of knowledge can be housed, facilitating knowledge retrieval for specific audiences and purposes (Lawlor et al, 2021). Media sources identified mirrored the news in providing information surrounding the Capitol insurrection, as compared to research-informed pedagogy for supporting students in making sense of such an event. ...

Platformed Knowledge Brokerage in Education: Power and Possibilities
  • Citing Chapter
  • November 2021

... The research article is a compilation and summarizes the research process to be concise and short for presentations at academic conferences or in academic journals and review article. It is a review of the progress of a particular subject by compiling and researching the findings in many research reports to synthesize new concepts or new knowledge (Hammond et al., 2020;Maeda et al., 2022). It also find the arguments on a particular subject to provoke criticism of particular subject which is necessary to compare them to find clarity on that matter. ...

Research Synthesis Infrastructures: Shaping Knowledge in Education
  • Citing Article
  • March 2020

Review of Research in Education

... This first element of verification was studied by Woods et al. [11], who used the statistical frequency of a combined representation of dose, route and frequency, in relation to a pharmaceutical product, to determine if a specific prescription is atypical for a selection of five drugs. Flynn et al. [12] expanded this approach in 2018 to 431 drugs prescribed to patients aged 75 and over. This study showed that, defining as "rare" a dose, route and frequency combination occurring in <= 10% of cases on a training set, only 27.3% of orders would be considered rare in their testing set. ...

ScriptNumerate: A Data-to-Advice Pipeline using Compound Digital Objects to Increase the Interoperability of Computable Biomedical Knowledge
  • Citing Article
  • December 2018

AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

... As portrayed in Figure 1 below, our approach uses a stack of technical components for managing and deploying KOs, which are digital packages holding CBK models. 27 In Figure 1, the two yellow-shaded areas are where we make new technical contributions. ...

Architecture and Initial Development of a Knowledge-as-a-Service Activator for Computable Knowledge Objects for Health
  • Citing Article
  • January 2018

Studies in Health Technology and Informatics

... With this growing potential for CBK comes the increased importance for sharing CBK artifacts to facilitate knowledge understanding and use at scale. We have developed an ontologyspecified [9] knowledge object (KO) model that packages CBK with metadata and implementation information [10], and created over 100 KOs [11]. Along the way, we've learned how to enrich computable knowledge artifacts to facilitate sharing and reuse. ...

The Knowledge Object Reference Ontology (KORO): A formalism to support management and sharing of computable biomedical knowledge for learning health systems

... Most of this information is provided to researchers when they obtain access, but cannot easily be communicated to journal editors or readers of articles. Nevertheless, as we have argued (Lagoze & Vilhuber, 2017) and experienced in our own research (Abowd et al., 2009;Abowd & Vilhuber, 2005;McKinney et al., 2017), it is definitely feasible to do reproducible research in this environment. The difficulty consists in communicating that information, in a reliable fashion, to editors, referees, and readers. ...

O Privacy, Where Art Thou? Making Confidential Data Part of Reproducible Research
  • Citing Article
  • July 2017

CHANCE

... The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a standard protocol in the digital libraries community that is used for transferring metadata among digital repositories and metadata-based service providers (Lagoze & Van de Sompel, 2003). The protocol was developed in response to a need to aggregate metadata from multiple repositories to create high-quality discovery tools for researchers. ...

The Open Archives Initiative Protocol for Metadata Harvesting
  • Citing Article
  • June 2002