
Michael Cohen- Ph.D.
- Prof. Emeritus at University of Aizu
Michael Cohen
- Ph.D.
- Prof. Emeritus at University of Aizu
About
205
Publications
52,002
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,700
Citations
Introduction
Michael Cohen currently works at the Department of Computer and Information Systems, The University of Aizu. Michael does research in Human-computer Interaction, Computer Graphics and Distributed Computing. Their current project is '“Twhirleds”: Spun and whirled affordances controlling multimodal mobile-ambient environments with reality distortion and synchronized lighting to preserve intuitive alignment'.
Current institution
Additional affiliations
April 1991 - March 1993
April 1993 - present
June 1989 - August 1989
Education
September 1988 - May 1991
Publications
Publications (205)
The apparent paradoxes of multipresence, having avatars in multiple places simultaneously, are resolvable by an 'autofocus' feature, which uses reciprocity to project overlaid soundscapes and simulate the precedence effect to consolidate the display. We have developed an interface for narrowcasting functions for networked mobile devices deployed in...
First-person VR- and MR-based Action Observation research has thus far yielded both positive and negative findings in studies observing such tools’ potential to teach motor skills. Teaching drumming, particularly polyrhythms, is a challenging motor skill to learn and has remained largely unexplored in the field of Action Observation. In this contri...
This chapter presents the development of a wearable force-feedback mechanism designed to provide a free-range haptic experience within the spectrum of Extended Reality (XR). The proposed system offers untethered six degrees-of-freedom and small- to medium-scale force-feedback, enabling users to immerse themselves in haptic interactions within virtu...
Virtual Co-embodiment (vc) is a relatively new field of VR, enabling a user to share control of an avatar with other users or entities. According to a recent study, vc was shown to have the highest motor skill learning efficiency out of three VR-based methods. This contribution expands on these findings, as well as previous work relating to Action...
We present an extended prototype of a wearable force-feedback mechanism coupled with a Meta Quest 2 head-mounted display to enhance immersion in virtual environments. Our study focuses on the development of devices and virtual experiences that place significant emphasis on personal sensing capabilities, such as precise inside-out optical hand, head...
The recent rise in popularity of head-mounted displays (HMDs) for immersion into virtual reality has resulted in demand for new ways to interact with virtual objects. Most solutions utilize generic controllers for interaction within virtual environments and provide limited haptic feedback. We describe the construction and implementation of an ambul...
The goal of this project is to create an application that uses auditory, visual, and tactile elements to assist and guide musical instrument learners, using features available on the Nintendo Switch device and its paired “Joy-Con” detachable interface, to increase the novelty as well as intuitiveness of musical performance while stimulating user ex...
We propose a method to realize the experience of seeing through an occluding wall by using a 3D scanner and an AR (Augmented Reality) device. This system uses the LiDAR (Light Detection And Ranging) scanner of the Apple iPad Pro as a 3D scanner, sending scanned spatial mesh data to the AR device of the experiencer in real-time. By using a VPS (Visu...
We propose a new way to capture snapshots in virtual space. We created a demonstration of how this method could be incorporated into other VR (virtual reality) contents. Using an Oculus Quest HMD (headmounted display), we created VR utility that runs on game engine Unity. We used bimanual controllers to take snapshots. The area framed by the bimanu...
General approaches in computer graphics to compose visual effects (VFX) usually involve editing textual modules and parameters or constructing procedural node networks. These techniques are used in many game engine editors and digital contents creation (DCC) tools. However, contemporary interfaces arc not intuitive, especially for inexperienced use...
We investigate the potential of interactive user interfaces with omnidirectional horizontal selection. Three novel multimodal interfaces have been developed, exploring different ways of displaying and controlling spaces that encourage panoramic and pantophonic experience by featuring selection of objects distributed around the subjective equator. B...
When seated users of multimodal augmented reality (AR) systems attempt to navigate unfamiliar environments, they can become disoriented during their initial travel through a remote environment that is displayed for them via that AR display technology. Even when the multimodal displays provide mutually coherent visual, auditory, and vestibular cues...
We describe a method of achieving redirected walking by modulating subjective translation and rotation. In a real space, a user walks around without leaving a 5 m ² area, but we have built a system that allows virtual movement around a larger area than the real space. This system is realized by translating and rotating the apparent ground in respon...
Instead of adding to the driving cacophony, actively orchestrated windshield wipers can enhance musical audition, reinforcing a beat by augmenting the rhythm, increasing the signal:noise ratio by aligning the cross-modal rhythmic beats and masking the noise, providing “visual music,” the dance of the wipers. We recast the windshield wipers of an au...
We introduce a way of implementing physically-based renderers that can switch rendering methods with a raytracing library. Various physically-based rendering (PBR) methods can generate beautiful images that are close to human view of real world. However, comparison between corresponding pairs of pixels of image pairs generated by different renderin...
The RGB color model is framed by black, white, three additive primary colors, and three subtractive secondary colors, but there are many hues between them. We used a color cube for representing dimensional bases and intermediate hues and animated it to visualize color interpolation processes.
Association between electroencephalography (EEG) and individually personal information is being explored by the scientific community. Though person identification using EEG is an attraction among researchers, the complexity of sensing limits using such technologies in real-world applications. In this research, the challenge has been addressed by re...
Contemporary listeners are exposed to overlaid cacophonies of sonic sources, both intentional and incidental. Such soundscape superposition can be usefully characterized by where such combination actually occurs: in the air, at the ears of listeners, in the auditory imagery subjectively evoked by such events, or in whatever audio equipment is used...
We present an analytical framework for a cognitively informed organization of signals involved in computational representations of spatial soundscape superposition, defined here as ``procedural superposition,'' building on the accompanying article Part I, where we discussed physical (acoustical) and perceptual (subjective and psychological) framewo...
Spatial soundscape superposition occurs whenever multiple sound signals impinge upon a human listener's ears from multiple sources, as in augmented reality displays that combine natural soundscapes with reproduced soundscapes. Part I of this two-part contribution on spatial soundscape superposition regards perceptual superposition of soundscapes, a...
In this study, we made “Aka-beko” choral ensemble. It comprises an octave of a chromatic musical scale, arranged in a helix and populated by animated oxen, instances of a regional mascot, who lift their heads and sing when triggered by events from a realtime MIDI keyboard controller. The application installation in the University of Aizu UBIC 3D Th...
The goal of this study is to visualize duplex auditory communication filtered with narrowcasting. Narrowcasting is technology that enables media stream filtering for privacy. Audio narrowcasting has four operations (solo & mute and attend & deafen). In this research, communication from source to sink is determined according to these four operations...
One-class classification (OCC) technique based on autoencoders is proposed in this study. OCC can be especially used in the user authenti-cation domain when filtering imposter data. Six different autoencoder configurations were tested for performances using the publicly available MNIST dataset. 94.51% maximum average accuracy of all classes was ach...
Musical relaxation is a common method to relieve personal stress. Particularly, nature sounds, instrumental music, voice (chanting), "easy listening" songs, etc. can be played for relaxation. Nevertheless, effectiveness of the sounds used for the relaxation is idiosyncratic, depending on personal taste. In our approach, computer-guided audition for...
Live video streaming is becoming increasingly popular as a form of interaction in social applications. One of its main advantages is an ability to immediately create and connect a community of remote users on the spot. In this paper we discuss how this feature can be used for crowdsourced completion of simple visual search tasks (such as finding sp...
We present a system that exploits mobile rotational tracking and photospherical imagery to allow users to share their environment with remotely connected peers “on the go.” We surveyed related interfaces and developed a unique groupware application that shares a mixed reality space with spatially-oriented live video feeds. Users can collaborate thr...
Mixed reality telepresence is becoming an increasingly popular form of interaction in social and collaborative applications. We are interested in how created virtual spaces can be archived, mapped, shared, and reused among different applications. Therefore, we propose a decentralized blockchain-based peer-to-peer model of distribution, with virtual...
User authentication systems based on EEG (electroencephalography) is currently popular, marking an inflection point in the field. Recently, the scientific community has been making tremendous attempts towards perceiving uniqueness of brain signal patterns. Several types of methodical approaches have been proposed and prototyped to analyze EEG data...
The popularity of the contemporary smartphone makes it an attractive platform for new applications. We are exploring the potential of such personal devices to control networked displays. In particular, we have developed a system that can sense mobile phone orientation to support two kinds of juggling-like play styles: padiddle and poi. Padiddling i...
Mobile live video streaming is becoming an increasingly popular form of interaction both in social media and remote collaboration scenarios. However, in most cases the streamed video does not take mobile devices' spatial data into account (e.g., the viewers do not know the spatial orientation of a streamer), or use such data only in specific scenar...
This paper presents the key concepts and design of our proposed framework for a smart supermarket (SMARTKet). We briefly introduce the infrastructure, smart functions, and enabling technologies of the SMARTKet implementation. We especially focus on the basic principles, performance evaluation in terms of localization accuracy, and a proof-of-concep...
We have developed a networked phantom GUI control emulator that can click and type into otherwise stand-alone applications. In conjunction with rapid prototyping "Alice" desktop VR system, and previously developed "Twhirleds" smartphone applications and network interfaces, all of which are freely available, anyone with a contemporary smartphone can...
We have created a mixed reality concert application using Alice, a 3d rapid prototyping programming environment, in which musical instruments are arranged around a virtual conductor (in this case the user) located at their center. A user-conductor can use a smartphone as a simplified baton, pointing at a preferred instrument and tapping a button to...
We have developed a phantom gui emulator that can read from otherwise stand-alone applications, complementing a separate parallel program that can write to such applications. In conjunction with the âAliceâ desktop vr system and previously developed âCollaborative Virtual Environmentâ both of which are freely available, virtual scene explor...
Authentication is a crucial consideration when securing data or any kind of information system. Though existing approaches for authentication are user-friendly, they have vulnerabilities such as the possibility & criminally threatening a user. We propose a novel approach which uses Electroen-cephalogram (EEG) brain signals for an authentication pro...
Navigation is a basic feature of mobile robots. Self-navigating robots can be used in industry for moving loads from one place to another without human interaction. In this paper, we propose a novel approach to move a payload placed along a given path with minimum number of motors and sensors. The navigation is based on lines on the floor, and the...
The previous chapter outlined the psychoacoustic theory behind cyberspatial sound,
recapitulated in Figure 13.1, and the idea of audio augmented reality (AAR), including
review of its various form factors. Whereware was described as a class of location- and
position-aware interfaces, particularly those featuring spatial sound. This chapter
consider...
The previous chapter outlined the psychoacoustic theory behind cyberspatial sound,
recapitulated in Figure 13.1, and the idea of audio augmented reality (AAR), including
review of its various form factors. Whereware was described as a class of location- and
position-aware interfaces, particularly those featuring spatial sound. This chapter
consider...
The previous chapter outlined the psychoacoustic theory behind cyberspatial sound,
recapitulated in Figure 13.1, and the idea of audio augmented reality (AAR), including
review of its various form factors. Whereware was described as a class of location- and
position-aware interfaces, particularly those featuring spatial sound. This chapter
consider...
Time is the core of multimedia. Modern applications are synchronous: dynamic (interactive, runtime), realtime (updates reflected immediately), and online (networked). Sound and audio, including spatial sound and augmented audio, are especially leveraged by such distributed capabilities, not just modulation of location of virtual sound sources, but...
The previous chapter outlined the psychoacoustic theory behind cyberspatial sound,
recapitulated in Figure 13.1, and the idea of audio augmented reality (AAR), including
review of its various form factors. Whereware was described as a class of location- and
position-aware interfaces, particularly those featuring spatial sound. This chapter
consider...
It is known that images seen by human eyes and those captured by camera lens are evidently different. In general, a human eye can see the majority of the view field. However, a lens has limited visual range due to various specifications and uses. We can obtain a panoramic image that has a wider view field using stitching techniques from input image...
We describe a musical cyberworld, " Folkways in Wonderland, " in which avatar-represented users can find and listen to selections from the Smithsonian Folkways world music collection. When audition is disturbed by cacophony of nearby tracks or avatar conversations , one's soundscape can be refined since the system supports narrowcasting, a techniqu...
We have been working on "twirling" interfaces, featuring affordances spun in "padiddle" or "poi" style. The affordances, crafted out of mobile devices (smartphones and tablets) embedded into twirlable toys, sense their orientation and allow "mobile-ambient" individual control of public display, such as a large format screen. Typically one or two us...
Narrowcasting, in analogy to uni-, broad-, and multicasting, is a formalization of media control functions that can be used to adjust exposure and receptiveness. Its idioms have been deployed in spatial sound diffusion interfaces, internet telephony, immersive chatspaces, and collaborative music audition systems. Here, we consider its application t...
Narrowcasting, in analogy to uni-, broad-, and multicasting, is a formalization of media control functions that can be used to adjust exposure and receptiveness. Its idioms have been deployed in spatial sound diffusion interfaces, internet telephony, immersive chatspaces, and collaborative music audition systems. Here, we consider its application t...
We describe a musical cyberworld as a virtual space for curating ethnomusicology, as well as for conducting research: the ethnomusicology of controlled musical cyberspaces. Our cyberworld differs from most online music curation in enabling immersive, social experience. Considering such cyber-exhibition of ethnomusicological research as itself a for...
The technology-enabled future of reading is broadly surveyed. Through innovations in digital typography and electronic publishing, computers enable new styles of reading. Audio and music, animation, video, multimedia, hypermedia, and live documents extend traditional literacy. Several classes of systems and instances thereof — including commercial...
To illuminate the alignment between mixed reality juggling toys and ambidextrous vactors twirling a projection of those toys, roomware lighting control is deployed to show the modeled position of a virtual camera spinning around each player, even while the affordances are whirled. "Tworlds" is a mixed reality multimodal toy using twirled juggling-s...
We have built haptic interfaces featuring mobile devices--- smartphones, phablets, and tablets--- that use compass-derived orientation sensing to animate virtual displays and ambient media. "Tworlds" is a mixed reality, multimodal toy using twirled juggling-style affordances crafted with mobile devices to modulate various displays, including 3D mod...
We have developed a mobile application featuring a navigation system enhanced with spatial sound delivered via headphones. We hope that this application (named Machi-beacon) can promote traffic and pedestrian safety. With Machi-beacon, a user can establish a destination from her current position, and the application renders a visual map and an earc...
Contemporary smartphones and tablets have magnetometers that can be used to detect yaw, which data can be distributed to adjust ambient media. We have built haptic interfaces featuring smartphones and tablets that use compass-derived orientation sensing to modulate virtual displays. Embedding mobile devices into pointing, swinging, and flailing aff...
Perceived roughness reports were collected for pairings of sinusoidal tones presented either over loudspeakers or headphones such that the sounds were collocated or spatially separated 90 degrees in front of the listener (+ /- 45 degrees). In the loudspeaker experiment, pairs of sinusoids were centered at 0.3, 1.0, and 3.3 kHz, and separated by hal...
Interactive table-top interfaces are multimedia devices which allow sharing information visually and aurally among several users. Table-top interfaces for spatial sound environments are frequently investigated in the field of the human interfaces. Table-top interfaces are utilized as groupware and it is suitable for collaborative work, and it is co...
We have built interfaces featuring smartphones and tablets that use magnetometer-derived orientation sensing to control spatial sound, motion platforms, panoramic and turnoramic image-based renderings, virtual displays, and other programs. To leverage our Collaborative Virtual Environment (CVE), which is implemented in pure Java, against the power...
The aim of this research is to explore virtual sound environments with mobile devices, using iOS as a main platform and Pure Data (Pd) as a backend for sound processing. The underlying calculations are based on human's natural and linear interpolation between virtual sound sources. As a result, the developed application allows user to "walk around"...
Modern smartphones and tablets have magnetometers that can be used to detect yaw, which data can be distributed to adjust ambient media. Either static (pointing) or dynamic (twirling) modes can be used to modulate multimodal displays, including 360° imagery and virtual environments. Azimuthal tracking especially allows control of horizontal planar...
Human interaction with social networking services (SNS) is currently a very active research area. SNS posts, such as tweets, allow users to broadcast their ideas in short form of text, voice, or images, using mobile devices and computers. Text and speech enriched with emotions is one of the major ways of exchanging ideas, especially via telephony a...
We introduce students to the basics of human interface technology and the virtual reality paradigm, especially through "desktop VR" (a.k.a. "fishtank VR"), a "hands-on" approach emphasizing creation of self-designed virtual worlds. The main vehicle of expression is Alice, used to contextualize segments on color models, image capture and compositing...
We have built haptic interfaces featuring smartphones and tablets that use magnetometerderived orientation sensing to modulate virtual displays, especially spatial sound, allowing, for instance, each side of a karaoke recording to be separately steered around a periphonic display. Embedding such devices into a spinnable affordance allows a "spinnin...
Speech is one of the most important signals that can be used to detect human emotions. Speech is modulated by different emotions by varying frequency-and energy-related acoustic pa-rameters such as pitch, energy and formants. In this paper, we describe research on analyzing inter-and intra-subband energy variations to differentiate five emotions. T...
We present a new software application based on a recently collected hrir database comprising measurements at different distances. The new application, programmed in Pure-data, is capable of directionalizing sound objects at any azimuth, at elevations between −40◦ and 90◦, and at distances 20−160 cm. This truly 3D spatialization is done by pre-calcu...
Conventional standards for bibliography styles entail a forced choice between index and name-year citations and corresponding references. We reject this false dichotomy, and describe a multibibliography, comprising alphabetic, sequenced, and also chronological orderings of references. An extended inline citation format is presented which integrates...
We have built haptic interfaces featuring smartphones that use magnetometer-derived orientation sensing to modulate virtual displays. Embedding such devices into swinging affordances allows a \poi"-style interface, whirling tethered devices, for a novel interaction technique. Dynamic twirling can be used to control multimodal displays| including po...
We have developed two versions of a media player designed for differently abled users. The media player, suitable for selecting and playing songs or videos, runs on either computer-hosted "Alice" or an iOS smartphone or tablet. Even though special users have trouble with normal mouse and keyboard, such accessibility allows them to enjoy selected me...
Alice is an innovative 3d programming environment that makes it easy to create an animation. Many virtual environment (ve) models are available in the Alice 3d environment. We created ve scenes using the Alice 3d ide (integrated development environment). We deploy a beat detector to detect the rhythm of a song, based on pd (Pure Data, a free datafl...
With the emergence of virtual environments, even ordinary computer users can drag-and-drop 3d objects (cities, buildings, furniture, instruments, controls, animals and avatars) from galleries and arrange them to create attractive cyberworlds. We have created a virtual concert application using Alice, a 3d rapid prototyping programming environment,...
A Squid proxy sever is a popular web content caching service trusted by many network administrators. In this paper, we describe a method of managing the bandwidth of Squid proxy cache through the World Wide Web, thus allowing Squid administrators and authorized users to allocate a percentage of bandwidth to a particular computer or group of compute...
We have built haptic interfaces featuring smartphones and tablets that use magnetometer-derived orientation sensing to modulate virtual displays, especially spatial sound, allowing, for instance, each side of a karaoke recording to be separately steered around a periphonic display. Embedding such devices into a spinnable affordance allows a "spinni...
"Poi," originally a Maori performance art featuring whirled tethered weights, combines elements of dance and juggling. It has been embraced by contemporary festival culture (especially rave-style electronic music events), including extension to "glowstringing," in which a glow stick (chemiluminescent plastic tube) is whirled at the end of a string....
In this paper we describe a musical cyber world - a collaborative, immersive virtual environment for browsing musical databases - together with an experimental design launching a new sub discipline: the ethnomusicology of controlled musical cyberspaces. Research in ethnomusicology, the ethnographic study of music in its socio-cultural environment,...
Speech is one of the most important signals that can be used to detect human emotions. Speech is modulated by different emotions by varying frequency- and energy-related acoustic parameters such as pitch, energy, and formants. In this paper, we describe research on analyzing inter- and intra-subband energy variations to differentiate five emotions....
Since audition is omnidirectional, it is especially receptive to orientation modulation. Position can be defined as the combination of location and orientation information. Location-based or locationaware services do not generally require orientation information, but position-based services are explicitly parameterized by angular bearing as well as...
We introduce an enhancement to the Helical Keyboard, an interactive installation displaying three-dimensional musical scales aurally and visually. The Helical Keyboard features include tuning stretching mechanisms, spatial sound, and stereographic display. The improvement in the audio display is intended to facilitate pedagogic purposes by enhancin...
Since the earliest studies of human behavior, emotions have attracted attention of researchers in many disciplines, including psychology, neuroscience, and lately computer science. Speech is considered a salient conveyor of emotional cues, and can be used as an important source for emotional studies. Speech is modulated for different emotions by va...
Diffusion curves are a new kind of primitive in vector graphics, capable of representing smooth color transitions among boundaries. Their rendering requires solving Poisson's equation; much previous research relied on traditional solvers, which commonly require GPU acceleration to achieve real-time rasterization. This obviously restricts deployment...
Interfaces featuring smartphones and tablets that use magnetometer-derived orientation sensing can be used to modulate virtual displays. Embedding such devices into a spinnable affordance allows a "spinning plate"-style interface, a novel interaction technique. Either static (pointing) or dynamic (whirled) mode can be used to control multimodal dis...
An introduction to spatial sound in the context of hypermedia, interactive multimedia, and virtual reality is presented. Basic principals of relevant physics and psychophysics are reviewed (ITDs: interaural time differences, IIDs: interaural intensity differences, and frequency-dependent attenuation capturable by transfer functions). Modeling of so...
HRIR~, a new software audio filter for Head-Related Impulse Response (HRIR) convolution is presented. The filter, implemented as a Pure-Data object, allows dynamic modification of a sound source's apparent location by modulating its virtual azimuth, elevation, and range in realtime, the last attribute being missing in surveyed similar applications....
The aim of this research is integration of time-aware geomedia browsing with a virtual environment MultiSound Management System. We researched and developed a web service and application as contents based on such information. We approached the construction and design of the function and API (Application Program Interface) to develop an advanced web...
This paper describes achievement of the "Vertigo" cinematographic effect in "machinima" software named "Alice" that can make 3D animation. By composing virtual camera gestures of zoom in and zoom out with camera motion, "dolly zoom effect" can be realized.
Human interaction with mobile devices is currently a very active research area. Speech enriched with emotions is one of the major ways of exchanging ideas, especially via telephony. By analyzing a voice stream using a Hidden Markov Model (HMM) and Log Frequency Cepstral Coefficients (LFPC) based system, different emotions can be recognized. Using a...
The purpose of this study is improvement of real-time human-computer interface in using augmented reality. We propose animating an avatar in real-time with motion capture system and augmented reality. Development of ubiquitous computing is impressive in recent years. Augmented reality is an instance of ubiquitous computing technology. Augmented rea...