# Nokia Research Center (NRC)

• Helsinki, Finland
Recent publications
2D homojunctions have stimulated extensive attention because of their perfect thermal and lattice matches, as well as their tunable band structures in 2D morphology, which provide fascinating opportunities for novel electronics and optoelectronics. Recently, 2D nonlayered materials have attracted the attention of researchers owing to their superior functional applications and diverse portfolio of the 2D family. Therefore, 2D nonlayered homojunctions would open the door to a rich spectrum of exotic 2D materials. However, they are not investigated due to their extremely difficult synthesis methods. Herein, nonlayered CdSe flakes homojunctions are obtained via self‐limited growth with InCl3 as a passivation agent. Interestingly, two pieces of vertical wurtzite‐zinc blende (WZ‐ZB) homojunctions epitaxially integrate into WZ/ZB lateral junctions. These homojunctions show a divergent second‐harmonic generation intensity, strongly correlated to the multiple twinned ZB phase, as identified by aberration‐corrected scanning transmission electron microscopy and theoretical calculations. Impressively, the photodetector based on this WZ/ZB CdSe homojunction shows excellent performances, integrating a high photoswitching ratio (3.4 × 105) and photoresponsivity (3.7 × 103 A W−1), suggesting promising potential for applications in electronics and optoelectronics. Nonlayered CdSe homojunction flakes are obtained via self‐limited growth with InCl3 as the passivation agent. The two pieces of homojunctions epitaxially integrated together display a giant divergence of second‐harmonic generation intensity, which is attributed to the twinned zinc blende structure in the homojunctions. It is further confirmed by scanning transmission electron microscopy and theoretical calculations.
This paper describes the development and evaluation of Undulating Covers (UnCovers), mobile interfaces that can change their surface texture to transmit information. The Pin Array UnCover incorporates sinusoidal ridges controlled by servomotors, which can change their amplitude and granularity. The Mylar UnCover is a more organic interface that exploits the buckling properties of Mylar, using muscle wires, to change the texture granularity. The prototype development process and evaluations show that very low frequency texture changes, using amplitude or granularity, can be distinguished with high levels of accuracy. Since small changes are perceptible, it is possible to incorporate such interfaces into mobile devices without drastically increasing their size or actuation requirements. Finally, ratings from participants indicate that UnCovers would be appropriate for attention-grabbing, or caring and supportive interpersonal messages.
Auditory interfaces offer a solution to the problem of effective eyes-free mobile interactions. In this article, we investigate the use of multilevel auditory displays to enable eyes-free mobile interaction with indoor location-based information in non-guided audio-augmented environments. A top-level exocentric sonification layer advertises information in a gallery-like space. A secondary interactive layer is used to evaluate three different conditions that varied in the presentation (sequential versus simultaneous) and spatialisation (non-spatialised versus egocentric/exocentric spatialisation) of multiple auditory sources. Our findings show that (1) participants spent significantly more time interacting with spatialised displays; (2) using the same design for primary and interactive secondary display (simultaneous exocentric) showed a negative impact on the user experience, an increase in workload and substantially increased participant movement; and (3) the other spatial interactive secondary display designs (simultaneous egocentric, sequential egocentric, and sequential exocentric) showed an increase in time spent stationary but no negative impact on the user experience, suggesting a more exploratory experience. A follow-up qualitative and quantitative analysis of user behaviour support these conclusions. These results provide practical guidelines for designing effective eyes-free interactions for far richer auditory soundscapes.
Interactive image segmentation is growingly useful for selecting objects of interest in images, facilitating spatially localized media manipulation especially on touch screen devices. We present a robust and efficient approach for segmenting image with less and intuitive user interaction. Our approach combines geodesic distance information with the flexibility of level set methods in energy minimization, leveraging the complementary strengths of each to promote accurate boundary placement and strong region connectivity while requiring less user interaction. We harness weakly supervised segment annotation to maximize the user-provided prior knowledge. This leads to a seed generation algorithm which enables image object segmentation without user-provided background seeds. We demonstrate that our approach is less sensitive to seed placement and better at edge localization, whilst requiring less user interaction, compared with the state-of-the-art methods.
This chapter presents visual quality assessment covering both opinion-aware and opinion-unaware models. Most of the approaches are based on understanding and modeling the underlying statistics of natural images and/or distortions using perceptual principles.
Elastically deformable materials can be created from rigid sheets through patterning appropriate meshes which can locally bend and flex. We demonstrate how microaccordion patterns can be fabricated across large areas using three-beam interference lithography. Our resulting mesh induces a large and robust elasticity within any rigid material film. Gold coating the microaccordion produces stretchable conducting films. Conductivity changes are negligible when the sample is stretched reversibly up to 30% and no major defects are introduced, in comparison to continuous sheets which quickly tear. Scaling analysis shows that our method is suited to further miniaturization and large-scale fabrication of stretchable functional films. It thus opens routes to stretchable interconnects in electronic, photonic, and sensing applications, as well as a wide variety of other deformable structures.
Recommender systems have been studied comprehensively in both academic and industrial fields over the past decade. As user interests can be affected by context at any time and any place in mobile scenarios, rich context information becomes more and more important for personalized context-aware recommendations. Although existing context-aware recommender systems can make context-aware recommendations to some extent, they suffer several inherent weaknesses: (1) Users' context-aware interests are not modeled realistically, which reduces the recommendation quality; (2) Current context-aware recommender systems ignore trust relations among users. Trust relations are actually context-aware and associated with certain aspects (i.e., categories of items) in mobile scenarios. In this article, we define a term role to model common context-aware interests among a group of users. We propose an efficient role mining algorithm to mine roles from a "user-context-behavior" matrix, and a role-based trust model to calculate context-aware trust value between two users. During online recommendation, given a user u in a context c, an efficient weighted set similarity query (WSSQ) algorithm is designed to build u's role-based trust network in context c. Finally, we make recommendations to u based on u's role-based trust network by considering both context-aware roles and trust relations. Extensive experiments demonstrate that our recommendation approach outperforms the state-of-the-art methods in both effectiveness and efficiency.
HTTP-based delivery for Video on Demand (VoD) has been gaining popularity within recent years. With the recently proposed Dynamic Adaptive Streaming over HTTP (DASH), video clients may dynamically adapt the requested video quality and bitrate to match their current download rate. To avoid playback interruption, DASH clients attempt to keep the buffer occupancy above a certain minimum level. This mechanism works well for the single view video streaming. For multi-view video streaming application over DASH, the user originates view switching and that only one view of multi-view content is played by a DASH client at a given time. For such applications, it is an open problem how to exploit the buffered video data during the view switching process. In this paper, we propose two fast and efficient view switching approaches in the paradigm of DASH systems, which fully exploit the already buffered video data. The advantages of the proposed approaches are twofold. One is that the view switching delay will be short. The second advantage is that the rate-distortion performance during the view switching period will be high, i.e., using less request data to achieve comparable video playback quality. The experimental results demonstrate the effectiveness of the proposed method.
The feedback and input of users have been an important part of product innovation in recent years. User input has been studied from different approaches and is applied through different methods in particular phases of the innovation process. However, these methods are not integrated into the whole innovation process and are used only in particular phases or on an ad hoc basis. New developments in technology, social media, and new ways of working closer with customers have opened up new possibilities for firms to gain user input throughout the whole innovation process. However, the impact that these new developments in technology offer for user input innovation in high-tech firms is unclear. Therefore, we study how high-tech firms collect and apply user feedback throughout the whole innovation process. The paper is based on a comparative case study of eight cases in the high-tech industry, in which qualitative data collection was applied. The key contribution of the paper is a conceptual framework on user data-driven innovation throughout the innovation cycle. This framework gives insight into user involvement types and approaches to collect and apply user feedback throughout the innovation process.
This paper proposes a method for binaural reconstruction of a sound scene captured with a portable-sized array consisting of several microphones. The proposed processing is separating the scene into a sum of small number of sources, and the spectrogram of each of them is in turn represented as a small number of latent components. The direction of arrival (DOA) of each source is estimated, which is followed by binaural rendering of each source at its estimated direction. For representing the sources, the proposed method uses low-rank complex-valued non-negative matrix factorization combined with DOA-based spatial covariance matrix model. The binaural reconstruction is achieved by applying the binaural cues (head-related transfer function) associated with the estimated source DOA to the separated source signals. The binaural rendering quality of the proposed method was evaluated using a speech intelligibility test. The test results indicated that the proposed binaural rendering was able to improve the intelligibility of speech over stereo recordings and separation by minimum variance distortionless response beamformer with the same binaural synthesis in a three-speaker scenario. An additional listening test evaluating the subjective quality of the rendered output indicates no added processing artifacts by the proposed method in comparison to unprocessed stereo recording.
In this paper the authors experiment with multi-display mobile applications that can be used in an environment where multiple smart phones are co-located within the same physical space. Utilizing Remote User Interface interaction metaphor and the REST architectural style they propose a solution that follows the Remote Model-View-Controller model, in such a way that client devices do not need to have application specific software pre-installed. The authors demonstrate the system with the Panorama Bricks application, for displaying, in a multi-display expanded view, street-view style mirror-world panoramas, in a synchronized manner. The architecture proves that such enhanced application scenarios are possible to implement even today, utilizing off-the-shelf mobile smart phones. Their evaluations prove that responsiveness levels are high, even in scenarios where multiple objects are overlaid on top of the mirror-world panoramas.
Software mashups that combine content from multiple web sites to an integrated experience are a popular trend. However, methods, tools and architectures for creating mashups are still rather undeveloped, and there is little engineering support behind them. In this paper the authors present guidelines that can serve as a helpful starting point for the design of new mashups. Guidelines focus mainly on mashup creation methods. Furthermore, they describe a reference architecture for client-side mashup development. In addition, the authors provide insight into mashup development based on their practical experiences in implementing various sample client-side mashup applications and tools for creating them. The long term goal of the authors? work is to facilitate the development of compelling, robust and maintainable mashup applications, and more generally ease the transition towards web-based software development.
Because of losses in electricity conversion and storage only part of the energy taken from the power grid produces useful work in a battery-operated mobile device; the rest evaporates as heat. The authors analyze the recharging activity of a mobile phone to understand the efficiency of the different units involved (charger, EPM chipset, battery). Their measurements show that the efficiency is quite low; only about 15% of the electricity from the power grid ends up being used for the actual computing and communication elements of the mobile phone. It seems that there is room for improvement in the recharging efficiency. However, as the consumption of electricity of a single phone is small the incentive for improvements has been weak.
3D video is composed out of two or more, temporally synchronized, 2D video streams acquired at different camera poses and accompanied by geometrical information. In a mixed resolution 3D video stream, a subset of views is coded at reduced resolution. It has been shown in the literature that subjective quality of mixed resolution 3D video is close to that of full resolution 3D video. In order to improve the coding gain in mixed resolution coding scenario we present a new depth encoding method called View Upsampling Optimization. A novel depth distortion metric based on the performance of the Depth-Based Super Resolution is also presented. Finally, to improve the quality of the decoded video an improved Depth-Based Super Resolution method that uses View Synthesis Quality Mapping is used for upsampling of low resolution views. The simulations, performed with the recently standardized MVC+D encoder, show that the proposed solution combined with the state of the art View Synthesis Distortion outperforms the anchor MVC+D coding scheme by 14.5\% of dBR on average for the total coded bitrate and by 17\% of dBR on average for the synthesized views.
In this paper we present a novel street scene semantic recognition framework, which takes advantage of 3D point clouds captured by a high-definition LiDAR laser scanner. An important problem in object recognition is the need for sufficient labeled training data to learn robust classifiers. In this paper we show how to significantly reduce the need for manually labeled training data by reduction of scene complexity using non-supervised ground and building segmentation. Our system first automatically segments grounds point cloud, this is because the ground connects almost all other objects and we will use a connect component based algorithm to oversegment the point clouds. Then, using binary range image processing building facades will be detected. Remained point cloud will grouped into voxels which are then transformed to super voxels. Local 3D features extracted from super voxels are classified by trained boosted decision trees and labeled with semantic classes e.g. tree, pedestrian, car, etc. The proposed method is evaluated both quantitatively and qualitatively on a challenging fixed-position Terrestrial Laser Scanning (TLS) Velodyne data set and two Mobile Laser Scanning (MLS), Paris-rue-Madam and NAVTEQ True databases. Robust scene parsing results are reported.
Object detection, recognition and pose estimation in 3D images have gained momentum due to availability of 3D sensors (RGB-D) and increase of large scale 3D data, such as city maps. The most popular approach is to extract and match 3D shape descriptors that encode local scene structure, but omits visual appearance. Visual appearance can be problematic due to imaging distortions, but the assumption that local shape structures are sufficient to recognise objects and scenes is largely invalid in practise since objects may have similar shape, but different texture (e.g., grocery packages). In this work, we propose an alternative appearance-driven approach which first extracts 2D primitives justified by Marr’s primal sketch, which are “accumulated” over multiple views and the most stable ones are “promoted” to 3D visual primitives. The 3D promoted primitives represent both structure and appearance. For recognition, we propose a fast and effective correspondence matching using random sampling. For quantitative evaluation we construct a semi-synthetic benchmark dataset using a public 3D model dataset of 119 kitchen objects and another benchmark of challenging street-view images from 4 different cities. In the experiments, our method utilises only a stereo view for training. As the result, with the kitchen objects dataset our method achieved almost perfect recognition rate for $$\pm 10^\circ$$ camera view point change and nearly 80 % for $$\pm 20^\circ$$, and for the street-view benchmarks it achieved 75 % accuracy for 160 street-view images pairs, 80 % for 96 street-view images pairs, and 92 % for 48 street-view image pairs.
This paper discusses quality metrics as well as procedure for parameters optimization and model assessment for human activity recognition based on sensors signals. We compare micro and macro performance measures in multiclass classification as well as various cross-validation techniques. The paper introduces general concept of Dual Leave-Group-of-Sources-Out cross-validation procedure. This technique provides reliable way for model parameters optimization in practical applications and prevents overestimation of recognition quality from point of view generalization capability.
Institution pages aggregate content on ResearchGate related to an institution. The members listed on this page have self-identified as being affiliated with this institution. Publications listed on this page were identified by our algorithms as relating to this institution. This page was not created or approved by the institution. If you represent an institution and have questions about these pages or wish to report inaccurate content, you can contact us here.
35 members