This paper considers neurological, formational and functional similarities between gestures and signed verb predicates. From analysis of verb sign movement, we offer suggestions for analyzing gestural movement (motion capture, kinematic analysis, trajectory internal structure). From analysis of verb sign distinctions, we offer suggestions for analyzing co-speech gesture functions.
Retrieving information about the occurrences of persons in a video is an important task in many video indexing and retrieval applications. The problem is to answer the question "In which shots and scenes does person X appear?". In this paper, we present an automatic video annotation system with respect to a person's appearance based on state-of-the-art algorithms for face detection, tracking and recognition. In contrast to many related approaches, knowledge about the persons in a given video is not assumed in advance. Adaboost is employed after an initial clustering of faces to select the best features describing a person's face. These features are then used to train new classifiers based only on the faces extracted from the video under consideration. Several possibilities to train Adaboost and support vector machine (ensemble) classifiers directly on a video are compared. Finally, experimental results demonstrate the effectiveness of correcting in-plane face rotation and of the employed self-supervised learning method
This paper describes a new approach for estimating term weights in a document, and shows how the new weighting scheme can be used to improve the accuracy of a text classifier. The method uses term co-occurrence as a measure of dependency between word features. A random-walk model is applied on a graph encoding words and co-occurrence dependencies, resulting in scores that represent a quantification of how a particular word feature contributes to a given context. Experiments performed on three standard classification datasets show that the new random-walk based approach outperforms the traditional term frequency approach of feature weighting.
Automatic sentence segmentation of spoken language is an important precursor to downstream natural language processing. Previous studies combine lexical and prosodic features, but can impose significant computational challenges because of the large size of feature sets. Little is understood about which features most benefit performance, particularly for speech data from different speaking styles. We compare sentence segmentation for speech from broadcast news versus natural multi-party meetings, using identical lexical and prosodic feature sets across genres. Results based on boosting and forward selection for this task show that (1) features sets can be reduced with little or no loss in performance, and (2) the contribution of different feature types differs significantly by genre. We conclude that more efficient approaches to sentence segmentation and similar tasks can be achieved, especially if genre differences are taken into account.
Interpreting the semantics of an image is a hard problem. However, for storing and indexing large multimedia collections, it is essential to build systems that can automatically extract semantics from images. In this research we show how we can fuse content and context to extract semantics from digital photographs. Our experiments show that if we can properly model context associated with media, we can interpret semantics using only a part of high dimensional content data.
The OntoNotes project is creating a corpus of large-scale, accurate, and integrated annotation of multiple levels of the shallow semantic structure in text. Such rich, integrated annotation covering many levels will allow for richer, cross-level models enabling significantly better automatic semantic analysis. At the same time, it demands a robust, efficient, scalable mechanism for storing and accessing these complex inter-dependent annotations. We describe a relational database representation that captures both the inter- and intra-layer dependencies and provide details of an object-oriented API for efficient, multi-tiered access to this data.
Existing methods in the semantic computer vision community seem unable to
deal with the explosion and richness of modern, open-source and social video
content. Although sophisticated methods such as object detection or
bag-of-words models have been well studied, they typically operate on low level
features and ultimately suffer from either scalability issues or a lack of
semantic meaning. On the other hand, video supervoxel segmentation has recently
been established and applied to large scale data processing, which potentially
serves as an intermediate representation to high level video semantic
extraction. The supervoxels are rich decompositions of the video content: they
capture object shape and motion well. However, it is not yet known if the
supervoxel segmentation retains the semantics of the underlying video content.
In this paper, we conduct a systematic study of how well the actor and action
semantics are retained in video supervoxel segmentation. Our study has human
observers watching supervoxel segmentation videos and trying to discriminate
both actor (human or animal) and action (one of eight everyday actions). We
gather and analyze a large set of 640 human perceptions over 96 videos in 3
different supervoxel scales. Furthermore, we conduct machine recognition
experiments on a feature defined on supervoxel segmentation, called supervoxel
shape context, which is inspired by the higher order processes in human
perception. Our ultimate findings suggest that a significant amount of
semantics have been well retained in the video supervoxel segmentation and can
be used for further video analysis.
Research on audio-based music retrieval has primarily concentrated on refining audio features to improve search quality. However, much less work has been done on improving the time efficiency of music audio searches. Representing music audio documents in an indexable format provides a mechanism for achieving efficiency. To address this issue, in this work Exact Locality Sensitive Mapping (ELSM) is suggested to join the concatenated feature sets and soft hash values. On this basis we propose audio-based music indexing techniques, ELSM and Soft Locality Sensitive Hash (SoftLSH) using an optimized Feature Union (FU) set of extracted audio features. Two contributions are made here. First, the principle of similarity-invariance is applied in summarizing audio feature sequences and utilized in training semantic audio representations based on regression. Second, soft hash values are pre-calculated to help locate the searching range more accurately and improve collision probability among features similar to each other. Our algorithms are implemented in a demonstration system to show how to retrieve and evaluate multi-version audio documents. Experimental evaluation over a real "multi-version" audio dataset confirms the practicality of ELSM and SoftLSH with FU and proves that our algorithms are effective for both multi-version detection (online query, one-query vs. multi-object) and same content detection (batch queries, multi-queries vs. one-object).
Human-computer interaction systems have been developed in recent years. These systems use multimedia techniques to create Mixed-Reality environments where users can train themselves. Although most of these systems rely strongly on interactivity with the users, taking into account users' states, they still lack the possibility of considering users preferences when they help them. In this paper, we introduce an Action Support System for Interactive Self-Training (ASSIST) in cooking. ASSIST focuses on recognizing users' cooking actions as well as real objects related to these actions to be able to provide them with accurate and useful assistance. Before the recognition and instruction processes, it takes users' cooking preferences and suggests one or more recipes that are likely to satisfy their preferences by collaborative filtering. When the cooking process starts, ASSIST recognizes users' hands movement using a similarity measure algorithm called AMSS. When the recognized cooking action is correct, ASSIST instructs the user on the next cooking procedure through virtual objects. When a cooking action is incorrect, the cause of its failure is analyzed and ASSIST provides the user with support information according to the cause to improve the user's incorrect cooking action. Furthermore, we construct parallel transition models from cooking recipes for more flexible instructions. This enables users to perform necessary cooking actions in any order they want, allowing more flexible learning.
In this paper, a framework of automatic human action segmentation and recognition in continuous action sequences is proposed. A star figure enclosed by a bounding convex polygon is used to effectively represent the extremities of the silhouette of a human body. The human action, thus, is recorded as a sequence of the star-figure's parameters, which is used for action modeling. To model human actions in a compact manner while characterizing their spatio-temporal distributions, the star-figure's parameters are represented by Gaussian mixture models (GMM). In addition, to address the intrinsic nature of temporal variations in a continuous action sequence, we transform the time sequence of star-like figure parameters into frequency domain by discrete cosine transform (DCT) and use only the first few coefficients to represent different temporal patterns with significant discriminating power. The performance shows that the proposed framework can recognize continuous human actions in an efficient way.
This paper proposes a method for automatically extracting principal video objects that appear in TV program segments and their actions using linguistic analysis of closed captions. We focus on features based on the text style of the closed captions by using Quinlan's C4.5 decision-tree learning algorithm. We extract a noun describing a video object and a verb describing an action for each video shot. To show the effectiveness of the method, we conducted experiments on the extraction of video segments in which animals appear and perform actions in twenty episodes of a Nature program. We obtained F-values of 0.609 on the extraction of video segments in which animals appear and 0.699 on extracting the action of "eating." We applied our method to a further 20 episodes, and generated a multimedia encyclopedia of animals. This provided a total of 387 video clips of 105 kinds of animals and 261 video clips of 56 kinds of actions.
A modular knowledge representation framework for conversational agents is presented. The approach has been realized to suit the situation awareness paradigm. The modularity of the framework makes possible the composition of specific modules that deal with particular features, simplifying both the chatbot design process and its smartness. As a proof of concepts we have developed a modular, situation awareness oriented, KB for a conversational agent, which plays the role of an advisor aimed at helping a user to be in charge of a virtual town, inspired to the SimCity series game. The agent makes an extensive use of semantic computing techniques and is able to perceive, comprehend and project consequences of actions in order to handle strategic decision under uncertainty conditions.
In this paper, we propose an approach for affective representation of movie scenes based on the emotions that are actually felt by spectators. Such a representation can be used for characterizing the emotional content of video clips for e.g. affective video indexing and retrieval, neuromarketing studies, etc. A dataset of 64 different scenes from eight movies was shown to eight participants. While watching these clips, their physiological responses were recorded. The participants were also asked to self-assess their felt emotional arousal and valence for each scene. In addition, content-based audio- and video-based features were extracted from the movie scenes in order to characterize each one. Degrees of arousal and valence were estimated by a linear combination of features from physiological signals, as well as by a linear combination of content-based features. We showed that a significant correlation exists between arousal/valence provided by the spectator's self-assessments, and affective grades obtained automatically from either physiological responses or from audio-video features. This demonstrates the ability of using multimedia features and physiological responses to predict the expected affect of the user in response to the emotional video content.
Intelligent agents help to automate time and resource consuming tasks such as anomaly detection, pattern recognition, monitoring and decision-making. One of the major issues in automation of cyberspace is the discordance between the concept people use and the elucidation of the corresponding data by existing algorithms. Moreover, the measurement and computation of relevance referred to as degree of match-making is a crucial task and presents one of the most important challenges in unknown and uncertain environments of multi-agent systems. Optimal algorithms that generate the best matches for a user input are desired. This paper overcomes the challenges listed by proposing an agent-based semantic match-making algorithm that addresses the problem of eterogeneous ontology at user end and semantically enhances the user-input. A degree of match-making evaluation scheme based on fuzzy logic is proposed and evaluated using synthetic data from the web. The results are found to be consistent on the scale provided by the existing algorithms.
This paper describes a method to partially align ontologies in dialogs of agents which use different ontologies. The method aims at aligning in execution time only the concepts necessary to the agents fulfill the current dialog. Thus, reducing the number of concepts to be searched in the target ontology is a very important requirement for agents' mutual understanding. The proposed method (named POAM, acronym for Partial Ontology Alignment Method) uses syntactical and linguistic techniques to group concepts together. The underlying rationale of POAM is that a person perceives an object and immediately identifies some properties. Even never before seen objects can be interpreted independently of any class, because properties in the real world exist independently of any class. Hence, similarity between a pair of concepts is calculated based on the similarity of their properties. A set of measures including syntactical, structural and semantic ones are used to calculate similarity between the properties associated to the concepts. A property signature vector is created for each concept and the similarity between two concepts is given by the distance between the corresponding vectors in a high dimensional space. We demonstrate that POAM reduces the number of candidate mappings when aligning concepts in a dialog of agents by means of an evaluation using ontologies from the bibliographic domain of the Ontology Alignment Evaluation Initiative (OAEI). We also show that POAM performs satisfactorily well considering the quality of results measured with the precision and recall metrics.
As a growing number of applications represent data as semantic graphs like RDF (Resource Description Format) and the many entity-attribute-value formats, query languages for such data are being required to support operations beyond graph pattern matching and inference queries. Specifically the ability to express aggregate queries is an important feature which is either lacking or is implemented with little attention to the peculiarities of the data model. In this paper, we study the meaning and implementation of grouping and aggregate queries over RDF graphs. We first define grouping and aggregate operators algebraically and then show how the SPARQL query language can be extended to express grouping and aggregate queries.
We investigate the automatic generation of topic pages as an alternative to the current Web search paradigm. Topic pages explicitly aggregate information across documents, filter redundancy, and promote diversity of topical aspects. We propose a novel framework for building rich topical aspect models and selecting diverse information from the Web. In particular, we use Web search logs to build aspect models with various degrees of specificity, and then employ these aspect models as input to a sentence selection method that identifies relevant and non-redundant sentences from the Web. Automatic and manual evaluations on biographical topics show that topic pages built by our system compare favorably to regular Web search results and to MDS-style summaries of the Web results on all metrics employed.
Computing with words (CWW) is an intelligent computing methodology for processing words, linguistic variables, and their semantics, which mimics the natural-language-based reasoning mechanisms of human beings in soft computing, semantic computing, and cognitive computing. The central objects in CWW techniques are words and linguistic variables, which may be formally modeled by abstract concepts that are a basic cognitive unit to identify and model a concrete entity in the real world and an abstract object in the perceived world. Therefore, concepts are the most fundamental linguistic entities that carries certain meanings in expression, thinking, reasoning, and system modeling, which may be formally modeled as an abstract and dynamic mathematical structure in denotational mathematics. This paper presents a formal theory for concept and knowledge manipulations in CWW known as concept algebra. The mathematical models of abstract and concrete concepts are developed based on the object-attribute-relation (OAR) theory. The formal methodology for manipulating knowledge as a concept network is described. Case studies demonstrate that concept algebra provides a generic and formal knowledge manipulation means, which is capable of dealing with complex knowledge and their algebraic operations in CWW.
Active learning has been demonstrated to be an effective approach to reducing human labeling effort in multimedia annotation tasks. However, most of the existing active learning methods for video annotation are studied in a relatively simple context where concepts are sequentially annotated with fixed effort and only a single modality is applied. However, we usually have to deal with multiple modalities, and sequentially annotating concepts without preference cannot suitably assign annotation effort. To address these two issues, in this paper we propose a multi-concept multi-modality active learning method for video annotation in which multiple concepts and multiple modalities can be simultaneously taken into consideration. In each round of active learning, this method selects the concept that is expected to get the highest performance gain and a batch of suitable samples to be annotated for this concept. Then, a graph-based semi-supervised learning is conducted on each modality for the selected concept. The proposed method is able to sufficiently explore the human effort by considering both the learnabilities of different concepts and the potentials of different modalities. Experimental results on TRECVID 2005 benchmark have demonstrated its effectiveness and efficiency.
In this paper, we categorize "semantics" into "taxonomical semantics", "syntactical semantics" and "formal semantics". We propose a declarative meta-language SCDL-NL as the foundation of a general annotation language in which "taxonomical and syntactical semantic" information of a sentence can be clearly defined. Since pure natural language is too complicated to be used as a general annotation language, the general annotation language imposes some restrictions on the English grammar so that it can be easily translated into SCDL-NL to facilitate information retrieval.
In this paper, we first show the importance of face-voice correlation for audio-visual person recognition. We propose a simple multimodal fusion technique which preserves the correlation between audio-visual features during speech and evaluates the performance of such a system against audio-only, video-only, and audio-visual systems which use audio and visual features neglecting the interdependency of a person’s spoken utterance and the associated facial movements. Experiments performed on the VidTIMIT dataset show that the proposed multimodal fusion scheme has a lower error rate than all other comparison conditions and is more robust against replay attacks. The simplicity of the fusion technique allows for low-complexity designs for a simple low-cost real-time DSP implementation. We discuss some problems associated with the previously proposed design and, as a solution to those problems, propose two novel classifier designs which provide more flexibility and a convenient way to represent multimodal data where each modality has different characteristics. We also show that these novel classifier designs offer superior performance in terms of both accuracy and robustness.
This work concerns non-parametric approaches for statistical learning applied to the standard knowledge representation languages adopted in the Semantic Web context. We present methods based on epistemic inference that are able to elicit and exploit the semantic similarity of individuals in OWL knowledge bases. Specifically, a totally semantic and language-independent semi-distance function is introduced, whence also an epistemic kernel function for Semantic Web representations is derived. Both the measure and the kernel function are embedded in non-parametric statistical learning algorithms customized for coping with Semantic Web representations. Particularly, the measure is embedded in a k-Nearest Neighbor algorithm and the kernel function is embedded in a Support Vector Machine. The implemented algorithms are used to perform inductive concept retrieval and query answering. An experimentation on real ontologies proves that the methods can be effectively employed for performing the target tasks, and moreover that it is possible to induce new assertions that are not logically derivable.
Recent advancement in wireless sensor network technology has completely changed the way the physicians and other health professionals monitor and access patients' health status records in real time, interact with each other, and access the past and present medical records of patients. However, the sensor nodes used in a wireless sensor network to monitor patients' health are resource constraint in nature with limited processing and communication capability. In future, an increase of wireless sensor networks to monitor and analyze patients' health records is envisioned and therefore, the resource constraint nature of wireless sensor networks needs to be addressed. In this paper, an architecture to overcome the limitations of wireless sensor networks is introduced using Grid computing technology. Sensor Grid technology combines these two technologies by extending the Grid computing paradigm to the sensor resources in wireless sensor networks. This paper outlines how the Sensor Grid technology provides a solution for remote patient monitoring to address the resource constraint nature of the sensor devices in a wireless sensor network.
We describe an experiment mapping semantic role preferences for transitive verbs to their deverbal nominal forms. The preferences are learned by data mining large parsed corpora. Preferences are modeled for deverbal/argument pairs, falling back to a model for the deverbal alone when sufficient data is not available. Errors in role assignment are reduced by 35%.
The development of wireless sensor networks enables sensors to be embedded in everyday artifacts to create smart artifacts. Smart artifacts can deliver a variety of context-aware human-centric services. However, current systems mainly rely on ad-hoc definitions of context information, which makes it difficult to achieve knowledge sharing, reuse and reasoning. Moreover, smart-artifact applications developed by experts sometimes cannot meet end users' needs, but current systems do not allow end users to exert control over their smart homes. To avoid having to start from scratch when building new smart-artifact systems, and to empower experienced computer users to participate in the control of their smart environments, we developed a new knowledge infrastructure called Sixth-Sense. Unlike previous systems, Sixth-Sense builds on semantic web technologies. It defines a normalized ontology (called SS-ONT) using OWL. SS-ONT is focused on modeling general human-artifact interactions, and reflects several vital aspects of these interactions, such as artifact property and status description, and artifact-artifact and artifact-human relationships. Using this ontology model as a basis, we address some of the principles involved in performing context querying and context reasoning. An initial user study with 14 experienced computer users was conducted to determine the usability of our system. We also evaluate the runtime performance of our system and discuss some of the lessons learnt from the evaluation.
This paper presents a novel model for social network analysis in which, rather than analyzing the quantity of relationships (co-authorships, business relations, friendship, etc.), we analyze their communicative content. Text mining and clustering techniques ...
Two important approaches in multimedia information retrieval are classification and the ranking of the retrieved results. The technique of performing classification using Association Rule Mining (ARM) has been utilized to detect the high-level features from the video, taking advantages of its high efficiency and accuracy. Motivated by the fact that the users are only interested in the top-ranked relevant results, ranking strategies have been adopted to sort the retrieved results. In this paper, an effective and efficient video high-level semantic retrieval framework that utilizes associations and correlations to retrieve and rank the high-level features is developed. The n-feature-value pair rules are generated using a combined measure based on (1) the existence of the (n - 1)-feature-value pairs, where n is larger than 1, (2) the correlation between different n-feature-value pairs and the concept classes through Multiple Correspondence Analysis (MCA), and (3) the similarity representing the harmonic mean of the inter-similarity and intra-similarity. The final association classification rules are selected by using the calculated similarity values. Then our proposed ranking process uses the scores that integrate the correlation and similarity values to rank the retrieved results. To show the robustness of the proposed framework, experiments with 15 high-level features (concepts) and benchmark data sets from TRECVID and comparisons with 6 other well-known classifiers are presented. Our proposed framework achieves promising performance and outperforms all the other classifiers. Moreover, the final ranked retrieved results are evaluated by the mean average precision measure, which is commonly used for performance evaluation in the TRECVID community.
Author identification algorithms attempt to ascribe document to author, with an eye towards diverse application areas including: forensic evidence, authenticating communications, and intelligence gathering. We view author identification as a single label classification problem, where 2000 authors would imply 2000 possible categories to assign to a post. Experiments with a naive Bayes classifier on a blog author identification task demonstrate a remarkable tendency to over-predict the most prolific authors. Literature search confirms that the class imbalance phenomenon is a challenge for author identification as well as other machine learning tasks. We develop a vector projection method to remove this hazard, and achieve a 63% improvement in accuracy over the baseline on the same task. Our method adds no additional asymptotic computational complexity to naive Bayes, and has no free parameters to set. The projection technique will likely prove useful for other natural language tasks exhibiting class imbalance.
In this paper, we address the problem of Blind Audio Separation (BAS) by content evaluation of audio signals in the Time-Scale domain. Most of the proposed techniques rely on independence or at least uncorrelation assumption of the source signals exploiting mutual information or second/high order statistics. Here, we present a new algorithm, for instantaneous mixture, that considers only different time-scale source signature properties. Our approach lies in wavelet transformation advantages and proposes for this a new representation; Spatial Time Scale Distributions (STSD), to characterize energy and interference of the observed data. The BAS will be allowed by joint diagonalization, without a prior orthogonality constraint, of a set of selected diagonal STSD matrices. Several criteria will be proposed, in the transformed time-scale space, to assess the separated audio signal contents. We describe the logistics of the separation and the content rating, thus an exemplary implementation on synthetic signals and real audio recordings show the high efficiency of the proposed technique to restore the audio signal contents.
The development of sophisticated technologies for service-oriented architectures (SOA) is a grand challenge. A promising approach is the employment of semantic technologies to better support the service usage cycle. Most existing solutions show significant deficits in the computational performance, which hampers the applicability in large-scale SOA systems. We present an optimization technique for automated service discovery -- one of the central operations in semantically enabled SOA environments -- that can ensure a sophisticated performance while maintaining a high retrieval accuracy. The approach is based on goals that formally describe client objectives, and it employs a caching mechanism for enhancing the computational performance of a two-phased discovery framework. At design time, the suitable services for generic and reusable goal descriptions are determined by semantic matchmaking. The result is captured in a continuously updated graph structure that organizes goals and services with respect to the requested and provided functionalities. This is exploited at runtime in order to detect the suitable services for concrete client requests with minimal effort. We formalize the approach within a first-order logic framework, and define the graph structure along with
Most of the work on automated web service composition has focused so far on composition of stateful web services. This level of composition so-called “Process Level” considers web services with their internal and complex behaviors. At process level formal models such as State Transition Systems (STS from now) or Interface Automata are the most appropriate models to represent the internal behavior of stateful web services. However such models focus only on semantics of their behaviors and unfortunately not on semantics of actions and their parameters. In this paper, we suggest to extend the STS model, by following the WSMO based annotation for Abstract State Machine. This semantic enhancement of STS so called S 2 TS will enable to model semantics of internal behaviors and semantics of their actions together with their input and output parameters. Secondly, we will focus on automated generation of data flow (or the process to perform automated assignments between parameter of services involved in a composition). Thus we do not restrict to assignments of exact parameters (which is practically never used in industrial scenario) but extend assignments to semantically close parameters (e.g., through a subsumption matching) in the same ontology. Our system is implemented and interacting with web services dedicated to Telecommunication scenarios. The preliminary evaluation results showed high efficiency and effectiveness of the proposed approach.
In this article, we propose a novel method for generating engaging multi-modal content automatically from text. Rhetorical Structure Theory (RST) is used to decompose text into discourse units and to identify rhetorical discourse relations between them. Rhetorical relations are then mapped to question–answer pairs in an information preserving way, i.e., the original text and the resulting dialogue convey essentially the same meaning. Finally, the dialogue is "acted out" by two virtual agents. The network of dialogue structures automatically built up during this process, called DialogueNet, can be reused for other purposes, such as personalization or question–answering.
Word Sense Disambiguation (WSD) is an important problem in Natural Language Processing. Supervised WSD involves assigning a sense from some sense inventory to each occurrence of an ambiguous word. Verb sense distinctions often depend on the distinctions in the semantics of the target verb’s arguments. Therefore, some method of capturing their semantics is crucial to the success of a VSD system. In this paper we propose a novel approach to encoding the semantics of the noun arguments of a verb. This approach involves extracting various semantic properties of that verb from a large text corpus. We contrast our approach with the traditional methods and show that it performs better while the only resources it requires are a large corpus and a dependency parser.
Extracting temporal information from raw text is fundamental for deep language understanding, and key to many applications like question answering, information extraction, and document summarization. Our long-term goal is to build complete temporal structure of documents and use the temporal structure in other applications like textual entailment, question answering, visualization, or others. In this paper, we present a first step, a system for extracting events, event features, main events, temporal expressions and their normalized values from raw text. Our system is a combination of deep semantic parsing with extraction rules, Markov Logic Network classifiers and Conditional Random Field classifiers. To compare with existing systems, we evaluated our system on the TempEval-1 and TempEval-2 corpus. Our system outperforms or performs competitively with existing systems that evaluate on the TimeBank, TempEval-1 and TempEval-2 corpus and our performance is very close to inter-annotator agreement of the TimeBank annotators.
The evolution of mobile communication systems to 3G and beyond introduces requirements for flexible, customized, and ubiquitous multimedia service provision to mobile users. One must be able to know at any given time the network status, the user location, the profiles of the various entities (users, terminals, network equipment, services) involved and the policies that are employed within the system. Namely, the system must be able to cope with a large amount of context information. The present paper focuses on location and context awareness in service provisioning and proposes a flexible and innovative model for user profiling. The innovation is based on the enrichment of common user profiling architectures to include location and other contextual attributes, so that enhanced adaptability and personalization can be achieved. For each location and context instance an associated User Profile instance is created and hence, service provisioning is adapted to the User Profile instance that better apply to the current context. The generic model, the structure and the content of this location- and context-sensitive User Profile, along with some related implementation issues, are discussed.
In peer-to-peer (P2P) networks, computers with equal rights form a logical (overlay) network in order to provide a common service that lies beyond the capacity of every single participant. Efficient similarity search is generally recognized as a frontier in research about P2P systems. One way to address this issue is using data source selection based approaches where peers summarize the data they contribute to the network, generating typically one summary per peer. When processing queries, these summaries are used to choose the peers (data sources) that are most likely to contribute to the query result. Only those data sources are contacted.
There are several contributions of this article. We extend earlier work, adding a data source selection method for high-dimensional vector data, comparing different peer ranking schemes. Furthermore, we present two methods that use progressive stepwise data exchange between peers to better each peer's summary and therefore improve the system's performance. We finally examine the effect of these data exchange methods with respect to load balancing.
Efficient resource retrieval is a crucial issue, particularly when semantic resource descriptions are considered which enable the exploitation of reasoning services during the retrieval process. In this context, resources are commonly retrieved by checking if each available resource description satisfies the given query. This approach becomes inefficient with the increase of available resources. We propose a method for improving the retrieval process by constructing a tree index through a new conceptual clustering method for resources expressed as class definitions or as instances of classes in ontology languages. The available resource descriptions are located at the leaf nodes of the index, while inner nodes represent intensional descriptions (generalizations) of their child nodes. The retrieval is performed by following the tree branches whose nodes satisfy the query. Query answering time may be improved as the number of retrieval steps may be O(logn) in the best case.
Domain-Specific Modeling Languages (DSMLs) play a fundamental role in the model-based design of embedded software and systems. While abstract syntax metamodeling enables the rapid and inexpensive development of DSMLs, the specification of DSML semantics is still a hard problem. In previous work, we have developed methods and tools for the semantic anchoring of DSMLs. Semantic anchoring introduces a set of reusable "semantic units" that provide reference semantics for basic behavioral categories using the Abstract State Machine framework. In this paper, we extend the semantic anchoring framework to heterogeneous behaviors by exploring methods for the composition of semantic units. Semantic unit composition reduces the required effort from DSML designers and improves the quality of the specification. The proposed method is demonstrated through a case study. Formal notions of compositionality are discussed as well as a brief comparison with similar research tools.
Nowadays, software development and maintenance are highly distributed processes that involve a multitude of supporting tools and resources. Knowledge relevant for a particular software maintenance task is typically dispersed over a wide range of artifacts in different representational formats and at different abstraction levels, resulting in isolated 'information silos'. An increasing number of task-specific software tools aim to support developers, but this often results in additional challenges, as not every project member can be familiar with every tool and its applicability for a given problem. Furthermore, historical knowledge about successfully performed modifications is lost, since only the result is recorded in versioning systems, but not how a developer arrived at the solution. In this research, we introduce conceptual models for the software domain that go beyond existing program and tool models, by including maintenance processes and their constituents. The models are supported by a pro-active, ambient, knowledge-based environment that integrates users, tasks, tools, and resources, as well as processes and history-specific information. Given this ambient environment, we demonstrate how maintainers can be supported with contextual guidance during typical maintenance tasks through the use of ontology queries and reasoning services.
Human-centric computing has grown to be the major influence in today's computing research. Due to demand from industry and even lawmakers for easy-to-use computer systems, the user is now regarded as being the center of a ubiquitously available environment that supports the execution of task and anticipates user actions. This environment allows for the establishment of completely new ways for the delivery of legacy services and represents an opportunity for the introduction of a new type of services, addressing the user-focused service consumption. As a cause of this shift, the increasing saturation of everyday environments with computing devices can be identified. This saturation implies a numerical growth of computing systems and entails an increasing complexity, which negatively influences maintainability and manageability. Moreover, the shortcomings caused by the mobility of system elements, a common trait of human-centric environments, require consideration about the reliability of cooperative actions. In this paper, we present an approach that copes with complexity and dynamic while making service-oriented systems autonomous by the use of bio-inspired concepts. In particular, the aim is to make service architectures environment-aware. Thus, service architectures are supposed to adapt autonomously to their current environment like biological species do to survive. This approach requires services to obtain knowledge about characteristics and state of the environment through gathering semantically enhanced information about the context of the computing environment, which is intended to help in forming a virtual counterpart of the real world as reference for service adaptation. For this purpose, we illustrate the architecture for context provisioning in highly dynamic computing environments. As base for this architecture a middleware is introduced utilizing a loosely coupled interaction model. Moreover, a pheromone-based concept is outlined to optimize the dissemination of context data in the absence of adequate context sources.
Microscopic imaging is one of the most common techniques for investigating biological systems. In recent years there has been a tremendous growth in the volume of biological imaging data owing to rapid advances in optical instrumentation, high-speed cameras and fluorescent probes. Powerful semantic analysis tools are required to exploit the full potential of the information content of these data. Semantic analysis of multi-modality imaging data, however, poses unique challenges. In this paper we outline the state-of-the-art in this area along with the challenges facing this domain. Information extraction from biological imaging data requires modeling at multiple levels of detail. While some applications require only quantitative analysis at the level of cells and subcellular objects, others require modeling of spatial and temporal changes associated with dynamic biological processes. Modeling of biological data at different levels of detail allows not only quantitative analysis but also the extraction of high-level semantics. Development of powerful image interpretation and semantic analysis tools has the potential to significantly help in understanding biological processes, which in turn will result in improvements in drug development and healthcare.
Current bioinformatics tools or databases are very heterogeneous in terms of data formats, database schema, and terminologies. Additionally, most biomedical databases and analysis tools are scattered across different web sites making interoperability across such different services more difficult. It is desired that these diverse databases and analysis tools be normalized, integrated and encompassed with a semantic interface such that users of biological data and tools could communicate with the system in natural language and a workflow could be automatically generated and distributed into appropriate tools. In this paper, the BioSemantic System is presented to bridge complex biological/biomedical research problems and computational solutions via semantic computing. Due to the diversity of problems in various research fields, the semantic capability description language (SCDL) plays an important role as a common language and generic form for problem formalization. Several queries as well as their corresponding SCDL descriptions are provided as examples. For complex applications, multiple SCDL queries may be connected via control structures. For these cases, we present an algorithm to map a user request to one or more existing services if they exist.
Cut detection is part of the video segmentation problem, and consists in identifying the boundary between two consecutive shots. In this case, when two consecutive frames are similar, they are considered to be in the same shot. This work presents an approach to cut detection using a new simple and efficient dissimilarity measure (which is also invariant to rotation and translation) based on the size of a bipartite graph matching. To establish some parameter values, a machine learning approach is used. Experimental results provides a comparison between the new approach and other popular algorithms from the literature, showing that the new algorithm is robust and has a high performance compared to other methods for cut detection.
This paper describes an automatic process to check the semantic consistency of a segmentation. This process is made possible through the formalism of graphs. In this article we propose to apply this process to the checking of the relevancy of merging criteria used in an adaptive pyramid by matching the obtained segmentation with a conceptual graph describing the objects under consideration. This matching is performed by using a discrete relaxation method. It checks the arc-consistency with bilevel constraints of the chosen semantic graph. The efficiency of this approach is illustrated on synthetic and real images.
Semantics is the meaning of symbols, notations, concepts, functions, and behaviors, as well as their relations that can be deduced onto a set of predefined entities and/or known concepts. Semantic computing is an emerging computational methodology that models and implements computational structures and behaviors at semantic or knowledge level beyond that of symbolic data. In semantic computing, formal semantics can be classified into the categories of to be, to have, and to do semantics. This paper presents a comprehensive survey of formal and cognitive semantics for semantic computing in the fields of computational linguistics, software science, computational intelligence, cognitive computing, and denotational mathematics. A set of novel formal semantics, such as deductive semantics, concept-algebra-based semantics, and visual semantics, is introduced that forms a theoretical and cognitive foundation for semantic computing. Applications of formal semantics in semantic computing are presented in case studies on semantic cognition of natural languages, semantic analyses of computing behaviors, behavioral semantics of human cognitive processes, and visual semantic algebra for image and visual object manipulations.
This paper explains the classical social network analysis and discusses how computer networks effect a shift in constructing social networks. The paper then concentrates on analyzing cognitive aspects of a social network, explaining a simple but scalable approach for modeling a socio-cognitive network. Novel measures using such a socio-cognitive network model are defined and applications of such measures to extract useful information is illustrated on the Enron email dataset. The paper then describes a Dempster-Schafer theory based approach towards modeling a cognitive knowledge network and uses the Enron email dataset to illustrate how the proposed model can be used to capture actors' perceptions in a knowledge network. The paper concludes with a summary of the proposed models and a discussion on new research directions that can arise due to such cognitive analyses of electronic communication data.