Article

Extensible markup language keywords search based on security access control

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

With increasing rate of storing and sharing information in the cloud by the users, data storage brings new challenges to the Extensible Markup Language (XML) database in big data environments. The efficient retrieval of data with protection and privacy issues for accessing mass data in the cloud is more and more important. Most of existing research about XML data query and retrieval focuses on efficiency or establishing the index, and so on. However, these methods or algorithms do not take into account the data and data structure for their own safety issues. Furthermore, traditional access control rules read XML document node in a dynamic environment, relevant dynamic query-based keyword research data security and privacy protection requirements are not many. In order to improve the search efficiency with security condition, this paper examines how to generate the sub-tree of matching keywords that the user can access by the access control rules for the user's role. The corresponding algorithm is proposed to achieve safe and efficient keywords search.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... A system is composed of objects like databases [2] and subjects like transactions and users. A transaction is allowed to read and write an object only if the transaction is granted access rights [11] to read and write data in the object, respectively. Suppose a transaction T 1 reads data d in an object o 1 and writes the data d in another object o 2 . ...
Chapter
Full-text available
In the access control models to make a system secure, a transaction is allowed to read and write an object like a file only if access rights on the object are granted. Suppose a transaction T1T_1 reads data d from a file f1f_1 and then writes the data d to another file f2f_2. Here, another transaction T2T_2 can get the data d by reading the file f2f_2 even if T2T_2 is not granted a read right on the file f1f_1. Here, the read operation issued by the transaction T2T_2 is illegal. In our previous studies, a condition to detect an illegal read operation is defined based on the role-based access control (RBAC) model. Here, once a transaction issues an illegal read operation, the transaction is aborted. However, even if the illegal condition is satisfied for a transaction issuing a read operation, illegal information flow may not occur. In this paper, we newly propose a modified read abortion (MRA) protocol which uses a new condition on maximal roles of role sets. In addition, we consider only maximal roles which include a read right on an object which a transaction can read. In the evaluation, we show the number of transactions aborting can be reduced.
... A system is composed of objects like databases [2] and subjects like transactions and users. A transaction is allowed to read and write an object only if the transaction is granted access rights [10] to read and write data in the object, respectively. Suppose a transaction T 1 reads data d in an object o 1 and writes the data d in another object o 2 . ...
Chapter
Full-text available
In access control models, a transaction is allowed to read and write an object only if access rights to read and write the object are granted, respectively. Suppose a transaction T1T_1 reads data d from a file object f1f_1 and then writes the data d to another file object f2f_2. Here, another transaction T2T_2 can get the data d by reading the file object f2f_2 even if T2T_2 is not granted a read right on the file object f1f_1. Here, information in the file object f1f_1 flows to the transaction T2T_2 via the file object f2f_2. We have to prevent illegal information flow to occur by transactions manipulating objects. The role-based access control (RBAC) model is widely used in various applications like database systems. In our previous studies, the legally precedent relation from a role to a role is defined. However, even if the legal condition is satisfied, there is case illegal information flow occurs. In this paper, we redefine legal and illegal precedent relations among roles. In order to check if a collection A of roles illegally precedes a collection B of roles, we introduce a new condition which uses maximal roles of A and B.
Chapter
The Semantic Web supports a set of technologies that exploit the standardization of the semantic representation of informational resources available on the web, representing the evolution of the current web. It provides a mechanism for formatting data in a machine-readable manner. Helping people in certain activities are done manually and end up consuming a lot of time in human daily life, linking individual properties of these data with globally accessible schemes. Since with so much information evolution in digital searches is inevitable, which with this technology provides ease and provides inferences about sates in scalable activities and modes. Therefore, this chapter aims to provide an overview of the semantic web and technology behind the semantic web search Engines, showing and approaching its success relation, with a concise bibliographic background, categorizing and synthesizing the potential of both technologies.
Article
Full-text available
A B S T R A C T Resources and services are accessible in pervasive computing environments from anywhere and at anytime. Also, due to ever-changing nature of such environments, the identity of users is unknown. However, users must be able to access the required resources based on their contexts. These and other similar complexities necessitate dynamic and context-aware access control models for such environments. In other words, an ecient access control model for pervasive computing environments should be aware of context information. Changes in context information imply some changes in the users' authorities. Accordingly, an access control model for a pervasive computing environment should control all accesses of unknown users to the resources based upon the participating context information, i.e., contexts of the users, resources and the environment. In this paper, a new context-aware access control model is proposed for pervasive computing environments. Contexts are classied into long-term contexts (which do not change during a session) and short-term contexts (which their steady-state period is less than an average time of a session). The model assigns roles to a user dynamically at the beginning of their sessions considering the long-term contexts. However, during a session the active permission set of the assigned roles are determined based on the short-term context conditions. Formal specication of the proposed model as well as the proposed architecture are presented in this paper. Furthermore, by presenting a real case study, it is shown that the model is applicable, decidable, and dynamic. Expressiveness and complexity of the model is also evaluated.
Conference Paper
Full-text available
In this paper, we study the problem of effective keyword search over XML documents. We begin by introducing the notion of Valu- able Lowest Common Ancestor (VLCA) to accurately and effec- tively answer keyword queries over XML documents. We then propose the concept of Compact VLCA (CVLCA) and compute the meaningful compact connected trees rooted as CVLCAs as the answers of keyword queries. To efficiently compute CVLCAs, we devise an effective optimization strategy for speeding up the com- putation, and exploit the key properties of CVLCA in the design of the stack-based algorithm for answering keyword queries. We have conducted an extensive experimental study and the experimental results show that our proposed approach achieves both high effi- ciency and effectiveness when compared with existing proposals.
Conference Paper
Full-text available
Access control for XML documents is a non-trivial topic, as can be witnessed from the number of approaches presented in the literature. Trying to compare these, we discovered the need for a simple, clear and unambiguous language to state the declarative semantics of an access control policy. All current approaches state the semantics in natural language, which has none of the above properties. This makes it hard to assess whether the proposed algorithms are correct (i.e., really implement the described semantics). It is also hard to assess the proposed policy on its merits, and to compare it to others (for file systems for instance). This paper shows how XPath can be used to specify the semantics of an access control policy for XML documents. Using XPath has great advantages: it is standard technology, widely used and it has clear and easy syntax and semantics. We use the developed framework to give a formal specification of the five most prominent approaches of access control for XML documents from the literature.
Conference Paper
Full-text available
In this paper, we address issues related to sharing information in a distributed system consisting of autonomous entities, each of which holds a private database. Semi-honest behavior has been widely adopted as the model for adversarial threats. However, it substantially underestimates the capability of adversaries in reality. In this paper, we consider a threat space containing more powerful adversaries that includes not only semi-honest but also those malicious adversaries. In particular, we classify malicious adversaries into two widely existing subclasses, called weakly malicious and strongly malicious adversaries, respectively. We define a measure of privacy leakage for information sharing systems and propose protocols that can effectively and efficiently protect privacy against different kinds of malicious adversaries.
Conference Paper
Full-text available
Keyword search enables web users to easily access XML data with- out the need to learn a structured query language and to study pos- sibly complex data schemas. Existing work has addressed the prob- lem of selecting qualied data nodes that match keywords and con- necting them in a meaningful way, in the spirit of inferring a where clause in XQuery. However, how to infer the return clause for key- word search is an open problem. To address this challenge, we present an XML keyword search en- gine, XSeek, to infer the semantics of the search and identify return nodes effectively. XSeek recognizes possible entities and attributes inherently represented in the data. It also distinguishes between search predicates and return specications in the keywords. Then based on the analysis of both XML data structures and keyword match patterns, XSeek generates return nodes. Extensive experi- mental studies show the effectiveness of XSeek.
Conference Paper
Full-text available
We study generalization for preserving privacy in publication of sensitive data. The existing methods focus on a universal approach that exerts the same amount of preservation for all persons, with-out catering for their concrete needs. The consequence is that we may be offering insufficient protection to a subset of people, while applying excessive privacy control to another subset. Motivated by this, we present a new generalization framework based on the concept of personalized anonymity. Our technique performs the minimum generalization for satisfying everybody's requirements, and thus, retains the largest amount of information from the microdata. We carry out a careful theoretical study that leads to valuable insight into the behavior of alternative solutions. In particular, our analysis mathematically reveals the circumstances where the previous work fails to protect privacy, and establishes the superiority of the proposed solutions. The theoretical findings are verified with extensive experiments.
Conference Paper
Full-text available
The prevalent use of XML highlights the need for a generic, flexible access-control mechanism for XML documents that supports efficient and secure query access, without revealing sensitive information unauthorized users. This paper introduces a novel paradigm for specifying XML security constraints and investigates the enforcement of such constraints during XML query evaluation. Our approach is based on the novel concept of security views, which provide for each user group (a) an XML view consisting of all and only the information that the users are authorized to access, and (b) a view DTD that the XML view conforms to. Security views effectively protect sensitive data from access and potential inferences by unauthorized user, and provide authorized users with necessary schema information to facilitate effective query formulation and optimization. We propose an efficient algorithm for deriving security view definitions from security policies (defined on the original document DTD) for different user groups. We also develop novel algorithms for XPath query rewriting and optimization such that queries over security views can be efficiently answered without materializing the views. Our algorithms transform a query over a security view to an equivalent query over the original document, and effectively prune query nodes by exploiting the structural properties of the document DTD in conjunction with approximate XPath containment tests. Our work is the first to study a flexible, DTD-based access-control model for XML and its implications on the XML query-execution engine. Furthermore, it is among the first efforts for query rewriting and optimization in the presence of general DTDs for a rich a class of XPath queries. An empirical study based on real-life DTDs verifies the effectiveness of our approach.
Conference Paper
Full-text available
Keyword search for smallest lowest common ancestors (SLCAs) in XML data has recently been proposed as a meaningful way to identify interesting data nodes in XML data where their subtrees contain an input set of keywords. In this pa- per, we generalize this useful search paradigm to support keyword search beyond the traditional AND semantics to include both AND and OR boolean operators as well. We first analyze properties of the LCA computation and pro- pose improved algorithms to solve the traditional keyword search problem (with only AND semantics). We then ex- tend our approach to handle general keyword search involv- ing combinations of AND and OR boolean operators. The effectiveness of our new algorithms is demonstrated with a comprehensive experimental performance study.
Conference Paper
Full-text available
Queries navigate semistructured data via path expressions, and can be accelerated using an index. Our solution encodes paths as strings, and inserts those strings into a special index that is highly optimized for long and complex keys. We describe the Index Fabric, an indexing structure that provides the efficiency and flexibility we need. We discuss how "raw paths" are used to optimize ad hoc queries over semistructured data, and how "refined paths" optimize specific access paths. Although we can use knowledge about the queries and structure of the data to create refined paths, no such knowledge is needed for raw paths. A performance study shows that our techniques, when implemented on top of a commercial relational database system, outperform the more traditional approach of using the commercial system?s indexing mechanisms to query the XML.
Article
Keyword search provides an easy way for users to pose queries against XML documents, and it is important to support queries with arbitrary combinations of AND, OR, and NOT operators. The previous RELMN algorithm processed such kind of queries by extending the original SLCA definition in a straightforward way, but it did not work correctly in some cases. In this paper, we propose the concept of valid SLCAs as query results. Basically, nodes in an XML document are classified according to their usages, which is further used to define the scope affected by a negative keyword. Only valid nodes, which are not affected by any negative keyword, are qualified to identify valid SLCAs. The experimental results show that the proposed algorithm achieves higher precision and recall, and is more efficient than the previous work.
Conference Paper
XML documents are frequently used in applications such as business transactions and medical records involving sensitive information. Typically, parts of documents should be visible to users depending on their roles. For instance, an insurance agent may see the billing information part of a medical document but not the details of the patient's medical history. Access control on the basis of data location or value in an XML document is therefore essential. In practice, the number of access control rules is on the order of millions, which is a product of the number of document types (in 1000's) and the number of user roles (in 100's). Therefore, the solution requires high scalability and performance. Current approaches to access control over XML documents have suffered from scalability problems because they tend to work on individual documents. In this paper, we propose a novel approach to XML access control through rule functions that are managed separately from the documents. A rule function is an executable code fragment that encapsulates the access rules (paths and predicates), and is shared by all documents of the same document type. At runtime, the rule functions corresponding to the access request are executed to determine the accessibility of document fragments. Using synthetic and real data, we show the scalability of the scheme by comparing the accessibility evaluation cost of two rule function models. We show that the rule functions generated on user basis is more efficient for XML databases.
Conference Paper
The extensible markup language (XML) is a promising standard for describing semi-structured information and contents on the Internet. When XML comes to be a widespread data encoding format for Web applications, safeguarding the accuracy of the information represented in XML documents will be indispensable. In this paper, we propose a provisional authorization model that provides XML with sophisticated access control mechanism. The well-recognized need for such a system has only recently been addressed. Based on this authorization model, we present an XML access control language (XACL) that integrates security features such as authorization, non-repudiation, confidentiality, and an audit trail for XML documents. We describe our implementation, which can be used as an extension of a Web server for e-Business applications.
Conference Paper
Keyword search is a proven, user-friendly way to query HTML documents in the World Wide Web. We propose keyword search in XML documents, modeled as labeled trees, and describe corresponding efficient algorithms. The proposed keyword search returns the set of smallest trees containing all keywords, where a tree is designated as "smallest" if it contains no tree that also contains all keywords. Our core contribution, the Indexed Lookup Eager algorithm, exploits key properties of smallest trees in order to outperform prior algorithms by orders of magnitude when the query contains keywords with significantly different frequencies. The Scan Eager variant is tuned for the case where the keywords have similar frequencies. We analytically and experimentally evaluate two variants of the Eager algorithm, along with the Stack algorithm [13]. We also present the XKSearch system, which utilizes the Indexed Lookup Eager, Scan Eager and Stack algorithms and a demo of which on DBLP data is available at http://www.db.ucsd.edu/projects/xksearch. Finally, we extend the Indexed Lookup Eager algorithm to answer Lowest Common Ancestor (LCA) queries.
Conference Paper
ldquoCloudrdquo computing - a relatively recent term, builds on decades of research in virtualization, distributed computing, utility computing, and more recently networking, web and software services. It implies a service oriented architecture, reduced information technology overhead for the end-user, great flexibility, reduced total cost of ownership, on-demand services and many other things. This paper discusses the concept of ldquocloudrdquo computing, issues it tries to address, related research topics, and a ldquocloudrdquo implementation available today.
XSEarch: a semantic search engine for XML
  • S Cohen
  • J Mamou
  • Y Kanza
Cohen, S., Mamou, J. and Kanza, Y. (2003) 'XSEarch: a semantic search engine for XML', Proceeding VLDB '03 Proceedings of the 29th International Conference on Very Large Data Bases, ACM, Berlin, Germany, pp.45-56.
Formal specification for role based access control user/role and role/role relationship management
  • S I Gavrila
  • J F Barkley
Gavrila, S.I. and Barkley, J.F. (1998) 'Formal specification for role based access control user/role and role/role relationship management', Proceeding of the 3rd ACM Workshop on Role-Based Access Control, ACM, Fairfax, VA, USA.
Access control policy and index of XML keyword search
  • X Li
  • H Huang
  • H Zhu
Li, X., Huang, H. and Zhu, H. (2011) 'Access control policy and index of XML keyword search', Computer Applications and Software, Vol. 28, No. 12, pp.5-10.
Indexing and querying XML data for regular path expressions
  • Q-Z Li
  • B Moon
Li, Q-Z. and Moon, B. (2001) 'Indexing and querying XML data for regular path expressions', The 27th VLDB Conference, ACM, Roma, Italy.
XML keyword search based on secure access control
  • X Li
  • H Zhu
  • W Yang
Li, X., Zhu, H. and Yang, W. (2010) 'XML keyword search based on secure access control', Computer Science and Technology, Vol. 4, No. 1, pp.73-81.
XML access control using static analysis
  • M Mutata
  • A Tozawa
  • M Kudo
Mutata, M., Tozawa, A. and Kudo, M. (2006) 'XML access control using static analysis', ACM Transactions on Information and System Security, Vol. 9, No. 3, pp.292-324.
XML access control for semantically related XML documents
  • V Parmar
  • H Shi
Parmar, V. and Shi, H. (2002) 'XML access control for semantically related XML documents', Proceedings of the 36th Hawaii International Conference on System Sciences (HICSS'03), IEEE, Big Island, HI, USA.
Secure XML keyword search based on compression policies
  • H Wu
  • Z Tang
Wu, H. and Tang, Z. (2011) 'Secure XML keyword search based on compression policies', Computer Engineering and Applications, Vol. 47, No. 36, pp.5-8.