Mohammad Farhan Husain

University of Texas at Dallas, Richardson, TX, United States

Are you Mohammad Farhan Husain?

Claim your profile

Publications (8)1.89 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Semantic web is an emerging area to augment human reasoning. Various technologies are being developed in this arena which have been standardized by the World Wide Web Consortium (W3C). One such standard is the Resource Description Framework (RDF). Semantic web technologies can be utilized to build efficient and scalable systems for Cloud Computing. With the explosion of semantic web technologies, large RDF graphs are common place. This poses significant challenges for the storage and retrieval of RDF graphs. Current frameworks do not scale for large RDF graphs and as a result do not address these challenges. In this paper, we describe a framework that we built using Hadoop to store and retrieve large numbers of RDF triples by exploiting the cloud computing paradigm. We describe a scheme to store RDF data in Hadoop Distributed File System. More than one Hadoop job (the smallest unit of execution in Hadoop) may be needed to answer a query because a single triple pattern in a query cannot simultaneously take part in more than one join in a single Hadoop job. To determine the jobs, we present an algorithm to generate query plan, whose worst case cost is bounded, based on a greedy approach to answer a SPARQL Protocol and RDF Query Language (SPARQL) query. We use Hadoop's MapReduce framework to answer the queries. Our results show that we can store large RDF graphs in Hadoop clusters built with cheap commodity class hardware. Furthermore, we show that our framework is scalable and efficient and can handle large amounts of RDF data, unlike traditional approaches.
    IEEE Transactions on Knowledge and Data Engineering 10/2011; · 1.89 Impact Factor
  • Source
    Mohammad Farhan Husain, Tahseen Al-Khateeb, Mohmmad Alam, Latifur Khan
    [Show abstract] [Hide abstract]
    ABSTRACT: With advances of digital technology two or more organizations form a federation for a temporary purpose. It is possible that an employee of an organization needs to access resource of another organization. While languages like XACML are very good for access control of a single organization, they are not able to address interoperability issues of such federations. Lack of interoperability of access control policies does not allow someone to access remote resources. In this paper we propose an approach by augmenting with knowledge-base, Ontology that will facilitate access control policies to provide resources of remote organizations. In addition, we introduce a new effect of the rule, partial permit in our application domain (Geo-Spatial).
    Computer Standards & Interfaces. 01/2011; 33:214-219.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Semantic web is an emerging area to augment human reasoning. Various technologies are being developed in this arena which have been standardized by the World Wide Web Consortium (W3C). One such standard is the Resource Description Framework (RDF). Semantic web technologies can be utilized to build efficient and scalable systems for Cloud Computing. With the explosion of semantic web technologies, large RDF graphs are common place. This poses significant challenges for the storage and retrieval of RDF graphs. Current frameworks do not scale for large RDF graphs and as a result do not address these challenges. In this paper, we describe a framework that we built using Hadoop to store and retrieve large numbers of RDF triples by exploiting the cloud computing paradigm. We describe a scheme to store RDF data in Hadoop Distributed File System. More than one Hadoop job (the smallest unit of execution in Hadoop) may be needed to answer a query because a single triple pattern in a query cannot simultaneously take part in more than one join in a single Hadoop job. To determine the jobs, we present an algorithm to generate query plan, whose worst case cost is bounded, based on a greedy approach to answer a SPARQL Protocol and RDF Query Language (SPARQL) query. We use Hadoop's MapReduce framework to answer the queries. Our results show that we can store large RDF graphs in Hadoop clusters built with cheap commodity class hardware. Furthermore, we show that our framework is scalable and efficient and can handle large amounts of RDF data, unlike traditional approaches.
    IEEE Trans. Knowl. Data Eng. 01/2011; 23:1312-1327.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Cloud computing solutions continue to grow increasingly popular both in research and in the commercial IT industry. With this popularity comes ever increasing challenges for the cloud computing service providers. Semantic web is another domain of rapid growth in both research and industry. RDF datasets are becoming increasingly large and complex and existing solutions do not scale adequately. In this paper, we will detail a scalable semantic web framework built using cloud computing technologies. We define solutions for generating and executing optimal query plans. We handle not only queries with Basic Graph Patterns (BGP) but also complex queries with optional blocks. We have devised a novel algorithm to handle these complex queries. Our algorithm minimizes binding triple patterns and joins between them by identifying common blocks by algorithms to find sub graph isomorphism and building a query plan utilizing that information. We utilize Hadoop's MapReduce framework to process our query plan. We will show that our framework is extremely scalable and efficiently answers complex queries.
    IEEE International Conference on Cloud Computing, CLOUD 2011, Washington, DC, USA, 4-9 July, 2011; 01/2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Semantic Web is gaining immense popularity-and with it, the Resource Description Framework (RDF)broadly used to model Semantic Web content. However, access control on RDF stores used for single machines has been seldom discussed in the literature. One significant obstacle to using RDF stores defined for single machines is their scalability. Cloud computers, on the other hand, have proven useful for storing large RDF stores, but these system slack access control on RDF data to our knowledge. This work proposes a token-based access control system that is being implemented in Hadoop (an open source cloud computing framework). It defines six types of access levels and an enforcement strategy for the resulting access control policies. The enforcement strategy is implemented at three levels: Query Rewriting, Embedded Enforcement, and Post processing Enforcement. In Embedded Enforcement, policies are enforced during data selection using MapReduce, whereas in Post-processing Enforcement they are enforced during the presentation of data to users. Experiments show that Embedded Enforcement consistently outperforms Post processing Enforcement due to the reduced number of jobs required.
    Cloud Computing, Second International Conference, CloudCom 2010, November 30 - December 3, 2010, Indianapolis, Indiana, USA, Proceedings; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Cloud computing is the newest paradigm in the IT world and hence the focus of new research. Companies hosting cloud computing services face the challenge of handling data intensive applications. Semantic web technologies can be an ideal candidate to be used together with cloud computing tools to provide a solution. These technologies have been standardized by the World Wide Web Consortium (W3C). One such standard is the Resource Description Framework (RDF). With the explosion of semantic web technologies, large RDF graphs are common place. Current frameworks do not scale for large RDF graphs. In this paper, we describe a framework that we built using Hadoop, a popular open source framework for Cloud Computing, to store and retrieve large numbers of RDF triples. We describe a scheme to store RDF data in Hadoop Distributed File System. We present an algorithm to generate the best possible query plan to answer a SPARQL Protocol and RDF Query Language (SPARQL) query based on a cost model. We use Hadoop's MapReduce framework to answer the queries. Our results show that we can store large RDF graphs in Hadoop clusters built with cheap commodity class hardware. Furthermore, we show that our framework is scalable and efficient and can easily handle billions of RDF triples, unlike traditional approaches.
    IEEE International Conference on Cloud Computing, CLOUD 2010, Miami, FL, USA, 5-10 July, 2010; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Handling huge amount of data scalably is a matter of concern for a long time. Same is true for semantic web data. Current semantic web frameworks lack this ability. In this paper, we describe a framework that we built using Hadoop to store and retrieve large number of RDF triples. We describe our schema to store RDF data in Hadoop Distribute File System. We also present our algorithms to answer a SPARQL query. We make use of Hadoop's MapReduce framework to actually answer the queries. Our results reveal that we can store huge amount of semantic web data in Hadoop clusters built mostly by cheap commodity class hardware and still can answer queries fast enough. We conclude that ours is a scalable framework, able to handle large amount of RDF data efficiently.
    Cloud Computing, First International Conference, CloudCom 2009, Beijing, China, December 1-4, 2009. Proceedings; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Land cover classification for the evaluation of land cover changes over certain areas or time periods is crucial for geospatial modeling, environmental crisis evaluation and urban open space planning. Remotely sensed images of various spatial and spectral resolutions make it possible to classify land covers on the level of pixels. Semantic meanings of large regions consisting of hundreds of thousands of pixels cannot be revealed by discrete and individual pixel classes, but can be derived by integrating various groups of pixels using ontologies. This paper combines data of different resolutions for pixel classification by support vector classifiers, and proposes an efficient algorithm to group pixels based on classes of neighboring pixels. The algorithm is linear in the number of pixels of the target area, and is scalable to very large regions. It also re-evaluates imprecise classifications according to neighboring classes for region level semantic interpretations. Experiments on advanced spaceborne thermal emission and reflection radiometer (ASTER) data of more than six million pixels show that the proposed approach achieves up to 99.8% cross validation accuracy and 89.25% test accuracy for pixel classification, and can effectively and efficiently group pixels to generate high level semantic concepts.
    IEEE International Conference on Intelligence and Security Informatics, ISI 2007, New Brunswick, New Jersey, USA, May 23-24, 2007, Proceedings; 01/2007

Publication Stats

59 Citations
1.89 Total Impact Points

Institutions

  • 2009–2011
    • University of Texas at Dallas
      • Department of Computer Science
      Richardson, TX, United States
  • 2010
    • Mississippi State University
      • Department of Computer Science and Engineering
      Starkville, MS, United States