Josep Maria Mateo-Sanz

Universitat Rovira i Virgili, Tarraco, Catalonia, Spain

Are you Josep Maria Mateo-Sanz?

Claim your profile

Publications (35)33.92 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Web Search Engines store and analyse queries made by their users in order to build their profiles and offer them personalized search results (i.e., results are ranked according to each user’s preferences). Even though this service is positive for the users, their profiles may contain sensitive information that can threaten their privacy. Literature in this field stresses the existence of a trade-off between the level of privacy achieved and the quality of service received. Current proposals try to maximize this trade-off, yet we argue that there is still room for improvement. In this paper, we present a new approach which improves the trade-off by grouping users who share similar profiles. Moreover, our new proposal is based on a hybrid P2P architecture that outperforms current schemes in terms of scalability. The proposed system has been tested in terms of performance and privacy achieved, using real data from the AOL dataset. Simulation results show that: (i) the system protects users’ privacy when they behave honestly, and penalizes selfish users; (ii) it supports a large number of users; (iii) the runtime required to group users is affordable; and (iv) groups are dynamic and their topology is unpredictable, which we argue that is positive from the privacy point of view.
    Computer Communications 09/2014; In press. · 1.35 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The presented work analyses the feasibility of biodiesel production by transesterification of vegetable oil extracted from Cynara Cardunculus. The performance assessment of a plant with a capacity of 5000 t/year biodiesel production is carried out based on its rigorous simulation in AspenHysys®. A modular automated evaluation tool programmed in Matlab® retrieves the inventory of the energy and material inputs/outputs, and the environmental releases from the simulation results. The simulation performs a combined acid-catalyzed pretreatment and alkali-catalyzed transesterification based on the cardoon oil characterization and the kinetic data extracted from published experimental studies. The performance of the plant is optimized considering four potential alternatives. The Eco-indicator 99 methodology is used for the impact assessment, achieving similar results to other vegetable sources of biodiesel. With respect to the economic study, a profitability analysis yields better results than the reported in previous works for oils of other agricultural crops, identifying two critical factors: the biodiesel sale price and the plant capacity.
    Fuel 09/2013; 111:535–542. · 3.41 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microalgae-based biodiesel has several benefits over other resources such as less land use, potential cultivation in non-fertile locations, faster growth and especially a high lipid-to-biodiesel yield. Nevertheless, the environmental and economic behavior for high scale production depends on several variables that must be addressed in the scale-up procedure. In this sense, rigorous modeling and multicriteria evaluation are performed in order to achieve optimal topology for third generation biodiesel production. Different scenarios and the most promising technologies tested at pilot scale are assessed. Besides, the sensitivity analysis allows the detection of key operating variables and assumptions that have a direct effect on the lipid content. The deviation of these variables may lead to an erroneous estimation of the scale-up performance of the technology reviewed in the microalgae-based biodiesel process. The modeling and evaluation of different scenarios of the harvesting, oil extraction and transesterification help to identify greener and cheaper alternatives.
    Bioresource Technology 08/2013; 147C:7-16. · 5.04 Impact Factor
  • Agusti Solanas, Antoni Martínez-Ballesté, Josep Maria Mateo-Sanz
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present the concept of double-phase microaggregation as an improvement of classical microaggregation for the protection of privacy in distributed scenarios without fully trusted parties. We apply this new concept in the context of mobile health and we show that a distributed architecture consisting of patients and several intermediate entities can apply it to protect the privacy of patients, whose data are released to third parties for secondary use. After recalling some fundamental concepts of statistical disclosure control and microaggregation, we detail the distributed architecture that allows the private gathering, storage, and sharing of biomedical data. We show that double-phase multivariate microaggregation properly fits the needs for privacy preservation of biomedical data in the distributed context of mobile health. Moreover, we show that double-phase microaggregation performs similarly to classical microaggregation in terms of information loss, disclosure risk, and correlation preservation, while avoiding the limitations of a centralized approach.
    IEEE Transactions on Information Forensics and Security 06/2013; 8(6):901-910. · 2.07 Impact Factor
  • Source
    Journal of Cleaner Production 04/2013; 44:56-68. · 3.59 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microalgae oil has been identified as a reliable resource for biodiesel production due to its high lipid productivity and potential cultivation in non-fertile locations. However, high scale production of microalgae based biodiesel depends on the optimization of the entire process to be economically feasible. The selected strain, medium, harvesting methods, etc., sorely affects the ash content in the dry biomass which have a direct effect in the lipid content. Moreover, the suitable lipids for biodiesel production, some of the neutral/saponifiable, are only a fraction of the total ones (around 30% dry base biomass in the best case). The present work uses computational tools for the modeling of different scenarios of the harvesting, oil extraction and transesterification. This rigorous modeling approach detects process bottlenecks that could have led to an overestimation of the potentiality of the microalgae lipids as a resource for the biodiesel production.
    Bioresource Technology 03/2013; 136C:617-625. · 5.04 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Web search engines profile their users by storing and analyzing their past searches. Profiles reflect the interests of the users and enable web search engines to offer a better service. In this way, search results are personalized to fulfill the expectations of each individual user. Nevertheless, this service is not provided without cost. User profiles contain information that can be considered private and personal. This represents a serious privacy threat which must be addressed. Several privacy-preserving techniques which try to prevent this situation can be found in the literature. In this paper, we focus on those that work directly in the computer of the users without requiring any external entity. More specifically, we propose a new single-party scheme that addresses the trade-off between privacy and quality of service but it does not require any change at the server side. The performance of this new method has been evaluated using real search queries extracted from the AOL's files. The results achieved show that our proposal works as expected and it can be considered a proper option for those users who are concerned about their privacy.
    Tenth Annual Conference on Privacy, Security and Trust (PST’12); 01/2012
  • C M Torres, M Gadalla, J M Mateo-Sanz, L Jiménez
    Computer Aided Chemical Engineering 01/2012; 30:162-166.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Since the environmental awareness increases and regulations become more restrictive, chemical industries are forced to adopt measures for minimizing the environmental impact and to include these techniques in the process design. This work proposes a methodology where environment and human health considerations are coupled with the process design of new and existing plants. With this aim, a new assessment tool for the environmental evaluation of chemical processes is presented. It includes the development of a new environmental indicator (Material Balance Environmental Index, MBEI) based on the toxicities of the chemicals involved and the materials flows between the process and environment. Moreover, a total index is computed following three levels of aggregation using the geometric mean of the ratios of several environmental impact categories. The environmental evaluation tool is tested using two case studies (formaldehyde and styrene production), where data are obtained from rigorous process simulation validated with industrial data.
    Industrial & Engineering Chemistry Research 10/2011; 50(23). · 2.24 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Highlights ► We propose a protocol that protects the privacy of the users of web search engines. ► The proposed scheme addresses the presence of users who do not follow the protocol. ► We have designed a new measure to evaluate the privacy of the users. ► We have used real queries from real users to simulate the behavior of our scheme.
    Journal of Systems and Software 10/2011; 84:1734-1745. · 1.25 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protecting personal data is essential to guarantee the rule of law1. Due to the new Information and Communication Technologies (ICTs) unprecedented amounts of personal data can be stored and analysed. Thus, if the proper measures are not taken, individual privacy could be in jeopardy. Being the aim to protect individual privacy, a great variety of statistical dis- closure control (SDC) techniques has been proposed. Amongst many others, k-anonymity is a promising property that, if properly achieved, can help protect individual privacy. In this paper, we propose a new post-processing method that can be applied after a k-anonymity algo- rithm, being the aim to lessen the errors resulting from the aggregation of data. We show that our method can be extended to work with many other SDC techniques and we provide some experimental results which em- phasise the usefulness of our proposal.
    Proceedings of the The Third International Conference on Availability, Reliability and Security, ARES 2008, March 4-7, 2008, Technical University of Catalonia, Barcelona , Spain; 01/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microaggregation is a technique used to protect privacy in databases and location-based services. We propose a new hybrid technique for multivariate microaggregation. Our technique combines a heuristic yielding fixed-size groups and a genetic algorithm yielding variable-sized groups. Fixed-size heuristics are fast and able to deal with large data sets, but they sometimes are far from optimal in terms of the information loss inflicted. On the other hand, the genetic algorithm obtains very good results (i.e. optimal or near optimal), but it can only cope with very small data sets. Our technique leverages the advantages of both types of heuristics and avoids their shortcomings. First, it partitions the data set into a number of groups by using a fixed-size heuristic. Then, it optimizes the partitions by means of the genetic algorithm. As an outcome of this mixture of heuristics, we obtain a technique that improves the results of the fixed-size heuristic in large data sets.
    Proceedings of the 23rd International Conference on Data Engineering Workshops, ICDE 2007, 15-20 April 2007, Istanbul, Turkey; 01/2007
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microaggregation is a family of methods for statistical disclosure control (SDC) of microdata (records on individuals and/or companies), that is, for masking microdata so that they can be released while preserving the privacy of the underlying individuals. The principle of microaggregation is to aggregate original database records into small groups prior to publication. Each group should contain at least k records to prevent disclosure of individual information, where k is a constant value preset by the data protector. Recently, microaggregation has been shown to be useful to achieve k-anonymity, in addition to it being a good masking method. Optimal microaggregation (with minimum within-groups variability loss) can be computed in polynomial time for univariate data. Unfortunately, for multivariate data it is an NP-hard problem. Several heuristic approaches to microaggregation have been proposed in the literature. Heuristics yielding groups with fixed size k tends to be more efficient, whereas data-oriented heuristics yielding variable group size tends to result in lower information loss. This paper presents new data-oriented heuristics which improve on the trade-off between computational complexity and information loss and are thus usable for large datasets.
    The VLDB Journal 10/2006; 15(4):355-369. · 1.70 Impact Factor
  • Source
    Josep Domingo-Ferrer, Josep Maria Mateo-Sanz, Francesc Sebé
    [Show abstract] [Hide abstract]
    ABSTRACT: The goal of privacy protection in statistical databases is to balance the social right to know and the individual right to privacy. When microdata (i.e. data on individual respondents) are released, they should stay analytically useful but should be protected so that it cannot be decided whether a published record matches a specific individual. However, there is some uncertainty in the assessment of data utility, since the specific data uses of the released data cannot always be anticipated by the data protector. Also, there is uncertainty in assessing disclosure risk, because the data protector cannot foresee what will be the information context of potential intruders. Generating synthetic microdata is an alternative to the usual approach based on distorting the original data. The main advantage is that no original data are released, so no disclosure can happen. However, subdomains (i.e. subsets of records) of synthetic datasets do not resemble the corresponding subdomains of the original dataset. Hybrid microdata mixing original and synthetic microdata overcome this lack of analytical validity. We present a fast method for generating numerical hybrid microdata in a way that preserves attribute means, variances and covariances, as well as (to some extent) record similarity and subdomain analyses. We also overcome the uncertainty in assessing data utility by using newly defined probabilistic information loss measures.
    07/2006: pages 287-298;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Several techniques exist nowadays for continuous (i.e. numerical) data analysis and modeling. However, although part of the information gathered by companies, statistical offices and other institutions is numerical, a large part of it is represented using categorical variables in ordinal or nominal scales. Techniques for model building on categorical data are required to take advantage of such a wealth of information. In this paper, current approaches to regression for ordinal data are reviewed and a new proposal is described which has the advantage of not assuming any latent continuous variable underlying the dependent ordinal variable. Estimation in the new approach can be implemented using genetic algorithms. An artificial example is presented to illustrate the feasibility of the proposal.
    Information Sciences 02/2006; · 3.89 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Resumen El Control de Revelación Estadística es una disciplina de protección de datos cuyo objetivo es impedir que la identidad de los respondedores quede revelada cuando terceras partes tienen acceso a grandes ficheros de estadísticas. En este artículo presentamos un conjunto de métodos para k-anonimato en bases de datos estadísticas usando microagregación. Los métodos son analizados y comparados y se desarrolla el análisis computacional de cada uno.
    01/2006;
  • Source
    Vicenç Torra, Josep Domingo-Ferrer, Josep Maria Mateo-Sanz, Michael Ng
    Inf. Sci. 01/2006; 176:465-474.
  • Conference Paper: A 2
    Proceedings of the The First International Conference on Availability, Reliability and Security, ARES 2006, The International Dependability Conference - Bridging Theory and Practice, April 20-22 2006, Vienna University of Technology, Austria; 01/2006
  • Source
    Josep M. Mateo-Sanz, Josep Domingo-Ferrer, Francesc Sebé
    [Show abstract] [Hide abstract]
    ABSTRACT: Inference control for protecting the privacy of microdata (individual data) should try to optimize the tradeoff between data utility (low information loss) and protection against disclosure (low disclosure risk). Whereas risk measures are bounded between 0 and 1, information loss measures proposed in the literature for continuous data are unbounded, which makes it awkward to trade off information loss for disclosure risk. We propose in this paper to use probabilities to define bounded information loss measures for continuous microdata.
    Data Mining and Knowledge Discovery 08/2005; 11(2):181-193. · 1.74 Impact Factor
  • Source
    Josep Domingo-ferrer, Josep Maria Mateo-sanz
    [Show abstract] [Hide abstract]
    ABSTRACT: Statistical data protection can be viewed as a heir of the work that was started on statistical database protection in the 70s and 80s. Massive production of computerized statistics by government agencies combined with an increasing social importance of individual privacy has led to a renewed interest in this topic. This paper summarizes recent activity in statistical data protection, then outlines the current areas of work and finally lists some issues which are likely to deserve the attention of researchers and practitioners in the near future.
    10/2004;