Feng Zhang

Feng Zhang
  • Sun Yat-sen University

About

49
Publications
6,678
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
652
Citations
Current institution
Sun Yat-sen University

Publications

Publications (49)
Article
This paper introduces a novel approach to improving the accuracy of satellite orbit prediction (OP) by combining XGBoost with the extended Kalman filter (EKF). While EKF is a well-established method in orbit determination, it falls short when dealing with OP due to the absence of observation data during the prediction period. To our knowledge, this...
Article
Earth observation (EO) data provenance is vital for facilitating data sharing and cooperative processing. However, existing techniques for managing EO data provenance still have various weaknesses, including decentralization, traceability, transparency, tamper-proofing, and security protection. Despite being a transformative solution in various dom...
Article
Full-text available
The most frequent and noticeable natural calamity in the Karakoram region is landslides. Extreme landslides have occurred frequently along Karakoram Highway, particularly during monsoons, causing a major loss of life and property. Therefore, it is necessary to look for a solution to increase growth and vigilance in order to lessen losses related to...
Article
Full-text available
Algorithms for machine learning have found extensive use in numerous fields and applications. One important aspect of effectively utilizing these algorithms is tuning the hyperparameters to match the specific task at hand. The selection and configuration of hyperparameters directly impact the performance of machine learning models. Achieving optima...
Preprint
Full-text available
The most frequent, noticeable, and frequent natural calamity in the karakoram region is landslides. Extreme landslides have occurred frequently along Karakoram highway, particularly during the monsoon, causing a major loss of life and property. Therefore, it was necessary to look for a solution to increase growth and vigilance in order to lessen lo...
Preprint
Full-text available
In this study, our primary objective was to analyze the tradeoff between accuracy and complexity in machine learning models, with a specific focus on the impact of reducing complexity and entropy on the production of landslide susceptibility maps. We aimed to investigate how simplifying the model and reducing entropy can affect the capture of compl...
Preprint
Full-text available
Algorithms for machine learning have found extensive use in numerous fields and applications. One important aspect of effectively utilizing these algorithms is tuning the hyperparameters to match the specific task at hand. The selection and configuration of hyperparameters directly impact the performance of machine learning models. Achieving optima...
Article
Full-text available
Landslide is a major natural hazard causing losses of human lives and properties. Therefore, it is significant to assess landslide susceptibility. This paper proposed an assessment model for landslide susceptibility based on deep learning to avoid landslide hazards and reduce losses. We combined the multilayer perceptron and the frequency ratio to...
Article
With the growing use of multi-core processors in the market, efficient and effective task parallelization strategies are on huge demand, so as the task scheduling algorithms. The scalability and efficiency of the existing algorithms on multi-core task scheduling needs to be improved. To schedule real-time tasks on a multi-core processor, any pair o...
Article
Matrix factorization is a powerful method to implement collaborative filtering recommender systems. This paper addresses two major challenges, privacy and efficiency, which matrix factorization is facing. We based our work on DS-ADMM, a distributed matrix factorization algorithm with decent efficiency, to achieve the following two pieces of work: (...
Article
Full-text available
A count-min sketch is a probabilistic data structure, which serves as a frequency table of events to process a stream of big data. It uses hash functions to map events to frequencies. Querying a count-min sketch returns the targeted event along with an estimated frequency, which is not less than the actual frequency. The estimated error, i.e., the...
Article
A simplified approach to accelerate matrix factorization of big data is to parallelize it. A commonly used method is to divide the matrix into multiple non-intersecting blocks and concurrently calculate them. This operation causes the Load balance problem, which significantly impacts parallel performance and is a big concern. A general belief is th...
Article
The latest implementation of the fully homomorphic encryption algorithm FHEW, FHEW-V2, takes about 0.12 seconds for a bootstrapping on a single-node computer. It seems much faster than the previous implementations. However, the 30-bit homomorphic addition requires 270 times of bootstrapping; plus those spent on key generation, the total elapsed tim...
Article
The sea surface monitoring is playing an important role in obtaining the ocean big data, especially for some businesses requiring complete coverage (e.g. the marine search and rescue). A great deal of energy consumption caused by observation leads to that the energy supply is more precious than the traditional observations. This study considers the...
Article
As an important kind of complex network models, bipartite network is widely used in many applications such as access control. The process of finding a set of structural communities in a bipartite network is called role mining, which has been extensively used to automatically generate roles for structural communities. Role minimization, aiming to ge...
Article
Ensuring privacy in recommender systems for smart cities remains a research challenge, and in this paper we study collaborative filtering recommender systems for privacy-aware smart cities. Specifically, we use the rating matrix to establish connections between a privacy-aware smart city and k-coRating, a novel privacy-preserving rating data publis...
Article
Full-text available
Earth observation (EO) big data is playing the increasingly important role in spatial sciences. To obtain adequate EO data, virtual constellation is proposed to overcome the limitation of traditional EO facilities, by combining the existing space and ground segment capabilities. However, the current configuration pattern of virtual constellation is...
Article
The unstructured nature of clinical narratives makes them complex for automatically extracting information. Feature learning is an important precursor to document classification, a sub-discipline of natural language processing (NLP). In NLP, word and document embeddings are an effective approach for generating word and document representations (vec...
Article
Cloud computing offers a cheap and efficient solution for the deployment of web applications. It results in a big increase of the number of service provider. Users hold multiple identities for using services from different domains. The openness of public clouds requires the authentication system to accept user identities from various domains and to...
Article
Before deploying a recommender system, its performance must be measured and understood. So evaluation is an integral part of the process to design and implement recommender systems. In collaborative filtering, there are many metrics for evaluating recommender systems. Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are among the most im...
Chapter
Data has become a valuable asset. Extensive work has been put on how to make the best use of data. One of the trends is to open and share data, and to integrate multiple data sources for specific usage, such as searching over multiple sources of data. Integrating multiple sources of data incurs the issue of data security, where different sources of...
Conference Paper
Matrix factorization has high computation complexity. It is unrealistic to directly adopt such techniques to online recommendation where users, items, and ratings grow constantly. Therefore, implementing an online version of recommendation based on incremental matrix factorization is a significant task. Though some results have been achieved in thi...
Conference Paper
Full-text available
For datasets in Collaborative Filtering (CF) recommen-dations, even if the identifier is deleted and some triv-ial perturbation operations are applied to ratings before they are released, there are research results claiming that the adversary could discriminate the individual's iden-tity with a little bit of information. In this paper, we propose k...
Article
For datasets in Collaborative Filtering (CF) recommendations, even if the identifier is deleted and some trivial perturbation operations are applied to ratings before they are released, there are research results claiming that the adversary could discriminate the individual's identity with a little bit of information. In this paper, we propose $k$-...
Conference Paper
In many applications, data mining has to be done in distributed data scenarios. In such situations, data owners may be concerned with the misuse of data, hence, they do not want their data to be mined, especially when these contain sensitive information. Privacy-preserving Data Mining (PPDM) aims to protect data privacy in the course of data mining...
Conference Paper
Full-text available
Cloud computing enables rapid deployment of services on an on-demand basis in large scale. Public clouds are open to all users with identities from various domains, requiring the support of multiple identity providers and hybrid authentication protocols. The complexity of the authentication scenario brings a great burden to service providers. Servi...
Conference Paper
Cloud computing has been acknowledged as one of the prevaling models for providing IT capacities. The off-premises computing paradigm that comes with cloud computing has incurred great concerns on the security of data, especially the integrity and confidentiality of data, as cloud service providers may have complete control on the computing infrast...
Conference Paper
Full-text available
Dynamic Time Warping (DTW) has been widely used for measuring the distance between the two time series, but its computational complexity is too high to be directly applied to similarity search in large databases. In this paper, we propose a new approach to deal with this problem. It builds the filtering process based on histogram distance, using me...
Article
From users' point of view, cloud computing delivers elastic computing services to users based on their needs. Cloud computing service providers must make sure enough resources are provided to meet users' demand, by either the provision of more than enough resource, or the provision of just-enough resource. This paper investigates the growth of the...
Article
Internet users consume a large volume of rich media, and the consumption is increasing dramatically with the pervasiveness of network. BitTorrent has been well accepted as a solution to large-scale content distribution without requiring major infrastructure investment. However, it has not taken into account the link optimization across networks, ca...
Conference Paper
Users see that cloud computing delivers elastic computing services to users based on their needs. Cloud computing service providers must make sure enough resources are provided to meet users’ demand, by either the provision of more than enough resource, or the provision of just-enough resource. This paper investigates the growth of the user set of...
Article
Privacy-preserving data mining aims at discovering beneficial information from a large amount of data without violating the privacy policy. Privacy-preserving association rules mining research has already generated many interesting results. Based on commutative encryptions and the Secure Multi-party Computation (SMC) theory, Kantarcioglu and Clifto...
Conference Paper
k -Nearest Neighbor (k -NN) mining aims to retrieve the k most similar objects to the query objects. It can be incorporated into many data mining algorithms, such as outlier detection, clustering, and k -NN classification. Privacy-preserving distributedk -NN is developed to address the issue while preserving the participants’ privacy. Several two-p...
Conference Paper
Collaborative filtering technologies are facing two major challenges: scalability and recommendation quality, which are two goals in conflict. Nowadays more studies are focusing on the quality issue but less on the scalability one. We introduce a genetic clustering algorithm to partition the source data, guaranteeing that the intra-similarity is hi...
Conference Paper
Compared with relative recently-reported counterparts, a novel recommender system prototype is implemented. Its efforts focus on the three essential issues as a whole that recommender systems have to handle: data source, data modeling and recommendation strategy. It is based on a common data format and introduces a hybrid-rule model with a strategy...
Conference Paper
Collaborative filtering technologies are facing two major challenges: scalability and recommendation quality. Sparsity of source data sets is one major reason causing the poor recommendation quality. To reduce sparsity, we design a collaborative filtering algorithm who firstly selects users whose non-null ratings intersect the most as candidates of...
Conference Paper
Web mining can be classified into three domains: Web structure mining, Web content mining and Web usage mining. There are generally three tasks in Web usage mining: preprocessing, knowledge discovery and pattern analysis. Though Web usage mining is still ranged in the application of traditional data mining techniques, in view of changes in applicat...

Network

Cited By