Peking University
  • Beijing, Beijing, China
Recent publications
The association between PM2.5 (particulate matter ≤ 2.5 µm) short-term exposure and its health effect is non-linear from the epidemiological studies. And this nonlinearity is suggested to be related with the PM2.5 heterogeneity, however, the underlying biological mechanism is still unclear. Here, a total of 38 PM2.5 filters were collected continuously for three weeks in winter Beijing, with the ambient PM2.5 varying between 10 and 270 µg/m³. Human monocytes-derived macrophages (THP-1) were treated with PM2.5 water-soluble elutes at 10 µg/mL to investigate the PM2.5 short-term exposure effect from a proinflammatory perspective. The proinflammatory cytokine tumor necrosis factor (TNF) induced by the PM2.5 elutes at equal concentrations were unequal, showing the heterogeneity of PM2.5 proinflammatory potentials. Of the various chemical and biological components, lipopolysaccharide (LPS) showed a strong positive association with the TNF heterogeneity. However, some outliers were observed among the TNF-LPS association. Specifically, for PM2.5 from relatively clean air episodes, the higher LPS amount corresponded to relatively low TNF levels. And this phenomenon was also observed in the promotion tests by treating macrophages with PM2.5 elutes dosed with additional trace LPS. Gene expression analysis indicated the involvement of oxidative-stress related genes in the LPS signaling pathway. Therefore, a potential oxidative-stress-mediated suppression on the PM2.5-borne LPS proinflammatory effect was proposed to be accounted for the outliers. Overall, the results showed the differential role of LPS in the heterogeneity of PM2.5 proinflammatory effects from a component-based perspective. Future experimental studies are needed to elucidate the signaling pathway of LPS attached on PM2.5 from different air quality episodes.
As an essential way to enhance farmers’ self-development ability, off-farm employment plays an indispensable role in farmers’ multidimensional poverty reduction in many countries. Employing a survey of 1926 farmers in five provinces of the Yellow River Basin in China, this paper examined the multi-dimensional poverty reduction effect of off-farm employment and the heterogeneous influence of different dimensions of off-farm employment (modes, levels, distances and frequency). The results showed that (1) although absolute poverty in the income dimension was largely eliminated in the Yellow River Basin, the poverty in social resources, transportation facilities, employment security were the key bottlenecks restricting farmers’ self-development. (2) The province with the best multidimensional poverty reduction effect for non-farm employment was Shaanxi, with the largest contribution to employment security. (3) Improving off-farm employment level, distance and time can significantly alleviate the multi-dimensional poverty of farmers. Therefore, to lessen the multi-dimensional poverty of farmers in the Yellow River Basin, it is necessary to focus on the governance of multi-dimensional key poverty-stricken areas, such as the middle and upper courses of the Yellow River, adopting multidimensional poverty alleviation strategy of off-farm employment according to local conditions, working on the farmers’ deficiencies in social resources, mobility, employment security, and deepening the effect of off-farm employment on benefiting farmers and helping the poor.
As basic units of urban areas, urban functional zones (UFZs) are fundamental to urban planning, management, and renewal. UFZs are mainly determined by human activities, economic behaviors, and geographical factors, but existing methods 1) do not fully use multimodal geographic data owing to a lack of semantic modeling and feature fusion of geographic objects and 2) are composed of multiple stages, which leads to the accumulation of errors through multiple stages and increases the mapping complexity. Accordingly, this study designs a multimodal data fusion framework (MDFF) to map fine-grained UFZs end-to-end, which effectively integrates very-high-resolution remote sensing images and social sensing data. The MDFF extracts physical attributes from remote sensing images and models socioeconomic semantics of geographic objects from social sensing data, and then fuses multimodal information to classify UFZs where object semantics guide the fine-grained classification. Experimental results in Beijing and Shanghai, two major cities of China, show that the MDFF greatly improves the quality of UFZ mapping with the accuracy about 5% higher than state-of-the-art methods. The proposed method significantly reduces the complexity of UFZ mapping to complete the urban structure analysis conveniently.
The question of whether environmental regulation fosters technological innovation and green development, as a nuanced extension of the Porter hypothesis, constitutes a focal point in contemporary research. Despite this attention, the literature often omits a multifaceted evaluation framework for green development and fails to consider multiaspectual environmental regulation and technological innovation. This study develops a comprehensive model of green total factor productivity (GTFP), situating the Chinese economy within an economy–environment–health nexus. The extended Cr´ epon–Dugeut–Mairesse model is employed to revisit the “strong”, “weak”, and “narrow” Porter hypotheses. The analysis reveals that formal environmental regulation exerts a crowding-out effect on research and development (R&D), whereas informal environmental regulation exhibits a facilitating effect, corroborating the narrow version of the Porter hypothesis. Both categories of regulation contribute to substantial innovation. Following the incorporation of R&D factors, heterogeneity in the “weak” Porter hypothesis emerges in the Chinese context, contingent upon specific types of environmental regulation and technological innovation. Environmental regulation positively influences GTFP, affirming the “strong” Porter hypothesis, primarily through the vector of technical progress change. A developmental trajec- tory to enhance GTFP is thus articulated: judicious environmental regulation leads to R&D, which in turn fosters innovation quality, subsequently affecting the technical progress change index and ultimately GTFP. Corre- spondingly, policy recommendations are delineated across three dimensions: judicious environmental regulation, targeted innovation support, and regional coordination.
Widespread degradation of natural ecosystems around the globe has resulted in several ecological problems. Ecological restoration is considered a global priority as an important means of mitigating ecosystem degradation and enhancing ecosystem services provision. Regarding ecosystem reference state is a prerequisite for ecological restoration. However, there were few studies focusing on how to regard reference state for ecological restoration, especially under a changing climate. Taking Guizhou Province, a typical karst region in China, as a case study area, in this study we firstly assessed ecosystem services under homogeneous climate conditions. Secondly, we defined the optimal ecosystem services as ecosystem reference state, and then evaluated restoration suitability under a comprehensive framework. Finally, ecological restoration priority areas (EPRAs), which included ecological reconstruction areas, assisted regeneration areas and conservation priority areas needing restoration, were identified by integrating restoration suitability and conservation priority areas. The results showed that the services of water conservation and habitat maintenance only increased less than 10% from 2001 to 2018. Identified ecological reconstruction areas and assisted regeneration areas covered 1078 km 2 and 1159 km 2 respectively. Additionally, 15 conservation priority areas with the total area of 18,507 km 2 were identified as conservation priority areas needing restoration. Accounting for 11.78% of the total area, ERPAs were mostly located in the eastern part of Guizhou, including Qiandongnan, Tongren, and Zunyi. The approach proposed here for regarding ecosystem reference state after controlling climate variables and the framework for identifying ERPAs can provide a scientific reference for large-scale ecological restoration planning.
The path tracking of the robotic fish is a hotspot with its high maneuverability and environmental friendliness. However, the periodic oscillation generated by bionic fish-like propulsion mode may lead to unstable control. To this end, this paper proposes a novel framework involving a newly-designed platform and multi-agent reinforcement learning (MARL) method. Firstly, a bionic robotic fish equipped with a reaction wheel is developed to enhance the stability. Secondly, a MARL-based control framework is proposed for the cooperative control of tail-beating and reaction wheel. Correspondingly, a hierarchical training method including initial training and iterative training is designed to deal with the control coupling and frequency difference between two agents. Finally, extensive simulations and experiments indicate that the developed robotic fish and proposed MARL-based control framework can effectively improve the accuracy and stability of path tracking. Remarkably, the head-shaking is reduced about 40%. It provides a promising reference for the stability optimization and cooperative control of bionic swimming robots featuring oscillatory motions.
Graph classification is a fundamental problem with diverse applications in bioinformatics and chemistry. Due to the intricate procedures of manual annotations in graphical domains, there may be abundant noisy labels of graphs in practice, resulting in poor performance for existing supervised methods. Thus, it is necessary and urgent to study the problem of graph classification with label noise. However, this problem is challenging due to the overfitting of noisy data as well as complicated relational structures of graphs. To handle this problem, we present a simple but effective approach called c O upled M ix for G raph Contrast (OMG), which combines coupled Mixup with graph contrastive learning in the feature space. On the one hand, to improve the model generalization, we take convex combination of sample pairs in the feature space for positive pair construction. On the other hand, to accomplish effective optimization, we offer challenging negatives by multiple sample Mixup with different emphasis. To further reduce the impact of noisy data, we develop a neighbour-aware noise removal strategy, which promotes the smoothness in the neighbourhood of samples following the principle of curriculum learning. Extensive experiments on a range of benchmark datasets demonstrate the superiority of our proposed OMG.
Discovering causal relationships among observed variables is a new research focus in the area of data mining. Methods based on the additive noise model have been proved to be efficient in the identification of cause-effect pairs. However, when trying to determine many-to-one causality, additive noise models often fail to identify the causal direction due to the complex interrelationships and interactions even though the generation of each causal relation follows the additive noise model, and become unreliable in practical applications. In this work, to identify the causal direction, we propose a Hierarchical Additive Noise Model (HANM) to convert many-to-one causality into an approximate one-to-one causality by generalizing multiple factors into an intermediate variable with a variational approach, and use asymmetry in the forward model and backward model of HANM to identify causal direction. Experiments using synthetic data show that many-to-one causality can be effectively identified through asymmetry with our proposed HANM and the accuracy of HANM is higher than the best existing model. By applying the model to real-world data, it can be seen that HANM can greatly augment the application scope of functional causal models for causal discovery.
In this paper, we propose a generic sketch algorithm capable of achieving more accuracy in the following five tasks: finding top- $k$ frequent items, finding heavy hitters, per-item frequency estimation, and heavy changes in the time and spatial dimension. The state-of-the-art (SOTA) sketch solution for multiple measurement tasks is ElasticSketch (ES). However, the accuracy of its frequency estimation has room for improvement. The reason for this is that ES suffers from overestimation errors in the light part, which introduces errors when querying both frequent and infrequent items. To address these problems, we propose a generic sketch, OneSketch, designed to minimize overestimation errors. To achieve the design goal, we propose four key techniques, which embrace hash collisions and minimize possible errors by handling highly recurrent item replacements well. Experimental results show that OneSketch clearly outperforms 12 SOTA schemes. For example, compared with ES, OneSketch achieves more than 10× lower Average Absolute Error on finding top- $k$ frequent items and heavy hitters, as well as 48.3% and 38.4% higher F1 Scores on two heavy changes under 200KB memory, respectively.
This work proposes a statistical modeling approach for the artificial neural network (ANN) based compact model (CM). The method of retaining part of the network features of the nominal device and further finetuning the network parameters (variational neurons) is found to accurately reproduce the static variation. A mapping from process variation to network parameters is derived by combining the proposed variational neuron selection algorithm and the backward propagation of variance (BPV) method. In addition, a secondary classification of the selected variational neurons is applied to model the fabrication-induced correlation between n-and p-type devices. The NN-based statistical modeling approach has been well implemented and verified on the GAA simulation data and the 16nm node foundry FinFET, which indicates its great potential in modeling emerging and advanced device technology.
The General Data Protection Regulation (GDPR) is a European Union (EU) data protection and privacy law. According to the GDPR, the data on a hosting platform must meet semantic consistency and data integrity requirements. Semantic consistency means that the data operation should comply with the GDPR, while data integrity is meant to ensure that the outsourcing data should be intact. The two terms are not interchangeable. For example, if a cloud service provider migrates data to foreign storage nodes without authorization of the data owner, the data integrity requirement of the GDPR is met but the semantic consistency requirement is not. How to ensure data integrity and compliance is the main challenge for a GDPR-compliant data supervision platform. To achieve this aim, we leverage a blockchain based data management framework to check the data compliance, which can break the black box of the data hosting platform and demonstrate its logic to data owners, allowing for inspection. We propose a new provable data possession (PDP) scheme for the aforementioned framework that can check for semantic consistency and data integrity simultaneously. The verifier does not need to hold any audited data, which can reduce bandwidth usage. The verification result can be regarded as the proof for subsequent data recovery and accountability. Experimental results show higher efficiency of the PDP scheme.
Static timing analysis (STA) is an essential yet time-consuming task during the circuit design flow to ensure the correctness and performance of the design. Thanks to the advancement of general-purpose computing on graphics processing units (GPUs), new possibilities and challenges have arisen for boosting the performance of STA. In this work, we present an efficient and holistic GPU-accelerated STA engine. We accelerate major STA tasks, including levelization, delay computation, graph propagation, and multi-corner analysis, by developing high-performance GPU kernels and data structures. By dividing the STA workloads into CPU-GPU concurrent tasks with managed dependencies, our acceleration framework supports versatile incremental updates. Furthermore, we have extended our approach to multi-corner analysis by exploring a large amount of corner-level data parallelism using GPU computing. Our implementation based on the open-source STA engine OpenTimer has achieved up to 4.07× speed-up on single corner analysis, and up to 25.67× speed-up on multi-corner analysis on TAU 2015 contest designs and a 14nm technology.
The design automation community has been actively exploring machine learning for VLSI CAD. Many studies have explored learning-based techniques for cross-stage prediction tasks in the design flow. Although building machine learning models usually requires a large amount of data, most studies can only generate small internal datasets for validation due to the lack of large public datasets. Such a situation challenges the research in this field and raises potential issues like difficulty in benchmarking and reproducing results, limited research scope on small internal datasets, and high bar for new researchers. Therefore, in this paper, we present an open-source dataset called “CircuitNet” for machine learning tasks in VLSI CAD. The dataset consists of more than 10K samples extracted from versatile runs of commercial design tools based on 6 open-source RISC-V designs which support typical cross-stage prediction tasks, such as routability and IR drop prediction, with extensive benchmarking on recent models. With the dataset prepared, we identify two practical challenges, data imbalance and model transferability, for machine learning application in CAD. To overcome data imbalance, we propose a loss function, biased loss, to give more weight to the minority, leading to 2% congestion reduction in routability driven placement. We test the model transferability from RISC-V designs to ISPD 2015 contest designs in congestion prediction with several transfer learning methods, and further proposed a knowledge distillation based transfer learning framework with up to 20% accuracy improvement. We believe this dataset can open up new opportunities for machine learning in CAD research and beyond.
Along with the ever-increasing amount of data generated from industrial devices, cross domain (also known as Autonomous Systems, AS) data transmission problem has attracted more and more attention in Industrial Internet of Things (IIoT). As mature and widely used inter-domain routing protocols, BGP-based solutions often take the number of domains (i.e., AS hops) of each path as a criterion to make routing decisions, which is simple and effective. However, such protocols can only meet reachability requirements while ignoring performance requirements. That is, the path with the minimum AS hops will be selected to carry flows, even if the actual performance of this path does not meet the transmission requirements due to the unawareness of intra-domain information on that path. But it is not impractical to directly access intra-domain information for making better routing decisions given data privacy concerns. In this paper, we propose M-DIT, which can make inter-domain routing decisions with the assistance of desensitized intra-domain information for multiple-requirement transmissions. To do so, we design a homomorphic encrypted-based private number comparison scheme to export intra-domain information securely and thus assist in routing decisions. The results of some experiments based on 5 real topologies (ATMnet, Claranet, Compuserve, NSFnet, and Peer1) with thousands of inter-domain flows demonstrate that reduced flow completion time by about 60% or selected high bandwidth paths flexibly for inter-domain routing for IIoT scenarios.
In-memory computing is an emerging computing paradigm to breakthrough the von-Neumann bottleneck. The SRAM based in-memory computing (SRAM-IMC) attracts great concerns from industries and academia, because the SRAM is technology compatible with the widely-used MOS devices. The digital SRAM-IMC scheme has advantages on stability and accuracy of computing results, compared with the analog SRAM-IMC schemes. However, few logic operations can be implemented by the current digital SRAM-IMC architectures. Designers have to insert some special logic modules to facilitate the complex computation. To address this issue, this work proposes an area-efficient implementation method of arbitrary Boolean function in SRAM array. Firstly, a two-input SRAM LUT is designed to realize the arbitrary two-input Boolean functions. Then, the logic merging and the spatial merging techniques are proposed to reduce the area consumption of the SRAM-IMC scheme. Finally, the SOP-based SRAM-IMC architecture is proposed, and the merged SOPs are mapped into and computed in it. The evaluation results on LGsynth’91, IWLS’93 and EPFL benchmarks show that, the area of the synthesis results based on the ABC tool is 3.69, 5.72 and 1.86 times of the circuit area from the proposed SRAM-IMC scheme in average respectively. Furthermore, the circuit area from the original SOP-based SRAM-IMC scheme is 2.07, 1.99 and 1.86 times in average of the circuit area from the proposed SRAM-IMC scheme respectively. The performance evaluation results show that the cycle consumption of the proposed SRAM-IMC scheme is independent to the scale of the input Boolean functions.
Three-dimensional (3D) understanding or inference has received increasing attention, where 3D convolutional neural networks (3D-CNNs) have demonstrated superior performance compared to two-dimensional CNNs (2D-CNNs), since 3D-CNNs learn features from all three dimensions. However, 3D-CNNs suffer from intensive computation and data movement. In this paper, Sagitta, an energy-efficient low-latency on-chip 3D-CNN accelerator, is proposed for edge devices. Locality and small differential value dropout are leveraged to increase the sparsity of activations. A full-zero-skipping convolutional microarchitecture is proposed to fully utilize the sparsity of weights and activations. A hierarchical load-balancing scheme is also introduced to increase the hardware utilization. Specialized architecture and computation flow are proposed to enhance the effectiveness of the proposed techniques. Fabricated in a 55-nm CMOS technology, Sagitta achieves 3.8 TOPS/W for C3D at a latency of 0.1 s and 4.5 TOPS/W for 3D U-Net at a latency of 0.9 s at 100 MHz and 0.91 V supply voltage. Compared to the state-of-the-art 3D-CNN and 2D-CNN accelerators, Sagitta enhances the energy efficiency by up to 379.6× and 11×, respectively.
Activity pattern prediction is a critical part of urban computing, urban planning, intelligent transportation, and so on. Based on a dataset with more than 10 million GPS trajectory records collected by mobile sensors, this research proposed a CNN-BiLSTM-VAE-ATT-based encoder-decoder model for fine-grained individual activity sequence prediction. The model combines the long-term and short-term dependencies crosswise and also considers randomness, diversity, and uncertainty of individual activity patterns. The proposed results show higher accuracy compared to the ten baselines. The model can generate high diversity results while approximating the original activity patterns distribution. Moreover, the model also has interpretability in revealing the time dependency importance of the activity pattern prediction.
Graph classification, aiming at learning the graph-level representations for effective class assignments, has received outstanding achievements, which heavily relies on high-quality datasets that have balanced class distribution. In fact, most real-world graph data naturally presents a long-tailed form, where the head classes occupy much more samples than the tail classes, it thus is essential to study the graph-level classification over long-tailed data while still remaining largely unexplored. However, most existing long-tailed learning methods in visions fail to jointly optimize the representation learning and classifier training, as well as neglect the mining of the hard-to-classify classes. Directly applying existing methods to graphs may lead to sub-optimal performance, since the model trained on graphs would be more sensitive to the long-tailed distribution due to the complex topological characteristics. Hence, in this paper, we propose a novel long-tailed graph-level classification framework via Co llaborative M ulti- e xpert Learning (CoMe) to tackle the problem. To equilibrate the contributions of head and tail classes, we first develop balanced contrastive learning from the view of representation learning, and then design an individual-expert classifier training based on hard class mining. In addition, we execute gated fusion and disentangled knowledge distillation among the multiple experts to promote the collaboration in a multi-expert framework. Comprehensive experiments are performed on seven widely-used benchmark datasets to demonstrate the superiority of our method CoMe over state-of-the-art baselines.
This paper presents the frequency-domain-based modified impedance tuning analysis (ITA) method and its demonstration in designing a Class-Ф <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> DC-DC converter with load-independent zero voltage switching (ZVS) characteristics. By adjusting the 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">nd</sup> -order harmonic voltage, a near trapezial drain-to-source voltage ( V<sub>DS</sub> ) is achieved with beneficially reduced voltage stresses across the main power devices. A complete non-iteration design procedure is established. Comparing to traditional ITA design theory, this modified ITA method leads to low dv/dt and low V<sub>DS</sub> stress, even at 50% switching duty cycles. In addition, thanks to the finite input resistance of the rectifier, the output resonant network and rectifier designed following this frequency-domain method feature the extensive load-independent ZVS characteristic. The 5 MHz prototype aiming for a true short-circuit (SC) to open-circuit (OC) load-independent ZVS is designed using modified ITA method, assembled, and measured. Under 16 V input, the measured conversion efficiency maintains 76.8-89.5% in the testable load range from 7.4W to 16.8W.
Institution pages aggregate content on ResearchGate related to an institution. The members listed on this page have self-identified as being affiliated with this institution. Publications listed on this page were identified by our algorithms as relating to this institution. This page was not created or approved by the institution. If you represent an institution and have questions about these pages or wish to report inaccurate content, you can contact us here.
33,951 members
Jie Huang
  • School of Life Sciences
Jing Guo
  • School of Public Health
Donghe Zhang
  • School of Earth and Space Sciences
Bei Liu
  • College of Future Technology
Jufen Liu
  • Institute of Reproductive and Child Health, Department of Epidemiology and Biostatistics, School of Public Health
Information
Address
No.5 Yiheyuan Road Haidian District, Beijing, 100871, Beijing, Beijing, China
Head of institution
Hao Ping
Website
http://www.pku.edu.cn/
Phone
010-62751812
Fax
010-62751812