Publications (7)

  • Jun Li · Wei Wang · Yongheng Zhao
    Article · Jan 2013
  • Yan-Rong Zhao · Wei-Ping Wang · Dan Meng · [...] · Jun Li
    [Show abstract] [Hide abstract] ABSTRACT: This paper proposes a join query processing algorithm CoLocationHashMapJoin (CHMJ). First the study designs a multi-copy consistency hash algorithm. The algorithm distributes the data of tables over the cluster according to the hash values of the join property, which improves the data locality while ensure data availability. Second, based on the multi-copy consistency hash algorithm, the study proposes a parallel join query processing algorithm called HashMapJoin. HashMapJoin improves the efficiency of join query significantly. CHMJ has been used in Tencent's data warehouse system, and plays an important role in Tencent's daily analysis tasks. The results show that CHMJ improves the efficiency of join query processing by five times comparing to Hive.
    Article · Aug 2012 · Journal of Software
  • [Show abstract] [Hide abstract] ABSTRACT: Data-intensive applications are increasingly designed to execute on large computing clusters. Our previous observation on Tencent production systems has indicated that join query is one of the most important queries in large-scale data processing. When running a join query on Hive system, the job of the join query is divided into map phase and reduce phase, and requires transferring large amounts of intermediate results over the network, which is inefficient. In this paper, we proposed an algorithm called CHMJ, the general idea of the algorithm is to take advantage of data locality to accelerate calculation. It includes four parts, Data distribution strategy, Parallel HashMapJoin Algorithm, CoLocation Scheduling and Delay scheduling strategy. CHMJ has been adopted in Tencent data warehouse, and plays an important role in Tencent's daily operations. Our relevant experiments demonstrate the feasibility and efficiency of our solution.
    Conference Paper · Jul 2012
  • Yanrong Zhao · Ruan Yuan · Weiping Wang · [...] · Jun Li
    [Show abstract] [Hide abstract] ABSTRACT: There is a growing interest in designing high-speed network devices to perform packet processing at stream layer. However, TCP processing for 10G backbone traffic is not just to address performance problem but also to cope with abnormal conditions. Some characteristics of real traffic, especially the lack of finish tag for many streams and the complexity of packets reordering, will result in memory exhaustion for hardware-based TCP subsystem which is less flexible for exceptional processing. In this paper, we present a hardware design for backbone traffic which is capable of processing 10G with TCP reassembly and tracking states of millions of parallel TCP streams. The solution has several features: (1) an effective, easy hardware implementation stream replacement algorithm for massive stream table (2) fast one round access to global stream table which enable 10MPPS processing (3) an active release policy for out-of-order data buffers management (4) a design of linkless data structure which ensures time limit for worst case processing. The simulation result shows that the system can process over 99% of the 10G Backbone traffic using reasonable storage resources. A FPGA-based prototype is also implemented for evaluation.
    Conference Paper · Jun 2012
  • Yanrong Zhao · Weiping Wang · Dan Meng · [...] · Jun Li
    [Show abstract] [Hide abstract] ABSTRACT: As organizations start to use data intensive cluster computing systems like Hadoop MapReduce to handle large-scale data, scheduling of jobs become very important in order to achieve efficiency. In the default implementations of Hadoop MapReduce, jobs are scheduled in FIFO order. It easily causes the starvation of small jobs in the event of resources being utilized by large jobs, while Fair Scheduler is inefficient when handling large jobs and it leads to sticky slots problem. In this paper, we proposed a new job scheduling algorithm TDWS. The scheduling algorithm takes account characters of different applications to meet their different needs. In addition, it is also highly robust to heterogeneity and easy to achieve optimal data locality. The experiments demonstrate the feasibility and efficiency of our solution.
    Conference Paper · Jun 2012
  • Source
    Jun Li · Wei Wang · Yongheng Zhao
    [Show abstract] [Hide abstract] ABSTRACT: We present both timing and spectral analysis of the outburst of 4U 0115+63 in April -- May 2008 with INTEGRAL and RXTE observations. We have determined the spin period of the neutron star at $\sim 3.61430 \pm 0.00003$ s, and a spin up rate during the outburst of $\dot{P}=(-7.24 \pm 0.03)\times10^{-6} {\rm s d^{-1}}$, the angle of periapsis $\omega=48.67^\circ \pm 0.04^\circ$ in 2008 and its variation (apsidal motion) $\dot{\omega} = 0.048^\circ \pm 0.003^\circ {\rm yr}^{-1}$. We also confirm the relation of spin-up torque versus luminosity in this source during the giant outburst. The hard X-ray spectral properties of 4U 0115+63 during the outburst are studied with INTEGRAL and RXTE. Four cyclotron absorption lines are detected using the spectra from combined data of IBIS and JEM-X aboard INTEGRAL in the energy range of 3 -- 100 keV. The 5 -- 50 keV luminosities at an assumed distance of 7 kpc are determined to be in the range of $(1.5-12)\times 10^{37} {\rm ergs s^{-1}}$ during the outburst. The fundamental absorption line energy varies during the outburst: around 15 keV during the rising phase, and transiting to $\sim 10$ keV during the peak of the outburst, and further coming back to $\sim 15$ keV during the decreasing phase. The variations of photon index show the correlation with the fundamental line energy changes: the source becomes harder around the peak of the outburst and softer in both rising and decreasing phases. This correlation and transition processes during the outburst need further studies in both observations and theoretical work. The known relation of the fundamental line energy and X-ray luminosity is confirmed by our results, however, our discoveries suggest that some other factors besides luminosity play the important role in fundamental line energy variations and spectral transitions.
    Full-text Article · Apr 2012 · Monthly Notices of the Royal Astronomical Society
  • [Show abstract] [Hide abstract] ABSTRACT: ZFPs (Zinc Finger Proteins) play important roles in various cellular functions, including transcriptional activation, transcriptional repression, cell proliferation, and development. C(2)H(2) (Cys-Cys-His-His motif) ZFPs are the most abundant proteins among the founding members of the ZFP super family in eukaryotes. In this study, we isolate a novel C(2)H(2) ZNF (Zinc Finger) gene ZNFD. It contains an ORF (Open Reading Frame) with a length of 990 bp, encoding 329 amino acids. The predicted protein contains a C(2)H(2) zinc finger. RT-PCR analysis in 18 human adult tissues indicated that it was expressed in five human adult tissues. Green fluorescence protein localization analysis showed that human ZNFD was located in the nucleus of Hela cells. Overexpression of ZNFD in the COS7 cells activates the transcriptional activities of AP1(PMA) (Activator of protein 1, that responds specifically to phorbol ester). Together the data indicate that ZNFD is probably a new type of C(2)H(2) ZFP and the ZNFD protein may act as a transcriptional activator in PKC (protein kinase C) signal pathway to mediate cellular functions.
    Article · Feb 2010 · Molecular and Cellular Biochemistry