Table 1 - uploaded by Wen-Ling Hsu
Content may be subject to copyright.
Source publication
Using heterogeneous data sources collected from one of the largest 3G cellular networks in the US over three months, in this paper we investigate the usage patterns of mobile data users. We observe that data usage across mobile users are highly uneven. Most of the users access data services occasionally, while a small number of heavy users contribu...
Context in source publication
Context 1
... on the functionality of mobile de- vices, we further categorize them into 6 classes (1st column in Ta- ble 1). We show the proportion of heavy users out of all data users with each particular device class in the second column of Table 1. The description of each device class is in the 3rd column.Not sur- prisingly, COMPUTER related devices (laptop cards and netbooks) and SMARTPHONE users, due to their capability of running many applications, have a much higher chance of becoming heavy users. ...
Similar publications
Game Theory (GT) has been used with excellent results to model and optimize the operation of a huge number of real-world systems, including in communications and networking. Using a tutorial style, this paper surveys and updates the literature contributions that have applied a diverse set of theoretical games to solve a variety of challenging probl...
States, or even large urban cities, may experience different crash rates in different regions or parts of the city. This can be attributed to differences in terrain, population, weather, and other unobserved characteristics. Hence, it can impact the calibration procedure and consequently the calibration factor when it is used for a very large area....
Knee kinematic data consist of a small sample of high-dimensional vectors recording repeated measurements of the temporal variation of each of the three fundamental angles of knee three-dimensional rotation during a walking cycle. In applications such as knee pathology classification, the notorious problems of high-dimensionality (the curse of dime...
Objective:
Multi-organizational research requires a multi-organizational data quality assessment (DQA) process that combines and compares data across participating organizations. We demonstrate how such a DQA approach complements traditional checks of internal reliability and validity by allowing for assessments of data consistency and the evaluat...
Citations
... The Autoregressive Integrated Moving Average (ARIMA) model, one of the most used time series forecasting techniques, was adopted in [10] for Wi-Fi traffic prediction. However, given their simplicity, classical models like ARIMA fail to generalize and do not provide accurate predictions when the time series is complex, which is typically the case for network traffic embedding activities from a large number of users [11]. ...
... The detection architecture is designed to face the small sample size and the complex change in the current real network. Deep Packet Inspection (DPI) data is recognized as the most effective data for traffic classification, traffic control and anomaly detection by ISPs [6], [7]. The human-in-the-loop anomaly detection architecture in this paper uses DPI data as the training data and application scenario. ...
Monitoring network traffic to identify malicious applications is an active research topic in network security. In the era of mobile big data, smart mobile devices have become an integral part of our daily life, which brings many benefits to the digital society. However, the popularity and relatively lax security make them vulnerable to various cyber threats. Traditional network traffic analysis techniques utilizing pattern matching and regular expressions matching algorithms is becoming insufficient for mobile big data. Network traffic anomaly detection is an effective method to replace traditional methods. Meanwhile, network traffic mirroring is the foundation of digital twin for Mobile Network(DTMN). With the ongoing development of 6G and space Internet in the future, mobile networks will become more and more dense and require more rapid response. Network traffic anomaly detection can solve many new challenges brought by future networks and protect the security of DTMN. In this article, we propose a streaming network framework for mobile big data, referred to as SNMDF, which provides massive data traffic collection, processing, analysis and updating functions, to cope with the tremendous amount of data traffic. In particular, by analyzing the specific characteristic of anomaly traffic data from flow and user behavior, our proposed SNMDF demonstrates its capability to offer real data-based advice to address new challenges for future wireless networks from the viewpoints of operators. Tested by real mobile big data, SNMDF has proven its efficiency and reliability. Furthermore, SNMDF is accessed for the digital twin of the space Internet, which validates that it can be generalized to other environments with massive data traffic or big data.
... Undoubtedly, the energy consumption of cellular systems is highly dependent on the mobile users' characteristics. The user activities and data usage patterns in a large cellular network can be modelled statistically using a Markov model with tri-non-negative matrix factorisation [3]. This work has demonstrated that data usage across mobile users are severely uneven for a cellular network. ...
With the rapid proliferation of wireless traffic and the surge of various data-intensive applications, the energy consumption of wireless networks has tremendously increased in the last decade, which not only leads to more CO2 emission, but also results in higher operating expenditure. Consequently, energy efficiency (EE) has been regarded as an essential design criterion for future wireless networks. This paper investigates the problem of EE maximisation for a cooperative heterogeneous network (HetNet) powered by hybrid energy sources via joint base station (BS) switching (BS-Sw) and power allocation using combinatorial optimisation. The cooperation among the BSs is achieved through a coordinated multi-point (CoMP) technique. Next, to overcome the complexity of combinatorial optimisation, Lagrange dual decomposition is applied to solve the power allocation problem and a sub-optimal distance-based BS-Sw scheme is proposed. The main advantage of the distance-based BS-Sw is that the algorithm is tuning-free as it exploits two dynamic thresholds, which can automatically adapt to various user distributions and network deployment scenarios. The optimal binomial and random BS-Sw schemes are also studied to serve as benchmarks. Further, to solve the non-fractional programming component of the EE maximisation problem, a low-complexity and fast converging Dinkelbach’s method is proposed. Extensive simulations under various scenarios reveal that in terms of EE, the proposed joint distance-based BS-Sw and power allocation technique applied to the cooperative and harvesting BSs performs around 15–20% better than the non-cooperative and non-harvesting BSs and can achieve near-optimal performance compared to the optimal binomial method.
... Thus, the improvement of 5G and future 6G networks' energy efficiency (EE) is of high importance [3]. Research has shown that many of the currently deployed BSs are underutilized over long periods of time [4]. The problem would arise even more while considering the dense deployment of pico BSs. ...
... Thus, the improvement of 5G and future 6G networks' energy efficiency (EE) is of high importance [3]. Research has shown that many of the currently deployed BSs are underutilized over long periods of time [4]. The problem would arise even more while considering the dense deployment of pico BSs. ...
The Massive Multiple-Input Multiple-Output (MMIMO) technique together with Heterogeneous Network (Het-Net) deployment enables high throughput of 5G and beyond networks. However, a high number of antennas and a high number of Base Stations (BSs) can result in significant power consumption. Previous studies have shown that the energy efficiency (EE) of such a network can be effectively increased by turning off some BSs depending on User Equipments (UEs) positions. Such mapping is obtained by using Reinforcement Learning. Its results are stored in a so-called Radio Environment Map (REM). However, in a real network, the number of UEs' positions patterns would go to infinity. This paper aims to determine how to match the current set of UEs' positions to the most similar pattern, i.e., providing the same optimal active BSs set, saved in REM. We compare several state-of-the-art distance metrics using a computer simulator: an accurate 3D-Ray-Tracing model of the radio channel and an advanced system-level simulator of MMIMO Het-Net. The results have shown that the so-called Sum of Minimums Distance provides the best matching between REM data and UEs' positions, enabling up to 56% EE improvement over the scenario without EE optimization.
... Kliks). network varies over time [5,6]. As such, many of the BSs are underutilized within some time periods. ...
... • State ∈ is defined as the set of UEs coordinates rounded to fit the square grid of size . States are recognized using the HD metric defined by (5). is a set of all historical UEs positions stored in REM. ...
Energy Efficiency (EE) is of high importance while considering Massive Multiple-Input Multiple-Output (M-MIMO) networks where base stations (BSs) are equipped with an antenna array composed of up to hundreds of elements. M-MIMO transmission, although highly spectrally efficient, results in high energy consumption growing with the number of antennas. This paper investigates EE improvement through switching on/off underutilized BSs. It is proposed to use the location-aware approach, where data about an optimal active BSs set is stored in a Radio Environment Map (REM). For efficient acquisition, processing and utilization of the REM data, reinforcement learning (RL) algorithms are used. State-of-the-art exploration/exploitation methods including ϵ-greedy, Upper Confidence Bound (UCB), and Gradient Bandit are evaluated. Then analytical action filtering, and an REM-based Exploration Algorithm (REM-EA) are proposed to improve the RL convergence time. Algorithms are evaluated using an advanced, system-level simulator of an M-MIMO Heterogeneous Network (HetNet) utilizing an accurate 3D-ray-tracing radio channel model. The proposed RL-based BSs switching algorithm is proven to provide 70% gains in EE over a state-of-the-art algorithm using an analytical heuristic. Moreover, the proposed action filtering and REM-EA can reduce RL convergence time in relation to the best-performing state-of-the-art exploration method by 60% and 83%, respectively.
... However, Fig. 8 shows that the patterns are not similar between the connected user time series and traffic time series for the Patternless group. The works in [55] and [56] have observed the similar behavior and the reason for it which can be explained as follows. In Fig. 8, a small number of users with heavy data usage contribute the majority of traffic in APs so that data usage pattern for users are highly uneven. ...
Wireless traffic usage forecasting methods can help to facilitate proactive resource allocation solutions in cloud managed wireless networks. In this paper, we present temporal and spatial analysis of network traffic using real traffic data of an enterprise network comprising 470 access points (APs). We classify and separate APs into different groups according to their traffic usage patterns. We study various statistical properties of traffic data, such as auto-correlations and cross-correlations within and across different groups of APs. Our analysis shows that the group of APs with high traffic utilization have strong seasonality patterns. However, there are also APs with no such seasonal patterns. We also study the relation between number of connected users and traffic generated, and show that more connected users do not always mean more traffic data, and vice versa. We use Holt-Winters, seasonal auto-regressive integrated moving average (SARIMA), long short-term memory (LSTM), gated recurrent unit (GRU) and convolutional neural network (CNN) methods for forecasting traffic usage. Our results show that there is no single universal best method that can forecast traffic usage of every AP in an enterprise wireless network. The combined models such as CNN-LSTM and CNN-GRU are also used for spatio-temporal forecasting of a single AP traffic usage. The results show that considering spatial dependencies of neighboring APs can improve the forecasting performance of a single AP if it has significant spatial correlations.
(Dataset can be downloaded at - https://ieee-dataport.org/open-access/wireless-network-traffic-time-series-enterprise-network.)
Keywords - 5G, CNN, CNN-GRU, CNN-LSTM, Forecasting, GRU, Holt-Winters, LSTM, Neural Network, Real Network Data, SARIMA, Spatio-temporal, Temporal, Time Series Analysis, WLAN.
... Bursty Sampling: The generation of MFRs used in this paper is triggered by users' Internet access behaviors. Based on some previous studies, people usually access Internet resources in a bursty manner [16,38]. When people use their phones intensively, many records are generated in a short time, which causes redundancy in the location data. ...
Understanding citizens' main transportation modes at urban scale is beneficial to a range of applications, such as urban planning, user profiling, transportation management, and precision marketing. Previous methods on mode inference are mostly focused on utilizing GPS data with high spatiotemporal granularity. However, due to high costs of GPS data collection, the previous work typically is in small scales. In contrast, the cellular data logging interactions between cellphone users and cell towers cover much higher population given the ubiquity of cellphones. Nevertheless, utilizing cellular data introduces new challenges given their low spatiotemporal granularity compared to GPS data. In this paper, we design CellTrans, a novel framework to survey users' main transportation modes (public transportation or private car) at urban scale with cellular data. CellTrans extracts various mobility features that are pertinent to users' main transportation modes and presents solutions for different application scenarios including when there are no labeled users in the studied cities. We evaluate CellTrans on two real-world large-scale cellular datasets covering 3 million users, among which 2,589 users are with labels. We assess our method not only quantitatively with labeled users, but also qualitatively with the whole population. The experiments show that CellTrans infers users' main transportation modes with accuracy over 80% (with a performance gain of 20% compared to state-of-the-art), and CellTrans remains effective when applied at urban scale to the whole population.
... So the insights obtained from such network traces are not applicable for monitoring the large-scale mobile network. On the other hand, the traces obtained from network operator provide a bundle of information such as users mobility, service quality, network performance etc for an entire cellular network [14,15]. The insights obtained from these traces are very useful for monitoring the network as well as the behavior of the users [16,17]. ...
The information contained within Call Details records (CDRs) of mobile networks can be used to study the operational efficacy of cellular networks and behavioural pattern of mobile subscribers. In this study, we extract actionable insights from the CDR data and show that there exists a strong spatiotemporal predictability in real network traffic patterns. This knowledge can be leveraged by the mobile operators for effective network planning such as resource management and optimization. Motivated by this, we perform the spatiotemporal analysis of CDR data publicly available from Telecom Italia. Thus, on the basis of spatiotemporal insights, we propose a framework for mobile traffic classification. Experimental results show that the proposed model based on machine learning technique is able to accurately model and classify the network traffic patterns. Furthermore, we demonstrate the application of such insights for resource optimisation.
... We can observe that the total traffic consumption of 70% users is lower than 10,000 KB (around 10 MB). In order to obtain the bandwidth usage intuitively, we refer to work [13] and plot CDF of the total traffic usage in Figure 2b, where the x-axis indicates the proportion of top users ranked by their traffic usage in descending order, and the y-axis represents the percentage of total traffic data. It indicates that the top 9% users occupy half of the bandwidth of application services. ...
The proliferation of smart devices prompts the explosive usage of mobile applications, which increases network traffic load. Characterizing the application level traffic patterns from an individual perspective is valuable for operators and content providers to make technical and business strategies. In this paper, we identify several typical traffic patterns and predict per-user traffic demand utilizing application usage dataset in cellular network. Our primary contributions are twofold: First, we novelly designed a three-stage model combining factor analysis and machine learning to extract the traffic patterns of individuals. By detecting the latent temporal structure of their application usage, users in the network are grouped into six typical patterns. Second, we implement a Wavelet-ARMA based model to forecast per-user application level traffic demand. The evaluation on real-world dataset indicates the model improves the prediction accuracy by 7 to 8 times compared with the benchmark solutions.