5G Qos [PDF]

  • 0 0 0
  • Gefällt Ihnen dieses papier und der download? Sie können Ihre eigene PDF-Datei in wenigen Minuten kostenlos online veröffentlichen! Anmelden
Datei wird geladen, bitte warten...
Zitiervorschau

Smart Concurrent Learning Scheme for 5G Network: QoS-Aware Radio Resource Allocation Evgeni Bikov

Dmitri Botvich

Moscow Institute of Physics and Technology Moscow, Russia [email protected]

University Paris-Est, LIGM Lab Marne-la-Vallee, France [email protected]

have been undertaken on backhaul [5] and transport infrastructure optimization [6]. In paper [7] the authors propose to define a power profile with two parameters - the subband center power factor and the edge-to-center boundary. They then use Q-learning [8] to find the best configuration. The flexible interference management technique was also studied in [9] which separates macro- and small cell users in frequency by varying allocation probabilities on different parts of the spectrum. In [5] authors presented a refunding framework for small cells, where small cell holders receive refunding from the network operator when admitting macrocell users. In [3] the problem is defined as the traffic-aware utility based scheduling where power constraints is posed as an optimization objective. An optimal algorithm for the scheduling problem is presented. Above subset of mechanisms is efficient for backhaul-limited scenarios but do not take into account the issues arising at higher levels. In this paper, we propose to consider the QoS problem from a different perspective - on radio resource management level. A complete overview of the main interference coordination techniques for the modern heterogeneous architectures can be found in [10]. While many authors have provided approaches I. I NTRODUCTION to control resource allocation, these solutions fall short with Small cells are considered to be the most practical way to respect to flexibility in scenarios with multiple traffic types. improve the indoor cellular coverage and increase the network We propose a novel radio resource management technique capacity through the spatial reuse of spectrum. In [1], extreme for heterogeneous network with QoS users. We target to use densification (where more active nodes per unit area and only local information and restrict any information exchange Hz) is identified as one of the key methods to improve area with the neighboring cells to eliminate incompatibility with spectral efficiency for 5G technologies. Although, the extreme proprietary interfaces. densification offers the largest increase in the network capacity, To take a full advantage of machine learning techniques it can degrade more valuable system metrics like quality of introduced in our earlier paper [11], we propose a novel service (QoS) for voice or video users. The QoS degradation is power profile construction framework tailored for scenarios especially large when the number of users increases or when with multiple traffic types. For same we also adapt utility they simultaneously run delay-critical and bandwidth-hungry function and rewarding mechanics of the reinforcement learning applications. algorithm relying on Q-learning formulation [8]. To address A comprehensive review of recent advances related to the the arising convergence challenge we propose to enhance the quality of user experience in cellular networks is presented proposed algorithm with a smart model fitting stage. Taking in [2]. As applied to wireless heterogeneous networks the advantage of these ideas, we were able to use flexibility and problem appears to be a highly complex multilevel task. meet the strict requirements of machine learning algorithm for Traditionally it is studied mostly at media access level by QoS scenarios. elaborating scheduling schemes like in [3], [4]. The problem The main contribution of this paper is as follows. First, we was also studied from transport perspective - several studies propose a novel power profile construction mechanism with Abstract—The continuous performance race brought wireless industry to a ubiquitous adoption of heterogeneous architecture with small cells. Extreme densification offers the largest gain in network capacity but challenges important metrics related to quality of service (QoS) for users with mixed traffic types. To work out this problem traditional radio resource management schemes need to be refocused from boosting total network capacity to addressing the requirements of quality-sensitive applications. In this paper, we propose a novel way to adopt Q-learning for planning resource usage. It is based on a smart power profile construction framework and tailored for scenarios with multiple traffic types. To handle the emerging convergence challenge we present a way to enhance the introduced algorithm with a smart model fitting stage. Taking advantage of these concepts, we have managed to both use flexibility and meet the stringent requirements of machine learning algorithms. System level simulations show it achieves a considerable performance improvement for heterogeneous deployment, without compromising the quality of service of the overall network. The performance metrics are tested in realistic LTE-Advanced scenarios proving efficiency and practicality of the proposed method. Index Terms—Long Term Evolution, Inter-Cell Interference (ICI), Resource management, Reinforcement Learning, Resource sharing, Signal to noise ratio, QoS, Resource allocation, Dense 5G networks

536

a smart multi-class utility function that provisions QoS for the smallcell users. Second, we present a smart QoS-oriented definition of action-state spaces and power profile as a learning object for the reinforcement learning formulation. Third, we present an enhancement to improve the convergence for high number of QoS classes by a periodic model fitting. Finally, performance evaluation and extensive modeling are conducted in realistic scenario settings that proves the proposed approach to be practical. The rest of this paper is organized as follows. Section II presents the related work, Section III describes the proposed algorithm and provides intuition on the key design decisions. In section V, we present the simulation environment and performance evaluation results. Finally, we conclude with general comments in Section VI. II. S YSTEM SETUP AND REQUIREMENTS A. System architecture In this paper, we consider LTE/LTE-A network with a heterogeneous deployment of base stations such as macrocells and small cells as shown in Fig. 1. The latter is usually deployed to improve the macrocell indoor coverage and to boost capacity in highly populated areas. The focus of this paper is made on the coexistence of small cells network with the macro environment and with non-cooperative small cells of different vendors.









In this paper, we do not distinguish between the macrocell and small cell users. We assume that the user is always served by the BS with the strongest signal (which is true in most cases in practice). By design we define two classes of users within the system: Voice-users and Data-users. Voice-users are the ones with primarily latency requirements, when Data-users - with throughout requirements. B. Problem statement Base station densification offers a large increase in the cellular network capacity but challenges quality of service metrics. Most of the traditional radio resource management approaches fall short with respect to performance and flexibility in scenarios with multiple traffic types. In this paper, we target to develop and evaluate a machine learning algorithm to manage radio resource usage on the base station meeting the following requirements: 1. The algorithm should be sensitive to the user’s QoS for various traffic types (latency-critic, latency-tolerant types); 2. The algorithm must adapt to various deployment schemes and conditions of macroenvironment [13]; 3. No preparatory configuration should be required to fit self-organizing requirements [14]; 4. No communication with foreign agents is allowed in order to eliminate a possible incompatibility between proprietary interfaces; 5. Algorithm’s convergence rate should be comparable to the typical rate of changes in the system; 6. Algorithm should not confront with the underlying media access level scheduling schemes.

Fig. 1. Overall system setup: macro layer and small cell layers (cooperative and non-cooperative)

In this paper, we consider a multi-tier LTE network composed of a set of small cells coexisting with surrounding macro base stations. Both macro base stations and small cells operate in the same frequency band to increase the spatial frequency reuse. Fig. 1 illustrates a typical HetNet deployment scenario [12] that is typically encountered in practice. It comprises the following elements:

Macro Base Stations and Transport Infrastructure, which are used to offer the basic cellular coverage. Base stations are commonly pre-configured with radio resource plans obtained at a preliminary radio-planning stage. Small cells – short-range base stations, which are mainly deployed in the indoor environment. Uncoordinated deployment are one of the most attractive features of small cells. Non-cooperative cells - a separate set of the network elements. They are statically pre-configured by the cellular operator or operate under the different set of rules rather that small cells. Voice and Data users - a set of users having delay-sensitive and bandwidth-sensitive types of traffic respectfully.

III. A LGORITHM DESCRIPTION A. Definitions

537

• • •

eNB, eNodeB, smallcell - LTE base station; Learning agent - base station executing learning algorithm; RB, Resource Block - the smallest part of the radio resource available for scheduling. It has size of 180kHz in the frequency domain and 0.5 ms in the time domain.

Fig. 3. Two parts of the power profile are marked ’Voice’ and ’Data’ to illustrate the subsequent step of per-user scheduling. The general principle of the algorithm is as follows. Based Power profile parameter values will be calculated individually on locally available performance stats [13] (packet loss, for each QoS class and each base station. The resulting power block error rate, spectral efficiency, delay) each small cell profile is designed to be multilevel with respect to serving QoS autonomously takes resource block allocation decision for highclasses. level QoS access classes. B. General principle

After resource allocation decision is done at each cell, all the permitted resource blocks are divided between the users at the lower layer by MAC scheduler. In this paper, we mainly use opportunistic scheduler [15].

P (x) =

X

pi σ(x, ai , ci )

(2)

i∈η

C. Power profile 250

σ(x, a, c) =

1.0

σ(x, a, c) =

1 1 + e−a(x−c)

Maximum TX power, mW

The whole available frequency bandwidth is divided into the number of subbands. The size of the subband is chosen to match the granularity of channel quality reporting. In this paper, we suppose it to be equal to the size of the resource block. These subbands are noted bj , where j = 1, .., N and N is the total number of subbands. The output of the algorithm is a multilevel power profile Pj characterizing the maximum allowed transmission power for each bj . To increase the efficiency of learning and optimization techniques we describe the profile in a smart way - with as less variables as possible. We propose to construct a power profile as a weighted sum of sigmoid functions (see Eq. 1) and Fig. 2 for the reference.

Power profile with two classes

200

150

100

50

0

5

10

15

20 25 30 Resource blocks

35

40

45

50

Fig. 3. Complex power profile example for n = 2 traffic types

(1) D. Radio resource allocation scheme, power profile update rule

1 1 + e−a(x−c)

Initialize Power Profile model

Max power ratio

0.8

Fit PP model

0.6

0.4

Choose update action from A

0.2

Perform PP update 0.0 10

20 30 Resource blocks

40

50

Perceive L3 reward

Fig. 2. Sigmoid power profile example

Update p(a,s)

Sigmoid function σ(x, a, c) is a mathematical function having an S-shape and mainly used in genetic networks to model neuron activation function. We have chosen this approach due to its flexible nature and a possibility to adopt some fruitful theoretical results of the universal approximation theorem [16]. One of the obvious benefits of the proposed scheme lies in the smoothness of the resulting complex profile. A complex power profile is defined by parameters p¯ (individual sigmoid weight), a (sigmoid steepness), c¯ (sigmoid position). An example power profile for parameter values p¯ = [2, −2, 1, −1], a = 10, c¯ = [5, 10, 10, 25] is presented on

Fig. 4. The structure of the proposed algorithm

To coordinate intercell interference each small cell autonomously takes frequency allocation decisions on a permillisecond basis. At time t, each autonomous agent updates one of the power profile parameters. The updated values will then set the power constraints for downlink transmission during the following time period t + 1. Following the Q-learning terminology we propose to choose an update rule and actionstate space as following:

538

1) State: s ∈ {S} represents the number of satisfied Data and Voice users on the base station S = (nd , nv ). To keep the state space compact we only store the multiplicative order of these values modulo 2. 2) Action: a ∈ {A} is one of 27 possible modifications of the power profile parameters (¯ p, a, c¯). Each of these parameters (e.g. a) can be increased, decreased or left unchanged: • • •

at+1 = at at+1 = (1 + β)at at+1 = (1 − β)at

3) Reward function: Rt (or utility function) equals to an aggregate number of users with satisfied requirements, weighted per QoS class: X Rt = wq nq (3) q∈{Q}

where nq the number of users with satisfied requirements of class q and wq is the corresponding class weight. User QoS requirements are considered to be satisfied if user’s packet error rate, delay and throughput metrics are in an admissible range defined by LTE specification [13]. 4) Q-value: Qt (a, s) is a probability to choose action a when perceiving state s. Each action-state pair is initially assigned with a random probability Q0 (a, s) ∈ (0, 1). For system in state s∗ , higher value of Q(a, s∗ ) means higher probability to choose action a from the actions space {A}. 5) Update rule: state-action probability Qt (a, s) is updated at each iteration of the algorithm according to Eq. (4). The update rule is chosen to encourage the system to preferably perform actions leading to higher rewards. Qt+1 (a, s) = Qt (a, s) + 4Q 4Q = α(Rt+1 + γ max Q(a, st+1 ) − Qt (a, s)) a∈{A}

(4)

1) Pearson check: Each time the number of users served by the small cell (state) changes, the following procedure is initiated to fit power profile model. First, we conduct a pairwise check of the Pearson coefficient for power profile parameters. Pn (xi − x ¯)(yi − y¯) pPn (5) ρ = pPn i=1 2 ¯) ¯)2 i=1 (xi − x i=1 (yi − y 2) Model fitting: If the measure of linear correlation between any two parameters (e.g. p1 and p2 , is high (ρ(p1 , p2 ) > ρmax ) - replace one parameter (e.g. p2 ) by a linear function of p1 : p2 = a + bp1 , where a and b are obtained by linear regression. The replaced parameters are excluded from the action space. Simulation results prove this method to reduce convergence time and look promising enough to continue developing this idea in further studies. F. Limitations The multi-user scheduling of resources within each base station is carried out by an opportunistic scheduler. We deliberately limit the scope of this paper to radio resource management decisions ignoring MAC layer decisions. This is done to highlight the effect of radio resource decisions on the resulting QoS. IV. P ERFORMANCE EVALUATION For performance analysis, we conduct simulations with a simulator based on Vienna LTE-A Downlink System Level Simulator [17]. The overall network is composed of 2-tier cell layout where a group of indoor small cells is co-located with a macro base station. The small cells are placed according to 5x5-Grid layout specified in the recommendation [18] by 3GPP. The users are located indoor and are uniformly distributed. It is assumed that each user is served by the base station with the strongest reference signal. The considered channel propagation and fading models are based on [18]. The default simulation details are summarized in the Table I.

where α is the learning rate, γ is the discount factor and Rt+1 is the reward observed after performing at in st . Note that due to the smart parameterization the power profile does not change dramatically between steps t and t + 1. This allows to optimize the final solution smoothly without abrupt system changes and to converge quicker.

25

TX power, dBm

20

E. Power profile model fitting The significant limitation of Q-learning algorithm lies in the fact that knowledge gained for single action-state pair is not utilized to update the neighboring ones. As a result, for wide space-action space the algorithm may take thousands of iterations to converge. In order to improve convergence rate we propose to reduce action space by adding another procedure to the algorithm - model fitting (see Fig. 4). The idea behind this is simple. If we see that two power profile parameters behave similarly, we replace them by a single parameter.

15

10

eNodeB 1 eNodeB 2 eNodeB 3

5

0

5

10

15

20 25 30 35 Resource blocks

40

45

50

Fig. 5. Power profile snapshot (3 base stations), logarithmic scale

Fig. 6 presents the average number of satisfied users for indoor Voice and Data users, respectively. The proposed QoS RRM algorithm is compared with the basic QoS-ignorant

539

TABLE I S IMULATION PARAMETERS

1.0

0.8

Scenario details Base scenario 5x5 Grid [18] Base station TX power 250 mW Number of base stations 3–28 Number of users 4-10 per BS Scheduler type Proportional Fair, QoS Token Traffic type VoIP + Best effort [19] Delay tolerance 100 ms Data rate 5–10 Mbps Antenna type Omnidirectional RB Bandwidth 20 MHz TTI duration 1 ms Simulation time 300 ms, 1 hour, 1 day Channel model Carrier frequency 2.6 GHz (Band 7) Thermal noise density -174 dBm/Hz Shadowing fading Log-normal, 8 dB Fast fading Flat fading Algorithm parameters Learning rate α 0.2 Change factor β 1.05 Discount factor γ 0.01 Pearson’s threshold ρmax 0.7

CDF

0.6

0.4

0.2

0.0 0

Average satisfied requirements

8 6 4 2 PF-2 RR-1 RR-2 SINR-1 MAC scheduler type / class number

30

V. C ONCLUSION

Voice Data

PF-1

10 15 20 25 Average satisfied requirements number

to find in autonomous manner the best choices and overreaches the performances of a simple scheme. The proposed algorithm with two class numbers leads to the best performances in term of satisfied requirements for both Data and Voice users.

10

0

5

Fig. 7. Performance comparison for 1-class and 2-class setups

version of radio resourse allocation. In other words, the same algorithm is used but the same transmit power is used for each user.

12

1-class RRM 2-class RRM

SINR-2

Fig. 6. Performance comparison of 1-class and 2-class scheme, on top of commonly used MAC schedulers types: Round Robin, Proportional Fair, Max SINR

As it can be seen from Fig. 6, Voice users greatly benefit from the proposed algorithm without compromising Data users performance. The highest performance is achieved under Proportional Fair type of the underlying MAC scheduler. Fig. 7 shows the performance in form of cumulative distribution function (CDF) of the average number of satisfied users per base station. We consider at first the performances of the 2-class scheme, comparing to the traditional 1-class scheme. We conclude that QoS-sensitive version of the algorithm is able

In this paper, we have presented the distributed mechanism for handling QoS radio resource management in a dense heterogeneous network. It divides resources within the system based on the multi-agent Q-learning algorithm. The underlying algorithm is based on Q-learning formulation and relies on a smart way for describing the power profile for QoS scenarios. We also propose a cross-layered traffic-aware utility function that provisions QoS for the users of smallcells. The focus is made on the coexistence of the users with various types of traffic served by small cells in a non-cooperative macro environment. Taking advantage of the flexibility of the proposed approach, we have shown a way to increase both overall learning efficiency and system performance. As inputs for the algorithm, we use standard 3GPP-based metrics available locally at every commercial small cell what makes it easily implementable in practice. The complexity of the algorithm is low due to supervised learning nature. System level simulations prove the efficiency of the proposed solution for the realistic setup recommended by the 3GPP. In our study, we have shown that the proposed algorithm works properly in various heterogeneous scenarios and outperforms the reference algorithms without negative impact on convergence. Further studies aim to develop an enhanced algorithm capable to dynamically adjust to the varying traffic types number. R EFERENCES [1] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K. Soong, and J. C. Zhang, “What Will 5G Be?” Selected Areas in Communications, IEEE Journal on, vol. 32, no. 6, pp. 1065–1082, Jun. 2014. [2] S. Jelassi, G. Rubino, H. Melvin, H. Youssef, and G. Pujolle, “Quality of Experience of VoIP Service: A Survey

540

of Assessment Approaches and Open Issues,” Communications Surveys Tutorials, IEEE, vol. 14, no. 2, pp. 491–513, 2147483647 2012. [3] R. Balakrishnan, B. Canberk, and I. F. Akyildiz, “Trafficaware utility based QoS provisioning in OFDMA hybrid smallcells,” in Communications (ICC), 2013 IEEE International Conference on, 2013, pp. 6464–6468. [4] M. Simsek and A. Czylwik, “Improved Decentralized Fuzzy Q-learning for Interference Reduction in Heterogeneous LTE-Networks,” IEEE Xplore, pp. 1–6, Aug. 2012. [5] Y. Yang, T. Q. S. Quek, and L. Duan, “BackhaulConstrained Small Cell Networks: Refunding and QoS Provisioning,” Wireless Communications, IEEE Transactions on, vol. 13, no. 9, pp. 5148–5161, Sep. 2014. [6] K. Liu and J. Y. B. Lee, “Impact of TCP protocol efficiency on mobile network capacity loss,” in Modeling Optimization in Mobile, Ad Hoc Wireless Networks (WiOpt), 2013 11th International Symposium on, 2013, pp. 1–6. [7] U. Sallakh, S. S. Mwanje, and A. Mitschele-Thiel, “Multi-parameter Q-Learning for downlink Inter-Cell Interference Coordination in LTE SON,” in Computers and Communication (ISCC), 2014 IEEE Symposium on, 2014, pp. 1–6. [8] J. G. Carbonell, Ed., Machine Learning: Paradigms and Methods. New York, NY, USA: Elsevier North-Holland, Inc., 1990. [9] Z. Zheng, J. Hamalainen, and Y. Yang, “On Uplink Power Control Optimization and Distributed Resource Allocation in Femtocell Networks,” in Vehicular Technology Conference (VTC Spring), 2011 IEEE 73rd, 2011, pp. 1–5. [10] Y. L. Lee, T. C. Chuah, J. Loo, and A. Vinel, “Recent Advances in Radio Resource Management for Heterogeneous

LTE/LTE-A Networks,” Communications Surveys Tutorials, IEEE, vol. PP, no. 99, pp. 1–39, 2014. [11] E. Bikov, Y. Ghamri-Doudane, and D. Botvich, “Smart Resource Allocation with Concurrent Learning Scheme for Heterogeneous LTE Smallcell Networks,” in 2015 IEEE Global Communications Conference (GLOBECOM), 2015, pp. 1–6. [12] Y. L. Lee, T. C. Chuah, J. Loo, and A. Vinel, “Recent Advances in Radio Resource Management for Heterogeneous LTE/LTE-A Networks,” Communications Surveys Tutorials, IEEE, vol. 16, no. 4, pp. 2142–2180, 2147483647 2014. [13] “Evolved Universal Terrestrial Radio Access (E-UTRA) and Evolved Universal Terrestrial Radio Access Network (EUTRAN); Overall description; stage 2,” 3GPP, Recommendation TS36.300, 2014. [14] “E-UTRAN Self-configuring and self-optimizing network (SON) use cases and solutions,” 3GPP, Recommendation TS36.902, 2011. [15] A. Asadi and V. Mancuso, “A Survey on Opportunistic Scheduling in Wireless Communications,” IEEE Communications Surveys Tutorials, vol. 15, no. 4, pp. 1671–1688, 2147483647 2013. [16] G. Cybenko, “Approximation by superpositions of a sigmoidal function,” Mathematics of Control, Signals and Systems, vol. 2, no. 4, pp. 303–314, 1989. [17] J. C. Ikuno, M. Wrulich, and M. Rupp, “System level simulation of LTE networks,” in Proc. 2010 IEEE 71st Vehicular Technology Conference, 2010. [18] “3GPP TSG RAN WG4 (Radio) Meeting 51: Simulation assumptions and parameters for FDD HeNB RF requirements,” 3GPP, Recommendation R4-092042, 2009.

541