The traffic within data centers exhibits bursts and unpredictable patterns.This rapid growth in network traffic has two consequences:it surpasses the inherent capacity of the network’s link bandwidth and creates an i...The traffic within data centers exhibits bursts and unpredictable patterns.This rapid growth in network traffic has two consequences:it surpasses the inherent capacity of the network’s link bandwidth and creates an imbalanced network load.Consequently,persistent overload situations eventually result in network congestion.The Software Defined Network(SDN)technology is employed in data centers as a network architecture to enhance performance.This paper introduces an adaptive congestion control strategy,named DA-DCTCP,for SDN-based Data Centers.It incorporates Explicit Congestion Notification(ECN)and Round-Trip Time(RTT)to establish congestion awareness and an ECN marking model.To mitigate incorrect congestion caused by abrupt flows,an appropriate ECN marking is selected based on the queue length and its growth slope,and the congestion window(CWND)is adjusted by calculating RTT.Simultaneously,the marking threshold for queue length is continuously adapted using the current queue length of the switch as a parameter to accommodate changes in data centers.The evaluation conducted through Mininet simulations demonstrates that DA-DCTCP yields advantages in terms of throughput,flow completion time(FCT),latency,and resistance against packet loss.These benefits contribute to reducing data center congestion,enhancing the stability of data transmission,and improving throughput.展开更多
The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections an...The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.展开更多
With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The...With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The development of software defined networks has brought new opportunities and challenges to future networks. The data and control separation characteristics of SDN improve the performance of the entire network. Researchers have integrated SDN architecture into data centers to improve network resource utilization and performance. This paper first introduces the basic concepts of SDN and data center networks. Then it discusses SDN-based load balancing mechanisms for data centers from different perspectives. Finally, it summarizes and looks forward to the study on SDN-based load balancing mechanisms and its development trend.展开更多
Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with s...Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.展开更多
According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G c...According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G connections.While bringing a boom to the network,it also presents unprecedented challenges in terms of flow forwarding decisions.The path assignment mechanism used in traditional traffic schedulingmethods tends to cause local network congestion caused by the concentration of elephant flows,resulting in unbalanced network load and degraded quality of service.Using the centralized control of software-defined networks,this study proposes a data center traffic scheduling strategy for minimization congestion and quality of service guaranteeing(MCQG).The ideal transmission path is selected for data flows while considering the network congestion rate and quality of service.Different traffic scheduling strategies are used according to the characteristics of different service types in data centers.Reroute scheduling for elephant flows that tend to cause local congestion.The path evaluation function is formed by the maximum link utilization on the path,the number of elephant flows and the time delay,and the fast merit-seeking capability of the sparrow search algorithm is used to find the path with the lowest actual link overhead as the rerouting path for the elephant flows.It is used to reduce the possibility of local network congestion occurrence.Equal cost multi-path(ECMP)protocols with faster response time are used to schedulemouse flows with shorter duration.Used to guarantee the quality of service of the network.To achieve isolated transmission of various types of data streams.The experimental results show that the proposed strategy has higher throughput,better network load balancing,and better robustness compared to ECMP under different traffic models.In addition,because it can fully utilize the resources in the network,MCQG also outperforms another traffic scheduling strategy that does rerouting for elephant flows(namely Hedera).Compared withECMPandHedera,MCQGimproves average throughput by 11.73%and 4.29%,and normalized total throughput by 6.74%and 2.64%,respectively;MCQG improves link utilization by 23.25%and 15.07%;in addition,the average round-trip delay and packet loss rate fluctuate significantly less than the two compared strategies.展开更多
Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due ...Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due to their extensive energy consumption during workload pro-cessing.Numerous research studies have examined distinct operating cost mitigation techniques for geo-distributed data centers(DCs).However,oper-ating cost savings during workload processing,which also considers string-matching techniques in geo-distributed DCs,remains unexplored.In this research,we propose a novel string matching-based geographical load balanc-ing(SMGLB)technique to mitigate the operating cost of the geo-distributed DC.The primary goal of this study is to use a string-matching algorithm(i.e.,Boyer Moore)to compare the contents of incoming workloads to those of documents that have already been processed in a data center.A successful match prevents the global load balancer from sending the user’s request to a data center for processing and displaying the results of the previously processed workload to the user to save energy.On the contrary,if no match can be discovered,the global load balancer will allocate the incoming workload to a specific DC for processing considering variable energy prices,the number of active servers,on-site green energy,and traces of incoming workload.The results of numerical evaluations show that the SMGLB can minimize the operating expenses of the geo-distributed data centers more than the existing workload distribution techniques.展开更多
As the amount of data continues to grow rapidly,the variety of data produced by applications is becoming more affluent than ever.Cloud computing is the best technology evolving today to provide multi-services for the ...As the amount of data continues to grow rapidly,the variety of data produced by applications is becoming more affluent than ever.Cloud computing is the best technology evolving today to provide multi-services for the mass and variety of data.The cloud computing features are capable of processing,managing,and storing all sorts of data.Although data is stored in many high-end nodes,either in the same data centers or across many data centers in cloud,performance issues are still inevitable.The cloud replication strategy is one of best solutions to address risk of performance degradation in the cloud environment.The real challenge here is developing the right data replication strategy with minimal data movement that guarantees efficient network usage,low fault tolerance,and minimal replication frequency.The key problem discussed in this research is inefficient network usage discovered during selecting a suitable data center to store replica copies induced by inadequate data center selection criteria.Hence,to mitigate the issue,we proposed Replication Strategy with a comprehensive Data Center Selection Method(RS-DCSM),which can determine the appropriate data center to place replicas by considering three key factors:Popularity,space availability,and centrality.The proposed RS-DCSM was simulated using CloudSim and the results proved that data movement between data centers is significantly reduced by 14%reduction in overall replication frequency and 20%decrement in network usage,which outperformed the current replication strategy,known as Dynamic Popularity aware Replication Strategy(DPRS)algorithm.展开更多
As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processin...As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processing and artificial intelligence.However,current architectures of data center networks suffer from a long routing path and a low fault tolerance between source and destination servers,which is hard to satisfy the requirements of high-performance data center networks.Based on dual-port servers and Clos network structure,this paper proposed a novel architecture RClos to construct high-performance data center networks.Logically,the proposed architecture is constructed by inserting a dual-port server into each pair of adjacent switches in the fabric of switches,where switches are connected in the form of a ring Clos structure.We describe the structural properties of RClos in terms of network scale,bisection bandwidth,and network diameter.RClos architecture inherits characteristics of its embedded Clos network,which can accommodate a large number of servers with a small average path length.The proposed architecture embraces a high fault tolerance,which adapts to the construction of various data center networks.For example,the average path length between servers is 3.44,and the standardized bisection bandwidth is 0.8 in RClos(32,5).The result of numerical experiments shows that RClos enjoys a small average path length and a high network fault tolerance,which is essential in the construction of high-performance data center networks.展开更多
Globally,digital technology and the digital economy have propelled technological revolution and industrial change,and it has become one of the main grounds of international industrial competition.It was estimated that...Globally,digital technology and the digital economy have propelled technological revolution and industrial change,and it has become one of the main grounds of international industrial competition.It was estimated that the scale of China’s digital economy would reach 50 trillion yuan in 2022,accounting for more than 40%of GDP,presenting great market potential and room for the growth of the digital economy.With the rapid development of the digital economy,the state attaches great importance to the construction of digital infrastructure and has introduced a series of policies to promote the systematic development and large-scale deployment of digital infrastructure.In 2022 the Chinese government planned to build 8 arithmetic hubs and 10 national data center clusters nationwide.To proactively address the future demand for AI across various scenarios,there is a need for a well-structured computing power infrastructure.The data center,serving as the pivotal hub for computing power,has evolved from the conventional cloud center to a more intelligent computing center,allowing for a diversified convergence of computing power supply.Besides,the data center accommodates a diverse array of arithmetic business forms from customers,reflecting the multi-industry developmental trend.The arithmetic service platform is consistently broadening its scope,with ongoing optimization and innovation in the design scheme of machine room processes.The widespread application of submerged phase-change liquid cooling technology and cold plate cooling technology introduces a series of new challenges to the construction of digital infrastructure.This paper delves into the design objectives,industry considerations,layout,and other dimensions of a smart computing center and proposes a new-generation data center solution that is“flexible,resilient,green,and low-carbon.”展开更多
The interest in selecting an appropriate cloud data center is exponentially increasing due to the popularity and continuous growth of the cloud computing sector.Cloud data center selection challenges are compounded by...The interest in selecting an appropriate cloud data center is exponentially increasing due to the popularity and continuous growth of the cloud computing sector.Cloud data center selection challenges are compounded by ever-increasing users’requests and the number of data centers required to execute these requests.Cloud service broker policy defines cloud data center’s selection,which is a case of an NP-hard problem that needs a precise solution for an efficient and superior solution.Differential evolution algorithm is a metaheuristic algorithm characterized by its speed and robustness,and it is well suited for selecting an appropriate cloud data center.This paper presents a modified differential evolution algorithm-based cloud service broker policy for the most appropriate data center selection in the cloud computing environment.The differential evolution algorithm is modified using the proposed new mutation technique ensuring enhanced performance and providing an appropriate selection of data centers.The proposed policy’s superiority in selecting the most suitable data center is evaluated using the CloudAnalyst simulator.The results are compared with the state-of-arts cloud service broker policies.展开更多
In recent years,dual-homed topologies have appeared in data centers in order to offer higher aggregate bandwidth by using multiple paths simultaneously.Multipath TCP(MPTCP) has been proposed as a replacement for TCP i...In recent years,dual-homed topologies have appeared in data centers in order to offer higher aggregate bandwidth by using multiple paths simultaneously.Multipath TCP(MPTCP) has been proposed as a replacement for TCP in those topologies as it can efficiently offer improved throughput and better fairness.However,we have found that MPTCP has a problem in terms of incast collapse where the receiver suffers a drastic goodput drop when it simultaneously requests data over multiple servers.In this paper,we investigate why the goodput collapses even if MPTCP is able to actively relieve hot spots.In order to address the problem,we propose an equally-weighted congestion control algorithm for MPTCP,namely EW-MPTCP,without need for centralized control,additional infrastructure and a hardware upgrade.In our scheme,in addition to the coupled congestion control performed on each subflow of an MPTCP connection,we allow each subflow to perform an additional congestion control operation by weighting the congestion window in reverse proportion to the number of servers.The goal is to mitigate incast collapse by allowing multiple MPTCP subflows to compete fairly with a single-TCP flow at the shared bottleneck.The simulation results show that our solution mitigates the incast problem and noticeably improves goodput in data centers.展开更多
With the rapid development of technologies such as big data and cloud computing,data communication and data computing in the form of exponential growth have led to a large amount of energy consumption in data centers....With the rapid development of technologies such as big data and cloud computing,data communication and data computing in the form of exponential growth have led to a large amount of energy consumption in data centers.Globally,data centers will become the world’s largest users of energy consumption,with the ratio rising from 3%in 2017 to 4.5%in 2025.Due to its unique climate and energy-saving advantages,the high-latitude area in the Pan-Arctic region has gradually become a hotspot for data center site selection in recent years.In order to predict and analyze the future energy consumption and carbon emissions of global data centers,this paper presents a new method based on global data center traffic and power usage effectiveness(PUE)for energy consumption prediction.Firstly,global data center traffic growth is predicted based on the Cisco’s research.Secondly,the dynamic global average PUE and the high latitude PUE based on Romonet simulation model are obtained,and then global data center energy consumption with two different scenarios,the decentralized scenario and the centralized scenario,is analyzed quantitatively via the polynomial fitting method.The simulation results show that,in 2030,the global data center energy consumption and carbon emissions are reduced by about 301 billion kWh and 720 million tons CO2 in the centralized scenario compared with that of the decentralized scenario,which confirms that the establishment of data centers in the Pan-Arctic region in the future can effectively relief the climate change and energy problems.This study provides support for global energy consumption prediction,and guidance for the layout of future global data centers from the perspective of energy consumption.Moreover,it provides support of the feasibility of the integration of energy and information networks under the Global Energy Interconnection conception.展开更多
How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data cente...How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data centers. The winner tree is introduced to make the data nodes as the leaf nodes of the tree and the final winner on the purpose of reducing energy consumption is selected. The complexity of large-scale cloud data centers is fully consider, and the task comparson coefficient is defined to make task scheduling strategy more reasonable. Experiments and performance analysis show that the proposed algorithm can effectively improve the node utilization, and reduce the overall power consumption of the cloud data center.展开更多
With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.Howeve...With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.However,traditional TCPs are ill-suited to such situations and always result in the inefficiency(e.g.missing the flow deadline,inevitable throughput collapse)of data transfers.This further degrades the user-perceived quality of service(QoS)in data centers.To reduce the flow completion time of mice and deadline-sensitive flows along with promoting the throughput of elephant flows,an efficient and deadline-aware priority-driven congestion control(PCC)protocol,which grants mice and deadline-sensitive flows the highest priority,is proposed in this paper.Specifically,PCC computes the priority of different flows according to the size of transmitted data,the remaining data volume,and the flows’deadline.Then PCC adjusts the congestion window according to the flow priority and the degree of network congestion.Furthermore,switches in data centers control the input/output of packets based on the flow priority and the queue length.Different from existing TCPs,to speed up the data transfers of mice and deadline-sensitive flows,PCC provides an effective method to compute and encode the flow priority explicitly.According to the flow priority,switches can manage packets efficiently and ensure the data transfers of high priority flows through a weighted priority scheduling with minor modification.The experimental results prove that PCC can improve the data transfer performance of mice and deadline-sensitive flows while guaranting the throughput of elephant flows.展开更多
Global data traffic is growing rapidly,and the demand for optoelectronic transceivers applied in data centers(DCs)is also increasing correspondingly.In this review,we first briefly introduce the development of optoele...Global data traffic is growing rapidly,and the demand for optoelectronic transceivers applied in data centers(DCs)is also increasing correspondingly.In this review,we first briefly introduce the development of optoelectronics transceivers in DCs,as well as the advantages of silicon photonic chips fabricated by complementary metal oxide semiconductor process.We also summarize the research on the main components in silicon photonic transceivers.In particular,quantum dot lasers have shown great potential as light sources for silicon photonic integration—whether to adopt bonding method or monolithic integration—thanks to their unique advantages over the conventional quantum-well counterparts.Some of the solutions for highspeed optical interconnection in DCs are then discussed.Among them,wavelength division multiplexing and four-level pulseamplitude modulation have been widely studied and applied.At present,the application of coherent optical communication technology has moved from the backbone network,to the metro network,and then to DCs.展开更多
New and emerging use cases, such as the interconnection of geographically distributed data centers(DCs), are drawing attention to the requirement for dynamic end-to-end service provisioning, spanning multiple and hete...New and emerging use cases, such as the interconnection of geographically distributed data centers(DCs), are drawing attention to the requirement for dynamic end-to-end service provisioning, spanning multiple and heterogeneous optical network domains. This heterogeneity is, not only due to the diverse data transmission and switching technologies, but also due to the different options of control plane techniques. In light of this, the problem of heterogeneous control plane interworking needs to be solved, and in particular, the solution must address the specific issues of multi-domain networks, such as limited domain topology visibility, given the scalability and confidentiality constraints. In this article, some of the recent activities regarding the Software-Defined Networking(SDN) orchestration are reviewed to address such a multi-domain control plane interworking problem. Specifically, three different models, including the single SDN controller model, multiple SDN controllers in mesh, and multiple SDN controllers in a hierarchical setting, are presented for the DC interconnection network with multiple SDN/Open Flow domains or multiple Open Flow/Generalized Multi-Protocol Label Switching( GMPLS) heterogeneous domains. I n addition, two concrete implementations of the orchestration architectures are detailed, showing the overall feasibility and procedures of SDN orchestration for the end-to-endservice provisioning in multi-domain data center optical networks.展开更多
An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid in...An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid integration technology. Multimode output waveguides in the silica AWG with 2% refractive index difference are used to obtain fiat-top spectra. The output waveguide facet is polished to 45° bevel to change the light propagation direction into the mesa-type PIN PD, which simplifies the packaging process. The experimentM results show that the single channel I dB bandwidth of AWG ranges from 2.12nm to 3.06nm, the ROSA responsivity ranges from 0.097 A/W to 0.158A/W, and the 3dB bandwidth is up to 11 GHz. It is promising to be applied in the eight-lane WDM transmission system in data center interconnection.展开更多
Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the ph...Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the physical data center effectively. In this paper, we focus on this problem. Distinct with previous works, our study of VDC embedding problem is under the assumption that switch resource is the bottleneck of data center networks (DCNs). To this end, we not only propose relative cost to evaluate embedding strategy, decouple embedding problem into VM placement with marginal resource assignment and virtual link mapping with decided source-destination based on the property of fat-tree, but also design the traffic aware embedding algorithm (TAE) and first fit virtual link mapping (FFLM) to map virtual data center requests to a physical data center. Simulation results show that TAE+FFLM could increase acceptance rate and reduce network cost (about 49% in the case) at the same time. The traffie aware embedding algorithm reduces the load of core-link traffic and brings the optimization opportunity for data center network energy conservation.展开更多
We consider differentiated timecritical task scheduling in a N×N input queued optical packet s w itch to ens ure 100% throughput and meet different delay requirements among various modules of data center. Existin...We consider differentiated timecritical task scheduling in a N×N input queued optical packet s w itch to ens ure 100% throughput and meet different delay requirements among various modules of data center. Existing schemes either consider slot-by-slot scheduling with queue depth serving as the delay metric or assume that each input-output connection has the same delay bound in the batch scheduling mode. The former scheme neglects the effect of reconfiguration overhead, which may result in crippled system performance, while the latter cannot satisfy users' differentiated Quality of Service(Qo S) requirements. To make up these deficiencies, we propose a new batch scheduling scheme to meet the various portto-port delay requirements in a best-effort manner. Moreover, a speedup is considered to compensate for both the reconfiguration overhead and the unavoidable slots wastage in the switch fabric. With traffic matrix and delay constraint matrix given, this paper proposes two heuristic algorithms Stringent Delay First(SDF) and m-order SDF(m-SDF) to realize the 100% packet switching, while maximizing the delay constraints satisfaction ratio. The performance of our scheme is verified by extensive numerical simulations.展开更多
The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which...The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.展开更多
基金supported by the National Key R&D Program of China(No.2021YFB2700800)the GHfund B(No.202302024490).
文摘The traffic within data centers exhibits bursts and unpredictable patterns.This rapid growth in network traffic has two consequences:it surpasses the inherent capacity of the network’s link bandwidth and creates an imbalanced network load.Consequently,persistent overload situations eventually result in network congestion.The Software Defined Network(SDN)technology is employed in data centers as a network architecture to enhance performance.This paper introduces an adaptive congestion control strategy,named DA-DCTCP,for SDN-based Data Centers.It incorporates Explicit Congestion Notification(ECN)and Round-Trip Time(RTT)to establish congestion awareness and an ECN marking model.To mitigate incorrect congestion caused by abrupt flows,an appropriate ECN marking is selected based on the queue length and its growth slope,and the congestion window(CWND)is adjusted by calculating RTT.Simultaneously,the marking threshold for queue length is continuously adapted using the current queue length of the switch as a parameter to accommodate changes in data centers.The evaluation conducted through Mininet simulations demonstrates that DA-DCTCP yields advantages in terms of throughput,flow completion time(FCT),latency,and resistance against packet loss.These benefits contribute to reducing data center congestion,enhancing the stability of data transmission,and improving throughput.
文摘The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.
文摘With the continuous expansion of the data center network scale, changing network requirements, and increasing pressure on network bandwidth, the traditional network architecture can no longer meet people’s needs. The development of software defined networks has brought new opportunities and challenges to future networks. The data and control separation characteristics of SDN improve the performance of the entire network. Researchers have integrated SDN architecture into data centers to improve network resource utilization and performance. This paper first introduces the basic concepts of SDN and data center networks. Then it discusses SDN-based load balancing mechanisms for data centers from different perspectives. Finally, it summarizes and looks forward to the study on SDN-based load balancing mechanisms and its development trend.
基金This work was supported by the Serbian Ministry of Science and Education(project TR-32022)by companies Telekom Srbija and Informatika.
文摘Data center networks may comprise tens or hundreds of thousands of nodes,and,naturally,suffer from frequent software and hardware failures as well as link congestions.Packets are routed along the shortest paths with sufficient resources to facilitate efficient network utilization and minimize delays.In such dynamic networks,links frequently fail or get congested,making the recalculation of the shortest paths a computationally intensive problem.Various routing protocols were proposed to overcome this problem by focusing on network utilization rather than speed.Surprisingly,the design of fast shortest-path algorithms for data centers was largely neglected,though they are universal components of routing protocols.Moreover,parallelization techniques were mostly deployed for random network topologies,and not for regular topologies that are often found in data centers.The aim of this paper is to improve scalability and reduce the time required for the shortest-path calculation in data center networks by parallelization on general-purpose hardware.We propose a novel algorithm that parallelizes edge relaxations as a faster and more scalable solution for popular data center topologies.
基金This work is funded by the National Natural Science Foundation of China under Grant No.61772180the Key R&D plan of Hubei Province(2020BHB004,2020BAB012).
文摘According to Cisco’s Internet Report 2020 white paper,there will be 29.3 billion connected devices worldwide by 2023,up from 18.4 billion in 2018.5G connections will generate nearly three times more traffic than 4G connections.While bringing a boom to the network,it also presents unprecedented challenges in terms of flow forwarding decisions.The path assignment mechanism used in traditional traffic schedulingmethods tends to cause local network congestion caused by the concentration of elephant flows,resulting in unbalanced network load and degraded quality of service.Using the centralized control of software-defined networks,this study proposes a data center traffic scheduling strategy for minimization congestion and quality of service guaranteeing(MCQG).The ideal transmission path is selected for data flows while considering the network congestion rate and quality of service.Different traffic scheduling strategies are used according to the characteristics of different service types in data centers.Reroute scheduling for elephant flows that tend to cause local congestion.The path evaluation function is formed by the maximum link utilization on the path,the number of elephant flows and the time delay,and the fast merit-seeking capability of the sparrow search algorithm is used to find the path with the lowest actual link overhead as the rerouting path for the elephant flows.It is used to reduce the possibility of local network congestion occurrence.Equal cost multi-path(ECMP)protocols with faster response time are used to schedulemouse flows with shorter duration.Used to guarantee the quality of service of the network.To achieve isolated transmission of various types of data streams.The experimental results show that the proposed strategy has higher throughput,better network load balancing,and better robustness compared to ECMP under different traffic models.In addition,because it can fully utilize the resources in the network,MCQG also outperforms another traffic scheduling strategy that does rerouting for elephant flows(namely Hedera).Compared withECMPandHedera,MCQGimproves average throughput by 11.73%and 4.29%,and normalized total throughput by 6.74%and 2.64%,respectively;MCQG improves link utilization by 23.25%and 15.07%;in addition,the average round-trip delay and packet loss rate fluctuate significantly less than the two compared strategies.
文摘Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due to their extensive energy consumption during workload pro-cessing.Numerous research studies have examined distinct operating cost mitigation techniques for geo-distributed data centers(DCs).However,oper-ating cost savings during workload processing,which also considers string-matching techniques in geo-distributed DCs,remains unexplored.In this research,we propose a novel string matching-based geographical load balanc-ing(SMGLB)technique to mitigate the operating cost of the geo-distributed DC.The primary goal of this study is to use a string-matching algorithm(i.e.,Boyer Moore)to compare the contents of incoming workloads to those of documents that have already been processed in a data center.A successful match prevents the global load balancer from sending the user’s request to a data center for processing and displaying the results of the previously processed workload to the user to save energy.On the contrary,if no match can be discovered,the global load balancer will allocate the incoming workload to a specific DC for processing considering variable energy prices,the number of active servers,on-site green energy,and traces of incoming workload.The results of numerical evaluations show that the SMGLB can minimize the operating expenses of the geo-distributed data centers more than the existing workload distribution techniques.
基金supported by Universiti Putra Malaysia and the Ministry of Education(MOE).
文摘As the amount of data continues to grow rapidly,the variety of data produced by applications is becoming more affluent than ever.Cloud computing is the best technology evolving today to provide multi-services for the mass and variety of data.The cloud computing features are capable of processing,managing,and storing all sorts of data.Although data is stored in many high-end nodes,either in the same data centers or across many data centers in cloud,performance issues are still inevitable.The cloud replication strategy is one of best solutions to address risk of performance degradation in the cloud environment.The real challenge here is developing the right data replication strategy with minimal data movement that guarantees efficient network usage,low fault tolerance,and minimal replication frequency.The key problem discussed in this research is inefficient network usage discovered during selecting a suitable data center to store replica copies induced by inadequate data center selection criteria.Hence,to mitigate the issue,we proposed Replication Strategy with a comprehensive Data Center Selection Method(RS-DCSM),which can determine the appropriate data center to place replicas by considering three key factors:Popularity,space availability,and centrality.The proposed RS-DCSM was simulated using CloudSim and the results proved that data movement between data centers is significantly reduced by 14%reduction in overall replication frequency and 20%decrement in network usage,which outperformed the current replication strategy,known as Dynamic Popularity aware Replication Strategy(DPRS)algorithm.
基金This work was supported by the Hainan Provincial Natural Science Foundation of China(620RC560,2019RC096,620RC562)the Scientific Research Setup Fund of Hainan University(KYQD(ZR)1877)+2 种基金the National Natural Science Foundation of China(62162021,82160345,61802092)the key research and development program of Hainan province(ZDYF2020199,ZDYF2021GXJS017)the key science and technology plan project of Haikou(2011-016).
文摘As a critical infrastructure of cloud computing,data center networks(DCNs)directly determine the service performance of data centers,which provide computing services for various applications such as big data processing and artificial intelligence.However,current architectures of data center networks suffer from a long routing path and a low fault tolerance between source and destination servers,which is hard to satisfy the requirements of high-performance data center networks.Based on dual-port servers and Clos network structure,this paper proposed a novel architecture RClos to construct high-performance data center networks.Logically,the proposed architecture is constructed by inserting a dual-port server into each pair of adjacent switches in the fabric of switches,where switches are connected in the form of a ring Clos structure.We describe the structural properties of RClos in terms of network scale,bisection bandwidth,and network diameter.RClos architecture inherits characteristics of its embedded Clos network,which can accommodate a large number of servers with a small average path length.The proposed architecture embraces a high fault tolerance,which adapts to the construction of various data center networks.For example,the average path length between servers is 3.44,and the standardized bisection bandwidth is 0.8 in RClos(32,5).The result of numerical experiments shows that RClos enjoys a small average path length and a high network fault tolerance,which is essential in the construction of high-performance data center networks.
文摘Globally,digital technology and the digital economy have propelled technological revolution and industrial change,and it has become one of the main grounds of international industrial competition.It was estimated that the scale of China’s digital economy would reach 50 trillion yuan in 2022,accounting for more than 40%of GDP,presenting great market potential and room for the growth of the digital economy.With the rapid development of the digital economy,the state attaches great importance to the construction of digital infrastructure and has introduced a series of policies to promote the systematic development and large-scale deployment of digital infrastructure.In 2022 the Chinese government planned to build 8 arithmetic hubs and 10 national data center clusters nationwide.To proactively address the future demand for AI across various scenarios,there is a need for a well-structured computing power infrastructure.The data center,serving as the pivotal hub for computing power,has evolved from the conventional cloud center to a more intelligent computing center,allowing for a diversified convergence of computing power supply.Besides,the data center accommodates a diverse array of arithmetic business forms from customers,reflecting the multi-industry developmental trend.The arithmetic service platform is consistently broadening its scope,with ongoing optimization and innovation in the design scheme of machine room processes.The widespread application of submerged phase-change liquid cooling technology and cold plate cooling technology introduces a series of new challenges to the construction of digital infrastructure.This paper delves into the design objectives,industry considerations,layout,and other dimensions of a smart computing center and proposes a new-generation data center solution that is“flexible,resilient,green,and low-carbon.”
基金This work was supported by Universiti Sains Malaysia under external grant(Grant Number 304/PNAV/650958/U154).
文摘The interest in selecting an appropriate cloud data center is exponentially increasing due to the popularity and continuous growth of the cloud computing sector.Cloud data center selection challenges are compounded by ever-increasing users’requests and the number of data centers required to execute these requests.Cloud service broker policy defines cloud data center’s selection,which is a case of an NP-hard problem that needs a precise solution for an efficient and superior solution.Differential evolution algorithm is a metaheuristic algorithm characterized by its speed and robustness,and it is well suited for selecting an appropriate cloud data center.This paper presents a modified differential evolution algorithm-based cloud service broker policy for the most appropriate data center selection in the cloud computing environment.The differential evolution algorithm is modified using the proposed new mutation technique ensuring enhanced performance and providing an appropriate selection of data centers.The proposed policy’s superiority in selecting the most suitable data center is evaluated using the CloudAnalyst simulator.The results are compared with the state-of-arts cloud service broker policies.
基金supported in part by the HUT Distributed and Mobile Cloud Systems research project and Tekes within the ITEA2 project 10014 EASI-CLOUDS
文摘In recent years,dual-homed topologies have appeared in data centers in order to offer higher aggregate bandwidth by using multiple paths simultaneously.Multipath TCP(MPTCP) has been proposed as a replacement for TCP in those topologies as it can efficiently offer improved throughput and better fairness.However,we have found that MPTCP has a problem in terms of incast collapse where the receiver suffers a drastic goodput drop when it simultaneously requests data over multiple servers.In this paper,we investigate why the goodput collapses even if MPTCP is able to actively relieve hot spots.In order to address the problem,we propose an equally-weighted congestion control algorithm for MPTCP,namely EW-MPTCP,without need for centralized control,additional infrastructure and a hardware upgrade.In our scheme,in addition to the coupled congestion control performed on each subflow of an MPTCP connection,we allow each subflow to perform an additional congestion control operation by weighting the congestion window in reverse proportion to the number of servers.The goal is to mitigate incast collapse by allowing multiple MPTCP subflows to compete fairly with a single-TCP flow at the shared bottleneck.The simulation results show that our solution mitigates the incast problem and noticeably improves goodput in data centers.
基金supported by National Natural Science Foundation of China(61472042)Corporation Science and Technology Program of Global Energy Interconnection Group Ltd.(GEIGC-D-[2018]024)
文摘With the rapid development of technologies such as big data and cloud computing,data communication and data computing in the form of exponential growth have led to a large amount of energy consumption in data centers.Globally,data centers will become the world’s largest users of energy consumption,with the ratio rising from 3%in 2017 to 4.5%in 2025.Due to its unique climate and energy-saving advantages,the high-latitude area in the Pan-Arctic region has gradually become a hotspot for data center site selection in recent years.In order to predict and analyze the future energy consumption and carbon emissions of global data centers,this paper presents a new method based on global data center traffic and power usage effectiveness(PUE)for energy consumption prediction.Firstly,global data center traffic growth is predicted based on the Cisco’s research.Secondly,the dynamic global average PUE and the high latitude PUE based on Romonet simulation model are obtained,and then global data center energy consumption with two different scenarios,the decentralized scenario and the centralized scenario,is analyzed quantitatively via the polynomial fitting method.The simulation results show that,in 2030,the global data center energy consumption and carbon emissions are reduced by about 301 billion kWh and 720 million tons CO2 in the centralized scenario compared with that of the decentralized scenario,which confirms that the establishment of data centers in the Pan-Arctic region in the future can effectively relief the climate change and energy problems.This study provides support for global energy consumption prediction,and guidance for the layout of future global data centers from the perspective of energy consumption.Moreover,it provides support of the feasibility of the integration of energy and information networks under the Global Energy Interconnection conception.
基金supported by the National Natural Science Foundation of China(6120200461272084)+9 种基金the National Key Basic Research Program of China(973 Program)(2011CB302903)the Specialized Research Fund for the Doctoral Program of Higher Education(2009322312000120113223110003)the China Postdoctoral Science Foundation Funded Project(2011M5000952012T50514)the Natural Science Foundation of Jiangsu Province(BK2011754BK2009426)the Jiangsu Postdoctoral Science Foundation Funded Project(1102103C)the Natural Science Fund of Higher Education of Jiangsu Province(12KJB520007)the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions(yx002001)
文摘How to effectively reduce the energy consumption of large-scale data centers is a key issue in cloud computing. This paper presents a novel low-power task scheduling algorithm (L3SA) for large-scale cloud data centers. The winner tree is introduced to make the data nodes as the leaf nodes of the tree and the final winner on the purpose of reducing energy consumption is selected. The complexity of large-scale cloud data centers is fully consider, and the task comparson coefficient is defined to make task scheduling strategy more reasonable. Experiments and performance analysis show that the proposed algorithm can effectively improve the node utilization, and reduce the overall power consumption of the cloud data center.
基金supported part by the National Natural Science Foundation of China(61601252,61801254)Public Technology Projects of Zhejiang Province(LG-G18F020007)+1 种基金Zhejiang Provincial Natural Science Foundation of China(LY20F020008,LY18F020011,LY20F010004)K.C.Wong Magna Fund in Ningbo University。
文摘With the emerging diverse applications in data centers,the demands on quality of service in data centers also become diverse,such as high throughput of elephant flows and low latency of deadline-sensitive flows.However,traditional TCPs are ill-suited to such situations and always result in the inefficiency(e.g.missing the flow deadline,inevitable throughput collapse)of data transfers.This further degrades the user-perceived quality of service(QoS)in data centers.To reduce the flow completion time of mice and deadline-sensitive flows along with promoting the throughput of elephant flows,an efficient and deadline-aware priority-driven congestion control(PCC)protocol,which grants mice and deadline-sensitive flows the highest priority,is proposed in this paper.Specifically,PCC computes the priority of different flows according to the size of transmitted data,the remaining data volume,and the flows’deadline.Then PCC adjusts the congestion window according to the flow priority and the degree of network congestion.Furthermore,switches in data centers control the input/output of packets based on the flow priority and the queue length.Different from existing TCPs,to speed up the data transfers of mice and deadline-sensitive flows,PCC provides an effective method to compute and encode the flow priority explicitly.According to the flow priority,switches can manage packets efficiently and ensure the data transfers of high priority flows through a weighted priority scheduling with minor modification.The experimental results prove that PCC can improve the data transfer performance of mice and deadline-sensitive flows while guaranting the throughput of elephant flows.
基金supported by the National Key Research and Development Program of China under Grant No.2016YFB 0402302the National Natural Science Foundation of China under Grant No.91433206。
文摘Global data traffic is growing rapidly,and the demand for optoelectronic transceivers applied in data centers(DCs)is also increasing correspondingly.In this review,we first briefly introduce the development of optoelectronics transceivers in DCs,as well as the advantages of silicon photonic chips fabricated by complementary metal oxide semiconductor process.We also summarize the research on the main components in silicon photonic transceivers.In particular,quantum dot lasers have shown great potential as light sources for silicon photonic integration—whether to adopt bonding method or monolithic integration—thanks to their unique advantages over the conventional quantum-well counterparts.Some of the solutions for highspeed optical interconnection in DCs are then discussed.Among them,wavelength division multiplexing and four-level pulseamplitude modulation have been widely studied and applied.At present,the application of coherent optical communication technology has moved from the backbone network,to the metro network,and then to DCs.
文摘New and emerging use cases, such as the interconnection of geographically distributed data centers(DCs), are drawing attention to the requirement for dynamic end-to-end service provisioning, spanning multiple and heterogeneous optical network domains. This heterogeneity is, not only due to the diverse data transmission and switching technologies, but also due to the different options of control plane techniques. In light of this, the problem of heterogeneous control plane interworking needs to be solved, and in particular, the solution must address the specific issues of multi-domain networks, such as limited domain topology visibility, given the scalability and confidentiality constraints. In this article, some of the recent activities regarding the Software-Defined Networking(SDN) orchestration are reviewed to address such a multi-domain control plane interworking problem. Specifically, three different models, including the single SDN controller model, multiple SDN controllers in mesh, and multiple SDN controllers in a hierarchical setting, are presented for the DC interconnection network with multiple SDN/Open Flow domains or multiple Open Flow/Generalized Multi-Protocol Label Switching( GMPLS) heterogeneous domains. I n addition, two concrete implementations of the orchestration architectures are detailed, showing the overall feasibility and procedures of SDN orchestration for the end-to-endservice provisioning in multi-domain data center optical networks.
基金Supported by the National High Technology Research and Development Program of China under Grant No 2015AA016902the National Natural Science Foundation of China under Grant Nos 61435013 and 61405188the K.C.Wong Education Foundation
文摘An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid integration technology. Multimode output waveguides in the silica AWG with 2% refractive index difference are used to obtain fiat-top spectra. The output waveguide facet is polished to 45° bevel to change the light propagation direction into the mesa-type PIN PD, which simplifies the packaging process. The experimentM results show that the single channel I dB bandwidth of AWG ranges from 2.12nm to 3.06nm, the ROSA responsivity ranges from 0.097 A/W to 0.158A/W, and the 3dB bandwidth is up to 11 GHz. It is promising to be applied in the eight-lane WDM transmission system in data center interconnection.
基金This research was partially supported by the National Grand Fundamental Research 973 Program of China under Grant (No. 2013CB329103), Natural Science Foundation of China grant (No. 61271171), the Fundamental Research Funds for the Central Universities (ZYGX2013J002, ZYGX2012J004, ZYGX2010J002, ZYGX2010J009), Guangdong Science and Technology Project (2012B090500003, 2012B091000163, 2012556031).
文摘Virtualization is a common technology for resource sharing in data center. To make efficient use of data center resources, the key challenge is to map customer demands (modeled as virtual data center, VDC) to the physical data center effectively. In this paper, we focus on this problem. Distinct with previous works, our study of VDC embedding problem is under the assumption that switch resource is the bottleneck of data center networks (DCNs). To this end, we not only propose relative cost to evaluate embedding strategy, decouple embedding problem into VM placement with marginal resource assignment and virtual link mapping with decided source-destination based on the property of fat-tree, but also design the traffic aware embedding algorithm (TAE) and first fit virtual link mapping (FFLM) to map virtual data center requests to a physical data center. Simulation results show that TAE+FFLM could increase acceptance rate and reduce network cost (about 49% in the case) at the same time. The traffie aware embedding algorithm reduces the load of core-link traffic and brings the optimization opportunity for data center network energy conservation.
基金supported by the Major State Basic Research Program of China (973 project No. 2013CB329301 and 2010CB327806)the Natural Science Fund of China (NSFC project No. 61372085, 61032003, 61271165 and 61202379)+1 种基金the Research Fund for the Doctoral Program of Higher Education of China (RFDP project No. 20120185110025, 20120185110030 and 20120032120041)supported by Tianjin Key Laboratory of Cognitive Computing and Application, School of Computer Science and Technology, Tianjin University, Tianjin, P. R. China
文摘We consider differentiated timecritical task scheduling in a N×N input queued optical packet s w itch to ens ure 100% throughput and meet different delay requirements among various modules of data center. Existing schemes either consider slot-by-slot scheduling with queue depth serving as the delay metric or assume that each input-output connection has the same delay bound in the batch scheduling mode. The former scheme neglects the effect of reconfiguration overhead, which may result in crippled system performance, while the latter cannot satisfy users' differentiated Quality of Service(Qo S) requirements. To make up these deficiencies, we propose a new batch scheduling scheme to meet the various portto-port delay requirements in a best-effort manner. Moreover, a speedup is considered to compensate for both the reconfiguration overhead and the unavoidable slots wastage in the switch fabric. With traffic matrix and delay constraint matrix given, this paper proposes two heuristic algorithms Stringent Delay First(SDF) and m-order SDF(m-SDF) to realize the 100% packet switching, while maximizing the delay constraints satisfaction ratio. The performance of our scheme is verified by extensive numerical simulations.
基金supported in part by National Key Research and Development Program of China(2019YFB2103200)NSFC(61672108),Open Subject Funds of Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory(SKX182010049)+1 种基金the Fundamental Research Funds for the Central Universities(5004193192019PTB-019)the Industrial Internet Innovation and Development Project 2018 of China.
文摘The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.