Traditional expert-designed branching rules in branch-and-bound(B&B) are static, often failing to adapt to diverse and evolving problem instances. Crafting these rules is labor-intensive, and may not scale well wi...Traditional expert-designed branching rules in branch-and-bound(B&B) are static, often failing to adapt to diverse and evolving problem instances. Crafting these rules is labor-intensive, and may not scale well with complex problems.Given the frequent need to solve varied combinatorial optimization problems, leveraging statistical learning to auto-tune B&B algorithms for specific problem classes becomes attractive. This paper proposes a graph pointer network model to learn the branch rules. Graph features, global features and historical features are designated to represent the solver state. The graph neural network processes graph features, while the pointer mechanism assimilates the global and historical features to finally determine the variable on which to branch. The model is trained to imitate the expert strong branching rule by a tailored top-k Kullback-Leibler divergence loss function. Experiments on a series of benchmark problems demonstrate that the proposed approach significantly outperforms the widely used expert-designed branching rules. It also outperforms state-of-the-art machine-learning-based branch-and-bound methods in terms of solving speed and search tree size on all the test instances. In addition, the model can generalize to unseen instances and scale to larger instances.展开更多
Time series clustering is a challenging problem due to the large-volume,high-dimensional,and warping characteristics of time series data.Traditional clustering methods often use a single criterion or distance measure,...Time series clustering is a challenging problem due to the large-volume,high-dimensional,and warping characteristics of time series data.Traditional clustering methods often use a single criterion or distance measure,which may not capture all the features of the data.This paper proposes a novel method for time series clustering based on evolutionary multi-tasking optimization,termed i-MFEA,which uses an improved multifactorial evolutionary algorithm to optimize multiple clustering tasks simultaneously,each with a different validity index or distance measure.Therefore,i-MFEA can produce diverse and robust clustering solutions that satisfy various preferences of decision-makers.Experiments on two artificial datasets show that i-MFEA outperforms single-objective evolutionary algorithms and traditional clustering methods in terms of convergence speed and clustering quality.The paper also discusses how i-MFEA can address two long-standing issues in time series clustering:the choice of appropriate similarity measure and the number of clusters.展开更多
On June 17, 2013, MilkyWay-2 (Tianhe-2) supercomputer was crowned as the fastest supercomputer in the world on the 41th TOP500 list. This paper provides an overview of the MilkyWay-2 project and describes the design...On June 17, 2013, MilkyWay-2 (Tianhe-2) supercomputer was crowned as the fastest supercomputer in the world on the 41th TOP500 list. This paper provides an overview of the MilkyWay-2 project and describes the design of hardware and software systems. The key architecture features of MilkyWay-2 are highlighted, including neo-heterogeneous compute nodes integrating commodity- off-the-shelf processors and accelerators that share similar instruction set architecture, powerful networks that employ proprietary interconnection chips to support the massively parallel message-passing communications, proprietary 16- core processor designed for scientific computing, efficient software stacks that provide high performance file system, emerging programming model for heterogeneous systems, and intelligent system administration. We perform extensive evaluation with wide-ranging applications from LINPACK and Graph500 benchmarks to massively parallel software deployed in the system.展开更多
Grid computing is the combination of com- puter resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale dat...Grid computing is the combination of com- puter resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale data-intensive applications, producing and consuming huge amounts of data, distributed across a large number of machines. Data grid computing composes sets of independent tasks each of which require massive distributed data sets that may each be replicated on different resources. To reduce the completion time of the application and improve the performance of the grid, appropriate computing resources should be selected to execute the tasks and appropriate storage resources selected to serve the files required by the tasks. So the problem can be broken into two sub-problems: selection of storage resources and assignment of tasks to computing resources. This paper proposes a scheduler, which is broken into three parts that can run in parallel and uses both parallel tabu search and a parallel genetic algorithm. Finally, the proposed algorithm is evaluated by comparing it with other related algorithms, which target minimizing makespan. Simulation results show that the proposed approach can be a good choice for scheduling large data grid applications.展开更多
On the 41st Top500 list announced in June 2013, the MilkyWay-2 system produced by National University of Defense Technology (NUDT) in China won the first place with a LINPACK test result of 33.86 PFLOPS. It has been...On the 41st Top500 list announced in June 2013, the MilkyWay-2 system produced by National University of Defense Technology (NUDT) in China won the first place with a LINPACK test result of 33.86 PFLOPS. It has been one and a half year since its predecessor, MilkyWay-1 (TH-1), reached the same place for the first time. On the newest Top500 list published in November 2013, MilkyWay-2 continued to win the champion.展开更多
Script is the structured knowledge representation of prototypical real-life event sequences.Learning the commonsense knowledge inside the script can be helpful for machines in understanding natural language and drawin...Script is the structured knowledge representation of prototypical real-life event sequences.Learning the commonsense knowledge inside the script can be helpful for machines in understanding natural language and drawing commonsensible inferences.Script learning is an interesting and promising research direction,in which a trained script learning system can process narrative texts to capture script knowledge and draw inferences.However,there are currently no survey articles on script learning,so we are providing this comprehensive survey to deeply investigate the standard framework and the major research topics on script learning.This research field contains three main topics:event representations,script learning models,and evaluation approaches.For each topic,we systematically summarize and categorize the existing script learning systems,and carefully analyze and compare the advantages and disadvantages of the representative systems.We also discuss the current state of the research and possible future directions.展开更多
In the second half of 2014 the Manchester Institute of Biotechnology, based in Manchester (UK), hosted the first SupraBiology congress, an event attended by representa- tives of different academic institutions and i...In the second half of 2014 the Manchester Institute of Biotechnology, based in Manchester (UK), hosted the first SupraBiology congress, an event attended by representa- tives of different academic institutions and industry based in both the UK and China. The congress was aimed to serve as a platform to discuss and promote potential collaborations between the UK and China on the subject of Systems Biology and High Performance Computing. The event, sponsored by the "BBSRC China Partnering Awards" and ISBE, was organised as a sequence of talks addressing the different aspects of Systems Biology that can benefit from High Performance Computing. A general discussion session followed where the scientific, techni- cal, and logistic aspects of the prospected UK-China collaborations were examined.展开更多
基金supported by the Open Project of Xiangjiang Laboratory (22XJ02003)Scientific Project of the National University of Defense Technology (NUDT)(ZK21-07, 23-ZZCX-JDZ-28)+1 种基金the National Science Fund for Outstanding Young Scholars (62122093)the National Natural Science Foundation of China (72071205)。
文摘Traditional expert-designed branching rules in branch-and-bound(B&B) are static, often failing to adapt to diverse and evolving problem instances. Crafting these rules is labor-intensive, and may not scale well with complex problems.Given the frequent need to solve varied combinatorial optimization problems, leveraging statistical learning to auto-tune B&B algorithms for specific problem classes becomes attractive. This paper proposes a graph pointer network model to learn the branch rules. Graph features, global features and historical features are designated to represent the solver state. The graph neural network processes graph features, while the pointer mechanism assimilates the global and historical features to finally determine the variable on which to branch. The model is trained to imitate the expert strong branching rule by a tailored top-k Kullback-Leibler divergence loss function. Experiments on a series of benchmark problems demonstrate that the proposed approach significantly outperforms the widely used expert-designed branching rules. It also outperforms state-of-the-art machine-learning-based branch-and-bound methods in terms of solving speed and search tree size on all the test instances. In addition, the model can generalize to unseen instances and scale to larger instances.
基金supported by the Open Project of Xiangjiang Laboratory(No.22XJ02003)the National Natural Science Foundation of China(No.62122093).
文摘Time series clustering is a challenging problem due to the large-volume,high-dimensional,and warping characteristics of time series data.Traditional clustering methods often use a single criterion or distance measure,which may not capture all the features of the data.This paper proposes a novel method for time series clustering based on evolutionary multi-tasking optimization,termed i-MFEA,which uses an improved multifactorial evolutionary algorithm to optimize multiple clustering tasks simultaneously,each with a different validity index or distance measure.Therefore,i-MFEA can produce diverse and robust clustering solutions that satisfy various preferences of decision-makers.Experiments on two artificial datasets show that i-MFEA outperforms single-objective evolutionary algorithms and traditional clustering methods in terms of convergence speed and clustering quality.The paper also discusses how i-MFEA can address two long-standing issues in time series clustering:the choice of appropriate similarity measure and the number of clusters.
基金Acknowledgements This work was partially supported by the Na- tional High-tech R&D Program of China (863 Program) (2012AA01A301), and the National Natural Science Foundation of China (Grant No. 61120106005). The MilkyWay-2 project is a great team effort and benefits from the cooperation of many individuals at NUDT. We thank all the people who have contributed to the system in a variety of ways.
文摘On June 17, 2013, MilkyWay-2 (Tianhe-2) supercomputer was crowned as the fastest supercomputer in the world on the 41th TOP500 list. This paper provides an overview of the MilkyWay-2 project and describes the design of hardware and software systems. The key architecture features of MilkyWay-2 are highlighted, including neo-heterogeneous compute nodes integrating commodity- off-the-shelf processors and accelerators that share similar instruction set architecture, powerful networks that employ proprietary interconnection chips to support the massively parallel message-passing communications, proprietary 16- core processor designed for scientific computing, efficient software stacks that provide high performance file system, emerging programming model for heterogeneous systems, and intelligent system administration. We perform extensive evaluation with wide-ranging applications from LINPACK and Graph500 benchmarks to massively parallel software deployed in the system.
文摘Grid computing is the combination of com- puter resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale data-intensive applications, producing and consuming huge amounts of data, distributed across a large number of machines. Data grid computing composes sets of independent tasks each of which require massive distributed data sets that may each be replicated on different resources. To reduce the completion time of the application and improve the performance of the grid, appropriate computing resources should be selected to execute the tasks and appropriate storage resources selected to serve the files required by the tasks. So the problem can be broken into two sub-problems: selection of storage resources and assignment of tasks to computing resources. This paper proposes a scheduler, which is broken into three parts that can run in parallel and uses both parallel tabu search and a parallel genetic algorithm. Finally, the proposed algorithm is evaluated by comparing it with other related algorithms, which target minimizing makespan. Simulation results show that the proposed approach can be a good choice for scheduling large data grid applications.
文摘On the 41st Top500 list announced in June 2013, the MilkyWay-2 system produced by National University of Defense Technology (NUDT) in China won the first place with a LINPACK test result of 33.86 PFLOPS. It has been one and a half year since its predecessor, MilkyWay-1 (TH-1), reached the same place for the first time. On the newest Top500 list published in November 2013, MilkyWay-2 continued to win the champion.
基金Project supported by the National Natural Science Foundation of China(No.61806216)。
文摘Script is the structured knowledge representation of prototypical real-life event sequences.Learning the commonsense knowledge inside the script can be helpful for machines in understanding natural language and drawing commonsensible inferences.Script learning is an interesting and promising research direction,in which a trained script learning system can process narrative texts to capture script knowledge and draw inferences.However,there are currently no survey articles on script learning,so we are providing this comprehensive survey to deeply investigate the standard framework and the major research topics on script learning.This research field contains three main topics:event representations,script learning models,and evaluation approaches.For each topic,we systematically summarize and categorize the existing script learning systems,and carefully analyze and compare the advantages and disadvantages of the representative systems.We also discuss the current state of the research and possible future directions.
文摘In the second half of 2014 the Manchester Institute of Biotechnology, based in Manchester (UK), hosted the first SupraBiology congress, an event attended by representa- tives of different academic institutions and industry based in both the UK and China. The congress was aimed to serve as a platform to discuss and promote potential collaborations between the UK and China on the subject of Systems Biology and High Performance Computing. The event, sponsored by the "BBSRC China Partnering Awards" and ISBE, was organised as a sequence of talks addressing the different aspects of Systems Biology that can benefit from High Performance Computing. A general discussion session followed where the scientific, techni- cal, and logistic aspects of the prospected UK-China collaborations were examined.