期刊文献+
共找到10,903篇文章
< 1 2 250 >
每页显示 20 50 100
Research on Value Evaluation Method of Investment Project Based on Fuzzy Composite Real Options
1
作者 Huanyu Li 《Economics World》 2024年第1期24-34,共11页
Venture capital investments are characterized by high input,high yield,and high risk.Due to the complexity of the market environment,stage-by-stage investment is becoming increasingly important.Traditional evaluation ... Venture capital investments are characterized by high input,high yield,and high risk.Due to the complexity of the market environment,stage-by-stage investment is becoming increasingly important.Traditional evaluation methods like comparison,proportion,maturity,internal rate of return,scenario analysis,decision trees,and net present value cannot fully consider the uncertainty and stage characteristics of the project.The fuzzy real options method addresses this by combining real option theory,fuzzy number theory,and composite option theory to provide a more accurate and objective evaluation of Public-Private Partnership(PPP)projects.It effectively considers the interaction of options and the ambiguity of project parameters,making it a valuable tool for project evaluation in the context of venture capital investment. 展开更多
关键词 real option fuzzy method Geske composite option
下载PDF
基于互信息优化的Option-Critic算法
2
作者 栗军伟 刘全 徐亚鹏 《计算机科学》 CSCD 北大核心 2024年第2期252-258,共7页
时序抽象作为分层强化学习的重要研究内容,允许分层强化学习智能体在不同的时间尺度上学习策略,可以有效解决深度强化学习难以处理的稀疏奖励问题。如何端到端地学习到优秀的时序抽象策略一直是分层强化学习研究面临的挑战。Option-Crit... 时序抽象作为分层强化学习的重要研究内容,允许分层强化学习智能体在不同的时间尺度上学习策略,可以有效解决深度强化学习难以处理的稀疏奖励问题。如何端到端地学习到优秀的时序抽象策略一直是分层强化学习研究面临的挑战。Option-Critic(OC)框架在Option框架的基础上,通过策略梯度理论,可以有效解决此问题。然而,在策略学习过程中,OC框架会出现Option内部策略动作分布变得十分相似的退化问题。该退化问题影响了OC框架的实验性能,导致Option的可解释性变差。为了解决上述问题,引入互信息知识作为内部奖励,并提出基于互信息优化的Option-Critic算法(Option-Critic Algorithm with Mutual Information Optimization,MIOOC)。MIOOC算法结合了近端策略Option-Critic(Proximal Policy Option-Critic,PPOC)算法,可以保证下层策略的多样性。为了验证算法的有效性,把MIOOC算法和几种常见的强化学习方法在连续实验环境中进行对比实验。实验结果表明,MIOOC算法可以加快模型学习速度,实验性能更优,Option内部策略更有区分度。 展开更多
关键词 深度强化学习 时序抽象 分层强化学习 互信息 内部奖励 option多样性
下载PDF
Modeling the Spatio-Temporal Dynamics of Local Context for a Contextualized Diffusion of Agroecological Intensification Options in Niger
3
作者 Nouhou Salifou Jangorzo Maud Loireau +3 位作者 Abou-Soufianou Sadda Ousmane Sami Mari Abdoul-Aziz Saïdou Hassane Bil-Assanou Issoufou 《International Journal of Geosciences》 CAS 2024年第3期270-301,共32页
Spatio-temporal variability and dynamics in Sahelian agro-pastoral zones make each local situation a special case. These specificities must be considered to guide the dissemination of agricultural options with a view ... Spatio-temporal variability and dynamics in Sahelian agro-pastoral zones make each local situation a special case. These specificities must be considered to guide the dissemination of agricultural options with a view to sustainable development. The territorial scale of municipalities is not sufficient for this necessary contextualization;the scale of the “village terroir” seems to be a better option. This is the hypothesis we put forward in the framework of the Global Collaboration for Resilient Food Systems program (CRFS), i.e. local context is spatially defined by village terroir. The study is based on data collected through participatory mapping and surveys in “village terroirs” in three regions of Niger (Maradi, Dosso and Tillabéri). Then the links between farm managers and their cultivated land, as well as the spatio-temporal dynamics of local context are analyzed. This study provides evidence of the existence and functional usefulness of the village terroir for farmers, their land management and their activities. It demonstrates the usefulness of contextualizing agricultural options at this scale. Their analysis elucidates the links between “terroirs village” and the specific functioning of the agrosocio-ecosystems acting on each of them, thus laying the systemic and geographical foundations for a model of the spatio- temporal dynamics of “village terroirs”. This initial work has opened up new perspectives in modeling and sustainable development. 展开更多
关键词 NIGER option by Context Local Condition Complex System Multiscale Conceptual Modeling
下载PDF
Upside and downside correlated jump risk premia of currency options and expected returns
4
作者 Jie‑Cao He Hsing‑Hua Chang +1 位作者 Ting‑Fu Chen Shih‑Kuei Lin 《Financial Innovation》 2023年第1期2267-2324,共58页
This research explores upside and downside jumps in the dynamic processes of three rates:domestic interest rates,foreign interest rates,and exchange rates.To fill the gap between the asymmetric jump in the currency ma... This research explores upside and downside jumps in the dynamic processes of three rates:domestic interest rates,foreign interest rates,and exchange rates.To fill the gap between the asymmetric jump in the currency market and the current models,a correlated asymmetric jump model is proposed to capture the co-movement of the correlated jump risks for the three rates and identify the correlated jump risk premia.The likelihood ratio test results show that the new model performs best in 1-,3-,6-,and 12-month maturities.The in-and out-of-sample test results indicate that the new model can capture more risk factors with relatively small pricing errors.Finally,the risk factors captured by the new model can explain the exchange rate fluctuations for various economic events. 展开更多
关键词 Jump-diffusion process Currency option Risk premia Correlated jumps
下载PDF
Intelligent option portfolio model with perspective of shadow price and risk‑free profit
5
作者 Fengmin Xu Jieao Ma 《Financial Innovation》 2023年第1期2137-2164,共28页
Since Markowitz proposed modern portfolio theory,portfolio optimization has been being a classic topic in financial engineering.Although it is generally accepted that options help to improve the market,there is still ... Since Markowitz proposed modern portfolio theory,portfolio optimization has been being a classic topic in financial engineering.Although it is generally accepted that options help to improve the market,there is still an improvement for the portrayal of their unique properties in portfolio problems.In this paper,an intelligent option portfolio model is developed that allows selling options contracts to earn option fees and considers the high leverage of options in the market.Deep learning methods are used to predict the forward price of the underlying asset,making the model smarter.It can find an optimal option portfolio that maximizes the final wealth among the call and put options with multiple strike prices.We use the duality theory to analyze the marginal contribution of initial assets,risk tolerance limit,and portfolio leverage limit for the final wealth.The leverage limit of the option portfolio has a significant impact on the return.To satisfy the investors with different risk preferences,we also give the conditions for the option portfolio to gain a risk-free return and replace the Conditional Value-at-Risk.Numerical experiments demonstrate that the intelligent option portfolio model obtains a satisfactory out-of-sample return,which is significantly positively correlated with the volatility of the underlying asset and negatively correlated with the forecast error of the forward price.The risk-free option model is effective in achieving the goal of no drawdown and gaining satisfactory returns.Investors can adjust the balance point between returns and risks according to their risk preference. 展开更多
关键词 option portfolio Linear programming Deep learning Risk appetite
下载PDF
Decarbonization options of the iron and steelmaking industry based on a three-dimensional analysis
6
作者 Xin Lu Weijian Tian +3 位作者 Hui Li Xinjian Li Kui Quan Hao Bai 《International Journal of Minerals,Metallurgy and Materials》 SCIE EI CAS CSCD 2023年第2期388-400,共13页
Decarbonization is a critical issue for peaking CO_(2) emissions of energy-intensive industries,such as the iron and steel industry.The decarbonization options of China’s ironmaking and steelmaking sector were discus... Decarbonization is a critical issue for peaking CO_(2) emissions of energy-intensive industries,such as the iron and steel industry.The decarbonization options of China’s ironmaking and steelmaking sector were discussed based on a systematic three-dimensional low-carbon analysis from the aspects of resource utilization(Y),energy utilization(Q),and energy cleanliness which is evaluated by a process general emission factor(PGEF)on all the related processes,including the current blast furnace(BF)-basic oxygen furnace(BOF)integrated process and the specific sub-processes,as well as the electric arc furnace(EAF)process,typical direct reduction(DR)process,and smelting reduction(SR)process.The study indicates that the three-dimensional aspects,particularly the energy structure,should be comprehensively considered to quantitatively evaluate the decarbonization road map based on novel technologies or processes.Promoting scrap utilization(improvement of Y)and the substitution of carbon-based energy(improvement of PGEF)in particular is critical.In terms of process scale,promoting the development of the scrap-based EAF or DR-EAF process is highly encouraged because of their lower PGEF.The three-dimensional method is expected to extend to other processes or industries,such as the cement production and thermal electricity generation industries. 展开更多
关键词 peak CO_(2)emission low carbon management decarbonization option energy-intensity industry ironmaking and steelmaking
下载PDF
A novel stochastic modeling framework for coal production and logistics through options pricing analysis
7
作者 Mesias Alfeus James Collins 《Financial Innovation》 2023年第1期1430-1448,共19页
We propose a novel stochastic modeling framework for coal production and logistics using option pricing theory.The problem of valuing the inherent real optionality a coal producer has when mining and processing therma... We propose a novel stochastic modeling framework for coal production and logistics using option pricing theory.The problem of valuing the inherent real optionality a coal producer has when mining and processing thermal coal is modelled as pricing spread options of three assets under the stochastic volatility model.We derive a three-dimensional Fast Fourier Transform(“FFT”)lower bound approximation to value the inherent real optionality and for robustness check,we compare the semi-analytical pricing accuracy with the Monte Carlo simulation.Model parameters are estimated from the historical monthly data,and stochastic volatility parameters are obtained by matching the Kurtosis of the low-ash diff data to the Kurtosis of the stochastic volatility process which is assumed to follow Cox–Ingersoll–Ross(“CIR”)model. 展开更多
关键词 Stochastic volatility Real option analysis Fast Fourier transform method COAL Monte-Carlo Closed-form solution
下载PDF
Valuing options to renew at future market value:the case of commercial property leases
8
作者 Jenny Jing Wang Jianfu Shen Frederik Pretorius 《Financial Innovation》 2023年第1期1932-1966,共35页
In this study,we develop and empirically test a valuation model for a commonly encountered option in office leases:a tenant’s option to renew at future market rent(a fair market value)with lease termination as the ma... In this study,we develop and empirically test a valuation model for a commonly encountered option in office leases:a tenant’s option to renew at future market rent(a fair market value)with lease termination as the maturity date.The model integrates decision analysis with real options analysis and market risk with private risks.“Option value”is defined as the private value of the option to either party pre-contract,while“option price”assumes a fair agreement between transacting parties and can be positive(rental premium paid)or negative(rental discount offered).Without manifest expectations,an analysis of a sample of office leases supports the model’s logic with price estimates in a practical range.The tenants’option price/value is shown to have a negative relationship with the original/renewal lease term;conversely,the landlords’option value is positively related to the original/renewal term.Comparative analyses show that transaction costs have a positive effect on tenants’option value and on prices,while vacancy costs and the vacancy period are both positively related to the landlords’option value and negatively related to price.Market rent is found to have a negative relationship with option price.Overall,this study provides a theoretical analysis and empirical tests of the value of a real option that allows option holders to renew/extend their contracts at a fair market value. 展开更多
关键词 Fair market value renewal Commercial property leases Real option VALUATION Integrated method
下载PDF
Investment Promotion for Development Zones in China:Underlying Rationale and Policy Options
9
作者 Chen Qiangyuan Zhao Haoyun Ye Yang 《China Economist》 2023年第5期98-123,共26页
Development zones(DZs)have emerged as a significant policy initiative for promoting regional coordination and facilitating resources allocation.They serve as an organizational framework for fostering industrial agglom... Development zones(DZs)have emerged as a significant policy initiative for promoting regional coordination and facilitating resources allocation.They serve as an organizational framework for fostering industrial agglomeration and driving high-quality development.DZs attract and accommodate resource factors,firms,and projects,thereby functioning as a central catalyst for economic growth.This study utilizes data collected at the“DZ,City and Countrycountry”levels through manual compilation,textual analysis,and innovation measurement.It aims to empirically examine the theoretical rationale and practical preferences for promoting business and investment in China’s DZs.This study considers several factors such as industry attribute,firm attribute,agglomeration theory,and industrial chain layout.Based on our research findings,DZs exhibit distinct preferences.First,industry attribute:DZs align with both national and regional strategic planning and adhere to the industrial endowments of the respective areas.Second,firm attribute:DZs prioritize attracting firms that are productive and innovative,and have an international presence,rather than those that primarily contribute to taxes and job creation.Third,DZs are guided by the agglomeration theory,which suggests that they prefer firms that generate strong agglomeration externalities.Lastly,DZs also consider the industrial chain layout,aiming to attract firms that not only align with their existing industrial strengths but also extend to the upstream and downstream supply chain links.These conclusions are substantiated by the performance of robustness test.The success of DZs in China can be attributed to the five key principles:Adherence to national and regional strategic planning,prioritizing the actual industrial foundation,incorporating the theory of agglomeration externalities,strengthening corporate competitiveness,and expanding industrial chains. 展开更多
关键词 Investment promotion by development zones basic rationale policy options agglomeration externalities spatial allocation of resources
下载PDF
Pricing European Options Based on a Logarithmic Truncated t-Distribution
10
作者 Yingying Cao Xueping Liu +1 位作者 Yiqian Zhao Xuege Han 《Journal of Applied Mathematics and Physics》 2023年第5期1349-1358,共10页
The t-distribution has a “fat tail” feature, which is more suitable than the normal probability density function to describe the distribution characteristics of return on assets. The difficulty of using t-distributi... The t-distribution has a “fat tail” feature, which is more suitable than the normal probability density function to describe the distribution characteristics of return on assets. The difficulty of using t-distribution to price European options is that a fat tail can lead to a deviation in one integral required for option pricing. We use a distribution called logarithmic truncated t-distribution to price European options. A risk neutral valuation method was used to obtain a European option pricing model with logarithmic truncated t-distribution. 展开更多
关键词 option Pricing Logarithmic Truncated t-Distribution Asset Returns Risk-Neutral Valuation Approach
下载PDF
The Contribution Margin due to a Limiting Factor in the Presence of Several Sales Options: Actuality Is Not Always As It Appears at the Beginning of the Analysis
11
作者 Maria Silvia Avi 《Journal of Modern Accounting and Auditing》 2023年第1期1-22,共22页
The analysis of company data useful for economic decisions,if not interpreted in an overall view of the company situation,can lead to wrong conclusions.This is the case when a company has to choose between several sal... The analysis of company data useful for economic decisions,if not interpreted in an overall view of the company situation,can lead to wrong conclusions.This is the case when a company has to choose between several sales options for one or more products in the presence of a limiting factor.The continuation of the investigation often denies the initial analysis.Not everything is as it appears,therefore,at the beginning of the deepening of the data useful for economic decisions.As it is well known,the choices of profitability concerning the planning of the sale of company products take place,at least in the majority of cases,through the determination of the contribution margin,i.e.the profitability margin connected to the individual goods/services sold by the companies(selling price net of variable costs).The contribution margin can be determined with four objectives:(1)Determination of the yield of the single product,net of variable costs only.In this case,the margin defines unitary,from net product yield to unitary contribution margin.(2)Determination of the return on total sales of an individual product,net of variable costs.In this hypothesis,reference is made to the first level(or gross)contribution margin.(3)Determination of the ability of the individual product to contribute to the coverage of fixed costs common to the company.This margin is determined net of special product variable and fixed costs.This aggregate is defined as a Level II(or semi-gross)margin.(4)Determination of the useful value in the planning choices in case of presence of scarce productive factors.In this case,it must identify the so-called unitary margin for low factor.Here we will only deal with the problem of the use of the contribution margin in the presence of rare factors.To complete the analysis,below are some very brief considerations regarding,respectively,the unitary,level I,and level II contribution margin in order to better understand where the problem of the most convenient choice of income is located in the event of the presence of rare production factors,especially in an environment characterized by a plurality of sales options. 展开更多
关键词 contribution margin unit contribution margin first level contribution margin second level contribution margin Unit Scarce factor contribution margin Unit Scarce factor contribution margin in the presence of a plurality of sales options profit
下载PDF
面向Option的k-聚类Subgoal发现算法 被引量:8
12
作者 王本年 高阳 +2 位作者 陈兆乾 谢俊元 陈世福 《计算机研究与发展》 EI CSCD 北大核心 2006年第5期851-855,共5页
在学习过程中自动发现有用的Subgoal并创建Option,对提高强化学习的学习性能有着重要意义.提出了一种基于k-聚类的Subgoal自动发现算法,该算法能通过对在线获取的少量路径数据进行聚类的方法抽取出Subgoal.实验表明,该算法能有效地发现... 在学习过程中自动发现有用的Subgoal并创建Option,对提高强化学习的学习性能有着重要意义.提出了一种基于k-聚类的Subgoal自动发现算法,该算法能通过对在线获取的少量路径数据进行聚类的方法抽取出Subgoal.实验表明,该算法能有效地发现所有符合要求的Subgoal,与Q-学习和基于多样性密度的强化学习算法相比,用该算法发现Subgoal并创建Option的强化学习算法能有效提高A-gent的学习速度. 展开更多
关键词 分层强化学习 option 子目标
下载PDF
基于可中断Option的在线分层强化学习方法 被引量:4
13
作者 朱斐 许志鹏 +2 位作者 刘全 伏玉琛 王辉 《通信学报》 EI CSCD 北大核心 2016年第6期65-74,共10页
针对大数据体量大的问题,在Macro-Q算法的基础上提出了一种在线更新的Macro-Q算法(MQIU),同时更新抽象动作的值函数和元动作的值函数,提高了数据样本的利用率。针对传统的马尔可夫过程模型和抽象动作均难于应对可变性,引入中断机制,提... 针对大数据体量大的问题,在Macro-Q算法的基础上提出了一种在线更新的Macro-Q算法(MQIU),同时更新抽象动作的值函数和元动作的值函数,提高了数据样本的利用率。针对传统的马尔可夫过程模型和抽象动作均难于应对可变性,引入中断机制,提出了一种可中断抽象动作的Macro-Q无模型学习算法(IMQ),能在动态环境下学习并改进控制策略。仿真结果验证了MQIU算法能加快算法收敛速度,进而能解决更大规模的问题,同时也验证了IMQ算法能够加快任务的求解,并保持学习性能的稳定性。 展开更多
关键词 大数据 强化学习 分层强化学习 option 在线学习
下载PDF
延迟退休对我国劳动者养老金收入的影响——基于Option Value模型的预测 被引量:24
14
作者 林熙 林义 《人口与经济》 CSSCI 北大核心 2015年第6期12-21,共10页
养老保险制度的精算公平性是延迟退休的经济基础。根据Option Value模型的预测结果,在当前养老保险计发办法下,延迟退休可能对男性劳动者和低收入劳动者造成明显的经济损失。而延长女性劳动者的退休年龄,也可能在特定假设条件下使其遭... 养老保险制度的精算公平性是延迟退休的经济基础。根据Option Value模型的预测结果,在当前养老保险计发办法下,延迟退休可能对男性劳动者和低收入劳动者造成明显的经济损失。而延长女性劳动者的退休年龄,也可能在特定假设条件下使其遭受经济损失。鉴于此,我国养老保险制度亟须调整,以做到精算公平,为渐进延迟退休年龄改革打下基础。 展开更多
关键词 延迟退休 养老保险 option Value模型
下载PDF
分层强化学习中的Option自动生成算法 被引量:5
15
作者 沈晶 顾国昌 刘海波 《计算机工程与应用》 CSCD 北大核心 2005年第34期4-6,15,共4页
分层强化学习中目前有Option、HAM和MAXQ三种主要方法,其自动分层问题均未得到有效解决,该文针对第一种方法,提出了Option自动生成算法,该算法以Agent在学习初始阶段探测到的状态空间为输入,采用人工免疫网络技术对其进行聚类,在聚类后... 分层强化学习中目前有Option、HAM和MAXQ三种主要方法,其自动分层问题均未得到有效解决,该文针对第一种方法,提出了Option自动生成算法,该算法以Agent在学习初始阶段探测到的状态空间为输入,采用人工免疫网络技术对其进行聚类,在聚类后的各状态子集上通过经验回放学习产生内部策略集,从而生成Option,仿真实验验证了该算法的有效性。 展开更多
关键词 分层强化学习 option 人工免疫网络 经验回放
下载PDF
基于多智能体的Option自动生成算法 被引量:2
16
作者 沈晶 顾国昌 刘海波 《智能系统学报》 2006年第1期84-87,共4页
目前分层强化学习中的任务自动分层都是采用基于单智能体的串行学习算法,为解决串行算法学习速度较慢的问题,以Sutton的Option分层强化学习方法为基础框架,提出了一种基于多智能体的Option自动生成算法,该算法由多智能体合作对状态空间... 目前分层强化学习中的任务自动分层都是采用基于单智能体的串行学习算法,为解决串行算法学习速度较慢的问题,以Sutton的Option分层强化学习方法为基础框架,提出了一种基于多智能体的Option自动生成算法,该算法由多智能体合作对状态空间进行并行探测并集中应用aiNet实现免疫聚类产生状态子空间,然后并行学习生成各子空间上的内部策略,最终生成Option.以二维有障碍栅格空间内2点间最短路径规划为任务背景给出了算法并进行了仿真实验和分析.结果表明,基于多智能体的Option自动生成算法速度明显快于基于单智能体的算法. 展开更多
关键词 分层强化学习 自动分层 多智能体系统 option AINET
下载PDF
基于并发Options的双边多议题协商模型优化 被引量:2
17
作者 彭志平 彭宏 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2007年第9期95-100,共6页
针对双边多议题协商中的僵局问题,提出利用并发Options优化协商模型的方法.这种方法可在不降低双边协商效用的前提下,并行动态优化与僵局议题相关的多个议题的保留值.电子商务的实验结果表明:基于并发Options的协商模型优化方法是有效的... 针对双边多议题协商中的僵局问题,提出利用并发Options优化协商模型的方法.这种方法可在不降低双边协商效用的前提下,并行动态优化与僵局议题相关的多个议题的保留值.电子商务的实验结果表明:基于并发Options的协商模型优化方法是有效的;无论是学习速度,还是最佳策略的优化程度和泛化能力,该方法均明显优于基于标准Options和Q-学习的优化方法. 展开更多
关键词 协商模型 协商僵局 优化 并发options 强化学习
下载PDF
基于Option82技术的DHCP在大型网络中的实现 被引量:1
18
作者 肖阳 李阳 段辉良 《中南林业科技大学学报》 CAS CSCD 北大核心 2008年第5期140-142,共3页
目前大型网络中应用最多的就是动态主机配置协议(DHCP),它主要用来动态提供配置参数给因特网上的主机,一方面从DHCP服务器传送主机特定的协议配置参数到主机,同时自动分配网络地址给主机.针对DHCP技术结合O ption82机制在大型网络中的... 目前大型网络中应用最多的就是动态主机配置协议(DHCP),它主要用来动态提供配置参数给因特网上的主机,一方面从DHCP服务器传送主机特定的协议配置参数到主机,同时自动分配网络地址给主机.针对DHCP技术结合O ption82机制在大型网络中的应用做了详细的探讨. 展开更多
关键词 计算机网络 大型网络 DHCP option82
下载PDF
基于option 82与802.1x校园网用户权限控制的设计 被引量:1
19
作者 吴江 姜少杰 +1 位作者 冯雯 廖蓉 《微计算机信息》 2010年第6期102-104,121,共4页
本文通过对现有的基于IEEE的802.1x协议的认证环境与DHCP option 82技术相结合,对校园网用户实现上网权限的控制,使得不同类别的用户在认证前后拥有不同权限的IP地址,便于校园网资源得到最合理的利用,避免了有限的网络资源大面积消耗,... 本文通过对现有的基于IEEE的802.1x协议的认证环境与DHCP option 82技术相结合,对校园网用户实现上网权限的控制,使得不同类别的用户在认证前后拥有不同权限的IP地址,便于校园网资源得到最合理的利用,避免了有限的网络资源大面积消耗,也为管理者提供了简单、有效的管理手段。 展开更多
关键词 DHCP option 82 802.1x 权限 校园网 认证
下载PDF
基于禁忌搜索的option自动构造
20
作者 徐明亮 苏晓萍 须文波 《系统仿真学报》 CAS CSCD 北大核心 2009年第23期7479-7482,共4页
通过在环境中设置禁忌状态,agent能够在与环境的交互过程中发现瓶颈状态,以及瓶颈状态之间的毗邻关系。agent根据瓶颈状态之间的毗邻关系,自动地从毗邻的瓶颈状态中挑选合适的瓶颈状态作为option子目标。同时在交互过程中获得Option的... 通过在环境中设置禁忌状态,agent能够在与环境的交互过程中发现瓶颈状态,以及瓶颈状态之间的毗邻关系。agent根据瓶颈状态之间的毗邻关系,自动地从毗邻的瓶颈状态中挑选合适的瓶颈状态作为option子目标。同时在交互过程中获得Option的初始集,实现option的自动构造。网格环境中的导航实验验证了该方法无需人工干预就可以自动构造有用的option,即可以加快agent学习速度,也便于知识迁移,加快相关任务的学习。 展开更多
关键词 分层强化学习 option 子目标 禁忌搜索 Q-LEARNING
原文传递
上一页 1 2 250 下一页 到第
使用帮助 返回顶部