Mobile adhoc networks have grown in prominence in recent years,and they are now utilized in a broader range of applications.The main challenges are related to routing techniques that are generally employed in them.Mob...Mobile adhoc networks have grown in prominence in recent years,and they are now utilized in a broader range of applications.The main challenges are related to routing techniques that are generally employed in them.Mobile Adhoc system management,on the other hand,requires further testing and improvements in terms of security.Traditional routing protocols,such as Adhoc On-Demand Distance Vector(AODV)and Dynamic Source Routing(DSR),employ the hop count to calculate the distance between two nodes.The main aim of this research work is to determine the optimum method for sending packets while also extending life time of the network.It is achieved by changing the residual energy of each network node.Also,in this paper,various algorithms for optimal routing based on parameters like energy,distance,mobility,and the pheromone value are proposed.Moreover,an approach based on a reward and penalty system is given in this paper to evaluate the efficiency of the proposed algorithms under the impact of parameters.The simulation results unveil that the reward penalty-based approach is quite effective for the selection of an optimal path for routing when the algorithms are implemented under the parameters of interest,which helps in achieving less packet drop and energy consumption of the nodes along with enhancing the network efficiency.展开更多
Goal-conditioned reinforcement learning(RL)is an interesting extension of the traditional RL framework,where the dynamic environment and reward sparsity can cause conventional learning algorithms to fail.Reward shapin...Goal-conditioned reinforcement learning(RL)is an interesting extension of the traditional RL framework,where the dynamic environment and reward sparsity can cause conventional learning algorithms to fail.Reward shaping is a practical approach to improving sample efficiency by embedding human domain knowledge into the learning process.Existing reward shaping methods for goal-conditioned RL are typically built on distance metrics with a linear and isotropic distribution,which may fail to provide sufficient information about the ever-changing environment with high complexity.This paper proposes a novel magnetic field-based reward shaping(MFRS)method for goal-conditioned RL tasks with dynamic target and obstacles.Inspired by the physical properties of magnets,we consider the target and obstacles as permanent magnets and establish the reward function according to the intensity values of the magnetic field generated by these magnets.The nonlinear and anisotropic distribution of the magnetic field intensity can provide more accessible and conducive information about the optimization landscape,thus introducing a more sophisticated magnetic reward compared to the distance-based setting.Further,we transform our magnetic reward to the form of potential-based reward shaping by learning a secondary potential function concurrently to ensure the optimal policy invariance of our method.Experiments results in both simulated and real-world robotic manipulation tasks demonstrate that MFRS outperforms relevant existing methods and effectively improves the sample efficiency of RL algorithms in goal-conditioned tasks with various dynamics of the target and obstacles.展开更多
As assessment outcomes provide students with a sense of accomplishment that is boosted by the reward system,learning becomes more effective.This research aims to determine the effects of reward system prior to assessm...As assessment outcomes provide students with a sense of accomplishment that is boosted by the reward system,learning becomes more effective.This research aims to determine the effects of reward system prior to assessment in Mathematics.Quasi-experimental research design was used to examine whether there was a significant difference between the use of reward system and students’level of performance in Mathematics.Through purposive sampling,the respondents of the study involve 80 Grade 9 students belonging to two sections from Gaudencio B.Lontok Memorial Integrated School.Based on similar demographics and pre-test results,control and study group were involved as participants of the study.Data were treated and analyzed accordingly using statistical treatments such as mean and t-test for independent variables.There was a significant finding revealing the advantage of using the reward system compare to the non-reward system in increasing students’level of performance in Mathematics.It is concluded that the use of reward system is effective in improving the assessment outcomes in Mathematics.It is recommended to use reward system for persistent assessment outcomes prior to assessment,to be a reflection of the intended outcomes in Mathematics.展开更多
There is emerging evidence implicating glucagon-like peptide-1 (GLP-1) in reward, including palatable food reinforcement and alcohol-based reward circuitry. While recent findings suggest that mesolimbic structures, su...There is emerging evidence implicating glucagon-like peptide-1 (GLP-1) in reward, including palatable food reinforcement and alcohol-based reward circuitry. While recent findings suggest that mesolimbic structures, such as the ventral tegmental area (VTA) and the nucleus accumbens (NAc), are critical anatomical sites mediating the role of GLP-1’s inhibitory actions, the present study focused on the potential novel impact of GLP-1 within the habenula, a region of the forebrain expressing GLP-1 receptors. Given that the habenula has also been implicated in the neural control of reward and reinforcement, we hypothesized that this brain region, like the VTA and NAc, might mediate the anhedonic effects of GLP-1. Rats were stereotaxically implanted with guide cannula targeting the habenula and trained on a progressive ratio 3 (PR3) schedule of reinforcement. Separate rats were trained on an alcohol two-bottle choice paradigm with intermittent access. The GLP-1 agonist exendin-4 (Ex-4) was administered directly into the habenula to determine the effects on operant responding for palatable food as well as alcohol intake. Our results indicated that Ex-4 reliably suppressed PR3 responding and that this effect was dose-dependent. A similar suppressive effect on alcohol consumption was observed. These findings provide initial and compelling evidence that the habenula may mediate the inhibitory action of GLP-1 on reward, including operant and drug reward. Our findings further suggest that GLP-1 receptor mechanisms outside of the midbrain and ventral striatum are critically involved in brain reward neurotransmission.展开更多
There is no question that learning a foreign language like English is different from learning other subjects, mainly because it is new to us Chinese and there is no enough environment. But that doesn’t mean we have n...There is no question that learning a foreign language like English is different from learning other subjects, mainly because it is new to us Chinese and there is no enough environment. But that doesn’t mean we have no way to learn it and do it well .If asked to identify the most powerful influences on learning, motivation would probably be high on most teachers’ and learners’ lists. It seems only sensible to assume that English learning is most likely to occur when the learners want to learn. That is, when motivation such as interest, curiosity, or a desire achieves, the learners would be engaged in learning. However, how do we teachers motivate our students to like learning and learn well? Here, rewards both extrinsic and intrinsic are of great value and play a vital role in English learning.展开更多
The psychological mechanism of reward is to form operational conditioned reflex through positive reinforcement and negative reinforcement.The positive effect of reward is to strengthen external learning motivation,and...The psychological mechanism of reward is to form operational conditioned reflex through positive reinforcement and negative reinforcement.The positive effect of reward is to strengthen external learning motivation,and reward can sometimes improve creativity.The negative effects are:weakening students'creativity,weakening the internal motivation of learning and hindering the development of autonomy.Teachers should apply educational rewards scientifically,take care of their age,consider the difficulty of tasks,pay attention to stimulating students'internal motivation,and give priority to spiritual rewards,supplemented by material rewards.展开更多
In the world, most of the successes are results of longterm efforts. The reward of success is extremely high, but before that, a long-term investment process is required. People who are “myopic” only value short-ter...In the world, most of the successes are results of longterm efforts. The reward of success is extremely high, but before that, a long-term investment process is required. People who are “myopic” only value short-term rewards and are unwilling to make early-stage investments, so they hardly get the ultimate success and the corresponding high rewards. Similarly, for a reinforcement learning(RL) model with long-delay rewards, the discount rate determines the strength of agent’s “farsightedness”.In order to enable the trained agent to make a chain of correct choices and succeed finally, the feasible region of the discount rate is obtained through mathematical derivation in this paper firstly. It satisfies the “farsightedness” requirement of agent. Afterwards, in order to avoid the complicated problem of solving implicit equations in the process of choosing feasible solutions,a simple method is explored and verified by theoreti cal demonstration and mathematical experiments. Then, a series of RL experiments are designed and implemented to verify the validity of theory. Finally, the model is extended from the finite process to the infinite process. The validity of the extended model is verified by theories and experiments. The whole research not only reveals the significance of the discount rate, but also provides a theoretical basis as well as a practical method for the choice of discount rate in future researches.展开更多
OBJECTIVE Glutamatergic projections from prefrontal cortex(PFc) to nucleus accumbens(NAc) regulate the dopamine(DA) release in NAc.However,it is not clear whether this circuit is effective for the reward and motivatio...OBJECTIVE Glutamatergic projections from prefrontal cortex(PFc) to nucleus accumbens(NAc) regulate the dopamine(DA) release in NAc.However,it is not clear whether this circuit is effective for the reward and motivation of heroin addiction.Our study investigates the effects of metabotropic glutamate receptor 2/3(mGluR2/3) and the projections from ventromedial prefrontal cortex(vmPFc) to the NAc shell on the reward and motivation of heroin-addicted rats.METHODS First,rats were trained to selfadministration for 14 d.On the 15 thday,parts of rats were injected with mGluR 2/3 agonist LY379268(0.1,0.3 and 1.0 mg·kg-1,ip) systematically and another parts of rats were bilaterally microinjected with LY379268(0.3 and 1.0 g·L^(-1))at the volume of 0.5 μL into the ventral tegmental area(VTA),NAc core or NAc shell,respectively.All rats were followed by heroin self-administration testing under fixed ratio 1(FR1) schedule or progressed ratio(PR) schedule to observe the effect of LY379268 on the heroin reward or motivation.Second,rats were injected chemogenetic glutamatergic virus(pAAV-CaMKIIa-hM3 D(Gq)-mCherry or pAOV-CaMKIIa-hM4 D(Gi)-mCherry-3 Flag) or negative control virus in vmPFc,and trained to heroin self-administration for 14 d.On the 15 thday,rats were bilateral y microinjected with clozapine-N-oxide(CNO,1 mmol·L^(-1),0.5 μL) into NAc shell and tested the effect on the heroin reward or motivation.Finally,rats were injected optogenetical glutamatergic virus(AAV2/9-CaM KⅡ-hChR2-EYFP) or negative control virus in vmPFc,implanted 16 channel photoelectrode in ipsilateral NAc shell,and trained to heroin selfadministration for 14 d.On the 15 thday,rats were tested heroin reward under FR1 procedure with blue light stimulation in the wavelength of470 nm,frequency of 25 HZ and power of 5 mW.Each stimulation lasting for 1 h and interval for1 h.The spike changes before and after stimulation in NAc Shel neural nerve was recorded.RESULTS LY379268 cloud dose-dependent attenuated the heroin reward or motivation and the local effective site was mainly in the NAc shell.Chemogenetic results showed activation or inactivation the projection from vmPFc to NAc shell enhanced or attenuated the heroin reward and motivation,respectively.Optogenetical stimulation the same projection also enhanced the heroin reward,and a tonic neuronal firing at the nerve of NAc shell was observed during the light stimulation session.CONCLUSION mGluR2/3 activation in the NAc shell is involved in the inhibition of heroin reward and motivation.Activation the projection from PFc to NAc shell can enhance the effects on heroin reward and motivation.展开更多
We hypothesize that individuals with genetic predisposition to Substance Use Disorder (SUD) may have greater likelihood of experiencing work related accidents. We further hypothesize that high risk populations will ca...We hypothesize that individuals with genetic predisposition to Substance Use Disorder (SUD) may have greater likelihood of experiencing work related accidents. We further hypothesize that high risk populations will carry single or multiple polymorphisms associated with brain reward circuitry and/or brain reward cascade, including: Dopaminergic (i.e. DRD2 receptor genes);Serotonergic (i.e. 5-HTT2 receptor genes);Endorphinergic (i.e. pre-enkephalin genes);Gabergic (i.e. GABAA receptor genes);Neurotransmitter Metabolizing genes (i.e. MAO and COMT genes) among others (GARSRXTM). Analgesic addiction as well as “pseudoaddiction” must be treated to improve pain control and its management. We propose that non-pharmacological alternatives to pain relief, in high risk, addiction-prone individuals, are Electrotherapeutic Device(s) and Programs. We further propose patented KB220Z, a nutraceutical designed to release dopamine at the nucleus accumbens, will reduce craving behavior, in genetically programmed individuals. By utilizing both alternatives in DNA analyzed injured workers, a reduction in analgesic addiction (genuine or pseudo) leads to improved health and quicker return to work. We also hypothesize that this novel approach will impact costs related to injuries in the workforce. Effective management of chronic pain, especially in high addiction-prone workforce populations, is possible in spite of being particularly elusive. A series of factors encumber pain assessment and management, including analgesia addiction, pharmacogenomic response to pain medications, and genetically inherited factors involving gene polymorphisms. Additional research is required to test these stipulated hypotheses related to genetic proneness to addiction, but also proneness to accidents in the workplace and reduction of craving behavior. Our hypothesis that genotyping coupled with both KB220ZTM and the pharmaceutical-free Electrotherapy, will reduce iatrogenic induced analgesia addiction. This approach will achieve attainable effective pain management and quicker return to work. We propose outcomes such as the Reward Deficiency System SolutionTM may become an adjunct in the war against iatrogenic pain medication addiction.展开更多
To study the incentive mechanisms of cooperation, we propose a preference rewarding mechanism in the spatial prisoner’s dilemma game, which simultaneously considers reputational preference, other-regarding preference...To study the incentive mechanisms of cooperation, we propose a preference rewarding mechanism in the spatial prisoner’s dilemma game, which simultaneously considers reputational preference, other-regarding preference and the dynamic adjustment of vertex weight. The vertex weight of a player is adaptively adjusted according to the comparison result of his own reputation and the average reputation value of his immediate neighbors. Players are inclined to pay a personal cost to reward the cooperative neighbor with the greatest vertex weight. The vertex weight of a player is proportional to the preference rewards he can obtain from direct neighbors. We find that the preference rewarding mechanism significantly facilitates the evolution of cooperation, and the dynamic adjustment of vertex weight has powerful effect on the emergence of cooperative behavior. To validate multiple effects, strategy distribution and the average payoff and fitness of players are discussed in a microcosmic view.展开更多
This study was aimed to determine the effect of amygdaline inactivation on the sexual motivation of male rats during a T-maze task with a sexual reward. Subjects were chronically implanted with two stainless-steel can...This study was aimed to determine the effect of amygdaline inactivation on the sexual motivation of male rats during a T-maze task with a sexual reward. Subjects were chronically implanted with two stainless-steel cannulae that enabled the infusion of tetrodotoxin, a sodium channel blocker, into the left and right basolateral amygdala (BLA). Animals were divided into 3 groups: saline (SS);TTX1 (tetrodotoxin at 2.5 ng);and TTX2 (tetrodotoxin at 5.0 ng). To induce a sexually-motivated state, all male rats were allowed to have an intromission with a receptive female before performing the T-maze task, after which their sexual motivation was evaluated during seven trials in which a receptive female was placed in one goal-box of the T-maze, and a non-receptive one in the other. Subjects were allowed an intromission as a sexual reward whenever they reached the goal-box containing the receptive female, but were returned to the start-box if they did not. At the end of the experiment, copulation until ejaculation was permitted. Both doses of TTX increased the time rats required to cross the maze stem during the final trials. In terms of sexual interaction, the high dose of TTX increased more markedly mount, intromission and ejaculation latencies and the number of mounts and intromissions. Overall, these results indicate that the BLA may play an important role in modulating sexual behavior, particularly in maintaining sexual motivation in successive trials in a T-maze task and during sexual interaction per se.展开更多
In this paper, we present a stochastic reward net (SRN) approach to analyse the performance of IEEE 802.16 MAC with multiple traffic classes. The SRN model captures the quality of service requirements of the traffic c...In this paper, we present a stochastic reward net (SRN) approach to analyse the performance of IEEE 802.16 MAC with multiple traffic classes. The SRN model captures the quality of service requirements of the traffic classes. The model also takes into account pre-emption, priority and timeout characteristics associated with the traffic classes under consideration. The performance of the system is evaluated in terms of mean delay and normalized throughput considering the on-off traffic model. Our analytical model is validated by simulations.展开更多
Anhedonia can be defined as a condition in which the hedonic capacity is totally or partially lost. From a psychobiological perspective, several researchers proposed that anhedonia has a putative neural substrate, the...Anhedonia can be defined as a condition in which the hedonic capacity is totally or partially lost. From a psychobiological perspective, several researchers proposed that anhedonia has a putative neural substrate, the dopaminergic mesolimbic and mesocortical reward circuit, which involves the ventral tegmental area, the ventral striatum and part of the prefrontal cortex. Anhedonia is, besides depressed mood, one of the two core symptoms of depression;furthermore it is one of the most important negative symptom in schizophrenia. Anhedonia is also present in substance use disorders as part of the abstinence symptomatology, and interrelations between hedonic capability, craving and protracted withdrawal have been found, particularly in opiate-dependent subjects. Although anhedonia is regarded as an important symptom in psychopathology, so far it has received relatively little attention. In general, two main approaches have been utilized to investigate and assess anhedonia or hedonic capacity: laboratory-based measures and questionnaires. Among measurement scales, the most commonly used are the Snaith-Hamilton Pleasure Scale (SHAPS), the Fawcett-Clark Pleasure Scale (FCPS), and the Revised Chapman Physical Anhedonia Scale (CPAS). Nevertheless, other measurement scales, particularly used within broader psychopathological dimensions, are the Anhedonia-Asociality subscale (SANSanh) of the Scale for the Assessment of Negative Symptoms (SANS) and the Bech-Rafaelsen Melancholia Scale (BRMS). In this paper we analyze these different scales, individuating their strengths and limits and their current clinical applications.展开更多
Mental health symptoms secondary to trauma exposure and substance use disorders(SUDs)co-occur frequently in both clinical and community samples.The possibility of a shared aetiology remains an important question in tr...Mental health symptoms secondary to trauma exposure and substance use disorders(SUDs)co-occur frequently in both clinical and community samples.The possibility of a shared aetiology remains an important question in translational neuroscience.Advancements in genetics,basic science,and neuroimaging have led to an improved understanding of the neural basis of these disorders,their frequent comorbidity and high rates of relapse remain a clinical challenge.This project aimed to conduct a review of the field’s current understanding regarding the neural circuitry underlying posttraumatic stress disorder and SUD.A comprehensive review was conducted of available published literature regarding the shared neurobiology of these disorders,and is summarized in detail,including evidence from both animal and clinical studies.Upon summarizing the relevant literature,this review puts forth a hypothesis related to their shared neurobiology within the context of fear processing and reward cues.It provides an overview of brain reward circuitry and its relation to the neurobiology,symptomology,and phenomenology of trauma and substance use.This review provides clinical insights and implications of the proposed theory,including the potential development of novel pharmacological and therapeutic treatments to address this shared neurobiology.Limitations and extensions of this theory are discussed to provide future directions and insights for this shared phenomena.展开更多
Pediatric autoimmune neuropsychiatric disorders associated with group A streptococcal infections (PANDAS) is a concept that is used to characterize a subset of children with neuropsychiatric symptoms, tic disorders, o...Pediatric autoimmune neuropsychiatric disorders associated with group A streptococcal infections (PANDAS) is a concept that is used to characterize a subset of children with neuropsychiatric symptoms, tic disorders, or obsessive-compulsive disorder (OCD), whose symptoms are exacerbated by group A streptococcal (GAS) infection. PANDAS has been known to cause a sudden onset of reward deficiency syndrome (RDS). RDS includes multiple disorders that are characterized by dopaminergic signaling dysfunction in the brain reward cascade (BRC), which may result in addiction, depression, avoidant behaviors, anxiety, tic disorders, and/or OCD. According to research by Blum et al., the dopamine receptor D2 (DRD2) gene polymorphisms are important prevalent genetic determinants of RDS. The literature demonstrates that infections like Borrelia and Lyme, as well as other infections like group A beta-hemolytic streptococcal (GABHS), can cause an autoimmune reaction and associated antibodies target dopaminergic loci in the mesolimbic region of the brain, which interferes with brain function and potentially causes RDS-like symptoms/behaviors. The treatment of PANDAS remains controversial, especially since there have been limited efficacy studies to date. We propose an innovative potential treatment for PANDAS based on previous clinical trials using a pro-dopamine regulator known as KB220 variants. Our ongoing research suggests that achieving “dopamine homeostasis” by precision-guided DNA testing and pro-dopamine modulation could result in improved therapeutic outcomes.展开更多
This work aims to identify a method by the coordinator of the OU(operational unit)for the training of gratified personnel through the use of a rewarding system.The continuous transformations that concern the Italian h...This work aims to identify a method by the coordinator of the OU(operational unit)for the training of gratified personnel through the use of a rewarding system.The continuous transformations that concern the Italian healthcare scene lead the operators to face always new needs and problems.Professionals can not only be considered as workers but bearers of qualified intellectual,professional and cultural skills.Individual coordinators are required to be real leaders within their operational units and to use their managerial skills in achieving company objectives and in evaluating the personnel they manage.The main factor to which difficulties in the management of staff are related concerns the motivation,defined as a state of mind together with aspirations,needs,orientations,that pushes people to act and to use a behavior characterized by commitment,perseverance and determination.The need to better rationalize the resources available,to promote high quality health care,improving safety,efficiency and appropriateness has led the general management and coordinator of the OU to use the reward systems.With the introduction of this procedure aimed at enhancing the merit and encouraging virtuous behavior during the provision of health services,the public employment reform participates in the evolution of the regulatory framework and it turns on the change that is taking place in the world of work.展开更多
This paper aims to explore the impact of policy of giving rewards and subsidies(GRS) for grassland ecological conservation in Tibetan Plateau implemented by the Chinese government since 2009.Taking Gerze County in Nga...This paper aims to explore the impact of policy of giving rewards and subsidies(GRS) for grassland ecological conservation in Tibetan Plateau implemented by the Chinese government since 2009.Taking Gerze County in Ngari Prefecture in the Tibetan Autonomous Region(TAR) as an example,it discusses the objective,implementation and outcome of that policy with regard to the ecological reconstruction and problems that have ensured.Located in the northern part of the Qiangtang Plateau,Gerze is the largest county in Ngari Prefecture.It covers more than 7.8 million acres of pastureland,of which 6.2 million acres are usable for pastoralism; 3.4 million acres,however,lack water source.In recent decades,due to the increased population and other reasons,pastures of the area have shown signs of overgrazing,thus leading to serious degradation,desertification and salinization of the grassland.Since 2009,when neighboring Coqin County was chosen as a pilot site for the national ecological incentive and subsidy policy(or: ecological compensation policy),Gerze has also started to adopt this policy and brought ful implementation in 2010.Its purpose is to solve the problem of overgrazing.But like other policies carried out in Gerze,its implementation is faced with many challenges.First,it is difficult to define the types and scopes of the incentives and subsidies,which have become a major source of complaints of the local herdsmen.Second,the local herdsmen are also concerned with the fairness of assigning rewards and subsidies.Third,the high cost of the policy's implementation and supervision reduces its effects.Fourth,the fact that the herdsmen are not willing to reduce livestock population makes it difficult for the policy to achieve actual results.The author thinks it's necessary to revise and improve the current ecological incentive and subsidy policy.展开更多
This essay aims at illustrate the important role, of reward and punishment in education from a psychological viewpoint. According to Stimulus and Response theory, reward and punishment are now commonly used by teacher...This essay aims at illustrate the important role, of reward and punishment in education from a psychological viewpoint. According to Stimulus and Response theory, reward and punishment are now commonly used by teachers to encourage both congnitive activities and appropriate behaviour in classroom . Either of them can be used to encourage or supervise the students in learning, and rewarding is fawoured. However reward mechanism must be used properly and under control. It should 't be overused. Also, there is a place for punishment in education because errors need to be pointed out and antisocial behaviour should be corrected. It can be applied only when the intensity, duration and timing are carefully considered. In a word , reward system . is undoubtedly to have positive effect while punishment is proved to cause unpredictable result. Those specific informations are mentioned in the essay that follows.展开更多
In this work, for a control consumption-investment process with the discounted reward optimization criteria, a numerical estimate of the stability index is made. Using explicit formulas for the optimal stationary poli...In this work, for a control consumption-investment process with the discounted reward optimization criteria, a numerical estimate of the stability index is made. Using explicit formulas for the optimal stationary policies and for the value functions, the stability index is explicitly calculated and through statistical techniques its asymptotic behavior is investigated (using numerical experiments) when the discount coefficient approaches 1. The results obtained define the conditions under which an approximate optimal stationary policy can be used to control the original process.展开更多
文摘Mobile adhoc networks have grown in prominence in recent years,and they are now utilized in a broader range of applications.The main challenges are related to routing techniques that are generally employed in them.Mobile Adhoc system management,on the other hand,requires further testing and improvements in terms of security.Traditional routing protocols,such as Adhoc On-Demand Distance Vector(AODV)and Dynamic Source Routing(DSR),employ the hop count to calculate the distance between two nodes.The main aim of this research work is to determine the optimum method for sending packets while also extending life time of the network.It is achieved by changing the residual energy of each network node.Also,in this paper,various algorithms for optimal routing based on parameters like energy,distance,mobility,and the pheromone value are proposed.Moreover,an approach based on a reward and penalty system is given in this paper to evaluate the efficiency of the proposed algorithms under the impact of parameters.The simulation results unveil that the reward penalty-based approach is quite effective for the selection of an optimal path for routing when the algorithms are implemented under the parameters of interest,which helps in achieving less packet drop and energy consumption of the nodes along with enhancing the network efficiency.
基金supported in part by the National Natural Science Foundation of China(62006111,62073160)the Natural Science Foundation of Jiangsu Province of China(BK20200330)。
文摘Goal-conditioned reinforcement learning(RL)is an interesting extension of the traditional RL framework,where the dynamic environment and reward sparsity can cause conventional learning algorithms to fail.Reward shaping is a practical approach to improving sample efficiency by embedding human domain knowledge into the learning process.Existing reward shaping methods for goal-conditioned RL are typically built on distance metrics with a linear and isotropic distribution,which may fail to provide sufficient information about the ever-changing environment with high complexity.This paper proposes a novel magnetic field-based reward shaping(MFRS)method for goal-conditioned RL tasks with dynamic target and obstacles.Inspired by the physical properties of magnets,we consider the target and obstacles as permanent magnets and establish the reward function according to the intensity values of the magnetic field generated by these magnets.The nonlinear and anisotropic distribution of the magnetic field intensity can provide more accessible and conducive information about the optimization landscape,thus introducing a more sophisticated magnetic reward compared to the distance-based setting.Further,we transform our magnetic reward to the form of potential-based reward shaping by learning a secondary potential function concurrently to ensure the optimal policy invariance of our method.Experiments results in both simulated and real-world robotic manipulation tasks demonstrate that MFRS outperforms relevant existing methods and effectively improves the sample efficiency of RL algorithms in goal-conditioned tasks with various dynamics of the target and obstacles.
文摘As assessment outcomes provide students with a sense of accomplishment that is boosted by the reward system,learning becomes more effective.This research aims to determine the effects of reward system prior to assessment in Mathematics.Quasi-experimental research design was used to examine whether there was a significant difference between the use of reward system and students’level of performance in Mathematics.Through purposive sampling,the respondents of the study involve 80 Grade 9 students belonging to two sections from Gaudencio B.Lontok Memorial Integrated School.Based on similar demographics and pre-test results,control and study group were involved as participants of the study.Data were treated and analyzed accordingly using statistical treatments such as mean and t-test for independent variables.There was a significant finding revealing the advantage of using the reward system compare to the non-reward system in increasing students’level of performance in Mathematics.It is concluded that the use of reward system is effective in improving the assessment outcomes in Mathematics.It is recommended to use reward system for persistent assessment outcomes prior to assessment,to be a reflection of the intended outcomes in Mathematics.
文摘There is emerging evidence implicating glucagon-like peptide-1 (GLP-1) in reward, including palatable food reinforcement and alcohol-based reward circuitry. While recent findings suggest that mesolimbic structures, such as the ventral tegmental area (VTA) and the nucleus accumbens (NAc), are critical anatomical sites mediating the role of GLP-1’s inhibitory actions, the present study focused on the potential novel impact of GLP-1 within the habenula, a region of the forebrain expressing GLP-1 receptors. Given that the habenula has also been implicated in the neural control of reward and reinforcement, we hypothesized that this brain region, like the VTA and NAc, might mediate the anhedonic effects of GLP-1. Rats were stereotaxically implanted with guide cannula targeting the habenula and trained on a progressive ratio 3 (PR3) schedule of reinforcement. Separate rats were trained on an alcohol two-bottle choice paradigm with intermittent access. The GLP-1 agonist exendin-4 (Ex-4) was administered directly into the habenula to determine the effects on operant responding for palatable food as well as alcohol intake. Our results indicated that Ex-4 reliably suppressed PR3 responding and that this effect was dose-dependent. A similar suppressive effect on alcohol consumption was observed. These findings provide initial and compelling evidence that the habenula may mediate the inhibitory action of GLP-1 on reward, including operant and drug reward. Our findings further suggest that GLP-1 receptor mechanisms outside of the midbrain and ventral striatum are critically involved in brain reward neurotransmission.
文摘There is no question that learning a foreign language like English is different from learning other subjects, mainly because it is new to us Chinese and there is no enough environment. But that doesn’t mean we have no way to learn it and do it well .If asked to identify the most powerful influences on learning, motivation would probably be high on most teachers’ and learners’ lists. It seems only sensible to assume that English learning is most likely to occur when the learners want to learn. That is, when motivation such as interest, curiosity, or a desire achieves, the learners would be engaged in learning. However, how do we teachers motivate our students to like learning and learn well? Here, rewards both extrinsic and intrinsic are of great value and play a vital role in English learning.
文摘The psychological mechanism of reward is to form operational conditioned reflex through positive reinforcement and negative reinforcement.The positive effect of reward is to strengthen external learning motivation,and reward can sometimes improve creativity.The negative effects are:weakening students'creativity,weakening the internal motivation of learning and hindering the development of autonomy.Teachers should apply educational rewards scientifically,take care of their age,consider the difficulty of tasks,pay attention to stimulating students'internal motivation,and give priority to spiritual rewards,supplemented by material rewards.
基金supported by the National Natural Science Foundation of China (717712167170120972001214)。
文摘In the world, most of the successes are results of longterm efforts. The reward of success is extremely high, but before that, a long-term investment process is required. People who are “myopic” only value short-term rewards and are unwilling to make early-stage investments, so they hardly get the ultimate success and the corresponding high rewards. Similarly, for a reinforcement learning(RL) model with long-delay rewards, the discount rate determines the strength of agent’s “farsightedness”.In order to enable the trained agent to make a chain of correct choices and succeed finally, the feasible region of the discount rate is obtained through mathematical derivation in this paper firstly. It satisfies the “farsightedness” requirement of agent. Afterwards, in order to avoid the complicated problem of solving implicit equations in the process of choosing feasible solutions,a simple method is explored and verified by theoreti cal demonstration and mathematical experiments. Then, a series of RL experiments are designed and implemented to verify the validity of theory. Finally, the model is extended from the finite process to the infinite process. The validity of the extended model is verified by theories and experiments. The whole research not only reveals the significance of the discount rate, but also provides a theoretical basis as well as a practical method for the choice of discount rate in future researches.
基金National Basic Research Program of China(2015CB553504)National Natural Science Foundationof China (81471350+1 种基金81671321)Natural Science Foundation of Ningbo Municipality,Zhejiang Province, China (2017A610214).
文摘OBJECTIVE Glutamatergic projections from prefrontal cortex(PFc) to nucleus accumbens(NAc) regulate the dopamine(DA) release in NAc.However,it is not clear whether this circuit is effective for the reward and motivation of heroin addiction.Our study investigates the effects of metabotropic glutamate receptor 2/3(mGluR2/3) and the projections from ventromedial prefrontal cortex(vmPFc) to the NAc shell on the reward and motivation of heroin-addicted rats.METHODS First,rats were trained to selfadministration for 14 d.On the 15 thday,parts of rats were injected with mGluR 2/3 agonist LY379268(0.1,0.3 and 1.0 mg·kg-1,ip) systematically and another parts of rats were bilaterally microinjected with LY379268(0.3 and 1.0 g·L^(-1))at the volume of 0.5 μL into the ventral tegmental area(VTA),NAc core or NAc shell,respectively.All rats were followed by heroin self-administration testing under fixed ratio 1(FR1) schedule or progressed ratio(PR) schedule to observe the effect of LY379268 on the heroin reward or motivation.Second,rats were injected chemogenetic glutamatergic virus(pAAV-CaMKIIa-hM3 D(Gq)-mCherry or pAOV-CaMKIIa-hM4 D(Gi)-mCherry-3 Flag) or negative control virus in vmPFc,and trained to heroin self-administration for 14 d.On the 15 thday,rats were bilateral y microinjected with clozapine-N-oxide(CNO,1 mmol·L^(-1),0.5 μL) into NAc shell and tested the effect on the heroin reward or motivation.Finally,rats were injected optogenetical glutamatergic virus(AAV2/9-CaM KⅡ-hChR2-EYFP) or negative control virus in vmPFc,implanted 16 channel photoelectrode in ipsilateral NAc shell,and trained to heroin selfadministration for 14 d.On the 15 thday,rats were tested heroin reward under FR1 procedure with blue light stimulation in the wavelength of470 nm,frequency of 25 HZ and power of 5 mW.Each stimulation lasting for 1 h and interval for1 h.The spike changes before and after stimulation in NAc Shel neural nerve was recorded.RESULTS LY379268 cloud dose-dependent attenuated the heroin reward or motivation and the local effective site was mainly in the NAc shell.Chemogenetic results showed activation or inactivation the projection from vmPFc to NAc shell enhanced or attenuated the heroin reward and motivation,respectively.Optogenetical stimulation the same projection also enhanced the heroin reward,and a tonic neuronal firing at the nerve of NAc shell was observed during the light stimulation session.CONCLUSION mGluR2/3 activation in the NAc shell is involved in the inhibition of heroin reward and motivation.Activation the projection from PFc to NAc shell can enhance the effects on heroin reward and motivation.
文摘We hypothesize that individuals with genetic predisposition to Substance Use Disorder (SUD) may have greater likelihood of experiencing work related accidents. We further hypothesize that high risk populations will carry single or multiple polymorphisms associated with brain reward circuitry and/or brain reward cascade, including: Dopaminergic (i.e. DRD2 receptor genes);Serotonergic (i.e. 5-HTT2 receptor genes);Endorphinergic (i.e. pre-enkephalin genes);Gabergic (i.e. GABAA receptor genes);Neurotransmitter Metabolizing genes (i.e. MAO and COMT genes) among others (GARSRXTM). Analgesic addiction as well as “pseudoaddiction” must be treated to improve pain control and its management. We propose that non-pharmacological alternatives to pain relief, in high risk, addiction-prone individuals, are Electrotherapeutic Device(s) and Programs. We further propose patented KB220Z, a nutraceutical designed to release dopamine at the nucleus accumbens, will reduce craving behavior, in genetically programmed individuals. By utilizing both alternatives in DNA analyzed injured workers, a reduction in analgesic addiction (genuine or pseudo) leads to improved health and quicker return to work. We also hypothesize that this novel approach will impact costs related to injuries in the workforce. Effective management of chronic pain, especially in high addiction-prone workforce populations, is possible in spite of being particularly elusive. A series of factors encumber pain assessment and management, including analgesia addiction, pharmacogenomic response to pain medications, and genetically inherited factors involving gene polymorphisms. Additional research is required to test these stipulated hypotheses related to genetic proneness to addiction, but also proneness to accidents in the workplace and reduction of craving behavior. Our hypothesis that genotyping coupled with both KB220ZTM and the pharmaceutical-free Electrotherapy, will reduce iatrogenic induced analgesia addiction. This approach will achieve attainable effective pain management and quicker return to work. We propose outcomes such as the Reward Deficiency System SolutionTM may become an adjunct in the war against iatrogenic pain medication addiction.
基金the National Natural Science Foundation of China(Grant No.62062049)the Social Science Project of the Ministry of Education of China(Grant No.20YJCZH212)the Natural Science Foundation of Gansu Province,China(Grant No.20JR5RA390).
文摘To study the incentive mechanisms of cooperation, we propose a preference rewarding mechanism in the spatial prisoner’s dilemma game, which simultaneously considers reputational preference, other-regarding preference and the dynamic adjustment of vertex weight. The vertex weight of a player is adaptively adjusted according to the comparison result of his own reputation and the average reputation value of his immediate neighbors. Players are inclined to pay a personal cost to reward the cooperative neighbor with the greatest vertex weight. The vertex weight of a player is proportional to the preference rewards he can obtain from direct neighbors. We find that the preference rewarding mechanism significantly facilitates the evolution of cooperation, and the dynamic adjustment of vertex weight has powerful effect on the emergence of cooperative behavior. To validate multiple effects, strategy distribution and the average payoff and fitness of players are discussed in a microcosmic view.
文摘This study was aimed to determine the effect of amygdaline inactivation on the sexual motivation of male rats during a T-maze task with a sexual reward. Subjects were chronically implanted with two stainless-steel cannulae that enabled the infusion of tetrodotoxin, a sodium channel blocker, into the left and right basolateral amygdala (BLA). Animals were divided into 3 groups: saline (SS);TTX1 (tetrodotoxin at 2.5 ng);and TTX2 (tetrodotoxin at 5.0 ng). To induce a sexually-motivated state, all male rats were allowed to have an intromission with a receptive female before performing the T-maze task, after which their sexual motivation was evaluated during seven trials in which a receptive female was placed in one goal-box of the T-maze, and a non-receptive one in the other. Subjects were allowed an intromission as a sexual reward whenever they reached the goal-box containing the receptive female, but were returned to the start-box if they did not. At the end of the experiment, copulation until ejaculation was permitted. Both doses of TTX increased the time rats required to cross the maze stem during the final trials. In terms of sexual interaction, the high dose of TTX increased more markedly mount, intromission and ejaculation latencies and the number of mounts and intromissions. Overall, these results indicate that the BLA may play an important role in modulating sexual behavior, particularly in maintaining sexual motivation in successive trials in a T-maze task and during sexual interaction per se.
文摘In this paper, we present a stochastic reward net (SRN) approach to analyse the performance of IEEE 802.16 MAC with multiple traffic classes. The SRN model captures the quality of service requirements of the traffic classes. The model also takes into account pre-emption, priority and timeout characteristics associated with the traffic classes under consideration. The performance of the system is evaluated in terms of mean delay and normalized throughput considering the on-off traffic model. Our analytical model is validated by simulations.
文摘Anhedonia can be defined as a condition in which the hedonic capacity is totally or partially lost. From a psychobiological perspective, several researchers proposed that anhedonia has a putative neural substrate, the dopaminergic mesolimbic and mesocortical reward circuit, which involves the ventral tegmental area, the ventral striatum and part of the prefrontal cortex. Anhedonia is, besides depressed mood, one of the two core symptoms of depression;furthermore it is one of the most important negative symptom in schizophrenia. Anhedonia is also present in substance use disorders as part of the abstinence symptomatology, and interrelations between hedonic capability, craving and protracted withdrawal have been found, particularly in opiate-dependent subjects. Although anhedonia is regarded as an important symptom in psychopathology, so far it has received relatively little attention. In general, two main approaches have been utilized to investigate and assess anhedonia or hedonic capacity: laboratory-based measures and questionnaires. Among measurement scales, the most commonly used are the Snaith-Hamilton Pleasure Scale (SHAPS), the Fawcett-Clark Pleasure Scale (FCPS), and the Revised Chapman Physical Anhedonia Scale (CPAS). Nevertheless, other measurement scales, particularly used within broader psychopathological dimensions, are the Anhedonia-Asociality subscale (SANSanh) of the Scale for the Assessment of Negative Symptoms (SANS) and the Bech-Rafaelsen Melancholia Scale (BRMS). In this paper we analyze these different scales, individuating their strengths and limits and their current clinical applications.
文摘Mental health symptoms secondary to trauma exposure and substance use disorders(SUDs)co-occur frequently in both clinical and community samples.The possibility of a shared aetiology remains an important question in translational neuroscience.Advancements in genetics,basic science,and neuroimaging have led to an improved understanding of the neural basis of these disorders,their frequent comorbidity and high rates of relapse remain a clinical challenge.This project aimed to conduct a review of the field’s current understanding regarding the neural circuitry underlying posttraumatic stress disorder and SUD.A comprehensive review was conducted of available published literature regarding the shared neurobiology of these disorders,and is summarized in detail,including evidence from both animal and clinical studies.Upon summarizing the relevant literature,this review puts forth a hypothesis related to their shared neurobiology within the context of fear processing and reward cues.It provides an overview of brain reward circuitry and its relation to the neurobiology,symptomology,and phenomenology of trauma and substance use.This review provides clinical insights and implications of the proposed theory,including the potential development of novel pharmacological and therapeutic treatments to address this shared neurobiology.Limitations and extensions of this theory are discussed to provide future directions and insights for this shared phenomena.
文摘Pediatric autoimmune neuropsychiatric disorders associated with group A streptococcal infections (PANDAS) is a concept that is used to characterize a subset of children with neuropsychiatric symptoms, tic disorders, or obsessive-compulsive disorder (OCD), whose symptoms are exacerbated by group A streptococcal (GAS) infection. PANDAS has been known to cause a sudden onset of reward deficiency syndrome (RDS). RDS includes multiple disorders that are characterized by dopaminergic signaling dysfunction in the brain reward cascade (BRC), which may result in addiction, depression, avoidant behaviors, anxiety, tic disorders, and/or OCD. According to research by Blum et al., the dopamine receptor D2 (DRD2) gene polymorphisms are important prevalent genetic determinants of RDS. The literature demonstrates that infections like Borrelia and Lyme, as well as other infections like group A beta-hemolytic streptococcal (GABHS), can cause an autoimmune reaction and associated antibodies target dopaminergic loci in the mesolimbic region of the brain, which interferes with brain function and potentially causes RDS-like symptoms/behaviors. The treatment of PANDAS remains controversial, especially since there have been limited efficacy studies to date. We propose an innovative potential treatment for PANDAS based on previous clinical trials using a pro-dopamine regulator known as KB220 variants. Our ongoing research suggests that achieving “dopamine homeostasis” by precision-guided DNA testing and pro-dopamine modulation could result in improved therapeutic outcomes.
文摘This work aims to identify a method by the coordinator of the OU(operational unit)for the training of gratified personnel through the use of a rewarding system.The continuous transformations that concern the Italian healthcare scene lead the operators to face always new needs and problems.Professionals can not only be considered as workers but bearers of qualified intellectual,professional and cultural skills.Individual coordinators are required to be real leaders within their operational units and to use their managerial skills in achieving company objectives and in evaluating the personnel they manage.The main factor to which difficulties in the management of staff are related concerns the motivation,defined as a state of mind together with aspirations,needs,orientations,that pushes people to act and to use a behavior characterized by commitment,perseverance and determination.The need to better rationalize the resources available,to promote high quality health care,improving safety,efficiency and appropriateness has led the general management and coordinator of the OU to use the reward systems.With the introduction of this procedure aimed at enhancing the merit and encouraging virtuous behavior during the provision of health services,the public employment reform participates in the evolution of the regulatory framework and it turns on the change that is taking place in the world of work.
基金sponsored by National Natural Science Fund of China (Grant No.71273183)Natioanl Project 985 of Sichuan University
文摘This paper aims to explore the impact of policy of giving rewards and subsidies(GRS) for grassland ecological conservation in Tibetan Plateau implemented by the Chinese government since 2009.Taking Gerze County in Ngari Prefecture in the Tibetan Autonomous Region(TAR) as an example,it discusses the objective,implementation and outcome of that policy with regard to the ecological reconstruction and problems that have ensured.Located in the northern part of the Qiangtang Plateau,Gerze is the largest county in Ngari Prefecture.It covers more than 7.8 million acres of pastureland,of which 6.2 million acres are usable for pastoralism; 3.4 million acres,however,lack water source.In recent decades,due to the increased population and other reasons,pastures of the area have shown signs of overgrazing,thus leading to serious degradation,desertification and salinization of the grassland.Since 2009,when neighboring Coqin County was chosen as a pilot site for the national ecological incentive and subsidy policy(or: ecological compensation policy),Gerze has also started to adopt this policy and brought ful implementation in 2010.Its purpose is to solve the problem of overgrazing.But like other policies carried out in Gerze,its implementation is faced with many challenges.First,it is difficult to define the types and scopes of the incentives and subsidies,which have become a major source of complaints of the local herdsmen.Second,the local herdsmen are also concerned with the fairness of assigning rewards and subsidies.Third,the high cost of the policy's implementation and supervision reduces its effects.Fourth,the fact that the herdsmen are not willing to reduce livestock population makes it difficult for the policy to achieve actual results.The author thinks it's necessary to revise and improve the current ecological incentive and subsidy policy.
文摘This essay aims at illustrate the important role, of reward and punishment in education from a psychological viewpoint. According to Stimulus and Response theory, reward and punishment are now commonly used by teachers to encourage both congnitive activities and appropriate behaviour in classroom . Either of them can be used to encourage or supervise the students in learning, and rewarding is fawoured. However reward mechanism must be used properly and under control. It should 't be overused. Also, there is a place for punishment in education because errors need to be pointed out and antisocial behaviour should be corrected. It can be applied only when the intensity, duration and timing are carefully considered. In a word , reward system . is undoubtedly to have positive effect while punishment is proved to cause unpredictable result. Those specific informations are mentioned in the essay that follows.
文摘In this work, for a control consumption-investment process with the discounted reward optimization criteria, a numerical estimate of the stability index is made. Using explicit formulas for the optimal stationary policies and for the value functions, the stability index is explicitly calculated and through statistical techniques its asymptotic behavior is investigated (using numerical experiments) when the discount coefficient approaches 1. The results obtained define the conditions under which an approximate optimal stationary policy can be used to control the original process.