Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It...Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It is also playing an essential role in devolving human-robot interaction.The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping.Deep learning is changing the shape of computer vision(CV)technologies and natural language processing(NLP).There are hundreds of deep learning models,datasets,and evaluations that can improve the gaps in current research.This article filled this gap by evaluating some state-of-the-art approaches,especially focusing on deep learning and machine learning for video caption in a dense environment.In this article,some classic techniques concerning the existing machine learning were reviewed.And provides deep learning models,a detail of benchmark datasets with their respective domains.This paper reviews various evaluation metrics,including Bilingual EvaluationUnderstudy(BLEU),Metric for Evaluation of Translation with Explicit Ordering(METEOR),WordMover’s Distance(WMD),and Recall-Oriented Understudy for Gisting Evaluation(ROUGE)with their pros and cons.Finally,this article listed some future directions and proposed work for context enhancement using key scene extraction with object detection in a particular frame.Especially,how to improve the context of video description by analyzing key frames detection through morphological image analysis.Additionally,the paper discusses a novel approach involving sentence reconstruction and context improvement through key frame object detection,which incorporates the fusion of large languagemodels for refining results.The ultimate results arise fromenhancing the generated text of the proposedmodel by improving the predicted text and isolating objects using various keyframes.These keyframes identify dense events occurring in the video sequence.展开更多
Objective: To study the problematic use of video games among secondary school students in the city of Parakou in 2023. Methods: Descriptive cross-sectional study conducted in the commune of Parakou from December 2022 ...Objective: To study the problematic use of video games among secondary school students in the city of Parakou in 2023. Methods: Descriptive cross-sectional study conducted in the commune of Parakou from December 2022 to July 2023. The study population consisted of students regularly enrolled in public and private secondary schools in the city of Parakou for the 2022-2023 academic year. A two-stage non-proportional stratified sampling technique combined with simple random sampling was adopted. The Problem Video Game Playing (PVP) scale was used to assess problem gambling in the study population, while anxiety and depression were assessed using the Hospital Anxiety and Depression Scale (HADS). Results: A total of 1030 students were included. The mean age of the pupils surveyed was 15.06 ± 2.68 years, with extremes of 10 and 28 years. The [13 - 18] age group was the most represented, with a proportion of 59.6% (614) in the general population. Females predominated, at 52.8% (544), with a sex ratio of 0.89. The prevalence of problematic video game use was 24.9%, measured using the Video Game Playing scale. Associated factors were male gender (p = 0.005), pocket money under 10,000 cfa (p = 0.001) and between 20,000 - 90,000 cfa (p = 0.030), addictive family behavior (p < 0.001), monogamous family (p = 0.023), good relationship with father (p = 0.020), organization of video game competitions (p = 0.001) and definite anxiety (p Conclusion: Substance-free addiction is struggling to attract the attention it deserves, as it did in its infancy everywhere else. This study complements existing data and serves as a reminder of the need to focus on this group of addictions, whose problematic use of video games remains the most frequent due to its accessibility and social tolerance. Preventive action combined with curative measures remains the most effective means of combating the problem at national level.展开更多
In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is de...In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits.展开更多
Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract ...Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality.展开更多
Context/objectives: The fight against Chronic Non-Communicable Diseases (NCDs) is a long-term undertaking, which requires available, motivated and well-managed human resources (HR). The administrative management of sk...Context/objectives: The fight against Chronic Non-Communicable Diseases (NCDs) is a long-term undertaking, which requires available, motivated and well-managed human resources (HR). The administrative management of skills on both qualitative and quantitative levels is one of the essential functions of a health system. To better implement policies of fight against High Blood Pressure (HBP) and other chronic diseases, it is important to establish strategies to retain health personnel. This loyalty requires favorable working conditions and consideration of the contribution-reward couple. Good working conditions are likely to reduce the phenomenon of medical nomadism;conversely, poor HR management can contribute to their exodus towards exotic “green pastures”, thus leading to an additional crisis in the Cameroonian health system. The fight against HBP is a complex, multifaceted and multifactorial reality that requires appropriate management model for all types of resources mainly HR. The main objective of this research is to show the impact of poor management of human resources in Cameroon health system on medical nomadism and the ineffectiveness of the fight against High Blood Pressure. Method: A cross-sectional descriptive survey among five hundred (500) health facilities in the center region of Cameroon has been conducted. A stratified probabilistic technique has been used, and the number of health facilities to be surveyed has been determined using the “sample size estimation table” of Depelteau. The physical questionnaires have been printed and then distributed to data collectors. After data collection, the latter were grouped during processing in Excel sheets. The Chi-square test was used for data with a qualitative value and that of Kolmogorov-Sminorf for data with a quantitative value to assess the normality and reliability of data. The Crochach’s Alpha reliability test allowed us to have a summary of the means and variances and then to search for intragroup correlations between variables. Descriptive analysis was possible with the XLSTAT 2016 software. Results: 43.60% of Health Facilities (HF) managers were unqualified. 82.20% of HF managers have staff in a situation of professional insecurity. They are mainly contractual (49.00), decision-making agents (24.40%), casual agents (08.80). The proportion of unstable personnel is average of 22.00% and very unstable, 12.00%.展开更多
Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing com...Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing complex spatial data that is also influenced by temporal dynamics.Despite the progress made in existing VSOD models,they still struggle in scenes of great background diversity within and between frames.Additionally,they encounter difficulties related to accumulated noise and high time consumption during the extraction of temporal features over a long-term duration.We propose a multi-stream temporal enhanced network(MSTENet)to address these problems.It investigates saliency cues collaboration in the spatial domain with a multi-stream structure to deal with the great background diversity challenge.A straightforward,yet efficient approach for temporal feature extraction is developed to avoid the accumulative noises and reduce time consumption.The distinction between MSTENet and other VSOD methods stems from its incorporation of both foreground supervision and background supervision,facilitating enhanced extraction of collaborative saliency cues.Another notable differentiation is the innovative integration of spatial and temporal features,wherein the temporal module is integrated into the multi-stream structure,enabling comprehensive spatial-temporal interactions within an end-to-end framework.Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance on five benchmark datasets while maintaining a real-time speed of 27 fps(Titan XP).Our code and models are available at https://github.com/RuJiaLe/MSTENet.展开更多
Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the...Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the research and applications of natural language processing across different modalities,our goal is to accurately extract frame-level semantic information from videos and ultimately transmit high-quality videos.Specifically,we propose a deep learning-basedMulti-ModalMutual Enhancement Video Semantic Communication system,called M3E-VSC.Built upon a VectorQuantized Generative AdversarialNetwork(VQGAN),our systemaims to leverage mutual enhancement among different modalities by using text as the main carrier of transmission.With it,the semantic information can be extracted fromkey-frame images and audio of the video and performdifferential value to ensure that the extracted text conveys accurate semantic information with fewer bits,thus improving the capacity of the system.Furthermore,a multi-frame semantic detection module is designed to facilitate semantic transitions during video generation.Simulation results demonstrate that our proposed model maintains high robustness in complex noise environments,particularly in low signal-to-noise ratio conditions,significantly improving the accuracy and speed of semantic transmission in video communication by approximately 50 percent.展开更多
Video streaming applications have grown considerably in recent years.As a result,this becomes one of the most significant contributors to global internet traffic.According to recent studies,the telecommunications indu...Video streaming applications have grown considerably in recent years.As a result,this becomes one of the most significant contributors to global internet traffic.According to recent studies,the telecommunications industry loses millions of dollars due to poor video Quality of Experience(QoE)for users.Among the standard proposals for standardizing the quality of video streaming over internet service providers(ISPs)is the Mean Opinion Score(MOS).However,the accurate finding of QoE by MOS is subjective and laborious,and it varies depending on the user.A fully automated data analytics framework is required to reduce the inter-operator variability characteristic in QoE assessment.This work addresses this concern by suggesting a novel hybrid XGBStackQoE analytical model using a two-level layering technique.Level one combines multiple Machine Learning(ML)models via a layer one Hybrid XGBStackQoE-model.Individual ML models at level one are trained using the entire training data set.The level two Hybrid XGBStackQoE-Model is fitted using the outputs(meta-features)of the layer one ML models.The proposed model outperformed the conventional models,with an accuracy improvement of 4 to 5 percent,which is still higher than the current traditional models.The proposed framework could significantly improve video QoE accuracy.展开更多
Video watermarking plays a crucial role in protecting intellectual property rights and ensuring content authenticity.This study delves into the integration of Galois Field(GF)multiplication tables,especially GF(2^(4))...Video watermarking plays a crucial role in protecting intellectual property rights and ensuring content authenticity.This study delves into the integration of Galois Field(GF)multiplication tables,especially GF(2^(4)),and their interaction with distinct irreducible polynomials.The primary aim is to enhance watermarking techniques for achieving imperceptibility,robustness,and efficient execution time.The research employs scene selection and adaptive thresholding techniques to streamline the watermarking process.Scene selection is used strategically to embed watermarks in the most vital frames of the video,while adaptive thresholding methods ensure that the watermarking process adheres to imperceptibility criteria,maintaining the video's visual quality.Concurrently,careful consideration is given to execution time,crucial in real-world scenarios,to balance efficiency and efficacy.The Peak Signal-to-Noise Ratio(PSNR)serves as a pivotal metric to gauge the watermark's imperceptibility and video quality.The study explores various irreducible polynomials,navigating the trade-offs between computational efficiency and watermark imperceptibility.In parallel,the study pays careful attention to the execution time,a paramount consideration in real-world scenarios,to strike a balance between efficiency and efficacy.This comprehensive analysis provides valuable insights into the interplay of GF multiplication tables,diverse irreducible polynomials,scene selection,adaptive thresholding,imperceptibility,and execution time.The evaluation of the proposed algorithm's robustness was conducted using PSNR and NC metrics,and it was subjected to assessment under the impact of five distinct attack scenarios.These findings contribute to the development of watermarking strategies that balance imperceptibility,robustness,and processing efficiency,enhancing the field's practicality and effectiveness.展开更多
Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,V...Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,Video Masked Auto-Encoder(VideoMAE)employs a pre-training approach with a high ratio of tube masking and reconstruction,effectively mitigating spatial bias due to temporal redundancy in full video frames.This steers the model’s focus toward detailed temporal contexts.However,as the VideoMAE still relies on full video frames during the action recognition stage,it may exhibit a progressive shift in attention towards spatial contexts,deteriorating its ability to capture the main spatio-temporal contexts.To address this issue,we propose an attention-directing module named Transformer Encoder Attention Module(TEAM).This proposed module effectively directs the model’s attention to the core characteristics within each video,inherently mitigating spatial bias.The TEAM first figures out the core features among the overall extracted features from each video.After that,it discerns the specific parts of the video where those features are located,encouraging the model to focus more on these informative parts.Consequently,during the action recognition stage,the proposed TEAM effectively shifts the VideoMAE’s attention from spatial contexts towards the core spatio-temporal contexts.This attention-shift manner alleviates the spatial bias in the model and simultaneously enhances its ability to capture precise video contexts.We conduct extensive experiments to explore the optimal configuration that enables the TEAM to fulfill its intended design purpose and facilitates its seamless integration with the VideoMAE framework.The integrated model,i.e.,VideoMAE+TEAM,outperforms the existing VideoMAE by a significant margin on Something-Something-V2(71.3%vs.70.3%).Moreover,the qualitative comparisons demonstrate that the TEAM encourages the model to disregard insignificant features and focus more on the essential video features,capturing more detailed spatio-temporal contexts within the video.展开更多
The Tibet Plateau is one of the regions with the richest solar energy resources in the world.In the process of achieving carbon neutrality in China,the development and utilization of solar energy resources in the regi...The Tibet Plateau is one of the regions with the richest solar energy resources in the world.In the process of achieving carbon neutrality in China,the development and utilization of solar energy resources in the region will play an important role.In this study,the gridded solar resource data with 1km resolution in Tibet were obtained by spatial correction and downscaling of SMARTS model.On this basis,the spatial and temporal distribution characteristics of solar energy resources in the region in the past 30 years(1991–2020)are finely evaluated,and the annual global horizontal radiation resource is calculated.The results show that:1)The average annual global horizontal radiation amount in Tibet is 1816 kWh/m^(2).More than 60%of the area belongs to the“Most abundant”(GHI≥1750 kWh/m^(2))area of China’s solar energy resources category A,and nearly 40%belongs to the“Quite abundant”(1400≤GHI<1750)area of China’s solar energy resource category B.2)In space,the solar energy resources in Tibet increased gradually from north to south and from east to west.Lhasa,Central and Eastern Shigatse,Shannan,and Southwestern Ali are the most abundant cities,with a maximum annual radiation level of 2189 kWh/m2.3)In terms of time,the total horizontal radiation in Tibet was the highest in May and the lowest in December.74%of the total area belongs to the“Very stable”(R_(w)≥0.47)area of solar resource stability category A,and 26%belongs to the“stable”(0.36≤R_(w)<0.47)area of solar resource stability category B.Solar energy resources in the region show the characteristics of both strong and stable.Average solar energy resources in the region have shown a fluctuating downward trend over the past 30 years,with an average decline of about 12.86(kWh/m2)per decade.4)In terms of solar radiation resources reaching the earth’s surface,the theoretical total amount of annual horizontal radiation in Tibet is about 240.07 billion tons of standard coal or 222.91 billion kilowatts on average.展开更多
What causes object detection in video to be less accurate than it is in still images?Because some video frames have degraded in appearance from fast movement,out-of-focus camera shots,and changes in posture.These reas...What causes object detection in video to be less accurate than it is in still images?Because some video frames have degraded in appearance from fast movement,out-of-focus camera shots,and changes in posture.These reasons have made video object detection(VID)a growing area of research in recent years.Video object detection can be used for various healthcare applications,such as detecting and tracking tumors in medical imaging,monitoring the movement of patients in hospitals and long-term care facilities,and analyzing videos of surgeries to improve technique and training.Additionally,it can be used in telemedicine to help diagnose and monitor patients remotely.Existing VID techniques are based on recurrent neural networks or optical flow for feature aggregation to produce reliable features which can be used for detection.Some of those methods aggregate features on the full-sequence level or from nearby frames.To create feature maps,existing VID techniques frequently use Convolutional Neural Networks(CNNs)as the backbone network.On the other hand,Vision Transformers have outperformed CNNs in various vision tasks,including object detection in still images and image classification.We propose in this research to use Swin-Transformer,a state-of-the-art Vision Transformer,as an alternative to CNN-based backbone networks for object detection in videos.The proposed architecture enhances the accuracy of existing VID methods.The ImageNet VID and EPIC KITCHENS datasets are used to evaluate the suggested methodology.We have demonstrated that our proposed method is efficient by achieving 84.3%mean average precision(mAP)on ImageNet VID using less memory in comparison to other leading VID techniques.The source code is available on the website https://github.com/amaharek/SwinVid.展开更多
Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate b...Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate based on facial video is an exciting research field for getting palpation information by observation diagnosis.However,most studies focus on optimizing the algorithm based on a small sample of participants without systematically investigating multiple influencing factors.A total of 209 participants and 2,435 facial videos,based on our self-constructed Multi-Scene Sign Dataset and the public datasets,were used to perform a multi-level and multi-factor comprehensive comparison.The effects of different datasets,blood volume pulse signal extraction algorithms,region of interests,time windows,color spaces,pulse rate calculation methods,and video recording scenes were analyzed.Furthermore,we proposed a blood volume pulse signal quality optimization strategy based on the inverse Fourier transform and an improvement strategy for pulse rate estimation based on signal-to-noise ratio threshold sliding.We found that the effects of video estimation of pulse rate in the Multi-Scene Sign Dataset and Pulse Rate Detection Dataset were better than in other datasets.Compared with Fast independent component analysis and Single Channel algorithms,chrominance-based method and plane-orthogonal-to-skin algorithms have a more vital anti-interference ability and higher robustness.The performances of the five-organs fusion area and the full-face area were better than that of single sub-regions,and the fewer motion artifacts and better lighting can improve the precision of pulse rate estimation.展开更多
Tourism resources that span provincial boundaries in China play a pivotal role in regional development,yet effective governance poses persistent challenges.This study addresses this issue by constructing a comprehensi...Tourism resources that span provincial boundaries in China play a pivotal role in regional development,yet effective governance poses persistent challenges.This study addresses this issue by constructing a comprehensive database of transboundary natural tourism resources(TNTR)through amalgamation of diverse data sources.Utilizing the Getis-Ord Gi^(*),kernel density estimation,and geographical detectors,we scrutinize the spatial patterns of TNTR,focusing on both named and unnamed entities,while exploring the influencing factors.Our findings reveal 7883 identified TNTR in China,with mountain tourism resources emerging as the predominant type.Among provinces,Hunan boasts the highest count,while Shanghai exhibits the lowest.Southern China demonstrates a pronounced clustering trend in TNTR distribution,with the spatial arrangement of biological landscapes appearing more random compared to geological and water landscapes.Western China,characterized by intricate terrain,exhibits fewer TNTR,concurrently unveiling a significant presence of unnamed natural tourism resources.Crucially,administrative segmentation influences TNTR development,generating disparities in regional goals,developmental stages and intensities,and management approaches.In response to these variations,we advocate for strengthening the naming of the unnamed transboundary tourism resources,constructing a geographic database of TNTR for government and establishing a collaborative management mechanism based on TNTR database.Our research contributes to elucidating the intricate landscape of TNTR,offering insights for tailored governance strategies in the realm of cross-provincial tourism resource management.展开更多
To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advan...To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.展开更多
Wuyi Mountain,located in the north of Fujian Province,China,is renowned for its abundant medicinal plant resources.In July 2014,the 8th(second team)of Shenyang Pharmaceutical University’s Chinese Medicine Resources S...Wuyi Mountain,located in the north of Fujian Province,China,is renowned for its abundant medicinal plant resources.In July 2014,the 8th(second team)of Shenyang Pharmaceutical University’s Chinese Medicine Resources Scientific Expedition Team conducted field investigation in the area.Through specimen collection and extensive literature review,the team identified and analyzed 223 vascular plant species from 175 genera and 85 families.The most dominant families were Compositae and Rosaceae,and perennial herbs were the predominant species,accounting for 44.39%of the total species identified.Notably,we documented five precious and rare medicinal plants unique to Wuyi Mountain.This study updates the database of plant resources and diversity in the region,providing a valuable reference for future studies.Finally,we put forward some suggestions to enhance the conservation and sustainable utilization of Wuyi Mountain’s plant resources.展开更多
Text perception is crucial for understanding the semantics of outdoor scenes,making it a key requirement for building intelligent systems for driver assistance or autonomous driving.Text information in car-mounted vid...Text perception is crucial for understanding the semantics of outdoor scenes,making it a key requirement for building intelligent systems for driver assistance or autonomous driving.Text information in car-mounted videos can assist drivers in making decisions.However,Car-mounted video text images pose challenges such as complex backgrounds,small fonts,and the need for real-time detection.We proposed a robust Car-mounted Video Text Detector(CVTD).It is a lightweight text detection model based on ResNet18 for feature extraction,capable of detecting text in arbitrary shapes.Our model efficiently extracted global text positions through the Coordinate Attention Threshold Activation(CATA)and enhanced the representation capability through stacking two Feature Pyramid Enhancement Fusion Modules(FPEFM),strengthening feature representation,and integrating text local features and global position information,reinforcing the representation capability of the CVTD model.The enhanced feature maps,when acted upon by Text Activation Maps(TAM),effectively distinguished text foreground from non-text regions.Additionally,we collected and annotated a dataset containing 2200 images of Car-mounted Video Text(CVT)under various road conditions for training and evaluating our model’s performance.We further tested our model on four other challenging public natural scene text detection benchmark datasets,demonstrating its strong generalization ability and real-time detection speed.This model holds potential for practical applications in real-world scenarios.展开更多
This study investigates how cognitive psychology principles can be integrated into the information architecture design of short-form video platforms,like TikTok,to enhance user experience,engagement,and sharing.Using ...This study investigates how cognitive psychology principles can be integrated into the information architecture design of short-form video platforms,like TikTok,to enhance user experience,engagement,and sharing.Using a questionnaire,it explores TikTok users’habits and preferences,highlighting how social media fatigue(SMF)impacts their interaction with the platform.The paper offers strategies to optimize TikTok’s design.It suggests refining the organizational system using principles like chunking,schema theory,and working memory capacity.Additionally,it proposes incorporating shopping features within TikTok’s interface to personalize product suggestions and enable monetization for influencers and content creators.Furthermore,the study underlines the need to consider gender differences and user preferences in improving TikTok’s sharing features,recommending streamlined and customizable sharing options,collaborative sharing,and a system to acknowledge sharing milestones.Aiming to strengthen social connections and increase sharing likelihood,this research provides insights into enhancing information architecture for short-form video platforms,contributing to their growth and success.展开更多
[Objectives]To facilitate the rational use and timely protection of the Tibetan medicinal plant resources to count and reorganize Tibetan medicines recorded in Yu Tuo Ben Cao.[Methods]Based on literature research and ...[Objectives]To facilitate the rational use and timely protection of the Tibetan medicinal plant resources to count and reorganize Tibetan medicines recorded in Yu Tuo Ben Cao.[Methods]Based on literature research and data analysis,this paper analyzed the plant genera,and their habitat characteristics and the main types of diseases.[Results]Yu Tuo Ben Cao contains 191 kinds of botanicals,of which Ranunculaceae has the largest number of 11 genera and 25 species,with a wide distribution of habitats and 5 categories,and the main therapeutic efficacy covers 16 fields.[Conclusions]As a part of Yu Tuo Ben Cao,Tibetan medicines of Ranunculaceae have great research value because of their variety,large number,wide distribution,and diverse uses.展开更多
As part of its efforts to promote a sustainable and high-quality development,China has pledged to reduce water consumption and create a water-efficient society.On the basis of identifying the institutional root causes...As part of its efforts to promote a sustainable and high-quality development,China has pledged to reduce water consumption and create a water-efficient society.On the basis of identifying the institutional root causes of excessive capital allocation and excessive water consumption in China’s water-intensive industrial sectors,this study elaborates how the national water-efficient cities assessment contributes to optimized capital allocation.Our research shows that national water-efficient cities assessment has motivated local governments to compete for water efficiency.To conserve water,local governments regulated the entry and exit of water-intensive enterprises,discouraged excessive investments in water-intensive sectors,and phased out obsolete water-intensive capacities within their jurisdictions.This approach has resulted in mutually beneficial outcomes,including improved allocation of capital,enhanced water efficiency,and reduced emissions.This paper offers policy recommendations for establishing a water-efficient society throughout the 14^(th) Five-Year Plan(2021-2025)period by presenting empirical evidence on the policy effects of resource efficiency evaluation.展开更多
文摘Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It is also playing an essential role in devolving human-robot interaction.The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping.Deep learning is changing the shape of computer vision(CV)technologies and natural language processing(NLP).There are hundreds of deep learning models,datasets,and evaluations that can improve the gaps in current research.This article filled this gap by evaluating some state-of-the-art approaches,especially focusing on deep learning and machine learning for video caption in a dense environment.In this article,some classic techniques concerning the existing machine learning were reviewed.And provides deep learning models,a detail of benchmark datasets with their respective domains.This paper reviews various evaluation metrics,including Bilingual EvaluationUnderstudy(BLEU),Metric for Evaluation of Translation with Explicit Ordering(METEOR),WordMover’s Distance(WMD),and Recall-Oriented Understudy for Gisting Evaluation(ROUGE)with their pros and cons.Finally,this article listed some future directions and proposed work for context enhancement using key scene extraction with object detection in a particular frame.Especially,how to improve the context of video description by analyzing key frames detection through morphological image analysis.Additionally,the paper discusses a novel approach involving sentence reconstruction and context improvement through key frame object detection,which incorporates the fusion of large languagemodels for refining results.The ultimate results arise fromenhancing the generated text of the proposedmodel by improving the predicted text and isolating objects using various keyframes.These keyframes identify dense events occurring in the video sequence.
文摘Objective: To study the problematic use of video games among secondary school students in the city of Parakou in 2023. Methods: Descriptive cross-sectional study conducted in the commune of Parakou from December 2022 to July 2023. The study population consisted of students regularly enrolled in public and private secondary schools in the city of Parakou for the 2022-2023 academic year. A two-stage non-proportional stratified sampling technique combined with simple random sampling was adopted. The Problem Video Game Playing (PVP) scale was used to assess problem gambling in the study population, while anxiety and depression were assessed using the Hospital Anxiety and Depression Scale (HADS). Results: A total of 1030 students were included. The mean age of the pupils surveyed was 15.06 ± 2.68 years, with extremes of 10 and 28 years. The [13 - 18] age group was the most represented, with a proportion of 59.6% (614) in the general population. Females predominated, at 52.8% (544), with a sex ratio of 0.89. The prevalence of problematic video game use was 24.9%, measured using the Video Game Playing scale. Associated factors were male gender (p = 0.005), pocket money under 10,000 cfa (p = 0.001) and between 20,000 - 90,000 cfa (p = 0.030), addictive family behavior (p < 0.001), monogamous family (p = 0.023), good relationship with father (p = 0.020), organization of video game competitions (p = 0.001) and definite anxiety (p Conclusion: Substance-free addiction is struggling to attract the attention it deserves, as it did in its infancy everywhere else. This study complements existing data and serves as a reminder of the need to focus on this group of addictions, whose problematic use of video games remains the most frequent due to its accessibility and social tolerance. Preventive action combined with curative measures remains the most effective means of combating the problem at national level.
基金supported in part by the National Natural Science Foundation of China under Grant 61873277in part by the Natural Science Basic Research Plan in Shaanxi Province of China underGrant 2020JQ-758in part by the Chinese Postdoctoral Science Foundation under Grant 2020M673446.
文摘In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits.
基金The authors would like to thank Research Supporting Project Number(RSP2024R444)King Saud University,Riyadh,Saudi Arabia.
文摘Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality.
文摘Context/objectives: The fight against Chronic Non-Communicable Diseases (NCDs) is a long-term undertaking, which requires available, motivated and well-managed human resources (HR). The administrative management of skills on both qualitative and quantitative levels is one of the essential functions of a health system. To better implement policies of fight against High Blood Pressure (HBP) and other chronic diseases, it is important to establish strategies to retain health personnel. This loyalty requires favorable working conditions and consideration of the contribution-reward couple. Good working conditions are likely to reduce the phenomenon of medical nomadism;conversely, poor HR management can contribute to their exodus towards exotic “green pastures”, thus leading to an additional crisis in the Cameroonian health system. The fight against HBP is a complex, multifaceted and multifactorial reality that requires appropriate management model for all types of resources mainly HR. The main objective of this research is to show the impact of poor management of human resources in Cameroon health system on medical nomadism and the ineffectiveness of the fight against High Blood Pressure. Method: A cross-sectional descriptive survey among five hundred (500) health facilities in the center region of Cameroon has been conducted. A stratified probabilistic technique has been used, and the number of health facilities to be surveyed has been determined using the “sample size estimation table” of Depelteau. The physical questionnaires have been printed and then distributed to data collectors. After data collection, the latter were grouped during processing in Excel sheets. The Chi-square test was used for data with a qualitative value and that of Kolmogorov-Sminorf for data with a quantitative value to assess the normality and reliability of data. The Crochach’s Alpha reliability test allowed us to have a summary of the means and variances and then to search for intragroup correlations between variables. Descriptive analysis was possible with the XLSTAT 2016 software. Results: 43.60% of Health Facilities (HF) managers were unqualified. 82.20% of HF managers have staff in a situation of professional insecurity. They are mainly contractual (49.00), decision-making agents (24.40%), casual agents (08.80). The proportion of unstable personnel is average of 22.00% and very unstable, 12.00%.
基金funded by the Natural Science Foundation China(NSFC)under Grant No.62203192.
文摘Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing complex spatial data that is also influenced by temporal dynamics.Despite the progress made in existing VSOD models,they still struggle in scenes of great background diversity within and between frames.Additionally,they encounter difficulties related to accumulated noise and high time consumption during the extraction of temporal features over a long-term duration.We propose a multi-stream temporal enhanced network(MSTENet)to address these problems.It investigates saliency cues collaboration in the spatial domain with a multi-stream structure to deal with the great background diversity challenge.A straightforward,yet efficient approach for temporal feature extraction is developed to avoid the accumulative noises and reduce time consumption.The distinction between MSTENet and other VSOD methods stems from its incorporation of both foreground supervision and background supervision,facilitating enhanced extraction of collaborative saliency cues.Another notable differentiation is the innovative integration of spatial and temporal features,wherein the temporal module is integrated into the multi-stream structure,enabling comprehensive spatial-temporal interactions within an end-to-end framework.Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance on five benchmark datasets while maintaining a real-time speed of 27 fps(Titan XP).Our code and models are available at https://github.com/RuJiaLe/MSTENet.
基金supported by the National Key Research and Development Project under Grant 2020YFB1807602Key Program of Marine Economy Development Special Foundation of Department of Natural Resources of Guangdong Province(GDNRC[2023]24)the National Natural Science Foundation of China under Grant 62271267.
文摘Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the research and applications of natural language processing across different modalities,our goal is to accurately extract frame-level semantic information from videos and ultimately transmit high-quality videos.Specifically,we propose a deep learning-basedMulti-ModalMutual Enhancement Video Semantic Communication system,called M3E-VSC.Built upon a VectorQuantized Generative AdversarialNetwork(VQGAN),our systemaims to leverage mutual enhancement among different modalities by using text as the main carrier of transmission.With it,the semantic information can be extracted fromkey-frame images and audio of the video and performdifferential value to ensure that the extracted text conveys accurate semantic information with fewer bits,thus improving the capacity of the system.Furthermore,a multi-frame semantic detection module is designed to facilitate semantic transitions during video generation.Simulation results demonstrate that our proposed model maintains high robustness in complex noise environments,particularly in low signal-to-noise ratio conditions,significantly improving the accuracy and speed of semantic transmission in video communication by approximately 50 percent.
文摘Video streaming applications have grown considerably in recent years.As a result,this becomes one of the most significant contributors to global internet traffic.According to recent studies,the telecommunications industry loses millions of dollars due to poor video Quality of Experience(QoE)for users.Among the standard proposals for standardizing the quality of video streaming over internet service providers(ISPs)is the Mean Opinion Score(MOS).However,the accurate finding of QoE by MOS is subjective and laborious,and it varies depending on the user.A fully automated data analytics framework is required to reduce the inter-operator variability characteristic in QoE assessment.This work addresses this concern by suggesting a novel hybrid XGBStackQoE analytical model using a two-level layering technique.Level one combines multiple Machine Learning(ML)models via a layer one Hybrid XGBStackQoE-model.Individual ML models at level one are trained using the entire training data set.The level two Hybrid XGBStackQoE-Model is fitted using the outputs(meta-features)of the layer one ML models.The proposed model outperformed the conventional models,with an accuracy improvement of 4 to 5 percent,which is still higher than the current traditional models.The proposed framework could significantly improve video QoE accuracy.
文摘Video watermarking plays a crucial role in protecting intellectual property rights and ensuring content authenticity.This study delves into the integration of Galois Field(GF)multiplication tables,especially GF(2^(4)),and their interaction with distinct irreducible polynomials.The primary aim is to enhance watermarking techniques for achieving imperceptibility,robustness,and efficient execution time.The research employs scene selection and adaptive thresholding techniques to streamline the watermarking process.Scene selection is used strategically to embed watermarks in the most vital frames of the video,while adaptive thresholding methods ensure that the watermarking process adheres to imperceptibility criteria,maintaining the video's visual quality.Concurrently,careful consideration is given to execution time,crucial in real-world scenarios,to balance efficiency and efficacy.The Peak Signal-to-Noise Ratio(PSNR)serves as a pivotal metric to gauge the watermark's imperceptibility and video quality.The study explores various irreducible polynomials,navigating the trade-offs between computational efficiency and watermark imperceptibility.In parallel,the study pays careful attention to the execution time,a paramount consideration in real-world scenarios,to strike a balance between efficiency and efficacy.This comprehensive analysis provides valuable insights into the interplay of GF multiplication tables,diverse irreducible polynomials,scene selection,adaptive thresholding,imperceptibility,and execution time.The evaluation of the proposed algorithm's robustness was conducted using PSNR and NC metrics,and it was subjected to assessment under the impact of five distinct attack scenarios.These findings contribute to the development of watermarking strategies that balance imperceptibility,robustness,and processing efficiency,enhancing the field's practicality and effectiveness.
基金This work was supported by the National Research Foundation of Korea(NRF)Grant(Nos.2018R1A5A7059549,2020R1A2C1014037)by Institute of Information&Communications Technology Planning&Evaluation(IITP)Grant(No.2020-0-01373)funded by the Korea government(*MSIT).*Ministry of Science and Information&Communication Technology.
文摘Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,Video Masked Auto-Encoder(VideoMAE)employs a pre-training approach with a high ratio of tube masking and reconstruction,effectively mitigating spatial bias due to temporal redundancy in full video frames.This steers the model’s focus toward detailed temporal contexts.However,as the VideoMAE still relies on full video frames during the action recognition stage,it may exhibit a progressive shift in attention towards spatial contexts,deteriorating its ability to capture the main spatio-temporal contexts.To address this issue,we propose an attention-directing module named Transformer Encoder Attention Module(TEAM).This proposed module effectively directs the model’s attention to the core characteristics within each video,inherently mitigating spatial bias.The TEAM first figures out the core features among the overall extracted features from each video.After that,it discerns the specific parts of the video where those features are located,encouraging the model to focus more on these informative parts.Consequently,during the action recognition stage,the proposed TEAM effectively shifts the VideoMAE’s attention from spatial contexts towards the core spatio-temporal contexts.This attention-shift manner alleviates the spatial bias in the model and simultaneously enhances its ability to capture precise video contexts.We conduct extensive experiments to explore the optimal configuration that enables the TEAM to fulfill its intended design purpose and facilitates its seamless integration with the VideoMAE framework.The integrated model,i.e.,VideoMAE+TEAM,outperforms the existing VideoMAE by a significant margin on Something-Something-V2(71.3%vs.70.3%).Moreover,the qualitative comparisons demonstrate that the TEAM encourages the model to disregard insignificant features and focus more on the essential video features,capturing more detailed spatio-temporal contexts within the video.
基金This work was supported by the Major Science and Technology Project of the Science and Technology Department of Tibet under Grant Number XZ202101ZD0015Gthe Second Tibet Plateau Scientific Expedition and Research Program(STEP)under Grant Number 2019QZKK0804.
文摘The Tibet Plateau is one of the regions with the richest solar energy resources in the world.In the process of achieving carbon neutrality in China,the development and utilization of solar energy resources in the region will play an important role.In this study,the gridded solar resource data with 1km resolution in Tibet were obtained by spatial correction and downscaling of SMARTS model.On this basis,the spatial and temporal distribution characteristics of solar energy resources in the region in the past 30 years(1991–2020)are finely evaluated,and the annual global horizontal radiation resource is calculated.The results show that:1)The average annual global horizontal radiation amount in Tibet is 1816 kWh/m^(2).More than 60%of the area belongs to the“Most abundant”(GHI≥1750 kWh/m^(2))area of China’s solar energy resources category A,and nearly 40%belongs to the“Quite abundant”(1400≤GHI<1750)area of China’s solar energy resource category B.2)In space,the solar energy resources in Tibet increased gradually from north to south and from east to west.Lhasa,Central and Eastern Shigatse,Shannan,and Southwestern Ali are the most abundant cities,with a maximum annual radiation level of 2189 kWh/m2.3)In terms of time,the total horizontal radiation in Tibet was the highest in May and the lowest in December.74%of the total area belongs to the“Very stable”(R_(w)≥0.47)area of solar resource stability category A,and 26%belongs to the“stable”(0.36≤R_(w)<0.47)area of solar resource stability category B.Solar energy resources in the region show the characteristics of both strong and stable.Average solar energy resources in the region have shown a fluctuating downward trend over the past 30 years,with an average decline of about 12.86(kWh/m2)per decade.4)In terms of solar radiation resources reaching the earth’s surface,the theoretical total amount of annual horizontal radiation in Tibet is about 240.07 billion tons of standard coal or 222.91 billion kilowatts on average.
文摘What causes object detection in video to be less accurate than it is in still images?Because some video frames have degraded in appearance from fast movement,out-of-focus camera shots,and changes in posture.These reasons have made video object detection(VID)a growing area of research in recent years.Video object detection can be used for various healthcare applications,such as detecting and tracking tumors in medical imaging,monitoring the movement of patients in hospitals and long-term care facilities,and analyzing videos of surgeries to improve technique and training.Additionally,it can be used in telemedicine to help diagnose and monitor patients remotely.Existing VID techniques are based on recurrent neural networks or optical flow for feature aggregation to produce reliable features which can be used for detection.Some of those methods aggregate features on the full-sequence level or from nearby frames.To create feature maps,existing VID techniques frequently use Convolutional Neural Networks(CNNs)as the backbone network.On the other hand,Vision Transformers have outperformed CNNs in various vision tasks,including object detection in still images and image classification.We propose in this research to use Swin-Transformer,a state-of-the-art Vision Transformer,as an alternative to CNN-based backbone networks for object detection in videos.The proposed architecture enhances the accuracy of existing VID methods.The ImageNet VID and EPIC KITCHENS datasets are used to evaluate the suggested methodology.We have demonstrated that our proposed method is efficient by achieving 84.3%mean average precision(mAP)on ImageNet VID using less memory in comparison to other leading VID techniques.The source code is available on the website https://github.com/amaharek/SwinVid.
基金supported by the Key Research Program of the Chinese Academy of Sciences(grant number ZDRW-ZS-2021-1-2).
文摘Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate based on facial video is an exciting research field for getting palpation information by observation diagnosis.However,most studies focus on optimizing the algorithm based on a small sample of participants without systematically investigating multiple influencing factors.A total of 209 participants and 2,435 facial videos,based on our self-constructed Multi-Scene Sign Dataset and the public datasets,were used to perform a multi-level and multi-factor comprehensive comparison.The effects of different datasets,blood volume pulse signal extraction algorithms,region of interests,time windows,color spaces,pulse rate calculation methods,and video recording scenes were analyzed.Furthermore,we proposed a blood volume pulse signal quality optimization strategy based on the inverse Fourier transform and an improvement strategy for pulse rate estimation based on signal-to-noise ratio threshold sliding.We found that the effects of video estimation of pulse rate in the Multi-Scene Sign Dataset and Pulse Rate Detection Dataset were better than in other datasets.Compared with Fast independent component analysis and Single Channel algorithms,chrominance-based method and plane-orthogonal-to-skin algorithms have a more vital anti-interference ability and higher robustness.The performances of the five-organs fusion area and the full-face area were better than that of single sub-regions,and the fewer motion artifacts and better lighting can improve the precision of pulse rate estimation.
基金funded by the by the Youth Program of the National Natural Science Foundation of China(Grants No.42001243,and 42201311)the Humanities and Social Science Project of the Ministry of Education,China(Grants No.20YJC630212,and 22YJCZH071)+1 种基金the Youth Program of the Natural Science Foundation of Shandong Province,China(Grants No.ZR2020QD008)Frontier Science Research Support Program,Management College,OUC(Grants No.MCQYZD2305,and MCQYYB2309).
文摘Tourism resources that span provincial boundaries in China play a pivotal role in regional development,yet effective governance poses persistent challenges.This study addresses this issue by constructing a comprehensive database of transboundary natural tourism resources(TNTR)through amalgamation of diverse data sources.Utilizing the Getis-Ord Gi^(*),kernel density estimation,and geographical detectors,we scrutinize the spatial patterns of TNTR,focusing on both named and unnamed entities,while exploring the influencing factors.Our findings reveal 7883 identified TNTR in China,with mountain tourism resources emerging as the predominant type.Among provinces,Hunan boasts the highest count,while Shanghai exhibits the lowest.Southern China demonstrates a pronounced clustering trend in TNTR distribution,with the spatial arrangement of biological landscapes appearing more random compared to geological and water landscapes.Western China,characterized by intricate terrain,exhibits fewer TNTR,concurrently unveiling a significant presence of unnamed natural tourism resources.Crucially,administrative segmentation influences TNTR development,generating disparities in regional goals,developmental stages and intensities,and management approaches.In response to these variations,we advocate for strengthening the naming of the unnamed transboundary tourism resources,constructing a geographic database of TNTR for government and establishing a collaborative management mechanism based on TNTR database.Our research contributes to elucidating the intricate landscape of TNTR,offering insights for tailored governance strategies in the realm of cross-provincial tourism resource management.
基金supported by ZTE Industry-University-Institute Cooperation Funds.
文摘To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.
文摘Wuyi Mountain,located in the north of Fujian Province,China,is renowned for its abundant medicinal plant resources.In July 2014,the 8th(second team)of Shenyang Pharmaceutical University’s Chinese Medicine Resources Scientific Expedition Team conducted field investigation in the area.Through specimen collection and extensive literature review,the team identified and analyzed 223 vascular plant species from 175 genera and 85 families.The most dominant families were Compositae and Rosaceae,and perennial herbs were the predominant species,accounting for 44.39%of the total species identified.Notably,we documented five precious and rare medicinal plants unique to Wuyi Mountain.This study updates the database of plant resources and diversity in the region,providing a valuable reference for future studies.Finally,we put forward some suggestions to enhance the conservation and sustainable utilization of Wuyi Mountain’s plant resources.
基金This work is supported in part by the National Natural Science Foundation of China(Grant Number 61971078)which provided domain expertise and computational power that greatly assisted the activity+1 种基金This work was financially supported by Chongqing Municipal Education Commission Grants forMajor Science and Technology Project(KJZD-M202301901)the Science and Technology Research Project of Jiangxi Department of Education(GJJ2201049).
文摘Text perception is crucial for understanding the semantics of outdoor scenes,making it a key requirement for building intelligent systems for driver assistance or autonomous driving.Text information in car-mounted videos can assist drivers in making decisions.However,Car-mounted video text images pose challenges such as complex backgrounds,small fonts,and the need for real-time detection.We proposed a robust Car-mounted Video Text Detector(CVTD).It is a lightweight text detection model based on ResNet18 for feature extraction,capable of detecting text in arbitrary shapes.Our model efficiently extracted global text positions through the Coordinate Attention Threshold Activation(CATA)and enhanced the representation capability through stacking two Feature Pyramid Enhancement Fusion Modules(FPEFM),strengthening feature representation,and integrating text local features and global position information,reinforcing the representation capability of the CVTD model.The enhanced feature maps,when acted upon by Text Activation Maps(TAM),effectively distinguished text foreground from non-text regions.Additionally,we collected and annotated a dataset containing 2200 images of Car-mounted Video Text(CVT)under various road conditions for training and evaluating our model’s performance.We further tested our model on four other challenging public natural scene text detection benchmark datasets,demonstrating its strong generalization ability and real-time detection speed.This model holds potential for practical applications in real-world scenarios.
文摘This study investigates how cognitive psychology principles can be integrated into the information architecture design of short-form video platforms,like TikTok,to enhance user experience,engagement,and sharing.Using a questionnaire,it explores TikTok users’habits and preferences,highlighting how social media fatigue(SMF)impacts their interaction with the platform.The paper offers strategies to optimize TikTok’s design.It suggests refining the organizational system using principles like chunking,schema theory,and working memory capacity.Additionally,it proposes incorporating shopping features within TikTok’s interface to personalize product suggestions and enable monetization for influencers and content creators.Furthermore,the study underlines the need to consider gender differences and user preferences in improving TikTok’s sharing features,recommending streamlined and customizable sharing options,collaborative sharing,and a system to acknowledge sharing milestones.Aiming to strengthen social connections and increase sharing likelihood,this research provides insights into enhancing information architecture for short-form video platforms,contributing to their growth and success.
基金Supported by Key Laboratory of Medicinal Animal and Plant Resources of Qinghai-Tibetan Plateau in Qinghai Province(2020-ZJ-Y40).
文摘[Objectives]To facilitate the rational use and timely protection of the Tibetan medicinal plant resources to count and reorganize Tibetan medicines recorded in Yu Tuo Ben Cao.[Methods]Based on literature research and data analysis,this paper analyzed the plant genera,and their habitat characteristics and the main types of diseases.[Results]Yu Tuo Ben Cao contains 191 kinds of botanicals,of which Ranunculaceae has the largest number of 11 genera and 25 species,with a wide distribution of habitats and 5 categories,and the main therapeutic efficacy covers 16 fields.[Conclusions]As a part of Yu Tuo Ben Cao,Tibetan medicines of Ranunculaceae have great research value because of their variety,large number,wide distribution,and diverse uses.
基金Sponsorship of the Outstanding Youth Innovation Team Development Program for Institutes of Higher Learning in Shandong Province(2021RW008)the Youth Program of the Natural Science Foundation of Shandong Province(ZR2021QG048).
文摘As part of its efforts to promote a sustainable and high-quality development,China has pledged to reduce water consumption and create a water-efficient society.On the basis of identifying the institutional root causes of excessive capital allocation and excessive water consumption in China’s water-intensive industrial sectors,this study elaborates how the national water-efficient cities assessment contributes to optimized capital allocation.Our research shows that national water-efficient cities assessment has motivated local governments to compete for water efficiency.To conserve water,local governments regulated the entry and exit of water-intensive enterprises,discouraged excessive investments in water-intensive sectors,and phased out obsolete water-intensive capacities within their jurisdictions.This approach has resulted in mutually beneficial outcomes,including improved allocation of capital,enhanced water efficiency,and reduced emissions.This paper offers policy recommendations for establishing a water-efficient society throughout the 14^(th) Five-Year Plan(2021-2025)period by presenting empirical evidence on the policy effects of resource efficiency evaluation.