This paper introduces the current practice of building a network of institutional repositories(IRs) at Chinese Academy of Sciences(CAS), which is named CAS IR Grid.National Science Library(NSL) of CAS plays a leading ...This paper introduces the current practice of building a network of institutional repositories(IRs) at Chinese Academy of Sciences(CAS), which is named CAS IR Grid.National Science Library(NSL) of CAS plays a leading role in the construction, promotion and implementation of CAS IR Grid. It aims to promote each institute of CAS to build IR of its own, and finally form the IR network of CAS institutes. NSL's experience is introduced in coordinating and supporting institutes' building of their respective IRs and promoting IR services by adopting collaborative and progressive development strategies. Achievements made during the development of CAS IR Grid are described and challenges for its future development are discussed. The authors aim to provide best practices for developing a network of institutional repositories in research institute settings, which can serve as a practical reference to other institutions engaged in the similar task.展开更多
In this paper a team member ranking technique is presented for software bug repositories. Member ranking is performed using numbers of attributes available in software bug repositories, and a ranked list of developers...In this paper a team member ranking technique is presented for software bug repositories. Member ranking is performed using numbers of attributes available in software bug repositories, and a ranked list of developers is generated who are participating in development of software project. This ranking is generated from the contribution made by the individual developers in terms of bugs fixed, severity and priority of bugs, reporting newer problems and comments made by the developers. The top ranked developers are the best contributors for the software projects. The proposed algorithm can also be used for classifying and rating the software bugs using the ratings of members participating in the software bug repository.展开更多
Environmental fungi can damage the documentary heritage conserved in archives and affect the personnel’s health if their concentrations,thermo-hygrometric parameters and ventilation conditions are not adequate,proble...Environmental fungi can damage the documentary heritage conserved in archives and affect the personnel’s health if their concentrations,thermo-hygrometric parameters and ventilation conditions are not adequate,problems that can be accentuated by Climate Change.The aims of this work were to identify and to characterize the airborne fungal pollution of naturally ventilated repositories in the Provincial Historical Archive of Santiago de Cuba and predict the risk that these fungi pose to the staff’s health.Indoor air of three repositories of this archive and the outdoor air were sampled in an occasion every time in 2015,2016 and 2017 using a SAS sampler.The obtained fungal concentrations varied from 135.6 CFU/m^(3) to 421.1 CFU/m^(3) and the indoor/outdoor ratios fluctuated from 0.7 to 4.2,evidencing a variable environmental quality over time,but in the third sampling the repositories environments showed good quality.Aspergillus and Cladosporium were the predominant genera in these environments.A.flavus was a prevailed species in indoor air,while A.niger and Cl.cladosporioides were the species that showed the greatest similarities with the outdoor air.Coremiella and Talaromyces genera as well as the species Aspergillus uvarum,Alternaria ricini and Cladosporium staurophorum were the first findings for environments of Cuban archives.Xerophilic species(A.flavus,A.niger,A.ochraceus,A.ustus)indicators of moisture problems in the repositories were detected;they are also opportunistic pathogens and toxigenic species but their concentrations were higher than the recommended,demonstrating the potential risk to which the archive personnel is exposed in a circumstantial way.展开更多
Purpose: China Academic Library & Information System(CALIS) planned to launch an institutional repository(IR) project to promote IR development and open access at colleges and universities in China. In order to ge...Purpose: China Academic Library & Information System(CALIS) planned to launch an institutional repository(IR) project to promote IR development and open access at colleges and universities in China. In order to get to know the current state of IRs in academic institutions, with the help of Peking University Library, CALIS Administrative Center conducted this survey.Design/methodology/approach: We conducted an online survey of CALIS member libraries.Findings: Firstly, the development of IRs at China's colleges and universities is still in its infancy. Secondly, the Chinese colleges and universities have reached a consensus on the objective for having an IR. Thirdly, they are having high expectations of IR functions. Fourthly,they prefer to establish a centralized IR system at a minimum cost. Finally, there are both similarities and differences between the Chinese academic institutions and their counterparts in other countries in the state of IR development.Research limitations: The questionnaire needs to be improved because there is a lack of enough questions for those who do not plan to build an IR. Comparatively lower rate of valid questionnaire return can affect the accuracy of the results. It is hard to go into an in-depth discussion only based on the data collected from this questionnaire survey, and consequently,the findings from the survey can hardly present an accurate and comprehensive picture of the current state of IR development in the academic sector in China.Practical implications: The survey results provide essential foundation for CALIS IR project,and meanwhile the research can serve as a reference source for the future studies of the development of IRs at China's colleges and universities.Originality/value: It is the first national survey focused on the development of IRs in academic institutions in China.展开更多
A computer code for simulation of groundwater flow and transport is described. Both porous and fractured media are handled by the code. The main intended application is the analysis of a deep repository for nuclear wa...A computer code for simulation of groundwater flow and transport is described. Both porous and fractured media are handled by the code. The main intended application is the analysis of a deep repository for nuclear waste and for this reason flow and transport in a sparsely fractured rock is in focus. The mathematical and numerical models are described in some detail. In short, one may say that the code is based on the traditional conservation and state laws, but also embodies a number of submodels (subgrid processes, permafrost, etc). An unstructured Cartesian grid and a finite volume approach are the key elements in the discretization of the basic equations. A multigrid solver is part of the code as well as a parallelization option based on the SPMD (Single-Program Multiple-Data) method. The main application areas are summarized and an application to a deep repository is discussed in some more detail.展开更多
The open science movement has gained significant momentum within the last few years.This comes along with the need to store and share research artefacts,such as publications and research data.For this purpose,research...The open science movement has gained significant momentum within the last few years.This comes along with the need to store and share research artefacts,such as publications and research data.For this purpose,research repositories need to be established.A variety of solutions exist for implementing such repositories,covering diverse features,ranging from custom depositing workflows to social media-like functions.In this article,we introduce the FAIREST principles,a framework inspired by the well-known FAIR principles,but designed to provide a set of metrics for assessing and selecting solutions for creating digital repositories for research artefacts.The goal is to support decision makers in choosing such a solution when planning for a repository,especially at an institutional level.The metrics included are therefore based on two pillars:(1)an analysis of established features and functionalities,drawn from existing dedicated,general purpose and commonly used solutions,and(2)a literature review on general requirements for digital repositories for research artefacts and related systems.We further describe an assessment of 11 widespread solutions,with the goal to provide an overview of the current landscape of research data repository solutions,identifying gaps and research challenges to be addressed.展开更多
In the context of repositories for nuclear waste,understanding the behavior of gas migration through clayey rocks with inherent anisotropy is crucial for assessing the safety of geological disposal facilities.The prim...In the context of repositories for nuclear waste,understanding the behavior of gas migration through clayey rocks with inherent anisotropy is crucial for assessing the safety of geological disposal facilities.The primary mechanism for gas breakthrough is the opening of micro-fractures due to high gas pressure.This occurs at gas pressures lower than the combined strength of the rock and its minimum principal stress under external loading conditions.To investigate the mechanism of microscale mode-I ruptures,it is essential to incorporate a multiscale approach that includes subcritical microcracks in the modeling framework.In this contribution,we derive the model from microstructures that contain periodically distributed microcracks within a porous material.The damage evolution law is coupled with the macroscopic poroelastic system by employing the asymptotic homogenization method and considering the inherent hydro-mechanical(HM)anisotropy at the microscale.The resulting permeability change induced by fracture opening is implicitly integrated into the gas flow equation.Verification examples are presented to validate the developed model step by step.An analysis of local macroscopic response is undertaken to underscore the influence of factors such as strain rate,initial damage,and applied stress,on the gas migration process.Numerical examples of direct tension tests are used to demonstrate the model’s efficacy in describing localized failure characteristics.Finally,the simulation results for preferential gas flow reveal the robustness of the two-scale model in explicitly depicting gas-induced fracturing in anisotropic clayey rocks.The model successfully captures the common behaviors observed in laboratory experiments,such as a sudden drop in gas injection pressure,rapid build-up of downstream gas pressure,and steady-state gas flow following gas breakthrough.展开更多
Data repository infrastructures for academics have appeared in waves since the dawn of Web technology.These waves are driven by changes in societal needs,archiving needs and the development of cloud computing resource...Data repository infrastructures for academics have appeared in waves since the dawn of Web technology.These waves are driven by changes in societal needs,archiving needs and the development of cloud computing resources.As such,the data repository landscape has many flavors when it comes to sustainability models,target audiences and feature sets.One thing that links all data repositories is a desire to make the content they host reusable,building on the core principles of cataloging content for economical and research speed efficiency.The FAIR principles are a common goal for all repository infrastructures to aim for.No matter what discipline or infrastructure,the goal of reusable content,for both humans and machines,is a common one.This is the first time that repositories can work toward a common goal that ultimately lends itself to interoperability.The idea that research can move further and faster as we un-silo these fantastic resources is an achievable one.This paper investigates the steps that existing repositories need to take in order to remain useful and relevant in a FAIR research world.展开更多
Thousands of community-developed(meta)data guidelines,models,ontologies,schemas and formats have been created and implemented by several thousand data repositories and knowledge-bases,across all disciplines.These reso...Thousands of community-developed(meta)data guidelines,models,ontologies,schemas and formats have been created and implemented by several thousand data repositories and knowledge-bases,across all disciplines.These resources are necessary to meet government,funder and publisher expectations of greater transparency and access to and preservation of data related to research publications.This obligates researchers to ensure their data is FAIR,share their data using the appropriate standards,store their data in sustainable and community-adopted repositories,and to conform to funder and publisher data policies.FAIR data sharing also plays a key role in enabling researchers to evaluate,re-analyse and reproduce each other’s work.We can map the landscape of relationships between community-adopted standards and repositories,and the journal publisher and funder data policies that recommend their use.In this paper,we show how the work of the GO-FAIR FAIR Standards,Repositories and Policies(StRePo)Implementation Network serves as a central integration and cross-fertilisation point for the reuse of FAIR standards,repositories and data policies in general.Pivotal to this effort,the FAIRsharing,an endorsed flagship resource of the Research Data Alliance that maps the landscape of relationships between community-adopted standards and repositories,and the journal publisher and funder data policies that recommend their use.Lastly,we highlight a number of activities around FAIR tools,services and educational efforts to raise awareness and encourage participation.展开更多
In the high-level radioactive waste(HLW)deep geological repository,bentonite is compacted uniaxially,and then arranged vertically in engineered barriers.The assembly scheme induces the initial anisotropy,and with hydr...In the high-level radioactive waste(HLW)deep geological repository,bentonite is compacted uniaxially,and then arranged vertically in engineered barriers.The assembly scheme induces the initial anisotropy,and with hydration,it develops more evidently under chemical conditions.To investigate the anisotropic swelling of compacted Gaomiaozi(GMZ)bentonite and the further response to saline effects,a series of constant-volume swelling pressure tests were performed.Results showed that dry density enhanced the bentonite swelling and raised the final anisotropy,whereas saline inhibited the bentonite swelling but still promoted the final anisotropy.The final anisotropy coefficient(ratio of radial to axial pressure)obeyed the Boltzmann sigmoid attenuation function,decreasing with concentration and dry density,converging to a minimum value of 0.76.The staged evolution of anisotropy coefficient was discovered,that saline inhibited the rise of the anisotropy coefficient(Dd)in the isotropic process greater than the valley(d1)in the anisotropic process,leading to the final anisotropy increasing.The isotropic stage amplified the impact of soil structure rearrangement on the macro-swelling pressure values.Thus,a new method for predicting swelling pressures of compacted bentonite was proposed,by expanding the equations of Gouy-Chapman theory with a dissipative wedge term.An evolutionary function was constructed,revealing the correlation between the occurrence time and the pressure value due to the structure rearrangement and the former crystalline swelling.Accordingly,a design reference for dry density was given,based on the chemical conditions around the pre-site in Beishan,China.The anisotropy promoted by saline would cause a greater drop of radial pressure,making the previous threshold on axial swelling fail.展开更多
The state of in situ stress is a crucial parameter in subsurface engineering,especially for critical projects like nuclear waste repository.As one of the two ISRM suggested methods,the overcoring(OC)method is widely u...The state of in situ stress is a crucial parameter in subsurface engineering,especially for critical projects like nuclear waste repository.As one of the two ISRM suggested methods,the overcoring(OC)method is widely used to estimate the full stress tensors in rocks by independent regression analysis of the data from each OC test.However,such customary independent analysis of individual OC tests,known as no pooling,is liable to yield unreliable test-specific stress estimates due to various uncertainty sources involved in the OC method.To address this problem,a practical and no-cost solution is considered by incorporating into OC data analysis additional information implied within adjacent OC tests,which are usually available in OC measurement campaigns.Hence,this paper presents a Bayesian partial pooling(hierarchical)model for combined analysis of adjacent OC tests.We performed five case studies using OC test data made at a nuclear waste repository research site of Sweden.The results demonstrate that partial pooling of adjacent OC tests indeed allows borrowing of information across adjacent tests,and yields improved stress tensor estimates with reduced uncertainties simultaneously for all individual tests than they are independently analysed as no pooling,particularly for those unreliable no pooling stress estimates.A further model comparison shows that the partial pooling model also gives better predictive performance,and thus confirms that the information borrowed across adjacent OC tests is relevant and effective.展开更多
Docker has been the mainstream technology of providing reusable software artifacts recently. Developers can easily build and deploy their applications using Docker. Currently, a large number of reusable Docker images ...Docker has been the mainstream technology of providing reusable software artifacts recently. Developers can easily build and deploy their applications using Docker. Currently, a large number of reusable Docker images are publicly shared in online communities, and semantic tags can be created to help developers effectively reuse the images. However, the communities do not provide tagging services, and manually tagging is exhausting and time-consuming. This paper addresses the problem through a semi-supervised learning-based approach, named SemiTagRec. SemiTagRec contains four components:(1) the predictor, which calculates the probability of assigning a specific tag to a given Docker repository;(2) the extender, which introduces new tags as the candidates based on tag correlation analysis;(3) the evaluator, which measures the candidate tags based on a logistic regression model;(4) the integrator, which calculates a final score by combining the results of the predictor and the evaluator, and then assigns the tags with high scores to the given Docker repositories. SemiTagRec includes the newly tagged repositories into the training data for the next round of training. In this way, SemiTagRec iteratively trains the predictor with the cumulative tagged repositories and the extended tag vocabulary, to achieve a high accuracy of tag recommendation. Finally, the experimental results show that SemiTagRec outperforms the other approaches and SemiTagRec’s accuracy, in terms of Recall@5 and Recall@10, is 0.688 and 0.781 respectively.展开更多
Background: The number of biological Knowledge bases/databases storing metabolic pathway information and models has been growing rapidly. These resources are diverse in the type of information/data, the analytical to...Background: The number of biological Knowledge bases/databases storing metabolic pathway information and models has been growing rapidly. These resources are diverse in the type of information/data, the analytical tools, and objectives. Here we present a review of the most popular metabolic pathway databases and model repositories, focusing on their scope, content including reactions, enzymes, compounds, and genes, and applicability. The review aims to help researchers choose a suitable database or model repository according to the information and data required, by providing an insight look of each pathway resource. Results: Four pathways databases and three model repositories were selected on the basis of popularity and diversity. Our review showed that the pathway resources vary in many aspects, such as their scope, content, access to data and the tools. In addition, inconsistencies have been observed in nomenclature and representation of database entities. The three model repositories reviewed do not offer a brief description of the models' characteristics such as simulation conditions. Conclusions: The inconsistencies among the databases in representing their contents may hamper the maximal use of the knowledge accumulated in these databases in particular and the area of systems biology' at large. Therefore, it is strongly recommended that the database creators and the metabolic network models developers should follow international standards for the nomenclature of reactions and metabolites. Besides, computationally generated models that could be obtained from model repositories should be utilized with manual curations as they lack some important components that are necessary for full functionality of the models.展开更多
A growing interest in producing and sharing computable biomedical knowledge artifacts(CBKs) is increasing the demand for repositories that validate, catalog, and provide shared access to CBKs. However, there is a lack...A growing interest in producing and sharing computable biomedical knowledge artifacts(CBKs) is increasing the demand for repositories that validate, catalog, and provide shared access to CBKs. However, there is a lack of evidence on how best to manage and sustain CBK repositories. In this paper, we present the results of interviews with several pioneering CBK repository owners. These interviews were informed by the Trusted Repositories Audit and Certification(TRAC) framework. Insights gained from these interviews suggest that the organizations operating CBK repositories are somewhat new, that their initial approaches to repository governance are informal, and that achieving economic sustainability for their CBK repositories is a major challenge. To enable a learning health system to make better use of its data intelligence, future approaches to CBK repository management will require enhanced governance and closer adherence to best practice frameworks to meet the needs of myriad biomedical science and health communities. More effort is needed to find sustainable funding models for accessible CBK artifact collections.展开更多
GitHub repository recommendation is a research hotspot in the field of open-source software. The current problemswith the repository recommendation systemare the insufficient utilization of open-source community infor...GitHub repository recommendation is a research hotspot in the field of open-source software. The current problemswith the repository recommendation systemare the insufficient utilization of open-source community informationand the fact that the scoring metrics used to calculate the matching degree between developers and repositoriesare developed manually and rely too much on human experience, leading to poor recommendation results. Toaddress these problems, we design a questionnaire to investigate which repository information developers focus onand propose a graph convolutional network-based repository recommendation system (GCNRec). First, to solveinsufficient information utilization in open-source communities, we construct a Developer-Repository networkusing four types of behavioral data that best reflect developers’ programming preferences and extract features ofdevelopers and repositories from the repository content that developers focus on. Then, we design a repositoryrecommendation model based on a multi-layer graph convolutional network to avoid the manual formulation ofscoringmetrics. Thismodel takes the Developer-Repository network, developer features and repository features asinputs, and recommends the top-k repositories that developers are most likely to be interested in by learning theirpreferences. We have verified the proposed GCNRec on the dataset, and by comparing it with other open-sourcerepository recommendation methods, GCNRec achieves higher precision and hit rate.展开更多
In contemporary workplace, organizations are emphasizing on individual’s diversity and inclusion initiatives in order to reinforce managerial adaptability, increase competitive advantage and decrease legal risks. Non...In contemporary workplace, organizations are emphasizing on individual’s diversity and inclusion initiatives in order to reinforce managerial adaptability, increase competitive advantage and decrease legal risks. Nonetheless, in recent times, there has arisen a debate on whether diversity is a variable that has an immediate effect on success or not. This study focused on determining if diversity in terms of ethnicity, gender, age, etc., has effects on success, by investigating two different data sets;the first one is a massive repository of movies data set and actors to determine if there is a correlation between multiple movie related variables and box office earnings. While the second data focused on Fortunes top 500 companies in the United States (US) vs. 500 less profitable companies in the US. Moreover, the study explores how diversity among Board of Directors (BOD) of fortune 500 companies affects the net sales and gross profits. The movie data set was collected from two main websites;Internet Movie Database (IMDB) and Rotten Tomatoes (RT), the imdb data set contained 107,645 records, while as the rotten tomatoes contained 13,904 records. In addition, information about Fortunes 500 companies was obtained from various websites manually, as immediate data sets were hard to find since it’s the first study that focuses on diversity and success of fortune companies. The data set contained data of fortunes top 500 companies with information of all of its BOD about 5358 records, and less profitable companies of 4434 records. The reason in which these data sets were chosen was to study the ethnic diversity factor and its impact on success rate, and also due to the fact that IMDB and Rotten Tomatoes are the most recognized websites that provide access to a massive repository of movie data sets. While the fortune company’s data set was chosen to demonstrate diversity in the chosen dataset where one was for movies and the other was enterprise based. Furthermore, the data was analyzed in python to establish the relationship between the various variables. In all of the correlation analysis, the Pearson’s coefficient was less than 0.1. Therefore, it was concluded that ethnic diversity has an insignificant effect on the success of movies and the Fortune 500 companies.展开更多
An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontol...An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontology and user ontology, and evaluated from its function and implementation; then the corresponding class diagram, resource description framework (RDF) schema and extensible markup language (XML) schema are given. Secondly, the metadata for online learning resource repository management is proposed based on the Dublin Core Metadata Initiative and the IEEE Learning Technologies Standards Committee Learning Object Metadata Working Group. Finally, the inference instance is shown, which proves the validity of ontology and metadata in online learning resource repository management.展开更多
Libraries at large academic medical centers in the United States are undergoing a transformation from their traditional role as knowledge repositories to a new role as connectors to knowledge. This transformation is f...Libraries at large academic medical centers in the United States are undergoing a transformation from their traditional role as knowledge repositories to a new role as connectors to knowledge. This transformation is fueled by the move away from library-held print resources as the primary source of information used by researchers,clinicians and students. Knowledge resources critical to the missions of academic medical centers now include online books and journals,very large data sets,software tools,and expertise far beyond the walls of the library. This article illustrates how Bernard Becker Medical Library at Washington University in St. Louis has seized the opportunity to recast itself as a connector to knowledge beyond literature and strengthen its vital role within the university as a catalyst for learning and discovery.展开更多
基金supported by the Knowledge Innovation Program of Chinese Academy of Sciences (CAS) and the West Light Foundation of CAS
文摘This paper introduces the current practice of building a network of institutional repositories(IRs) at Chinese Academy of Sciences(CAS), which is named CAS IR Grid.National Science Library(NSL) of CAS plays a leading role in the construction, promotion and implementation of CAS IR Grid. It aims to promote each institute of CAS to build IR of its own, and finally form the IR network of CAS institutes. NSL's experience is introduced in coordinating and supporting institutes' building of their respective IRs and promoting IR services by adopting collaborative and progressive development strategies. Achievements made during the development of CAS IR Grid are described and challenges for its future development are discussed. The authors aim to provide best practices for developing a network of institutional repositories in research institute settings, which can serve as a practical reference to other institutions engaged in the similar task.
文摘In this paper a team member ranking technique is presented for software bug repositories. Member ranking is performed using numbers of attributes available in software bug repositories, and a ranked list of developers is generated who are participating in development of software project. This ranking is generated from the contribution made by the individual developers in terms of bugs fixed, severity and priority of bugs, reporting newer problems and comments made by the developers. The top ranked developers are the best contributors for the software projects. The proposed algorithm can also be used for classifying and rating the software bugs using the ratings of members participating in the software bug repository.
基金This research project was financially supported by the Ministry of Science,Technology and Environment(CITMA)of Cuba(Grant number I-2118025001).
文摘Environmental fungi can damage the documentary heritage conserved in archives and affect the personnel’s health if their concentrations,thermo-hygrometric parameters and ventilation conditions are not adequate,problems that can be accentuated by Climate Change.The aims of this work were to identify and to characterize the airborne fungal pollution of naturally ventilated repositories in the Provincial Historical Archive of Santiago de Cuba and predict the risk that these fungi pose to the staff’s health.Indoor air of three repositories of this archive and the outdoor air were sampled in an occasion every time in 2015,2016 and 2017 using a SAS sampler.The obtained fungal concentrations varied from 135.6 CFU/m^(3) to 421.1 CFU/m^(3) and the indoor/outdoor ratios fluctuated from 0.7 to 4.2,evidencing a variable environmental quality over time,but in the third sampling the repositories environments showed good quality.Aspergillus and Cladosporium were the predominant genera in these environments.A.flavus was a prevailed species in indoor air,while A.niger and Cl.cladosporioides were the species that showed the greatest similarities with the outdoor air.Coremiella and Talaromyces genera as well as the species Aspergillus uvarum,Alternaria ricini and Cladosporium staurophorum were the first findings for environments of Cuban archives.Xerophilic species(A.flavus,A.niger,A.ochraceus,A.ustus)indicators of moisture problems in the repositories were detected;they are also opportunistic pathogens and toxigenic species but their concentrations were higher than the recommended,demonstrating the potential risk to which the archive personnel is exposed in a circumstantial way.
文摘Purpose: China Academic Library & Information System(CALIS) planned to launch an institutional repository(IR) project to promote IR development and open access at colleges and universities in China. In order to get to know the current state of IRs in academic institutions, with the help of Peking University Library, CALIS Administrative Center conducted this survey.Design/methodology/approach: We conducted an online survey of CALIS member libraries.Findings: Firstly, the development of IRs at China's colleges and universities is still in its infancy. Secondly, the Chinese colleges and universities have reached a consensus on the objective for having an IR. Thirdly, they are having high expectations of IR functions. Fourthly,they prefer to establish a centralized IR system at a minimum cost. Finally, there are both similarities and differences between the Chinese academic institutions and their counterparts in other countries in the state of IR development.Research limitations: The questionnaire needs to be improved because there is a lack of enough questions for those who do not plan to build an IR. Comparatively lower rate of valid questionnaire return can affect the accuracy of the results. It is hard to go into an in-depth discussion only based on the data collected from this questionnaire survey, and consequently,the findings from the survey can hardly present an accurate and comprehensive picture of the current state of IR development in the academic sector in China.Practical implications: The survey results provide essential foundation for CALIS IR project,and meanwhile the research can serve as a reference source for the future studies of the development of IRs at China's colleges and universities.Originality/value: It is the first national survey focused on the development of IRs in academic institutions in China.
基金the Swedish Nuclear Fuel and Waste Management Company(SKB)for supporting the writing of the paper
文摘A computer code for simulation of groundwater flow and transport is described. Both porous and fractured media are handled by the code. The main intended application is the analysis of a deep repository for nuclear waste and for this reason flow and transport in a sparsely fractured rock is in focus. The mathematical and numerical models are described in some detail. In short, one may say that the code is based on the traditional conservation and state laws, but also embodies a number of submodels (subgrid processes, permafrost, etc). An unstructured Cartesian grid and a finite volume approach are the key elements in the discretization of the basic equations. A multigrid solver is part of the code as well as a parallelization option based on the SPMD (Single-Program Multiple-Data) method. The main application areas are summarized and an application to a deep repository is discussed in some more detail.
基金supported by the Fundacao para a Ciencia e a Tecnologia through the LASIGE Research DB/00408/2020,UIDP/00408/2020supported by the Federal Ministry of Education and Research of Germany(BMBF)un no.16Dll128("Deutsches Internet-Institut").
文摘The open science movement has gained significant momentum within the last few years.This comes along with the need to store and share research artefacts,such as publications and research data.For this purpose,research repositories need to be established.A variety of solutions exist for implementing such repositories,covering diverse features,ranging from custom depositing workflows to social media-like functions.In this article,we introduce the FAIREST principles,a framework inspired by the well-known FAIR principles,but designed to provide a set of metrics for assessing and selecting solutions for creating digital repositories for research artefacts.The goal is to support decision makers in choosing such a solution when planning for a repository,especially at an institutional level.The metrics included are therefore based on two pillars:(1)an analysis of established features and functionalities,drawn from existing dedicated,general purpose and commonly used solutions,and(2)a literature review on general requirements for digital repositories for research artefacts and related systems.We further describe an assessment of 11 widespread solutions,with the goal to provide an overview of the current landscape of research data repository solutions,identifying gaps and research challenges to be addressed.
基金financially supported by the National Natural Science Foundation of China(Grant Nos.12302503 and U20A20266)Scientific and Technological Research Projects in Sichuan Province,China(Grant No.2023ZYD0154).
文摘In the context of repositories for nuclear waste,understanding the behavior of gas migration through clayey rocks with inherent anisotropy is crucial for assessing the safety of geological disposal facilities.The primary mechanism for gas breakthrough is the opening of micro-fractures due to high gas pressure.This occurs at gas pressures lower than the combined strength of the rock and its minimum principal stress under external loading conditions.To investigate the mechanism of microscale mode-I ruptures,it is essential to incorporate a multiscale approach that includes subcritical microcracks in the modeling framework.In this contribution,we derive the model from microstructures that contain periodically distributed microcracks within a porous material.The damage evolution law is coupled with the macroscopic poroelastic system by employing the asymptotic homogenization method and considering the inherent hydro-mechanical(HM)anisotropy at the microscale.The resulting permeability change induced by fracture opening is implicitly integrated into the gas flow equation.Verification examples are presented to validate the developed model step by step.An analysis of local macroscopic response is undertaken to underscore the influence of factors such as strain rate,initial damage,and applied stress,on the gas migration process.Numerical examples of direct tension tests are used to demonstrate the model’s efficacy in describing localized failure characteristics.Finally,the simulation results for preferential gas flow reveal the robustness of the two-scale model in explicitly depicting gas-induced fracturing in anisotropic clayey rocks.The model successfully captures the common behaviors observed in laboratory experiments,such as a sudden drop in gas injection pressure,rapid build-up of downstream gas pressure,and steady-state gas flow following gas breakthrough.
文摘Data repository infrastructures for academics have appeared in waves since the dawn of Web technology.These waves are driven by changes in societal needs,archiving needs and the development of cloud computing resources.As such,the data repository landscape has many flavors when it comes to sustainability models,target audiences and feature sets.One thing that links all data repositories is a desire to make the content they host reusable,building on the core principles of cataloging content for economical and research speed efficiency.The FAIR principles are a common goal for all repository infrastructures to aim for.No matter what discipline or infrastructure,the goal of reusable content,for both humans and machines,is a common one.This is the first time that repositories can work toward a common goal that ultimately lends itself to interoperability.The idea that research can move further and faster as we un-silo these fantastic resources is an achievable one.This paper investigates the steps that existing repositories need to take in order to remain useful and relevant in a FAIR research world.
基金Some of the discussion points in this article and the call for action were developed as part of the joint RDA and Force11 working group and the GO-FAIR StRePo INWe therefore gratefully acknowledge the support provided by the RDA,Force11 and GO-FAIR communities and structures.FAIRsharing is funded by grants awarded to S.-A.S.that include elements of this work+3 种基金specifically,grants from the UK BBSRC and Research Councils(BB/L024101/1,BB/L005069/1)European Union(H2020-EU.3.1,634107,H2020-EU.1.4.1.3,654241,H2020-EU.1.4.1.1,676559),IMI(116060)and NIH(U54 AI117925,1U24AI117966-01,1OT3OD025459-01,1OT3OD025467-01,1OT3OD025462-01)the new FAIRsharing award from the Wellcome Trust(212930/Z/18/Z)as well as a related award(208381/A/17/Z).S.-A.S.is funded also by the Oxford e-Research Centre,Department of Engineering Science of the University of Oxford.
文摘Thousands of community-developed(meta)data guidelines,models,ontologies,schemas and formats have been created and implemented by several thousand data repositories and knowledge-bases,across all disciplines.These resources are necessary to meet government,funder and publisher expectations of greater transparency and access to and preservation of data related to research publications.This obligates researchers to ensure their data is FAIR,share their data using the appropriate standards,store their data in sustainable and community-adopted repositories,and to conform to funder and publisher data policies.FAIR data sharing also plays a key role in enabling researchers to evaluate,re-analyse and reproduce each other’s work.We can map the landscape of relationships between community-adopted standards and repositories,and the journal publisher and funder data policies that recommend their use.In this paper,we show how the work of the GO-FAIR FAIR Standards,Repositories and Policies(StRePo)Implementation Network serves as a central integration and cross-fertilisation point for the reuse of FAIR standards,repositories and data policies in general.Pivotal to this effort,the FAIRsharing,an endorsed flagship resource of the Research Data Alliance that maps the landscape of relationships between community-adopted standards and repositories,and the journal publisher and funder data policies that recommend their use.Lastly,we highlight a number of activities around FAIR tools,services and educational efforts to raise awareness and encourage participation.
基金supported by the National Science Fund for Distinguished Young Scholars of China(Grant No.42125701)Innovation Program of Shanghai Municipal Education Commission(Grant No.2023ZKZD26)the Fundamental Research Funds for the Central Universities,and Top Discipline Plan of Shanghai Universities-Class I.
文摘In the high-level radioactive waste(HLW)deep geological repository,bentonite is compacted uniaxially,and then arranged vertically in engineered barriers.The assembly scheme induces the initial anisotropy,and with hydration,it develops more evidently under chemical conditions.To investigate the anisotropic swelling of compacted Gaomiaozi(GMZ)bentonite and the further response to saline effects,a series of constant-volume swelling pressure tests were performed.Results showed that dry density enhanced the bentonite swelling and raised the final anisotropy,whereas saline inhibited the bentonite swelling but still promoted the final anisotropy.The final anisotropy coefficient(ratio of radial to axial pressure)obeyed the Boltzmann sigmoid attenuation function,decreasing with concentration and dry density,converging to a minimum value of 0.76.The staged evolution of anisotropy coefficient was discovered,that saline inhibited the rise of the anisotropy coefficient(Dd)in the isotropic process greater than the valley(d1)in the anisotropic process,leading to the final anisotropy increasing.The isotropic stage amplified the impact of soil structure rearrangement on the macro-swelling pressure values.Thus,a new method for predicting swelling pressures of compacted bentonite was proposed,by expanding the equations of Gouy-Chapman theory with a dissipative wedge term.An evolutionary function was constructed,revealing the correlation between the occurrence time and the pressure value due to the structure rearrangement and the former crystalline swelling.Accordingly,a design reference for dry density was given,based on the chemical conditions around the pre-site in Beishan,China.The anisotropy promoted by saline would cause a greater drop of radial pressure,making the previous threshold on axial swelling fail.
基金supported by the Guangdong Basic and Applied Basic Research Foundation(2023A1515011244).
文摘The state of in situ stress is a crucial parameter in subsurface engineering,especially for critical projects like nuclear waste repository.As one of the two ISRM suggested methods,the overcoring(OC)method is widely used to estimate the full stress tensors in rocks by independent regression analysis of the data from each OC test.However,such customary independent analysis of individual OC tests,known as no pooling,is liable to yield unreliable test-specific stress estimates due to various uncertainty sources involved in the OC method.To address this problem,a practical and no-cost solution is considered by incorporating into OC data analysis additional information implied within adjacent OC tests,which are usually available in OC measurement campaigns.Hence,this paper presents a Bayesian partial pooling(hierarchical)model for combined analysis of adjacent OC tests.We performed five case studies using OC test data made at a nuclear waste repository research site of Sweden.The results demonstrate that partial pooling of adjacent OC tests indeed allows borrowing of information across adjacent tests,and yields improved stress tensor estimates with reduced uncertainties simultaneously for all individual tests than they are independently analysed as no pooling,particularly for those unreliable no pooling stress estimates.A further model comparison shows that the partial pooling model also gives better predictive performance,and thus confirms that the information borrowed across adjacent OC tests is relevant and effective.
基金supported by the National Natural Key Research and Development Program of China under Grant No. 2016YFB1000803the National Natural Science Foundation of China under Grant Nos. 61732019 and 61572480.
文摘Docker has been the mainstream technology of providing reusable software artifacts recently. Developers can easily build and deploy their applications using Docker. Currently, a large number of reusable Docker images are publicly shared in online communities, and semantic tags can be created to help developers effectively reuse the images. However, the communities do not provide tagging services, and manually tagging is exhausting and time-consuming. This paper addresses the problem through a semi-supervised learning-based approach, named SemiTagRec. SemiTagRec contains four components:(1) the predictor, which calculates the probability of assigning a specific tag to a given Docker repository;(2) the extender, which introduces new tags as the candidates based on tag correlation analysis;(3) the evaluator, which measures the candidate tags based on a logistic regression model;(4) the integrator, which calculates a final score by combining the results of the predictor and the evaluator, and then assigns the tags with high scores to the given Docker repositories. SemiTagRec includes the newly tagged repositories into the training data for the next round of training. In this way, SemiTagRec iteratively trains the predictor with the cumulative tagged repositories and the extended tag vocabulary, to achieve a high accuracy of tag recommendation. Finally, the experimental results show that SemiTagRec outperforms the other approaches and SemiTagRec’s accuracy, in terms of Recall@5 and Recall@10, is 0.688 and 0.781 respectively.
文摘Background: The number of biological Knowledge bases/databases storing metabolic pathway information and models has been growing rapidly. These resources are diverse in the type of information/data, the analytical tools, and objectives. Here we present a review of the most popular metabolic pathway databases and model repositories, focusing on their scope, content including reactions, enzymes, compounds, and genes, and applicability. The review aims to help researchers choose a suitable database or model repository according to the information and data required, by providing an insight look of each pathway resource. Results: Four pathways databases and three model repositories were selected on the basis of popularity and diversity. Our review showed that the pathway resources vary in many aspects, such as their scope, content, access to data and the tools. In addition, inconsistencies have been observed in nomenclature and representation of database entities. The three model repositories reviewed do not offer a brief description of the models' characteristics such as simulation conditions. Conclusions: The inconsistencies among the databases in representing their contents may hamper the maximal use of the knowledge accumulated in these databases in particular and the area of systems biology' at large. Therefore, it is strongly recommended that the database creators and the metabolic network models developers should follow international standards for the nomenclature of reactions and metabolites. Besides, computationally generated models that could be obtained from model repositories should be utilized with manual curations as they lack some important components that are necessary for full functionality of the models.
文摘A growing interest in producing and sharing computable biomedical knowledge artifacts(CBKs) is increasing the demand for repositories that validate, catalog, and provide shared access to CBKs. However, there is a lack of evidence on how best to manage and sustain CBK repositories. In this paper, we present the results of interviews with several pioneering CBK repository owners. These interviews were informed by the Trusted Repositories Audit and Certification(TRAC) framework. Insights gained from these interviews suggest that the organizations operating CBK repositories are somewhat new, that their initial approaches to repository governance are informal, and that achieving economic sustainability for their CBK repositories is a major challenge. To enable a learning health system to make better use of its data intelligence, future approaches to CBK repository management will require enhanced governance and closer adherence to best practice frameworks to meet the needs of myriad biomedical science and health communities. More effort is needed to find sustainable funding models for accessible CBK artifact collections.
基金supported by Special Funds for the Construction of an Innovative Province of Hunan,No.2020GK2028.
文摘GitHub repository recommendation is a research hotspot in the field of open-source software. The current problemswith the repository recommendation systemare the insufficient utilization of open-source community informationand the fact that the scoring metrics used to calculate the matching degree between developers and repositoriesare developed manually and rely too much on human experience, leading to poor recommendation results. Toaddress these problems, we design a questionnaire to investigate which repository information developers focus onand propose a graph convolutional network-based repository recommendation system (GCNRec). First, to solveinsufficient information utilization in open-source communities, we construct a Developer-Repository networkusing four types of behavioral data that best reflect developers’ programming preferences and extract features ofdevelopers and repositories from the repository content that developers focus on. Then, we design a repositoryrecommendation model based on a multi-layer graph convolutional network to avoid the manual formulation ofscoringmetrics. Thismodel takes the Developer-Repository network, developer features and repository features asinputs, and recommends the top-k repositories that developers are most likely to be interested in by learning theirpreferences. We have verified the proposed GCNRec on the dataset, and by comparing it with other open-sourcerepository recommendation methods, GCNRec achieves higher precision and hit rate.
文摘In contemporary workplace, organizations are emphasizing on individual’s diversity and inclusion initiatives in order to reinforce managerial adaptability, increase competitive advantage and decrease legal risks. Nonetheless, in recent times, there has arisen a debate on whether diversity is a variable that has an immediate effect on success or not. This study focused on determining if diversity in terms of ethnicity, gender, age, etc., has effects on success, by investigating two different data sets;the first one is a massive repository of movies data set and actors to determine if there is a correlation between multiple movie related variables and box office earnings. While the second data focused on Fortunes top 500 companies in the United States (US) vs. 500 less profitable companies in the US. Moreover, the study explores how diversity among Board of Directors (BOD) of fortune 500 companies affects the net sales and gross profits. The movie data set was collected from two main websites;Internet Movie Database (IMDB) and Rotten Tomatoes (RT), the imdb data set contained 107,645 records, while as the rotten tomatoes contained 13,904 records. In addition, information about Fortunes 500 companies was obtained from various websites manually, as immediate data sets were hard to find since it’s the first study that focuses on diversity and success of fortune companies. The data set contained data of fortunes top 500 companies with information of all of its BOD about 5358 records, and less profitable companies of 4434 records. The reason in which these data sets were chosen was to study the ethnic diversity factor and its impact on success rate, and also due to the fact that IMDB and Rotten Tomatoes are the most recognized websites that provide access to a massive repository of movie data sets. While the fortune company’s data set was chosen to demonstrate diversity in the chosen dataset where one was for movies and the other was enterprise based. Furthermore, the data was analyzed in python to establish the relationship between the various variables. In all of the correlation analysis, the Pearson’s coefficient was less than 0.1. Therefore, it was concluded that ethnic diversity has an insignificant effect on the success of movies and the Fortune 500 companies.
基金The Advanced University Action Plan of the Minis-try of Education of China (2004XD-03).
文摘An ontology and metadata for online learning resource repository management is constructed. First, based on the analysis of the use-case diagram, the upper ontology is illustrated which includes resource library ontology and user ontology, and evaluated from its function and implementation; then the corresponding class diagram, resource description framework (RDF) schema and extensible markup language (XML) schema are given. Secondly, the metadata for online learning resource repository management is proposed based on the Dublin Core Metadata Initiative and the IEEE Learning Technologies Standards Committee Learning Object Metadata Working Group. Finally, the inference instance is shown, which proves the validity of ontology and metadata in online learning resource repository management.
文摘Libraries at large academic medical centers in the United States are undergoing a transformation from their traditional role as knowledge repositories to a new role as connectors to knowledge. This transformation is fueled by the move away from library-held print resources as the primary source of information used by researchers,clinicians and students. Knowledge resources critical to the missions of academic medical centers now include online books and journals,very large data sets,software tools,and expertise far beyond the walls of the library. This article illustrates how Bernard Becker Medical Library at Washington University in St. Louis has seized the opportunity to recast itself as a connector to knowledge beyond literature and strengthen its vital role within the university as a catalyst for learning and discovery.