著者
Pasquale Pagano Leonardo Candela Donatella Castelli
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.12, pp.GRDI19-GRDI25, 2013 (Released:2013-07-23)
参考文献数
33
被引用文献数
17 2
著者
Terry Eastwood
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.3, pp.202-208, 2004 (Released:2006-01-05)
参考文献数
3
被引用文献数
8

This paper aims to extract lessons from archivists' experience of appraising electronic records that are likely to have wider application in the preservation of other digital materials, including scientific data. It relies mainly on the work of the Appraisal Task Force of the InterPARES project on long-term preservation of authentic electronic records to develop a picture of the process of appraisal. It concludes that the aspects of assessment of authenticity, determination of the feasibility of preservation, and monitoring electronic records as they are maintained in the live environment are likely to find counterparts in attempts to appraise digital objects for long-term preservation in the scientific community. It also argues that the activities performed during appraisal constitute the first vital step in the process of preservation of digital materials.
著者
E. Wulff-Barreiro
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.10, pp.13-23, 2011-11-27 (Released:2011-11-27)
参考文献数
17
被引用文献数
1

As observational data has attained new legal status, allowing their integration into open Internet systems, and experimental data continues to be assembled in common and free platforms, state of the art, easy to access data repositories have been designed in Spain. These repositories have removed many obstacles to re-utilization of GIS and other data. European legislation has also made advances in opening biodiversity data, including a European space in the Latin-American grid infrastructure. Open access biomedical repositories attract commercial attention while astronomical, meteorological, and oncological institutions promote data quality and access. This paper describes recent approaches to open access data for science in Spain.
著者
Eric Reiter
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.6, pp.7-18, 2007 (Released:2007-01-30)
参考文献数
11
被引用文献数
2

Chemical analyses of volcanic gases consist of: location of sampling, date of sampling, identification of the sampling, etc. Nowadays, these data are generally represented in different formats. All of these formats are inflexible and machine dependent. XML has become the most important method of transferring data between computers. VolcanoGasML is a new format, based on XML, for the chemical analyses of volcanic gases. Its definition is divided into several layers: the first one describes the general information concerning the sample, the second, which is organized in several sublayers, contains the chemical data.
著者
Leonardo Candela Donatella Castelli Pasquale Pagano
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.12, pp.GRDI75-GRDI81, 2013 (Released:2013-08-10)
参考文献数
28
被引用文献数
65

Virtual Research Environments are innovative, web-based, community-oriented, comprehensive, flexible, and secure working environments conceived to serve the needs of modern science. We overview the existing initiatives developing these environments by highlighting the major distinguishing features. We envisage a future where regardless of geographical location, scientists will be able to use their Web browsers to seamlessly access data, software, and processing resources that are managed by diverse systems in separate administration domains via Virtual Research Environments. We identify and discuss the major challenges that should be resolved to fully achieve the proposed vision, i.e., large-scale integration and interoperability, sustainability, and adoption.
著者
Stuart E Middleton Zoheir A Sabeur Peter Löwe Martin Hammitzsch Siamak Tavakoli Stefan Poslad
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.12, pp.WDS109-WDS113, 2013 (Released:2013-03-05)
参考文献数
19
被引用文献数
8

We describe our knowledge-based service architecture for multi-risk environmental decision-support, capable of handling geo-distributed heterogeneous real-time data sources. Data sources include tide gauges, buoys, seismic sensors, satellites, earthquake alerts, Web 2.0 feeds to crowd source 'unconventional' measurements, and simulations of Tsunami wave propagation. Our system of systems multi-bus architecture provides a scalable and high performance messaging backbone. We are overcoming semantic interoperability between heterogeneous datasets by using a self-describing 'plug-in' data source approach. As crises develop we can agilely steer the processing server and adapt data fusion and mining algorithm configurations in real-time.
著者
Jacek Becla Kian-Tat Lim Daniel Liwei Wang
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.8, pp.MR1-MR16, 2010-02-24 (Released:2010-02-24)
被引用文献数
3

Academic and industrial users are increasingly facing the challenge of petabytes of data, but managing and analyzing such large data sets still remains a daunting task. Both the database and the map/reduce communities worldwide are working on addressing these issues. The 3rd Extremely Large Databases workshop was organized to examine the needs of scientific communities beginning to face these issues, to reach out to European communities working on extremely large scale data challenges, and to brainstorm possible solutions. The science benchmark that emerged from the 2nd workshop in this series was also debated. This paper is the final report of the discussions and activities at this workshop.
著者
S. Levitus
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.11, pp.46-71, 2012 (Released:2012-08-11)
参考文献数
71
被引用文献数
15 1

We document the history and progress of two international ocean data management projects. The "Global Oceanographic Data Archaeology and Rescue" project was initiated in 1993 under the auspices of the UNESCO Intergovernmental Oceanographic Commission (IOC). The project has the goal of locating (archaeology) and digitizing or copying to modern electronic media (rescuing) historical (pre-1992) oceanographic data that exist in manuscript or electronic media form that are at risk of loss due to media decay. The IOC "World Ocean Database" project initiated in 2001 focuses on encouraging international data exchange for the post-1991 period and the development of regional atlases.
著者
A. Asserson K. Jeffery
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.9, pp.CRIS14-CRIS23, 2010-07-24 (Released:2010-07-24)
参考文献数
64
被引用文献数
2 3

CRIS (Current Research Information Systems) provide researchers, research managers, innovators, and others with a view over the research activity of a domain. IRs (institutional repositories) provide a mechanism for an organisation to showcase through OA (open access) its intellectual property. Increasingly, organizations are mandating that their employed researchers deposit peer-reviewed published material in the IR. Research funders are increasingly mandating that publications be deposited in an open access repository: some mandate a central (or subject-based) repository, some an IR. In parallel, publishers are offering OA but replacing subscription-based access with author (or author institution) payment for publishing. However, many OA repositories have metadata based on DC (Dublin Core) which is inadequate; a CERIF (Common-European Research Information Format) CRIS provides metadata describing publications with formal syntax and declared semantics thus facilitating interoperation or homogeneous access over heterogeneous sources. The formality is essential for research output metrics, which are increasingly being used to determine future funding for research organizations.
著者
Ge Peng Jeffrey L Privette Edward J Kearns Nancy A Ritchey Steve Ansari
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.13, pp.231-253, 2015 (Released:2015-02-02)
参考文献数
55
被引用文献数
30

This paper presents a stewardship maturity assessment model in the form of a matrix for digital environmental datasets. Nine key components are identified based on requirements imposed on digital environmental data and information that are cared for and disseminated by U.S. Federal agencies by U.S. law, i.e., Information Quality Act of 2001, agencies’ guidance, expert bodies’ recommendations, and users. These components include: preservability, accessibility, usability, production sustainability, data quality assurance, data quality control/monitoring, data quality assessment, transparency/traceability, and data integrity. A five-level progressive maturity scale is then defined for each component associated with measurable practices applied to individual datasets, representing Ad Hoc, Minimal, Intermediate, Advanced, and Optimal stages. The rationale for each key component and its maturity levels is described. This maturity model, leveraging community best practices and standards, provides a unified framework for assessing scientific data stewardship. It can be used to create a stewardship maturity scoreboard of dataset(s) and a roadmap for scientific data stewardship improvement or to provide data quality and usability information to users, stakeholders, and decision makers.
著者
N Moles
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.13, pp.192-202, 2015 (Released:2015-01-27)
参考文献数
15
被引用文献数
2

With the growing importance of data to the scholarly record and the critical role journals play in facilitating data sharing, the complex landscape of scholarly journal data publication policies has become an obstacle for research. This paper outlines Data-PE, a framework for evaluating these policies. It takes the form of a conceptual foundation, comprising twelve criteria for evaluation, operationalized through an evaluation tool. Its objective is to function as a flexible means for a variety of stakeholders to appraise individual policies. Examples of the use of the framework are provided and means for the validation of the tool are discussed.
著者
Siri Krishan Wasan Vasudha Bhatnagar Harleen Kaur
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.5, pp.119-126, 2006 (Released:2006-11-28)
参考文献数
10
被引用文献数
44

Medical data mining has great potential for exploring the hidden patterns in the data sets of the medical domain. These patterns can be utilized for clinical diagnosis. However, the available raw medical data are widely distributed, heterogeneous in nature, and voluminous. These data need to be collected in an organized form. This collected data can be then integrated to form a hospital information system. Data mining technology provides a user-oriented approach to novel and hidden patterns in the data. Data mining and statistics both strive towards discovering patterns and structures in data. Statistics deals with heterogeneous numbers only, whereas data mining deals with heterogeneous fields. We identify a few areas of healthcare where these techniques can be applied to healthcare databases for knowledge discovery. In this paper we briefly examine the impact of data mining techniques, including artificial neural networks, on medical diagnostics.
著者
J. H. Westbrook
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.2, pp.198-212, 2003 (Released:2006-01-05)
参考文献数
35
被引用文献数
6

The availability of the Internet has provided unprecedented opportunities for both data compilers and users. With respect to materials data, this paper explores: how do we know what is available? how can data be accessed, interpreted, exchanged? what novel modes of presentation are now available? what organizations are active in this field and what are their programs? what improvements are needed? where do we go from here and how? Examples will be illustrated of specific materials databases available on the Internet from a variety of materials data fields, e.g. fundamental data, engineering design properties, environmental data, and materials safety data. While there is no question that large and widely varied bodies of data are accessible on the Internet, significant improvements are needed promptly. The paper concludes by summarizing these problems and possible means for their alleviation.

1 0 0 0 OA Data Discovery

著者
Gerhard Weikum
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.12, pp.GRDI26-GRDI31, 2013 (Released:2013-07-23)
参考文献数
24
著者
Guan-Hua Xu
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.6, pp.OD21-OD25, 2007 (Released:2007-06-29)
被引用文献数
5

As an important part of the science and technology infrastructure platform of China, the Ministry of Science and Technology launched the Scientific Data Sharing Program in 2002. Twenty-four government agencies now participate in the Program. After five years of hard work, great progress has been achieved in the policy and legal framework, data standards, pilot projects, and international cooperation. By the end of 2005, one-third of the existing public-interest and basic scientific databases in China had been integrated and upgraded. By 2020, China is expected to build a more user-friendly scientific data management and sharing system, with 80 percent of scientific data available to the general public. In order to realize this objective, the emphases of the project are to perfect the policy and legislation system, improve the quality of data resources, expand and establish national scientific data centers, and strengthen international cooperation. It is believed that with the opening up of access to scientific data in China, the Program will play a bigger role in promoting science and national innovation.
著者
Nidhi Kushwaha O P Vyas
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.13, pp.119-126, 2014 (Released:2014-11-14)
参考文献数
10
被引用文献数
2

The Semantic Web (Web 3.0) has been proposed as an efficient way to access the increasingly large amounts of data on the internet. The Linked Open Data Cloud project at present is the major effort to implement the concepts of the Seamtic Web, addressing the problems of inhomogeneity and large data volumes. RKBExplorer is one of many repositories implementing Open Data and contains considerable bibliographic information. This paper discusses bibliographic data, an important part of cloud data. Effective searching of bibiographic datasets can be a challenge as many of the papers residing in these databases do not have sufficient or comprehensive keyword information. In these cases however, a search engine based on RKBExplorer is only able to use information to retrieve papers based on author names and title of papers without keywords. In this paper we attempt to address this problem by using the data mining algorithm Association Rule Mining (ARM) to develop keywords based on features retrieved from Resource Description Framework (RDF) data within a bibliographic citation. We have demonstrate the applicability of this method for predicting missing keywords for bibliographic entries in several typical databases.−−−−−¹ Paper presented at 1st International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2014) March 27-28, 2014. Organized by VIT University, Chennai, India. Sponsored by BRNS.
著者
Dijk E. van Meel M.
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.9, pp.CRIS59-CRIS65, 2010
被引用文献数
2

Scholarly publications are a major part of the research infrastructure. One way to make output available is to store the publications in Open Access Repositories (OAR). A Current Research Information System (CRIS) that conforms to the standard CERIF (Common European Research Information Format) could be a key component in the e-infrastructure. A CRIS provides the structure and makes it possible to interoperate the CRIS metadata at every stage of the research cycle. The international DRIVER projects are creating a European repository infrastructure. Knowledge Exchange has launched a project to develop a metadata exchange format for publications between CRIS and OAR systems.
著者
Xudong Wang
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.6, pp.S698-S714, 2007 (Released:2007-10-23)
参考文献数
9
被引用文献数
1

Digital World History is a new expression of world history (or maybe "a new method for world history expression") and a paradigm of world history description, study, and application by virtual informatization and recovery. It is also a comprehensive systematic study through dynamic marks, integrated description, and retrieval of human society evolution and its causality dependant on the theory and methodology of digitization information. It aims at breaking the limitation of diachronic language attributed to the process of history cognition, summation, and recovery, addressing a possible scheme to fuse historical factors in relation to changing history, dynamically applying a multiplicity of results so that the discipline of world history can meet the needs of the information-equipped society of the 21st century. In this article, the author uses theoretical modelling methods, resulting in a blueprint of the quality issue, namely the Digital World History premise, and a paradigm for setting the foundation and scientific data strategy as a basis for its necessity.
著者
D Haglin R Roiger J Hakkila T Giblin
出版者
CODATA
雑誌
Data Science Journal (ISSN:16831470)
巻号頁・発行日
vol.4, pp.39-53, 2005 (Released:2006-01-05)
参考文献数
22
被引用文献数
6

The scientific method encourages sharing data with other researchers to independently verify conclusions. Currently, technical barriers impede such public scrutiny. A strategy for offering scientific data for public analysis is described. With this strategy, effectively no requirements of software installation (other than a web browser) or data manipulation are imposed on other researchers to prepare for perusing the scientific data. A prototype showcasing this strategy is described.