Term Definitions Version 1.0 "Montreal"

Data Foundation and Terminology Interest Group (DFT IG) of the Research Data Alliance (RDA)

This file file is an export of the TeD-T term collection of the DFT IG, done at Tue Sep 05 16:44:37 CEST 2017

http://smw-rda.esc.rzg.mpg.de

A

API Consumer LayerDefinition21.11109/dft/1.0/API_Consumer_Layer
AccessDefinition21.11109/dft/1.0/Access
Access ControlDefinition21.11109/dft/1.0/Access_Control
Access PolicyDefinition21.11109/dft/1.0/Access_Policy
Access RoleDefinition21.11109/dft/1.0/Access_Role
Access WorkflowDefinition21.11109/dft/1.0/Access_Workflow
Access a repositoryDefinition21.11109/dft/1.0/Access_a_repository
Access control listDefinition21.11109/dft/1.0/Access_control_list
AccessibleDefinition21.11109/dft/1.0/Accessible
Active CollectionDefinition 1Definition 221.11109/dft/1.0/Active_Collection
Active DataDefinition21.11109/dft/1.0/Active_Data
Adaptive IndexingDefinition21.11109/dft/1.0/Adaptive_Indexing
Add a retention periodDefinition21.11109/dft/1.0/Add_a_retention_period
Addition of access controlsDefinition21.11109/dft/1.0/Addition_of_access_controls
Administrative metadataDefinition21.11109/dft/1.0/Administrative_metadata
AggregationDefinition21.11109/dft/1.0/Aggregation
AnalyticsDefinition21.11109/dft/1.0/Analytics
AnnotationDefinition21.11109/dft/1.0/Annotation
ArchitectureDefinition21.11109/dft/1.0/Architecture
Archival DescriptionDefinition21.11109/dft/1.0/Archival_Description
ArchiveDefinition21.11109/dft/1.0/Archive
ArchivingDefinition21.11109/dft/1.0/Archiving
Arrangement:Definition21.11109/dft/1.0/Arrangement:
AttributeDefinition21.11109/dft/1.0/Attribute
AuthenticationDefinition21.11109/dft/1.0/Authentication
Authentication SystemDefinition21.11109/dft/1.0/Authentication_System
Authenticity metadataDefinition21.11109/dft/1.0/Authenticity_metadata
AuthorizationDefinition21.11109/dft/1.0/Authorization
Authorization SystemDefinition21.11109/dft/1.0/Authorization_System

B

Big DataDefinition21.11109/dft/1.0/Big_Data
Bit SequenceDefinition21.11109/dft/1.0/Bit_Sequence
Bit StreamDefinition21.11109/dft/1.0/Bit_Stream
BlockchainDefinition21.11109/dft/1.0/Blockchain
BlueprintDefinition21.11109/dft/1.0/Blueprint

C

Canonical Data CollectionDefinition21.11109/dft/1.0/Canonical_Data_Collection
Canonical Metadata PackagesDefinition21.11109/dft/1.0/Canonical_Metadata_Packages
CatalogDefinition21.11109/dft/1.0/Catalog
CataloguingDefinition21.11109/dft/1.0/Cataloguing
Category RegistryDefinition21.11109/dft/1.0/Category_Registry
CertificateDefinition21.11109/dft/1.0/Certificate
Certification ProcessDefinition21.11109/dft/1.0/Certification_Process
ChecksumDefinition 1Definition 221.11109/dft/1.0/Checksum
Choosing a storage locationDefinition21.11109/dft/1.0/Choosing_a_storage_location
Citable DataDefinition21.11109/dft/1.0/Citable_Data
CitationDefinition21.11109/dft/1.0/Citation
Citation MetadataDefinition21.11109/dft/1.0/Citation_Metadata
Cited referenceDefinition21.11109/dft/1.0/Cited_reference
CollectionDefinition 1Definition 2Definition 321.11109/dft/1.0/Collection
Collection DevelopmentDefinition21.11109/dft/1.0/Collection_Development
Collection ManagementDefinition21.11109/dft/1.0/Collection_Management
Collection Management IdentificationDefinition21.11109/dft/1.0/Collection_Management_Identification
Collection RegistryDefinition21.11109/dft/1.0/Collection_Registry
Collection of DataDefinition21.11109/dft/1.0/Collection_of_Data
Collection virtualizationDefinition21.11109/dft/1.0/Collection_virtualization
CommunicationDefinition21.11109/dft/1.0/Communication
ComponentsDefinition21.11109/dft/1.0/Components
ConceptDefinition21.11109/dft/1.0/Concept
Conceptual ObjectDefinition21.11109/dft/1.0/Conceptual_Object
ContainerDefinition21.11109/dft/1.0/Container
Content ReplicationDefinition21.11109/dft/1.0/Content_Replication
Contextual MetadataDefinition21.11109/dft/1.0/Contextual_Metadata
Contextual metadata extractionDefinition21.11109/dft/1.0/Contextual_metadata_extraction
Controlled VocabularyDefinition21.11109/dft/1.0/Controlled_Vocabulary
CopyrightDefinition21.11109/dft/1.0/Copyright
Copyright InfringementDefinition21.11109/dft/1.0/Copyright_Infringement
CorpuseDefinition21.11109/dft/1.0/Corpuse
Create derived data productsDefinition21.11109/dft/1.0/Create_derived_data_products
Crowdsourcing CurationDefinition21.11109/dft/1.0/Crowdsourcing_Curation
CurationDefinition21.11109/dft/1.0/Curation
Curation MetadataDefinition21.11109/dft/1.0/Curation_Metadata
Curation WorkflowDefinition21.11109/dft/1.0/Curation_Workflow
CyberinfrastructureDefinition21.11109/dft/1.0/Cyberinfrastructure

D

Darwin CoreDefinition21.11109/dft/1.0/Darwin_Core
DataDefinition 1Definition 2Definition 3Definition 4Definition 5Definition 621.11109/dft/1.0/Data
Data AccessDefinition21.11109/dft/1.0/Data_Access
Data AggregateDefinition21.11109/dft/1.0/Data_Aggregate
Data AnalysisDefinition21.11109/dft/1.0/Data_Analysis
Data AnalyticsDefinition21.11109/dft/1.0/Data_Analytics
Data ArchitectDefinition21.11109/dft/1.0/Data_Architect
Data ArchivingDefinition21.11109/dft/1.0/Data_Archiving
Data ArrangementDefinition21.11109/dft/1.0/Data_Arrangement
Data BrokerDefinition21.11109/dft/1.0/Data_Broker
Data CatalogDefinition21.11109/dft/1.0/Data_Catalog
Data CitationDefinition21.11109/dft/1.0/Data_Citation
Data Citation MetadataDefinition21.11109/dft/1.0/Data_Citation_Metadata
Data CleaningDefinition21.11109/dft/1.0/Data_Cleaning
Data CollectionDefinition21.11109/dft/1.0/Data_Collection
Data ConsumerDefinition21.11109/dft/1.0/Data_Consumer
Data ContainerDefinition21.11109/dft/1.0/Data_Container
Data CurationDefinition21.11109/dft/1.0/Data_Curation
Data DepositDefinition21.11109/dft/1.0/Data_Deposit
Data DictionaryDefinition21.11109/dft/1.0/Data_Dictionary
Data DiscoveryDefinition21.11109/dft/1.0/Data_Discovery
Data EcosystemDefinition21.11109/dft/1.0/Data_Ecosystem
Data ElementDefinition21.11109/dft/1.0/Data_Element
Data EntityDefinition21.11109/dft/1.0/Data_Entity
Data Flow VirtualizationDefinition21.11109/dft/1.0/Data_Flow_Virtualization
Data FormatDefinition21.11109/dft/1.0/Data_Format
Data IdentifierDefinition21.11109/dft/1.0/Data_Identifier
Data IntegrationDefinition21.11109/dft/1.0/Data_Integration
Data ItemDefinition21.11109/dft/1.0/Data_Item
Data LakeDefinition21.11109/dft/1.0/Data_Lake
Data LibrarianDefinition21.11109/dft/1.0/Data_Librarian
Data LifecycleDefinition 1Definition 221.11109/dft/1.0/Data_Lifecycle
Data Management InfrastructureDefinition21.11109/dft/1.0/Data_Management_Infrastructure
Data ManagerDefinition21.11109/dft/1.0/Data_Manager
Data ModelDefinition21.11109/dft/1.0/Data_Model
Data ObjectDefinition21.11109/dft/1.0/Data_Object
Data OrganizationDefinition21.11109/dft/1.0/Data_Organization
Data PolicyDefinition21.11109/dft/1.0/Data_Policy
Data PreservationDefinition21.11109/dft/1.0/Data_Preservation
Data ProcessingDefinition21.11109/dft/1.0/Data_Processing
Data ProducerDefinition21.11109/dft/1.0/Data_Producer
Data ProfessionalDefinition21.11109/dft/1.0/Data_Professional
Data ProviderDefinition21.11109/dft/1.0/Data_Provider
Data Provider LayerDefinition21.11109/dft/1.0/Data_Provider_Layer
Data PublishingDefinition 1Definition 221.11109/dft/1.0/Data_Publishing
Data QualityDefinition21.11109/dft/1.0/Data_Quality
Data RegistrationDefinition21.11109/dft/1.0/Data_Registration
Data RegistryDefinition21.11109/dft/1.0/Data_Registry
Data RepositoryDefinition21.11109/dft/1.0/Data_Repository
Data Repository managementDefinition21.11109/dft/1.0/Data_Repository_management
Data RepresentationDefinition21.11109/dft/1.0/Data_Representation
Data SetDefinition 1Definition 2Definition 3Definition 4Definition 521.11109/dft/1.0/Data_Set
Data StewardshipDefinition21.11109/dft/1.0/Data_Stewardship
Data StreamDefinition21.11109/dft/1.0/Data_Stream
Data TransformationDefinition21.11109/dft/1.0/Data_Transformation
Data TransparencyDefinition21.11109/dft/1.0/Data_Transparency
Data Type RegistryDefinition21.11109/dft/1.0/Data_Type_Registry
Data TypingDefinition21.11109/dft/1.0/Data_Typing
Data UploadDefinition21.11109/dft/1.0/Data_Upload
Data VersioningDefinition21.11109/dft/1.0/Data_Versioning
Data articleDefinition21.11109/dft/1.0/Data_article
Data journalDefinition21.11109/dft/1.0/Data_journal
Data managementDefinition21.11109/dft/1.0/Data_management
Data objectDefinition21.11109/dft/1.0/Data_object
Data packetDefinition21.11109/dft/1.0/Data_packet
Data paperDefinition21.11109/dft/1.0/Data_paper
Data policyDefinition21.11109/dft/1.0/Data_policy
Data practiceDefinition21.11109/dft/1.0/Data_practice
Data privacyDefinition21.11109/dft/1.0/Data_privacy
Data publishing workflowDefinition21.11109/dft/1.0/Data_publishing_workflow
Data repository entryDefinition21.11109/dft/1.0/Data_repository_entry
Data reviewDefinition21.11109/dft/1.0/Data_review
Data sharingDefinition21.11109/dft/1.0/Data_sharing
Data sourceDefinition21.11109/dft/1.0/Data_source
Data typeDefinition21.11109/dft/1.0/Data_type
Data type registryDefinition21.11109/dft/1.0/Data_type_registry
DatabaseDefinition21.11109/dft/1.0/Database
Database CrackingDefinition21.11109/dft/1.0/Database_Cracking
Database RightsDefinition21.11109/dft/1.0/Database_Rights
DatapointDefinition21.11109/dft/1.0/Datapoint
DatasetDefinition 1Definition 221.11109/dft/1.0/Dataset
Dataset seriesDefinition21.11109/dft/1.0/Dataset_series
DatumDefinition21.11109/dft/1.0/Datum
Derived Data ProductsDefinition21.11109/dft/1.0/Derived_Data_Products
Description ObjectDefinition21.11109/dft/1.0/Description_Object
DescriptionsDefinition21.11109/dft/1.0/Descriptions
Descriptive metadataDefinition21.11109/dft/1.0/Descriptive_metadata
Detailed MetadataDefinition21.11109/dft/1.0/Detailed_Metadata
DictionaryDefinition21.11109/dft/1.0/Dictionary
Digital ArchiveDefinition21.11109/dft/1.0/Digital_Archive
Digital CollectionDefinition21.11109/dft/1.0/Digital_Collection
Digital Curation LifecycleDefinition21.11109/dft/1.0/Digital_Curation_Lifecycle
Digital DataDefinition21.11109/dft/1.0/Digital_Data
Digital Data ObjectDefinition21.11109/dft/1.0/Digital_Data_Object
Digital ObjectDefinition 1Definition 221.11109/dft/1.0/Digital_Object
Digital Object IdentifierDefinition21.11109/dft/1.0/Digital_Object_Identifier
Digital RecordDefinition21.11109/dft/1.0/Digital_Record
Digital RepositoryDefinition21.11109/dft/1.0/Digital_Repository
Digital repositoryDefinition21.11109/dft/1.0/Digital_repository
Discovery MetadataDefinition21.11109/dft/1.0/Discovery_Metadata
DisposalDefinition21.11109/dft/1.0/Disposal
Domain MetadataDefinition21.11109/dft/1.0/Domain_Metadata
Dynamic DataDefinition21.11109/dft/1.0/Dynamic_Data

E

EcosystemDefinition21.11109/dft/1.0/Ecosystem
EntityDefinition21.11109/dft/1.0/Entity
EquityDefinition21.11109/dft/1.0/Equity
Event LoggingDefinition21.11109/dft/1.0/Event_Logging
External PropertyDefinition21.11109/dft/1.0/External_Property
Extract descriptive metadataDefinition21.11109/dft/1.0/Extract_descriptive_metadata

F

FAIR Data PrinciplesDefinition21.11109/dft/1.0/FAIR_Data_Principles
Facility / equipmentDefinition21.11109/dft/1.0/Facility_/_equipment
Fair UseDefinition21.11109/dft/1.0/Fair_Use
Federated ArchitectureDefinition21.11109/dft/1.0/Federated_Architecture
Federation RegistryDefinition21.11109/dft/1.0/Federation_Registry
FileDefinition21.11109/dft/1.0/File
FindableDefinition21.11109/dft/1.0/Findable
Fixed SchemaDefinition21.11109/dft/1.0/Fixed_Schema
Flexible SchemaDefinition21.11109/dft/1.0/Flexible_Schema
FrameworkDefinition21.11109/dft/1.0/Framework

G

GlossaryDefinition21.11109/dft/1.0/Glossary
Graph Creation LayerDefinition21.11109/dft/1.0/Graph_Creation_Layer

H

I

IdentifierDefinition21.11109/dft/1.0/Identifier
IdentityDefinition21.11109/dft/1.0/Identity
Immutable classDefinition21.11109/dft/1.0/Immutable_class
IndexDefinition21.11109/dft/1.0/Index
IndexingDefinition21.11109/dft/1.0/Indexing
InformationDefinition21.11109/dft/1.0/Information
Information ContentDefinition21.11109/dft/1.0/Information_Content
Information ObjectDefinition21.11109/dft/1.0/Information_Object
Information ScienceDefinition21.11109/dft/1.0/Information_Science
InfrastructureDefinition21.11109/dft/1.0/Infrastructure
Instances of Bit StreamDefinition21.11109/dft/1.0/Instances_of_Bit_Stream
IntegrityDefinition21.11109/dft/1.0/Integrity
Intellectual Property RightsDefinition21.11109/dft/1.0/Intellectual_Property_Rights
InterfacesDefinition21.11109/dft/1.0/Interfaces
Internal PropertyDefinition21.11109/dft/1.0/Internal_Property
InteroperabilityDefinition21.11109/dft/1.0/Interoperability

J

K

Key MetadataDefinition21.11109/dft/1.0/Key_Metadata
KeywordDefinition21.11109/dft/1.0/Keyword

L

Landing PageDefinition21.11109/dft/1.0/Landing_Page
LandscapeDefinition21.11109/dft/1.0/Landscape
Legacy dataDefinition21.11109/dft/1.0/Legacy_data
Legal InteroperabilityDefinition21.11109/dft/1.0/Legal_Interoperability
Legal RightDefinition21.11109/dft/1.0/Legal_Right
LexiconDefinition21.11109/dft/1.0/Lexicon
LifecycleDefinition21.11109/dft/1.0/Lifecycle
Linked DataDefinition21.11109/dft/1.0/Linked_Data

M

Machine ActionableDefinition21.11109/dft/1.0/Machine_Actionable
Manage data sets in a repositoryDefinition21.11109/dft/1.0/Manage_data_sets_in_a_repository
Manage metadata catalogDefinition21.11109/dft/1.0/Manage_metadata_catalog
MashupDefinition21.11109/dft/1.0/Mashup
Math TestDefinition21.11109/dft/1.0/Math_Test
MediumDefinition21.11109/dft/1.0/Medium
MetadataDefinition 1Definition 2Definition 3Definition 421.11109/dft/1.0/Metadata
Metadata AttributeDefinition21.11109/dft/1.0/Metadata_Attribute
Metadata CatalogueDefinition21.11109/dft/1.0/Metadata_Catalogue
Metadata ComponentDefinition21.11109/dft/1.0/Metadata_Component
Metadata ElementDefinition21.11109/dft/1.0/Metadata_Element
Metadata ManagementDefinition21.11109/dft/1.0/Metadata_Management
Metadata ObjectDefinition21.11109/dft/1.0/Metadata_Object
Metadata ProfileDefinition21.11109/dft/1.0/Metadata_Profile
Metadata RecordDefinition21.11109/dft/1.0/Metadata_Record
Metadata RegistryDefinition21.11109/dft/1.0/Metadata_Registry
Metadata StandardsDefinition21.11109/dft/1.0/Metadata_Standards
Metadata tagDefinition21.11109/dft/1.0/Metadata_tag
MetamodelDefinition21.11109/dft/1.0/Metamodel
Minimal MetadataDefinition21.11109/dft/1.0/Minimal_Metadata

N

NanopublicationDefinition21.11109/dft/1.0/Nanopublication

O

OAI RepositoryDefinition21.11109/dft/1.0/OAI_Repository
ObjectDefinition21.11109/dft/1.0/Object
Object AttributeDefinition21.11109/dft/1.0/Object_Attribute
Object ModelDefinition21.11109/dft/1.0/Object_Model
Object PropertyDefinition21.11109/dft/1.0/Object_Property
Objective MetadataDefinition21.11109/dft/1.0/Objective_Metadata
Open AccessDefinition21.11109/dft/1.0/Open_Access
Open dataDefinition21.11109/dft/1.0/Open_data
OperationDefinition21.11109/dft/1.0/Operation
Original RepositoryDefinition21.11109/dft/1.0/Original_Repository
OriginatorDefinition21.11109/dft/1.0/Originator

P

PID AttributeDefinition21.11109/dft/1.0/PID_Attribute
PID DomainDefinition21.11109/dft/1.0/PID_Domain
PID RecordDefinition 1Definition 221.11109/dft/1.0/PID_Record
PID ResolutionDefinition21.11109/dft/1.0/PID_Resolution
PID ServiceDefinition21.11109/dft/1.0/PID_Service
PID SystemDefinition21.11109/dft/1.0/PID_System
PID TypeDefinition21.11109/dft/1.0/PID_Type
PatentDefinition21.11109/dft/1.0/Patent
Payload MetadataDefinition21.11109/dft/1.0/Payload_Metadata
Persistent IdentifierDefinition21.11109/dft/1.0/Persistent_Identifier
Physical coordintesDefinition21.11109/dft/1.0/Physical_coordintes
PolicyDefinition21.11109/dft/1.0/Policy
Policy ConstraintsDefinition21.11109/dft/1.0/Policy_Constraints
Presentation VersionDefinition21.11109/dft/1.0/Presentation_Version
PreservationDefinition21.11109/dft/1.0/Preservation
ProcedureDefinition21.11109/dft/1.0/Procedure
Processing WorkflowDefinition21.11109/dft/1.0/Processing_Workflow
ProjectDefinition21.11109/dft/1.0/Project
PropertyDefinition21.11109/dft/1.0/Property
Property FeaturesDefinition21.11109/dft/1.0/Property_Features
Property RecordDefinition21.11109/dft/1.0/Property_Record
PropositionDefinition21.11109/dft/1.0/Proposition
ProtocolsDefinition21.11109/dft/1.0/Protocols
ProvenanceDefinition21.11109/dft/1.0/Provenance
Provenance metadataDefinition 1Definition 221.11109/dft/1.0/Provenance_metadata
PublicationDefinition21.11109/dft/1.0/Publication
PublisherDefinition21.11109/dft/1.0/Publisher

Q

QualityDefinition21.11109/dft/1.0/Quality
Query IDDefinition21.11109/dft/1.0/Query_ID
Query StoreDefinition21.11109/dft/1.0/Query_Store
Query TimestampingDefinition21.11109/dft/1.0/Query_Timestamping

R

Raw DataDefinition21.11109/dft/1.0/Raw_Data
Real-Time DataDefinition 1Definition 221.11109/dft/1.0/Real-Time_Data
RecordDefinition21.11109/dft/1.0/Record
Record provenance informationDefinition21.11109/dft/1.0/Record_provenance_information
Records ManagementDefinition21.11109/dft/1.0/Records_Management
Referable dataDefinition21.11109/dft/1.0/Referable_data
Reference ResolutionDefinition21.11109/dft/1.0/Reference_Resolution
Reference dataDefinition21.11109/dft/1.0/Reference_data
Reference modelDefinition21.11109/dft/1.0/Reference_model
Register ManagerDefinition21.11109/dft/1.0/Register_Manager
Register MetadataDefinition21.11109/dft/1.0/Register_Metadata
Registered DataDefinition21.11109/dft/1.0/Registered_Data
Registered digital dataDefinition21.11109/dft/1.0/Registered_digital_data
RegistrationDefinition21.11109/dft/1.0/Registration
RegistryDefinition 1Definition 221.11109/dft/1.0/Registry
Related DataDefinition21.11109/dft/1.0/Related_Data
RelationsDefinition21.11109/dft/1.0/Relations
Replica numberDefinition21.11109/dft/1.0/Replica_number
ReplicationDefinition21.11109/dft/1.0/Replication
RepositoryDefinition 1Definition 221.11109/dft/1.0/Repository
Repository RegistryDefinition21.11109/dft/1.0/Repository_Registry
RepresentationDefinition21.11109/dft/1.0/Representation
Representation objectDefinition21.11109/dft/1.0/Representation_object
Research DataDefinition21.11109/dft/1.0/Research_Data
Research ObjectDefinition21.11109/dft/1.0/Research_Object
Research StakeholderDefinition21.11109/dft/1.0/Research_Stakeholder
Researcher operationsDefinition21.11109/dft/1.0/Researcher_operations
ResourceDefinition21.11109/dft/1.0/Resource
Resource Description FrameworkDefinition21.11109/dft/1.0/Resource_Description_Framework
Resource DestinationDefinition21.11109/dft/1.0/Resource_Destination
Resource SourceDefinition21.11109/dft/1.0/Resource_Source
Reusable DataDefinition21.11109/dft/1.0/Reusable_Data
Rich MetadataDefinition21.11109/dft/1.0/Rich_Metadata
Right HolderDefinition21.11109/dft/1.0/Right_Holder

S

SchemaDefinition21.11109/dft/1.0/Schema
Semantic InteroperabilityDefinition21.11109/dft/1.0/Semantic_Interoperability
Service ObjectDefinition21.11109/dft/1.0/Service_Object
ServicesDefinition21.11109/dft/1.0/Services
Source DataDefinition21.11109/dft/1.0/Source_Data
Standard protocolDefinition21.11109/dft/1.0/Standard_protocol
State InformationDefinition21.11109/dft/1.0/State_Information
Sticky BitsDefinition21.11109/dft/1.0/Sticky_Bits
Structural metadataDefinition21.11109/dft/1.0/Structural_metadata
Structured DataDefinition21.11109/dft/1.0/Structured_Data
Study-Level MetadataDefinition21.11109/dft/1.0/Study-Level_Metadata
Subjective MetadataDefinition21.11109/dft/1.0/Subjective_Metadata
Support ServiceDefinition21.11109/dft/1.0/Support_Service
SystemDefinition21.11109/dft/1.0/System
System MetadataDefinition21.11109/dft/1.0/System_Metadata

T

TaxonomyDefinition21.11109/dft/1.0/Taxonomy
TeD-TDefinition21.11109/dft/1.0/TeD-T
Technology MigrationDefinition21.11109/dft/1.0/Technology_Migration
Temporal coordinatesDefinition21.11109/dft/1.0/Temporal_coordinates
Temporary VersionDefinition21.11109/dft/1.0/Temporary_Version
TermDefinition21.11109/dft/1.0/Term
TimestampingDefinition21.11109/dft/1.0/Timestamping
Topical metadataDefinition21.11109/dft/1.0/Topical_metadata
Transaction RecordDefinition21.11109/dft/1.0/Transaction_Record
TransparencyDefinition21.11109/dft/1.0/Transparency
Trusted RepositoryDefinition21.11109/dft/1.0/Trusted_Repository
Trusted userDefinition21.11109/dft/1.0/Trusted_user
TypeDefinition21.11109/dft/1.0/Type

U

URI - Uniform resource identifierDefinition21.11109/dft/1.0/URI_-_Uniform_resource_identifier
UUIDDefinition21.11109/dft/1.0/UUID
Unique IdentifierDefinition21.11109/dft/1.0/Unique_Identifier
User NameDefinition21.11109/dft/1.0/User_Name

V

Verify checksumDefinition21.11109/dft/1.0/Verify_checksum
VersioningDefinition21.11109/dft/1.0/Versioning
VirtualizationDefinition21.11109/dft/1.0/Virtualization
VocabularyDefinition21.11109/dft/1.0/Vocabulary

W

Web resourceDefinition21.11109/dft/1.0/Web_resource
WorkDefinition21.11109/dft/1.0/Work
WorkflowDefinition21.11109/dft/1.0/Workflow
Workflow VirtualizationDefinition21.11109/dft/1.0/Workflow_Virtualization

API Consumer Layer

Definition

API Consumer Layer is a Digital Infrastructure layer that enables e-Infrastructure providersand university librarians to find connections across research data registries.

Explanation: See also Provider and Graph Creation Layer

Scope: Data Description Registry Interoperability

Status: New

Access

Definition

Data access typically refers to software and activities related to storing, retrieving, or acting on data housed in a some stored form

Explanation: In the context of data access refers to a user's ability to access or retrieve data digital objects and digital resources. The idea of access means that data content is available for use. Users who have data access can store, retrieve, move or manipulate data. Data can be stored on a wide range of devices \& technologies such as a database or repository. Related terms Access Control \& Access Control List.

References: After http://www.techopedia.com/definition/26929/data-access

Scope: Practical Policy WG

Status: In discussion

Access Control

Definition

Access control refers to controlling access to information and systems.

Explanation: Access control policy provide details on the nature of controls placed on access via identification such as passwords.See Access control list.

Examples: Having locked access to where the system is stored is a physical control, having access to files or software code is a digital access control.

Scope: Practical Policy WG

Status: New

Access Policy

Definition

Access Policy is a type of policy that expresses authorization rules.

Explanation: The granularity of access policies is typically coarse-grained making them suitable for matching up to a broad range ofservices.

Examples: An policy rule example one which says which User Domains may access a particular service Endpoint and thus gain use of that service.

Scope: Practical Policy WG

Status: New

Access Role

Definition

Access role is a type of specified role for a person or group allowing the ability to access system functionsand facilities.

Examples: Examples include the means of finding, using, adding, changing or retrieving data and information.

Scope: Practical Policy WG

Status: New

Access Workflow

Definition

A type of access entity that contains the services and functions which make the data object holdings and their information content and related services visible to data consumers.

References: After OAIS

Scope: Practical Policy WG

Status: In discussion

Access a repository

Definition

In accessing a repository one uses a client (application) to discover relevant digital objects within a repository,.

Explanation: A repository manages organization of digital objects into collections and provides a context for understanding the relevance of the digital objects. The organization of the digital objects is based on a Logical Name that is independent of the physical path name on the storage system. Accessing repository is equivalent to exploring the Logical Name space to find a file.

Examples: The repository provides a persistent location for issuing queries, a query mechanism for searching the contents, and returns a logical name or persistent identifier for referencing a digital object.

References: RDA PP WG

Scope: RDA Term Collection Core

Status: New

Access control list

Definition

An access control list is a relationship established between a user account or user group or user role and a digital object.

Explanation: An Access Control List is the usual means by to control access to, and denial of, services. It is simply alist of the services available, each with a list of the hosts permitted to use the service. The value of the access control denotes the allowed operations that may be performed by the user. For each access, the user credential is authenticated, the access control list is checked for the digital object, and then the operation access control value is compared with the requested operation. If the permission allows the requested operation, the access is permitted.

Scope: RDA Term Collection Core

Status: New

Accessible

Definition

Data (and metadata) is Accessible, according to FAIR principle if it:(meta)data are retrievable by their identifier using a standardized communications protocol and the protocol is open, free, and universally implementable and the protocol allows for an authentication and authorization procedure, where necessary and metadata are accessible, even when the data are no longer available.

References: The FAIR Guiding Principles for scientific data management and stewardshiphttp://www.nature.com/articles/sdata201618

Scope: RDA Data Fabric Interest Group

Status: New

Active Collection

Definition 1

Record collections that continue to be used with sufficient frequency to justify keeping them in the office of creation; current records.

References: http://www2.archivists.org/glossary/terms/a/active-records

Scope: RDA DFT Interest Group

Status: In discussion

Definition 2

An Active Collection is a collection that is being generated dynamically by executable code.

Scope: RDA DFT Interest Group

Status: In discussion

Active Data

Definition

Active data denotes virtual units of data objects that are created dynamically by executable code.

Examples: Here several possibilities can be imagined such as generating data tables from a relational database with the help of SQL scripts. Nevertheless we want to ensure that exactly the same a???bit streama??? can be generated after some years. So active data denotes a a???data objecta??? which is generated dynamically but that can be referred to and thus be cited.

Scope: DFT Term Definition Prototype

Status: In discussion

Adaptive Indexing

Definition

Adaptive indexing is characterized by the partial creation and refinement of preliminary or fixed DB indexes as side effects to support efficient query execution.

References: after http://www.vldb.org/pvldb/vol4/p586-idreos.pdf

Scope: RDA Data Fabric Interest Group

Status: New

Add a retention period

Definition

This is a metadata operation to create state information for a data object that defines the date when retention of the data object should be evaluated.

Explanation: Retention periods may be used to define when a property of a digital object should be reviewed. Properties could include physical retention of the digital object, updates to an access approval flag, updates to the file format, updates to the type of checksum, etc. An assumption is that the retention period will have an associated disposition policy for deciding what to do when the retention period expires. The disposition policy specifies the property that will be reviewed and updated.

Examples: A retention period is set on a digital object, with an associated disposition policy for migration of the digital object to an archive. The data management system periodically check whether the retention period has expired, and then applies the disposition policy.

References: PP WG

Scope: Practical Policy WG

Status: New

Addition of access controls

Definition

Given a data object name, define access relationships between the following metadata:data object name, a user name (or user group, or user role), and an access permission.

Explanation: The information can be stored as metadata information associated with each data object. The information can be generated dynamically by applying the access controls of the collection that organizes the data objects (if a collection sticky bit is turned on). Related term - sticky bit

Examples: A data management system provides a method for adding, updating, or removing access controls. The access controls may be set interactively, inherited from the collection into which the digital object is deposited, or applied retroactively in bulk. A significant example is the automated generation of access controls on a replica of a file when the replica is created.

References: RDA PP WG

Scope: Practical Policy WG

Status: New

Administrative metadata

Definition

Administrative metadata is a type of Metadata the provides information to help manage a resource, such as when and how data was created, file type and other technical information, and who can access it.Administrative metadata is related to the interaction or use of metadata within a specific system.

Explanation: Changes in administrative metadata do not change the meaning of the metadata content describing data.There are several subsets of administrative data. Representation described in a Representation Object is one. Two others that are sometimes listed as separate metadata types are: Rights management metadata, which deals with intellectual property rights, and a??? Preservation metadata, which contains information needed to archive and preserve a resource. (See also retention period)

Examples: Examples include- Acquisition information - Rights and reproduction tracking (e.g. a users URN) - Documentation of legal access requirements - Location information - Selection criteria for digitization - Version control and differentiation between similar information objects - Audit trails created by record keeping systems

References: NISO. (2004) Understanding Metadata.Bethesda, MD: NISO Press, p.1

Scope: RDA Metadata WG

Status: New

Aggregation

Definition

An aggregation is general the bringing together of elements.

Explanation: Types of aggregations differ by the nature of the processes by which elements are brought together and the reason understood for aggregating or contained as a unit.Aggregations differ in the nature of relations between the member parts. For the semantic web Resources may be aggregated.

Examples: A baseball card collection is one type of aggregation where each card may be consider a member of the aggregated collection.A digital file is an an aggregation of data elements.

References: OAI-ORE http://www.openarchives.org/ore/1.0/datamodel\#Aggregation

Scope: RDA Term Collection Core

Status: New

Analytics

Definition

Analytics is the discovery of meaningful patterns in data. The resulting synthesis may be used for description and prediction. The synthesis often affords a new understanding or a form of improved knowledge derived from the analysis.

Explanation: There is often an extensive, systematic use of mathematics and statistics underlying the analytic methods used.

References: NIST Big Data Definitions and http://en.wikipedia.org/wiki/Analytics.

Scope: DFT Term Definition Prototype

Status: New

Annotation

Definition

An annotation is a a type of documentation or metadata (e.g. a comment, explanation, markup) attached to Data Objects, text, image, or other data.

References: https://en.wikipedia.org/wiki/Annotation

Scope: RDA Metadata WG

Status: New

Architecture

Definition

Fundamental organization of a system embodied in its components, their relationships to each other, and to the environment, and the principles guiding its design and evolution.

Explanation: Elaboration1: Due to the required flexibility within DFIG it is necessary to stress that the architecture needs to be flexible enough and thus is more of an open framework (see below) and that it cannot be interpreted as being a fixed set of components and services equal to everyone.Elaboration2: The term a???architecturea??? may not be used in normative or prescriptive ways in the realm of the DF discussions.

References: Systems Engineering

Scope: RDA Data Fabric Interest Group

Status: New

Archival Description

Definition

Archival Description is a type of high-level Metadata that describes the large scale organization of data and how things fit together.

Explanation: Archival Descriptions are particularly important for Collections

References: After RDA 2015 BoF on Convergences in Archives, Records Management, and Research Data Curationhttps://rd-alliance.org/convergences-archives-records-management-and-research-data-curation-p6-bof-session.html

Scope: RDA Metadata WG

Status: New

Archive

Definition

An archive is a place or collection containing \& managing records, documents, or other materials of interest which is the result of archiving processes and which is intended for long-term preservation.

Explanation: In the RDA context the focus is a digital sub-type of archive/data archive and its management processes such as data security.It is desirable that a digital archive follow open standards. See archiving. See also Digital Archive, data arrangement since arrangements of data play a special role in archives.

Examples: Digital repositories are one type of archive with repository services.

References: After http://www.thefreedictionary.com/

Scope: RDA Data Fabric Interest Group

Status: New

Archiving

Definition

Archiving is one type of curation activity which ensures that data is properly selected, stored, can be accessed and that its logical and physical integrity is maintained over time, including security and authenticity.

References: Ohsawa, Yukio, and Akinori Abe. Advances in Chance Discovery. Springer, 2014.

Scope: RDA Data Fabric Interest Group

Status: New

Arrangement:

Definition

Arrangement is a type of process which analyses the nature and scope of groups of data/digital objects and associates materials.In this process their provenance and original arranging order are understood and the data elements or objects are organized into groups, series, sets etc. according to a structuring approach that preserves and reflects their nature taking into account their provenance and original, acquisition ordering.

References: After http://www.irmt.org/documents/educUNDERSCORESIGNtraining/termPERCENTSIGN20modules/IRMTPERCENTSIGN20TERMPERCENTSIGN20GlossaryPERCENTSIGN20ofPERCENTSIGN20Terms.pdf

Scope: Practical Policy WG

Status: New

Attribute

Definition

An Attribute is short for a physical data attribute is a single data element related to a data object such as in a database. The database schema associates one or more attributes with each database entity.

Explanation: Attribute is used here as short for "Data Attribute". In a DB an attribute is also known as a field or column. A data attribute may also be used as a term for a logical or conceptual attribute such as in an EAR data model.Attribute is a characteristic of data that sets it apart from other data, such as location, length or type. The term attribute is sometimes used synonymously with a???data elementa??? or a???property.a???

References: reference: http://www.krollontrack.com/resource-library/glossary/legal/\#

Scope: RDA Term Collection Core

Status: In discussion

Authentication

Definition

Authentication is a process within the field of access control to verify the identity of the user.

Explanation: The Authentication function to verify the identity of the user accessing the system, meaning the user is the same as claimed with associated rights as a user.Authentications take place by means of credentials (user name, password, fingerprints, etc.) and an authoritative source for user names. Authentication takes place by mean of an authentication system.

Scope: Practical Policy WG

Status: New

Authentication System

Definition

Part of the IT infrastructure that supports the registrations needed to track and identify/verify authenticated resources for system access.

Scope: RDA Data Fabric Interest Group

Status: New

Authenticity metadata

Definition

A type of metadata that conveys information needed to link a data object to its original source.

Explanation: Authenticity is provided by appropriate metadata, within an archive \& digital retention and preservation context, results from verifying that a digital object \& its state information, has not changed. See Authenticity.

References: Source is RDA Practical Policy WG

Scope: DFT Term Definition Prototype

Status: In discussion

Authorization

Definition

Authorization is a security mechanism, process, or result of a process, for deciding if an agent (person/user, program, device, group, role,etc.) is allowed to have access to or take an action employing a particular resource.

Explanation: Authorization is normally preceded by authentication for user identity verification. System administrators (SA) are typically assigned permission levels covering all system and user resources.

Examples: For example, ASP.NET works with Internet Information Server (IIS) and Microsoft Windows to provide authentication and authorization services for Web-based .NET applications. Windows uses New Technology File System (NTFS) to maintain Access Control Lists (ACL) for all resources. The ACL serves as the ultimate authority on resource access.

References: https://www.techopedia.com/definition/10237/authorization

Scope: DFT Term Definition Prototype

Status: New

Authorization System

Definition

An Authorization System supports the process of giving some resource permission to effect an action and/or perform some tasks.

Explanation: Authorization is used in combination with authentication to secure access to data and services.

References: http://csrc.nist.gov/groups/SMA/fisma/CnA.htmlhttps://en.wikipedia.org/wiki/Authorization

Scope: Practical Policy WG

Status: New

Big Data

Definition

Big Data consists of extensive datasets/collections/linked data primarily characterized by big volume,extensive variety, high velocity (creation \& use), and/or variability that together require a scalable architecture for efficient data storage, manipulation, and analysis.

References: After NIST Big Data Definitions. http://bigdatawg.nist.gov/UNDERSCORESIGNuploadfiles/M0392UNDERSCORESIGNv1UNDERSCORESIGN3022325181.pdf

Scope: DFT Term Definition Prototype

Status: New

Bit Sequence

Definition

A representation of digital content in an assembly of the fundamental unit of digital bits

Examples: 1001

Scope: RDA DFT Interest Group

Status: New

Bit Stream

Definition

Bit Stream denotes an unstructured sequence of bits that is identified as a unit.

Explanation: It may be stored as a unit or may exist as a pattern and be generated. A digital object may be represented as a bit stream of finite length that encodes its informational content. (see information content. See http://smw-rda.esc.rzg.mpg.de/index.php/File:BitUNDERSCORESIGNstream.png for an example.

Examples: Bits in a communication transmission are often used as an example.

References: Figure from From INLS 525: Managing Electronic Recordshttp://ils.unc.edu/courses/2013UNDERSCORESIGNspring/inls525UNDERSCORESIGN001/slides/wk04-slides.html\#slide7

Scope: RDA Term Collection Core

Status: New

Blockchain

Definition

Blockchain refers to a distributed database that is used to maintain a continuously growing list of units or records, which are called blocks. Each block contains a timestamp and a link to a previous block.

Explanation: A blockchain is typically managed by a peer-to-peer network collectively adhering to a protocol for validating new blocks. By design, blockchains are inherently resistant to modification of the data. Once recorded, the data in any given block cannot be altered retroactively without the alteration of all subsequent blocks and the collusion of the network. Functionally, a blockchain can serve as "an open, distributed ledger that can record transactions between two parties efficiently and in a verifiable and permanent way.As a new technical approach there may be issues about how to handle blockchain metadata and IDs.

References: https://en.wikipedia.org/wiki/Blockchain

Scope: RDA DFT Interest Group

Status: New

Blueprint

Definition

Definition (first attempt): a design for a framework that can be re-used and re-purposed for different purposes by applying minor changes that do not require to change the underlying design principles.

Explanation: Elaboration1: This term is a very good approach to what some stated: different groups will want to do various selections of components/services to create a functioning system. The question is then whether there are essential components without which the system will not function.

Scope: RDA Data Fabric Interest Group

Status: New

Canonical Data Collection

Definition

A canonical data collection is a normalized by some established criteria.

Explanation: Mostly canonical collections are formed to allow effective data management.

Examples: Data files that belong to a certain experiment, all files that are created by one specific simulation, all files that belong to a specific observation (same day, same place, etc.) etc.

Scope: RDA Term Collection Core

Status: New

Canonical Metadata Packages

Definition

Canonical Metadata Packages are collections of metadata defined for specific purposes.

References: MIG briefing at RDA P6.

Scope: RDA Metadata WG

Status: New

Catalog

Definition

A catalog is at type of collection which describes and points to a features of the collection.

Explanation: Data catalogs are special types of catalogs. As defined in DCAT "data catalog is a curated collection of metadata about datasets."Also noted, typically, a web-based data catalog is represented as a single instance of this class.

References: https://www.w3.org/TR/vocab-dcat/\#class-catalog

Scope: RDA DFT Interest Group

Status: In discussion

Cataloguing

Definition

An intellectual process of describing objects in accordance with accepted library principles, particularly those of subject and classification order.

References: Source:http://www.alliancepermanentaccess.org/index.php/knowledge-base/dpglossary/\#B DPGloassary Working Group

Scope: DFT Term Definition Prototype

Status: In discussion

Category Registry

Definition

A type of special type of registry to store and manage a "data concept" which is"an elementary descriptor in a linguistic structure or an annotation scheme."

Explanation: May use ISOcat (ISO standard 12620) data categories.

Examples: Component MetaData Infrastructure (CMDI) andISO TC37 Data Category Registry (DCR)

References: ISOcat, a Data Category Registryhttp://media.dwds.de/clarin/userguide/text/conceptsUNDERSCORESIGNISOcat.xhtml

Scope: RDA Data Fabric Interest Group

Status: New

Certificate

Definition

A Certificate is information (verifiable an independent third party) which a recipient needs in order to decide whether he or she can trust the sender of electronically signed material.

Explanation: A certificate is produced as a result of a certification process.

Scope: Practical Policy WG

Status: New

Certification Process

Definition

A type of process that confirms and asserts that a set of properties are correctly enforced for some resource.

Scope: RDA DFT Interest Group

Status: New

Checksum

Definition 1

Checksum is type of metadata and an important property of a data object to allowverifying identity and integrity.

Explanation: This is associated with PIDs but can be found and tested independently of PID systems.

Scope: RDA Term Collection Core

Status: In discussion

Definition 2

Also called a hash, a checksum is a randomly generated piece of data that is used to verify the fixity or stability of a digital object. It is most commonly used to detect whether some representation of digital object has changed over time.

References: After https://wiki.duraspace.org/display/DPNC/Glossary

Scope: RDA Term Collection Core

Status: New

Choosing a storage location

Definition

Choosing a storage location has the following sequence:Identify a physical storage location where a data object will be stored on ingestion into a data repository. The identity should include an IP address. Generate the physical path name within the storage location where a data object will be stored. Register the physical path name as an attribute associated with the logical name For retrieval, the data object location is specified by the storage location and the physical path name. Related terms a??? storage location, Data repository, Logical name

References: RDA PP WG

Scope: Practical Policy WG

Status: New

Citable Data

Definition

Citable Data (aka citable content) is a type of referable data that has undergone registration and quality assessment and can be referred to as citations in publications and as part of Research Objects.

Explanation: Aggregated data is citable (has a citation). Related term: citation, raw data, data registration, research object.

References: Peter wittenburg's collection scenario for RDA P3.

Scope: RDA Term Collection Core

Status: New

Citation

Definition

A Citation is a reference in an academic or research communication that documents any sources used in a research output, for the two-fold purpose of:(a) giving credit to existing sources of ideas, data, and information, and (b) enabling others to identify and locate those sources used in the research.

Explanation: Citations acknowledge a source for matreial and may be either direct and explicit (as in the reference list of a journal article or a link to a particular set of data), indirect (e.g. a citation to a more recent paper by the same research group on the same topic), or implicit (e.g. as in artistic quotations or parodies, or in cases of plagiarism).See also Citable reference.

References: CiTO, the Citation Typing Ontologyhttp://www.essepuntato.it/lode/http://purl.org/spar/cito

Scope: Data Citation WG

Status: In discussion

Citation Metadata

Definition

Citation metadata/identifier should provide an unambiguous identifier to the data cited, its location, and means of access.

References: Borgman, C. (2012) Why are the attribution and citation of scientific data important? In P. F. Uhlir, (Ed.), For attribution: Developing scientific data attribution and citation practices and standards: Summary of an international workshop (pp. 1-10). Washington, D.C.: National Academies Press. Retrieved July 30, 2013 from the WWW: http://www.nap.edu/catalog.php?recordUNDERSCORESIGNid=13564

Scope: DFT Term Definition Prototype

Status: In discussion

Cited reference

Definition

Cited reference are the articles, books or other materials listed in a bibliography or as works cited in a particular publication.

Explanation: It has been pointed out that the word a???referencea??? is ambiguous. It can alternatively mean either what is found in the text, what is found in the reference list, as a verb the act of citation, or the object of the citation itself.Related term is "Citation."

References: https://library.missouri.edu/guides/citedrefsearch/

Scope: RDA DFT Interest Group

Status: New

Collection

Definition 1

A collection is an organized, systematic form of purposeful aggregation, grouping or arrangement of elements, that has an identity of its own separate from the identity of the elements.

Explanation: A collection's metadata should provide one or more reasons why this particular group of elements belongs together as part of a collection process.Collections are often associated with archives and repositories and their services.

Examples: A collection of books in a library. In this case the library serves as a repository for selected books and other media.Collections usually serve multiple functions, such as selection and collocation of related materials, narrowing of search scope, and clarification of information needs.

References: H. L. Lee, H. L. a???The concept of collection from the usera???s perspective.a??? Library Quarterly, 75 (1), 67-85. 2005.

Scope: RDA Term Collection Core

Status: New

Definition 2

Collection is defined as "a group of objects gathered together for some intellectual, artistic, or curatorial purpose."

References: "Representing Cultural Collections in Digital Aggregation and Exchange Environments"http://www.dlib.org/dlib/may14/wickett/05wickett.html Europeana Data Model (EDM)

Scope: RDA Term Collection Core

Status: New

Definition 3

A collection is a digital object which is identified by a PID and consists of a set or a list of PIDs/Ids and a set of additional pointers/links and metadata together with each PID/Id. A collection can be given explicitely by naming each PIDs/Id directly as well as implicitly by a generating rule. A collection is called finite, if the set of PIDs/Ids, generated by iteratively resolving its "sub-"collections, is finite.

Explanation: By definition a collection can contain other "sub-"collections. A collection and its sub-collections define a graph and this way a finite collection becomes a finite graph. Suggestion: only finite collections should be under investigation of the Research Data Collections WG. Otherwise one has to guarantee self-consistency of the definition, and also the proof of finitness for processes becomes in general much harder.

Examples: 1) Given a digital object together with its earlier versions: a) the collection PID points to a set/list of all the PIDs/Ids pointing to an earlier version. In a set the previous relation would be lost, in a list it can be contained in the order of the PIDs/Ids. b) the collection PID points to a set/list of one PID/Id pointing to the digital object and one PID/Id representing the previous version, which again points to a set/list of one PID/Id pointing to the digital object and one PID/Id representing the now previous version, and so on. 2) Try to interpret the OAI-ORE example (URI http://arxiv.org/abs/astro-ph/0601007) of the Primer User' Guide in the context of this definition.

References: OAI-ORE example: http://www.openarchives.org/ore/1.0/primer

Scope: BOF PID Collections

Status: In discussion

Collection Development

Definition

Collection Development is a the process of managing the creation and growth of a Collection consistent with practcie togerher with the theme and purpose of the Collection

Explanation: See also Collection

Scope: RDA Data Fabric Interest Group

Status: New

Collection Management

Definition

Collection Management (aka Collection Building and Management) is the end-to-end process used to manage collected information stored in a repository or archive. - See more at:

Explanation: The intent is to avoid data duplications, to make data available to researchers on a timely and organized basis, and to retain digital objects in an authentic form. Management activities include various metadata-related things for the collection: Representation \& administrative metadata PID for digital collection objects Naming \& descriptive metadata including composition \& arrangement Provenance metadata including data source for collection and provenance metadata including data source for collection Also managing access controls for the collection

References: http://www.tameyourassets.com/what-is-a-collections-management-system/\#sthash.oTaCjFlf.dpuf

Scope: Practical Policy WG

Status: New

Collection Management Identification

Definition

A type of data provenance that adds metadata to identify data collections.

Explanation: The organization doing the collection management is stated in the metadata along with the provenance of Collection management events such as source of data acquisition, conservation, movement. See Collection Management.

Scope: Practical Policy WG

Status: New

Collection Registry

Definition

Collection Registry is a type of registry where major domain repositories can describe their core collections.

Explanation: These registries are mainly intended for human consumption. They provide information about typical data collections that can be readily accessed.It was agreed as part of RDA DF discussion that such a registry would be a useful appetizer at the front page of the RDA web-site.

Scope: RDA Data Fabric Interest Group

Status: New

Collection of Data

Definition

A collection of data (also referred to as a a???dataseta???) means a a???collected, selected, coordinated, or arranged set of data elements in electronic form consisting often of observed, discovered, or derived valuesa???.

References: White Paper: Mechanisms to Share Data as Part of the GEOSS Data-CORE, Group on Earth Observations. Available at: https://www.earthobservations.org/documents/dswg/AnnexPERCENTSIGN20VIPERCENTSIGN20-PERCENTSIGN20PERCENTSIGN20MechanismsPERCENTSIGN20toPERCENTSIGN20sharePERCENTSIGN20dataPERCENTSIGN20asPERCENTSIGN20partPERCENTSIGN20ofPERCENTSIGN20GEOSSPERCENTSIGN20DataUNDERSCORESIGNCORE.pdf. (DSWG 2014a)

Scope: Legal Interoperability

Status: In discussion

Collection virtualization

Definition

Collection virtualization is a type of virtualiztion for managing the naming, arrangement, description, access control, integrity, authenticity, retention, disposition, distribution etc. of data.

References: Reagan Moore presentation at RDA P6 DFT IG session as well as the Data Fabric session.

Scope: RDA Data Fabric Interest Group

Status: New

Communication

Definition

Communication is a process by which information is exchanged between individuals or systems using a common system of symbols, signs, or behavior.

Examples: Bits in a communication transmission are an example of one type of communication.

References: Cognitive Atlas Concept, CAOUNDERSCORESIGN00210

Scope: DFT Term Definition Prototype

Status: New

Components

Definition

Aka, a system component. An entity with discrete structure, such as an assembly or software module, within a system considered at a particular level of analysis.

Explanation: Elaboration 1: in DFIG components are characterized by the functional services they offer and the internal structures that are required to offer these services.

References: Systems Engineering

Scope: RDA Data Fabric Interest Group

Status: New

Concept

Definition

A concept is the smallest, unambiguous unit of thought.

Explanation: A concept is uniquely identifiable.

Examples: Anything you can write a Wikipedia article about is a concept.

References: http://conceptwiki.org/index.php/Concept

Scope: RDA Metadata WG

Status: New

Conceptual Object

Definition

An object which is intangible and, because it is intangible, does not fit into adigital archive.

Explanation: In OAIS this neither a digital or physical object. When a description is provided it is called a "tagged non-digital object."

Examples: An example is the Cassini mission and NASAa???s strategic plan for solar system exploration.

References: GLOSSARY OF PDS4 TERMS Version of 2011-10-28 (v111028) https://pds.nasa.gov/pds4/doc/glossary/PDS4UNDERSCORESIGNGlossaryUNDERSCORESIGNv111028.pdf

Scope: RDA Data Fabric Interest Group

Status: New

Container

Definition

Something able to hold objects.

Examples: A data repository can hold data objects and collections. In this case the data repository may be considered a type of data container. Containers come in all sizes. A library may hold books and manuscripts.

Scope: DFT Term Definition Prototype

Status: In discussion

Content Replication

Definition

Content Replication duplicates Content Information - the set of information that is the original target object that has been registered and is under preservation. In OAIS it is a type of Digital Migration where there is no change to the Packaging Information, the Content Information, and the PDI. The bits used to represent these Information Objects are preserved in the transfer to the same or new media instance.

Examples: page image files

Scope: DFT Term Definition Prototype

Status: In discussion

Contextual Metadata

Definition

Contextual Metadata is a type of metadata needed for interpreting the relevance of data such as files in a collection. The context includes (contains) provenance information (identifying the source of the data), descriptive metadata (defining the attributes of the data), structural data (defining data formats).

Explanation: Some such metadata may be extracted from associated documents or mined from headers within the data.As noted by the RDA MIG there is a "need for documentation of the evolution of the data asset behind each element through re-usable contextual a???profilesa??? applicable upon various datasets..." See also Descriptive Metadata and Detailed Metadata.

Examples: An example of context is the medical imaging format DICOM which provides a context to understand an image, how it was generated etc.Context may be things such as rights and license terms, the organization that generated the data, data quality, data access methods the update schedule of datasets and the intention of the research that produced the data. As noted one example is provenance information which identifies the source of data.

References: Outcomes Policy Templates:Practical Policy Working Group, September 2014 https://rd-alliance.org/groups/data-context-ig.html

Scope: RDA Metadata WG

Status: In discussion

Contextual metadata extraction

Definition

Contextual metadata extraction is a process for creating metadata associated with files and collections.

Explanation: The creation of provenance and descriptive metadata defines a context for interpreting the relevance of files in a collection.Depending upon the data source, there are multiple ways to provide metadata a???some can be automated.

Examples: An example of extract metadata from an associated document is case of taking metadata from the DICOM medical imaging format.

References: Practical Policy RDA report

Scope: Practical Policy WG

Status: New

Controlled Vocabulary

Definition

A controlled vocabulary is a formally maintained collection of terms agreed upon and used in a specific community for communication purposes.

Explanation: Controlled vocabularies are accepted \& maintained by a community with some degree of asigned, rigorous \& understood definitions which may evolve and expand (or shrink \& consolidate) over time as part of a management \& review process. Terms that are part of Controlled vocabularies are often intended to provide systematic values for populating structured metadata elements.

Examples: Taxonomies are one type of controlled vocabulary. SNOMED is an example a controlled vocabulary for computer-based patient records.

References: After Currier Sarah, Lorna M. Campbell, Helen Beetham (2005). Pedagogical Vocabularies Review, JISC Pedagogical Vocabularies Project, Final Draft, 23rd December 2005 Pedagogical vocabularies project \& https://marinemetadata.org/guides/vocabs/vocdef

Scope: RDA DFT Interest Group

Status: New

Copyright

Definition

Copyright is a type of right used to monopolize a crative work such as research, art and literature

Explanation: Copyright refers not to the content of a work, but to the form of presentation (the a???expressiona???) of this content. CopyrightA? applies to individual works, but not to facts, ideas, or concepts. It is implemented through individual national legislation that is consistent with the Berne Convention for the Protection of Literary and Artistic Works treaty. There are two types of rights under copyright: economic rights, which allow the rights owner to derive financial reward from the use of his works by others; and moral rights, which protect the non-economic interests of the author.

References: Its most important legal basis at the international level is the Berne Convention for the Protection of Literary and Artistic Works (http://www.wipo.int/treaties/en/ip/berne/) http://www.wipo.int/copyright/en/

Scope: Legal Interoperability

Status: In discussion

Copyright Infringement

Definition

Infringement of copyright is a violation of any of the exclusive rights of the copyright owner, as provided by legislation.

Explanation: See also Copyright

References: (See, e.g., the copyright infringement section of the 1976 U.S. Copyright Act at: http://www.copyright.gov/title17/92chap5.html, and the a???What to Do If You're Accused of Copyright Infringementa??? section of the UNa???s World Intellectual Property Organization (WIPO) web site at: http://www.wipo.int/sme/en/documents/copyrightUNDERSCORESIGNinfringementUNDERSCORESIGNfulltext.html.)

Scope: Legal Interoperability

Status: In discussion

Corpuse

Definition

A corpus is a set of documents that has a scientific meaning. A corpus can be produced by an individual researchera???s activity(including its archival materials), or from a laboratory research, field campaign or science \& culture heritage project, a survey, etc.

References: Research Data Alliance Europe Report - Community Data Analysis: Huma-Num Section.

Scope: DFT Term Definition Prototype

Status: In discussion

Create derived data products

Definition

Given the data type and a desired data product, the operation identifies a procedure that can apply the required transformation on the data object to create a derived data object, then store the derived data object in the repository, along with provenance and descriptive metadata.

Explanation: Related terms include data type, repository, provenance and descriptive metadata.

Scope: RDA Term Collection Core

Status: New

Crowdsourcing Curation

Definition

Crowdsourcing Curation or Crowdsourced Curation is a type of Curation which is not based on the activity of expert curators or algorithms but is provided by soliciting users or interested parties for their opinion.

Examples: One common method of crowdsourced curation mechanisms for documents is the use of the Internet to rank each document according to the number of upvotes (approvals) and downvotes (disapporovals) that each document has received.

References: A Theoretical Analysis of Crowdsourced Content Curationhttp://yiling.seas.harvard.edu/sc2013/Askalidis.pdf

Scope: RDA Data Citation WG

Status: New

Curation

Definition

Curation is a process of maintaining, preserving and adding value to data throughout its lifecycle. -

Explanation: Curation may involve the assignment of administrative, descriptive, structural and technical archival metadata. - RDA focus is on curation of digital research objects. See Digital Curation Lifecycle.

References: See: http://www.dcc.ac.uk/digital-curation/what-digital-curation\#sthash.SZcvpDHA.dpuf

Scope: DFT Term Definition Prototype

Status: New

Curation Metadata

Definition

Curation metadata describe who supports a curated resource and its availability.

Explanation: See Administration Metadata

Examples: Examples include version, release date.

References: Resource Metadata for the Virtual Observatory Version 1.12http://www.ivoa.net/documents/REC/ResMetadata/RM-20070302.html

Scope: RDA DFT Interest Group

Status: New

Curation Workflow

Definition

A type of workflow including active steps to curate data as an aid to on-going management of data through its lifecycle.

Scope: DFT Term Definition Prototype

Status: In discussion

Cyberinfrastructure

Definition

Cyberinfrastructure refers to the set of organizational practices, technicalinfrastructures, and social norms that collectively provide for the smooth operation of scientific work at a distance

Explanation: Includes the knowledge infrastructure as well as the hardware and software and communications components.

References: Geoffrey C. Bowker, University of California Irvine, "Effective Communication andScientific Cyberinfrastructure" http://inspire.ec.europa.eu/events/conferences/inspireUNDERSCORESIGN2013/pdfs/socioeconomic/GeoffreyUNDERSCORESIGNC.UNDERSCORESIGNBowker.pdf

Scope: RDA DFT Interest Group

Status: New

Darwin Core

Definition

The Darwin Core is body of standards. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing reference definitions, examples, and commentaries. The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, samples, and related information. Included are documents describing how these terms are managed, how the set of terms can be extended for new purposes, and how the terms can be used. The normative document for the terms [RDF-NORMATIVE] is written in the Resource Description Framework (RDF) and is the definitive resource to understand the term definitions and their relationships to each other. The Simple Darwin Core [SIMPLEDWC] is a specification for one particular way to use the terms - to share data about taxa and their occurrences in a simply structured way - and is probably what is meant if someone suggests to "format your data according to the Darwin Core". (see: http://rs.tdwg.org/dwc/)

Scope: DFT Term Definition Prototype

Status: In discussion

Data

Definition 1

Data is a collection of datum. Data is a potential information bearer to a cognitive agent. (Prototype Wiki version)

Explanation: Data is the medium used to communicate and store information. It gives a concrete and persistent status to some information, but itis by itself (as a string of bits, for example) without context such the assumed formatting/representation information or reference systems without a single meaning. Thus, to extract the information intended in collections of data, an interpretation is necessary that assigns meaning to it using elements of this context such as assumed representation.

Examples: Data like a datum has a quantitative or qualitative value. Common types of data include content for sea surface temperature measurements, readings from monitoring equipment, user actions on a website, science funding projections, and demographic information. (after http://data-informed.com/glossary-of-big-data-terms/)The capital letter A is ASCII character 65, ut the bits to express this may be a pixel of some sound if the context is different. Likewise, the same information (content) may be expressed by very different looking data since it may be encoded in many different ways. A simple example is that the temperature of some object or region may be codes by degrees C or F (different reference systems). In this sense data may be considered analogous to syntax which helps communicate the semantics which is information.

References: After Quentin L. Burrell, Isle of Man International Business School.

Scope: RDA Term Collection Core

Status: In discussion

Definition 2

Data is a collection of propositions representing an agent's understanding of some entity or state. (Peters Document)

Explanation: Note this is a general view and means that non-digital things are included such as analog data.

References: Formulated based on: Data is a set of values of qualitative or quantitative variables http://en.wikipedia.org/wiki/Data Data are individual pieces of information. http://en.wikipedia.org/wiki/Data Data are typically the results of measurements and can be visualized using graphs or images. http://en.wikipedia.org/wiki/Data Data is a collection of datum. http://smw-rda.esc.rzg.mpg.de/index.php/Data (datum: a role played by a unitary proposition, which provides the content of the datum. http://smw-rda.esc.rzg.mpg.de/index.php/Datum) See discussion under http://smw-rda.esc.rzg.mpg.de/index.php/Data. Data is a potential information bearer to a cognitive agent. http://smw-rda.esc.rzg.mpg.de/index.php/Data [Data is] factual information (as measurements or statistics) used as a basis for reasoning, discussion, or calculation http://www.merriam-webster.com/dictionary/data (meaning 1) [Data are] facts or information used usually to calculate, analyze, or plan something http://www.merriam-webster.com/dictionary/data

Scope: RDA Term Collection Core

Status: In discussion

Definition 3

Data is a set of values of qualitative or quantitative variables and is a product of a???sensory dataa??? detected within the framework of perception.

Explanation: What constitutes data is determined by the activities and objectives of the users.

References: http://en.wikipedia.org/wiki/Data

Scope: RDA Term Collection Core

Status: In discussion

Definition 4

Data are individual pieces of information.

References: http://en.wikipedia.org/wiki/Data

Scope: RDA Term Collection Core

Status: In discussion

Definition 5

Data are typically the results of measurements and can be visualized using graphs or images. http://en.wikipedia.org/wiki/DataData is a collection of datum. http://smw-rda.esc.rzg.mpg.de/index.php/Data (datum: a role played by a unitary proposition, which provides the content of the datum. http://smw-rda.esc.rzg.mpg.de/index.php/Datum) See discussion under http://smw-rda.esc.rzg.mpg.de/index.php/Data. Data is a potential information bearer to a cognitive agent. http://smw-rda.esc.rzg.mpg.de/index.php/Data Data is factual information (as measurements or statistics) used as a basis for reasoning, discussion, or calculation http://www.merriam-webster.com/dictionary/data (meaning 1) Data are facts or information used usually to calculate, analyze, or plan something http://www.merriam-webster.com/dictionary/data

Definition 6

The term a???dataa??? as used here is meant to be broadly inclusive of something derived from observation or thought and communicated in some form.

Explanation: In addition to digital manifestations of literature (including text, sound, still images, moving images, models, games, and simulations), digital data refers as well to forms of data and databases that generally require the assistance of computational machinery and software in order to be useful, such as various types of laboratory dataincluding spectrographic, genomic sequencing, and electron microscopy data; observational data, such as remote sensing, geospatial, and socio-economic data; and other forms of data either generated or compiled by humans or machines

References: (Uhlir \& Cohen, 2011,3 as reported in Borgman, 2012, p. 1061).

Scope: RDA Term Collection Core

Status: New

Data Access

Definition

Data Access is a process that enables users to retrieve or read published data.

Explanation: Data Access steps implemented within the data management system include: Authentication of the user; Verification of access permission through checking access controls; Identification of the physical location of the data; Selection of the transmission protocol; Retrieval of a copy of the data. Within the data management infrastructure, data access may also entail generation of an event that records the access; Archiving of the event; Indexing of the event: Generation of usage summary reports.

Examples: From a web browser, access a repository and identify a relevant file. Invoke retrieval of the file using the Logical Name implemented by the repository for arranging distributed data into a logical collection.

References: RDA Practical Policy

Scope: Practical Policy WG

Status: New

Data Aggregate

Definition

A Data aggregate is a grouping of data elements that describe a particular entity.

Scope: DFT Term Definition Prototype

Status: In discussion

Data Analysis

Definition

Data Analysis, also called Analysis, is a data lifecycle stage that involves the techniques used to satisfy analyst goals of producing informative knowledge from organized data.

Explanation: "Analytic processes are often characterized as:1. discovery for the initial hypothesis formulation, 2. development for establishing the analytics process for a specific hypothesis, and 3. applied for the encapsulation of the analysis into an operational system. " (after NIST reference) Note Traditional statistical analytic techniques/processes may downsize, sample, skim or summarize the data before analysis.

Examples: Application example:

References: NIST Big Data definitions: http://bigdatawg.nist.gov/UNDERSCORESIGNuploadfiles/M0392UNDERSCORESIGNv1UNDERSCORESIGN3022325181.pdf

Scope: DFT Term Definition Prototype

Status: New

Data Analytics

Definition

Data Analytics refers to all investigator directed tasks that are aimed at extracting scientific information which may be the basis for useful inferences \& conclusions from data.

Explanation: Analysis may be of raw, refined, or combined dataset products. Analysis is usually distinguished from data mining. Analysis can be characterized by the procedure that will be applied. Data Mining is the active process of applying the analysis procedure.

Examples: A procedure can be defined that identifies whether a feature is present within a data set. Data Analytics can refer to the algorithm used by the procedure, or the relationship between different algorithms.

Scope: RDA Data Fabric Interest Group

Status: New

Data Architect

Definition

A Data Architect is a practitioner of data architecture, an information technology discipline concerned with designing, creating, deploying and managing data and information architecture.

References: http://pubs.opengroup.org/architecture/togaf9-doc/arch/chap10.htmlhttps://en.wikipedia.org/wiki/DataUNDERSCORESIGNarchitect

Scope: RDA Data Fabric Interest Group

Status: New

Data Archiving

Definition

Data archiving is a preservation process that moving data into a managed form of storage for long-term retention.

Explanation: Data archiving in the RDA context focuses on digital archives and their processes although archiving of material is an important type and has some different processes. Long-term preservation of digital objects, for example, involves different process than preserving paper manuscripts in an archive.

Examples: Curation is a one data archiving process.

Scope: DFT Term Definition Prototype

Status: New

Data Arrangement

Definition

Data Arrangement is the process of grouping digital objects by a well defined property(s). Digital objects with the same property may be grouped into a logical collection. By browsing the collection, related or similar digital objects can be identified.

Explanation: There are at least two types of arrangement to note: One derives temporally from the original arrangement (in archival terms a???respect du fondsa???) of the data and could be described by provenance metadata. For a record series, this corresponds to the data's position in the original series. The intent is to be able to track the order in which records are deposited into an archive over time. A second kind is less of temporal arrangement and is an organization as in a digital library analogous to how a library organizes books on a shelf - by subject and author. In a digital library, files are organized by subject, title or author properties etc. . Each collection represents some unifying property associated with all of the collection files. The unifying property is quite arbitrary. Related terms: collection, property Note that an arrangement can be the result of a query on descriptive metadata. The result of the query can be listed along with other query results, creating a virtual collection that can be browsed.

Examples: In the SCEC project, files were organized by the simulation (computational analysis that was executed). For each simulation, there were input files, output files, visualizations, movies, a???

Scope: Practical Policy WG

Status: New

Data Broker

Definition

A Data Broker is part of the data infrastructure that provides various low-barrier mediation \& interconnection services to support cross domain and project activities.

Explanation: A key service is to support various Levels of Conceptual Interoperability. Brokers and its associated brokering approach supports system "Autonomy" \& "flexibility" i.e. leaving existing disciplinary/project infrastructures as autonomous and yet flexible a fashion as possible.

Examples: Services such as discovery and access, workflow assistance, translation and reformatting as well as semantic mediation and vocabulary services.

References: https://rd-alliance.org/system/files/filedepot/97/06506981PERCENTSIGN20.pdf

Scope: Brokering Governance

Status: New

Data Catalog

Definition

A data catalog is type of collection. It is a curated collection of metadata about datasets and their data elements.

Explanation: Data catalogs are special types of catalogs.Also noted, typically, a web-based data catalog is represented as a single instance of this class.

References: http://www.w3.org/TR/vocab-dcat/\#class-catalog

Scope: RDA DFT Interest Group

Status: In discussion

Data Citation

Definition

Data Citation is the practice of providing an identifying reference to data in a similar way that researchers routinely include a bibliographic reference to published resources.

Explanation: Data Citation uses "data citation metadata" (which see).

References: Australian National Data Service [ANDS], [2011], Van Leunen, [1992] Other sources: quoted from http://www.force11.org/node/4770, cited there as 'adapted from https://www.jstage.jst.go.jp/article/dsj/12/0/12UNDERSCORESIGNOSOM13-043/UNDERSCORESIGNpdf

Scope: Data Citation WG

Status: In discussion

Data Citation Metadata

Definition

Data Citation Metadata is a type of metadata/dministrative metadata that plays the role of citing a dataset in an analogous way that books or journal articles are referenced in research publications.

Explanation: Data citation follows good open science practice and allows other researches to more readily locate and access a fellow researcher's dataset for the purposes of replicating, verifying or building on their results.

Examples: Main components of citing a dataset are the author(s), year, title, archive/distributer, access date, version number, and a persistent identifier or locator. Metadata that maps to DataCite schema or Dublin Core Terms etc.

References: USGS Data Management, http://www.usgs.gov/datamanagement/describe/citation.phpFAIR principles

Scope: Data Citation WG

Status: New

Data Cleaning

Definition

Data cleaning or data cleansing/scrubbing is a process used to improve data quality by detecting and correcting (or removing) defects \& errors in data.

Explanation: Corrupt or inaccurate records may exist in a record set, table, or database. See data quality.

Scope: DFT Term Definition Prototype

Status: New

Data Collection

Definition

Data Collection is a type of collection formed by some agent driven aggregation or grouping process whose parts/elements are made of data/datum. A collection is identified by a PID and described like other types of DOs by metadata.

Explanation: A collection is a form of aggregation of elements that has an identity of its own separate from the identity of the elements.There are many types of collection based on/attributed to an agent activity-driven purposes for the collection and the nature of the data components. Published data may be considered a collection in which case the digital object (publication) is then static, invariant, and a persistent identifier has meaning. But in a research collaboration, the entities change over time and are tracked by versions. A soft link can reference an entity that changes over time. A query issued to a database can be invariant, but the result set may change each time. A sensor data stream always has new data from the most recent observation. The stream itself may be identified, but the contents are not static. If we change the input parameters for a workflow, the result will change when the workflow is executed. Thus a workflow structured object has to associate the workflow with the input and the output. Researchers who collect data of some particular type often create their own software frameworks to make the data accessible. See Data Access.

Examples: Examples of collection types include: sub-collections files soft links to other collections (micro-service structured objects) soft links to objects in external repositories (micro-service structured objects) database queries (micro-service invocation) workflows (workflow structured objects) sensor data streams (micro-service structured objects)

References: Reagan Moore provided examples of data collections as well as the explanation of published data as a static object as opposed to dynamic data which requires a different view of links to it.

Scope: RDA Term Collection Core

Status: In discussion

Data Consumer

Definition

A Data consumer is a type of user such as person or group accessing, using, and potentially performing post-processing steps on data.

References: Strong, Diane M., Yang W. Lee, and Richard Y. Wang. "Data quality in context." Communications of the ACM 40.5 (1997): 103-110.

Scope: RDA Data Fabric Interest Group

Status: New

Data Container

Definition

A data container is a software stack that is chunking digital objects at a physical layer.

Examples: Typical containers are file systems, database management systems, content management systems, clouds etc. The software stack implies some form of encapsulation of the digital object.

Scope: DFT Term Definition Prototype

Status: In discussion

Data Curation

Definition

Data curation is a managed process, throughout the data lifecycle, by which data/data collections are cleansed, documented, standardized, formatted and inter-related.

Explanation: The goal of curation is to manage and promote the use of data from its point of creation, to ensure it is fit for contemporary purpose, and available for discovery and re-use.For dynamic datasets this may mean continuous enrichment or updating to keep it fit for purpose. Special forms of curation may be available in data repositories. The data curation process itself should be documented as part of curation, thus curation and provenance are highly related.

Examples: Versioning data or forming a new collection from several data sources.Annotating with metadata. Adding codes to raw data, for example classifying a galaxy image with a galaxy type such as "spiral." Higher levels of Curation will also involve maintaining links with annotation and with other published materials. Thus a data set may include a citation link to publication whose analysis was based on the data.

Scope: RDA Term Collection Core

Status: New

Data Deposit

Definition

Data Deposit (aka ingest, deposition or data archiving) is a process by which data is stored in a data archive.

Explanation: In archives like ICPSR data collections are deposited by a data archivist who reviews the data and documentation, builds a study description, enhances the documentation, approves the data collection for distribution and archives the data for long-term preservation. To some archive deposition is process which starts a preservation process by removing selected records from operational databases that are not expected to be referenced again and storing them in an archive data store where they can be retrieved if needed. related terms: archive, Data Repository management, add a retention period, data management.

References: http://www.icpsr.umich.edu/icpsrweb/deposit/See also http://www.dcc.ac.uk/resources/briefing-papers/introduction-curation/database-archiving\#sthash.KjjhMMwQ.dpuf

Scope: DFT Term Definition Prototype

Status: New

Data Dictionary

Definition

A Data Dictionary is a type of Dictionary that defines "data" by providng (metadata) details about data element.

Explanation: Data Elements are often described by metadata of Attribute/Field Name, Description, Data Type, and Constraints

Scope: RDA DFT Interest Group

Status: New

Data Discovery

Definition

Data discovery is a process of query and/or search to find (research) data of interest.

Explanation: Data discovery used Discovery Metadata.

Scope: RDA Metadata WG

Status: New

Data Ecosystem

Definition

A Data (Digital) Processing Ecosystem is a type of ecosystem made up of technical components that allows one to store, process, analyse and visualize data.

Explanation: Infrastructural technologies are the core of a data processing ecosystem as a dynamic system.An ecosystem view allows a better view of the coplexities involved in data processing in the Big Data era. It highlights such ecological ideas as the circular flow of resources, system openness to the environment, sustainability and adaptation of components over time as well as component dependency in context. In contrast to biological ecosystem, data processing ecosystem's unit include technical components and digital information.

Scope: RDA Data Fabric Interest Group

Status: New

Data Element

Definition

A unit of data for which the definition, identification, representation (term used to represent it), and permissible values are specified by means of a set of attributes.

Explanation: The general idea of an Element is of a single 'stand-alone' (physical) object or conceptual item. Elements are often expressed as constituents and play roles in Configurations.From a metadata perspective, the term data element is an atomic unit of data that has precise meaning or precise semantics. In traditional data practice a data element has: An identification such as a data element name A clear data element definition One or more representation terms Optional enumerated values Code (metadata) A list of synonyms to data elements in other metadata registries Synonym ring

Examples: Exmple The data element a???age of a persona??? with values consisting of all combinations of 3 decimal digits. EXAMPLE 2 A personnel record that includes the data elements "name" and "address". In the context of the personnel record, "name" and "address" function as an indivisible unit, e.g., the data element "name" and the data element "address" each can be stored and retrieved as an indivisible unit. However, in a different context, "address" itself may be considered a record that contains its own data elements "street address", "city", "postal code", "country". In a database an example of a data element is a data field. One also says that a data element is an attribute of a data entity.

References: ISO 11179-1Bateman, Henschel, and Rinaldi 1995, p. 13.

Scope: RDA DFT Interest Group

Status: In discussion

Data Entity

Definition

An Object, event, or phenomenon about which data is stored in a database and which has intermediate representation in a Data Model.

Scope: RDA Term Collection Core

Status: In discussion

Data Flow Virtualization

Definition

Data Flow Virtualization is a type of virualization for managing the selection of end points, disjoint parallel I/O paths, caching, distribution, access controls, naming, etc. of data flows.

Scope: RDA Data Fabric Interest Group

Status: New

Data Format

Definition

Data Format refers to the way that data is encoded and stored for use in a computer system and/or for comprehension by humans.

Explanation: Useful data formats are well specified and constrained by a formal data type and/or set of applicable standards.

References: http://guide.dhcuration.org/contents/data-representation/

Scope: RDA Term Collection Core

Status: New

Data Identifier

Definition

A type of data identifier that uniquely distinguishes one set of data from all others.

Examples: Types might be: a??? Archival Resource Key (ARK) a??? Digital Object Identii???ers (DOI) a??? Extensible Resource Identii???er (XRI) a??? HANDLE a??? Life Science ID (LSID) a??? Object Identii???ers (OID) a??? Persistent Uniform Resource Locators (PURL) a??? URI/URN/URL a??? UUID

Scope: DFT Term Definition Prototype

Status: In discussion

Data Integration

Definition

Data integration is systematic combining data from different independent \& potentially heterogeneous sources, to create a more compatible, unified view of these data for research purposes.

Explanation: Since data is understood in terms of data schemas unified views of data may be provided by means of a global schema using a reconciled view of all data.One result of data integration is the ability of a user to meaningfully queried the data.

References: Lenzerini, Maurizio. "Data integration: A theoretical perspective." Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 2002.

Scope: DFT Term Definition Prototype

Status: New

Data Item

Definition

Data item is a type of data element which expresses a proposition that binds one or more property values to some data entity.

Scope: DFT Term Definition Prototype

Status: In discussion

Data Lake

Definition

A data lake is a type of storage repository that holds a vast amount of raw data in its original (native) format until it is needed.

Explanation: While many repositories, such as hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data.Each data element in a lake is assigned a unique identifier and tagged with a set of extended metadata tags. A data lake can be queried for relevant data, using the ID.

Scope: RDA Data Fabric Interest Group

Status: New

Data Librarian

Definition

Data Librarians are all experts who have a librarian background.

Explanation: Data Librarians often carry out curation and metadata related work. There is much overlap with activities of datamanagers and data stewards.

Scope: RDA Term Collection Core

Status: New

Data Lifecycle

Definition 1

AKA a data curation lifecycle represents the stages in the existence of digital information from creation to destruction. A lifecycle view is used to enable active management of the data objects \& resource over time thus maintaining accessibility and usability.

Explanation: A Data LifeCycle defines all stages in the existence of digital data from creation to destruction as wells as chained operations Workflows within the Lifecycle.

References: Source: Pennock, M: "Digital Curation: A Life-Cycle Approach to Managing and Digital Curation: A Life-Cycle Approach to Managing and Preserving

Scope: RDA DFT Interest Group

Status: In discussion

Definition 2

The term a???Data Lifecyclea??? refers to all stages in the existence of digital information from creation to destruction.

Scope: DFT Term Definition Prototype

Status: In discussion

Data Management Infrastructure

Definition

Data management infrastructure consists of resources such as a data repository and information catalog.

Explanation: This infrastructure is used to provide data management and enforce data management policies.

Examples: Registries, catalogs and repositories for storage for data and metadata is one example of a data infrastructure resource component.

Scope: RDA DFT Interest Group

Status: New

Data Manager

Definition

A data manager(aka data steward) is a person responsible for the management of data elements/digital object content and associated metadata.

Explanation: Data managers/stewards broad skills include processes, policies, guidelines and responsibilities for administering research data over its lifecycle in compliance with policy and/or technical \& regulatory obligations. A data steward may share some responsibilities with a data custodian which are usually more focused as is a data administrator which may have more technical roles in installation, configuration, upgrades, , monitoring, maintenance, and security.

References: https://en.wikipedia.org/wiki/DataUNDERSCORESIGNsteward

Scope: Practical Policy WG

Status: New

Data Model

Definition

A data model is an abstract model that specifies the structure or schema of a data set. The model provides some documented description of data and thus is a instance of metadata.

Examples: A logical, relational data model showing an organize dataset as a collection of tables with entity, attributes and relations.

Scope: DFT Term Definition Prototype

Status: In discussion

Data Object

Definition

Data Object is a general data concept of which digital object are particular representational forms. Thus Data Object may be expressed as a physical or a digital object. The Representation Information accompanying a digital object is used to understand the information carried by the digital object. A DO holds attributes for bit-level objects and pointer(s) to the actual data.

Explanation: A data object may include the named bits of a digital object, but also has representation object allowing processing of its information content. Information that maps a Data Object into more meaningful concepts" (OAIS) a??? makes humanly-perceptible properties happen. The representation information itself is an information object that can be in digital form and needs itself representation information to be understood.

Examples: Examples: file format, encoding scheme, data format, encoding scheme, data type.

References: OAIS documentation - Holdsworth, David, and Derek M. Sergeant. "A Blueprint for Representation Information in the OAIS model." IEEE Symposium on Mass Storage Systems. 2000.Also input from Reagan Moore that data objects are bits and bytes that have been named.

Scope: RDA Term Collection Core

Status: New

Data Organization

Definition

The term Data Organization denotes the complexity of measures that is used by a repository to form aggregations of data objects (incl. collections and metadata), to describe the properties of data objects, to register PIDs and to build the PID records, to link between all components and to setup the containers (software stack) that are used to store all components.

Examples: Arrangement of names assigned to digital entities.

Scope: RDA DFT Interest Group

Status: In discussion

Data Policy

Definition

An assertion that is enforced about digital objects within a repository

Explanation: A set of high-level principles that establish a guiding framework for data management. A data policy can be used to address strategic aspects such as data access, relevant legal matters, data stewardship issues and custodial duties, data acquisition and other issues. Each data policy defines an assertion that the data managers enforce for the digital objects within a repository.

Examples: Policy to ensure all files are replicatedPolicy to ensure all files have a checksum Policy to define the descriptive metadata that will be associated with each file Policy to specify the set of permissible access controls

References: Mapping the Data Landscape 2011 Summit Rajasekar, R., M. Wan, R. Moore, W. Schroeder, S.-Y. Chen, L. Gilbert, C.-Y. Hou, C. Lee, R. Marciano, P. Tooby, A. de Torcy, B. Zhu, a???iRODS Primer: Integrated Rule-Oriented Data Systema???, Morgan \& Claypool, 2010.

Scope: Practical Policy WG

Data Preservation

Definition

PreservationA? is a set of information management activities/services within archiving \& curation in which specific items of data/collections are maintained over time so that they can still be accessed and understood through changes in technology.A?

Explanation: Preservation safeguards information assets for subsequent analysis \& discovery, litigation evidence, security, and regulatory compliancePreservation services protect data, provide availability, integrity and authenticity controls, and may include security and confidentiality safeguards, such as via an audit trail and audit log, \& metadata management (related term, see also archive). Part of preservation is the determination of a retention period.

References: Anderson, William L. "Some challenges and issues in managing, and preserving access to, long-lived collections of digital scientific and technical data." Data Science Journal 3 (2004): 191-201.

Scope: DFT Term Definition Prototype

Status: New

Data Processing

Definition

Data processing is a type of processing that includes a series of operations that are carried out on data, especially by digital computers, in order to acquire,present, transform or interpret, data and it's informational content. When data is in a digital form data processing is digital processing.

Explanation: this term relates to a generic concept referring to all kinds of procedures being executed on data which can range from management to curation and analytics tasks.A synonym is "Data Handling"

References: OAIS

Scope: DFT Term Definition Prototype

Status: In discussion

Data Producer

Definition

Data Producer is agent/group responsible for generating data at the beginning of the Data Lifecycle.

Explanation: Producers often also have the task of maintaining data.

References: After Strong, Diane M., Yang W. Lee, and Richard Y. Wang. "Data quality in context." Communications of the ACM 40.5 (1997): 103-110.

Scope: RDA Data Fabric Interest Group

Status: New

Data Professional

Definition

Data Professional(s) any \& all types experts who deal in some form with (research) data.

Explanation: Synonym is "Data Practitioner(s) "

References: From "Data Management Trends, Principles and Components a??? What Needs to be Done Next?"

Data Provider

Definition

A Data Provider, or Research Data Provider, is a type of Agent responsible for the creation and/or dissemination, accessibility of data to a consumer/

Explanation: "A provider or disseminator of research data in many cases may not be the same entity as the holder of rights in those data. The rights holder of a dataset also is not necessarily synonymous with its original producer."

References: Implementation Guidelines for the Principles on the Legal Interoperability of Research Data (Summer, 2016)

Scope: Legal Interoperability

Status: New

Data Provider Layer

Definition

Provider Layer is an infrastructure layer that enables data providers to import metadata records into their researche platform using OAI-PMH or RESTful services.

Explanation: See also Graph Creation and API Consumer Layer

References: https://rd-alliance.org/sites/default/files/attachment/RDAUNDERSCORESIGNOutputsUNDERSCORESIGNMay2015UNDERSCORESIGNweb.pdf

Scope: Data Description Registry Interoperability

Status: New

Data Publishing

Definition 1

The process whereby data are subjected to an assessment process to determine whether they should be acquired by a repository; followed by a rigorous acquisition and ingest process that results in products being publicly made available and supported for the long-term by that repository.

Explanation: For a domain repository, data publication is a time consuming and laborious process which is generally quite rigorous, may take a considerable period of time, and may even involve assessment by an external panel of scientists. What organizations such as figshare and Dryad do, does not count as data publication.

Examples: A typical data publication process:- Repository is requested to ingest some data - An assessment process is used to determine whether accession is appropriate. -- May be many criteria - If appropriate, repository follows internal processes to: -- Negotiate an acquisition agreement with the provider -- Reformat, document, review, etc., the data (may involve external science review) -- Create additional data products, etc. needed (as determined by an assessment of potential user communities and their needs) -- Generates outreach collateral for designated user communities - Repository a???pushes the publish buttona??? -- This can all take months (years) See http://nsidc.org/daac/daac-data-accpetance-plan.pdf for a somewhat out-of-date example NOTES: Coming up with a definition of data publication for the Earth Science community is the topic of a session at the ESIP winter meeting

Scope: DFT Term Definition Prototype

Status: In discussion

Definition 2

Research data publishing is the release of research data, associated metadata, accompanying documentation, and software code (in cases where the raw data have been processed or manipulated) for re-use and analysis in such a manner that they can be discovered on the Web and referred to in a unique and persistent way.

Explanation: Data publishing occurs via dedicated data repositories and/or (data) journals which ensure that the published research objects are well documented, curated, archived for the long term, interoperable, citable, quality assured and discoverable a??? all aspects of data publishing that are important for future reuse of data by third party end-users. This definition applies also to the publication of confidential and sensitive data with the appropriate safeguards and accessible metadata.

Examples: Zenodo (general repository): https://zenodo.org/Scientific Data (data journal): http://www.nature.com/sdata/ ICPSR (disciplinary repository): http://www.icpsr.umich.edu/icpsrweb/landing.jsp Academic Commons, Columbia University (institutional repository): http://academiccommons.columbia.edu/

References: Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542

Scope: RDA Data Publishing Workflow Interest Group

Status: New

Data Quality

Definition

Data quality (DQ) is a multi-dimensional construct perception and/or a judgment of data's fitness or trustworthiness to serve intended research uses in a given context.

Explanation: DQ is often expressed along a continuum from low to high based on a number of perceived attributes of data. This includes: Relevance to research issues \& timeliness, Accuracy ( the degree of congruity between data object \& real world phenomena), Precision/accuracy (limit of all practical analytic \& rational interpretations of a data object), Completeness (no gaps in coverage), Consistency(internal and external), and understandability (informativeness including via associated documentation \& capturing provenance of changes).Some data may lack multiple levels of details or conciseness to be as useful as other data. DQ is improved via data cleaning. Data quality is a proposed element of the Metadata profile.

Examples: Medical data may have low quality with incorrect code values or times or have fields not filled in.

References: Wand, Yair, and Richard Y. Wang. "Anchoring data quality dimensions in ontological foundations." Communications of the ACM 39.11 (1996): 86-95.http://ssm-vm030.mit.edu/Documents/Publications/TDQMpub/WandWangCACMNov96.pdf

Scope: RDA Metadata WG

Status: New

Data Registration

Definition

A process following data acquisition by which data is identified as a unit for subsequent access and processing. The result of data registration is a form of processed data that may be called registered data.

Explanation: Data resources are registered in well-kept repositories with a content, that is never changing and which can be referenced and cited this way. Related terms: repository, persistent identifier (PID), raw data, citable data, state information.

References: EPIC

Scope: RDA Term Collection Core

Status: New

Data Registry

Definition

A data registry is a storage device supporting the data registration process.

Explanation: A data registry is used following data acquisition to identify data as a unit for subsequent access and processing.Like the usually more comprehensive services of a metadata repository some may provide information on the definition, origin, source, and location of data. Standards relevant to metadata registries include ISO/IEC 11179, Specification and Standardization of Data Elements, and ANSI X3.285, Metamodel for the Management of Shareable Data. See Data Type Registry

Examples: U.S. Environmental Protection Agencya???s Environmental Data Registry provides information aboutmany of the data elements used in current and legacy EPA databases.

References: http://www.niso.org/publications/press/UnderstandingMetadata.pdf

Scope: Data Type Registries WG

Status: New

Data Repository

Definition

A Data Repository is a type of repository where data, data objects and data collections are permanently stored, managed and made accessible.

Explanation: A data repository can be a place where multiple databases, sets, collections, series or files are located and can be found accessed for distribution via a network or locally.Data repositories may accept a wide range of data types in a wide variety of formats, and generally do not attempt to integrate or harmonize deposited data. Some are open and place few restrictions (or requirements) on the metadata descriptors of the deposited data while others have a minimal metadata requirement and standard such as the PRoteomics IDEntifications (PRIDE) database.

Examples: Includes open globally-scoped repositories such as Dataverse or FigShare (http://figshare.com), Dryad8, Mendeley Data (https://data.mendeley.com/), Zenodo (http://zenodo.org/), DataHub (http://datahub.io), DANS (http://www.dans.knaw.nl/

References: Crosas, M. "The Dataverse NetworkA??: An Open-Source Application for Sharing, Discovering and Preserving Data". D-Lib Mag 17 (1), p2 (2011).White, H. C., Carrier, S., Thompson, A., Greenberg, J. \& Scherle, R. The Dryad data repository: A Singapore framework metadata architecture in a DSpace environment. Univ. GA??ttingen, p157 (2008).

Scope: RDA Data Fabric Interest Group

Status: In discussion

Data Repository management

Definition

A type of data management using repositories. Data repository management is the set of policies that govern organization, control, and properties of the repository.

Examples: Examples could include required file formats, access control restrictions, integrity, replication, retention, disposition, etc.

References: From the RDA PP WG.

Scope: Practical Policy WG

Status: New

Data Representation

Definition

Representation object is Context containing provenance, description, structural, and administrative information.

Explanation: A data context defines the source of the data and the steps that were applied to create the data (provenance), information about the use of the data (description), information about the format of the data and the applications that can parse the format (structural), and information about the management of the data such as storage location, checksum, creation date, size (administrative).

References: RDA PP WG

Scope: RDA Term Collection Core

Status: New

Data Set

Definition 1

A Data Set is a type of managed data aggregation from multiple data elements which are considered as a aggregated unit for processing purposes.

Explanation: It is the basic unit of managed data and may be represented in its entirety as a digital objects; in that case it and has a persistent identifier and metadata.Most commonly a data set corresponds to the contents of a single database table, or a single statistical data matrix, where every column of the table represents a particular variable, and each row corresponds to a given member of the data set in question.

Examples: Time series are good examples of a data set.The 1790-1960 Decennial Censuses are described as Data Sets by such repositories as the CISER Data Archive: Online Catalog.

References: After http://www.ontotext.com/factforge/dataset-definition

Scope: RDA DFT Interest Group

Status: New

Definition 2

A collection of data, published or curated by a single agent, and available for access or download in one or more formats.

Scope: RDA Term Collection Core

Status: New

Definition 3

A dataset in RDF represents a body of knowledge, technically a collection of RDF statements which can be interpreted as RDF graph.

Explanation: An RDF datasets is formally defined in the specification of the SPARQL query language: Dataset is a collection of RDF graphs against which the query is evaluated. SPARQL datasets consists of one default graph and multiple named graphs, i.e. RDF graphs identified by URIs.

References: http://www.ontotext.com/factforge/dataset-definition

Scope: RDA Term Collection Core

Status: New

Definition 4

A data set is an aggregation of data products with a common origin, history, or application.

Explanation: A data set includes primary (observational) data plus the ancillary data, software, and documentation (metadata) needed to understand and use the observations.Files in a data set share a unique data set name, share a unique data set identifier, and are described by a single DATAUNDERSCORESIGNSET catalog object (or equivalent)

References: Planetary Data System Standards Reference, Ch. 6Data Set / Data Set Collection Contents and Naming https://pds.jpl.nasa.gov/documents/sr/Chapter06.pdf see also http://pds.jpl.nasa.gov

Definition 5

A Dataset is a set of RDF triples that are published, maintained or aggregated by a single provider.

Explanation: Unlike RDF graphs, which are purely mathematical constructs, the term Dataset has a social dimension: it is a meaningful collection of triples, that deal with a certain topic, originate from a certain source or process, are hosted on a certain server, or are aggregated by a certain custodian. Also, typically a dataset is accessible on the Web, for example through resolvable HTTP URIs or through a SPARQL endpoint, and it contains sufficiently many triples that there is benefit in providing a concise summary. Since most datasets describe a well-defined set of entities, datasets can also be seen as a set of descriptions of certain entities, which often share a common URI prefix (such as http://dbpedia.org/resource/). In VoID, a dataset is modelled as an instance of the void:Dataset class. Such a void:Dataset instance is a single RDF resource that represents the entire dataset, and thus allows to easily make statements about the entire dataset and all its triples. The relationship between a void:Dataset instance and the concrete triples contained in the dataset is established through access information, such as the address of a SPARQL endpoint where the triples can be accessed.

Examples: DBpedia a void:Dataset . The resource is intended as a proxy for the well-known DBpedia dataset. A good next step would be to make this unambiguously clear by adding general metadata and access metadata to the resource.

References: http://www.w3.org/TR/void/http://vocab.deri.ie/void\#Dataset

Scope: RDA Term Collection Core

Status: New

Data Stewardship

Definition

Data stewardship is the formalized management and oversight of an organization's data assets/resources (by a data steward) to help provide business users with high-quality data that is easily accessible in a consistent manner.

Explanation: While data governance generally focuses on high-level policies and procedures, data stewardship focuses on accountability and tactical coordination and implementation. A data steward is responsible for carrying out data usage and security policies as determined through enterprise data governance initiatives, acting as a liaison between the IT department and the business side of an organization.

References: http://searchdatamanagement.techtarget.com/definition/data-stewardship

Scope: RDA Data Fabric Interest Group

Status: New

Data Stream

Definition

A data stream is a sequence of digitally encoded, coherent signals used to send or receive a representation of information content as transmitted.

Scope: DFT Term Definition Prototype

Status: In discussion

Data Transformation

Definition

Data Transformation is a process which create new data from an original source.

Examples: Examples include the process of migrating into a different format, or by creating a subset, by selection or query, to create newly derived results, such as for publication.

Scope: RDA Term Collection Core

Status: New

Data Transparency

Definition

Data transparency is a type of easy access, understandability \& transparency of digital objects and data workflows which allows them to be used no matter where they are stored or what application created them.

Explanation: Transparent data is often assumed to be accurate, well documented and can be traced to its original source.See also Transparency. Data Transparency is a provision or activity to make data available.

References: RDA Legal Interoperability Guidelines. This concerns the "holder of rights and the status of the rights, if any, in a collection of data to the extent that is feasible, provided with reasonable effort and cost by the person or organization making the data available."

Scope: RDA Data Publishing Workflow Interest Group

Status: In discussion

Data Type Registry

Definition

A Data Type Registry (DTR) is a registry that records the implicit details, such as structure, of data in the form of Data Types and associates those Types (and links various data types) with the executable data processing functions that can be useful for working with a specific data type.

Explanation: Data types range from complex digital objects to simple categories that occur in digital objects \& are identified, with different instances of datasets.A DTR allows a user or machine to submit an unknown type (e.g. a file or a term) and returns information about an available service this allows the user or machine to continue processing the content such as visualizing an image without asking prior knowledge from the user. This makes cross-disciplinary and cross-border work much more efficient and enables data driven science even to those who are not data experts. See Data Registry

Examples: Examples include complex file types in biology (diagnosis) or registering categoriesthat appear in PID records to describe data properties.

References: From "Data Management Trends, Principles and Components a??? What Needs to beDone Next?"

Scope: Data Type Registries WG

Status: New

Data Typing

Definition

Data Typing is a kind of process that associating a data type with a digital entity.

Examples: Do 131 is of type Physical Sample.

Scope: Data Type Registries WG

Status: New

Data Upload

Definition

To Do

Scope: DFT Term Definition Prototype

Status: In discussion

Data Versioning

Definition

Data versioning is the process by which new copies of Digital Objects are saved when you changes are made to the DO.

Explanation: Versioning allows users to go back and retrieve specific versions of DOs.Each version of a Do should have metadata and associated IDs to allow easy access.

Scope: Practical Policy WG

Status: New

Data article

Definition

A data article is a a???data publishinga??? product, also known as a [[data paper]] or a???data descriptora???, that may appear in a data journal or any other journal. According to Candela et al. (2015) the most commonly used name is [[data paper]].

Explanation: When publishers refer to a???data publishinga??? they usually mean a data article rather than the underlying dataset. Data articles focus on making data discoverable, interpretable and reusable rather than testing hypotheses or presenting new interpretations (by contrast with traditional journal articles). Whether linked to a dataset in a separate repository, or submitted in tandem with the data, the aim of the data article is to provide a formal route to data-sharing. The parent journal may choose whether or how standards of curation, formating, availability, persistence or peer review of the dataset are described. By definition, the data article provides a vehicle to describe these qualities, as well as some incentive to do so. The length of such articles can vary from micro papers (focused on one table or plot) to very detailed presentation of complex datasets.

Examples: http://openhealthdata.metajnl.com/articles/10.5334/ohd.ap/http://www.nature.com/articles/sdata20162

References: Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542 Candela, L.; Castelli, D.; Manghi, P.; Tani, A. Data Journals: A Survey. Journal of the Association for Information Science and Technology, http://dx.doi.org/10.1002/asi.23358

Scope: RDA Data Publishing Workflow Interest Group

Status: New

Data journal

Definition

A data journal is a journal (invariably Open Access) that publishes [[data article]]s.

Explanation: The data journal usually provides templates for data description and offers researchers guidance on where to deposit and how to describe and present their data. Depending on the journal, such templates can be generic or discipline focused. Some journals or their publishers maintain their own repositories. As well as supporting bi-directional linking between a data article and its corresponding dataset(s), and facilitating persistent identification practices, data journals provide workflows for quality assurance (i.e., data peer review), and should also provide editorial guidelines on data quality assessment.

Examples: Scientific Data: http://www.nature.com/sdata/Journal of Open Health Data: http://openhealthdata.metajnl.com/

References: Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542 Candela, L.; Castelli, D.; Manghi, P.; Tani, A. Data Journals: A Survey. Journal of the Association for Information Science and Technology, http://dx.doi.org/10.1002/asi.23358

Scope: RDA Data Publishing Workflow Interest Group

Status: New

Data management

Definition

Data management is a process of controlling \& managing data and its associated information acquired during observation and research.

Explanation: Data management processes are based development and execution of architectures, policies, practices and procedures in order to effectively manage the information lifecycle needs (for research purposed in the RDA context).

References: After http://dama-dach.org/dama-dmbok-functional-framework/

Scope: Practical Policy WG

Status: New

Data object

Definition

a type of digital object that included the named bits of a digital object but also has representation object allowing processing of its information content.

Explanation: Information that maps a Data Object into more meaningful concepts \& makes humanly-perceptible properties happen.Related term from OAIS, representation object.

Examples: file format, encoding scheme, data format, encoding scheme, data type

References: DFT WG file repository: 10 Category DFT working defintions.docx(OAIS)

Scope: RDA Term Collection Core

Status: In discussion

Data packet

Definition

A data packet is a unit of data made into a single package that travels along a given network path.

Explanation: Data packets are used in Internet Protocol (IP) transmissions for data that navigates the Web, and in other kinds of networks.

References: https://www.techopedia.com/definition/6751/data-packet

Scope: RDA DFT Interest Group

Status: New

Data paper

Definition

A 'data paper' is an artifact homologous with articles in traditional journals yet dedicated to describe a dataset.This is a synonym of [[Data article]].

Explanation: The concept of a data paper has at least two elements that have to be materialized into concrete and identifiable information objects: the dataset (the subject of the data paper) and the data paper itself (the artifact produced to describe the data set). The term data paper is used to refer to the artefact only. This artefact is homologous with articles for traditional journals; it is expected to have an identifier and a content with title, authors, abstract, number of sections, and references.

References: Candela, L.; Castelli, D.; Manghi, P.; Tani, A. (2015) Data Journals: A Survey. Journal of the Association for Information Science and Technology, 66: 1747a???1762 http://dx.doi.org/10.1002/asi.23358

Data policy

Definition

Data policy: an organizationa???s stated data/information management processes designed to assist and protect company data research assets.

Scope: RDA Term Collection Core

Status: New

Data practice

Definition

Data practice is the actual application use of ideas \& methods about how data are collected, created, stored (maintained), used, shared and released (disseminated).

Explanation: Practice is distinguished from theories about how data should be managed.

Scope: Practical Policy WG

Status: New

Data privacy

Definition

Data privacy, also know as "information privacy", is that aspect of information technology dealing with an agent's ability to determine what data in a computer system can be shared with third parties.

References: http://searchcio.techtarget.com/definition/data-privacy-information-privacy

Scope: RDA DFT Interest Group

Status: New

Data publishing workflow

Definition

Research data publishing workflows are activities and processes that lead to the publication of research data, associated metadata and accompanying documentation and software code on the Web.

Explanation: In contrast to interim or final published products, workflows are the means to curate, document, and review, and thus ensure and enhance the value of the published product. Workflows can involve both humans and machines and often humans are supported by technology as they perform steps in the workflow. Similar workflows may vary in their details, depending on the research discipline, data publishing product and/or the host institution of the workflow (e.g. individual publisher/journal, institutional repository, discipline-specific repository).

Examples: Murphy F, Bloom T, Dallmeier-Tiessen S, Austin CC, Whyte A, Tedds J, Nurnberger A, Raymond L, Stockhause M, Vardigan M (2015). WDS-RDA-F11 Publishing Data Workflows WG Synthesis FINAL CORRECTED. Zenodo.http://dx.doi.org/10.5281/zenodo.33899

References: Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542

Scope: RDA Data Publishing Workflow Interest Group

Status: New

Data repository entry

Definition

A data repository entry is the basic component of data publishing consisting of a persistent, unique identifier pointing to a landing page that contains a data description and details regarding data availability and the means to access the actual data.

Explanation: See also [[DataUNDERSCORESIGNRegistration]] and [[Registry]].

References: Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542

Scope: RDA Data Publishing Workflow Interest Group

Status: New

Data review

Definition

Data review comprises a broad range of quality assessment workflows, which may extend from a technical review of metadata accuracy to a double-blind peer review of the adequacy of data files and documentation and accuracy of calculations and analyses.

Explanation: Multiple variations of review processes exist and are dependant upon factors such as publisher requirements, researcher expectations, or data sensitivity. Some workflows may be similar to traditional journal workflows, in which specific roles and responsibilities are assigned to editors and reviewers to assess and ensure the quality of a data publication. The data review process may therefore encompass a peer review that is conducted by invited domain experts external to the data journal or the repository, a technical data review conducted by repository curation experts to ensure data are suitable for preservation, and/or a content review by repository subject domain experts.

References: Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542

Scope: RDA Data Publishing Workflow Interest Group

Status: New

Data sharing

Definition

Data sharing is the principle-based practice of making data available and accessible for use by others.

Explanation: Among the things that data sharing allows is replication and critical testing which has a long history as part of research.

Examples: Investigative use by scholars for research available is an example.Use as part of a collection is an example.

References: https://en.wikipedia.org/wiki/DataUNDERSCORESIGNsharing

Scope: Practical Policy WG

Status: New

Data source

Definition

A data source is simply the source of data.

Explanation: It can be a file, a particular database on a DBMS, or even a live data feed. The data might be located on the same computer as the program, or on another computer somewhere on a network.

Examples: A data source might be an Oracle DBMS running on an OS/2A?? operating system, accessed by NovellA?? Netware;an IBM DB2 DBMS accessed through a gateway; a collection of digital files in a repository; or a local DB file.

References: https://docs.microsoft.com/en-us/sql/odbc/reference/data-sources

Scope: RDA DFT Interest Group

Status: New

Data type

Definition

A data type characterizes dataA? structures

Explanation: Structure associated with a digital entity, defines the context and operations that can be applied to a digital entity. These may be at multiple levels of granularity that apply to a particular data type.See Data Type Registry (DTR).

Examples: There are basic types like strings and Booleans but also high level types like document or image. In between are types like jpg which can be used for images or images of documents.Physical Sample is a data type as is Trace Gas.

References: DTR WG slides for P3.

Scope: Data Type Registries WG

Status: New

Data type registry

Definition

A type of registry for data types supporting their standardization,uniqueness and discoverability.

Explanation: definition requested from data type registry WG.

Scope: RDA Term Collection Core

Status: New

Database

Definition

A collection of inter-related data often with controlled redundancy , organized according to a scheme to serve one or more applications; the data are stored so that they can be used by several programs without concern for data structures or organization. (ANSI X3-172/ISO 11179-1)

Explanation: A database has also be defined as collection of data organized according to a conceptual structure/model describing the characteristics of these data and the relationships among their corresponding entities, supporting one or more application areas.

Scope: DFT Term Definition Prototype

Status: In discussion

Database Cracking

Definition

Database cracking features incremental partial indexing and/or sorting of the data.Database cracking combines features of automatic index selection and partial indexes.

Explanation: Database cracking reorganizes data within the query operators, integrating the re-organization effort (occasionally invoking creation or removal of indexes on tables and views based on use) into query execution.Database cracking shifts the cost of index maintenance from updates to query processing.

References: http://www.vldb.org/pvldb/vol4/p586-idreos.pdf

Scope: RDA Data Fabric Interest Group

Status: New

Database Rights

Definition

Database rights (sui generis database rights) are a form of intellectual propriety rights that apply to collections of data.

Explanation: Database protection in a legal form of sui generisA? (a???of its own kind,a??? or unique) rights exists mostly in the European Union (with a few similar applications in other countries) (European Parliament, 1996). It applies to databases that show an investment in the verification and presentation of the contents. Database protection refers to the entire or a???substantial parta??? of a database, not to the single datum or a???insubstantiala??? part of a database. It prevents unauthorized persons from extracting and reusing substantial parts of the protected database, or even repeated extractions of insubstantial amounts of data. In most non-E.U. countries, databases are only protected if they (or certain portions or characteristics) qualify as a???worksa??? within the meaning of copyright. For an analysis of the effects of the E.U. Database Protection legislation.

References: see, e.g., NautaDutilh 2001; and on research, Guibault and Weibe, eds. 2013, and Reichman and Uhlir 1999.

Scope: Legal Interoperability

Status: In discussion

Datapoint

Definition

In statistics, a data point or observation is a set of one or more measurements on a single member of a statistical population.

Explanation: See also data element.

References: https://en.wikipedia.org/wiki/DataUNDERSCORESIGNpoint

Scope: RDA DFT Interest Group

Status: New

Dataset

Definition 1

A Data Set is a type of managed data collection. It is the basic unit of managed data and has a persistent identifier and metadata.

References: Provided by Peter Wittenburg.

Scope: RDA Term Collection Core

Status: New

Definition 2

Dataset: A collection of data, published or curated by a single agent, and available for access or download in one or more formats.

Explanation: Part of the DCAT vocabulary - "an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web."Datasets are containers or aggregations and the scale of aggregation may vary.

References: See http://www.w3.org/TR/vocab-dcat/See also https://jcheminf.springeropen.com/articles/10.1186/s13321-016-0168-9 for a large discussion of datasets: Dataset is used as a descriptor here to indicate that it is a generic container for data that can logically be reported as a set. The level and scope of the aggregation for a a???dataseta??? can be at any scale (and is at the discretion of the researcher) and thus it can be used to report a single piece of data or all of the data from a large research study. Within a???dataseta??? data can be organized/reported in multiple ways. Individual pieces of data are added to the a???datapointa??? section and it is implied that there is no relationship between values included. Data that is logically related to other data, either as a time or property series or correlated data such as a spectrum (multiple correlated arrays) are stored in the a???dataseriesa??? section, either directly under a???dataseta??? or as part of a a???datagroupa???.

Scope: RDA DFT Interest Group

Status: In discussion

Dataset series

Definition

A dataset series is a collection of datasets sharing the same product specification.

Explanation: This is yet another type of aggregation or collection with the (product) unit being a dataset about some "logical grouping" such as by a topic (specification).

Examples: An example would be a series of earth observations. Each year, month or week (depending on the volume) might be a dataset and the series could run from 1998 to now.

References: according to ISO 19115, ISO 19113 and ISO 19114

Scope: RDA Term Collection Core

Status: New

Datum

Definition

A datum is a role played by a unitary proposition, which provides the content of the datum.

Explanation: Datum is a quantifiable fact that can be repeatedly measured and is in a form suitable for communication.

Examples: An observation produces a datum whose proposition can be described as a set of values represented in some structure such as a table with headings: Location (20:25)Date-Time (12/11/2013 01:34)Temperature Value (15 C)

References: Conceptual approaches for defining data, information, and knowledge. Chaim Zins ARTICLEa???ina???JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY 58(4):479-493 A?? FEBRUARY 2007 https://www.researchgate.net/publication/220432993UNDERSCORESIGNConceptualUNDERSCORESIGNapproachesUNDERSCORESIGNforUNDERSCORESIGNdefiningUNDERSCORESIGNdataUNDERSCORESIGNinformationUNDERSCORESIGNandUNDERSCORESIGNknowledge

Scope: DFT Term Definition Prototype

Status: In discussion

Derived Data Products

Definition

A derived data product is a result of further processing of primary data input.

Explanation: Also called "Derived datasets" in

Examples: Examples include simple unit translations (lbs to kgs), averaging and smoothing and quality-based reprocessing but also added data objects to form an aggregation of interest.

References: https://www.dataone.org/best-practices/describe-method-create-derived-data-products

Scope: Practical Policy WG

Status: New

Description Object

Definition

Something that describes an object. As appropriate, it will have structuraland descriptive components.

Explanation: Technically speaking, a "description objecta??? (in NASA's PDS4) can have a"digital objecta??? form a??? a string of bits. But we assume that there is a form for human reading and, on that basis give it a special name.

References: https://pds.nasa.gov/pds4/doc/glossary/PDS4UNDERSCORESIGNGlossaryUNDERSCORESIGNv111028.pdf

Scope: RDA Data Fabric Interest Group

Status: New

Descriptions

Definition

Descriptions provide information about a a resource for purposes such as discovery and identification.

Explanation: See also Descriptive Metadata

Examples: Examples include definitions, explanatory titles, or summaries. Example of descriptions also include things from Dublin Cor(DC) contextual (project, person, organisation, funding, facility, equipment, publicationsa???.) CERIF, ISA; detailed/specific: schema level connecting dataset to software etc.

Scope: Metadata Interest Group

Status: New

Descriptive metadata

Definition

Descriptive metadata is a type of metadata that describes a resource for purposes such as discovery and identification suchas creator, title, and subject.

Explanation: This metadata is created by a process, usually from users of describing and naming data/DOs.The template used operationally may specify the structures within the data object that contain descriptive metadata and associated context such as metadata values, schema name that organizes the metadata names, creation date for metadata, person who created the metadata. For reproducible research descriptions of methods \& protocols used may need to be detailed. A second operation needs to associate the descriptive metadata with the data object. This can be in the form of metadata stored in a relational database, or as an Archival Information Package, or in an XML file.

Examples: Examples include elements such as title, abstract, author, type of data, data domain, acquisition method or study methodology and keywords.Example of descriptive also include things from Dublin Cor(DC) contextual (project, person, organisation, funding, facility, equipment, publicationsa???.) CERIF, ISA; detailed/specific: schema level connecting dataset to software etc. File-level metadata may describe tabular data files such as column-level metadata.

References: NISO. (2004) Understanding Metadata.Bethesda, MD: NISO Press, p.1 http://marciazeng.slis.kent.edu/metadatabasics/types.htm Additional explanations from PP WG. Journal on Semantic Web and Information Systems 5 (3) (2009) 1a???22. J.P. Mesirov, Accessible reproducible research, Science 327 (5964) (2010) 415a???416

Scope: Metadata Interest Group

Status: In discussion

Detailed Metadata

Definition

Detailed metadata is defined in distinction to simpler or light forms of metadata that provide some basic information about data, such as in Dublin Core, but which can supplement this information.

Explanation: There are obvious tradeoff between simple metadata that rpovides much of what is needed and more detailed metadata. Dublin Core is on the simple side and may be used for metadata about date, a description, and a format. But the format may just be noted as a Picture without detail as to its media type.More detail may improve searching precision but require higher investment in creation of metadata. Unless there is proper guidance and us some cases standard vocabularies ore detail may make it more difficult to promote consistency in creation of metadata.

Examples: Dublin Core defines a simple item for Subject which be broad and may not include sub-type information that helps users find the data.

References: http://www.dlib.org/dlib/april02/weibel/04weibel.html

Scope: RDA Metadata WG

Status: New

Dictionary

Definition

A Dictionary is often used interchangeably with a Glossary (whcih see) but is often implies handle more at the lexical level, such as providing some information about phonetics, derivations, history etc.

References: After Frank Guerino, "What the difference re., Data Dictionary, Ontology,"http://ontolog.cim3.net/forum/ontolog-forum/2014-02/msg00173.html

Scope: RDA DFT Interest Group

Status: New

Digital Archive

Definition

A digital or data archive is an archive that seeks to preserve the full digitial/record set, not simply the data; that is, all supporting metadata documentation and other related material that details provenance and context with respectto how the data were generated and should be treated, preserved, analysed and interpreted.

Explanation: See also Archive

References: http://dri.ie/sites/default/files/files/funding-models-open-access-repositories.pdf

Scope: RDA Data Fabric Interest Group

Status: New

Digital Collection

Definition

A digital collection is a type of aggregation formed by a collection process on exist Digital Objects and/or digital data sets where the collected data is in a digital form.

Explanation: Data collections are a more general form, but we focus on digital aggregations here.Digital collections often are collections of digital sets of data and/or digital objects. Collections are often built from repository data (which have PIDs). When collections are formed, a new collection PID is created for reference and metadata describing its aggregation properties and ID links to sources be provided at a minimum. Collections have (or should have) metadata indicating something about when the data within them was collected.

Examples: A complex Digital Object such as a Digital Research Object may be an example of a digital collection.

References: P3 and use case discussion.

Scope: RDA DFT Interest Group

Status: New

Digital Curation Lifecycle

Definition

The Digital curation lifecycle describes the path from data creation to data disposal with various transformations adding or preserving the value of digital content and/or format along the way

Explanation: The digital curation lifecycle involves the following steps starting with conceptualization and creation: Conceptualise: conceive and plan the creation of digital objects, including data capture methods and storage options. Create: produce digital objects and assign administrative, descriptive, structural and technical archival metadata. Access and use: ensure that designated users can easily access digital objects on a day-to-day basis. Some digital objects may be publicly available, whilst others may be password protected. Appraise and select: evaluate digital objects and select those requiring long-term curation and preservation. Adhere to documented guidance, policies and legal requirements. Dispose: rid systems of digital objects not selected for long-term curation and preservation. Documented guidance, policies and legal requirements may require the secure destruction of these objects. Ingest: transfer digital objects to an archive, trusted digital repository, data centre or similar, again adhering to documented guidance, policies and legal requirements. Preservation action: undertake actions to ensure the long-term preservation and retention of the authoritative nature of digital objects. Reappraise: return digital objects that fail validation procedures for further appraisal and reselection. Store: keep the data in a secure manner as outlined by relevant standards. Access and reuse: ensure that data are accessible to designated users for first time use and reuse. Some material may be publicly available, whilst other data may be password protected. Transform: enrich current object, say by adding annotations, and/or create new digital objects from the original, for example, by migration into a different form.

References: After DCC See: http://www.dcc.ac.uk/digital-curation/what-digital-curation\#sthash.SZcvpDHA.dpuf

Scope: DFT Term Definition Prototype

Status: New

Digital Data

Definition

Digital Data refers to a structured sequence of bits/bytes that represents information content.

Explanation: In many contexts digital data, digital object, data object and data are used interchangeably implying both the bits and the content.Sometimes called a bit-level object since it is a specific sequence of bits, independent of any semantic meaning. A bit-level view may also be independent of where bits exist in a file, on a channel in transmission or stored in computer memory. See also registered digital data. RDA focuses on the digitized data and operations on it for management and sharing, but non-digital data is also important and it and its understood content may be cited as part of publications and background.

Examples: See bit stream for examples.

References: What is an information object anyway?http://www.netarch2009.net/slides/Netarch09UNDERSCORESIGNBorjeUNDERSCORESIGNOhlman.pdf

Scope: RDA DFT Interest Group

Status: New

Digital Data Object

Definition

A Digital Data Object contains data and it's description for re-use.

Digital Object

Definition 1

A digital object is composed of structured sequence of bits/bytes. As an object it is named. The bit sequence (see definition) realizing the object can be identified \& accessed by a unique and persistent identifier or by use of referencing attributes describing its properties.

Explanation: This definition reflects discussion of previous definitions at a RDA WG meeting in Garching,Germany in Feb. 2014. The idea is to not be too abstract and to focus on the Bits aspect, but Reagan Moore uses the idea that an object has a name and possibly an ID for identity. This idea of identity as a requirement for something to be an entity/object and we have a name for it to talk about it comes from the philosopher Quine. See reference. DO's value is based in part on digital attributes such as editability, interactivity, openness and their ability to be easily copied \& distributed (A Theory of Digital Objects, Jannis Kallinikos et al. 2010)

Examples: 0010101, name = "U" letter0010101, name = "21" Integer A digital object can be a data object, a workflow, or a dataflow.

References: DFT P3 discussionA Theory of Digital Objects, Jannis Kallinikos et al. 2010, http://firstmonday.org/ojs/index.php/fm/article/view/3033/2564 W.V.O.Quine, 1960, Word and object, MIT, Cambridge, Mass

Scope: RDA DFT Interest Group

Status: In discussion

Definition 2

Digital Object is also called a Digital Entity defined as a???machine-independent data structure consisting of one or more elements in digital form that can be parsed by different information systems; the structure helps to enable interoperability among diverse information systems in the Internet.a???

Explanation: Presented as part of Garching WG meeting.

References: X.1255 ITU standard "Framework for discovery of identity management information"

Scope: RDA Term Collection Core

Status: Deprecated

Digital Object Identifier

Definition

A code used to permanently and stably identify (usually digital) objects. It is a type of Persistent Identifier (PID) and is sometimes used interchangably with this PID term.

Explanation: Also known as a DOI, Digital Object Identifiers are issued by the International DOI Foundation. This permanent identifier is associated with a digital object (DO) that permits it to be referenced reliably even if its location and metadata undergo change over time. DOIs provide a standard mechanism for retrieval of metadata about the object, and generally a means to access the data object itself.

References: http://www.nature.com/articles/sdata201618\#ref2

Scope: RDA DFT Interest Group

Status: In discussion

Digital Record

Definition

A type of record that been created in a digital form.

Examples: A bibliographic record which might include an Introduction to cataloging and classification / Bohdan S. Wynar. -- 8th ed. / Arlene G. Taylor. -- Englewood, Colo. : Libraries Unlimited, 1992. -- (Library science text series).

References: http://archives.govt.nz/advice/continuum-resource-kit/glossary/definitions-full-list\#Record

Scope: DFT Term Definition Prototype

Status: In discussion

Digital Repository

Definition

A type of Data Repository, a network-accessible storage system in which digital objects may be stored for possible subsequent access or retrieval

Explanation: Repository architecture manages content as well as metadata and offers a minimum set of basic services e.g. put, get, search, access control.To be effective a repository must be sustainable and trusted, well-supported and well-managed."

References: Kahn and Wilensky, 1995, http://www.cnri.reston.va.us/k-w.htmlDigital Repositories Review, Heery and Anderson, 2005

Scope: DFT Term Definition Prototype

Status: In discussion

Digital repository

Definition

A digital repository is an infrastructure component that is able to store, manage and curate digital objects and return their bitstreams when a request is being issued. (DFT Core Terms)

Explanation:

Scope: RDA Data Fabric Interest Group

Status: New

Discovery Metadata

Definition

Discovery metadata is metadata whose chief role is to discover relevant data

Explanation: Discovery metadata was proposed as a basic concept by the RDA MIG.

Examples: Discovery metadata might include the following overall features of a dataset: The title and a description of the dataset. The keywords describing the dataset. The date of publication of the dataset. The entity responsible (publisher) for making the dataset available. The contact point of the dataset. The spatial coverage of the dataset. The temporal period that the dataset covers. The themes/categories covered by a dataset.

References: http://w3c.github.io/dwbp/bp.html\#metadata

Scope: RDA Metadata WG

Status: New

Disposal

Definition

Disposal is a process within Records Management of removing material/data that has been subject to preservation.

Explanation: Disposal may be set by policy based on time or information value. Related concept Retention Schedule.

Scope: Practical Policy WG

Status: New

Domain Metadata

Definition

Domain Metadata or domain-specific metadatais non-general metadata used to capture domain information as reflected in domain vocabularies and models.

Explanation: When possible domain-specific metadata should maps to metadata standards used within a scientific domain.

Examples: AGROVOC thesaurus contains more than 16,000 concepts or General Multilingual Environmental Thesaurus (GEMET)

References: B. Lauser, M. Sini, G. Salokhe, J. Keizer, S. Katz, Agrovoc Web Services:Improved, real-time access to an agricultural thesaurus, Quarterly Bulletin of the International Association of Agricultural Information Specialists (IAALD) 1019-9926 (2) (2006) 79a???81. European Environment Agency, GEneral Multilingual Environmental Thesaurus (GEMET). Version 2.0, European Topic Centre on Catalogue of Data Sources (ETC/CDS), http://www.eionet.europa.eu/gemet (2004).

Scope: RDA DFT Interest Group

Status: New

Dynamic Data

Definition

Dynamic Data is data the content of which is changing frequently and at asynchronous moments.

Explanation: Dynamic data can have various flavors. It can be data streams that are generated by sensors when it is unpredictable when data segments will appear in time, i.e. data streams have gaps. It can be data streams that are generated by humans in crowd sourcing scenarios where it is not clear when which cell in a database will be filled.

References: Peter Wittenburg Draft Document on Core Vocabulary

Scope: RDA Term Collection Core

Status: In discussion

Ecosystem

Definition

In general use an ecosystem is a complex network or community of interconnected and interacting system of components.

Explanation: Originally ecosystem applied to a biological community of interacting organisms (made up of plants, animals, and microorganisms) together with the surrounding physical environment. In that sense an ecosystem is a community or combination of living and non-living things/sub0systems that work together.There is also the idea that an ecosystem functions in some way as a whole with rich interactions and feedbacks from its sub-parts. All the parts of ecosystem work together to make a balanced system. Ecology is a term used to refer to the study of A??ecological systemsA??.

Examples: A pond or a rain forest are each examples of complex ecosystems. An ecosystem can be as large as a desert or a lake or as small as a tree. Other bio-examples of ecosystems include pond, a forest, an estuary, grassland. It includes abiotic components like Sunlight, temperature, Precipitation, Water or moisture, Soil and biotic components like Primary producers, Herbivores, Carnivores, Omnivores, Detritivores.

Scope: RDA Data Fabric Interest Group

Status: New

Entity

Definition

Entity is used here as top-level concept for some independent, enduring unified and identifiable object.

Explanation: An entity may change parts over time put keep its identity as it plays various roles. Thus a data set may be updated but still identified as the data set.Most entities considered in research are physical and have material, temporal and spatial components. But things like Information Content are abstract entities without spacial and temporal components.

Examples: A data set is an entity as is a user or a data record or a repository.

References: DOLCE OntologyBorgo, Stefano, and Claudio Masolo. "Foundational choices in DOLCE." Handbook on ontologies. Springer Berlin Heidelberg, 2009. 361-381.

Scope: DFT Term Definition Prototype

Status: In discussion

Equity

Definition

Equity is a base socio-legal concept defined as "the quality of being fair and impartial."A?

References: Oxford English Dictionary.

Scope: Legal Interoperability

Status: New

Event Logging

Definition

An operation that records information about actions performed on digital objects.

Explanation: An operation that generates the following type of log entry metadata in a file or event repository. These identify: * the type of operation performed upon a data object, * the time when the operation was performed, * the name of the person who executed the operation, and * parameters used by the operation.

Examples: Related terms include a??? log entry file, event repository, audit trail

References: RDA PP WG

Scope: Practical Policy WG

Status: New

External Property

Definition

External properties are those properties that allow management and access of a DO's state information.

Explanation: See also Internal property

Scope: RDA DFT Interest Group

Status: New

Extract descriptive metadata

Definition

Given the data type, access a data type registry to identify a procedure that can be used to parse the data object and then apply a template to extract desired information from the contents of the data object.

Explanation: The template may specify the structures within the data object that contain descriptive metadata and associated context such as metadata values, schema name that organizes the metadata names, creation date for metadata, person who created the metadata.A second operation needs to associate the descriptive metadata with the data object. This can be in the form of metadata stored in a relational database, or as an Archival Information Package, or in an XML file. Related term include data type registry, Data type and Descriptive metadata

Scope: RDA Term Collection Core

FAIR Data Principles

Definition

FAIR Data Principles are a set of guiding principles to make data Findable, Accessible, Interoperable, and Re-usable.

Explanation: In the eScience ecosystem, the challenge of enabling optimal use of research data and methods is a complex one with multiple stakeholders involved.The minimal [FAIR Guiding Principles] are meant to guide implementers of FAIR data environments as a gauge of whether particular implementation choices result in making data FAIR.

References: https://www.force11.org/group/fairgroup/fairprinciples

Scope: RDA DFT Interest Group

Status: New

Facility / equipment

Definition

Facilities and equipment are artifacts, designed, built, operating or installed to serve a specific function affording a convenience or service.

Explanation: A Facility provides a capability via the provision of services to serve a specific function. Facilities can be physical or virtual

Examples: Data management facility, data repository, sensor.

Scope: Metadata Interest Group

Status: In discussion

Fair Use

Definition

Fair Use is any copying of copyrighted material done for a limited and a???transformativea??? purpose, such as to comment upon, criticize, or parody a copyrighted work.

Explanation: Such uses can be done without permission from the copyright owner.

References: http://fairuse.stanford.edu/overview/fair-use/what-is-fair-use/ Association of Research Libraries, Center for Social Media-School of Communication at American University, and Program on Information Justice and Intellectual Property-Washington College of Law at American University (ARL et al), 2012, Code of Best Practices in Fair Use for Academic and Research Libraries. Available online: http://www.cmsimpact.org/sites/default/files/documents/codeUNDERSCORESIGNofUNDERSCORESIGNbestUNDERSCORESIGNpracticesUNDERSCORESIGNinUNDERSCORESIGNfairUNDERSCORESIGNuseUNDERSCORESIGNforUNDERSCORESIGNarlUNDERSCORESIGNfinal.pdf.

Scope: Legal Interoperability

Status: In discussion

Federated Architecture

Definition

Federated Architecture is a type of architecture that allows loose coupling of system elements in support of system and data integration.

Explanation: Federated Architecture is often a pragmatic solution supporting a???loosely coupleda??? integration and interoperability at the expense of local optization.Federated data architectures support the cooperative use of multiple, disparate data sources within an ecosystem view of logically integrated resources.

References: https://en.wikipedia.org/wiki/FederatedUNDERSCORESIGNarchitecture

Scope: RDA Data Fabric Interest Group

Status: New

Federation Registry

Definition

A Federation Registry is a type of Register used to identify federated systems.

References: After Reagan Moore

Scope: RDA Data Fabric Interest Group

Status: New

File

Definition

In contrast to a bit stream, a digital (computer processable) file is a representation partitioned into chunks that are conveniently laid out to managed on computer processing system.

References: After Gladney, Henry M. "Long-term preservation of digital records: Trustworthy digital objects." American Archivist 72.2 (2009): 401-435.

Scope: RDA Term Collection Core

Status: New

Findable

Definition

Findable in the FAIR approach means that:1. (meta)data are assigned a globally unique and persistent identifier 2. data are described with "rich metadata" 3. metadata clearly and explicitly include the identifier of the data it describes 4. (meta)data are registered or indexed in a searchable resource

Explanation: In modern parlance some thing is findable in that ot is capable of being identified through the Internet and its services.

References: The FAIR Guiding Principles for scientific data management and stewardship,http://www.nature.com/articles/sdata201618\#ref2

Scope: Metadata Standards Directory Working Group

Status: New

Fixed Schema

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Flexible Schema

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Framework

Definition

A reusable design (models and/or code) that can be refined (specialized) and extended to provide some portion of the overall functionality of many applications.

Explanation: Elaboration1: A framework is a more a kind of meta-design that covers a high degree of flexibility of fit.Elaboration2: The ability to make refinements may require that the design is fully known which is not the case in DFIG at the very beginning which is why some like to use the term a??? frameworka???. Thus it seems that we would like to use the term "framework" as a much more fuzzy term to denote that we are not sure at this moment about many of the characteristics.

References: Systems Engineering

Scope: RDA Data Fabric Interest Group

Status: New

Glossary

Definition

A glossary is an alphabetical list of terms or words found in orrelated to a specific (specialized) topic or text. It may or may not include explanations, and its vocabulary may be monolingual, bilingual or multilingual.

Explanation: Unlike a Vocabulary that only provides a list or grouping of words or terms that are common to a context, a Glossary usually provides the long name, short name or acronym, and a description/definition. It rarely gets into a things like synonyms and antonyms.A glossary may be specialized such as terms for Big Data.

Examples: An example of glossary is the FAO Fisheries Glossaryhttp://www.datascienceglossary.org/ has a data science glossary.

References: http://www.fao.org/fi/glossary/default.asp S. Wright and G. Budin, editors. Handbook of terminology management, Basic aspects of terminology management. John Benjamins Publishing Company, 1997.

Scope: RDA DFT Interest Group

Status: New

Graph Creation Layer

Definition

Graph Creation Layer is a Digital Infrastructure layer that aggregates information, and usesGoogle API and other services to identify missing connections.

Explanation: See also Data Provider and API Consumer Layer.

References: https://rd-alliance.org/sites/default/files/attachment/RDAUNDERSCORESIGNOutputsUNDERSCORESIGNMay2015UNDERSCORESIGNweb.pdf

Scope: Data Description Registry Interoperability

Status: New

Identifier

Definition

An identifier (ala digital identifier) is a bitstring that is used to provide Object Identity.

Explanation: For many a digital identifier is associated with a registry for the identifier and a repository for data that is identified

Examples: The DOI system is used to identify electronic documents such as journal articles.

References: ISO 1117

Scope: RDA DFT Interest Group

Status: New

Identity

Definition

Identity is that property of an object, such as a Digital Object or Resource, which distinguishes each object from all others. Identity is established by some process that connects a set of attributes to some object.

Explanation: The process of assigning a unique identifier to authors of journal articles and other published work so that each author may be uniquely identified illustrates the definition.

Examples: Author, user or resource identity.

Scope: RDA DFT Interest Group

Status: In discussion

Immutable class

Definition

Immutable class in programming means that the object state is not changed and hashcode remains same.

Explanation: Immutability is shown by means of an immutable flag which has a type of PID

Examples: String, Integer or any other wrapper classes are immutable.

References: PID Info Type WG

Scope: PID Information Types WG

Status: New

Index

Definition

An index is a type of metadata Descriptive metadata used in support of the indexing operation.

References: RDA PP WG

Scope: Practical Policy WG

Status: New

Indexing

Definition

For all data objects in a collection, this operation generates a searchable representation of the contents of a data object (full text search), or the contents of descriptive metadata associated with data objects, or the events that track operations on the data object.

Explanation: The index may be stored in a triple-store as RDF, or may be stored in a relational database. Related terms a??? index- a type of metadata Descriptive metadata Event tracking

Scope: RDA Term Collection Core

Status: New

Information

Definition

Information is data that has been processed and/or communicated into a form, such as records, that is meaningful to the recipient

Explanation: Information is passed to a recipient by means of an information object, aka a message, used by a sender to represent one or more concepts within a communication process, intended to increase knowledge in recipients.

References: Davis \& Olson, Management Information Systems, 1985

Scope: DFT Term Definition Prototype

Status: New

Information Content

Definition

Information content is a state of affairs.A signal contains information about X to just that extent to which a suitably placed observer could learn something about X (change their knowledge) by consulting this signal.

Explanation: Information is created by or associated with a state of affairs among a set of possibilities of a situation. This information is carried by a signal. This follows Dretske paradigm shift from engineering aspect to a semantic aspect of information. There is a relationship between information and knowledge. There is prior knowledge about a specific information source and additional knowledge is added by the content of information which may be conveyed by data representing that information. One then says thata???data bears informationa???.

Examples: That there is smoke carries the information that there is a i???re.That a thermometer reads 101 in a body of water tells us something about the heat characteristics of that body. We thus acquire new knowledge. Examples of information content types a??? Novels a??? Legal documents a??? Charts a??? Symbol a??? Traffic directions a??? Recipes a??? Computer programs a??? XML files a??? File formats a??? Ontologies a??? Class descriptions a??? Sentences a??? URIs a??? Simulation models

References: Dretske, F. I. Knowledge and the Flow of Information, Basil Blackwell, Oxford, 1981. Xu, H. and Feng, J., a???Towards a Definition of the a???Information Bearing Capabilitya??? of a Conceptual Data Schemaa???, In Systems Theory and Practice in the Knowledge Age, (E. Ragsdell et al.), Kluwer Academic/Plenum Publishers. New York. ISBN 0-306-47247-3, 20021 Xu, Kaibo, Junkang Feng, and Malcolm Crowe. "Defining the notion of a???Information Contenta???and reasoning about it in a database." Knowledge and Information Systems 18.1 (2009): 29-59. Information Artifact Ontology

Scope: RDA Term Collection Core

Status: New

Information Object

Definition

A set of attributes defining the semantics (meaning) of a data object's content.Information Object consists of a Data Object which has explanatory metadata (e.g. format representation) as part of the Object.

Explanation: Also called a "tagged digital object" which is a "A digital object paired with its companion description object."

Examples: OAIS example, a digital image in TIFF format can only be rendered as an image using software which has been designed to interpret the bitstream in accordance with the TIFF format specification. In other words, the logical Information Object (the image) can only be derived from the physical Data Object (the bitstream) via a process of interpretation. OAIS uses the term Representation Information to describe the knowledge base required for this interpretation.

References: OAIS documentation: Holdsworth, David, and Derek M. Sergeant. "A Blueprint for Representation Information in the OAIS model." IEEE Symposium on Mass Storage Systems. 2000. "What is an information object anyway?" http://www.netarch2009.net/slides/Netarch09UNDERSCORESIGNBorjeUNDERSCORESIGNOhlman.pdf See also https://pds.nasa.gov/pds4/doc/glossary/PDS4UNDERSCORESIGNGlossaryUNDERSCORESIGNv111028.pdf

Scope: RDA Data Fabric Interest Group

Status: In discussion

Information Science

Definition

Information Science is the study of mediating perspectives of various universal human knowledge.

Explanation: These mediating perspectives include: cognitive, social, and technological aspects and conditions, which facilitate the dissemination of human knowledge from the originator to the user. Computer Scientists tend to view information through a technological perspective.

References: Zins, C. (2006). Redefining information science: From a???information sciencea??? to a???knowledge sciencea???. Journal of Documentation, 62(4), 447a???461.

Scope: DFT Term Definition Prototype

Status: New

Infrastructure

Definition

Definition is the basic physical and organizational structure needed for the operation of various necessary applications so that they can function on top of this structure. In SE terms it is a system existing of interconnected elements organized to achieve one or more stated purposes.

Explanation: This was a first attempt by DF but hasn't been followed up yet. A data infrastructure is a digital infrastructure promoting data sharing and consumption. Similarly to other infrastructures, it is a structure needed for the operation of a society as well as the services and facilities necessary for an economy to function, the data economy in this case.

Examples: Data infrastructure is one important type of infrastructure.

References: DF meeting noteshttps://en.wikipedia.org/wiki/DataUNDERSCORESIGNinfrastructure

Scope: RDA Data Fabric Interest Group

Status: New

Instances of Bit Stream

Definition

An instance of data has a bit stream or bit sequence.

Scope: DFT Term Definition Prototype

Status: In discussion

Integrity

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Intellectual Property Rights

Definition

Intellectual Property Rights are a type of right that includes copyright, sui generis database rights, patents, and other similar laws that establish a regime for access, use, or reuse of information, including research data or metadata.

Scope: Legal Interoperability

Status: New

Interfaces

Definition

Definition (first attempt): a specification that defines the way components within systems are interacting

Scope: RDA Data Fabric Interest Group

Status: New

Internal Property

Definition

Internal property refers to the properties, making up an internal structure, that allow one to interpret the content of a DO.

Explanation: This view of Digital Object is one in which there are things external to the object such as it name and ID that are identifying and may provide information about it creation in context. In contrast to this are importance of types of metadata that describe the elements of the DO independent of its creation contexts.

Examples: Internal properties may inlude data uncertainty, data accuracy, \& precision that are also discipline-dependent

References: After DFT core term definitions.

Scope: DFT Term Definition Prototype

Status: New

Interoperability

Definition

Interoperability describes the extent to which systems and devices can work together, exchange data, and interpret that shared data. For two systems to be interoperable, they must be able to exchange data and subsequently present that data such that it can be understood by a user. Interoperability generally means the ability to exchange data routinely/freely between systems, because each system would have at least knowledge of other systems formats in which data is exchanged. A stronger type of data exchange can include knowledge of the meaning of some of the data content.

Explanation: At the system level Interoperability is the ability of a system to accept and send services and to use the services so exchanged to enable them to operate useful. We sometimes speak of a data quality of interoperability, which means that the data an function with other data as part of some system operations such as analysis or querying. To be interoperable FAIR principles propose that (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. And (meta)data use vocabularies that follow FAIR principles And that (meta)data include qualified references to other (meta)data.

Examples: http://www.himss.org/library/interoperability-standards/what-is?navItemNumber=17333 en.wikipedia.org/wiki/Interoperability Institute of Electrical and Electronics Engineers, IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries, New York, NY: 1990. National Committee on Vital and Health Statistics (NCVHS) Report on Uniform Data Standards for Patient Medical Record Information, July 6, 2000, pp. 21-22.

References: ISO TC204, document N271 The FAIR Guiding Principles for scientific data management and stewardship http://www.nature.com/articles/sdata201618

Scope: RDA Data Fabric Interest Group

Status: In discussion

Key Metadata

Definition

Key metadata is information associated with a digital object (or entity) that are required for discovery.

Explanation: A distinction is proposed that minimal metadata does not include information that help discovery but a key type does include this.

References: Proposed by Reagan Moore

Scope: RDA Metadata WG

Status: New

Keyword

Definition

A keyword is a word associated with a concept of importance or significance.

Explanation: A keyword may act to provide access to information. In this role it may be used as part of an information retrieval system to suggest the likely content of a document.

References: http://www.dictionary.com/browse/keyword

Scope: Metadata Interest Group

Status: New

Landing Page

Definition

A Web page providing access to metadata, data files, dataset terms, waivers or licenses, and version information that a user arrives at after clicking a hyperlink.

Explanation: A landing page is usually a stable URL. Landing pages can have query based links to other things (papers which cite this one) etc ...To be useful information on a landing page is indexed and searchable. In the RDA context a PID resolves to a human readable landing page. This may be page that contains the essential metadata state information about a Digital Object. For a data set a landing page provides metadata including a link to the superset (PID of the data source) and citation text snippet. Things like an ID/DOI can be found on the database landing page for a published article. See also Machine Actionable

Examples: dx.doi.org redirects to a landing page URL.

References: http://www.apastyle.org/learn/faqs/what-is-doi.aspxhttps://rd-alliance.org/system/files/documents/RDA-DC-RecommendationsUNDERSCORESIGN150609.pdf http://www.nature.com/articles/sdata201618\#bx2

Scope: Data Citation WG

Status: In discussion

Landscape

Definition

Definition (first attempt): an area defined by elements and their interaction (interfaces, protocols) where many specifications are unclear, but where we can indicate some essential functions already now, and where not all elements (components, services) are yet known.

Explanation: Elaboration1: Some see a???landscapea??? and a???frameworka??? obviously as very similar terms.Elaboration2: A landscape may contain multiple frameworks at different stages of development and sophistication.

Scope: RDA Data Fabric Interest Group

Status: New

Legacy data

Definition

Legacy data refers to data that was previously generated.

Explanation: Legacy data may be associated with a legacy computer system and/or application program which continues to be used because of the cost of replacing or redesigning it. Often older systems and legacy data are large and monolithic enough to make then difficult to modify and use. Legacy software to exploit the data may runs on antiquated hardware with high maintenance costs.

Scope: RDA DFT Interest Group

Status: New

Legal Interoperability

Definition

Legal interoperability occurs among two or more datasets when:the legal use conditions are clearly and readily determinable for each of the datasets, typically through automated means; the legal use conditions imposed on each dataset allow creation and use of combined or derivative products; and users may legally access and use each dataset without seeking authorization from data rights holders on a case-by-case basis, assuming that the accumulated conditions of use for each and all of the datasets are met.

References: Draft report of RDA-CODATA Interest Group on the Legal Interoperability of Research Data

Scope: Legal Interoperability

Status: In discussion

Legal Right

Definition

A legal right, or just a Right, as used here is some ability bestowed onto an agent (a Right(s) Holder Owner) by a given legal system.

Explanation: Rights can be modified, repealed, and restrained by laws. See also Right Holder

Scope: Legal Interoperability

Status: New

Lexicon

Definition

A lexicon is the terminology or vocabulary used in a language, including its words and expressions.

Explanation: Lexicon is a collective concept.

References: Cognitive Atlas Concept - CAOUNDERSCORESIGN00381

Scope: DFT Term Definition Prototype

Status: New

Lifecycle

Definition

Lifecycle (or data lifecycle) is the sequence of processing that a data undergoes from its creation, documentation through its storage in a repository and eventual disposal.

Scope: RDA Data Citation WG

Status: New

Linked Data

Definition

Linked data also called Linked Open Data is data where relationships/connections among data should is made available. This allows easy data access.

Explanation: This related collection of interrelated datasets is store on the Web \& available via a common format -RDF.

Examples: A typical case of a large Linked Dataset is DBPedia (http://dbpedia.org/), which, essentially, makes the content of Wikipedia available in RDF.

References: http://www.w3.org/standards/semanticweb/data\#summary

Scope: DFT Term Definition Prototype

Status: New

Machine Actionable

Definition

Machine Actionable means that something (e.g. a Digital Object) is in a form that a computing system may process it in some automated fashion.

Explanation: Data policies may be made machine/computer actionable.

Examples: One may make a landing page machine-actionable allowing to retrieve the data set by re-executing a timestamped query that is provided.

References: https://rd-alliance.org/system/files/documents/RDA-DC-RecommendationsUNDERSCORESIGN150609.pdf

Scope: Practical Policy WG

Status: New

Manage data sets in a repository

Definition

Enforce desired properties for each digital object in a repository

Explanation: Manage data sets in a repository means defining the policies that govern the arrangement, naming, descriptive metadata, provenance metadata, representation metadata, administrative metadata, access controls, retention, disposition, integrity, and replication of digital objects. The desired properties may include required data format, or automated full text indexing, or generation of derived data products, or distribution across multiple storage locations.

References: RDA PP WG

Scope: Practical Policy WG

Manage metadata catalog

Definition

Manage information about each digital object in a repository

Explanation: Manage metadata catalog involves defining the policies that govern the choice of metadata schema, reserved vocabularies, metadata organization in tables, and metadata properties (creation date, access control, ownership). An implication is that the metadata will be consistently updated after each action applied to the repository.

Examples: A management example is the automated updating of the storage location for a digital object after it is moved to a new storage location.

References: RDA PP WG

Scope: Practical Policy WG

Mashup

Definition

Mashups are lightweight composite applications that source all of their content from existing systems and data sources; they have no native data store or content repository. To access the resources that they leverage, mashups employ the technologies of the Web, including representational state transfer (REST) APIs, RSS and ATOM feeds and widgets.

References: http://www.gartner.com/it-glossary/mashups

Scope: DFT Term Definition Prototype

Status: In discussion

Math Test

Definition

A formula: \<math\>b in langle 0,1 rangle\</math\>

Medium

Definition

The material with changeable characteristics in or on which information can be represented and thus can be used to support the storage and/or transmission of data.

Explanation: See also Data Format.

Examples: This can be chips, films, compact optical disks, cards, magnetic disks, magnetic drums, and paper but also copper wire, coaxial cable, optical fiber, or electromagnetic wave as in microwave.A tape with a magnetizable surface layer on which data can be stored by magnetic recording.

Scope: Metadata Interest Group

Status: New

Metadata

Definition 1

Metadata is data that plays the role (is used for) of documentation for data/resource discovery, description/documentation, contextualisation.

Explanation: Metadata provides contextual information on for data collections to retest, reuse and repurpose data.Data/Resource discovery allows resources to be found by relevant criteria; Identifying resources; Bringing similar resources together; Distinguishing dissimilar resources; Giving location information.

Examples: Examples include data about related datasets (including provenance metadata), software, publications, organisations, persons (such as organizer of the dataa???)Typically descriptive metadata includes such things as source \& time of creation. For data and report publication it may include administrative metadata such as authors \& date of submission. A PID is an example of metadata used to reference data. An example is retention period metadata which defines the date when retention of the data object should be evaluated.

References: RDA MD and PP WG discussions

Scope: RDA Metadata WG

Status: New

Definition 2

Metadata is a type of data object that that contains attributes describing properties of an associated data or digital object. a???Metadataa??? represent the set of instructions or documentation that describe the content, context, quality, structure, and accessibility of a data set." Mitchner (2006)

Explanation: MD can be used for Discovery, Access, Selection, Licensing, authorization, Quality, suitability and Provenance, reproducibility.It may contain as key the persistent identifier of that associated object. The association between a data object and metadata is that the content of the metadata describes the data object. Metadata may serve different purposes, such as helping people to find data of relevance - discovery or to bring data together a??? federation.

Examples: Data properties, both internal and external, are types of metadata as is transactional information about data.

References: Michener, William K. "Meta-information concepts for ecological data management." Ecological informatics 1.1 (2006): 3-7.More: http://www.esajournals.org/doi/abs/10.1890/1051-0761(1997)007PERCENTSIGN5B0330PERCENTSIGN3ANMFTESPERCENTSIGN5D2.0.CO;2

Scope: RDA Term Collection Core

Status: In discussion

Definition 3

Metadata can be Representation Information which is information that maps a Data Object into more meaningful concepts.

Explanation: Representation Information about a piece of data is added to understand it.For example, a format for data is added.

Examples: Examples include Preservation Description Information(PDI) and Packaging Information?Format info is descriptive Structural metadata or Representation Information This info should be adequate for things like rendering a digital media object but without additional information it is not adequate for understanding.

References: "Understanding a Digital Object-Basic Representation Information"Ch. 7 Advanced Digital Preservation by David Giaretta http://www.scribd.com/doc/102252110/Chapter-7-Understanding-a-Digital-Object-Basic-Representation-Information

Scope: RDA Metadata WG

Status: New

Definition 4

The PIT API can be used to create, edit and read metadata assigned to a PID. In this context. metadata is describing a PID itself and the content which the PID resolves to.

Scope: PID Information Types WG

Status: New

Metadata Attribute

Definition

Metadata Attribute is analagous to a data attribute.

Examples: how and when and by whom a particular set of data was collected, and how the data is formatted.

Scope: DFT Term Definition Prototype

Status: In discussion

Metadata Catalogue

Definition

A type of data catalog (catalogue) used to access informatation about data

Explanation: Metadata is data about data (or a service). It is the result of documenting data.MD catalogues include mechanism for storing and accessing descriptive metadata and allows users to query, for data items based on desired attribute, the catalogue service that stores descriptive information (metadata) about logical data items. When a catalog has metadata in human-readable form, it has primarily been used as information to enable the manager or user to understand, compare and interchange the content of the described data set. See also Data Registry, Data Repository. In the Web Services context, XML-encoded (machine-readable and human-readable) metadata stored in catalogues and registries enables services to use those catalogues and registries to find data and services. A Metadata dataset (after ISO 19101) is the set of metadata describing a specific dataset.

Examples: ISO 19115: describes all aspects of geospatial metadata and provides a comprehensive set of metadata elements.

References: http://www.marbef.org/wiki/MetadataUNDERSCORESIGNandUNDERSCORESIGNmetadataUNDERSCORESIGNcatalogues

Scope: DFT Term Definition Prototype

Status: In discussion

Metadata Component

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Metadata Element

Definition

Metadata Element is a special role of a Data Element defined as attribute or category of description in a metadata set.

Explanation: Metadata Elements are often represented an attribute-value pair (that us an element = a "string-value"), but values may have additional structure (where element = structured-value). This latter or more complex view is the concept used in the Metadata Elements approach of the RDA MIG.

References: Presentations by RDA MIG at P6 Metadata Principles and Practicalities, D-Lib Magazine April 2002, Volume 8 Number 4

Scope: RDA Metadata WG

Status: In discussion

Metadata Management

Definition

Metadata Management is an end-to-end process of administering metadata within a policy framework for creating, controlling, enhancing, attributing, defining and managing a metadata schema, model or other structured aggregation system, either independently or within a repository and the associated supporting processes.

Explanation: It includes the population of properties (e.g. creation date, access control, ownership) within a metadata schema, its organization in tables and use of reserved vocabularies such as defined in metadata standards.

References: After Meta-data management - Wikipedia, the free encyclopediahttps://en.wikipedia.org/wiki/Meta-dataUNDERSCORESIGNmanagement and PP WG.

Scope: Practical Policy WG

Status: New

Metadata Object

Definition

A type of Data object with a metadata role of describing some targeted data or data collection.

Scope: DFT Term Definition Prototype

Status: In discussion

Metadata Profile

Definition

Metadata Profile is an organizing concept to define core element of metadata needed within and across different domains.

Explanation: Detail to be added by the MIG briefings.

Scope: RDA Metadata WG

Status: New

Metadata Record

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Metadata Registry

Definition

Metadata Registries provide a means to manage and disclose metadata schema declarations, application profile declarations, and value space declarations.

Explanation: Metadata registries support Metadata Registration (see also) but also it maintenance.Because a given metadata schema or application profile may evolve, MD registries are used to maintain relationships among a particular schema's various versions in order to promote semantic and machine interoperability over time.

References: Heery, Rachel and Manjula Patel, Application Profiles: Mixing and Matching Metadata Schemas, Ariadne, Issue 25 (September 2000) http://www.ariadne.ac.uk/issue25/app-profiles/intro.html

Scope: RDA Metadata WG

Status: New

Metadata Standards

Definition

Metadata Standards are standards used to describe research data \& data collections from efforts including the humanities, physical, social, behavioral, and economic sciences.

Explanation: These standards may include a glossary of terms to facilitate the sharing of information about data and may be expressed in various formats including XML.

Examples: The DDI metadata specification is one example that supports the entire research data life cycle.

References: http://rd-alliance.github.io/metadata-directory/

Scope: RDA Metadata WG

Status: New

Metadata tag

Definition

In the context of information systems, a metadata tag is kind of keyword or term assigned to a resource or a piece of information.

Explanation: Tagged metadata helps describe an item and allows it to be found again by browsing or searching.

Examples: Examples of tagged resources include Internet bookmark, digital image, database record, or computer file.

References: https://en.wikipedia.org/wiki/TagUNDERSCORESIGN(metadata)

Scope: Metadata Interest Group

Status: New

Metamodel

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Minimal Metadata

Definition

Minimal metadata is description with very little curation that include a name and PID of a data object.

Explanation: Minimal metadata is only marginally targeted at discovery since there is much better infrastructure to accomplish this.

References: After Peter Wittenberg, Tobias Weigel and Tim Dilauro.

Scope: RDA Term Collection Core

Status: New

Nanopublication

Definition

Nanopublications A set of annotations that refer to the same statement and contains a minimum set of(community) agreed upon annotations.

Explanation: they are structured digital object that associate a statement composed of one or more triples with its evidence/provenance, and digital object metadata. Nanopublications have been proposed to make it easier to find, connect and curate core scientific statements and to determine their attribution, quality and provenance . Small RDF-based data snippets a??? i.e. nanopublications a??? rather than classical narrative articles should be at the center of general scholarly communication . In contrast to narrative articles, nanopublications support data sharing and mining, allow for fine-grained citation metrics on the level of individual claims, and give incentives for crowdsourced community efforts. (from "Broadening the Scope of Nanopublications?" by Kuhn et al (20130 See http://www.tkuhn.ch/pub/kuhn2013eswc.pdf

Examples: @prefix swan: \< http://swan.mindinformatics.org/ontologies/1.2/pav.owl\> .@prefix cw: \< http://conceptwiki.org/index.php/Concept\>. @prefix swp: \<http://www.w3.org/2004/03/trix/swp-1/\>. @prefix : \<http://www.example.org/thisDocument\#\> . :G1 = { cw:malaria cw:isTransmittedBy cw:mosquitoes } :G2 = { :G1 swan:importedBy cw:TextExtractor, :G1 swan:createdOn "2009-09-03"^^xsd:date, :G1 swan:authoredBy cw:BobSmith } :G3 = { :G2 ann:assertedBy cw:SomeOrganization }

References: P. Groth, A. Gibson, and J. Velterop. The anatomy of a nano-publication. InformationServices and Use, 30(1):51a???56, 2010 http://nanopub.org/guidelines/workingUNDERSCORESIGNdraft/ A Nanopublication Framework for Biological Networks using Cytoscape.js ttp://ceur-ws.org/Vol-1327/icbo2014UNDERSCORESIGNpaperUNDERSCORESIGN57.pdf

Status: New

OAI Repository

Definition

A type of repository with a network accessible server that can process the 6 OAI-PMH requests in the manner described in the OAI Implementation Guide.

References: http://www.openarchives.org/OAI/2.0/guidelines-static-repository.htm

Scope: DFT Term Definition Prototype

Status: In discussion

Object

Definition

An Object is any part of the perceivable or conceivable world. A type of entity.

References: (ISO 1087)

Scope: DFT Term Definition Prototype

Status: In discussion

Object Attribute

Definition

In an object model that is the logical attributes or properties associated with a particular object. In a data object this is the associated properties.

Scope: DFT Term Definition Prototype

Status: In discussion

Object Model

Definition

An object model is a collection of descriptions of classes or interfaces, together with their member data, member functions, and class-static operations.

References: http://www.w3.org/TR/WD-DOM/glossary.html

Scope: DFT Term Definition Prototype

Status: In discussion

Object Property

Definition

The characteristics of any digital object can be described by a number of properties which are typically stored in metadata and/or PID records.

Scope: DFT Term Definition Prototype

Status: In discussion

Objective Metadata

Definition

Objective Metadata is based on assertions of fact about such things as authorship, date of creation, \& version. Broadly they inclue attributes can be assigned by what is considered an objective and reproducible (perhaps automated) process.

Explanation: In some instances objective metadata can be machine/sensor generated such as the "properties" metadata generated when creating a file in a word processor or spreadsheet application.

Examples: Date-time of an observation may be considered objective if done by a calibrated machine or defined human process. It may be subjective if done retrospectively. Other objective metadata "properties" may be generated as part of an automated data management process employing defined practical policies.

References: http://www.dlib.org/dlib/april02/weibel/04weibel.html

Scope: RDA Metadata WG

Status: New

Open Access

Definition

Open Access (to data) essentially means that online and stored digital information is free of charge, as well as free of mostcopyright and licensing restrictions that prevent its open and free use.

Explanation: open access means unrestricted access to and use of scientific information and data.A? Open access exists to facilitate reuse and legal interoperability and is an important component of this process.

References: After "Funding models for Open Access Repositories"http://dri.ie/sites/default/files/files/funding-models-open-access-repositories.pdf Budapest Open Access Initiative (BOAI), 2002, a???Read the Budapest Open Access Initiativea??? (web page), http://www.budapestopenaccessinitiative.org/. Bethesda Statement on Open Access Publishing, (Bethesda Statement), 2003 (web page), http://dash.harvard.edu/bitstream/handle/1/4725199/suberUNDERSCORESIGNbethesda.htm?sequence=1.

Scope: RDA Data Fabric Interest Group

Status: New

Open data

Definition

Open data is data available/visible to others \& that can be freely used, re-used, shared, re-published and redistributed by anyone.

Explanation: Data openness is subject only, at most, to the requirement to attribute the source and implies a willingness to, in turn, share the use of this data with other. The idea \& value of transparency is sometimes used when discussing open data. Open data in pursuit of transparency should make it: * easier to access public data and * encourage data publishers to release data in standardised, open formats ingraining a a???presumption to publisha??? unless there are clear, specific reasons (such as privacy) not to do so. Open data is part of the Open Science effort. There may be issues of formats that data can appear in that helps to make them open. See also data transparency.

Examples: Data available from many government sites and funded efforts such the The Human Genome Project or Dataverse Networks.For example the Odum Institute Dataverse Network (http://arc.irss.unc.edu/dvn/ or UC San Diego Dataverse Network (http://dataverse.ucsd.edu/dvn/)

References: The full Open Definition gives precise details as to what this means. See http://opendefinition.org/od/index.html

Scope: RDA Data Publishing Workflow Interest Group

Status: New

Operation

Definition

An (digital) operation is a type of process that applies a data manipulation function.

Explanation: Data operations involve the following attributes:i??? EntityID: the identifier of the digital entity requesting invocation of the operation; i??? TargetEntityID: the identifier of the digital entity to be operated upon; i??? OperationID: the identifier that specifies the operation to be performed; i??? Input: a sequence of bits containing the input to the operation, including any parameters, content or other information; and i??? Output: a sequence of bits containing the output of the operation, including any content or other information.

Examples: authentication and authorizing,modifying state information, registering, publishing logging of information, storage duplication.

References: Provided by RDA Practical Policy WG

Scope: Practical Policy WG

Status: New

Original Repository

Definition

A type of Repository where the original copy of data was stored and probably a data identifier registered.

Scope: DFT Term Definition Prototype

Status: In discussion

Originator

Definition

Originator refers to the person or organization that is the source of data.

Explanation: Originators are responsible for providing data and information to providers.

Scope: Metadata Interest Group

Status: New

PID Attribute

Definition

A single data element related to a PID and part of its record content.

Scope: DFT Term Definition Prototype

Status: In discussion

PID Domain

Definition

For a single identifier, the class of entity it refers to. For a PID system, the typical class of entities it is intended to be used for.

Examples: APARSEN D22.3 offers: - PI for digital objects - PI for physical objects - PI for bodies - PI for actors

References: APARSEN WP22 Deliverable 22.3

Scope: DFT Term Definition Prototype

Status: In discussion

PID Record

Definition 1

A PID record is a type of record (and organization) that stores an instance of an executable/understandable PID. The content of a PID record distinguishes a registered digital or data object from other DOs.

References: A PID record has a lifecycle including creation, publication, Curation and the destruction.

Scope: DFT Term Definition Prototype

Status: In discussion

Definition 2

A PID record is a type of record that includes property information that characterizes the DO it is identifying.

Explanation: Important parts of a PID record are location and checksum. However there is a large variation in usage. In some data models the PID is just used as a unique label with an empty record .

Scope: DFT Term Definition Prototype

Status: In discussion

PID Resolution

Definition

PID Resolution is the process of resolving a PID to useful state information about a DO by using a globally available system.

Explanation: This is an operation that links the identifier to the digital object.

References: Weigel et al., 2013. a???A Framework for Extended Persistent Identification of Scientific Assetsa???. http://dx.doi.org/10.2481/dsj.12-036

Scope: RDA DFT Interest Group

Status: In discussion

PID Service

Definition

A PID service is a service that provides a connection between a PID and its target object.

Explanation: In functional effect, the identifier is a proxy for the resolving operation/service.

References: Documentation Note by Tobias on PIDDiscussion on collections contributed by Reagan Moore.

Scope: RDA DFT Interest Group

Status: In discussion

PID System

Definition

A PID System consists of at least one PID Resolver, a name schema and a defined mechanism for issuing PIDs that conform to the name schema.

Explanation: Sometimes called digital identifier systems.Uniform Resource Name (URN) includes a namespace registration process, for example.

Examples: The major PI systems are, in chronological order: Handle, 1994 (also DOI which is an implementation of the Handle idea Persistent URL (PURL), 1995 Uniform Resource Name (URN), 1997 Archival Resource Keys (ARK), 2001 Extensible Resource Identifier (XRI), 2005...

References: Hakala, J. Persistent identifiers - an overview. Accessed June 11, 2012.

Scope: RDA DFT Interest Group

Status: In discussion

PID Type

Definition

A PID type (or PID Information Type) is a category of identifier that distinguishes different types of Digital Objects (see).

Explanation: Different types of objects, digital and non-digital, need different types of metadata records to describe them and give them context. ID information as one type of metadata also varies by data type and need to conform to what the data type requires. PIDs and the records pointed to also need to handle activities on data operations such as replication to allow such things as provenance information available.

Examples: A publication date is a date type while and author is of string type and simple identifiers can be use. A new, composite material r a citation data type may need more complex metadata reflecting more properties and a different type of ID may be needed.

References: PIT final report https://rd-alliance.org/pit-final-report-draft-20140920.html

Scope: PID Information Types WG

Status: New

Patent

Definition

A patent is a form of intellectual property.

Explanation: The World Intellectual Property Organization defines patent as a???an exclusive right granted for an invention, which is a product of a process that provides a new way of doing something, or offers a new technical solution to a problema???. A patent provides protection for the invention to the owner of a patent. The protection is granted for a limited period, generally 20 years.a??? Patents are granted by a national (or regional) authority as the right to monopolize the commercialization of an invention, but they do not prohibit the exchange or distribution of knowledge on which the invention is based. Patents therefore should not hamper the access to research data, although they may impede certain commercial reuses of these data for a given time period.

References: http://www.wipo.int/edocs/pubdocs/en/patents/450/wipoUNDERSCORESIGNpubUNDERSCORESIGNl450pa.pdf

Scope: Legal Interoperability

Status: In discussion

Payload Metadata

Definition

Payload Metadata is all metadata NOT defined as Administrative Metadata.

Explanation: Payload metadata's content relates to the meaning or application of data/Digital Objects. Depending on policy updates to PayloadMetadata can triggers Provenance Metadata generation and change in the version number of published Data such as versioned documents.

References: After DDI Technical documentation

Scope: RDA Metadata WG

Status: New

Persistent Identifier

Definition

A persistent identifier (PID) is a string (functioning as a symbol) that identifies a digital object. The identifier can be persistently resolved (digitally actionable) to meaningful metadata state information about the identified digital object.

Explanation: An identifier should have an unlimited lifetime, even if the existence of identified entity ceases. This aspect of an identifier is called a???persistencya???.

References: Paskin, N. (1999). Toward unique identifiers. In: Proceedings of the IEEE 87 (7) 1208-1227.Khedmatgozar, Hamid Reza, and Mehdi Alipour-Hafezi. "A Basic Comparative Framework for Evaluation of Digital Identifier Systems." Journal of Digital Information Management 13.3 (2015): 191.

Scope: RDA Term Collection Core

Status: In discussion

Physical coordintes

Definition

Within a space-time continuum of four-dimensions there are three spatial coordinates and one temporal coordinate, in which all physical quantities may be located.

Explanation: The range of spatial information of a dataset, which could include a spatial region like a bounding box or a named place. References systems are needed to understand numerical values used such as latitude and longitude.

Scope: Metadata Interest Group

Status: New

Policy

Definition

Specification of how a property of a digital entity will be controlled.

Explanation: A policy makes assertions that can be enforced about a system such as a data collection, or a workflow, or a dataflow.The policy details (and may control or guide) when and where assertions are enforced.

Examples: Automated replication. On ingest of a file into a collection, a replica will be created at a separate location. Periodic integrity validation. Checksums for files will be periodically validated to verify integrity, with replacement of corrupted files from a valid replica. Derived product creation. Based on status flags associated with a digital entity, a procedure is invoked to create a derived data product, which is then stored in the collection.

References: Rajasekar, R., M. Wan, R. Moore, W. Schroeder, S.-Y. Chen, L. Gilbert, C.-Y. Hou, C. Lee, R. Marciano, P. Tooby, A. de Torcy, B. Zhu, a???iRODS Primer: Integrated Rule-Oriented Data Systema???, Morgan \& Claypool, 2010.

Scope: Practical Policy WG

Status: In discussion

Policy Constraints

Definition

Policy constraints are values in a Policy Template that control the application of that policy.

Explanation: State information is needed to evaluate the constraint

Examples: A constraint may limit action to one particular file or collection or one user, specify which metadata is to be registered etc.

References: Practical Policy briefing at RDA P6

Scope: Practical Policy WG

Status: New

Presentation Version

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Preservation

Definition

Preservation is the process of Storing data and digital material, such as Digital Objects in a Repository for a long period of time.

Scope: RDA Data Fabric Interest Group

Status: New

Procedure

Definition

Specification of the processing steps that are used to change digital entities or properties of digital entities.

Examples: Replication procedure has several processing steps: Selection of storage location, selection of physical file name, storage of a copy of a file, verification of integrity of the replica. Computer algorithms may be used as a procedure or a one or more steps as part of a procedure. Distribution procedure has the processing steps: Identification of type of data, selection of appropriate storage location, movement of the digital entity, verification of integrity Retention procedure has the processing steps: Retrieval of the retention period, test whether the retention period has elapsed, invocation of a disposition policy

References: Rajasekar, R., M. Wan, R. Moore, W. Schroeder, S.-Y. Chen, L. Gilbert, C.-Y. Hou, C. Lee, R. Marciano, P. Tooby, A. de Torcy, B. Zhu, a???iRODS Primer: Integrated Rule-Oriented Data Systema???, Morgan \& Claypool, 2010.

Scope: DFT Term Definition Prototype

Status: In discussion

Processing Workflow

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Project

Definition

A project is an organized effort, either by individual or collaborative enterprise, that is carefully planned and designed to achieve a particular aim within a particular timeframe.

Scope: Metadata Interest Group

Status: New

Property

Definition

A Property is the smallest, atomic part of metadata written and read by the PIT API. It is defined in the Type registry. It consists of the elements: * identifier: a PID * name: the name of the property, for example "license", "checksum" * range: data type, for example String, Date, ... * namespace: the content related scope of the property. Could be a project like RDA, EUDAT * description: a human readable description in English

Explanation: See Property Features

References: PID Information Type work

Scope: PID Information Types WG

Status: New

Property Features

Definition

Desirable Property Features to insure data integrity, authenticity and proper access control include data qualities of completeness, correctness, consistency and has been validated and approved by consensus.

Explanation: See data quality.

Scope: Practical Policy WG

Status: New

Property Record

Definition

A Property Record is a structure that contains property descriptions.

References: Peter Wittenburg

Scope: DFT Term Definition Prototype

Status: In discussion

Proposition

Definition

an expression in language or signs that some abstraction corresponds to some aspect or configuration of an entity or group of entities

Examples: the sky is blue (entity--sky, abstraction-blue, aspect--has color); the mass of the sample is 200 g (entity-sample, abstraction -- 200g, aspect-mass)

References: http://www.merriam-webster.com/dictionary/proposition, meaning 2a; Sowa, 2000, p. 501; usage of abstract and entity per Sowa.

Scope: DFT Term Definition Prototype

Status: In discussion

Protocols

Definition

AKA System Protocols. Definition (first attempt): is the special set of rules that regulates how components within a system are (or can) interacting

Explanation: Elaboration1: Protocols are crucial parts of interface specifications. They do not only specify message content, but also procedural aspects.

Scope: RDA Data Fabric Interest Group

Status: New

Provenance

Definition

Provenance is a type of historical information or metadata about the origin, location or the source of something, or the history of the ownership or location of an object or resource including digital objects.

Explanation: Provenance captures the meaningful history of an object, where it originated and how it developed from in its early/raw existence.As part of Archiving, archivists often have to look back to pre-existing processes to trace provenance.

Examples: Examples of Provenance Information are the principal investigator who recorded the data, and the information concerning its storage, handling, and migration.

References: Partially based on the NCI Thesaurus.

Scope: RDA Data Fabric Interest Group

Status: In discussion

Provenance metadata

Definition 1

Provenance information metadata concerning the creation, attribution, or version history of managed data.

Explanation: Provenance information is gather along the data lifecycle as part of curation processes.Some make the distinction between a coarse-level or workflow view of provenance for data within the overall lifecycle as defined above and a finer level that just is concerned with data flowing between various stores such as curated DBs and managed repositories.

Examples:

Scope: RDA Term Collection Core

Status: New

Definition 2

Provenance metadata that indicates the relationship between 2 versions of data objects and is generated whenever a new version of a dataset is created.

Explanation: This metadata is designed allow queries over the relationship between versions, and includeseither or both fine-grained and coarse-grained provenance data. Different applications may store different provenance data

Examples: Examples include:(i) the name of the program that generated the new version, (ii) the commit id of the program in a code version control system like GitHub, (iii) the identifiers of any other datasets or data objects that may have been used in creating the new version.

References: After DataHub: Collaborative Data Science \& Dataset Version Management at Scalehttp://arxiv.org/pdf/1409.0798v1.pdf

Scope: RDA Metadata WG

Status: New

Publication

Definition

The process of making digital object (re)usable by others

Explanation: Once data is collected it can be exposed in a way that is then (re)usable by others. Data publication is a process which involves the exposure of this data. Publication could include aspects of a???a???traditionala???a??? publication channels but need not be limited to this. To be effective published (research) data has documentation requirements beyond a pointer or link. To be useful some documentation of provenance, quality, credit, attribution and methods are desired to provide the reproducibility that enables validation of results in a publication. Prior research is identified in Publication with a citable reference. Report summaries cite the data reference and the original publication.

References: Bechhofer, Sean, et al. "Why linked data is not enough for scientists." Future Generation Computer Systems 29.2 (2013): 599-611.

Scope: RDA Term Collection Core

Status: New

Publisher

Definition

An active entity responsible for making a resource with information content such as a publication available.

Scope: DFT Term Definition Prototype

Status: In discussion

Quality

Definition

A quality is some aspect of an Entity (but not a part of it), which cannot exist without, independent from, that Entity

Explanation: Object qualities can be given measured values within particular dimensions such as an acidity dimension or an electromagnetic frequency dimension. As physical things they can have specific spatial-temporal coordinates.

Examples: An example is the way the surface of a specific Physical Object like a rock or planet looks, or the specific weight of that object.

References: https://www.w3.org/2005/Incubator/ssn/wiki/DULUNDERSCORESIGNssn

Scope: Metadata Interest Group

Status: New

Query ID

Definition

A Query ID is identification metadata attached to a query to give it a unique identity.

Explanation: Queries are assigned a new ID if either the query is newly created or if the result set returned froman earlier identical query is different due to changes in the data. Persistent IDs (PIDs) are preferred.

Scope: Data Citation WG

Status: New

Query Store

Definition

A Query Store is a type of storage infrastructure used to store queries used to select data and associated metadata

Explanation: Queries can be treated and stored like digital objects.

References: https://rd-alliance.org/system/files/documents/RDA-DC-RecommendationsUNDERSCORESIGN150609.pdf

Scope: Data Citation WG

Status: New

Query Timestamping

Definition

Query Timestamping is a type of Timestamping applied to a query based on the last update to the entiredatabase (or the last update to the selection of data affected by the query or the query execution time).

Explanation: This allows retrieving the data as it existed at query time. See Provenance Metadata.

References: https://rd-alliance.org/system/files/documents/RDA-DC-RecommendationsUNDERSCORESIGN150609.pdf

Scope: Data Citation WG

Status: New

Raw Data

Definition

Raw Data is data in its original, acquired , direct form from its source before subsequent processing

Explanation: See Data Lifecycle for context for raw data.Also called primary, source or atomic data. Although raw data has the potential to become "information," it requires selective processing such as registration, extraction, organization, documentation etc.

Examples: An data as encoded at a source including human generated raw data.

References: http://searchdatamanagement.techtarget.com/definition/raw-data

Real-Time Data

Definition 1

Real-time data, often referred to as RTD, is data that updates on its own schedule so it provide data that is delivered immediately after collection. There is no delay in the timeliness of the information provided. (Prototype Wiki)

Examples: For example, stock quotes, manufacturing statistics, Web server loads, data warehouse activity and sensor feeds to data collectors. Real-time data is often used for navigation or tracking.

References: http://msdn.microsoft.com/en-us/library/office/aa140060(v=office.10).aspx\#odcUNDERSCORESIGNxlrtdfaqUNDERSCORESIGNwhatisrtd Wade, T. and Sommer, S. eds. A to Z GIS in http://en.wikipedia.org/wiki/Real-timeUNDERSCORESIGNdata

Scope: RDA Term Collection Core

Status: In discussion

Definition 2

Real-Time Data is data being received, processed and stored at the time of its occurrence with only small delays. (Peters Document)

Explanation: Real-Time Data are data streams that are typically generated by sensors and received via direct networking connections. Real-Time Data can also be generated by many users that interact with a database system and expect immediate actions. One of the characteristics or real-time data can be that it is not well-defined how to define identifiable units that can be referred to.

Scope: RDA Term Collection Core

Status: In discussion

Record

Definition

A record is short for a data record which is the data structure that consists of several, uniquely named components known as data elements.

Explanation: A data record represents information created, received and maintained as evidence along with metadata information providedin pursuance of legal obligations or standard research operations.

Examples: A property record or a transaction record are examples of data record types. A document, a signature, a seal, text, images, sound, speech, or data compiled, recorded, or stored, as the case may be.

Scope: RDA Data Fabric Interest Group

Status: In discussion

Record provenance information

Definition

The process that associates provenance information with a digital object

Explanation: Record provenance information for a data object, Such provenance information includes:* the person who deposited the data object in the repository, * the source of the data object, * the date when the object was deposited, and * authenticity information needed to link the data object to its original source.

References: RDA PP WG

Scope: Practical Policy WG

Records Management

Definition

records management is a comprehensive process involving policies, procedures systems,and behaviors that ensures that appropriate attention and protection is given to all data records, and that the evidence and information the records contain can be retrieved moreefficiently and effectively, using standard practices and procedures.

Explanation: Together the record management process is intended to ensure that reliable evidence of actions and decisions is kept and remains available for reference and use when needed. As part of research the community benefits from effective management of its key data assets stored as records.See also Record. This term is related to Archiving and the related idea of Archive Management.

References: After the ISO INFORMATION AND DOCUMENTATION a??? RECORDS MANAGEMENThttp://www.taoiseach.gov.ie/attachedUNDERSCORESIGNfiles/PdfPERCENTSIGN20files/30PERCENTSIGN20ISO.15489-1PERCENTSIGN20-PERCENTSIGN20IRISHPERCENTSIGN20VERSION.pdf

Scope: RDA Data Fabric Interest Group

Status: New

Referable data

Definition

a type of data (digital or not) that is persistently stored and which is referred to by a persistent identifier

Explanation: Digital data may be accesses by the identifier. Some data objects references may access a service on the object (OAI-OR

References: DFT WG file repository: 10 Category DFT working defintions.docx

Scope: RDA Term Collection Core

Status: In discussion

Reference Resolution

Definition

Reference Resolution is the process of resolving a reference to useful information by using a globally available system.

References: Weigel et al., 2013. a???A Framework for Extended Persistent Identification of Scientific Assetsa???. http://dx.doi.org/10.2481/dsj.12-036

Scope: DFT Term Definition Prototype

Status: In discussion

Reference data

Definition

Reference data, in the context of research data management, are domain \& community standardized data objects that define the set of permissible values to be used to populate other data object.

Examples: Examples include code values consisting of sets of values, statuses; controlled vocabulares, taxonomises or classification schema.Other examples of reference data are: Units of measure Country codes Researcher codes Fixed conversion rates (e.g., weight, temperature, and length) Geonames

References: https://en.wikipedia.org/wiki/ReferenceUNDERSCORESIGNdata

Scope: DFT Term Definition Prototype

Status: New

Reference model

Definition

Definition (first attempt): is a design covering a class of frameworks with the following characteristics: (1) it can be used to generate more specific models that still belong to the class and (2) it can be used to compare a concrete framework design to identify whether it belongs to the same class.

Explanation: Elaboration1: a generic model of a DF certainly should have the potential to be viewed as a reference model to which all concrete instantiations and specializations belong as derivatives.

Scope: RDA Data Fabric Interest Group

Status: New

Register Manager

Definition

A registry manager is a resource (person/organization) responsible for the day-to-day management of a registry.

Explanation: A register manager may engage a third-part service provider to perform this service.A registry manager ensures the integrity of any register held in the registry, and provides means for electronic access to the registry for register managers, control bodies, and register users. There is a similar role of Repository Manager.

Examples: The US National Geospatial-Intelligence Agency (NGA) serves a registry manager for the Defence Geospatial Information Working Group (DGIWG), providing electronic access to the registers controlled by the DGIWG

References: ISO 19135 spechttps://www.dgiwg.org/Terminology/faq-other.php\#faq19135-10

Scope: RDA Data Fabric Interest Group

Status: New

Register Metadata

Definition

The process for recording information about a digital object

Explanation: A metadata registry may be used as part of the registration process to help provide:"consistent definitions of data across time, between databases, between organizations or between processes." The following minimal metadata is suggested by Practical Policy as part of the registration of metadata: AttributeUNDERSCORESIGNname AttributeUNDERSCORESIGNvalue AttributeUNDERSCORESIGNunit or comment Digital object to which the metadata is applied Digital collection that holds the digital object MetadataUNDERSCORESIGNcreationUNDERSCORESIGNtime MetadataUNDERSCORESIGNmodificationUNDERSCORESIGNtime

References: Practical Policy work and https://en.wikipedia.org/wiki/MetadataUNDERSCORESIGNregistry

Scope: Practical Policy WG

Status: New

Registered Data

Definition

Registered Data is data that has gone through a registration process and as part of this has an identifier and usually metadata to aid in its search and retrieval.

Scope: DFT Term Definition Prototype

Status: In discussion

Registered digital data

Definition

Registered digital data is a type of digital data that has undergone a registration process and thus has some type of ID.

Explanation: Much digital data is legacy data and may not have been registered in a repository or in some way identified to help search and retrieval.

Examples: Data in repositories and managed portals such as https://www.nasa.gov/open/researchaccess/nasa-data-portal

Scope: RDA DFT Interest Group

Status: New

Registration

Definition

Registration or data registration is a curation process on a data object by which the DO receives a persistent object identifier (PID) from a trusted registration authority.

Explanation: Registration should be accompanied by the step to upload it to a persistent repository.

Registry

Definition 1

In the context of DFIG where we discuss registries of trusted repositories a registry is a database containing information about trusted repositories that are provided by the repository managers and are useful for human and machine users.

Explanation: These registries do not contain information about all metadata descriptions of DOs nor do they offer a list of PIDs of all DOs stored, however, they offer information based on standardized types how to retrieve such information such as the port under which OAI-PMH can be accessed to offer metadata.The above description is not a real definition yet, however it is in overlap with the basic ISO 19135 definition which is at a more abstract level.

Scope: RDA Data Fabric Interest Group

Status: New

Definition 2

ISO 19135: registryinformation system on which a register is maintained

Explanation: ISO 19135: registerset of files containing identifiers assigned to items with descriptions of the associated items ISO 19135: registration assignment of a permanent, unique and unambiguous identifier to an item

Related Data

Definition

In the metadata context related data is some data connected to particular data for a variety of reasons including thematic, topic, geospatial or logical similarity, context, history, acquisition method, publication etc.

Explanation: Related data may be part of the same collection or by the same collector or something useful for data research and identified to support access.

Scope: Metadata Interest Group

Status: New

Relations

Definition

Definition (first attempt): In the context of DFIG the term relation was used to indicate how the different components within a system are "linked" to fulfill the tasks. "Relations" are thus defined by the services they are making use of and by the interface specifications.

Scope: RDA Data Fabric Interest Group

Status: New

Replica number

Definition

A type of metadata used as part of a replication process or access.

Scope: RDA Term Collection Core

Status: New

Replication

Definition

Generate a copy of a data object that is referenced by the same name, but with a different replica number. When changes are made to the data object, the replica can be updated to track the changes.

Explanation: AKA, data replication or duplication.Replication is guided by a replication policy. The act of replication generates a replica, a copy of a digital object with the same logical name. The replica may have different administrative information such as replica number, replica location, creation date. All replicas of a digital object are associated with the same PID. Thus any of the replicas may be returned as a valid digital object for the requested PID. Digital objects that are copied into multiple data management systems can be considered replicas if a mechanism is available to update all copies when changes are made to a digital object. The concept of replication implies automated consistency guarantees across all replicas. A PID should allow copies of a digital object from different communities to be identified as such. If there is no guarantee of consistency on changes to the digital object, the copies should be given separate PIDs. Related term replica number, PID, repository

References: RDA PP WG, Peter's replication scernario.

Scope: Practical Policy WG

Status: New

Repository

Definition 1

Repository (aka Data Repository or Digital Data Repository) is a searchable and queryable interfacing entity that is able to store, manage, maintain and curate Data/Digital Objects.

Explanation: A data repository returns data sets with appropriate features and/or the bit stream/dynamic data object instantiating a data/digital object if a persistent identifier is being issued. A repository should have a globally unique identifier that refers to it and an URL allowing access to the repository. Repositories store data and can also stored its associated metadata. Some repositories may be specialized to store metadata. New Collections (aggregations) are, or can be, built from repository data for analysis purposes. New PIDs are required for such collections.

Examples: There are many types of data repositories institutional and domain repositories. One example is the Gene Expression Omnibus (GEO), an open data repository which provides access to microarray, next-generation sequencing, and other forms of functional genomic data submitted by the scientific community or the Global Change Master Directory, maintained by the Earth Sciences Directorate at the National Aeronautics and Space Administration (NASA), provides access to more than 25,000 earth and environmental science data sets, relevant to global change and Earth science research.

References: Vocabulary for the Registration and Description of Research Data Repositorieshttp://gfzpublic.gfz-potsdam.de/pubman/item/escidoc:76875/component/escidoc:76874/re3dataUNDERSCORESIGNvocabularyUNDERSCORESIGNv2-0.pdf https://library.uoregon.edu/datamanagement/repositories.html\#three

Scope: RDA Term Collection Core

Status: New

Definition 2

Repository is a managed location (destination, directory or "bucket ") where digital data objects areregistered, permanently stored, made accessible and retrievable, and curated.

Repository Registry

Definition

A repository Registries is a type of registry that collect useful information about repositories for human consumption in order that depositors and users can easily find where to go to for their data needs.

Examples: re3data is an example

References: https://rd-alliance.org/group/data-fabric-ig/wiki/data-fabric-ig-repository-registries.html

Scope: RDA Data Fabric Interest Group

Status: New

Representation

Definition

a resource that conveys either the content of a resource (if it is a digital object instance), or provides a digital object that conveys the intention of the resource in a useful form for some user (machine or human...).

Scope: DFT Term Definition Prototype

Status: In discussion

Representation object

Definition

A Representation Object (or Representation Information) is information that maps a Data Object into more meaningful concepts.

Explanation: A representation object provide some context for a data object. It contains provenance, description (e.g. format, encoding scheme, algorithm-Brown, 2008), structural, and administrative information about the object. This is a form of metadata and is sometimes managed as part of Administrative Metadata efforts.

Examples: An example is the ASCII definition that describes how a sequence of bits (i.e., a Data Object) is mapped into a symbol.a??? -

References: See: http://www.dcc.ac.uk/news/representation-information-what-it-and-why-it-important\#sthash.5jiiqzXF.dpuf DFT WG file repository: 10 Category DFT working defintions.docx: Brown, A. (2008). White paper: Representation information registries. Retrieved June19, 2009, from http://www.planets-project.eu/docs/reports/PlanetsUNDERSCORESIGNPC3-D7UNDERSCORESIGNRepInformationRegistries.pdf

Scope: RDA Term Collection Core

Status: In discussion

Research Data

Definition

Research data is data collected, observed, or created usually in a digital form, for purposes of data analysis to produce original research information and results.

Explanation: "Research data is defined as recorded factual material commonly retained by and accepted in the scientific community as necessary to validate research findings; although the majority of such data is created in digital format, all research data is included irrespective of the format in which it is created."

Examples: Engineering and Physical Sciences Research Council (EPSRC)Scientific data is an important sub-type of research data.

References: See http://www2.le.ac.uk/services/research-data/rdm/what-is-rdm/research-data

Scope: RDA DFT Interest Group

Status: New

Research Object

Definition

A research Object (RO) or Research Data Object is or provides a container for a principled aggregation of resources, produced and consumed by common services and shareable within and across organisational boundaries.

Explanation: Research Objects are semantically rich aggregations of resources that providethe a???units of knowledgea???An RO bundles together essential information relating to experiments and investigations. This includes not only the data used, and methods employed to produce and analyse that data, but also the people involved in the investigation. An RO may be thought of as an archive for the variety of digital objects, that are produced during the course of a scientific investigation. Such objects would likely contain (or link to) not only data, software, and documents, but their provenance metadata as well.

Examples: A package of is a basic aggregation of resources that can be annotated or shared.Such a package may be asseembled to support more complex forms of reuse - for example, to rerun an investigation with new data, or validate that the results being presented are indeed the results expected.

References: Bechhofer, S., De Roure, D., Gamble, M., Goble, C., \& Buchan, I. (2010). Research objects: Towards exchange and reuse of digital knowledge.Bechhofer, Sean, et al. "Why linked data is not enough for scientists." Future Generation Computer Systems 29.2 (2013): 599-611.

Scope: RDA Data Publishing Workflow Interest Group

Status: In discussion

Research Stakeholder

Definition

Research Stakeholder: Groups and people involved in research including the roles of producers, users, funders and policymakers.

References: http://www.dcc.ac.uk/how-discover-requirements

Scope: RDA Term Collection Core

Status: New

Researcher operations

Definition

Researcher operations are user initiated tasks to control, manipulate, and process (analyze) a researchers' data sets.

Examples: Analysis is an example of a research operation.

References: RDA PP WG

Scope: Practical Policy WG

Resource

Definition

aka Web resource - In the context of the web, resources are addressable units of information which are addressed through Uniform Resource Identifiers or URIs making them identified, described, and discoverable components of the Web's distributed data environment.

Explanation: This definition is consistent with its use in the general Web community as a???anythingthat has an identitya???

Examples: Just about anything can be a resource: it can be an abstract idea, such as sustainability or a coding system. Alternatively it can be fairly concrete, like physical object, an organisation, a contact person or a data collection.Typical examples: a data collection, an archive or repository, an on-line database, an organization of people (university, lab, agency, institute, or research project), a web site, an on-line analysis tool.

References: Berners-Lee 1998, IETF RFC2396

Scope: RDA DFT Interest Group

Status: In discussion

Resource Description Framework

Definition

Resource Description Framework or RDF in short, is a globally-accepted framework for representing data and knowledge in a simple graph-based form with the intention of making it readable and somewhat interpretable by data processing systems in support of data interchange on the Web.

Explanation: RDF is a family of World Wide Web Consortium (W3C) specifications, including RDF schema originally designed as a hybrid metadata \& data model.

Examples: An RDF triple encodes a statement subject-predicate-object relation in a web processable form:ex:patient319 v:fullName "Mary Higgs"

References: https://www.w3.org/RDF/

Scope: RDA DFT Interest Group

Status: New

Resource Destination

Definition

Resource Destination a??? A system that retrieves those resources to synchronize itself with the Source.

References: Klein et al (2013) A Technical Framework for Resource Synchronization

Scope: RDA Term Collection Core

Status: New

Resource Source

Definition

Resource Source a??? A server that hosts resources to be synchronized.

References: Klein et al, 2013 A Technical Framework for Resource Synchronization

Scope: RDA Term Collection Core

Status: New

Reusable Data

Definition

Reusable data, according to FAIR, principles is meta(data) that are richly described with a plurality of accurate and relevant attributes, and (meta)data that are released with a clear and accessible data usage license, andhave associated detailed provenance as well as complying with domain-relevant community standards.

References: The FAIR Guiding Principles for scientific data management and stewardship,http://www.nature.com/articles/sdata201618\#bx2

Scope: RDA Data Fabric Interest Group

Status: New

Rich Metadata

Definition

Rich Metadata describes data with enough accurate and relevant attributes to make it easily findable.

References: The FAIR Guiding Principles for scientific data management and stewardshiphttp://www.nature.com/articles/sdata201618\#bx2

Scope: RDA Metadata WG

Status: New

Right Holder

Definition

A right Holder (or Rights Holder) refers to a legal agent/organization which a degree ownership of intellectual properties recognized by a legal system. Included may be ownership of protected copyright, trademark or patent, and the related rights of producers, performers, producers and broadcasters.

Explanation: A right holder may license a portion or all of owned protected work through international legal and licensing provisions.

Scope: Legal Interoperability

Status: New

Schema

Definition

A schema is a type of structure or blueprint that defines how data is organized and how the relations among parts or elements of the data are associated. Schemas may be abstract and represent a logical, constraining view of the entire database or dataset.

Explanation: Relational DB schemas use tables, rows and columns as organizing concepts or their logical equivalents of entity relations (ER).XML Schemas are another type written, not in ER form but in a language suitable for expressing constraints about XML documents.

Examples: Many examples of XML schemas exist. OGC's schemas are typical. XML encoding requirements of OGC GWML2.0, as specifiedin the requirements class is one example See: http://schemas.opengis.net/gwml/2.2/gwml2-aquifertest.sch and http://schemas.opengis.net/gwml/2.2/gwml2-aquifertest.xsd

References: https://en.wikipedia.org/wiki/DatabaseUNDERSCORESIGNschema

Scope: Metadata Interest Group

Status: New

Semantic Interoperability

Definition

Semantic Interoperability is usually defined as the ability of services and systems to exchange data in a meaningful/useful way.

Explanation: Semantic Interoperability is a stronger type of data exchange than typical Interoperability because it includes some knowledge of the meaning of the data content, system structure and operation, usage constraints, and the underlying assumptions.From a systems perspective, semantic interoperability can be 'defined as the enablement of software systems to interoperate at a level in which the exchange of information is at the enterprise level. This means each system (or object of a system) can map from its own conceptual model to the conceptual model of other systems, thereby ensuring that the meaning of their information is transmitted, accepted, understood, and used across the enterprise.' Knowledge of systems and data can be based on interpretable rules as well as semantic representation of meaning in metadata.

Examples: Databases may contain a text definition of a road, as a???An open way maintained for vehicular usea??? (DIGEST, 2000). Most topographic databases know this general feature class a???Roada???, but at lower level, sub classes or categorization of roads, many differences occur. By defining the semantic meaning of different kind of roads in an ontology, a reasoner will be able to find relationships and equivalences between different categorizations of roads.Quoted from Aerts, Koen, Karel Maesen, and Anton Van Rompaey. "A practical example of semantic interoperability of large-scale topographic databases using semantic web technologies." Proceedings of the AGILE. Vol. 6. 2006.

References: F. Harvey, W. Kuhn, H. Pundt, Y. Bisher, C. Riedemann. Semantic Interoperability A Central Issue for Sharing Geographic Information. in The Annals of Regional Science, Vol. 33 (1999), pp. 213-232. Goodchild, M.F., Egenhofer, M.J., Fegeas, R., and Kottman, C.A. (eds.) Interoperating Geographic Information Systems. New York, Kluwer, 1999. Obrst, L., G. Whittaker, A. Meng. 1999. Semantic Context for Object Exchange, AAAI Workshop on Context in AI Applications, Orlando, FL, July 19, 1999. Hitzler, P., Janowicz, K., Berg-Cross, G., Obrst, L., Sheth, A. P., Finin, T., \& Cruz, I. F. (2012). Semantic Aspects of EarthCube.

Scope: RDA Data Fabric Interest Group

Status: New

Service Object

Definition

A service object is a type of digital object containing executable code, considered as a unit.

References: Provided by Peter Wittenburg - "This definition comes from those of us who want to identify executable code to create reproducible science."

Scope: DFT Term Definition Prototype

Status: In discussion

Services

Definition

Definition (first attempt): in the DFIG context a service is a function that is being executed on request delivering certain expected results.

Scope: RDA Data Fabric Interest Group

Status: New

Source Data

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Standard protocol

Definition

Standard protocol - defined rules and conventions for fixed procedures for completing a data management task.

Examples: Data communication such as message protocols.

Scope: RDA Term Collection Core

Status: New

State Information

Definition

State Information is a???metadataa??? information that describes those current properties of a DO that are relevant for proper management and access.

Explanation: State Information can be made persistent to provide relevant and current actionable information about a DO such such as its current location(s), public key(s) and other validation information. Note, data location may change so the value of the data state changes but the attribute of data location, not its value, persists in this case.

Examples: Current location(s), checksum, data replication number, public key(s) and other validation information are examples of state information attributes.

References: Kahn et al.

Scope: RDA Term Collection Core

Status: In discussion

Sticky Bits

Definition

A sticky bit is a user ownership access-right flag that can be assigned to digital objects such as directories.

Explanation: When the sticky bit flag is set, files added to the directory will inherit the access permissions associated with the directory.

Scope: RDA Term Collection Core

Status: New

Structural metadata

Definition

A type of metadata that indicates how compound objects are put together.

Explanation: Also refers to the underlying structural metadata of digital objects that tells computers how to assemble them.

Examples: An example is how pages are ordered to form chapters or how data is organized in a table, datasets in a collection or the components \& structural organization of a research data object such as chapters in a book,sentences in a chapter, etc,that allows us to figure out how an objects should be put together.

References: NISO. (2004) Understanding Metadata.Bethesda, MD: NISO Press, p.1 http://marciazeng.slis.kent.edu/metadatabasics/types.htm

Scope: RDA Term Collection Core

Status: New

Structured Data

Definition

Structured Data, in distinction to unstructured data, is data that conforms to a defined, fixed schema.

Examples: Relational databases, spreadsheets and RDF triples are examples of structured data.

References: After http://w3c.github.io/dwbp/bp.html\#bib-Lexvo

Scope: RDA DFT Interest Group

Status: New

Study-Level Metadata

Definition

Metadata may exist at several levels. Study-Level Metadata describes the research study level for which data is gathered.

Scope: RDA Data Fabric Interest Group

Status: New

Subjective Metadata

Definition

Subjective Metadata or non-Objective Metadata whose elements are subject to differing personal, group or cultural points of view. ,

Explanation: Some metadata may specifically intended to represent a subjective evaluation of content, such as what to name a picture.

Examples: An example is assignment of keywords or some textual summarization of content as in a picture or a data set description.

References: http://www.dlib.org/dlib/april02/weibel/04weibel.html

Scope: RDA Metadata WG

Status: New

Support Service

Definition

Support service, a type of service whose functionality provides technical support and assistance to help solve problems related to technical products.

Explanation: Support services may be provided at each phase of the data lifecycle to help manage digital objects used as part of research. A support service may entail the application of multiple operations that are chained together in a procedure or workflow. Operations performed upon the digital object modify administrative state information maintained about the digital object.

Examples: Integrity management through replication, checksum generation, application of access controlsChain of custody management through generation of audit trails that track all operations applied to the digital object. Ingestion management through verification of representation information, generation of an Archival Information Package, and storage Data Discovery through generation of a query to a metadata catalog and the paging of query results Data Access through identification of an uncorrupted replica, and transport of the digital object

References: http://en.wikipedia.org/wiki/DataUNDERSCORESIGNcenterUNDERSCORESIGNservices\#SupportUNDERSCORESIGNservices

Scope: Practical Policy WG

Status: New

System

Definition

A whole made from some combination of interacting elements organized to function to achieve one or more stated purposes.

Explanation: Elaboration1: This is the aspect that the scientific researcher will interact with and must be well defined and directly relevant to the research needs, just as with any other scientific instrument. Elaboration2: In DFIG it is important to note that the "systems" will undergo continuous extensions and that its elements (components, services) will be subject of innovation. Also in the case of DFIG we cannot expect to have a fully-understood landscape (see that definition) as it is normally expected in software engineering for example to build a "system" to work. The wholenss of a system is defined by its function in a larger system of which it's a part.

Examples: A computer infrastructure, a data repository,

References: Systems Engineering, see Ackoffa???s definition at:http://environment-ecology.com/general-systems-theory/380-systems-thinking-with-dr-russell-ackoff.html

Scope: RDA Data Fabric Interest Group

Status: New

System Metadata

Definition

Digital entity properties that are generated by the data management system.

Examples: Creation time: A data management system records when a digital entity was created. Owner: A data management system records the owner of a digital entity. Storage location: A data management system records where a digital entity is stored. Data retention period: A data management system may record the length of time a digital entity will be retained.

Scope: DFT Term Definition Prototype

Status: In discussion

Taxonomy

Definition

A Taxonomy is a categorization of "something." This may be a lexical organization of Words, Glossaries, Types, etc.

Explanation: Taxonomies serve as "knowledge organization systems" developed to support communication and often to control the use of terms used in the data realm to facilitate the storing and retrieving of data items from a data repository.Taxonomies can take on multiple forms, such as lists, hierarchies, interactive facets, etc.

References: http://www.kmworld.com/Articles/Editorial/What-Is-.../What-is-a-Taxonomy-81159.aspx

Scope: RDA DFT Interest Group

Status: New

TeD-T

Definition

The RDA Term Collection tool, named by Gary

Explanation: TeD-T comes from Te(rm) D(efinition) Tool.

References: Offered by Gary Berg-Cross of DFT in brainstorming session.

Scope: RDA Term Collection Core

Status: New

Technology Migration

Definition

Technology Migration is a process by which something technical in nature, like data or an automated system. is migrated to a new form or representation.

Explanation: When data is migrated the queries and metatdata such as associated checksums are also.

Examples: Examples include migration to a new database system, a new schema or a completely different technology),

References: https://rd-alliance.org/system/files/documents/RDA-DC-RecommendationsUNDERSCORESIGN150609.pdf

Scope: Practical Policy WG

Status: New

Temporal coordinates

Definition

A time measurement about some physical entity using units defined as a specified duration or point in time.

Explanation: Temporal coordinates are understood within a space-time continuum of four-dimensions which has three spatial coordinates and one temporal coordinate. It is within these 4 coordinates that all physical quantities may be located."Time is that physical quantity perceived as the continued progress of existence measured by an observer as events which are relatively ordered as a???beforea??? or a???aftera??? and which, at a given point in time, give rise to the notions of past, present and future. Time and location are often used together by an application to describe when a given condition exists or when an object was present at a given location."

Examples: A start and end date applicable to data is one example and release date of data is another.

References: www.dictionary.com/browse/space-timehttp://standards.sedris.org/18026/text/ISOIECUNDERSCORESIGN18026EUNDERSCORESIGNTEMPORALUNDERSCORESIGNCS.HTM

Scope: Metadata Interest Group

Status: New

Temporary Version

Definition

A temporary version is a copy of a data object such as a file during the course of routine operations.

Scope: DFT Term Definition Prototype

Status: In discussion

Term

Definition

A Term is a word that is accompanied by a Definition.

Explanation: Words usually serve a labeling role.

References: http://ontolog.cim3.net/forum/ontolog-forum/2014-02/msg00173.html

Scope: RDA DFT Interest Group

Status: New

Timestamping

Definition

Timestamping is a process applied to data to ensure that there is a record of when data operations took place

Examples: additions to data and deletions of data are marked with a timestamp.

References: Data Citation WG recommendation

Scope: Data Citation WG

Status: New

Topical metadata

Definition

Topical metadata describes the topic or a???aboutnessa??? of an information/data object -what is this data about. To make sense to an agent or systems this may include a variety of vocabularies for describing, subjects, topics, categories, etc.

Scope: DFT Term Definition Prototype

Status: In discussion

Transaction Record

Definition

Transactional data, in the context of data management, is the metadata information recorded from transactions.

Explanation: In this context, a transaction is a sequence of information exchange and related work (e.g.as digital object or database updating) that is considered as a unit for the purposes of satisfying a requested activity such as updates.

References: After http://whatis.techtarget.com/

Scope: Practical Policy WG

Status: New

Transparency

Definition

Transparency as a process is activities to make something publicly visible (not private) and easily available.

Explanation: Transparency may also be a property of something which has undergone a process of to make it available.

Examples: Government information is often published online in electronic formats that make it easy to search, sort and download.

Scope: RDA Data Publishing Workflow Interest Group

Status: New

Trusted Repository

Definition

Trusted repositories are those repositories that are undergoing regular assessments according to a set of rules such as defined by Data Seal of Approval (DSA) or TRAC (ISO 16363).

Explanation: It is well understood that such an assessment has the potential of increasing trust from its depositors and users, but it will not be the only criterion for users. Repositories can be at different stages of assessments, however, it is evident that certain quality criteria need to be met to distinguish trusted repositories from all types of other entities that store data such as notebooks or lab servers.The term repository is well defined by DFT as: a digital repository is an infrastructure component that is able to store, manage and curate digital objects and return their bitstreams when a request is being issued.

Scope: RDA Data Fabric Interest Group

Status: New

Trusted user

Definition

A trusted user is an approved user that has met the criteria by the data owner to handle the data.

Explanation: See also "authentication"

Scope: RDA DFT Interest Group

Status: New

Type

Definition

For the PIT API, a Type is a collection of properties. It is stored in the Type Registry.

Scope: PID Information Types WG

Status: New

URI - Uniform resource identifier

Definition

URIs can be classified as locators (URLs), as names (URNs), or as both. A uniform resource name (URN) functions like a person's name, while a uniform resource locator (URL) resembles that person's street address. In other words: the URN defines an item's identity, while the URL provides a method for finding it.

References: http://en.wikipedia.org/wiki/UniformUNDERSCORESIGNresourceUNDERSCORESIGNidentifierbetter: IETF RFC 2396

Scope: DFT Term Definition Prototype

Status: In discussion

UUID

Definition

A UUID (Universally Unique Identifier) is a 128-bit number used to guarantee unique identify for different objects on the internet over time.

Examples: filesystem partitions

Scope: DFT Term Definition Prototype

Status: In discussion

Unique Identifier

Definition

A unique identifier (UID) is a numeric or alphanumeric string that is associated with a single entity within a given system. UIDs make it possible to address that entity, so that it can be accessed and interacted with. A UID is one of the RDA metadata set elements.

Explanation: In the context of a collection of digital objects, a unique identifier (UID) is any identifier which is guaranteed to be unique among all identifiers used for those DOs and for a specific purpose. There are various types of unique identifiers, some corresponding to a different generation strategy or use strategy

Examples: 1. serial numbers, assigned incrementally or sequentially2. random numbers, selected from a number space much larger than the maximum (or expected) number of objects to be identified. 3. names or codes allocated by choice which are forced to be unique for the purposes of a central registry

References: http://internetofthingsagenda.techtarget.com/definition/unique-identifier-UID

Scope: Metadata Interest Group

Status: New

User Name

Definition

User Name is a unique sequence of symbols (e.g.characters) employed as part of information system activities to uniquely identify a user and allow access to such things as computer systems, computer networks, data or online accounts.

Explanation: Also called login name, logon name, sign-in name, sign-on name. For access purposes a unique name as an ID is assumed or required."each file has a unique name, that each collection has a unique name, that each access role has a unique name, and that each access permission has a unique name. "

Examples: "In many data management systems, multiple naming conventions may be used. For example, a user may be identified by:a??? UserUNDERSCORESIGNID, a unique number assigned to the user a??? UserUNDERSCORESIGNname, an ascii string assigned to the user"

References: RDA Practical Policy "Outcomes Policy Templates:Practical Policy Working Group, September 2014 https://www.rd-alliance.org/filedepot?cid=104\&fid=556

Scope: Practical Policy WG

Status: New

Verify checksum

Definition

Generate a unique reduced representation for a data object by applying a procedure and compare the result to the original reduced representation that has been stored as provenance information.

Explanation: Related term a??? data representation, Provenance \& administrative metadata,checksum

Examples: Examples include a checksum, a hash, a digital signature,

Scope: RDA Term Collection Core

Status: New

Versioning

Definition

Generate a (changed) copy of a data object that is uniquely labeled with a version number. The intent is to enable access to prior versions.

Explanation: Note that a version is different from a backup copy, which is typically a copy made at a specific point in time, or a replica, which is a copy of a data object that can be periodically updated. Related term a??? version, replication

Scope: RDA Term Collection Core

Status: New

Virtualization

Definition

In computing, virtualization refers to the act of creating a virtual or abstract (rather than actual or physical) version of something.

Explanation: Virtualization may be model based, that is a model of actual data management services is built and used to represent a family of things that may be implemented in various ways faithful to the abstract model.By means of a virtual model a collection, workflow, or data flow can be managed independently of the choice of technology for implementation.

Examples: Examples include virtual computer hardware platforms, operating systems, storage devices, computer network resources and data management processes.

References: After https://en.wikipedia.org/wiki/Virtualization and Data Foundation position paper discussed at RDA P6.

Scope: RDA Data Fabric Interest Group

Status: New

Vocabulary

Definition

Vocabulary is a body or listing of a group of "terms" used by a community for particular domain purposes.

Explanation: To be useful vocabularies usually are annotated by explanatory definitions as to meaning and purpose and links to related terms.For convenience vocabularies are usually arranged alphabetically within a lexicon or glossary where definitions are provided although as part of comunication definitions may not have been explicitly provided.

Scope: RDA Term Collection Core

Status: New

Web resource

Definition

Every 'thing' or entity that can be identified (has an identity), named, addressed or handled, in any way whatsoever, in the web at large, or in any networked information system.

Explanation: the early notion of static addressable documents or files, to a more generic and abstract definition as above.Each resource must have a URI at which it is available. There are Resource Type such as the media type of a resource. This is useful to support an understanding of the nature of resource representations say as an audio file.

Examples: Familiar examples include an electronic document or data stored on the Web, an image, a service (e.g., "a weather report for DC"), as well as a collection of other resources.

References: A Short History of "Resource" in web architecture., by Tim Berners-Lee

Scope: RDA Term Collection Core

Status: New

Work

Definition

ToDo

Scope: DFT Term Definition Prototype

Status: In discussion

Workflow

Definition

A set of chained operations carried out in a defined sequence.

Explanation: Aka scientific workflows. The simplest computerized scientific workflows are scripts that can involve several ingredients such as data, programs, models and other inputs such as human or sensor observations. Workflows produce outputs such as analyses that can include visualizations and analytical results. Preserved workflows are important for reproducible science. They simplify complex sequences of activities and enable researchers to automate and track the provenance of the work in workflow execution. Workflow scripts are digital objects.

Scope: Practical Policy WG

Status: New

Workflow Virtualization

Definition

Workflow virtualization is a type of virtualization for managing the naming, arrangement, sharing, provenance, output files production and re-execution of workflows.

References: Reagan Moore presentation at RDA P6 DFT IG session

Scope: RDA Data Fabric Interest Group

Status: New