High-quality consistent metadata description is essential to the successful discovery, exchange, and query of a Linked Dataset. The Protein Ontology Linked Dataset is accompanied with a full VoID metadata description, which is compliant with the W3C HCLS specification. The full VoID description is at void.ttl.
Summary Level Description
This level provides a description of a dataset that is independent of a specific version or format.
@prefix : <#> . @prefix void: <http://rdfs.org/ns/void#> . # Describing Linked Datasets with the VoID Vocabulary @prefix void-ext: <http://ldf.fi/void-ext#> . # Extensions to the Vocabulary of Interlinked Datasets (VoID) @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . # RDF Syntax @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . # RDF Schema @prefix owl: <http://www.w3.org/2002/07/owl#> . # OWL ontology @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . # XML Schema @prefix dcterms: <http://purl.org/dc/terms/> . # Dublin Core Metadata Terms @prefix dctypes: <http://purl.org/dc/dcmitype/> . # Dublin Core Metadata Types @prefix foaf: <http://xmlns.com/foaf/0.1/> . # Friend-of-a-Friend @prefix cito: <http://purl.org/spar/cito/> . # Citation Typing Ontology @prefix dcat: <http://www.w3.org/ns/dcat#> . # Data Catalog @prefix freq: <http://purl.org/cld/freq/> . # Collection Description Frequency Vocabulary @prefix idot: <http://identifiers.org/idot/> . # Identifiers.org vocabulary @prefix lexvo: <http://lexvo.org/ontology#> . # Lexical Vocabulary @prefix pav: <http://purl.org/pav/> . # Provenance Authoring and Versioning ontology @prefix prov: <http://www.w3.org/ns/prov#> . # PROV Ontology @prefix schemaorg: <http://schema.org/> . # schema.org vocabulary @prefix sd: <http://www.w3.org/ns/sparql-service-description#> . # SPARQL 1.1 Service Description @prefix sio: <http://semanticscience.org/resource/> . # Semanticscience Integrated Ontology (SIO)
Dataset Identification and Declaration of Type
All summary and version level descriptions are typed as "dctypes:Dataset". The distribution level description can also be typed as "dctypes:Dataset" but needs to be additionally typed as a "dcat:Distribution". RDF formatted datasets are typed as an instance of a "void:Dataset".
:pro rdf:type dctypes:Dataset .
The "dct:title" defines a human-readable title for the dataset. Alternative or older titles can be specified using "dct:alternative".
:pro dct:title "PRO"@en ; dct:alternative "Protein Ontology"@en .
The "dct:description" describes the contents of the dataset.
:pro dct:description """PRO describes the relationships of proteins and protein evolutionary classes, delineates the multiple protein forms of a gene locus (ontology for protein forms), protein complexes, and interconnects existing ontologies. Further information is available at http://www.proteininformationresource.org/pro/."""@en .
The "dct:publisher" states the person, organisation, or service that is responsible for publishing the dataset.
:pro dct:publisher [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] .
Change Frequency
The "dct:accrualPeriodicity" from the Dublin Core Frequency vocabulary ( DCFreq) specifies the update frequency of the dataset.
:pro dct:accrualPeriodicity freq:quarterly .
Webpage and Logo
The "foaf:page" states a link to human-readable web page. "schemaorg:logo" states a link to an images file containing the logo for the dataset.
:pro foaf:page <https://proconsortium.org/> ; schemaorg:logo <https://proconsortium.org/PROlogofinalEDGclear.png> .
The "dcat:keyword" provides the keywords and topics of coverage for the dataset.
:pro dcat:keyword "Biomedical ontology"^^xsd:string, "Protein ontology"^^xsd:string, "Community annotation"^^xsd:string, "Protein"^^xsd:string .
Licensing and Rights
The "dct:license" states the license under which the dataset is published.
:pro dct:license <https://creativecommons.org/licenses/by/4.0/> ; dct:rights """The PRotein Ontology is licensed under CC BY 4.0. You are free to share (copy and redistribute the material in any medium or format) and adapt (remix, transform, and build upon the material) for any purpose, even commercially. You must give appropriate credit (by using the original ontology IRI for the whole ontology or original term IRIs for individual terms), provide a link to the license, and indicate if any changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.""" .
The "dct:language" declares the languages in which the dataset is published.
:pro dct:language <http://lexvo.org/id/iso639-3/eng> .
The "cito:citesAsAuthority" from the Citation Typing Ontology (CiTO) links to publications about the dataset.
:pro cito:citesAsAuthority <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210558/> .
Preferred and Alternative Prefixes
The "idot:preferredPrefix" specifies the short names for the dataset. The "idot:alternatePrefix" specifies the alternate short names for the dataset.
:pro idot:preferredPrefix "pro" ; idot:alternatePrefix "prodb" .
Version Level Description
This level captures version-specific characteristics of a dataset.
The "dct:isVersionOf" relates the version level description to the summary level description.
:pro61_0 dct:isVersionOf :pro .
The "pav:version" declares the version identifier.
:pro61_0 pav:version "61_0"^^xsd:string .
The "pav:previousVersion" links to the previous version of the dataset.
:pro61_0 pav:previousVersion :pro60_0 .
The "pav:hasCurrentVersion" declares the current version of the dataset.
:pro pav:hasCurrentVersion :pro61_0 .
Dates of Creation and Issuance
The "dct:created" states the date the dataset was generated. "dct:issued" states the date the dataset was made public.
:pro61_0 dct:created "2020-08-18"^^xsd:date ; dct:issued "2020-08-18"^^xsd:date .
Authorship, Creation, Curation
The "dct:creator" states the individual or organization responsible for creating the dataset. The value should be an IRI that can be resolved for more information. PAV ontology can be used to define fine-grained attribution of a creation event. For example, "pav:authoredBy" states the URIs of the authors. "pav:authoredOn" states the date of authorship. "pav:createdBy" states the URIs of the creators. "pav:createdOn" states the date of creation. "pav:curatedBy" states the URIs of the curators. "pav:curatedOn" states the date of curation.
:pro61_0 dct:creator [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] ; pav:authoredBy [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] ; pav:authoredOn"2020-08-18"^^xsd:date ; pav:createdBy [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] ; pav:createdOn"2020-08-18"^^xsd:date ; pav:curatedBy [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] ; pav:curatedOn"2020-08-18"^^xsd:date .
Distribution Level Description
This level captures metadata about a specific form and version of a dataset.
Distributions and Formats
Version level description should link to the distribution level descriptions that represent the files in different formats.
:pro61_0 dcat:distribution :pro61_0rdf . :pro61_0rdf a dctypes:Dataset, dcat:Distribution, void:Dataset ; dct:format "application/rdf+xml" ;
The "void:vocabulary" from VoID describes the RDFS vocabularies or OWL ontologies that represent the data.
:pro61_0rdf void:vocabulary <http://www.w3.org/ns/dcat#>, <http://purl.org/dc/terms/> .
The "dct:conformsTo" indicates a particular format or standard the dataset conforms to.
:pro61_0rdf dct:conformsTo <http://www.w3.org/2001/sw/hcls/notes/hcls-dataset/> .
The "dct:hasPart" indicates the parts of dataset.
:prolod61_0rdf dct:hasPart :pro61_0rdf dct:hasPart :paf61_0rdf ; void:subset :pro61_0rdf, :paf61_0rdf
File Locations
The "dcat:downloadURL" declares the distribution file. The "dcat:byteSize" declares the size of the distribution file. For RDF resources, the "void:dataDump" declares the distribution file. The "dcat:accessURL" specifies a directory containing the files of interest.
# Summary level declaration :pro dcat:accessURL <https://lod.proconsortium.org/releases/latest/> . # Version level declaration :pro61_0 dcat:accessURL <https://lod.proconsortium.org/releases/release_61.0/> ; dcat:downloadURL <https://lod.proconsortium.org/releases/release_61.0/pro.owl/> ; dcat:landingPage <https://lod.proconsortium.org/release.html/> . # Distribution level declaration :pro61_0rdf dcat:accessURL <https://lod.proconsortium.org/releases/release_61.0/> ; dcat:downloadURL <https://lod.proconsortium.org/releases/release_61.0/pro.owl> ; void:dataDump <https://lod.proconsortium.org/releases/release_61.0/pro.owl> .
SPARQL Query Endpoint
The "void:sparqlEndpoint" specifies a SPARQL endpoint at the summary level.
:pro void:sparqlEndpoint <https://sparql.proconsortium.org/virtuoso/sparql> .
Dataset Documentation
The documentation for the dataset is made available by using the "dcat:landingPage".
:prolod61_0 dcat:landingPage <https://lod.proconsortium.org/dataset.html> .
Identifier, Resource, and Access Patterns
The "idot:identifierPatthern" identifies items or records in the dataset using a regular expression pattern for distribution level description.
:pro61_0rdf idot:identifierPattern "PR_\\d+"^^xsd:string .
The "void:uriRegexPattern" denotes a superset of data item URIs in the dataset using a regular expression pattern for distribution level description.
:pro61_0rdf void:uriRegexPattern "http://purl.obolibrary.org/obo/PR_\\d+" .
Example Identifier and Resource
The "idot:exampleIdentifier" provides an example identifier for distribution level descriptions.
:pro61_0 idot:exampleIdentifier "PR_000000002"^^xsd:string .
The "void:exampleResource" provides an example resource for distribution level descriptions.
:pro61_0rdf void:exampleResource <https://sparql.proconsortium.org/virtuoso/describe/?uri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FPR_000000002> .
Linked Dataset
Linksets are a way of identifying the content that links instances in one dataset with instances in another dataset. A separate linkset is created for each link predicate relating a particular pair of datasets. A linkset should be declared to be a subset of the dataset which publishes it. The linkset itself MUST be declared to be of type void:Linkset and provide the same metadata as an RDF Distribution. There are some metadata properties specific for a linkset. These are stating the source and target of each link, i.e. the datasets that are linked and the predicate used in the links. The statistics relevant for a linkset are the number of triples it contains. This can be reported using the void:triples property.
PRO-UniProt Linkset
:pro61_0-uniprotkb-exactMatch-linkset #Linkset specific Metadata a void:Linkset ; void:subjectsTarget :pro61_0; void:objectsTarget <http://purl.uniprot.org/void#UniProtDataset_2019_11> ; void:linkPredicate skos:exactMatch ; #Metadata for a RDF distribution a dctypes:Dataset, dcat:Distribution, void:Dataset ; dcterms:format "text/turtle" ; dcterms:title "PRO Version 61_0 to UniProtKB ExactMatch Linkset"@en ; dcterms:description """A linkset connecting PRO Version 50_0 to UniProtKB with skos:exactMatch linkPredicate"""@en ; dcterms:created "2020-08-18"^^xsd:date ; dcterms:issued "2020-08-18"^^xsd:date ; dcterms:creator [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] ; pav:authoredBy [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] ; pav:authoredOn"2020-08-18"^^xsd:date ; pav:createdBy [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] ; pav:createdOn"2020-08-18"^^xsd:date ; pav:curatedBy [ foaf:page <https://proconsortium.org/pro_cst.shtml> ] ; pav:curatedOn"2020-08-18"^^xsd:date ; dcterms:license <https://creativecommons.org/licenses/by/4.0/> ; dcterms:rights """The PRotein Ontology is licensed under CC BY 4.0. You are free to share (copy and redistribute the material in any medium or format) and adapt (remix, transform, and build upon the material) for any purpose, even commercially. You must give appropriate credit (by using the original ontology IRI for the whole ontology or original term IRIs for individual terms), provide a link to the license, and indicate if any changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.""" ; dcterms:language <http://lexvo.org/id/iso639-3/eng> ; void:vocabulary <http://www.w3.org/ns/dcat#>, <http://purl.org/dc/terms/> ; dcterms:conformsTo <http://www.w3.org/2001/sw/hcls/notes/hcls-dataset/> ; #Identifiers idot:identifierPattern "PR_\\w"^^xsd:string ; void:uriRegexPattern "http://purl.obolibrary.org/obo/PR_\\w" ; idot:accessIdentifierPattern "https://sparql.proconsortium.org/virtuoso/describe/?url=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FPR_\\w" ; idot:exampleIdentifier "PR_Q96PN6"^^xsd:string ; #Provenance and Change pav:version "61_0"^^xsd:string ; pav:previousVersion :pro60_0-uniprotkb-exactMatch-linkset ; dcterms:isVersionOf :pro-uniprotkb-exactMatch-linkset ; #Availability/Distributions dcat:distribution :paf61_0rdf-uniprotkb-exactMatch-linkset ; dcat:downloadURL <https://lod.proconsortium.org/releases/release_61.0/pro-uniprotkb-exactMatch-linkset.ttl> ; dcat:landingPage <https://lod.proconsortium.org/release.html> ; void:sparqlEndpoint <https://sparql.proconsortium.org/virtuoso/sparql> . |