Species Occurrence Records Represented in RDF

Species Occurrence Records.png

I thought it might be helpful to explain how the example TaxonConcept occurrence records are structured. The diagram above shows the different sections in the RDF document. The different sections make statements about the various related entities. Here is a link to one of the RDF records. The occurrence section gives the what, when, where and who for the occurrence itself. The occurrence section also allows for the inclusion of an image of the original occurrence label.

The occurrence happened within a given geographic area documented by a WGS84 latitude, longitude, and radius in meters. In order to allow "areas" that overlap or are otherwise identical, to be linked, it is important that the precision (significant digits) for the longitude and latitude be standardized. For these examples, I simply added, one significant digit to the numbers from a typical GPS. This should allow submeter precision for those projects that need it. Remember that the actual precision is recorded in the radius in meters. Those who are exporting data sets with only two or three significant digits can output those longitude and latitudes by setting the signficant digits to 8 in their export software. This would mean that data recorded as 44.86, -87.23, would be exported as 44.86000000, -87.23000000. Again, remember that the actual precision is given in the radius. One feature of these "areas" is that I only need to define this "area" once, so if I have 10,000 mosquito records I can simply refer to the just area "geo:44.86528100,-87.23147800;u=10" in the other 9,999 records. Another feature is that if a different LOD entity has soil or weather data associated with this area, they can add that and it will be visible to those users consuming my occurrence records.

The species concept section documents that an instance of the species concept had a particular occurrence in a particular area. It also adds that the species concept can then be "expected" in the particular continent, state/province and county.

The occurrence was of an individual, which now could be a specimen in a collection or an individual that might be encountered in other studies.

The occurrence record has an associated identification. This section contains information about who identified the specimen, how they identified it and what concept and scientific name were assigned to the specimen. The identification section also allows for the inclusion of a image of the original identification label.

The last three sections make assertions about the geographic areas of continent, state/province and county. Since the species concept was observed within these geographical areas, the species concept can be said to be "expected" in these areas.

This method creates the triples that allow LOD consumers to see what species concepts are "expected" in a given geographical area. It also an emergent phenomena that individual occurrence records create geographic "species lists."

These examples include more information than they need to, but I thought it might be best to include everything that was part of the DarwinCore. Individual data providers might choose not to include all of these attributes in their own data.