毎木調査データをDarwin Core Archiveで公開した話

伊東宏樹

2023-02-16

はじめに

※ この発表ファイルは
https://ito4303.github.io/DarwinCoreArchive.html
に置いてあります。

背景

  • 苫小牧林冠破壊調査地(十文字プロット)
    • 洞爺丸台風による林冠破壊後の森林更新を1957年からモニタリング
  • 2016年調査
    • この年までの調査結果をDarwin Core Archive形式にまとめて、データペーパーとしたものがEcological Researchに掲載された (Itô et al. 2018)
  • 2022年調査
    • データをアップデート

Darwin Core / Darwin Core Archive

Darwin Core

  • 生物多様性情報を記述するデータフォーマット
  • GBIF (Global Biodiversity Information Facility) などで採用されている
  • もともとは標本データのため
    • occurrence (出現, 存在) データ
  • コンマやタブ区切りのテキストファイル(XMLやRDFもあり)

JBIF: Darwin Core最新版の項目

Darwin Core Archive

  • 生物多様性データとメタデータを一体化(zipで単一ファイルに)
    • コアデータ + メタデータ (+ 拡張データ)
  • occurrence以外のデータも扱えるようにする

GBIF: What is Darwin Core Archive (DwC-A)?

コアデータの形式 (3種類)

  1. Occurrence data
    • 出現(存在)データ
    • 個体単位
  2. Checklist data
    • 地域チェックリスト(ファウナ, フロラ)
  3. Sampling-event data

GBIF: DwC-A Components

拡張形式(例)

GBIF: Registered Extensions

苫小牧林冠破壊調査地のデータ

FileMaker Pro データベース

データ処理の流れ

Darwin Core Archiveの構造

メタデータ

  • eml.xml
    • Ecological Metadata Language (EML) により記述されたメタデータ

      • データの概要、ライセンス、連絡先など
  • meta.xml
    • データ構造の定義

メタデータの内容

<eml:eml xmlns:eml="eml://ecoinformatics.org/eml-2.1.1" xmlns:dc="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.1 http://rs.gbif.org/schema/eml-gbif-profile/1.1/eml.xsd" packageId="doi:10.1007/s11284-018-1559-3/v2.0" system="http://gbif.org" scope="system" xml:lang="eng">
<dataset>
<title xml:lang="eng">
Long-term data on forest regeneration after catastrophic windthrow in Tomakomai, Hokkaido, northern Japan
</title>
<creator>
<individualName>
<givenName>Hiroki</givenName>
<surName>Itô</surName>
</individualName>
<organizationName>Forestry and Forest Products Research Institute</organizationName>
<positionName>Leader of the silviculture research group, Hokkaido Research Center</positionName>
<address>
<deliveryPoint>Hitsujigaoka 7</deliveryPoint>
<city>Sapporo</city>
<administrativeArea>Hokkaido</administrativeArea>
<postalCode>062-8516</postalCode>
<country>JP</country>
</address>
<electronicMailAddress>abies.firma@gmail.com</electronicMailAddress>
<userId directory="http://orcid.org/">0000-0003-2243-1007</userId>
</creator>
<metadataProvider>
<individualName>
<givenName>Hiroki</givenName>
<surName>Itô</surName>
</individualName>
<organizationName>Forestry and Forest Products Research Institute</organizationName>
<positionName>Leader of the silviculture research group, Hokkaido Research Center</positionName>
<address>
<deliveryPoint>Hitsujigaoka 7</deliveryPoint>
<city>Sapporo</city>
<administrativeArea>Hokkaido</administrativeArea>
<postalCode>062-8516</postalCode>
<country>JP</country>
</address>
<electronicMailAddress>abies.firma@gmail.com</electronicMailAddress>
<userId directory="http://orcid.org/">0000-0003-2243-1007</userId>
</metadataProvider>
<pubDate>2023-02-08</pubDate>
<language>eng</language>
<abstract>
<para>
Typhoon No. 15 in 1954 (Marie) caused catastrophic windthrow in Hokkaido, northern Japan. The Tomakomai District of the National Forest was one of the forests severely damaged. A study site was established in a stand of the National Forest within the jurisdiction of the Iburi East District Forest Office. The stand was located on the eastern slope of Mt. Tarumae at an elevation of approximately 300–310 m a.s.l. at an angle of approximately 5°. Two belts sized 4 m × 40 m, crossing at right angles at the center (total area: 304 m^2), were established within the site in 1957, and censuses on regeneration were conducted in 1957, 1970, 1973, 1978, 1983, 1990, 1996, 2001, 2006, 2011, 2016, and 2022. All stems of coniferous tree species (height ≥ 10 cm) that regenerated in the belts were marked. For broadleaved tree species, all stems with height ≥ 1.3 m were marked in 1957–1990, but stems with height ≥ 10 cm were marked after 1996. Height was measured for all marked stems, and the diameter at breast height was measured for stems with height ≥ 1.3 m. During the censuses, 27 coniferous and broadleaved tree species were identified and three more species were identified to the genus level. There are 3,431 records for the occurrence data and 12,558 records for the measurement data, including missing values.
</para>
</abstract>
<keywordSet>
<keyword>Occurrence</keyword>
<keywordThesaurus>
GBIF Dataset Type Vocabulary: http://rs.gbif.org/vocabulary/gbif/dataset_type.xml
</keywordThesaurus>
</keywordSet>
<intellectualRights>
<para>
This work is licensed under a
<ulink url="http://creativecommons.org/licenses/by/4.0/legalcode">
<citetitle>Creative Commons Attribution (CC-BY) 4.0 License</citetitle>
</ulink>
.
</para>
</intellectualRights>
<coverage>
<geographicCoverage>
<geographicDescription>Tomakomai, Hokkaido, Japan</geographicDescription>
<boundingCoordinates>
<westBoundingCoordinate>141.432</westBoundingCoordinate>
<eastBoundingCoordinate>141.432</eastBoundingCoordinate>
<northBoundingCoordinate>42.69</northBoundingCoordinate>
<southBoundingCoordinate>42.69</southBoundingCoordinate>
</boundingCoordinates>
</geographicCoverage>
<temporalCoverage>
<rangeOfDates>
<beginDate>
<calendarDate>1957</calendarDate>
</beginDate>
<endDate>
<calendarDate>2022</calendarDate>
</endDate>
</rangeOfDates>
</temporalCoverage>
</coverage>
<maintenance>
<description>
<para/>
</description>
<maintenanceUpdateFrequency>unkown</maintenanceUpdateFrequency>
</maintenance>
<contact>
<individualName>
<givenName>Hiroki</givenName>
<surName>Itô</surName>
</individualName>
<organizationName>Forestry and Forest Products Research Institute</organizationName>
<positionName>Leader of the silviculture research group, Hokkaido Research Center</positionName>
<address>
<deliveryPoint>Hitsujigaoka 7</deliveryPoint>
<city>Sapporo</city>
<administrativeArea>Hokkaido</administrativeArea>
<postalCode>062-8516</postalCode>
<country>JP</country>
</address>
<electronicMailAddress>abies.firma@gmail.com</electronicMailAddress>
<userId directory="http://orcid.org/">0000-0003-2243-1007</userId>
</contact>
<methods>
<methodStep>
<description>
<para/>
</description>
</methodStep>
</methods>
</dataset>
<additionalMetadata>
<metadata>
<gbif>
<dateStamp>2023-02-08T02:00:00+00:00</dateStamp>
<hierarchyLevel>dataset</hierarchyLevel>
</gbif>
</metadata>
</additionalMetadata>
</eml:eml>
<?xml version="1.0"?>
<archive xmlns="http://rs.tdwg.org/dwc/text/">
    <core encoding="UTF-8" linesTerminatedBy="\n" fieldsTerminatedBy="\t" fieldsEnclosedBy="" ignoreHeaderLines="1" rowType="http://rs.tdwg.org/dwc/terms/Occurrence">
        <files>
            <location>Tomakomai1463_occ.txt</location>
        </files>
        <id index="0"/>
        <field default="HumanObservation" term="http://rs.tdwg.org/dwc/terms/basisOfRecord"/>
        <field default="2023-02-08T02:00:00Z" term="http://purl.org/dc/terms/modified"/>
        <field default="Forestry and Forest Products Research Institute, Japan" term="http://purl.org/dc/terms/rightsHolder"/>
        <field default="FFPRI" term="http://rs.tdwg.org/dwc/terms/institutionCode"/>
        <field default="Tomakomai1463" term="http://rs.tdwg.org/dwc/terms/collectionCode"/>
        <field default="Hokkaido" term="http://rs.tdwg.org/dwc/terms/island"/>
        <field default="Japan" term="http://rs.tdwg.org/dwc/terms/country"/>
        <field default="JP" term="http://rs.tdwg.org/dwc/terms/countryCode"/>
        <field default="Hokkaido" term="http://rs.tdwg.org/dwc/terms/stateProvince"/>
        <field default="Tomakomai" term="http://rs.tdwg.org/dwc/terms/municipality"/>
        <field default="300" term="http://rs.tdwg.org/dwc/terms/minimumElevationInMeters"/>
        <field default="310" term="http://rs.tdwg.org/dwc/terms/maximumElevationInMeters"/>
        <field default="42.690269" term="http://rs.tdwg.org/dwc/terms/decimalLatitude"/>
        <field default="141.432472" term="http://rs.tdwg.org/dwc/terms/decimalLongitude"/>
        <field default="WGS84" term="http://rs.tdwg.org/dwc/terms/geodeticDatum"/>
        <field default="20" term="http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters"/>
        <field index="1" term="http://rs.tdwg.org/dwc/terms/catalogNumber"/>
        <field index="2" term="http://rs.tdwg.org/dwc/terms/eventDate"/>
        <field index="3" term="http://rs.tdwg.org/dwc/terms/scientificName"/>
        <field index="4" term="http://rs.tdwg.org/dwc/terms/kingdom"/>
        <field index="5" term="http://rs.tdwg.org/dwc/terms/phylum"/>
        <field index="6" term="http://rs.tdwg.org/dwc/terms/class"/>
        <field index="7" term="http://rs.tdwg.org/dwc/terms/order"/>
        <field index="8" term="http://rs.tdwg.org/dwc/terms/family"/>
        <field index="9" term="http://rs.tdwg.org/dwc/terms/genus"/>
        <field index="10" term="http://rs.tdwg.org/dwc/terms/specificEpithet"/>
        <field index="11" term="http://rs.tdwg.org/dwc/terms/infraspecificEpithet"/>
        <field index="12" term="http://rs.tdwg.org/dwc/terms/taxonRank"/>
    </core>
    <extension encoding="UTF-8" linesTerminatedBy="\n" fieldsTerminatedBy="\t" fieldsEnclosedBy="" ignoreHeaderLines="1" rowType="http://rs.tdwg.org/dwc/terms/MeasurementOrFact">
        <files>
            <location>Tomakomai1463_mea.txt</location>
        </files>
        <coreid index="0"/>
        <field index="1" term="http://rs.tdwg.org/dwc/terms/measurementID"/>
        <field index="2" term="http://rs.tdwg.org/dwc/terms/measurementType"/>
        <field index="3" term="http://rs.tdwg.org/dwc/terms/measurementValue"/>
        <field index="4" term="http://rs.tdwg.org/dwc/terms/measurementAccuracy"/>
        <field index="5" term="http://rs.tdwg.org/dwc/terms/measurementUnit"/>
        <field index="6" term="http://rs.tdwg.org/dwc/terms/measurementDeterminedDate"/>
    </extension>
</archive>

データ

  • Core: Occurrence data
    • 個体(樹幹)を単位として測定しているので
  • Extension: Measurement or Facts
    • 樹高または胸高直径

データの内容

occurrenceID catalogNumber eventDate scientificName kingdom phylum class order family genus specificEpithet infraspecificEpithet taxonRank
urn:catalog:FFPRI:Tomakomai1463:00001 1 2006-06-21/2011-09-21 Fraxinus lanuginosa Koidz. f. serrata (Nakai) Murata Plantae Euphyllophytes Magnoliopsida Lamiales Oleaceae Fraxinus lanuginosa serrata forma
urn:catalog:FFPRI:Tomakomai1463:00002 2 2006-06-20/2011-09-21 Fraxinus lanuginosa Koidz. f. serrata (Nakai) Murata Plantae Euphyllophytes Magnoliopsida Lamiales Oleaceae Fraxinus lanuginosa serrata forma
urn:catalog:FFPRI:Tomakomai1463:00003 3 2006-06-21/2011-09-21 Fraxinus lanuginosa Koidz. f. serrata (Nakai) Murata Plantae Euphyllophytes Magnoliopsida Lamiales Oleaceae Fraxinus lanuginosa serrata forma
urn:catalog:FFPRI:Tomakomai1463:00004 4 2006-06-21/2016-11-02 Fraxinus lanuginosa Koidz. f. serrata (Nakai) Murata Plantae Euphyllophytes Magnoliopsida Lamiales Oleaceae Fraxinus lanuginosa serrata forma
(中略)
urn:catalog:FFPRI:Tomakomai1463:03510 3510 2022-10-26/2022-10-26 Abies sachalinensis (F.Schmidt) Mast. Plantae Euphyllophytes Pinopsida Pinales Pinaceae Abies sachalinensis species
urn:catalog:FFPRI:Tomakomai1463:03511 3511 2022-10-26/2022-10-26 Abies sachalinensis (F.Schmidt) Mast. Plantae Euphyllophytes Pinopsida Pinales Pinaceae Abies sachalinensis species
CoreID measurementID measurementType measurementValue measurementAccuracy measurementUnit measurementDeterminedDate
urn:catalog:FFPRI:Tomakomai1463:00001 8556 Height 18 1 cm 2006-06-21
urn:catalog:FFPRI:Tomakomai1463:00001 9116 Height 31 1 cm 2011-09-21
(中略)
urn:catalog:FFPRI:Tomakomai1463:00333 1012 Height 130 1 cm 1970-07
urn:catalog:FFPRI:Tomakomai1463:00333 1013 Diameter at Breast Height 0.5 0.1 cm 1970-07
(中略)
urn:catalog:FFPRI:Tomakomai1463:03510 12571 Height 10 1 cm 2022-10-26
urn:catalog:FFPRI:Tomakomai1463:03511 12572 Height 11 1 cm 2022-10-26

データの利用

Rからの利用

  1. 出現データと測定データをそれぞれ読み込んで、結合する。

  2. 必要なデータを抽出してグラフを作成する。

出現データの読み込み

(occ_data <- readr::read_tsv(occ_file))
# A tibble: 3,430 × 13
   occurre…¹ catal…² event…³ scien…⁴ kingdom phylum class order family genus
   <chr>       <dbl> <chr>   <chr>   <chr>   <chr>  <chr> <chr> <chr>  <chr>
 1 urn:cata…       1 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
 2 urn:cata…       2 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
 3 urn:cata…       3 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
 4 urn:cata…       4 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
 5 urn:cata…       5 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
 6 urn:cata…       6 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
 7 urn:cata…       7 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
 8 urn:cata…       8 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
 9 urn:cata…       9 2001-1… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
10 urn:cata…      10 2006-0… Fraxin… Plantae Euphy… Magn… Lami… Oleac… Frax…
# … with 3,420 more rows, 3 more variables: specificEpithet <chr>,
#   infraspecificEpithet <chr>, taxonRank <chr>, and abbreviated variable
#   names ¹​occurrenceID, ²​catalogNumber, ³​eventDate, ⁴​scientificName

測定データの読み込み

(mea_data <- readr::read_tsv(mea_file))
# A tibble: 12,558 × 7
   CoreID                    measu…¹ measu…² measu…³ measu…⁴ measu…⁵ measu…⁶
   <chr>                       <dbl> <chr>     <dbl>   <dbl> <chr>   <chr>  
 1 urn:catalog:FFPRI:Tomako…    8556 Height       18       1 cm      2006-0…
 2 urn:catalog:FFPRI:Tomako…    9116 Height       31       1 cm      2011-0…
 3 urn:catalog:FFPRI:Tomako…    8557 Height       18       1 cm      2006-0…
 4 urn:catalog:FFPRI:Tomako…    9117 Height       29       1 cm      2011-0…
 5 urn:catalog:FFPRI:Tomako…    8558 Height       32       1 cm      2006-0…
 6 urn:catalog:FFPRI:Tomako…    9118 Height       26       1 cm      2011-0…
 7 urn:catalog:FFPRI:Tomako…    8559 Height       46       1 cm      2006-0…
 8 urn:catalog:FFPRI:Tomako…    9119 Height       25       1 cm      2011-0…
 9 urn:catalog:FFPRI:Tomako…   10195 Height       22       1 cm      2016-1…
10 urn:catalog:FFPRI:Tomako…    8560 Height       24       1 cm      2006-0…
# … with 12,548 more rows, and abbreviated variable names ¹​measurementID,
#   ²​measurementType, ³​measurementValue, ⁴​measurementAccuracy,
#   ⁵​measurementUnit, ⁶​measurementDeterminedDate

データの結合と整理

各樹幹について各年の樹高データを抽出する。

height_data <-
  # 測定データに出現データを結合
  dplyr::left_join(mea_data, occ_data,
                   by = c("CoreID" = "occurrenceID")) %>%
  # 必要な項目を抽出
  dplyr::transmute(catalogNumber,
                   Species = str_c(genus, " ", specificEpithet),
                   measurementType, measurementValue,
                   Year = str_sub(measurementDeterminedDate, 1, 4) %>%
                          as.integer()) %>%
  # 不明種を除去し、樹高データを抽出
  dplyr::filter(!is.na(Species),
                measurementType == "Height") %>%
  # 2016年データの一部は2017年に測定したので統合
  dplyr::mutate(Year = if_else(Year == 2017L, 2016L, Year))
print(height_data)
# A tibble: 8,893 × 5
   catalogNumber Species             measurementType measurementValue  Year
           <dbl> <chr>               <chr>                      <dbl> <int>
 1             1 Fraxinus lanuginosa Height                        18  2006
 2             1 Fraxinus lanuginosa Height                        31  2011
 3             2 Fraxinus lanuginosa Height                        18  2006
 4             2 Fraxinus lanuginosa Height                        29  2011
 5             3 Fraxinus lanuginosa Height                        32  2006
 6             3 Fraxinus lanuginosa Height                        26  2011
 7             4 Fraxinus lanuginosa Height                        46  2006
 8             4 Fraxinus lanuginosa Height                        25  2011
 9             4 Fraxinus lanuginosa Height                        22  2016
10             5 Fraxinus lanuginosa Height                        24  2006
# … with 8,883 more rows

グラフ表示

エゾマツとトドマツの最大樹高をプロット

plot_max_height <- height_data %>%
  # エゾマツとトドマツを抽出
  dplyr::filter(Species %in% c("Picea jezoensis",
                               "Abies sachalinensis")) %>%
  # 樹種と測定年でグループ化
  dplyr::group_by(Species, Year) %>%
  # 樹種・測定年ごとの最大樹高(m)を求める
  dplyr::summarise(Max_height = max(measurementValue, na.rm = TRUE) / 100,
                   .groups = "drop") %>% 
  ggplot(aes(x = Year, y = Max_height, colour = Species)) +
  geom_line(size = 2) +
  ylim(0, NA) + labs(y = "Max height (m)") +
  theme_grey(base_size = 14)
print(plot_max_height)

おわりに

  • 生態学データ(毎木調査・個体数データなど)は、研究室での入力・保管フォーマットはさまざま
  • Darwin Core Archiveとして出力することで、共通フォーマットとして利用可能

参考文献