CIEE/ICEE
  • Home
  • Living Data
    • Courses
    • LDP Certificates
    • Internships
    • Living Data Stories
    • Working Groups
  • Working Groups
    • About
    • Apply
  • Training
    • CIEE Workshops and Training
  • Apply
    • For a Data Rescue Internship
    • To host a workshop
    • LDP PostDoc Positions
  • News
  • Outputs
    • Datasets
    • Publications
    • Documentary
  • Get Involved
    • Membership
    • Donations
    • Governance
    • Contact
  • Accueil
  • Données vivantes
    • Cours
    • LDP Certificats
    • Stages
    • Histoires de données vivantes
    • Groupes de travail
  • Groupes de travail
    • À Propos
    • POSTULER
  • Formation
    • Ateliers et formations de l’ICEE
  • POSTULER
    • Effectuer un stage de sauvetage de données
    • Animer un atelier
    • Occasion de stage postdoctoral PLD
  • Nouvelles
  • La Production
    • Ensembles de données
    • Publications
    • Documentary
  • COMMENT S’ENGAGER
    • Adhésion
    • Faire un Don
    • Gouvernance
    • Nous Joindre

LIVING DATA PROJECT STORIES

Data cleaning and standardization of ground cover data for rangelands in BC

9/3/2024

 
Data rescue intern: Jawad Sakarchi

For six weeks in the summer of 2023, I had the opportunity to participate in the LDP Data Rescue Internship. The goals set out for my internship were centered around a very large database of Range Reference Area data collected and monitored by rangeland ecologists of the Ministry of Forests in British Columbia. The data focuses on differences inside, outside and between fenced exclosures. These exclosures act as tools to explore the effects of disturbances on ecological communities, in the form of grazing (e.g., livestock or wildlife). The dataset is large, with up to 70 variables, including other disturbances, such as fire, and many biogeographic variables, such as moisture, nutrient, or ecosystem classifications. 


The benefit of this dataset is that it both provides an exceptionally extensive account of disturbances across 370 sites throughout BC, as well as observations (data, photographs, notes) in the same site separated by decades, or in most extreme cases even 100 years.
The Range Branch saw this dataset as extremely valuable and wanted to know what was available for both future data management, but also to provide an inventory for future ecologists that may wish to use and cite this data.

How rich the dataset was presented a challenge as it was not clear at first, particularly for an academic not in the field. Over 100 years one may imagine notes take different forms, sites have different names, locations change in names, digital data takes different organization, and often in inaccessible programs (Microsoft Access).

At the start of the internship many notes had recently been scanned and digitized, alongside many excel files. Though both the data and organization were conducted by separate people.  The goal for this six week internship was to bookkeep what is available and create metadata, restructure and reorganize the data into an efficient entity relationship diagram for future processing in a relational database management system, revise naming conventions to be more consistent, identify unnamed/ambiguous variables to create a data dictionary, identify depreciated variables, create a DOI, and look up tables from non-tidy data while documenting the process for accessibility and future directions.

To achieve this, I primarily worked with Nancy Elliot, a rangeland analyst at the British Columbia Ministry of Forests Range Branch (Range, Invasive Plants, and Ecosystem Restoration). Many  of the files, such as the data dictionary, entity relationship diagram, background information, and R scripts associated with the project are now publicly available on the Open Science Framework. The metadata will be used to provide future researchers with an opportunity to easily understand and use the data. Alongside its documentation of decisions and next analysis steps, this project allows for reproducible management and living continuation of this rich dataset.

Comments are closed.

    Archives

    February 2026
    January 2026
    August 2025
    July 2025
    May 2025
    February 2025
    January 2025
    December 2024
    September 2024
    October 2023
    April 2023
    March 2023
    February 2023
    January 2023
    November 2022
    July 2020

    Categories

    All

    RSS Feed

Home
Synthesis
Training
Living Data
Funding
News
Membership

Contact

  • Home
  • Living Data
    • Courses
    • LDP Certificates
    • Internships
    • Living Data Stories
    • Working Groups
  • Working Groups
    • About
    • Apply
  • Training
    • CIEE Workshops and Training
  • Apply
    • For a Data Rescue Internship
    • To host a workshop
    • LDP PostDoc Positions
  • News
  • Outputs
    • Datasets
    • Publications
    • Documentary
  • Get Involved
    • Membership
    • Donations
    • Governance
    • Contact
  • Accueil
  • Données vivantes
    • Cours
    • LDP Certificats
    • Stages
    • Histoires de données vivantes
    • Groupes de travail
  • Groupes de travail
    • À Propos
    • POSTULER
  • Formation
    • Ateliers et formations de l’ICEE
  • POSTULER
    • Effectuer un stage de sauvetage de données
    • Animer un atelier
    • Occasion de stage postdoctoral PLD
  • Nouvelles
  • La Production
    • Ensembles de données
    • Publications
    • Documentary
  • COMMENT S’ENGAGER
    • Adhésion
    • Faire un Don
    • Gouvernance
    • Nous Joindre