• Arctic SDI catalogue
  •  
  •  
  •  

Predicted distributions of 65 groundfish species in Canadian Pacific waters

Description:

This dataset contains layers of predicted occurrence for 65 groundfish species as well as overall species richness (i.e., the total number of species present) in Canadian Pacific waters, and the median standard error per grid cell across all species. They cover all seafloor habitat depths between 10 and 1400 m that have a mean summer salinity above 28 PSU. Two layers are provided for each species: 1) predicted species occurrence (prob_occur) and 2) the probability that a grid cell is an occurrence hotspot for that species (hotspot_prob; defined as being in the lower of: 1) 0.8, or 2) the 80th percentile of the predicted probability of occurrence values across all grid cells that had a probability of occurrence greater than 0.05.). The first measure provides an overall prediction of the distribution of the species while the second metric identifies areas where that species is most likely to be found, accounting for uncertainty within our model. All layers are provided at a 1 km resolution.

Methods:

These layers were developed using a species distribution model described in Thompson et al. 2023. This model integrates data from three fisheries-independent surveys: the Fisheries and Oceans Canada (DFO) Groundfish Synoptic Bottom Trawl Surveys (Sinclair et al. 2003; Anderson et al. 2019), the DFO Groundfish Hard Bottom Longline Surveys (Lochead and Yamanaka 2006, 2007; Doherty et al. 2019), and the International Pacific Halibut Commission Fisheries Independent Setline Survey (IPHC 2021). Further details on the methods are found in the metadata PDF available with the dataset.

Abstract from Thompson et al. 2023:

Predictions of the distribution of groundfish species are needed to support ongoing marine spatial planning initiatives in Canadian Pacific waters. Data to inform species distribution models are available from several fisheries-independent surveys. However, no single survey covers the entire region and different gear types are required to survey the range of habitats that are occupied by groundfish. Bottom trawl gear is used to sample soft bottom habitat, predominantly on the continental shelf and slope, whereas longline gear often focuses on nearshore and hardbottom habitats where trawling is not possible. Because data from these two gear types are not directly comparable, previous species distribution models in this region have been limited to using data from one survey at a time, restricting their spatial extent and usefulness at a regional scale. Here we demonstrate a method for integrating presence-absence data across surveys and gear types that allows us to predict the coastwide distributions of 66 groundfish species in British Columbia. Our model leverages the use of available data from multiple surveys to estimate how species respond to environmental gradients while accounting for differences in catchability by the different surveys. Overall, we find that this integrated method has two main benefits: 1) it increases the accuracy of predictions in data-limited surveys and regions while having negligible impacts on the accuracy when data are already sufficient to make predictions, 2) it reduces uncertainty, resulting in tighter confidence intervals on predicted species occurrences. These benefits are particularly relevant in areas of our coast where our understanding of habitat suitability is limited due to a lack of spatially comprehensive long-term groundfish research surveys.

Data Sources:

Research data was provided by Pacific Science’s Groundfish Data Unit for research surveys from the GFBio database between 2003 and 2020 for all species which had at least 150 observations, across all gear type and survey datasets available.

Uncertainties:

These are modeled results based on species observations at sea and their related environmental covariate predictions that may not always accurately reflect real-world groundfish distributions though methods that integrate different data types/sources have been demonstrated to improve model inference by increasing the accuracy of the predictions and reducing uncertainty.

Simple

Date ( RI_366 )
2022-10-28
Date ( RI_367 )
2023-02-27
Date ( RI_368 )
2024-09-27
RI_415
  Government of Canada; Fisheries and Oceans Canada; Pacific Science - Patrick Thompson ( Research Biologist )
Institute of Ocean Sciences 9860 West Saanich Road P.O. Box 6000 , Sidney , British Columbia , V8L 5T5 , Canada
604-999-3490
Status
completed; complété RI_593
Maintenance and update frequency
notPlanned; nonPlanifié RI_542
Keywords ( RI_528 )
  • ecosystem-based management
  • marine spatial planning
  • groundfish
  • species distribution models
  • species distribution
Government of Canada Core Subject Thesaurus Thésaurus des sujets de base du gouvernement du Canada ( RI_528 )
  • Ecosystems
  • Temperature
  • Marine biology
  • Environmental sciences
Classification
unclassified; nonClassifié RI_484
Use limitation
Open Government Licence - Canada (http://open.canada.ca/en/open-government-licence-canada)
Access constraints
license; licence RI_606
Use constraints
license; licence RI_606
Spatial representation type
vector; vecteur RI_635
Metadata language
eng; CAN
Character set
utf8; utf8 RI_458
Topic category
  • Biota
Begin date
2003-01-01
End date
2020-12-31
N
S
E
W
thumbnail


Supplemental Information

A full description of the methods used to develop these layers is available in Thompson et al. (in review).

Survey Data - The species distribution models are based on data from three fisheries-independent scientific surveys conducted within Canadian Pacific Waters: the Fisheries and Oceans Canada (DFO) Groundfish Synoptic Bottom Trawl Surveys (Trawl; Sinclair et al. 2003; Anderson et al. 2019), the DFO Groundfish Hardbottom Longline Surveys (HBLL; Lochead and Yamanaka 2006, 2007; Doherty et al. 2019), and the International Pacific Halibut Commission Fisheries Independent Setline Survey (IPHC; IPHC 2021). In all cases, we based our analyses on data from 2003 to 2020 and utilized presence-absence records to facilitate integrating across surveys and gear types. Our analysis included all species which had at least 150 observations (to exclude species that are rarely sampled by the surveys) across all the survey datasets, resulting in a final set of 66 species.

Environmental data - As environmental predictors in the models, we used log seafloor depth, bathymetric position index (BPI), substrate indices for muddiness and rockiness, log tidal speed (m s^(-1)), mean summer ocean current speed (m s^(-1)), mean summer salinity (practical salinity units - PSU), and mean summer salinity range (PSU). In our species distribution model (described below), we included all environmental predictors other than depth and BPI as linear fixed effects. We included depth and BPI as second-order-polynomials to allow species occurrences to peak at mid values. We also included year as a penalized spline (i.e., generalized additive model; Wood 2017) to account for year-to-year variation in species’ occurrences.

Species distribution model - We fit species distribution models using Generalized Linear Mixed Effects Models with the sdmTMB package (Anderson et al. 2021). This model included the environment variables described above as fixed effects as well spatiotemporal random effects to account for variation in species occurrences that is not associated with the environmental predictors. In models that integrated multiple surveys (see below) we included survey type (i.e., trawl, HBLL, IPHC) as a fixed effect to account for differences in catchability across surveys. See Thompson et al. in review for a full specification and details on the model.

Candidate models - For each species, we compared the predictive power of three classes of models. These were: 1) models fit on the individual surveys separately, 2) models that integrated surveys that shared the same gear type (integrated gear model), and 3) a model that integrated both gear types and all surveys (fully integrated model). We excluded surveys that included fewer than 10 occurrences for the focal species because including these surveys resulted in poor model convergence. Thus, for some species, the fully integrated model or one or both (i.e., trawl and longline) of the integrated gear models were not run. The Trawl survey was conducted in the Strait of Georgia in 2012 and 2015 and is typically excluded in analyses of the broader dataset (Anderson et al. 2019) because of the differences in gear and the limited temporal coverage. Thus, we considered the SoG Trawl survey as a separate survey in our single survey models.

All models included the fixed and spatiotemporal random effects described previously except for the SoG Trawl survey on its own where fixed effects other than depth were dropped because more complex models failed to converge due to the low number of observations in that dataset. We integrated the surveys using a categorical fixed effect (i.e., HBLL vs. IPHC vs. Trawl) to account for methodological differences between the surveys. Thus, species responses to the environmental conditions were shared across surveys in the integrated models, but the overall probability of occurrence could vary depending on the survey. Although there are some methodological differences in the trawl surveys in the SoG compared to the other regions, we assume that these differences would have minimal effect on presence-absence data. Furthermore, it would not be possible to distinguish these effects from the spatial and environmental differences that exist between the SoG and the outer shelf, which we assume have a larger impact on species occurrences. Thus, we elected to consider all trawl surveys as a single survey in the integrated models.

Model comparison - We compared models based on their predictive accuracy estimated using spatial block cross-validation. We used three-fold cross-validation with folds arranged sequentially over 24 blocks that divided the coastline equally from north to south. We assessed predictive accuracy using the area under the curve (AUC; Pearce and Ferrier 2000) of the withheld data in the cross-validation.

Selection of the final model for each species was made by applying a series of decisions:

1) Exclude SoG Trawl single survey models because of their spatial coverage constraints.

2) Exclude all models with a cross-validation AUC less than 0.7 because of low predictive accuracy.

3) Select the highest level of integration within the remaining models.

4) When a fully integrated model is selected, select the survey for making predictions (i.e., survey fixed effect) based on which survey has the higher proportion of occurrences for that species except when the proportion of occurrences is greater than 98% of all sets. This occurs in the IPHC survey for Pacific Halibut and North Pacific Spiny Dogfish, species which target the longline hooks, thus resulting in extremely high predicted probability of occurrence in a presence-absence model.

Spatial species occurrence predictions - We made predictions of occurrence using the final selected model for each species. These predictions were made at a 1 km resolution across the full British Columbia coast in waters for areas where seafloor depth was between 10 and 1400 meters and mean summer salinity was above 28 PSU. These values correspond to the approximate range of environmental conditions sampled in the surveys. Environmental values were obtained by averaging the 100 m resolution environmental data described previously within each 1 km grid cell. 1 km resolution for our selected models was chosen because it corresponds with the average distance between the start and end coordinates of the HBLL surveys and is in the same magnitude as the roughly 2 km average trawl. Finer scale predictions could be made if necessary (e.g., Nephin et al. 2020). However, this would require using the downscale environmental values and making the assumption that a model trained on trawl data, which integrates multiple kilometers of habitat, can make predictions at finer spatial resolutions. All species distribution predictions were made using models fit to the full data set, rather than using the cross-validation models. Predictions were made for 2012–2015 and then averaged across years in each grid cell. We elected to use this time span because the SoG Trawl survey was only conducted in 2012 and 2015 and so our predictions would not require extrapolation beyond the years in which we have data. Using the mean value across multiple years ensured that our predictions are not heavily influenced by year-to-year variation in predicted species occurrences. Predictions should be interpreted as the probability of catching a species in a given location if sampling using the survey identified as most appropriate in the model selection process.

We estimate model uncertainty by running 500 simulations based on draws from the joint precision matrix of our model, assuming multivariate normal parameter covariance (sensu Gelman and Hill 2007). We estimate the median standard error for each 1 km grid cell by taking the median standard error across all species. To identify species occurrence hotspots, we calculated the proportion of simulated values in each grid cell that exceeded a high occurrence threshold. This threshold was the lower of: 1) 0.8, or 2) the 80th percentile of the predicted probability of occurrence values across all grid cells that had a probability of occurrence greater than 0.05. Excluding grid cells with a probability of occurrence lower than 0.05 ensured that habitats that were unsuitable were not included. Using an upper bound of 0.8 for the hotspot threshold ensured that thresholds were not unreasonably high for species with high occurrence probability where it is predicted to be found (e.g. > 0.999 for Longspine Thornyhead).

Species richness predictions - We predicted species richness by summing the predicted species occurrences within each grid cell across all the selected final models (sensu Ovaskainen and Abrego 2020). This value should be interpreted as the number of species that would be expected to be caught if that location were surveyed using the most appropriate survey method for each species. This assumes, for example, that if one species has a probability of occurrence of 0.6 and another species has a probability of 0.8 in a given location, then a survey would catch an average of 1.4 species that location if sampled repeatedly.

Reference system identifier
https://epsg.io / EPSG:4326 /
RI_412
  Government of Canada; Fisheries and Oceans Canada; Pacific Science - Patrick Thompson ( Research Biologist )
Institute of Ocean Sciences 9860 West Saanich Road P.O. Box 6000 , Sidney , British Columbia , V8L 5T5 , Canada
604-999-3490
OnLine resource
Data Dictionary ( HTTPS )

Supporting Document;CSV;eng

OnLine resource
Data Dictionary ( HTTPS )

Supporting Document;CSV;fra

OnLine resource
Predicted Distributions Of 65 Groundfish Species In Canadian Pacific Waters – TIFF ( HTTPS )

Dataset;TIFF;eng

OnLine resource
References ( HTTPS )

Supporting Document;PDF;eng,fra

OnLine resource
Predicted Distributions Of 65 Groundfish Species In Canadian Pacific Waters-GIS Hub metadata ( HTTPS )

Supporting Document;PDF;eng

OnLine resource
Predicted Distributions Of 65 Groundfish Species In Canadian Pacific Waters-GIS Hub metadata ( HTTPS )

Supporting Document;PDF;fra

OnLine resource
Predicted Distributions Of 65 Groundfish Species In Canadian Pacific Waters ( ESRI REST: Map Server )

Web Service;ESRI REST;eng

OnLine resource
Predicted Distributions Of 65 Groundfish Species In Canadian Pacific Waters ( ESRI REST: Map Server )

Web Service;ESRI REST;fra

File identifier
51c60d88-c6ac-4e1c-9724-83b6048aeccd XML
Metadata language
eng; CAN
Character set
utf8; utf8 RI_458
Hierarchy level
dataset; jeuDonnées RI_622
Date stamp
2025-02-04T20:39:05.97Z
Metadata standard name
North American Profile of ISO 19115:2003 - Geographic information - Metadata
Metadata standard version
CAN/CGSB-171.100-2009
RI_415
  Government of Canada; Fisheries and Oceans Canada; Pacific Science - Emily Rubidge ( Research Scientist )
Institute of Ocean Sciences 9860 West Saanich Road P.O. Box 6000 , Sidney , British Columbia , V8L 5T5 , Canada
604-822-8419
 
 

Overviews

overview
gfdistribution_thumbnail.jpg

Spatial extent

N
S
E
W
thumbnail


Keywords


Provided by

logo

Associated resources

Not available


  •  
  •  
  •