Estuarine microbial networks and relationships vary between environmentally distinct communities
- Published
- Accepted
- Received
- Academic Editor
- Craig Moyer
- Subject Areas
- Ecology, Microbiology, Molecular Biology, Freshwater Biology, Aquatic and Marine Chemistry
- Keywords
- Microbes, Relationships, Networks, Estuaries, Temperature, Nutrients, Amplicon metabarcoding, Prokaryotes, Protists
- Copyright
- © 2022 Anderson and Harvey
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
- Cite this article
- 2022. Estuarine microbial networks and relationships vary between environmentally distinct communities. PeerJ 10:e14005 https://doi.org/10.7717/peerj.14005
Abstract
Microbial interactions have profound impacts on biodiversity, biogeochemistry, and ecosystem functioning, and yet, they remain poorly understood in the ocean and with respect to changing environmental conditions. We applied hierarchical clustering of an annual 16S and 18S amplicon dataset in the Skidaway River Estuary, which revealed two similar clusters for prokaryotes (Bacteria and Archaea) and protists: Cluster 1 (March-May and November-February) and Cluster 2 (June-October). We constructed co-occurrence networks from each cluster to explore how microbial networks and relationships vary between environmentally distinct periods in the estuary. Cluster 1 communities were exposed to significantly lower temperature, sunlight, NO3, and SiO4; only NH4 was higher at this time. Several network properties (e.g., edge number, degree, and centrality) were elevated for networks constructed with Cluster 1 vs. 2 samples. There was also evidence that microbial nodes in Cluster 1 were more connected (e.g., higher edge density and lower path length) compared to Cluster 2, though opposite trends were observed when networks considered Prokaryote-Protist edges only. The number of Prokaryote-Prokaryote and Prokaryote-Protist edges increased by >100% in the Cluster 1 network, mainly involving Flavobacteriales, Rhodobacterales, Peridiniales, and Cryptomonadales associated with each other and other microbial groups (e.g., SAR11, Bacillariophyta, and Strombidiida). Several Protist-Protist associations, including Bacillariophyta correlated with Syndiniales (Dino-Groups I and II) and an Unassigned Dinophyceae group, were more prevalent in Cluster 2. Based on the type and sign of associations that increased in Cluster 1, our findings indicate that mutualistic, competitive, or predatory relationships may have been more representative among microbes when conditions were less favorable in the estuary; however, such relationships require further exploration and validation in the field and lab. Coastal networks may also be driven by shifts in the abundance of certain taxonomic or functional groups. Sustained monitoring of microbial communities over environmental gradients, both spatial and temporal, is critical to predict microbial dynamics and biogeochemistry in future marine ecosystems.
Introduction
Over the last two decades, the expansion of high-throughput environmental sequencing, or amplicon metabarcoding, has improved our ability to monitor Bacteria, Archaea, and protists in the ocean (Sogin et al., 2006; Xia, Guo & Liu, 2017; Caron & Hu, 2019; Santoferrara et al., 2020; Burki, Sandin & Jamy, 2021). Amplicon surveys have informed marine microbial dynamics, revealing spatial trends in biodiversity (Ibarbalz et al., 2019), recurrent seasonal patterns in composition (Ward et al., 2017; Chafee et al., 2018; Giner et al., 2019), and more accurate representation of rare or cryptic organisms, especially among protists (Burki, Sandin & Jamy, 2021). These monitoring efforts have also reinforced the importance of environmental and biological factors in structuring microbial communities (Fuhrman, Cram & Needham, 2015; Logares et al., 2020). For instance, several environmental factors, like temperature, nutrients, sunlight, and water depth, consistently influence microbes on regional to global scales (Gilbert et al., 2012; Sunagawa et al., 2015). Microbes are also driven by interactions that they have with each other, with such relationships being subject to environmental stressors (Piccardi, Vessman & Mitri, 2019; Hernandez et al., 2021).
Microbial interactions underpin ocean ecosystem functioning, influencing carbon transfer, nutrient cycling, and organic matter transformation (Azam et al., 1983; Worden et al., 2015). Microbes are highly interconnected, exhibiting a range of interactions, from parasitism and predation to symbiosis and mutualism, all differentially influencing community diversity and biogeochemical cycles (Worden et al., 2015). Despite their importance, many interactions remain unresolved. A recent literature review of ∼2500 microbial interactions found that 14% were ambiguous across aquatic environments (Bjorbækmo et al., 2020). Realistically, the number of unresolved interactions in nature is likely much higher, owing to the large functional diversity of microbes, the challenges in culturing them in the lab, and the lack of tools available to examine such interactions at scale and under high taxonomic resolution (Krabberød et al., 2017). Given the important role of microbial interactions in shaping biodiversity and ecosystem functioning, it remains critical to better resolve these interactions and how they may shift over time and space and under different environmental conditions.
Amplicon surveys coupled with co-occurrence network analysis represent a powerful method to infer microbial relationships, establishing significant correlations between amplicon sequence variants (ASVs) based on relative abundance data (Röttjers & Faust, 2018; Faust, 2021). Co-occurrence networks have been conducted across marine systems, revealing globally important microbial associations that may reflect interactions, like parasitism or symbiosis (Lima-Mendez et al., 2015), and providing context for phytoplankton bloom dynamics, species succession, and biogeochemistry (Needham & Fuhrman, 2016; Needham, Sachdeva & Fuhrman, 2017; Bolaños et al., 2021). More recently, a reanalysis of Tara Ocean data revealed that global networks were structured by environmental (poleward) niches and influenced by factors like temperature, salinity, and nutrients (Chaffron et al., 2021). Together, these studies emphasize the importance of microbial networks to ecosystem functioning and food web dynamics, as well as their potential susceptibility to environmental changes.
A universal challenge in constructing microbial networks is how to deal with environmental factors that are known to strongly influence communities (Faust, 2021). The most common strategy to assess environmental impact has been to aggregate samples into a single network, including environmental variables as additional nodes that can be correlated to ASVs (Fuhrman, Cram & Needham, 2015; Needham, Sachdeva & Fuhrman, 2017). However, this approach often results in fewer environmental network correlations compared to those between species (Gilbert et al., 2012; Chow et al., 2014). Another strategy involves separating ASV tables into groups, for instance based on temporal (seasons or years) or spatial scales (water depth), and comparing resulting networks (Lima-Mendez et al., 2015; Milici et al., 2016; Kellogg et al., 2019; Lambert et al., 2021). Networks can also be separated by binning nodes (Röttjers & Faust, 2018; Chaffron et al., 2021) but tools are limited (Faust, 2021) and have not been well applied to marine samples.
Another potential method is to separate samples for network analysis a priori based on community composition (beta diversity). Separating samples in this manner may be relevant for estuarine or other coastal microbial communities that are seasonally structured and sensitive to anthropogenic impacts (Fuhrman et al., 2006; Chafee et al., 2018). Many estuaries along the southeastern U.S., including the Skidaway River Estuary (GA, U.S.), were once considered pristine and are now threatened by habitat transformation and nutrient loading (Verity, Alber & Bricker, 2006). Though well mixed from semidiurnal tides, long-term monitoring in the Skidaway River has revealed increased nutrient input correlated to increased abundance of heterotrophic bacteria and most plankton groups (Verity & Borkman, 2010). Continued anthropogenic input of nutrients in this region (and others) may enhance warming, eutrophication, and habitat loss, with implications for fisheries and ecosystem management (Verity, Alber & Bricker, 2006). Identifying and characterizing microbial relationships that occur over environmentally distinct periods may provide insight into their long-term dynamics in changing coastal habitats.
The aim of our study was to assess how different environmental conditions influence relationships among microbes (Bacteria, Archaea, and protists) in the Skidaway River Estuary. We performed 16S and 18S amplicon surveys on a weekly-monthly basis (33 days) and constructed microbial networks with co-occurrence analysis. Instead of binning samples categorically, we separated ASV tables for network analysis based on hierarchical clustering of beta diversity. This approach revealed two distinct communities, similar for prokaryotes and protists, representative of contrasting environmental conditions, mainly temperature and nutrients. Despite the same initial number of ASVs used in each network, we observed differences in network properties (e.g., degree, centrality, and number of edges) between clusters, as well as changes in the types of associations that occurred. The response of microbial relationships to anthropogenic threats and changing conditions remains unclear (Caron & Hutchins, 2013), and so, it is critical to better understand how current networks vary between environmentally distinct periods.
Materials & Methods
Sample collection
Surface water samples (1 m) were collected weekly to monthly from March 2017 to February 2018 in the Skidaway River Estuary (latitude, 31°59′25.7′N; longitude, 81°01′19.7′W), encompassing 33 sampling days. For consistency between weeks, sampling always occurred at high tide. Water samples were collected with a 5-L Niskin bottle, filtered on site through 200-µm mesh (to exclude zooplankton) into a 20-L carboy, and transferred to a nearby lab for processing. Samples (250–1000 ml) were filtered in triplicate from the 20-L carboy through 47-mm, 0.22-µm polycarbonate filters (Millipore) using vacuum filtration. Filters were stored at −80 °C. On three days (8/30, 10/11, and 11/21), only two biological replicates were filtered.
Surface temperature, salinity, and dissolved oxygen were measured using a YSI (600S sonde). Solar radiation data was collected from a nearby land-based site on Skidaway Island. Triplicate chlorophyll samples (50–100 ml) were filtered onto 0.7-µm GF/F filters, extracted in 91% ethanol, and measured on a Turner AU10 fluorometer (Graff & Rynearson, 2011). Dissolved nutrients (NO3, NH4, PO4, and SiO4) were measured via a Technicon AutoAnalyzer (SEAL Analytical), while particulate organic carbon (POC) and nitrogen (PON) were measured using a Thermo Flash elemental analyzer (Bittar et al., 2016; Anderson & Harvey, 2019). Dissolved nutrients and POC/PON were not measured on 9/6; these samples were not considered in the constrained ordination.
PCR conditions and DNA sequencing
The DNeasy PowerSoil kit (Qiagen) was used to extract DNA following manufacturer’s protocols. DNA samples were eluted in 10 mM Tris–HCl (pH = 8.5). DNA concentrations were estimated with the Qubit dsDNA HS kit (Thermo Scientific) and ranged from 2–5 ng µl−1 per sample. A two-step PCR approach was employed with two different primer sets, targeting prokaryotes (16S rRNA) or protists (18S rRNA). We used the following primers to target the V4 region of the 18S rRNA gene (Stoeck et al., 2010): forward (5′-CCAGCASCYGCGGTAATTCC-3 ′) and reverse (5′-ACTTTCGTTCTTGATYRA-3′). For 16S, primers targeted the V4–V5 region (Parada, Needham & Fuhrman, 2016): forward (5′-GTGYCAGCMGCCGCGGTAA-3′) and reverse (5′-CCGYCAATTYMTTTRAGTTT-3′). Illumina adapters were attached to each target-specific primer region. 18S PCR conditions involved an initial denaturation step at 98 °C for 2 min, 10 cycles of 98 °C for 10 s, 53 °C for 30 s, and 72 °C for 30 s, followed by 15 cycles of 98 °C for 10 s, 48 °C for 30 s, and 72 °C for 30 s, and a final extension of 72 °C for 2 min (Stoeck et al., 2010; Hu et al., 2015). 16S PCR conditions consisted of an initial denaturation of 95 °C for 2 min, 25 cycles of 95 °C for 45 s, 50 °C for 45 s, and 68 °C for 90 s, followed by a final elongation step of 68 °C for 5 min (Parada, Needham & Fuhrman, 2016). PCR products were purified and size-selected using AMPure XP Beads (A63881; Beckman Coulter). A second PCR step was carried out by attaching dual Illumina indices (P5 and P7) and adapters to template DNA using the Nextera XT Index Kit. Two separate sequencing runs were performed using an Illumina MiSeq (2 ×250 bp for 18S; 2 ×300 bp for 16S) at the Georgia Genomics and Bioinformatics Core at the University of Georgia.
Bioinformatics
Demultiplexed 16S and 18S FASTQ files were imported and processed separately in QIIME 2 (Bolyen et al., 2019). Amplicon sequence variants (ASVs) were inferred with paired-end DADA2 (Callahan et al., 2016). Truncation lengths of the forward and reverse reads were defined based on read-quality profiles; otherwise, default DADA2 parameters were used. Protist taxonomy was inferred using QIIME 2-compatible files from the Protist Ribosomal Reference (PR2) database (Version 4.12.0; Guillou et al., 2013), while prokaryotic taxonomy was assigned using the SILVA database (Version 138; Pruesse et al., 2007). For both gene regions, a Naïve Bayes Classifier was used to train the sequences against reference databases using the feature-classifier plugin in QIIME 2 (Bokulich et al., 2018). QIIME 2 taxonomy and count table artifact files (.qza files) were imported into R (Version 3.6.3; R Core Team, 2020) using the read_qza function from the qiime2R package (https://github.com/jbisanz/qiime2R).
Sequences were deposited at the Sequence Read Archive of the National Center for Biotechnology Information (NCBI) and made publicly available under accession numbers PRJNA575563 (18S) and PRJNA680039 (16S). R code used for data analysis, including a full list of R packages, is on GitHub (https://github.com/sra34/SkIO-network). This project has been archived on Zenodo (https://doi.org/10.5281/zenodo.6549350).
Statistical analyses
Community dynamics were investigated separately for 16S and 18S datasets in R, using packages including phyloseq (McMurdie & Holmes, 2013), vegan (Oksanen et al., 2018), and tidyverse (Wickham et al., 2019). Average values presented in the text refer to the mean. To focus on protists, we removed eukaryotes within Metazoa and Streptophyta that were amplified with the 18S primers. Unassigned reads at the supergroup level for protists (PR2 Rank 2) or domain level for prokaryotes (SILVA Rank 1) were also removed, as well as prokaryotic reads assigned to chloroplasts or mitochondria. After filtering, there were 51,770 sequence reads on average across samples for protists (16,857–84,369), assigned to 8,700 ASVs. There were 81,165 sequences on average for prokaryotes (47,250–177,521), assigned to 15,716 ASVs. Two samples in the 16S dataset were removed (3/16 B and 9/20 C) due to low sequence read numbers (94 total samples for 16S vs. 96 for 18S). Rarefaction curves were generated using the R package ranacapa (Kandlikar et al., 2018). Unfiltered taxonomic assignments and read counts for microbial ASVs are provided in Table S1.
To assess community dynamics, singletons were removed (except for alpha diversity) and samples were rarefied to the minimum read count for 16S (47,249) and 18S tables (16,849), respectively (Weiss et al., 2017). Community composition was correlated to environmental variables with distance-based redundancy analysis (dbRDA; Oksanen et al., 2018). Constrained ordinations were run with unweighted UniFrac distance matrices and log-transformed environmental factors. Environmental variables that were significant to the ordination (ANOVA, p-values <0.05) were identified using the ordistep function (both directions) in vegan with 999 permutations (Oksanen et al., 2018). After a final dbRDA run, significant variables were added to the ordination as arrows. Hierarchical clustering (Ward’s method) of UniFrac distances was performed using the hclust function in vegan. The optimal number of clusters was evaluated based on average silhouette widths, a measure of the similarity between each sample and its cluster compared to its similarity to other clusters (Rousseeuw, 1987).
Group-specific 16S and 18S relative abundance was assessed over the year and local regression (loess) curves were applied to visualize temporal trends using the geom_smooth function in ggplot2 (Wickham et al., 2019). Relative abundances were correlated with environmental variables using Spearman rank correlations (R), considering only variables that were significant to the dbRDA. We focused on the most relatively abundant prokaryotes and protists in the dataset at the order level (>2% on average). Observed richness and Shannon diversity were estimated with the estimate_richness function in phyloseq (McMurdie & Holmes, 2013). Singletons were considered for diversity estimates to account for rare microbes. Shapiro–Wilks’s normality tests were applied to the diversity data, whereafter either paired t-tests (richness) or Wilcoxon tests (Shannon) were used to compare mean diversity or richness between clusters. Similar comparative tests (Wilcoxon or t-test) were performed for environmental variables, after checking how each variable was distributed (Shapiro–Wilks).
Covariance networks
Microbial association networks were constructed for separate clusters using the SParse Inverse Covariance estimation for Ecological Association and Statistical Inference (SPIEC-EASI; Version 1.1.0) package in R (Kurtz et al., 2015). SPIEC-EASI uses ASV count tables as input and computes an inverse covariance matrix, using conditional independence to infer direct associations (Kurtz et al., 2015; Röttjers & Faust, 2018). The program also supports merging ASV tables across gene marker regions, an approach that has been tested previously to investigate cross-domain associations (Tipton et al., 2018). SPIEC-EASI aims to be robust to the compositional nature of amplicon data and aims to infer sparse networks that are more conservative against false-positive or indirect edges (Kurtz et al., 2015).
Networks constructed with too many edges can result in “hairball” networks that are difficult to interpret and may yield ambiguous relationships (Röttjers & Faust, 2018). Therefore, ASV tables were filtered for network analysis to include the top 150 most abundant (based on sequence reads) 16S and 18S ASVs (300 total ASVs) per cluster. Covariance networks were constructed with the spiec.easi function using ASV count tables as input (with matching sample IDs) and the “mb” (Meinshausen–Buhlmann) neighborhood selection setting. Two samples that were excluded from the 16S dataset due to low read numbers (3/16 B and 9/20 C) were also filtered from the 18S set at this stage to support merging of ASV tables. The Stability Approach to Regularization Selection (StARS) was used to select the optimal sparsity parameter with a threshold set to 0.05 (Liu, Roeder & Wasserman, 2010). The spiec.easi function performs centered log-ratio (clr) transformation of ASV count tables, eliminating the need to pre-transform abundance data (Tipton et al., 2018). SPIEC-EASI outputs a correlation matrix of positive and negative values (weights) for all significantly paired edges.
Networks were visualized in Cytoscape (Shannon et al., 2003). The total number of positive and negative edges were compared between networks for domain level associations (e.g., Protist-Protist, Prokaryote-Protist, and Prokaryote-Prokaryote), as well as the most prevalent types of order level (SILVA Rank 4; PR2 Rank 5) relationships within each broader category. Topological features were measured using the NetworkAnalyzer plugin (Assenov et al., 2008), analyzing networks overall or based on the major domain level pairings. Edge density and average path length were measured for each network, which in this case, referred to the fraction of realized to potential microbial edges and the average distance between any two microbial nodes (Faust & Raes, 2012). Degree and closeness centrality were estimated for each node (or ASV). Degree refers to the number of edges connected to a given microbe (represented as nodes), whereas closeness centrality indicates the proximity of a given microbe to all other microbes in the network; higher centrality indicates a greater contribution to network connectivity (Röttjers & Faust, 2018). Centrality and degree data were non-normal (Shapiro–Wilks) and compared at the domain level and for the most relatively abundant class level groups (SILVA Rank 3; PR2 Rank 4) over the year (Wilcoxon tests).
Results
Several environmental variables fluctuated over the year (Table S2), most notably temperature, SiO4, and NO3, all peaking in June-October. Temperature and SiO4 were strong covariates in the dataset (Spearman R = 0.86, p-value <0.001). Temperature also covaried with NO3 and PO4 (both with Spearman R = 0.48, p-values <0.01), as well as NH4 (Spearman R = − 0.73, p-value <0.001). Other factors like PO4, salinity, POC, and PON were less variable, while NH4 peaked in December-February (Table S2). Chlorophyll (<200 µm) ranged from 1.32–6.39 µg L−1, varying more greatly between sampling intervals from March-August (Table S2).
Environmental impact on prokaryotes and protists
For both prokaryotes and protists, the number of read counts vs. ASVs was saturated across samples (Fig. S1). Several dominant prokaryotic and protist groups were temporally variable and significantly correlated with environmental variables, particularly temperature, NO3, and SiO4 (Figs. 1; 2). Temporally variable prokaryotes included Actinomarinales, which peaked in abundance in June-October and was strongly correlated to temperature (Spearman R = 0.8, p-value <0.001), as well as Flavobacteriales, Rhodobacterales, and Oceanospirillales that contributed most to relative abundance in March-May or November-February and were negatively correlated to temperature (Spearman R = − 0.59 to −0.8, p-values <0.05; Figs. 1A–1B). SAR11 Clade, the most abundant prokaryotic group on average over the year, became most relatively abundant in November-February (Fig. 1A); however, SAR11 abundance was not significantly correlated to any environmental factor tested (Fig. 1B).
Figure 1: Group-specific 16S relative abundance over the year and correlation to environmental variables.
(A) Relative abundance (%) of the top ten most relatively abundant order level prokaryotes over the year (>2% relative abundance). Local regression (loess) curves represent smoothed trends (black lines) and 95% confidence intervals (shaded gray). Abundance data is shown in triplicate or duplicate (8/30, 10/11, and 11/21) and samples are grouped by month on the x-axis. (B) Spearman correlations between group-specific prokaryotic relative abundance and environmental variables. Only significant correlations are shown (Spearman R, p-values < 0.05), with the sign of correlation indicated by a red (negative) to blue (positive) color gradient. Stronger correlations are represented by darker colors. White boxes indicate no significant correlation.Several protist groups, like Unassigned Dinophyceae and Dino-Groups I and II (Syndiniales), were most relatively abundant in June-October and positively correlated to temperature (Spearman R = 0.68−0.83, p-values <0.001; Figs. 2A–2B). Other groups, like Peridiniales, Gymnodiniales, and Cryptomonadales, peaked in either March-May or November-February and were negatively correlated to temperature (Spearman R = − 0.42 to −0.56, p-values <0.05; Figs. 2A–2B). Bacillariophyta and Mamiellales were the most abundant protist groups on average over the year (Fig. 2A). Bacillariophyta relative abundance was lowest in March-May and positively correlated with SiO4 (Spearman R = 0.36, p-values <0.05; Figs. 2A–2B). Group-specific abundances of Mamiellales, Strombidiida, and Choreotrichida were consistent and not correlated with any factors (Figs. 2A–2B). Microbial groups that were positively (or negatively) correlated with temperature were inversely correlated with NH4 (Figs. 1B; 2B).
Figure 2: Group-specific 18S relative abundance over the year and correlation to environmental variables.
(A) Relative abundance (%) of the top ten most relatively abundant order level protists. (B) Spearman correlations (Spearman R, p-values < 0.05) between group-specific protist relative abundance and environmental variables. Other details are identical to Fig. 1.Hierarchical clustering to distinguish separate communities
Two clusters were determined to be optimal for each dataset based on plots of average silhouette widths, here referred to as Clusters 1 and 2 (Fig. S2). Hierarchical clustering revealed a more separated cluster of samples from March-May and November-February (Cluster 1) and a tightly grouped set of samples from June-October (Cluster 2; Figs. S3–S4). 16S and 18S samples were clustered in the same manner (Fig. 3). Distance-based redundancy analysis (dbRDA) revealed temporal variability in microbial composition, with significant environmental variables explaining 58% (18S) and 63% (16S) of the variance from the sum of the first two axes (Figs. 3A–3B). Temperature was among the strongest variables significantly influencing the ordination (ANOVA, p-value = 0.01), distinguishing Cluster 2 samples from Cluster 1 (Fig. 3). Other factors like SiO4 and NO3 were also significant constraints on the dbRDA (ANOVA, p-values <0.01) and distinguished Cluster 2 samples (Fig. 3). In contrast, lower NH4 was a significant explanatory factor (ANOVA, p-value = 0.01) for the change in composition from Cluster 1 to 2 samples (Fig. 3).
Figure 3: Microbial beta diversity in the estuary.
Ordination via distance-based redundancy analysis (dbRDA) for prokaryotes (A) and protists (B). Samples are shown in triplicate (or duplicate) and colored by month using a viridis color gradient. Environmental factors are represented as biplot arrows. Temp = temperature (°C); SiO4 = silicate (µM); NO3 = nitrate (µM); PO4 = phosphate (µM); NH4 = ammonium (µM); Chl = chlorophyll (µg L −1); Sal = salinity (psu); Solar = solar radiation (MJ m −2); POC/PON = particulate organic carbon or nitrogen (µg C or N L−1). Sample shapes indicate separate clusters identified via hierarchical clustering (Cluster 1 = circles; Cluster 2 = triangles).Cluster 1 and 2 microbial communities experienced different environmental conditions (Table 1). For example, temperature, sunlight, and nutrients (except for NH4) were significantly higher (Wilcoxon or paired t-test, p-values <0.05) in Cluster 2 vs. 1 (Table 1). Mean observed protist richness and Shannon diversity were not significantly different between clusters (Wilcoxon or paired t-test, p-values >0.2; Table 1), though for prokaryotes, both richness and diversity were significantly higher (Wilcoxon or paired t-test, p-values <0.01) in Cluster 2 compared to 1 (Table 1).
| Variables | Cluster 1 | Cluster 2 | 
|---|---|---|
| Temp** | 16.43 (5.04) | 28.39 (1.99) | 
| SiO4** | 44.83 (13.97) | 105.79 (20.90) | 
| NO3** | 0.44 (0.29) | 0.98 (0.46) | 
| PO4** | 0.56 (0.12) | 0.67 (0.11) | 
| NH4** | 2.4 (1.4) | 1 (0.93) | 
| Chl | 3.15 (1.26) | 3.79 (1.43) | 
| Sal | 29.22 (1.44) | 29.36 (1.55) | 
| Solar* | 9.57 (5.69) | 13.83 (3.66) | 
| POC | 60.27 (30.12) | 62.83 (14.67) | 
| PON | 10.8 (4.41) | 11.41 (3.4) | 
| 18S richness | 488.15 (124.71) | 500.78 (77.22) | 
| 18S Shannon | 4.3 (0.48) | 4.47 (0.28) | 
| 16S richness** | 221.15 (67.07) | 271.73 (47.31) | 
| 16S Shannon** | 4.07 (0.52) | 4.46 (0.18) | 
Network analysis of sample clusters
SPIEC-EASI was used to identify significant statistical correlations between ASVs (based on their abundance), with such associations (or edges) indicating potential relationships between microbes. These putative relationships may or may not reflect true biological interactions. Cluster 1 samples formed a network with 1086 edges (66% positive) between 300 ASVs, while the Cluster 2 network revealed 707 edges (59% positive) between 297 ASVs (Table 2; Fig. S5). Three ASVs were not connected to any other ASV in the Cluster 2 network analysis. Network nodes were represented by similar order level microbial groups (56% shared), while nodes were often different at the ASV level (20% shared) between clusters (Table S3). For both networks, only a handful of archaeal ASVs (five and six) were considered as nodes in the network analysis (Table S3; Fig. S5), reflecting their lower abundance in the 16S dataset (<2% on average) compared to Bacteria (Table S1). Protist-Protist associations contributed most to the overall number of edges in each network, followed by Prokaryote-Prokaryote and Prokaryote-Protist associations (Table 2). The number of Prokaryote-Prokaryote and Prokaryote-Protist associations increased by >100% in Cluster 1 (Table 2). For both networks, nearly half of Protist-Protist and Prokaryote-Protist edges were positive (47–58%); however, Prokaryote-Prokaryote edges were 84% and 87% positive in Cluster 1 and 2 networks, respectively (Table 2).
| Relationship type | Network | Number of edges | Number of nodes | Edge density | Average path length | 
|---|---|---|---|---|---|
| Total | Cluster 1 | 1086 (66%) | 300 | 0.02 | 3.3 | 
| Cluster 2 | 707 (59%) | 297 | 0.02 | 4.3 | |
| Protist-Protist | Cluster 1 | 501 (58%) | 150 | 0.05 | 3 | 
| Cluster 2 | 433 (52%) | 150 | 0.04 | 3.2 | |
| Prokaryote-Prokaryote | Cluster 1 | 340 (84%) | 150 | 0.03 | 3.6 | 
| Cluster 2 | 167 (87%) | 147 | 0.02 | 6.5 | |
| Prokaryote-Protist | Cluster 1 | 245 (58%) | 228 | 0.01 | 8.4 | 
| Cluster 2 | 107 (47%) | 150 | 0.17 | 3 | 
Edge density was slightly higher (and average path length lower) in Cluster 1 for the overall network or when networks were analyzed for each domain level pairing (Table 2). The exception were Prokaryote-Protist edges, which exhibited higher edge density and lower average path length in Cluster 2 (Table 2). Mean degree and closeness centrality were significantly higher (Wilcoxon tests, p-values <0.001) in Cluster 1 vs. 2 networks across all 16S or 18S ASVs used in the analysis (Figs. 4A–4B; Table S3). This pattern was conserved among the most relatively abundant microbial groups at the class level (Figs. S6– S7). Mean degree was not significantly different between network clusters for Mamiellophyceae and Syndiniales (Fig. S7).
Figure 4: Network properties between clusters.
Faceted box plots displaying mean network degree (A) and closeness centrality (B) for prokaryotes and protists included in Cluster 1 (blue) and 2 (orange) networks. Degree and centrality were significantly different (Wilcoxon tests, **** = p-values < 0.0001) between clusters for prokaryotes and protists. Microbial ASVs (or nodes) within each network are shown as individual points (n = 150 per box plot). See Table S3 for raw data.Though represented by similar order level microbes, the most prominent types of edges (both positive and negative) varied between networks (Fig. 5; Table S4), often becoming more prevalent in Cluster 1. Common Prokaryote-Prokaryote edges (mostly positive) that were higher in Cluster 1 vs. 2 included Flavobacteriales-SAR11, Flavobacteriales-Rhodobacterales, and Flavobacteriales-Flavobacteriales (Fig. 5A). Several Protist-Protist edges were higher in Cluster 2, including Bacillariophyta associated with Unassigned Dinophyceae, Dino-Groups I and II, Gonyaulacales, and Tintinnida (Fig. 5B). In contrast, associations between Bacillariophyta and Mamiellales, Cryptomonadales, and Peridiniales were elevated in Cluster 1 (Fig. 5B). Flavobacteriales-Bacillariophyta was the most prevalent cross-domain relationship in both networks and increased in Cluster 1 (Fig. 5C). Several cross-domain edges that were common in Cluster 1, including Rhodobacterales-Gymnodiniales, Rhodobacterales-Bacillariophyta, and Flavobacteriales-Strombidiida were not observed in the Cluster 2 network (Fig. 5C).
Figure 5: Differences in the number of order level microbial associations between clusters.
(A) Prokaryote-Prokaryote, (B) Protist-Protist, (C) and Prokaryote-Protist relationships (faceted at order level) that were most prevalent (top ten) across both networks. Number of positive (green) and negative (purple) network edges for each relationship are compared between clusters. Relationships that are absent from a given network were not detected. 16S abbreviations: Actino = Actinomarinales; Betapro = Betaproteobacteriales; Flavo = Flavobacteriales; Micro = Micrococcales; Puni = Puniceispiralles; Rhodo = Rhodobacterales; SAR11 = SAR11 Clade; SAR86 = SAR86 Clade. 18S abbreviations: Bacill = Bacillariophyta; Crypto = Cryptomonadales; DGI = Dino-Group I; DGII = Dino-Group II; Mamiell = Mamiellales; Gonyau = Gonyaulacales; Gymno = Gymnodiniales; Peridin = Peridiniales; Strom = Strombidiida; Tintinn = Tintinnida; Udino = Unassigned Dinophyceae. See Table S4 for a full list of network correlations.Discussion
Marine microbes interact with each other in a multitude of ways, influencing food web dynamics and the flow of energy in the ocean (Worden et al., 2015). Co-occurrence network analysis of amplicon metabarcoding data has become a widely used and powerful approach to characterize associations between microorganisms (Faust & Raes, 2012; Faust, 2021). Though providing ecological insight, network associations do not necessarily translate to causal interactions (Blanchet, Cazelles & Gravel, 2020). For instance, a positive (or negative) association between microbes may reflect an overlapping (or separate) niche (Deutschmann et al., 2021). Relationships generated from amplicon data should be verified by experimental testing and reinforced by searching for such interactions in primary literature and interaction databases (Bjorbækmo et al., 2020). Despite these drawbacks, network findings can be used to form (or build) hypotheses (Santoferrara et al., 2020) that can be further tested in culture or in the field, advancing our knowledge of microbial interactions.
In our study, samples for network analysis were partitioned based on hierarchical clustering of 18S and 16S amplicon data, resulting in two environmentally distinct sample sets: Cluster 1 = March-May/November-February vs. Cluster 2 = June-October. Several network properties, like network centrality, degree, and edge number, were higher in Cluster 1, despite networks having the same initial number of 16S (150) and 18S (150) ASVs (300 total) and being mainly (56%) represented by similar order level groups. Sample clusters reflected different environmental conditions, namely temperature, sunlight, NO3, and SiO4 (all lower in Cluster 1) that may have influenced network structure and microbial relationships. Temperature, sunlight, and nutrients are universal determinants of microbial metabolism, growth (or grazing), and community structure (Hutchins & Fu, 2017; Logares et al., 2020). Microbial physiology is often thought to scale with temperature in particular, though responses are species-specific and depend on other available resources, like nutrients (Sarmento et al., 2010; Barton & Yvon-Durocher, 2019). Applying these principles may also be challenging for microbial relationships, which are highly dynamic and vary over time and space in the ocean (Chaffron et al., 2021). Nevertheless, it remains important to better understand marine microbial networks and relationships, especially when considering the response of microbes to changing ecosystems.
Microbial relationships vary between environmentally distinct periods
Prokaryote-Prokaryote and Prokaryote-Protist edges increased in number by >100% in Cluster 1, indicating these types of relationships may have been preferable in the estuary when conditions were less favorable for microbial growth and production (lower temperature, sunlight, and nutrients). Interestingly, ∼85% of Prokaryote-Prokaryote edges were positive, which may represent mutualism among bacteria. This would be consistent with the stress gradient hypothesis, where facilitative associations among organisms increase under stressful conditions (Bertness & Callaway, 1994; He & Bertness, 2014). Prokaryote-Prokaryote associations often involved SAR11 and either Rhodobacterales or Flavobacteriales, the latter two being important for processing phytoplankton-derived organic carbon (Buchan et al., 2014). Oligotrophic (and genetically streamlined) microorganisms like SAR11 may rely on copiotrophs to assimilate sources of carbon or other metabolites needed for growth (Buchan et al., 2014; Giovannoni, Thrash & Temperton, 2014). In addition to lower temperature or sunlight, bacteria may have been impacted by depleted carbon sources. Diatoms were least relatively abundant in March-May (Cluster 1), and while winter-spring blooms can occur, phytoplankton biomass is typically lower at this time in the estuary (Verity & Borkman, 2010; Anderson & Harvey, 2019). Therefore, increased mutualism among heterotrophic bacteria may represent strategies to exchange carbon, vitamins, or other metabolites and maximize growth when resources are otherwise limited and thermal conditions are less favorable.
The number of Prokaryote-Protist edges (both positive and negative) also increased in Cluster 1, especially heterotrophic bacteria (Flavobacteriales and Rhodobacterales) associated with diatoms, dinoflagellates, and ciliates. Diatom-bacteria associations are well known in the marine environment, spanning antagonistic, competitive, mutualistic, and symbiotic relationships (Amin, Parker & Armbrust, 2012; Moran et al., 2012; Amin et al., 2015; Seymour et al., 2017). Diatoms in our study were largely represented by chain-forming genera, like Chaetoceros and Thalassiosira, which form episodic blooms in the Skidaway River Estuary (Anderson & Harvey, 2019). Flavobacteriales and Rhodobacterales are known to closely associate with diatoms and rapidly exploit diatom-produced organic matter (Buchan et al., 2014). As observed among bacteria, positive diatom-bacteria relationships may reflect mutualism, with microbes exchanging metabolites and other compounds (e.g., nutrients, vitamins, and hormones) required for growth (Amin et al., 2015). Negative diatom-bacteria associations may indicate resource competition or algicidal effects (Amin, Parker & Armbrust, 2012; Meyer et al., 2017), as well as other activities (e.g., bacterial growth following consumption of plankton organic matter) that may have promoted niche separation. These potential relationships may have been more common among these groups during less favorable conditions in Cluster 1.
Other cross-domain edges were only present in Cluster 1, including Flavobacteriales and Rhodobacterales associated with heterotrophic (or mixotrophic) dinoflagellates, like Gymnodiniales (mainly Gymnodinium) and Peridiniales (mainly Heterocapsa), as well as the ciliate group Strombidiida (mainly Strombidinium). Negative associations between these taxa may indicate predation. Protist grazers are capable of ingesting bacteria (Jeong et al., 2008), utilizing alternative food sources when temperatures are lower (Aberle et al., 2012) or when their preferred algal prey is less abundant (Paffenhöfer, Sherr & Sherr, 2007). Measurable bacteria ingestion rates have been recorded for Heterocapsa and Strombidinium (2–34 cells protist−1 hr−1), though in general, ingestion by large protists is low compared to smaller flagellates (Ichinotsuka, Ueno & Nakano, 2006; Kyeong et al., 2006). Positive dinoflagellate-bacteria associations may represent symbiotic relationships, including photosymbiosis, nutrient fixation, or vitamin exchange (Decelle, Colin & Foster, 2015; Bjorbækmo et al., 2020). Yet, Prokaryote-Protist relationships remain largely unresolved in the ocean, especially at the species level (Bjorbækmo et al., 2020), which warrants further network studies inclusive of multiple domains of life and validation of such interactions in controlled lab (or field) experiments.
Shifts in group-specific abundance between networks
Microbial networks may also vary over time and space due to changes in taxonomy and functional groups (Stoecker & Lavrentyev, 2018; Vincent & Bowler, 2020; Chaffron et al., 2021). While our networks were represented by similar taxa, the abundance of certain taxonomic groups changed, likely influencing network structure and relationships. Several groups, like Cryptomonadales, Peridiniales, Flavobacteriales, and Rhodobacterales, were more abundant and accounted for more network edges in Cluster 1. While impacted by temperature, these groups were also correlated with ammonium (NH4), which may have influenced their population dynamics and associations. Ammonium is a key nitrogen source in estuaries (Damashek & Francis, 2018) and evidence suggests that smaller phytoplankton, including cryptophytes and dinoflagellates, may better utilize NH4 compared to larger phytoplankton, like diatoms, that prefer NO3 (Glibert et al., 2016). Heterotrophic bacteria can also assimilate a substantial portion of NH4 (>20%) in coastal environments (Kirchman, 1994). In the Skidaway River Estuary, increased levels of NH4 (and other nutrients) have been recorded over decadal scales, resulting from increased anthropogenic activities (Verity, 2002; Verity, Alber & Bricker, 2006). As such, the role of dissolved nutrients in microbial networks should be considered, particularly in coastal areas with high nutrient runoff and potential for eutrophication or habitat loss.
Differences in network properties between clusters may have also been influenced by functional changes among dominant microbes. Certain protists that became more prevalent in Cluster 1, namely dinoflagellates and cryptophytes, are known to exhibit mixotrophy (Stoecker et al., 2017). Groups that can exploit both autotrophic and heterotrophic lifestyles may interact with a wider diversity of organisms, facilitating higher network connectivity (Chaffron et al., 2021). In general, we observed higher edge density and lower path length between nodes in Cluster 1 networks, which suggests that microbes were more connected to each other at this time. However, opposite trends were observed for Prokaryote-Protist edges, likely because cross-domain edges were spread out over more nodes in the Cluster 1 network. Despite fewer overall connections in Cluster 2, certain edges increased, including associations between diatoms and Syndiniales (Dino-Groups I and II), which are ubiquitous protist parasites in marine and estuarine ecosystems (Guillou et al., 2008). Increased parasite abundance and prevalence in Cluster 2 networks may have been driven by increased temperature, as parasitic infection is thought to be thermally influenced (Park, Yih & Coats, 2004). Though putative relationships, like infection (positive) or deterrence (negative), have not been empirically verified for Syndiniales and diatoms, these groups are often correlated in co-occurrence networks (Sassenhagen et al., 2020; Vincent & Bowler, 2020). Diatoms were also the most relatively abundant 18S group in our study (∼25% on average), which may have explained their large contribution to Protist-Protist associations, including with Syndiniales.
Considerations for co-occurrence analysis
An important consideration with co-occurrence network analysis is accounting for the presence of indirect edges that can contribute to dense (“hairball”) networks and may cloud interpretation (Röttjers & Faust, 2018; Faust, 2021). Computational methods like SPIEC-EASI (used here) aim to account for indirect dependencies during network construction between microbes and promote sparser networks (Kurtz et al., 2015). Other recently developed programs, like EnDED (environmentally driven edge detection), are designed to reduce environmentally-driven associations after network construction, filtering for indirect dependency due to environmental factors (Deutschmann et al., 2021). To further improve network sparsity, we included only the most abundant 16S and 18S ASVs, a common strategy among co-occurrence network studies (Faust, 2021). This approach, however, may limit detection of microbial correlations that are common in the community but involve less abundant taxa. For instance, though Archaea were present in our study, they exhibited low relative abundance in the 16S dataset (<2% on average) and contributed to <3% of network connections. It will be important to employ approaches to better characterize rare microbial relationships, while limiting the detection of indirect edges that may bias network analysis.
To assess environmental effects on networks, environmental variables are typically included as nodes in a single merged network (Needham & Fuhrman, 2016). Networks can also be constructed categorically (e.g., by season) to reflect temporally variable conditions (Kellogg et al., 2019). With our approach, we separated samples for network analysis a priori based on beta diversity clustering, resulting in two environmentally distinct communities. Similar clustering of prokaryotes and protists allowed for merging of ASV tables in our study, though merging may not be possible in cases where different marker regions are distinctly clustered from each other. Even with our approach, collapsing weekly samples into one or two networks will undoubtedly mask network variability, underestimating ephemeral interactions that occur at hourly to weekly scales in dynamic estuaries (Cloern, Foster & Kleckner, 2014) and making it difficult to estimate network properties over a range of environmental factors. One idea would be to compare networks at a reasonable scale (e.g., between months) in coastal areas that experience seasonal gradients in environmental factors. However, this would require sustaining ∼daily sampling frequency (as in Needham & Fuhrman, 2016; Martin-Platero et al., 2018) to maintain high sample to ASV ratios and avoid spurious correlations (Faust, 2021). Higher sampling frequency and binning of samples into more coarse time points would support correlations between network properties (e.g., centrality and degree) and environmental variables, a strategy that has been employed with spatial samples (Chaffron et al., 2021).
Conclusions
We explored correlation networks between two environmentally distinct microbial communities in a subtropical estuary. Instead of binning amplicon data arbitrarily, we used hierarchical cluster analysis to separate samples. With this approach, we observed higher network centrality, degree, and edge number for microbes in Cluster 1, even though environmental conditions were less favorable at this time (lower temperature, sunlight, and nutrients). Prokaryote-Prokaryote and Prokaryote-Protist edges increased the most in Cluster 1, while Protist-Protist associations were more stable. Though correlation networks present inferred associations (and not ecological interactions), differences in network properties may reflect changes in the types of relationships (e.g., mutualism or competition) or shifts in taxonomic (or functional) prevalence in response to separate environmental periods. These findings represent an important step towards predicting microbial networks under varying conditions. Applying these clustering methods to new and existing amplicon surveys will help to broaden the scope of the analysis presented here (e.g., single site and year), allowing for a deeper understanding of marine microbial networks, their relation to environmental factors, and potential sensitivity to anthropogenic ocean change.
Supplemental Information
Microbial ASV taxonomy and counts
Unfiltered taxonomic assignments and raw sequence counts per sampling day for all assigned prokaryotic and eukaryotic ASVs (on separate sheets). Multicellular eukaryotes (Metazoa and Streptophyta) were subsequently filtered out of the 18S dataset to focus on protists. Prokaryotes were classified with the SILVA database (Ranks 1-7 = Domain, Phylum, Class, Order, Family, Genus, and Species), while protists were assigned with the Protistan Ribosomal Reference (PR2) database (Ranks 1-8 = Domain, Supergroup, Division, Class, Order, Family, Genus, and Species). Read counts have not been rarefied. Prokaryote ASVs are labeled as “bASV” to denote prokaryotes from protists.
Raw environmental data
Environmental parameters measured over the year on each sampling day. Samples are distinguished by month and cluster. Dissolved nutrients and particulate organic carbon and nitrogen were not measured on 9/6.
Network properties for each ASV
Degree and closeness centrality for each 16S and 18S ASV (Domain, Class-Species) present in Cluster 1 or 2 networks. A “Shared” column indicates whether or not an ASV was included in both networks. A separate sheet (Network Comp) describes the number of ASVs (nodes) within each order level microbial group that were present in Cluster 1 and 2 networks, as well as the overlap in these major order level groups (Shared vs. Cluster 1 or Cluster 2 only).
List of all network associations
Information for all ASV-ASV edges inferred via Cluster 1 and 2 networks (separate sheet per cluster). Edge information for each pairing (ASV 1 and ASV 2) includes taxonomy (Domain, Class-Species), edge weight, relationship sign (negative or positive), and type (e.g., Prokaryote-Protist or Protist-Protist). All edges were significant. Taxonomic levels that were similar between each marker region are shown, though we acknowledge different ranks between databases (e.g., SILVA Class = Rank 3 vs. PR2 Class = Rank 4). See Table S1 for full taxonomic assignments and ranks of all ASVs, including those used in the network.
Microbial rarefaction curves
Rarefaction curves of species richness vs. sequence read counts across all sampling days for prokaryotes (left) and protists (right). Curves were estimated using a step size of 100.
Silhouette method to identify sample clusters
Average silhouette width vs. number of clusters based on unweighted UniFrac distances of prokaryotes (top) and protists (bottom). Optimal number of clusters is represented by the dotted lines.
Sample clustering for prokaryotes
Dendrogram displaying results of hierarchical clustering (Ward’s method) for 16S samples collected in the estuary. Samples were clustered into two groups: Cluster 1 = March-May/November-February (blue; n = 46) and Cluster 2 = June-October (orange; n = 48). Replicate samples are shown for each day and are represented as nodes on the dendrogram.
Sample clustering for protists
Dendrogram displaying results of hierarchical clustering (Ward’s method) for 18S samples collected in the estuary. Samples were clustered into two groups: Cluster 1 = March-May/November-February (blue; n = 47) and Cluster 2 = June-October (orange; n = 49). Replicate samples are shown for each day and are represented as nodes on the dendrogram.
SPIEC-EASI networks for Cluster 1 and 2 samples
Co-occurrence networks for Cluster 1 (top) and Cluster 2 (bottom). Positive (green) and negative (purple) edges are shown for each network, inclusive of prokaryotes (Bacteria = gold and Archaea = pink) and protists (blue) ASVs. See Table S4 for a full list of ASV-ASV relationships observed in each network.
Network properties for class level prokaryotes
Box plots displaying mean degree (top) and closeness centrality (bottom) of prokaryotic groups (faceted by class) included in Cluster 1 (blue) and 2 (orange). Groups that were most prominent (< 2% relative abundance) and had < 10 ASVs were included (comparison unreliable at low sample size). Network properties were significantly different (Wilcoxon tests, *** = p-values < 0.001; **** = p-values < 0.0001) between clusters for these groups. Individual points represent specific class level ASVs within each network. Prokaryotic class level taxonomy refers to Rank 3 in the SILVA database.
Network properties for class level protists
Box plots displaying mean degree (top) and closeness centrality (bottom) of protists (faceted by class) included in Cluster 1 (blue) and 2 (orange). Groups that were most prominent (< 2% relative abundance) and had < 10 ASVs were included (comparison unreliable at low sample size). Significant differences (Wilcoxon tests, ns = not significant; * = p-values < 0.05; ** = p-values < 0.01; *** = p-values < 0.001; **** = p-values < 0.0001) of network properties between clusters is indicated for each group. Individual points represent specific class level ASVs within each network. Protist class level taxonomy refers to Rank 4 in the PR2 database.
 
                



