Skip to content

Commit

Permalink
Merge pull request #87 from steeleb/new_adds
Browse files Browse the repository at this point in the history
Checking labels for outliers, class differences
  • Loading branch information
steeleb authored May 30, 2024
2 parents 0d44900 + 04b552b commit 4a235cb
Show file tree
Hide file tree
Showing 57 changed files with 12,009 additions and 4,523 deletions.
154 changes: 87 additions & 67 deletions Methods_Results_Summary.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,14 @@ completed using moderate resolution (e.g., Landsat, Sentinel, MODIS) satellite
images, focusing on mapping the distribution and types of wetlands throughout
the region ([@mohseni2023, @v.l.valenti2020]), as well as SAV distribution
throughout the system [@wolter2005]. Most of these analyses focus on a
relatively short temporal period (months to years), while a some span the
entire Landsat archive from the mid '80s through the recent past (e.g.,
[@amani2022]).
relatively short temporal period (months to years), while a some span the entire
Landsat archive from the mid '80s through the recent past (e.g., [@amani2022]).

In the recent past, much attention has been paid to the apparent proliferation
of algal blooms in some of the clearest lakes, including Lake Superior (cite).
While detecting algal blooms from moderate-resolution satellite imagery is
difficult due to low temporal frequency, time of day of acquisition, pixel
size, and spectral band metrics (cite), as well as the lack of observed,
difficult due to low temporal frequency, time of day of acquisition, pixel size,
and spectral band metrics (cite), as well as the lack of observed,
spatially-explicit bloom observations to validate presence and absence,
detecting sediment plumes (which often precede algal blooms) is relatively easy
with just the red, green, and blue bands common on nearly all
Expand Down Expand Up @@ -81,37 +80,36 @@ western extent of Lake Superior, the Apostle Islands, and Chequamegon Bay."
## eePlumB

Using the overarching architecture presented in the Global Rivers Obstruction
Database (GROD) [@yang2022] to engage volunteer observers, we crowdsourced
class labels for Landsat and Sentinel-2 images for the following classes:
'cloud', 'open water', 'light near shore sediment', 'dark near shore sediment',
'offshore sediment', 'shoreline contamination', 'other', and 'algae bloom'
using our Earth Engine Plume and Bloom labeling interface ("eePlumB"). Dates
for labeling were limited to the months of April through November to avoid
ice-on.

In order to eliminate outlier band information and reduce noise in the input
for our models, the second and ninety-eighth percentiles were calculated for
each mission-band combination and label data associated with values outside of
those cutoffs were dropped from the analysis. [[Could add the
Database (GROD) [@yang2022] to engage volunteer observers, we crowdsourced class
labels for Landsat and Sentinel-2 images for the following classes: 'cloud',
'open water', 'light near shore sediment', 'dark near shore sediment', 'offshore
sediment', 'shoreline contamination', 'other', and 'algae bloom' using our Earth
Engine Plume and Bloom labeling interface ("eePlumB"). Dates for labeling were
limited to the months of April through November to avoid ice-on.

In order to eliminate outlier band information and reduce noise in the input for
our models, the second and ninety-eighth percentiles were calculated for each
mission-band combination and label data associated with values outside of those
cutoffs were dropped from the analysis. [[Could add the
`02_label_class_summaries.Rmd` as supplemental.]]

## Model development

We used the built-in gradient tree boost ("GTB") ee.Classifier() method within
Google Earth Engine to create classification models from the crowd-sourced
label data. Label data were randomly split into training (70%) and test (30%)
data sets, with no special handling procedures for classes or satellite
missions. Data were examined to assure that all classes and missions were
present in both the training and testing data sets.

GTB models for each mission were trained independently on the rescaled band
data from red, green, blue, near infrared, and both shortwave infrared bands
for Landsat missions to classify 5 categories: cloud, open water, light near
shore sediment, dark near shore sediment, and offshore sediment. For
Sentinel-2, the bands used to develop the classifier were red, green, blue, red
edge 1-3, near infrared, and both short-wave infrared bands. We did not tune
the hyperparameters for the GTB model, as performance was already acceptable
for discerning open water from sediment plume using 10 trees.
Google Earth Engine to create classification models from the crowd-sourced label
data. Label data were randomly split into training (70%) and test (30%) data
sets, with no special handling procedures for classes or satellite missions.
Data were examined to assure that all classes and missions were present in both
the training and testing data sets.

GTB models for each mission were trained independently on the rescaled band data
from red, green, blue, near infrared, and both shortwave infrared bands for
Landsat missions to classify 5 categories: cloud, open water, light near shore
sediment, dark near shore sediment, and offshore sediment. For Sentinel-2, the
bands used to develop the classifier were red, green, blue, red edge 1-3, near
infrared, and both short-wave infrared bands. We did not tune the
hyperparameters for the GTB model, as performance was already acceptable for
discerning open water from sediment plume using 10 trees.

## Image classification

Expand All @@ -125,32 +123,40 @@ resoluiton greater than 10m x 10m were reprojected (downsampled) to 10m x 10m
pixel sizes so that the GTB model could be applied to the composite images more
efficiently. No further pre-processing was completed on the Sentinel-2 data.

Three areas of interest (AOIs) were used in this analysis: the complete AOI,
the AOI without shoreline contamination, and the AOI with shoreline
contamination. The area of shoreline contamination was defined as any area
within 60 meters of a volunteer-identified pixel with shoreline contamination.
We assumed that shoreline contamination was consistent throughout the analysis
and was not specific to any particular satellite or time period.
Three areas of interest (AOIs) were used in this analysis: the complete AOI, the
AOI without shoreline contamination, and the AOI with shoreline contamination.
The area of shoreline contamination was defined as any area within 60 meters of
a volunteer-identified pixel with shoreline contamination. We assumed that
shoreline contamination was consistent throughout the analysis and was not
specific to any particular satellite or time period.

### Model application and summaries

Each GTB model was applied to the corresponding satellite image stack and two
data types were output: a tabular data summary of the area classified and the
total area of each class for all three AOIs, as well as a .tif raster at the
resolution the GTB was applied (10m for Sentinel-2 and 30m for Landsat) for
each classified mission-date image. The .tif rasters were labeled by pixel with
the following values: 0 = out of area/masked for saturated pixels; 1 = cloud; 2
= open water; 3 = light, near shore sediment; 4 = offshore sediment; 5 = dark,
resolution the GTB was applied (10m for Sentinel-2 and 30m for Landsat) for each
classified mission-date image. The .tif rasters were labeled by pixel with the
following values: 0 = out of area/masked for saturated pixels; 1 = cloud; 2 =
open water; 3 = light, near shore sediment; 4 = offshore sediment; 5 = dark,
near shore sediment.

## Model evaluation metrics

Models were evaluated through error matrices, kappa statistics, and F1
statistics for each class.

- error matrix - testing: given the test data, does the model assign the correct class? These are tibble-style summaries where the model-assigned class and label class are compared.
- kappa statistic: an indicator of how much better or worse a model performs than by random chance. score is -1 to 1, where 0 is the same as random chance, positive values are better than random chance and negative are poorer than random chance
- F1 score: the harmonic mean of precision and recall per class (beta = 1, hence F1 where precision and recall are evenly weighted). Scores of 0 means the model cannot predict the correct class, a score of 1 means the model perfectly predicts the correct class.
- error matrix - testing: given the test data, does the model assign the
correct class? These are tibble-style summaries where the model-assigned
class and label class are compared.
- kappa statistic: an indicator of how much better or worse a model performs
than by random chance. score is -1 to 1, where 0 is the same as random
chance, positive values are better than random chance and negative are
poorer than random chance
- F1 score: the harmonic mean of precision and recall per class (beta = 1,
hence F1 where precision and recall are evenly weighted). Scores of 0 means
the model cannot predict the correct class, a score of 1 means the model
perfectly predicts the correct class.

Models were evaluated as 5-class categories and 3-class categories where all
sediment categories were compiled into a single class.
Expand Down Expand Up @@ -192,9 +198,9 @@ label_table_join <- full_join(label_table, filtered_label_table)
The collated crowdsourced label dataset consisted of `r nrow(labels)` labels
across all classes. There were `r nrow(ml_labels)` labels that were part of the
classes of interest (cloud, open water, sediment). After filtering for outliers
from each subset of mission-specific labels, there were `r
nrow(filtered_labels)` labels with complete band information. Table 1 presents
a break down of the labels.
from each subset of mission-specific labels, there were
`r nrow(filtered_labels)` labels with complete band information. Table 1
presents a break down of the labels.

```{r, echo = F}
gt(label_table_join) %>%
Expand Down Expand Up @@ -228,12 +234,12 @@ summary_table_join <- full_join(md_summ_table, md_summ_table_filt)
```

Labels were present from `r nrow(mission_date_summary)` individual mission-date
combinations spanning the dates of `r min(mission_date_summary$date)` to `r
max(mission_date_summary$date)`. Labels in the filtered dataset were present
combinations spanning the dates of `r min(mission_date_summary$date)` to
`r max(mission_date_summary$date)`. Labels in the filtered dataset were present
from `r nrow(mission_date_summary_filtered)` mission-date combinations spanning
the dates `r min(mission_date_summary_filtered$date)` to `r
max(mission_date_summary_filtered$date)`. See Table 2 for a complete breakdown
of labels by mission-date combination.
the dates `r min(mission_date_summary_filtered$date)` to
`r max(mission_date_summary_filtered$date)`. See Table 2 for a complete
breakdown of labels by mission-date combination.

```{r, echo = F}
gt(summary_table_join) %>%
Expand All @@ -247,13 +253,12 @@ gt(summary_table_join) %>%

Models performance was acceptable across open water, cloud, and discrete
sediment categories. All statistics presented in this section represent summary
statistics for classes from the testing set. Kappa statistic across all
missions was always greater than 0.8, indicating much better performance than
random assignment (Table 4). The F1 score, balanced equally between precision
and recall, was reasonable across all categories and missions with the minimum
F1 score being 0.62 for "dark near-shore sediment" for Landsat 7 (Table 4).
Cloud and open water classification F1 scores were always greater than 0.86
(Table 4).
statistics for classes from the testing set. Kappa statistic across all missions
was always greater than 0.8, indicating much better performance than random
assignment (Table 4). The F1 score, balanced equally between precision and
recall, was reasonable across all categories and missions with the minimum F1
score being 0.62 for "dark near-shore sediment" for Landsat 7 (Table 4). Cloud
and open water classification F1 scores were always greater than 0.86 (Table 4).

```{r, echo = F}
# get a list of the performance metrics list
Expand Down Expand Up @@ -408,9 +413,9 @@ gt(summary_simple) %>%
The GTB model was applied to all images in the Landsat and Sentinel 2 stacks,
irregardless of time of year and presence/absence of ice. Classified images
should only be used during ice-free periods, as no attempt was made to mask ice
or to classify ice. It is important to note that evaluation of the GTB model
was only done on the available by-pixel labels and that accuracy at
classification edges may not be precise.
or to classify ice. It is important to note that evaluation of the GTB model was
only done on the available by-pixel labels and that accuracy at classification
edges may not be precise.

In many cases, cirrus clouds are incorrectly classified as off-shore sediment.
Caution should be used when clouds characterize a large proportion of the AOI.
Expand All @@ -419,17 +424,32 @@ Caution should be used when clouds characterize a large proportion of the AOI.

The following links are Google Earth Engine scripts that allow for manual
examination of the true color image, the eePlumB classification (version
2024-01-08), and a measure of atmospheric opacity (Landsat 7) or cirrus cloud
confidence level (Landsat 8 & 9).
2024-01-08), and a measure of atmospheric opacity (Landsat 5/7) or cirrus cloud
confidence level (Landsat 8 & 9). For Sentinel 2, the cirrus cloud indication is
a 0 (no cirrus detected) 1 (cirrus detected) value.

[Landsat
7](https://code.earthengine.google.com/cd2fc7baeb0dcb2a1d30e065b419bb9e?hideCode=true)
5](https://code.earthengine.google.com/3dc621a541fefa6db53e874646f93b13)

[Landsat
8](https://code.earthengine.google.com/1c790dcabc46ff9f170a81223928df11?hideCode=true)
7](https://code.earthengine.google.com/83427128adac071d119edbd3a86f1127)

[Landsat
9](https://code.earthengine.google.com/4f8526227eeb88e67ee2e0db84ce77d1?hideCode=true)
8](https://code.earthengine.google.com/f4aad47222c53d6cf7510ef3e3344119)

[Landsat
9](https://code.earthengine.google.com/f2983f2a2196a2c033afacd22471e398)

[Sentinel
2A](https://code.earthengine.google.com/c8ae30202ad9e2549f009babe736497c)

[Sentinel
2B](https://code.earthengine.google.com/31d9b913a8421091447f213ef6d1db6d)

Note, the Sentinel viewers may hang shortly before displaying the date list.
Landsat 5 and 7 opacity measure does not seem robust for detecting cirrus
clouds. More investigation is needed to determine cirrus cloud contamination in
those instances.

# References

Expand Down
Binary file modified data/aoi/Superior_AOI_minus_shoreline_contamination.dbf
Binary file not shown.
Binary file modified data/aoi/Superior_shoreline_contamination.dbf
Binary file not shown.
Binary file added data/labels/LS5_labels_for_tvt_2024-04-25.RDS
Binary file not shown.
Binary file added data/labels/LS7_labels_for_tvt_2024-04-25.RDS
Binary file not shown.
Binary file added data/labels/LS8_labels_for_tvt_2024-04-25.RDS
Binary file not shown.
Binary file added data/labels/LS9_labels_for_tvt_2024-04-25.RDS
Binary file not shown.
Binary file added data/labels/S2_labels_for_tvt_2024-04-25.RDS
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ drive_auth()

# Purpose

This script processes the file created in GEE using the script `1_createMissionDateList.js` to create a list of unique mission-date pairs for eePlumB users.
This script processes the file created in GEE using the script
`1_createMissionDateList.js` to create a list of unique mission-date pairs for
eePlumB users.

## Download and load raw mission-date list

Expand All @@ -35,7 +37,8 @@ miss_date = read.csv(file.path(temp_dir, file$name))

## Summarize mission-date list

The mission-date list includes multiple scenes per mission-date pair, so we want to summarize by mission and date.
The mission-date list includes multiple scenes per mission-date pair, so we want
to summarize by mission and date.

```{r}
miss_date_unique = miss_date %>%
Expand Down
7 changes: 3 additions & 4 deletions eePlumB/README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,9 @@
The purpose of this directory and the associated workflow is to label Plumes and Blooms from satellite imagery in freshwater lakes. We use the methodology and workflow established in the [Global Rivers Obstruction Database](https://github.com/GlobalHydrologyLab/GROD) (GROD) to create a training dataset of labeled pixels for image segmentation of Lake Superior's western basin as an example use case.

## Lake Superior - Why label plumes and blooms?
Cyanobacteria blooms are one of the most significant management challenges in the Great Lakes today. Recurring blooms of varying toxicity are commonly observed in four of the Great Lakes, and the fifth, Lake Superior, has experienced intermittent nearshore blooms since 2012. The recent advent of cyanobacterial blooms in Lake Superior is disconcerting, given the highly valued, pristine water quality of the large lake. Many fear the appearance of blooms portend a very different future for Lake Superior. As a public resource, the coastal water quality of Lake Superior has tremendous economic, public health, and environmental value, and therefore, preventing cyanobacterial blooms in Lake Superior is a high-priority management challenge.

Lake Superior is a large lake, and relying on human observations of blooms restricts observations to near-shore locations. Remote sensing has the potential to catalog spatial and temporal extent of surface blooms. In this project, we are attempting to use optical imagery from Lake Superior to delineate surface plumes (sediment) and blooms (algae). It is likely that these two surface features occur at the same time (i.e a rainstorm may lead to a sediment plume from a river and subsequently an algal boom).

To train computer algorithms to detect these features in satellite images we need a training dataset. That's where we need your help! In this exercise, we ask you to participate in identify changes in surface conditions in the western arm of Lake Superior. All you need is a computer and your eyes.
Cyanobacteria blooms are one of the most significant management challenges in the Great Lakes today. Recurring blooms of varying toxicity are commonly observed in four of the Great Lakes, and the fifth, Lake Superior, has experienced intermittent nearshore blooms since 2012. The recent advent of cyanobacterial blooms in Lake Superior is disconcerting, given the highly valued, pristine water quality of the large lake. Many fear the appearance of blooms portend a very different future for Lake Superior. As a public resource, the coastal water quality of Lake Superior has tremendous economic, public health, and environmental value, and therefore, preventing cyanobacterial blooms in Lake Superior is a high-priority management challenge.

Lake Superior is a large lake, and relying on human observations of blooms restricts observations to near-shore locations. Remote sensing has the potential to catalog spatial and temporal extent of surface blooms. In this project, we are attempting to use optical imagery from Lake Superior to delineate surface plumes (sediment) and blooms (algae). It is likely that these two surface features occur at the same time (i.e a rainstorm may lead to a sediment plume from a river and subsequently an algal boom).

To train computer algorithms to detect these features in satellite images we need a training dataset. That's where we need your help! In this exercise, we ask you to participate in identify changes in surface conditions in the western arm of Lake Superior. All you need is a computer and your eyes.
Loading

0 comments on commit 4a235cb

Please sign in to comment.