From bbbf44b63afe5527f4858410374167d66ff475be Mon Sep 17 00:00:00 2001 From: B Steele Date: Mon, 19 Feb 2024 10:25:38 -0700 Subject: [PATCH 1/3] add classification export to ee image collection --- modeling/04_Landsat_5_GTB.Rmd | 36 ++++++++++++++++++++++++++++++---- modeling/05_Landsat_7_GTB.Rmd | 35 ++++++++++++++++++++++++++++++--- modeling/06_Landsat_8_GTB.Rmd | 37 ++++++++++++++++++++++++++++++++--- modeling/07_Landsat_9_GTB.Rmd | 37 +++++++++++++++++++++++++++++++---- modeling/08_Sentinel2_GTB.Rmd | 37 +++++++++++++++++++++++++++++++---- 5 files changed, 164 insertions(+), 18 deletions(-) diff --git a/modeling/04_Landsat_5_GTB.Rmd b/modeling/04_Landsat_5_GTB.Rmd index 2d7733e..1c35788 100644 --- a/modeling/04_Landsat_5_GTB.Rmd +++ b/modeling/04_Landsat_5_GTB.Rmd @@ -100,12 +100,16 @@ trainedGTB_ls5 = (ee.Classifier.smileGradientTreeBoost(10).train( classProperty = 'byte_property', inputProperties = ls_input_feat )) -``` -Unfortunately, there is no current mechanism to save the GTB object, so all models -may be *slightly* different if re-created, but will likely have very similar -outcomes. +print(trainedGTB_ls5.getInfo()) +``` +Unfortunately, there is no current mechanism to save the GTB object. This is a +bummer because you can't really set a seed for these either, however! GEE is a bit +more rudimentary and recognizes the inputs and therefore creates the same output +objects. I did a quick check of this by running the model here and then again +in the browser. Both have identical versions, so I feel confident that GEE is +making the 'same' model. ## Evaluate the models @@ -551,6 +555,30 @@ for d in range(date_length_5): #Send next task. export_image.start() +for d in range(date_length_5): + md = uniqueMissDate_l5.get(d) + print(md.getInfo()) + print(str(d+1) + ' of ' + str(date_length_5)) + image = (newStack_l5 + .filter(ee.Filter.eq('missDate', md)) + .first() + .clip(aoi_ee.geometry())) + image_new_class = (classifications_to_one_band(image) + .select('reclass')) + export_image = ee.batch.Export.image.toAsset( + image = image_new_class, + region = aoi_ee.geometry(), + description = 'GTB_v' + v_date + '_' + str(md.getInfo()), + assetId = 'projects/ee-ross-superior/assets/LS5/'+'GTB_LS5)'+str(md.getInfo())+'_v'+v_date, + scale = 30, + crs = img_crs, + maxPixels = 1e13) + + #Check how many existing tasks are running and take a break of 5 mins if it's >10 + maximum_no_of_tasks(10, 5*60) + #Send next task. + export_image.start() + ``` diff --git a/modeling/05_Landsat_7_GTB.Rmd b/modeling/05_Landsat_7_GTB.Rmd index e4816d0..e3a5bca 100644 --- a/modeling/05_Landsat_7_GTB.Rmd +++ b/modeling/05_Landsat_7_GTB.Rmd @@ -99,11 +99,16 @@ trainedGTB_ls7 = (ee.Classifier.smileGradientTreeBoost(10).train( classProperty = 'byte_property', inputProperties = ls_input_feat )) + +print(trainedGTB_ls7.getInfo()) ``` -Unfortunately, there is no current mechanism to save the GTB object, so all models -may be *slightly* different if re-created, but will likely have very similar -outcomes. +Unfortunately, there is no current mechanism to save the GTB object. This is a +bummer because you can't really set a seed for these either, however! GEE is a bit +more rudimentary and recognizes the inputs and therefore creates the same output +objects. I did a quick check of this by running the model here and then again +in the browser. Both have identical versions, so I feel confident that GEE is +making the 'same' model. ## Evaluate the models @@ -543,6 +548,30 @@ for d in range(date_length_7): maximum_no_of_tasks(10, 5*60) #Send next task. export_image.start() + +for d in range(date_length_7): + md = uniqueMissDate_l7.get(d) + print(md.getInfo()) + print(str(d+1) + ' of ' + str(date_length_7)) + image = (newStack_l7 + .filter(ee.Filter.eq('missDate', md)) + .first() + .clip(aoi_ee.geometry())) + image_new_class = (classifications_to_one_band(image) + .select('reclass')) + export_image = ee.batch.Export.image.toAsset( + image = image_new_class, + region = aoi_ee.geometry(), + description = 'GTB_v' + v_date + '_' + str(md.getInfo()), + assetId = 'projects/ee-ross-superior/assets/LS7/'+'GTB_LS7_'+str(md.getInfo())+'_v'+v_date, + scale = 30, + crs = img_crs, + maxPixels = 1e13) + + #Check how many existing tasks are running and take a break of 5 mins if it's >10 + maximum_no_of_tasks(10, 5*60) + #Send next task. + export_image.start() ``` diff --git a/modeling/06_Landsat_8_GTB.Rmd b/modeling/06_Landsat_8_GTB.Rmd index 6b7227f..7f02cd3 100644 --- a/modeling/06_Landsat_8_GTB.Rmd +++ b/modeling/06_Landsat_8_GTB.Rmd @@ -99,11 +99,16 @@ trainedGTB_ls8 = (ee.Classifier.smileGradientTreeBoost(10).train( classProperty = 'byte_property', inputProperties = ls_input_feat )) + +print(trainedGTB_ls8.getInfo()) ``` -Unfortunately, there is no current mechanism to save the GTB object, so all models -may be *slightly* different if re-created, but will likely have very similar -outcomes. +Unfortunately, there is no current mechanism to save the GTB object. This is a +bummer because you can't really set a seed for these either, however! GEE is a bit +more rudimentary and recognizes the inputs and therefore creates the same output +objects. I did a quick check of this by running the model here and then again +in the browser. Both have identical versions, so I feel confident that GEE is +making the 'same' model. ## Evaluate the models @@ -544,6 +549,32 @@ for d in range(date_length_8): maximum_no_of_tasks(10, 5*60) #Send next task. export_image.start() + + +for d in range(date_length_8): + md = uniqueMissDate_l8.get(d) + print(md.getInfo()) + print(str(d+1) + ' of ' + str(date_length_8)) + image = (newStack_l8 + .filter(ee.Filter.eq('missDate', md)) + .first() + .clip(aoi_ee.geometry())) + image_new_class = (classifications_to_one_band(image) + .select('reclass')) + export_image = ee.batch.Export.image.toAsset( + image = image_new_class, + region = aoi_ee.geometry(), + description = 'GTB_v' + v_date + '_' + str(md.getInfo()), + assetId = 'projects/ee-ross-superior/assets/LS8/'+'GTB_LS8_'+str(md.getInfo())+'_v'+v_date, + scale = 30, + crs = img_crs, + maxPixels = 1e13) + + #Check how many existing tasks are running and take a break of 5 mins if it's >10 + maximum_no_of_tasks(10, 5*60) + #Send next task. + export_image.start() + ``` diff --git a/modeling/07_Landsat_9_GTB.Rmd b/modeling/07_Landsat_9_GTB.Rmd index ae5576d..7919131 100644 --- a/modeling/07_Landsat_9_GTB.Rmd +++ b/modeling/07_Landsat_9_GTB.Rmd @@ -99,12 +99,16 @@ trainedGTB_ls9 = (ee.Classifier.smileGradientTreeBoost(10).train( classProperty = 'byte_property', inputProperties = ls_input_feat )) -``` -Unfortunately, there is no current mechanism to save the GTB object, so all models -may be *slightly* different if re-created, but will likely have very similar -outcomes. +print(trainedGTB_ls9.getInfo()) +``` +Unfortunately, there is no current mechanism to save the GTB object. This is a +bummer because you can't really set a seed for these either, however! GEE is a bit +more rudimentary and recognizes the inputs and therefore creates the same output +objects. I did a quick check of this by running the model here and then again +in the browser. Both have identical versions, so I feel confident that GEE is +making the 'same' model. ## Evaluate the models @@ -546,6 +550,31 @@ for d in range(date_length_9): #Send next task. export_image.start() + +for d in range(date_length_9): + md = uniqueMissDate_l9.get(d) + print(md.getInfo()) + print(str(d+1) + ' of ' + str(date_length_9)) + image = (newStack_l9 + .filter(ee.Filter.eq('missDate', md)) + .first() + .clip(aoi_ee.geometry())) + image_new_class = (classifications_to_one_band(image) + .select('reclass')) + export_image = ee.batch.Export.image.toAsset( + image = image_new_class, + region = aoi_ee.geometry(), + description = 'GTB_v' + v_date + '_' + str(md.getInfo()), + assetId = 'projects/ee-ross-superior/assets/LS9/'+'GTB_LS9_'+str(md.getInfo())+'_v'+v_date, + scale = 30, + crs = img_crs, + maxPixels = 1e13) + + #Check how many existing tasks are running and take a break of 5 mins if it's >10 + maximum_no_of_tasks(10, 5*60) + #Send next task. + export_image.start() + ``` diff --git a/modeling/08_Sentinel2_GTB.Rmd b/modeling/08_Sentinel2_GTB.Rmd index 76a340f..5376eaa 100644 --- a/modeling/08_Sentinel2_GTB.Rmd +++ b/modeling/08_Sentinel2_GTB.Rmd @@ -100,12 +100,16 @@ trainedGTB_sen = (ee.Classifier.smileGradientTreeBoost(10).train( classProperty = 'byte_property', inputProperties = sen_input_feat )) -``` -Unfortunately, there is no current mechanism to save the GTB object, so all models -may be *slightly* different if re-created, but will likely have very similar -outcomes. +print(trainedGTB_sen.getInfo()) +``` +Unfortunately, there is no current mechanism to save the GTB object. This is a +bummer because you can't really set a seed for these either, however! GEE is a bit +more rudimentary and recognizes the inputs and therefore creates the same output +objects. I did a quick check of this by running the model here and then again +in the browser. Both have identical versions, so I feel confident that GEE is +making the 'same' model. ## Evaluate the models @@ -551,4 +555,29 @@ for d in range(date_length_sen): #Send next task. export_image.start() + +for d in range(date_length_sen): + md = uniqueMissDate_sen.get(d) + print(md.getInfo()) + print(str(d+1) + ' of ' + str(date_length_sen)) + image = (newStack_sen + .filter(ee.Filter.eq('missDate', md)) + .first() + .clip(aoi_ee.geometry())) + image_new_class = (classifications_to_one_band(image) + .select('reclass')) + export_image = ee.batch.Export.image.toAsset( + image = image_new_class, + region = aoi_ee.geometry(), + description = 'GTB_v' + v_date + '_' + str(md.getInfo()), + assetId = 'projects/ee-ross-superior/assets/sen/'+'GTB_sen_'+str(md.getInfo())+'_v'+v_date, + scale = 30, + crs = img_crs, + maxPixels = 1e13) + + #Check how many existing tasks are running and take a break of 5 mins if it's >10 + maximum_no_of_tasks(10, 5*60) + #Send next task. + export_image.start() + ``` \ No newline at end of file From e447587777926caa17eb81f477c42f1ad447a772 Mon Sep 17 00:00:00 2001 From: B Steele Date: Mon, 19 Feb 2024 10:28:58 -0700 Subject: [PATCH 2/3] fix classification values in methods --- Methods_Results_Summary.Rmd | 15 +- Methods_Results_Summary.html | 481 ++++++++++++++++------------------- 2 files changed, 227 insertions(+), 269 deletions(-) diff --git a/Methods_Results_Summary.Rmd b/Methods_Results_Summary.Rmd index e1eb417..834ee60 100644 --- a/Methods_Results_Summary.Rmd +++ b/Methods_Results_Summary.Rmd @@ -139,14 +139,9 @@ data types were output: a tabular data summary of the area classified and the total area of each class for all three AOIs, as well as a .tif raster at the resolution the GTB was applied (10m for Sentinel-2 and 30m for Landsat) for each classified mission-date image. The .tif rasters were labeled by pixel with -the following values: - -| Pixel Value | Pixel Description | -|-------------|-----------------------------------------| | 1 | cloud | | 2 | -open water | | 3 | light, near-shore sediment | | 4 | dark, near-shore sediment -| | 5 | offshore sediment | | 0 | out of area/masked for saturated pixels | - -: Pixel values and description for the GTB tif model output. +the following values: 0 = out of area/masked for saturated pixels; 1 = cloud; 2 += open water; 3 = light, near shore sediment; 4 = offshore sediment; 5 = dark, +near shore sediment. ## Model evaluation metrics @@ -418,8 +413,8 @@ was only done on the available by-pixel labels and that accuracy at classification edges may not be precise. In some cases, hazy dispersed clouds are incorrectly classified as off-shore -sediment. Caution should be used clouds characterize a large proportion of the -AOI. +sediment. Caution should be used when clouds characterize a large proportion of +the AOI. # References diff --git a/Methods_Results_Summary.html b/Methods_Results_Summary.html index c18d14a..3fceb6e 100644 --- a/Methods_Results_Summary.html +++ b/Methods_Results_Summary.html @@ -11,7 +11,7 @@ - + Methods and Results Summary @@ -5600,7 +5600,7 @@

Methods and Results Summary

ROSSyndicate

-

2024-01-31

+

2024-02-19

@@ -5644,8 +5644,8 @@

Introduction

create image classification models to create a time series of rasters that enumerate sediment plumes across the western arm of Lake Superior.

-
- +
+

:“Area of interest for this analysis in purple, consisting of a portion the western extent of Lake Superior, the Apostle Islands, and Chequamegon Bay.”

@@ -5716,43 +5716,10 @@

Model application and summaries

classified and the total area of each class for all three AOIs, as well as a .tif raster at the resolution the GTB was applied (10m for Sentinel-2 and 30m for Landsat) for each classified mission-date image. -The .tif rasters were labeled by pixel with the following values:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Pixel values and description for the GTB tif model -output.
Pixel ValuePixel Description
1cloud
2open water
3light, near-shore sediment
4dark, near-shore sediment
5offshore sediment
0out of area/masked for saturated pixels
+The .tif rasters were labeled by pixel with the following values: 0 = +out of area/masked for saturated pixels; 1 = cloud; 2 = open water; 3 = +light, near shore sediment; 4 = offshore sediment; 5 = dark, near shore +sediment.

@@ -5786,20 +5753,20 @@

Label dataset

from each subset of mission-specific labels, there were 5255 labels with complete band information. Table 1 presents a break down of the labels.

-
- -
class @@ -6303,20 +6269,20 @@

Label dataset

dataset were present from 102 mission-date combinations spanning the dates 1984-07-07 to 2023-04-11. See Table 2 for a complete breakdown of labels by mission-date combination.

-
- - @@ -6738,20 +6703,20 @@

Model evaluation

categories and missions with the minimum F1 score being 0.62 for “dark near-shore sediment” for Landsat 7 (Table 4). Cloud and open water classification F1 scores were always greater than 0.86 (Table 4).

-
-
mission unique mission-dates (all)
-
satellite @@ -7192,20 +7156,20 @@

Model evaluation

When the sediment classes were aggregated to a single sediment class, F1 scores and the kappa statistic increased dramatically (Table 5).

-
- -
mission @@ -7643,8 +7606,8 @@

Discussion

that evaluation of the GTB model was only done on the available by-pixel labels and that accuracy at classification edges may not be precise.

In some cases, hazy dispersed clouds are incorrectly classified as -off-shore sediment. Caution should be used clouds characterize a large -proportion of the AOI.

+off-shore sediment. Caution should be used when clouds characterize a +large proportion of the AOI.

References

From 146d7c2cf61fa2859d91390026f66d0e9f5459f2 Mon Sep 17 00:00:00 2001 From: B Steele Date: Thu, 22 Feb 2024 14:06:56 -0700 Subject: [PATCH 3/3] add links to image viewers --- Methods_Results_Summary.Rmd | 27 ++- Methods_Results_Summary.html | 455 ++++++++++++++++++----------------- 2 files changed, 255 insertions(+), 227 deletions(-) diff --git a/Methods_Results_Summary.Rmd b/Methods_Results_Summary.Rmd index 7e0a011..9b3c5c3 100644 --- a/Methods_Results_Summary.Rmd +++ b/Methods_Results_Summary.Rmd @@ -190,9 +190,9 @@ label_table_join <- full_join(label_table, filtered_label_table) ``` The collated crowdsourced label dataset consisted of `r nrow(labels)` labels -across all classes. There were `r nrow(ml_labels)` labels that were part of -the classes of interest (cloud, open water, sediment). After filtering for -outliers from each subset of mission-specific labels, there were `r +across all classes. There were `r nrow(ml_labels)` labels that were part of the +classes of interest (cloud, open water, sediment). After filtering for outliers +from each subset of mission-specific labels, there were `r nrow(filtered_labels)` labels with complete band information. Table 1 presents a break down of the labels. @@ -412,9 +412,24 @@ or to classify ice. It is important to note that evaluation of the GTB model was only done on the available by-pixel labels and that accuracy at classification edges may not be precise. -In some cases, hazy dispersed clouds are incorrectly classified as off-shore -sediment. Caution should be used when clouds characterize a large proportion of -the AOI. +In many cases, cirrus clouds are incorrectly classified as off-shore sediment. +Caution should be used when clouds characterize a large proportion of the AOI. + +# Image Viewer Links + +The following links are Google Earth Engine scripts that allow for manual +examination of the true color image, the eePlumB classification (version +2024-01-08), and a measure of atmospheric opacity (Landsat 7) or cirrus cloud +confidence level (Landsat 8 & 9). + +[Landsat +7](https://code.earthengine.google.com/cd2fc7baeb0dcb2a1d30e065b419bb9e?hideCode=true) + +[Landsat +8](https://code.earthengine.google.com/1c790dcabc46ff9f170a81223928df11?hideCode=true) + +[Landsat +9](https://code.earthengine.google.com/4f8526227eeb88e67ee2e0db84ce77d1?hideCode=true) # References diff --git a/Methods_Results_Summary.html b/Methods_Results_Summary.html index 3fceb6e..bc706d7 100644 --- a/Methods_Results_Summary.html +++ b/Methods_Results_Summary.html @@ -11,7 +11,7 @@ - + Methods and Results Summary @@ -5600,7 +5600,7 @@

Methods and Results Summary

ROSSyndicate

-

2024-02-19

+

2024-02-22

@@ -5644,8 +5644,8 @@

Introduction

create image classification models to create a time series of rasters that enumerate sediment plumes across the western arm of Lake Superior.

-
- +
+

:“Area of interest for this analysis in purple, consisting of a portion the western extent of Lake Superior, the Apostle Islands, and Chequamegon Bay.”

@@ -5656,7 +5656,7 @@

Methods

eePlumB

Using the overarching architecture presented in the Global Rivers Obstruction Database (GROD) (Yang et al. -2022) to engage volunteer observers, we croudsourced class labels +2022) to engage volunteer observers, we crowdsourced class labels for Landsat and Sentinel-2 images for the following classes: ‘cloud’, ‘open water’, ‘light near shore sediment’, ‘dark near shore sediment’, ‘offshore sediment’, ‘shoreline contamination’, ‘other’, and ‘algae @@ -5747,26 +5747,26 @@

Model evaluation metrics

Results

Label dataset

-

The collated crowdsourced label dataset consisted of 7862 labels for +

The collated crowdsourced label dataset consisted of 7862 labels across all classes. There were labels that were part of the classes of interest (cloud, open water, sediment). After filtering for outliers from each subset of mission-specific labels, there were 5255 labels with complete band information. Table 1 presents a break down of the labels.

-
- @@ -6269,20 +6269,20 @@

Label dataset

dataset were present from 102 mission-date combinations spanning the dates 1984-07-07 to 2023-04-11. See Table 2 for a complete breakdown of labels by mission-date combination.

-
- @@ -6703,20 +6703,20 @@

Model evaluation

categories and missions with the minimum F1 score being 0.62 for “dark near-shore sediment” for Landsat 7 (Table 4). Cloud and open water classification F1 scores were always greater than 0.86 (Table 4).

-
- @@ -7156,20 +7156,20 @@

Model evaluation

When the sediment classes were aggregated to a single sediment class, F1 scores and the kappa statistic increased dramatically (Table 5).

-
- @@ -7605,9 +7605,22 @@

Discussion

attempt was made to mask ice or to classify ice. It is important to note that evaluation of the GTB model was only done on the available by-pixel labels and that accuracy at classification edges may not be precise.

-

In some cases, hazy dispersed clouds are incorrectly classified as -off-shore sediment. Caution should be used when clouds characterize a -large proportion of the AOI.

+

In many cases, cirrus clouds are incorrectly classified as off-shore +sediment. Caution should be used when clouds characterize a large +proportion of the AOI.

+
+

References