Implemented Glue ETL spreadsheet Google LIMS processing #19
+317
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Following the warehouse framework methodology that is being built, let
glue the Google LIMS sheet as the second spreadsheet importing use case.
The target is OrcaVault database staging data area tsa schema table.
Since it is for data warehouse purpose, the ETL approach retain all
factual information; without reshaping much or dropping any but column
renaming and harmonisation. Light-weight data clean up tasks.
as-is all columns and values are being retained.
by downstream warehouse layers in psa and vault schema.
Implemented Glue ETL spreadsheet processing #13 and Implemented Glue ETL job script deployment using terraform #14. Hence, this Glue data import job becomes pretty straight forward
task and, cookiecutter template code with only need to focus on transformation.