Skip to content

Commit

Permalink
Flaten path to mirror naip-analytic bucket
Browse files Browse the repository at this point in the history
  • Loading branch information
yellowcap committed Nov 5, 2024
1 parent 9146bdd commit b093ec5
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
2 changes: 2 additions & 0 deletions embeddings/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ For NAIP, we use the `naip-analytic` bucket. We leverage the manifest file that
lists all files in the bucket. This list is parsed in the beginning and each
job processes a section of the naip scenes.

At the moment of processing there were 1'231'441 NAIP scenes.

### Sentinel-2

For Sentinel-2 we use the `sentinel-cogs` bucket. Also here we use the manifest
Expand Down
4 changes: 1 addition & 3 deletions embeddings/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,14 +147,12 @@ def write_to_table(embeddings, bboxs, datestr, gsd, destination_bucket, path):
if len(embeddings.shape) == EMBEDDING_SHAPE_CLASS:
# Handle class embeddings
index["embeddings"] = [np.ascontiguousarray(dat) for dat in np_embeddings]
embedding_level = "class"
elif len(embeddings.shape) == EMBEDDING_SHAPE_PATCH:
# Handle patch embeddings
for i in range(embeddings.shape[1]):
index[f"patch_embeddings_{i}"] = [
np.ascontiguousarray(dat) for dat in np_embeddings[:, i, :]
]
embedding_level = "patch"

table = pa.table(
index,
Expand All @@ -172,5 +170,5 @@ def write_to_table(embeddings, bboxs, datestr, gsd, destination_bucket, path):
s3_bucket = s3_resource.Bucket(name=destination_bucket)
s3_bucket.put_object(
Body=body,
Key=f"{embedding_level}/{path.parent}/{path.stem}.parquet",
Key=f"{path.parent}/{path.stem}.parquet",
)

0 comments on commit b093ec5

Please sign in to comment.