-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
44 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
--- | ||
layout: post | ||
title: "Ag3 cohorts analysis version 20240924" | ||
tags: data | ||
--- | ||
|
||
A new cohorts analysis version `20240924` has been released for the | ||
Ag3 data resource. This is now the default cohorts analysis version | ||
when using the `malariagen_data` [Ag3 | ||
API](https://malariagen.github.io/malariagen-data-python/latest/Ag3.html). This | ||
cohorts analysis is available for datasets up to and including Ag3.11. | ||
|
||
Please note that the new cohorts analysis may change the values of | ||
sample metadata columns including `taxon`, `admin1_iso`, | ||
`admin1_name`, `admin2_name`, and derived columns beginning `cohorts_` | ||
relative to previous cohorts analysis versions. | ||
|
||
To pin this cohorts analysis when accessing data: | ||
|
||
{% highlight python %} | ||
import malariagen_data | ||
|
||
ag3 = malariagen_data.Ag3( | ||
cohorts_analysis="20240924", | ||
) | ||
{% endhighlight %} | ||
|
||
This new version introduces some key changes: | ||
|
||
- Samples that were previously assigned as `gcx1`, have been renamed as `bissau`: | ||
- `gcx` stands for `genetic cryptic species`, we use these labels as name placeholders for groups that fall outside our usual taxonomic assignment | ||
- In line with [Caputo et al. (2024)](https://malariagen.github.io/vobs-updates/2024/09/10/caputo.html), which characterises the `gcx1` group, we have updated its proposed name to `Bissau molecular form` | ||
- 291 samples that were previously assigned to the `gcx1` group, are now relabeled as `bissau`. | ||
- 5 samples samples that were previously `unassigned`, are now relabeled as `bissau`. | ||
- these changes also affect cohort names, e.g. `GM-M_gcx1_2019` has now been relabeled to `GM-M_biss_2019` | ||
|
||
- 36 samples that were previously `unassigned`, have been renamed as (32) `melas`, (2)`gambiae`, (1) `fontenillei`, (1) `arabiensis`. | ||
|
||
- An error on the administrative region 1 metadata has been fixed, affecting 119 samples. Tor these: | ||
- `admin1_iso` has been relabeled from `UG-E` to `KE-04` | ||
- `admin1_name` has been relabeled from `Eastern Region` to `Busia` | ||
- these changes also affect cohort names, e.g. `UG-E_arab_2013` has now been relabeled to `KE-04_arab_2013` | ||
|
||
If you need to access the previous version of the cohorts analysis, you can use pin it using the code in [here](https://malariagen.github.io/vobs-updates/2024/07/24/ag3-cohorts-v20240717.html). |