Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_unique_id() only used in target_sda(), not in target_market_share() #489

Closed
jacobvjk opened this issue Apr 19, 2024 · 1 comment · Fixed by #492
Closed

check_unique_id() only used in target_sda(), not in target_market_share() #489

jacobvjk opened this issue Apr 19, 2024 · 1 comment · Fixed by #492
Labels
ADO Maintenance Day! priority

Comments

@jacobvjk
Copy link
Member

jacobvjk commented Apr 19, 2024

check_unique_id(data, "id_loan")

currently, check_unique_id() only used in target_sda(), not in target_market_share()
is this desired? in workflow.aggregate.loanbook, it does not seem to cause issues to have multiple identical loan_ids in the target_market_share calculation...

The reprex highlights that this can be an issue in target_market_share when two loans have the same id but otherwise differ.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(r2dii.data)
library(r2dii.match)
library(r2dii.analysis)

loanbook <- loanbook_demo[1:3, ]

matched <- loanbook |> 
  match_name(abcd_demo) |> 
  prioritize()

# duplicate loans with diff values
matched_duplicate_diff <- matched
matched_duplicate_diff$loan_size_outstanding <- 1
matched_duplicate_diff <- rbind(matched, matched_duplicate_diff)

# duplicate the exact same loans
matched_duplicate_same <- rbind(matched, matched)

matched$id_loan
#> [1] "L1" "L3"
matched_duplicate_diff$id_loan
#> [1] "L1" "L3" "L1" "L3"
matched_duplicate_same$id_loan
#> [1] "L1" "L3" "L1" "L3"

result_tms <- matched |> 
  target_market_share(abcd_demo, scenario_demo_2020, region_isos_demo)

result_duplicate_diff_tms <- matched_duplicate_diff |> 
  target_market_share(abcd_demo, scenario_demo_2020, region_isos_demo)

result_duplicate_same_tms <- matched_duplicate_same |> 
  target_market_share(abcd_demo, scenario_demo_2020, region_isos_demo)

all.equal(result_tms, result_duplicate_diff_tms)
#> [1] "Component \"production\": Mean relative difference: 7.422493e-07"                               
#> [2] "Component \"technology_share\": Mean relative difference: 9.133179e-07"                         
#> [3] "Component \"percentage_of_initial_production_by_scope\": Mean relative difference: 2.605609e-08"
all.equal(result_tms, result_duplicate_same_tms)
#> [1] TRUE
all.equal(result_duplicate_same_tms, result_duplicate_diff_tms)
#> [1] "Component \"production\": Mean relative difference: 7.422493e-07"                               
#> [2] "Component \"technology_share\": Mean relative difference: 9.133179e-07"                         
#> [3] "Component \"percentage_of_initial_production_by_scope\": Mean relative difference: 2.605609e-08"

result_sda <- matched |> 
  target_sda(abcd_demo, co2_intensity_scenario_demo, region_isos = region_isos_demo)
#> Warning: Removing rows in abcd where `emission_factor` is NA
#> Warning: Found no match between loanbook and abcd.

result_duplicate_sda <- matched_duplicate |> 
  target_sda(abcd_demo, co2_intensity_scenario_demo, region_isos = region_isos_demo)
#> Error in eval(expr, envir, enclos): object 'matched_duplicate' not found

Created on 2024-04-19 with reprex v2.1.0

AB#10681

@jacobvjk
Copy link
Member Author

depending on whether we expected that this check should be available, this could be marked a bug. in any case I think this should be a priority, because prcessing multiple loan books makes us more prone to such situations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ADO Maintenance Day! priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant