Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default representation of edge attributes from vectors to lists & other miscellaneous adjustments #274

Open
wants to merge 13 commits into
base: dev
Choose a base branch
from

Conversation

maxloeffler
Copy link
Contributor

@maxloeffler maxloeffler commented Dec 5, 2024

Prerequisites

  • I adhere to the coding conventions (described here) in my code.
  • I have updated the copyright headers of the files I have modified.
  • I have written appropriate commit messages, i.e., I have recorded the goal, the need, the needed changes, and the location of my code modifications for each commit. This includes also, e.g., referencing to relevant issues.
  • I have put signed-off tags in all commits.
  • I have updated the changelog file NEWS.md appropriately.
  • I have checked whether I need to adjust the showcase file showcase.R with respect to my changes.
  • The pull request is opened against the branch dev.

Description

  • Replace deprecated igraph functions by their more recent counterparts.
  • Handle reading empty commit interaction data explicitly, to avoid crashing in that case.
  • To make splitting networks by ranges more consistent with splitting time-based, also add range information to the return value.
  • Convert certain edge attributes of networks to list type. This change is motivated by a more stricter type enforcement in igraph version 2.0 and up. When merging networks through igraph::disjoint_union, edge attributes that exist in both networks are now assumed to have identical types. But, when we simplify a network, its edge attributes convert to list type as they now need to store data that a vector in R cannot store anymore, i.e., one edge attribute now has to hold several lists of data of all the edges that got merged into one. Therefore, we need to convert edge attributes to list by default to be able to merge simplified and unsimplified networks.

Changelog

Added

  • Introduce convert.edge.attributes.to.list function.

Fixed

  • Fix reading empty commit interaction data.

Changed / Improved

  • Replace deprecated igraph functions.
  • Add range information to the return value in split.network.time.based.by.ranges, to be consistent with split.data.time.based.
  • Convert edge attributes to a list format to ensure compatibility with igraph.

Copy link

codecov bot commented Dec 5, 2024

Codecov Report

Attention: Patch coverage is 93.44262% with 4 lines in your changes missing coverage. Please review.

Project coverage is 81.09%. Comparing base (1d1fe7f) to head (3427e14).
Report is 1 commits behind head on dev.

Files with missing lines Patch % Lines
util-networks-metrics.R 0.00% 2 Missing ⚠️
util-networks.R 97.82% 1 Missing ⚠️
util-read.R 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #274      +/-   ##
==========================================
+ Coverage   80.95%   81.09%   +0.14%     
==========================================
  Files          16       16              
  Lines        5051     5095      +44     
==========================================
+ Hits         4089     4132      +43     
- Misses        962      963       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

util-split.R Outdated Show resolved Hide resolved
maxloeffler and others added 3 commits December 7, 2024 17:06
Signed-off-by: Maximilian Löffler <[email protected]>
The interaction reading breaks if an interaction is empty in the data.
This commit adds a check for empty interactions.

Signed-off-by: Christian Hechtl <[email protected]>
Since version 2.0.0 of package `igraph`, the function `igraph::power_law_fit`
(and its deprecated alias `igraph::fit.power.law`) does not automatically
compute the p-value any more, which is needed by `metrics.scale.freeness`.
In version 2.1.1 of package `igraph`, they have added a new parameter
`p.value` to provide the possibility to enable p-value computation
again. In addition, they have added a parameter `p.precision` which
defaults to a precision of 0.01.

To make `metrics.scale.freeness` work again with the most recent version
of package `igraph`, the deprecated function calls to
`igraph::fit.power.law` are replaced by `igraph::power_law_fit` and are
adjusted to use the new parameter `p.value` to automatically compute the
p-value again. However, we do not make use of `p.precision` and rely on
the default precision instead.

Signed-off-by: Thomas Bock <[email protected]>
util-networks.R Outdated Show resolved Hide resolved
Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While looking over the rest of this PR, I only found a couple of minor issues, please find the individual comments below.

Regarding the uncovered lines reported by codecov: I suggest to ignore them. 2 out of 4 missing lines are part of an untested module anyway. The other two lines refer to very special cases that have been previously untested too.

util-networks.R Outdated Show resolved Hide resolved
util-split.R Show resolved Hide resolved
tests/test-networks-equal-constructions.R Show resolved Hide resolved
util-networks.R Show resolved Hide resolved
tests/test-networks.R Outdated Show resolved Hide resolved
tests/test-networks.R Outdated Show resolved Hide resolved
This change makes splitting by ranges more consistent with time-based
splitting.

This works towards se-sic#273.

Signed-off-by: Maximilian Löffler <[email protected]>
Since igraph version 2.1, when joining networks using
'igraph::disjoint_union', edge attributes of the joining networks
require identical types. As simplifiying networks necessarily converts
types of edge attributes to list when merging edges, attributes now have
to be of type list by default.

Edge attributes that are explicitly considered during simplification
and, therefore, are not converted to lists are excluded from this rule.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <[email protected]>
Adjust the tests in accordance to converting edge attributes to list
type in the implementation.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <[email protected]>
This is necessary since igraph falsely fills in non-present edge
attributes with NULLs instead of NAs in certain cases when using
'igraph::disjoint_union' and 'igraph::add_edges'.

Signed-off-by: Maximilian Löffler <[email protected]>
'plyr::rbind.fill' uses NULL to fill missing values in lists. As we now
use lists for most edge attributes, we need to handle this case
separately to ensure missing values are filled with NAs instead.

To fix this issue, we need to instantiate missing columns in dataframes
with NAs before calling 'plyr::rbind.fill'. This operation is constant
with respect to the amount of rows and should not impact performance too
much.

This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <[email protected]>
This works towards fixing se-sic#271.

Signed-off-by: Maximilian Löffler <[email protected]>
@maxloeffler maxloeffler changed the title Miscellaneous fixes and adjustments Change default representation of edge attributes from vectors to lists & other miscellaneous adjustments Dec 16, 2024
Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a few comments regarding the news. Please find them below.

NEWS.md Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
NEWS.md Outdated

### Fixed

- Fix the creation of edgelists for issue-based artifact-networks by correctly iterating over the issue data (PR #264, 321d85043112971c04998249c14a0677a32c9004)
- Fix a bug in `extract.timestamps` that occurs when the first `data.source` contains empty data and that leads to a return value of type numeric which should be POSIXct (PR #270, 10696e4cf4ae92371917ed8ccaec2b0183da145c, 646c01a42ad8decfbc9040030e790e51cb65cffd)
- Fix `read.commit.interactions` by explicitly considering empty commit interaction data (PR #274, f591528a0f1f11b1a4390949ab770f3f74a766f9)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty commit interaction data ➡️ non-existent commit interactions

(for explanation: the commit-interaction data per se are present, but specific interactions between specific commits are empty, that is, non-existent)

util-split.R Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
@bockthom bockthom added this to the v5.0 milestone Dec 17, 2024
Signed-off-by: Maximilian Löffler <[email protected]>
The igraph maintainers decided that they will not re-install consistency
with their older versions (in terms of filling in non-existent values in
'igraph::disjoint_union' and 'igraph::add_edges' with NAs) yet
(igraph/rigraph#1587). The previously
temporary fix is therefore now semi-permanent.

Signed-off-by: Maximilian Löffler <[email protected]>
Signed-off-by: Maximilian Löffler <[email protected]>
@maxloeffler
Copy link
Contributor Author

Between the last and the second to last push I forgot to update the copyright headers of util-networks.R 😅.

Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found a few minor inconsistencies in the README, please find them below.
If also @hechtlC is fine with the new announcement, this PR is ready to be merged then.

@@ -4,6 +4,10 @@

## unversioned

### Announcement

- Coronet is not compatible anymore with igraph versions below 2.1.0. This is due to the simultaneous deprecation of `subgraph.edges` and the introduction of the replacement for it, `subgraph_from_edges`, in igraph version 2.1.0.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please use back ticks for the occurrences of igraph, as done in previous announcements?

NEWS.md Outdated
@@ -18,8 +22,9 @@

### Changed/Improved

- **Breaking Change**: Change the default representation of edge attributes from vectors to lists. This change is necessary for the interplay of coronet networks with certain igraph functionality since igraph version 2.1. (PR #274, 1c35d1fa2548deb297dbfa5e2b07fce31962c5b7, eda30b838369ec46376812298a3ea8159eec5789, 44c7b72e3234cb332bb2713fb408c124e67255d9, 7303eabef6a78198575fe5bdfc02813fde3d3974)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, please use back ticks for igraph also here. And my be we should also use version 2.1.0 here, to be consistent with the announcement.

@@ -28,6 +33,9 @@

- Fix the creation of edgelists for issue-based artifact-networks by correctly iterating over the issue data (PR #264, 321d85043112971c04998249c14a0677a32c9004)
- Fix a bug in `extract.timestamps` that occurs when the first `data.source` contains empty data and that leads to a return value of type numeric which should be POSIXct (PR #270, 10696e4cf4ae92371917ed8ccaec2b0183da145c, 646c01a42ad8decfbc9040030e790e51cb65cffd)
- Fix `read.commit.interactions` by explicitly considering non-existent commit interaction data (PR #274, f591528a0f1f11b1a4390949ab770f3f74a766f9)
- Add range information to network-splits when splitting a network using `split.network.time.based.by.ranges`. This effect also propagates into `split.networks.time.based` (PR #274, 87911ade231c44b93be194a1d6734f7de043a4af)
- Adjust `metrics.scale.freeness` and `metrics.is.scale.free` functions be compatible with both older and newer igraph versions (PR #274, 4b0d5221dd56bb3c9ddf196f67719d4f503d9b61)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be "to be compatible with" instead of just "be compatible with" ?

@@ -28,6 +33,9 @@

- Fix the creation of edgelists for issue-based artifact-networks by correctly iterating over the issue data (PR #264, 321d85043112971c04998249c14a0677a32c9004)
- Fix a bug in `extract.timestamps` that occurs when the first `data.source` contains empty data and that leads to a return value of type numeric which should be POSIXct (PR #270, 10696e4cf4ae92371917ed8ccaec2b0183da145c, 646c01a42ad8decfbc9040030e790e51cb65cffd)
- Fix `read.commit.interactions` by explicitly considering non-existent commit interaction data (PR #274, f591528a0f1f11b1a4390949ab770f3f74a766f9)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it seems to me that you have only fixed the first part (replaced empty be non-existent), but this only makes sense if you also replace the following words as indicated in my previous review: commit interaction data ➡️ commit interactions.

(the commit interaction data are present, but no commit interactions are present).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants