Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Resource Identifiers with ResourceIdTrait #29

Open
tareknaser opened this issue Apr 18, 2024 · 1 comment
Open

Generic Resource Identifiers with ResourceIdTrait #29

tareknaser opened this issue Apr 18, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@tareknaser
Copy link
Collaborator

Description

Merge changes from ARK-Builders/arklib#90 into ark-rust, with the cryptographic and non-cryptographic methods of operation pushed behind feature flags in relevant crates.

Motivation

Cryptographic hash functions ensure unique IDs for each resource, so there will be no need for collision handling in fs-index. To support non-cryptographic hash functions as well, we're proposing placing them behind feature flags.

Implementation proposal

Changes to ResourceIdTrait from ARK-Builders/arklib#90

  • Simplified to include only methods to get ResourceId from path or data bytes.
pub trait ResourceIdTrait {
    /// Associated type representing the hash used by this resource identifier.
    type HashType: Debug; // + Other traits we want to enforce

    /// Computes the resource identifier from the given file path
    fn from_path<P: AsRef<Path>>(file_path: P) -> Result<Self::HashType>;

    /// Computes the resource identifier from the given bytes
    fn from_bytes(data: &[u8]) -> Result<Self::HashType>;
}
  • ResourceId doesn't need a field data-size

The 2 features that should be available:

  • cryptographic-hash: Uses cryptographic hash functions (e.g., blake3) to define ResourceId, enabled by default.
  • non-cryptographic-hash: Uses non-cryptographic hash functions (e.g., crc32fast) for ResourceId.

Commands to build

  • Building for cryptographic hash: cargo build --release
  • Building for non-cryptographic hash: cargo build --release --workspace --features non-cryptographic-hash --no-default-features

Relevant crates:

  • data-resource: Main crate exporting different ResourceId types based on enabled features. Implementation should cleanly export ResourceId with enforced HashType traits, avoiding impact on fs-index.
  • fs-index: Handles collisions and index management if non-cryptographic hash ResourceId is exported by data-resource.
  • ark-cli: Includes ark collisions command and other relevant commands if non-cryptographic hash ResourceId is used in data-resource.

Follow up work:

  • Update the CI to include sanity checks for build, clippy, formatting, and testing in both cryptographic and non-cryptographic hash modes of operation.

Side note:

If we plan to include the ResourceIndex::collisions() method even for cryptographic hash functions (since some resources may have the same ID in case of identical content), we may not need feature-based compilation.

@tareknaser tareknaser added the enhancement New feature or request label Apr 18, 2024
@tareknaser tareknaser self-assigned this Apr 18, 2024
@kirillt kirillt moved this to In Progress in Development Apr 20, 2024
@tareknaser
Copy link
Collaborator Author

This issue needs to be divided into a couple of milestones to facilitate clean PRs.

Milestone 1:

  • Remove the data-size field from ResourceId struct.
  • Define a trait ResourceIdTrait.
  • Write blake3::ResourceId and crc32::ResourceId, both implementing ResourceIdTrait.
  • Define a compilation feature named "non-cryptographic-hash" in data-resource crate.
  • For crates relying on resource ID from data-resource, manually select the "non-cryptographic-hash" feature for now, as they previously used crc32fast hash directly.

Milestone 2:

This milestone should be doable once changes in fs-index and ark-cli are completed.

  • Implement the "non-cryptographic-hash" compilation feature in fs-index and handle collisions only if a non-cryptographic hash function was used in data-resource.
  • Implement the "non-cryptographic-hash" compilation feature in ark-cli and output collision-related information only if a non-cryptographic hash function was utilized in data-resource.
  • Update the CI to include sanity checks for build, clippy, formatting, and testing in both cryptographic and non-cryptographic hash modes of operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

1 participant