Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to choose the index type #502

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jeltsch
Copy link
Collaborator

@jeltsch jeltsch commented Dec 18, 2024

This pull requests enables choice of index types via table configurations.

@jeltsch jeltsch added the enhancement New feature or request label Dec 18, 2024
@jeltsch jeltsch self-assigned this Dec 18, 2024
Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is a draft, but I thought I'd write down my thoughts anyway.

Type parameter i or j

We parameterise over j (index accumulator) in a lot of places, where I think i (the index itself) would be nicer conceptually. I don't think it actually complicates the code to have an additional type family like

type family Acc i
type instance Acc IndexOrdinary = IndexOrdinaryAcc
type instance Acc IndexCompact = IndexCompactAcc

Existentially quantifying at the bottom

We can still give this a try to see if it works and whether we think it's nicer to have. I'd be happy to try if you want me to

Comment on lines +105 to 110
{-# SPECIALISE addKeyOp
:: IndexAcc j
=> RunBuilder IO h j
-> SerialisedKey
-> Entry SerialisedValue (RawBlobRef IO h)
-> IO () #-}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is not something we have to address directly in the current PR, we can do it in a separate one)

This is a general comment, but I use addKeyOp as an example. With the inclusion of new index class constraints, we're probably missing specialisations. We should probably add specialisation pragmas for IO and IndexCompactAcc, but also for IO and IndexOrdinaryAcc. https://gitlab.haskell.org/ghc/ghc/-/issues/24359 would allow us to write both specialisations without much reptition as follows:

{-# SPECIALISE addKeyOp @IO @IndexCompactAcc #-}
{-# SPECIALISE addKeyOp @IO @IndexOrdinaryAcc #-}

But this feature is not available yet, so we'd have to repeat multiple lines of specialisation pragmas. We can use CPP to make our lives a little bit easier:

#define spec_addKeyOp(m, j) \
{-# SPECIALISE addKeyOp :: \
     RunBuilder m h j \
  -> SerialisedKey \
  -> Entry SerialisedValue (RawBlobRef m h) \
  -> m () #-}

spec_addKeyOp(IO, IndexOrdinaryAcc)
spec_addKeyOp(IO, IndexCompactAcc)

For small type signatures, it might not be necessary to use CPP, but for large signatures it's probably useful.

Internal.SnapFullTable
override
snap
const
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops! I accidentall changed the last argument, which I should not have

Suggested change
const
(resolve (Proxy @v))

This means the ResolveValue constraints also have to be reinstated

Comment on lines +117 to +118
-- | Create a new index accumulator with a default configuration.
newWithDefaults :: ST s (j s)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect something like the following to also work:

Suggested change
-- | Create a new index accumulator with a default configuration.
newWithDefaults :: ST s (j s)
type IndexAccArgs j :: Type
new :: IndexAccArgs j -> ST s (j s)

Yes, the args may be different for different indexes, but we can construct these args in a context where we know the concrete index type. For example, we could have something like:

data FencePointerIndex =
    CompactIndex (Maybe (... {- some compact index configuration stuff -})
  | OrdinaryIndex  (Maybe (... {- some ordinary index configuration stuff -})

data SomeFencePointerIndex where
  SomeFencePointerIndex :: forall j. IndexAcc j => Proxy# j -> IndexAccArgs j -> SomeFencePointerIndex

someFencePointerIndex :: TableConfig -> SomeFencePointerIndex
someFencePointerIndex conf = confFencePointerIndex conf of
    CompactIndex mstuff -> 
      let args = case mstuff of
            Nothing -> ... {- use defaults -}
            Just stuff -> ... {- construct args from stuff -}
       in SomeFencePointerIndex (proxy# @IndexCompactAcc) args
    OrdinaryIndex mstuff -> 
      let args = case mstuff of
            Nothing -> ... {- use defaults -}
            Just stuff -> ... {- construct args from stuff -}
       in SomeFencePointerIndex (proxy# @OrdinaryIndex) args

Comment on lines +320 to +325
data SomeFencePointerIndex where
SomeFencePointerIndex :: forall j. IndexAcc j => Proxy# j -> SomeFencePointerIndex

someFencePointerIndex :: FencePointerIndex -> SomeFencePointerIndex
someFencePointerIndex CompactIndex = SomeFencePointerIndex (proxy# @IndexCompactAcc)
someFencePointerIndex OrdinaryIndex = SomeFencePointerIndex (proxy# @IndexOrdinaryAcc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we have to have explicit constructors for the compact/ordinary cases for the code to use specialised code paths.

Suggested change
data SomeFencePointerIndex where
SomeFencePointerIndex :: forall j. IndexAcc j => Proxy# j -> SomeFencePointerIndex
someFencePointerIndex :: FencePointerIndex -> SomeFencePointerIndex
someFencePointerIndex CompactIndex = SomeFencePointerIndex (proxy# @IndexCompactAcc)
someFencePointerIndex OrdinaryIndex = SomeFencePointerIndex (proxy# @IndexOrdinaryAcc)
data SomeFencePointerIndex where
SomeFencePointerIndex :: forall j. IndexAcc j => Proxy# j -> SomeFencePointerIndex
SomeCompactIndex :: Proxy# CompactIndex -> SomeFencePointerIndex
SomeOrdinaryIndex :: Proxy# OrdinaryIndex -> SomeOrdinaryIndex
someFencePointerIndex :: FencePointerIndex -> SomeFencePointerIndex
someFencePointerIndex CompactIndex = SomeCompactIndex (proxy# @IndexCompactAcc)
someFencePointerIndex OrdinaryIndex = SomeOrdinaryIndex (proxy# @IndexOrdinaryAcc)

Take this example, where the SomeFencePointerIndex case leads to calls to unspecialised functions

{-# SPECIALISE f :: IndexCompactAcc s -> ... #-}
{-# SPECIALISE f :: IndexOrdinaryAcc s -> ... #-}
f :: IndexAcc j => ... {- elided -}

example x = case x of
  SomeFencePointerIndex (_ :: Proxy# j) -> f @j ... -- this is a non-specialised call
  SomeCompactIndex (_ :: Proxy# IndexCompactAcc) -> f @IndexCompactAcc ... -- this is a specialised call
  SomeOrdinaryIndex (_ :: Proxy# IndexOrdinaryAcc) -> f @IndexOrdinaryAcc ... -- this is a specialised call

-> m (RunBuilder m h)
new hfs hbio runBuilderFsPaths numEntries alloc = do
runBuilderAcc <- ST.stToIO $ RunAcc.new numEntries alloc
-> ST (PrimState m) (j (PrimState m))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add my IndexAccArgs suggestion, then we could do this:

Suggested change
-> ST (PrimState m) (j (PrimState m))
-> IndexAccArgs j

Comment on lines 154 to +155
addKeyOp
:: RunAcc s
:: IndexAcc j
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already mentioned it through other channels that some type signatures like this one already had the :: on a new line.

grep -E "^( )*::" **.hs tells me there are 54 such signatures. We can change them at some point in a dedicated PR

Comment on lines +116 to +117
unsafeFinalise
:: IndexAcc j
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the :: to be on a newline, let's keep it as it was before. I haven't checked whether there are any other similar changes in this PR, but it would be good to revert those too

Comment on lines 180 to +182
addSmallKeyOp
:: RunAcc s
:: IndexAcc j
=> RunAcc j s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functions like this one that previously had no class constraints now need SPECIALISE pragmas for IndexCompactAcc and IndexOrdinaryAcc. It would be good to go through all the affected type signatures and add specialisations if necessary

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the changes in this module can be reverted, I think

@jeltsch
Copy link
Collaborator Author

jeltsch commented Dec 19, 2024

Existentially quantifying at the bottom

We can still give this a try to see if it works and whether we think it's nicer to have. I’d be happy to try if you want me to.

Actually, I’m meanwhile eager to try it myself. 🙂 I think that I have some rather clear idea of how to do it and would be happy to try it out.

Thank you for all your other comments. I’ve read through all of them but would like to address them only after having explored that alternative approach of existentially quantifying at the bottom, as we might decide to follow this approach and this would make many of those comments obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants