-
-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tree-building: port #835
base: main
Are you sure you want to change the base?
tree-building: port #835
Conversation
So, I think the gross solution that is supposed to be shipped with the exercise for the student to refactor got too many linting suggestions and made the Travis build fail. |
Actually a bunch of the linting warnings came from my @sshine do you know if that is why it failed? |
Okay, so I got rid of the linting errors for the solution in This appears to be why it is failing unless I am misreading the report from Travis. Is there someway to exempt files that are for refactoring exercises? |
I see that: hlint ${TRAVIS_BUILD_DIR} # Run `hlint` on the entire repository. returns That seems like desirable behavior in general. A number of the warnings I put in there intentionally, because they were redundant (to give some easy fixes during refactoring). I suppose I could fix all of those problems. It would still all be in one giant nasty function. Anyway, I will wait until I hear back from one of the core maintainers, before I decide on my next course of action here ;) |
So I just discovered that I can disable Hlint for that single file. If you all like the grotesqueness of the file that will be sent for refactoring, then I suppose that is one way to solve the issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice job.
I thought of another solution that builds a Data.Map
and then constructs the tree from that.
I think the stub that needs to be refactored looks fine. I don't have any ideas at this moment as to which kind of pitfalls to place in it beyond ones you've already placed. We can always reiterate on it later, if we like.
|
||
data Record = Record Id (Maybe Id) deriving (Eq, Show) | ||
|
||
data Tree = Leaf Id | Branch Id Children deriving (Eq, Show) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go with Data.Tree
from containers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I'll look into that.
cycles = any checkRecord | ||
where checkRecord r = | ||
case r of | ||
(Record i Nothing) -> if' (i /= 0) True False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Record i Nothing) -> if' (i /= 0) True False | |
Record i Nothing -> i /= 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, that should be pretty obvious 😂
where checkRecord r = | ||
case r of | ||
(Record i Nothing) -> if' (i /= 0) True False | ||
(Record i (Just p)) -> if' (p >= i) True False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Record i (Just p)) -> if' (p >= i) True False | |
Record i (Just p) -> p >= i |
. map (\xs -> (recordParent (head xs), map recordId xs)) | ||
. groupBy parentsEq | ||
. sortOn recordParent | ||
. sortOn recordId |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will have the possibly unintuitive consequence that it'll primarily sort on recordParent
and secondarily on recordId
. An alternative is to sort with a combined function using the Monoid
instance of Ordering
described here:
sortBy $ \x y -> comparing recordParent x y `mappend` comparing recordId x y
or something similar...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool
-- General Tools | ||
|
||
if' :: Bool -> a -> a -> a | ||
if' p a b = if p then a else b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer necessary.
build :: [ParentGrouping] -> Maybe Tree | ||
build [] = Nothing | ||
build (x:xs) | ||
| not . validRoot $ x = Nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| not . validRoot $ x = Nothing | |
| not (validRoot x) = Nothing |
in if not (null children) | ||
then Branch yid (build' children) | ||
else Leaf yid | ||
) (snd y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not exhaustive.
I might write it like:
build' :: [ParentGrouping] -> [Tree]
build' [(_, yids)] = map Leaf yids
build' ((_, yids):pgs) = map buildWithChildren yids
where
buildWithChildren yid =
let children = [ child | child <- pgs, fst child == Just yid ]
in if null children
then Leaf yid
else Branch yid (build' children)
And I might address that it's not exhaustive by concluding that it shouldn't be empty.
|
||
groupByParent :: [Record] -> [ParentGrouping] | ||
groupByParent = sortOn fst | ||
. map (\xs -> (recordParent (head xs), map recordId xs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import Control.Arrow ((&&&))
(\x -> (f x, g x)) = f &&& g
. map (\xs -> (recordParent (head xs), map recordId xs)) | |
. map (recordParent . head &&& map recordId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I haven't used the Control.Arrow library before. It is exciting to see it in action here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only really use &&&
, ***
, first
and second
from Control.Arrow
, and the only arrow I ever think about is (,)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may like to know that first
and second
actually live in two places nowadays: There is a better version in Data.Bifunctor
. A bifunctor is considerably less scary than an arrow, I think.
-- verification | ||
succIdCheck :: [Record] -> Bool | ||
succIdCheck = all (\(x,y) -> succ x == y) . pairwise . sort . map recordId | ||
where pairwise = zip <*> tail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this check is necessary.
Can't a database skip IDs as long as they're monotonically increasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. Taking all of the tests from the golang version of this exercise has the concequence of those tests incorporated simply by virtue of their being there.
I have not had to deal with the premis of this exercise in the real world. I suppose if users delete comments, then we could definitely end up with trees that have gaps, although a lit of forums seem to keep a marker that it is deleted.
I suppose that is what we have to think about here.
is this tree building for a system where users have the right to completely delete their comments? or is it creepy like facebook.?
|
||
import Data.List | ||
import Data.Ord (comparing) | ||
--import Data.Maybe (fromMaybe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure an unused import is necessary in the stub.
--import Data.Maybe (fromMaybe) |
You're quite right, with refactoring exercises we don't want - ignore: { within: [ TreeBuilding ] } But I'm not sure how to distinguish the |
It appears like we can put the ignore rules in the stub file as The only issue I see is that is would introduce a lot of clutter in the file, and we would probably need to add nformation in the |
I think I responded in some way to each thing 😆 I have to move a 1400 lb millingachine today, so I won't get any more work up until this evening in US/EST. @sshine as always thank you very much for your constructive guidance 😁 |
@@ -0,0 +1,21 @@ | |||
name: tree-building | |||
version: 1.0.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since there is no canonical-data.json, by what's written in https://github.com/exercism/haskell/blob/master/README.md#exercise-versioning please use 0.1.0.s
instead of 1.0.0.s
version numbers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, cool. I wasn't sure of the protocol.
@@ -0,0 +1,21 @@ | |||
name: tree-building | |||
version: 1.0.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since there is no canonical-data.json, by what's written in https://github.com/exercism/haskell/blob/master/README.md#exercise-versioning please use 0.1.0.s
instead of 1.0.0.s
version numbers
Just an update. I've been super busy recently, and I will be through Saturday. |
I'm sorry that I didn't address this in my research: Yes, we could do that, but then we would distribute the pragma to the student, which would cause HLint to be disabled on their end, which we really don't want. So ideally we should find a workaround HLint not having an The long-term benefit we get from working out this part is that we make way for handing out stubs with warnings in them for other refactoring exercises. I think having obvious errors that HLint catches is a good basis for refactoring, since it encourages the use of HLint. |
Okay, those are some compelling reasons to get this figured out the right way regarding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another implementation and some thoughts regarding the stub and test files:
- How about naming the stub file
ForumThread.hs
? - Subsequently,
module ForumThread where ...
. - How about renaming
newTree
intofromList
? - How about renaming
Tree
intoForumThread
?
Your tests include a lot of bad input cases also present in the F# version in which no trees are produced. One could imagine they're a product of some bad SQL JOIN (in case of duplicates), SELECT (in case of wrong order), DELETE (in case of missing parents) or some wicked UPDATE (in case of cycles).
We assume that records are sorted and hope that duplicates and cycles don't occur. But if these promises are broken, it's really a problem in the producer of these records. I thought it might be interesting to deal a little differently with the various errors:
- "Duplicate", "Duplicate root": Instead of failing to produce a tree, what we'd rather do in a production environment is discard the duplicate and log a warning. This could be an excellent opportunity to try the
validation
package ("A data-type like Either but with an accumulating Applicative") or theco-log
package (production-grade logging framework): Accept threads with duplicates by discarding them, and encourage in a later iteration to add either cumulative error handling or logging of these. - "Non-continuous": We assume that records are sorted, but test what happens if the assumption is broken? How about either (1) not assume this and require sorting, or (2) not test this and assume that the bug is located outside the function we're writing? In case of (1) it's just as easy to sort as it is to handle the error, and in case of (2) it's even easier.
- "Cycle Directly", "Cycle indirectly": We can disregard cycles by dealing with records ordered by their ID:
Record x (Just y)
is always invalid ifx > y
(y
is at this point an unknown parent). Similarly to duplicates, we could simply discard records that can't be inserted because of an unknown parent and encourage error handling / logging in a later iteration. - "Higher id parent of lower id": Same as "Non-continuous": How about either (1) not assume records are sorted and require sorting, in which case this is not an error, or (2) not test this and assume that the bug is located outside the function we're writing. Maybe a bug like this could occur if a record was "moved" from one thread to another. Maybe this is an acceptable anomaly as long as it doesn't produce bad trees.
TL;DR:
- Proposal: Discard duplicates and pseudo-cyclical records, change tests so they're positive edge cases rather than negative cases.
- Proposal: Either assume sorting and remove testing of non-sorted records, or require sorting and change negative tests of non-sorted records to positive tests.
module TreeBuilding (fromList, Record(..), Thread) where
import Debug.Trace (trace)
import Data.Function ((&))
import Data.Maybe (fromMaybe)
import Data.Ord (comparing)
import Data.List (sortBy, foldl')
import qualified Data.Map as Map
import Data.Map (Map)
import Data.Tree (Tree(..))
type Id = Int
type Thread = Tree Id
data Record = Record
{ recordId :: Id
, recordParentId :: Maybe Id
} deriving (Eq, Show)
type ParentId = Id
type ChildId = Id
type ParentChildMap = Map ParentId [ChildId]
fromList :: [Record] -> Maybe Thread
fromList = threadWithRoot 0
. foldl' (flip insertRecord) Map.empty
. sortBy (comparing recordParentId <> comparing recordId)
where
insertRecord (Record recordId Nothing) m
| Map.member recordId m = trace ("Duplicate parent " ++ show recordId ++ "!") m
| otherwise = Map.insertWith (++) recordId [] m
insertRecord (Record recordId (Just parentId)) m
| Map.member recordId m = trace ("Duplicate child " ++ show recordId ++ "!") m
| Map.member parentId m =
Map.insertWith (++) recordId []
(Map.adjust (++ [recordId]) parentId m)
| otherwise = trace ("Unknown parent " ++ show parentId ++ "!") m
threadWithRoot :: ParentId -> ParentChildMap -> Maybe Thread
threadWithRoot rootId parentChildMap = buildMaybe rootId
where
buildMaybe parentId = parentChildMap
& Map.lookup parentId
& fmap (Node parentId . map build)
build parentId = fromMaybe (pure parentId) (buildMaybe parentId)
|
||
groupByParent :: [Record] -> [ParentGrouping] | ||
groupByParent = sortOn fst | ||
. map (\xs -> (recordParent (head xs), map recordId xs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only really use &&&
, ***
, first
and second
from Control.Arrow
, and the only arrow I ever think about is (,)
.
I'll finally have some time for this tonight ! I looked at |
A last thought -- sorry for spamming you -- if we're giving a data Thread = Leaf Id | Branch Id Children in the stub, but we'd like people to use import Data.Tree (Tree)
type Thread = Tree Id then how can we express test cases abstractly? We would need to hide the concrete constructors ( module ForumThread (Record, Thread, fromList, thread) where
...
-- In the stub
thread :: Id -> [Thread] -> Thread
thread recordId [] = Leaf recordId
thread recordId children = Branch recordId children
-- In the example
thread :: Id -> [Thread] -> Thread
thread = Node Then we can express Just (thread 0 [ thread 1 [], thread 2 [] ]) regardless of internal representation. Then it can become an objective to change the internal representation to |
@sshine that is amazing! Thank you for sharing that strategy with me 😍 |
@@ -0,0 +1,21 @@ | |||
name: tree-building | |||
version: 1.0.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upon further review of other example package.yaml, I find that we just don't have version
in example package.yaml
, likely so we don't need to update them in multiple places. So this one should be removed
exposed-modules: TreeBuilding | ||
source-dirs: src | ||
ghc-options: -Wall | ||
# dependencies: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these lines can (and should) be removed from example's package.yaml
@guygastineau: I feel that I've left you a bit of a mess wrt. feedback. I'll try and summarize my proposal of what we need to do from here:
In this gist I've updated the stub file and the tests so they conform to these proposals, and the only thing I've left commented out are the negative cases. As for your example solution: Feel free to stop refactoring whenever you feel like it. The important part here is that it's a learning experience for you and that it behaves the same as the stub file. We can always take iterations on improving it further for your sake after merging the important parts. |
@sshine thank you for your detailed work. I will see how far I can take this with your guidance ;) I'm sorry I've left this PR up without progress for so long. A new project for a startup is taking a lot of time from my days. |
Hey @sshine That is a really good Gist! I appreciate all of your guidance here ;) I am still swamped with real work projects for a start up right now. I could probably implement these changes over the next day or so, but if you are willing simply to change this to the better version I would be grateful. I am sorry that I have left this PR hanging for so long on your track. |
@guygastineau: We've had PRs open for a year, it really is no inconvenience. I'll look at it if you prefer to not continue with it. Thanks for all the effort so far! |
@sshine your generosity of spirit is only matched by your expertise. Though it might take me til Monday to have it in good shape I suppose I should do it (it will help help me be a better Haskeller). Thank you for dissuading my anxieties about longer running PR's. |
Thank you for the review @petertseng I can try to do some work on this this week, but I am still utterly swamped. I have taken a huge interest in I have done most of the work for the Even though I said I should finish this PR for my benefit, I honestly would be grateful to anyone who wanted to finish this tree building port up. |
You already did the heavy lifting. I'm playing with stateful tree generators for this one, so it may take a couple of days. |
@sshine cool. I am excited to see your work. I imagine I will learn a lot from reading it 🤓 In the mean time I will get the |
I've gotten seriously frustrated at
I ditched the thing when I couldn't easily call on |
Well, this sure has been open for along time ;) I have learned a lot since I tried porting this exercise. I think I will give it another go now. |
@guygastineau: Cool :) I think I learned something about how to not give feedback doing this one, haha. I think if you regard all of the comments above as one big brainstorm and do the exercise without constraints, maybe it will be less critical. Feel free to decide whether you want to close this one and open a new one, or just work from here and not let the comments made so far disturb your creative process too much. |
@guygastineau: Upon revisiting the comment thread in this PR, I see this comment containing a link to a proposed copy of the handed-out solution and a test suite that is compatible with it. You would only need to make an example solution that you like and conforms to this test suite (which was made flexible wrt. internal representation, so that this could become a part of the task). |
@sshine hahaha. Well, I think your feedback was useful even though some of it lost me at the time I got interested in this again, because I thought I might be able to use recursion schemes in the example. I'll see if it is practical with the internal representation. Still, my point is that your advice was good, but it was over my head at the time. The above brainstorming session makes much more sense now that I have gone on to learn and build more Haskell projects in the wild that do actual work. My ongoing quest to learn intuitionistic type theory is also working out (though it is hard), that and some explorations in compilers has made a lot of things clearer for me I will need to see if I can get tools here to work for me. Last time (for some reason I can't understand) I did not reach out about the testing tools not working. Is stack still set up to use some sandbox directory to run all of the tests? IDK, I will re-familiarize myself with the process of testing and contributing to this project. If I encounter similar issues using the repo's utilities I will reach out for help with that this time. |
This doesn't point to a
canonical-data.json
file, because none exists: problem-specificationsI stole all of the tests from the
go
track. I think it would be good to have tests that verify it can handle more levels of recursion. go testsI think I probably should have used
Either
instead ofMaybe
, but now we can have a discussion about what would be best.This is a refactoring exercise. I might have made the gross version too gross, but it definitely can be refactored 🤣
I think that my
correct
solution could probably do with some refactoring too ;)PS. I meant to send this up here last week, but I've been distracted by life itself. It feels good to get moving on this again ;)