-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nearest Neighbor Model #158
base: master
Are you sure you want to change the base?
Conversation
a5b8512
to
eb263c3
Compare
eb263c3
to
772116d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, but would be great if we could unify it with the Oracle model, in this PR if feasible.
for (const auto &pair : indexer) { | ||
assert(preds.at(pair.first).size() == pair.second.size()); | ||
set_subset(preds.at(pair.first).mean, pair.second, &mean); | ||
set_subset(preds.at(pair.first).covariance.diagonal(), pair.second, | ||
&variance); | ||
if (preds.at(pair.first).has_covariance()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to do this or just add 1e6
as variances where we currently have none?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this method we're taking a pair of MarginalDistribution
and concatenating them, so if the both don't have a defined covariance then we want to preserve that in the concatenation. Somewhere on my list of want to dos is to remove the optional behavior for convariances in favor of a third distribution type, something like:
using MeanOnlyDistribution = Distribution<Empty>;
using MarginalDistribution = Distribution<DiagonalMatrixXd>;
using JointDistribution = Distribution<Eigen::MatrixXd>;
or something along those lines, but that's out of scope here.
const JointDistribution &prediction) const { | ||
const NearestNeighborModel<DistanceMetric> m(*this); | ||
MarginalDistribution marginal_pred( | ||
prediction.mean, prediction.covariance.diagonal().asDiagonal()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't prediction.covariance
work here? Or are you looking to zero the non-diagonal elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah exactly, I need to zero the non-diagonal elements since the NearestNeighbor
model can never actually predict off diagonals.
std::size_t min_index = 0; | ||
double min_distance = distance_metric(ref, features[0]); | ||
|
||
for (std::size_t i = 1; i < features.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we turn min_distance
into an optional
so the loop can start at 0? The only difference in the loop would be !min_distance &&
going at the start of the if
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that pattern better too ... but so far albatross
doesn't used any optionals! So we'd have to add a third party lib for it which I've been avoiding (though perhaps the time has come).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative is to initialize the min_distance
to DBL_MAX
or some such, so it will always be replaced by the first distance.
Adds a model which (given a distance metric) will produce predictions for the nearest neighbor.
https://en.wikipedia.org/wiki/Nearest-neighbor_interpolation