Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added GLM #5004

Closed
wants to merge 13 commits into from
80 changes: 80 additions & 0 deletions src/shogun/regression/GLM.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
#include <shogun/lib/config.h>

#include <shogun/labels/RegressionLabels.h>
#include <shogun/lib/SGVector.h>
#include <shogun/mathematics/Math.h>
#include <shogun/mathematics/linalg/LinalgNamespace.h>
#include <shogun/regression/GLM.h>
#include <utility>
using namespace shogun;

GLM::GLM() : LinearMachine()
{
init();
}
SGVector<float64_t> log_likelihood(
const std::shared_ptr<DenseFeatures<float64_t>>& features,
const std::shared_ptr<Labels>& label)
{
auto vector_count = features->get_num_vectors();
auto feature_count = features->get_num_features();
ASSERT(vector_count > 0 && label->get_num_labels() == vector_count)
// Array of Lambdas
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
SGVector<float64_t> lambda(vector_count);
for (auto i = 0; i < vector_count; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to loop here if using dense features. You can do a matrix vector product

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no log operation in linalg thus I don't think it can be done without a loop

{
SGVector<float64_t> feature_vector = features->get_feature_vector(i);
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// Assume beta is the same as the feature vector
SGVector<float64_t> beta = feature_vector.clone();
// Assume beta0 is the same as the first element in the feature vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't do that, this is a trick to make the math look easier, but we will treat bias and weights separately please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean use bias and m_w for weights?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, for example; the point is, you should not assume the bias is the first element of the feature vector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgoetz yes I have changed that in my last commit thanks :)

float64_t b0 = feature_vector.get_element(0);
float64_t res = linalg::dot(feature_vector, beta);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont do a loop here but a matrix vector product

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

lambda.set_element(log(1 + std::exp(b0 + res)), i);
}
SGVector<float64_t> likelihood(vector_count);
SGVector<float64_t> labels = label->get_values();
SGVector<float64_t> log_lambda(vector_count);

for (auto i = 0; i < vector_count; i++)
log_lambda.set_element(log(lambda.get_element(i)), i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have element wise log in linalg? if not we should have it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is. I had checked in the documentation and didn't find anything for it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see src/shogun/distributions/LinearHMM.cpp for example

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not in linalg, add it there. We have something for elementwise ops in there iirc.
I don't think LinearHMM is a good place to look .. It is way too old school ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karlnapf do I start another PR and add it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not in linalg, add it there. We have something for elementwise ops in there iirc.
I don't think LinearHMM is a good place to look .. It is way too old school ;)

sorry ;)


likelihood = linalg::add(
linalg::element_prod(labels, log_lambda), lambda, 1.0, -1.0);
SGVector<float64_t> likelihood_clone = likelihood.clone();
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
for (auto i = 0; i < vector_count; i++)
likelihood.set_element(
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
SGVector<float64_t>::sum(likelihood_clone, i), i);
return likelihood;
}
void GLM::init()
{
SG_ADD(
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
&m_alpha, "alpha", "Weighting parameter between L1 and L2 Penalty",
ParameterProperties::HYPER);
SG_ADD(
&m_lambda, "lambda", "Regularization parameter lambda",
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
ParameterProperties::HYPER);
SG_ADD(
(std::shared_ptr<SGObject>*)&m_descend_updater, "descend_updater",
"Descend Updater used for updating weights",
ParameterProperties::SETTING);
Comment on lines +85 to +88
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as for those lines, I think that's what causing the build to fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any thoughts? @karlnapf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get an error when I test as you could tell in the CI, in the SGObject test, so I have a feeling this is what causing it. what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you report such thing, could you please make sure that you copy here the link to the CI line where you think the error is....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is probably that m_descend_updater (DescendUpdater) is never initialised, so you are serialising a nullptr, and I think that might be causing issues... @vigsterkr ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG_ADD(
&m_family, "family", "Distribution Family used",
ParameterProperties::SETTING);
SG_ADD(
&m_link_fn, "link_fn", "Link function used",
ParameterProperties::SETTING);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to initialize all values to defaults here. This brings up an interesting question. What is the default descend updater? I think it would be good to have one set so that users are not forced to pass one (tedious).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since they are just linear models at the end of the day, there isn't really alot of parameters to learn like in Neural Networks, SGD would work best here which I believe is GradientDescendUpdater

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loss is convex (for Poisson regression with log link function), so a second order method like Newton will be better. But we can change that later, best is to start with something simple, then work it up from there.

}
GLM::GLM(
const std::shared_ptr<DescendUpdater>& descend_updater,
DistributionFamily family, LinkFunction link_fn, float64_t alpha,
float64_t lambda)
: LinearMachine()
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call GLM() instead of LinearMachine()... this way you dont need the redundancy to call init()....

{
m_alpha = alpha;
m_lambda = lambda;
m_link_fn = link_fn;
m_descend_updater = descend_updater;
m_family = family;
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
init();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove it once you've fixed the ctor i've mentioned above.

}
91 changes: 91 additions & 0 deletions src/shogun/regression/GLM.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
/*
* This software is distributed under BSD 3-clause license (see LICENSE file).
*
* Authors: Ahmed Khalifa
*/

#ifndef _GENERALIZEDLINEARMODEL_H__
#define _GENERALIZEDLINEARMODEL_H__

#include <shogun/lib/config.h>

#include <shogun/features/Features.h>
#include <shogun/machine/FeatureDispatchCRTP.h>
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
#include <shogun/machine/LinearMachine.h>
#include <shogun/mathematics/linalg/LinalgNamespace.h>
#include <shogun/optimization/DescendUpdater.h>
#include <shogun/regression/Regression.h>

namespace shogun
{
enum DistributionFamily
{
NORMAL_DISTRIBUTION,
EXPONENTIAL_DISTRIBUTION,
GAMMA_DISTRIBUTION,
BINOMIAL_DISTRIBUTION,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said before, I am not very sure of representing these things as enums, this will lead to spaghetti code with that many. BUT you can leave it for now and just focus on the poisson regression. Once that is done we can think about this again

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make the GLM class accept a GLMDistribution object instead. It will contain the log-likelihood and gradients of the given distribution (e.g., GLMDistributionPoisson). Then, the GLM class will only call the methods of the GLMDistribution to train itself.

However, this is obviously out of the scope of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that, or have locally defined helper classes that are then instantiated based on an enum.
But the code for single cases (eg posison) should be in a single place (likelihood contribution, gradients, etc)

GAUSS_DISTRIBUTION,
POISSON_DISTRIBUTION
};
enum LinkFunction
{
LOG,
LOGIT,
IDENTITY,
INVERSE
};
/** @brief Class GLM implements Generalized Linear Models, such as poisson,
* gamma, binomial
*/
class GLM : public LinearMachine
{
public:
/** problem type */
MACHINE_PROBLEM_TYPE(PT_REGRESSION);

Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
GLM();
SGVector<float64_t> log_likelihood(
const std::shared_ptr<DenseFeatures<float64_t>>& features,
const std::shared_ptr<Labels>& label);
/** Constructor
*
* @param descend_updater chosen Descend Updater algorithm
* @param link_fn the link function
* @param Family the family
* @param alpha Weighting parameter between L1 and L2 Penalty
* @param lambda Regularization parameter lambda
*/
GLM(const std::shared_ptr<DescendUpdater>& descend_updater,
Khalifa1997 marked this conversation as resolved.
Show resolved Hide resolved
DistributionFamily family, LinkFunction link_fn, float64_t alpha,
float64_t lambda);

virtual ~GLM(){};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have explicitly stated that these should be override instead of virtual. you have marked this resolved, but it has not been changed at all....


/** train model
* @param data training data
* @return whether training was successful
*/
virtual bool train_machine(std::shared_ptr<Features> data = NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still protected... and as said before if you would be using override then you would get an error actually

{
return true;
};

/** @return object name */
virtual const char* get_name() const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

override...

{
return "GLM";
}

protected:
std::shared_ptr<DescendUpdater>
m_descend_updater; // TODO: Choose Default value
float64_t m_alpha = 0.5;
float64_t m_lambda = 0.1;
DistributionFamily m_family = POISSON_DISTRIBUTION;
LinkFunction m_link_fn = LOG;

private:
void init();
};
} // namespace shogun
#endif /* _GLM_H_ */