This rule raises an issue when an attribute ending with _ is set in the __init__ method of a class inheriting from
Scikit-Learn BaseEstimator
On a Scikit-Learn estimator, attributes that have a trailing underscore represent attributes that are estimated. These attributes have to be set in the fit method. Their presence is used to verify if an estimator has been fitted.
from sklearn.neighbors import KNeighborsClassifier X = [[0], [1], [2], [3]] y = [0, 0, 1, 1] knn = KNeighborsClassifier(n_neighbors=1) knn.fit(X, y) knn.n_samples_fit_
In the example above the attributes of the KNeighborsClassifier, n_samples_fit_ is set only after the estimator’s
fit method is called. Calling n_samples_fit_ before the estimator is fitted would raise an AttributeError
exception.
When implementing a custom estimator by subclassing Scikit-Learn’s BaseEstimator, it is important to follow the above convention and
not set attributes with a trailing underscore inside the __init__ method.
To fix this issue, move the attributes with a trailing underscore from the __init__ method to the fit method.
from sklearn.base import BaseEstimator
class MyEstimator(BaseEstimator):
def __init__(self):
self.estimated_attribute_ = None # Noncompliant: an estimated attribute is set in the __init__ method.
from sklearn.base import BaseEstimator
class MyEstimator(BaseEstimator):
def fit(self, X, y):
self.estimated_attribute_ = some_estimation(X) # Compliant