-
Notifications
You must be signed in to change notification settings - Fork 772
Description
Hi,
I am building an EBM regressor with some nominal and continuous features with only main effects. I have tried ordering the nominal features by the mean of the response for better visualization. For example, if a feature has levels named ["cat", "dog", "fish"], the transformed levels would be ["2-cat", "0-dog", "1-fish"], where the number at the start indicate the order with respect to the mean of the response.
With this transformation, the scores of transformed feature levels change a lot. I had a categorical feature whose importance was quite low before the transformation, and it is the most important feature after this transformation.
I didn't specify the feature_types parameter, and the categorical features has type "nominal" when inspecting feature_types_in_.
As far as I know, EBM uses Fischer method for nominal categoricals, so the order shouldn't matter.
Can you please clarify the underlying cause of this discrepancy in the model results?
Take care,