Order of nominal feature levels change the results

Hi,

I am building an EBM regressor with some nominal and continuous features with only main effects. I have tried ordering the nominal features by the mean of the response for better visualization. For example, if a feature has levels named ["cat", "dog", "fish"], the transformed levels would be ["2-cat", "0-dog", "1-fish"], where the number at the start indicate the order with respect to the mean of the response. 

With this transformation, the scores of transformed feature levels change a lot. I had a categorical feature whose importance was quite low before the transformation, and it is the most important feature after this transformation.

I didn't specify the feature_types parameter, and the categorical features has type "nominal" when inspecting feature_types_in_.

As far as I know, EBM uses Fischer method for nominal categoricals, so the order shouldn't matter.

Can you please clarify the underlying cause of this discrepancy in the model results?

Take care,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Order of nominal feature levels change the results #629

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Order of nominal feature levels change the results #629

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions