Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Why does the tree change when non-splitting variables are dropped?

dropped tree variables
0
Posted

Why does the tree change when non-splitting variables are dropped?

0

A28. If a variable does not enter the tree as a primary node splitter, it may still play a important role in the tree as a surrogate splitter. If you have turned the displaying of surrogate splitters off, you will not see how these variables affect the tree but they will still be used internally by CART when applying the tree to data. The Variable Importance Table produced by CART ranks the variables in the tree by their importance, a statistic measuring how strongly a variable acts as a primary or surrogate splitter. Suppose a variable enters the tree as the top surrogate splitter in many nodes, but never as the primary splitter. If this variable is removed from the list of potential predictor variables and the tree is rebuilt, it will probably be a very different tree, and certainly will be if there are missing values in the data for the primary node-splitting variables. Steinberg, Dan and Colla, Phillip. CART–Classification and Regression Trees. San Diego, CA: Salford Systems, 1997

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123