chaid regression tree to table conversion in r

0 votes

I used the CHAID package from this link ..It gives me a chaid object which can be plotted..I want a decision table with each decision rule in a column instead of a decision tree. .But i dont understand how to access nodes and paths in this chaid object..Kindly help me.. I followed the procedure given in this link

i cant post my data here since it is too long.So i am posting a code which takes the sample dataset provided with chaid to perform the task.

copied from help manual of chaid:

library("CHAID")

  ### fit tree to subsample
  set.seed(290875)
  USvoteS <- USvote[sample(1:nrow(USvote), 1000),]

  ctrl <- chaid_control(minsplit = 200, minprob = 0.1)
  chaidUS <- chaid(vote3 ~ ., data = USvoteS, control = ctrl)

  print(chaidUS)
  plot(chaidUS)

Output:

Model formula:
vote3 ~ gender + ager + empstat + educr + marstat

Fitted party:
[1] root
|   [2] marstat in married
|   |   [3] educr <HS, HS, >HS: Gore (n = 311, err = 49.5%)
|   |   [4] educr in College, Post Coll: Bush (n = 249, err = 35.3%)
|   [5] marstat in widowed, divorced, never married
|   |   [6] gender in male: Gore (n = 159, err = 47.8%)
|   |   [7] gender in female
|   |   |   [8] ager in 18-24, 25-34, 35-44, 45-54: Gore (n = 127, err = 22.0%)
|   |   |   [9] ager in 55-64, 65+: Gore (n = 115, err = 40.9%)

Number of inner nodes:    4
Number of terminal nodes: 5

So my question is how to get this tree data in a decision table with each decision rule(branch/path) in a column..I dont understand how to access different tree paths from this chaid object..

Apr 11 in Machine Learning by Nandini
• 5,480 points
20 views

1 answer to this question.

0 votes

Partykit (recursive partitioning) tree structures are used by the CHAID programme. Party nodes can be used to walk the tree; a node can be terminal or have a list of nodes with information about the decision rule (split) and fitted data.

The following code traverses the tree and generates the decision table. It's only been tested on one example tree and was written for demonstration reasons.

tree_table <- function(party_tree) {

  df_list <- list()
  var_names <-  attr( party_tree$terms, "term.labels")
  var_levels <- lapply( party_tree$data, levels)

  walk_the_tree <- function(node, rule_branch = NULL) {
    # depth-first walk on partynode structure (recursive function)
    # decision rules are extracted for every branch
    if(missing(rule_branch)) {
      rule_branch <- setNames(data.frame(t(replicate(length(var_names), NA))), var_names)
      rule_branch <- cbind(rule_branch, nodeId = NA)
      rule_branch <- cbind(rule_branch, predict = NA)
    }
    if(is.terminal(node)) {
      rule_branch[["nodeId"]] <- node$id
      rule_branch[["predict"]] <- predict_party(party_tree, node$id) 
      df_list[[as.character(node$id)]] <<- rule_branch
    } else {
      for(i in 1:length(node)) {
        rule_branch1 <- rule_branch
        val1 <- decision_rule(node,i)
        rule_branch1[[names(val1)[1]]] <- val1
        walk_the_tree(node[i], rule_branch1)
      }
    }
  }

  decision_rule <- function(node, i) {
    # returns split decision rule in data.frame with variable name an values
    var_name <- var_names[node$split$varid[[1]]]
    values_vec <- var_levels[[var_name]][ node$split$index == i]
    values_txt <- paste(values_vec, collapse = ", ")
    return( setNames(values_txt, var_name))
  }
  # compile data frame list
  walk_the_tree(party_tree$node)
  # merge all dataframes
  res_table <- Reduce(rbind, df_list)
  return(res_table)
}

now we call function with the CHAID tree object:

table <- tree_table(chaidUS)

the result looks something like this:

gender   ager                       empstat   educr              marstat                          nodeId   predict  
-------- -------------------------- --------- ------------------ -------------------------------- -------- ---------
NA       NA                         NA        <HS, HS, >HS       married                          3        Gore     
NA       NA                         NA        College, Post Coll married                          4        Bush     
male     NA                         NA        NA                 widowed, divorced, never married 6        Gore     
female   18-24, 25-34, 35-44, 45-54 NA        NA                 widowed, divorced, never married 8        Gore     
female   55-64, 65+                 NA        NA                 widowed, divorced, never married 9        Gore
answered Apr 12 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
1 answer
0 votes
1 answer

How to use ICD10 Code in a regression model in R?

Using the concept of comorbidities is a ...READ MORE

answered Apr 12 in Machine Learning by Dev
• 6,000 points
21 views
0 votes
0 answers

How to add random and/or fixed effects into cloglog regression in R

Update question on treatment of one variable ...READ MORE

Apr 11 in Machine Learning by Dev
• 6,000 points
20 views
0 votes
1 answer

How to add regression line equation and R2 on graph?

Below is one solution: # GET EQUATION AND ...READ MORE

answered Jun 1, 2018 in Data Analytics by DataKing99
• 8,240 points
5,847 views
0 votes
1 answer

Data Imputation Packages

These are some packages in R which ...READ MORE

answered Jul 28, 2018 in Data Analytics by Sahiti
• 6,360 points
134 views
0 votes
1 answer

How to export regression equations for grouped data?

First, you'll need a linear model with ...READ MORE

answered Mar 14 in Machine Learning by Dev
• 6,000 points
23 views
0 votes
1 answer
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
Send OTP
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP