Review: LinkedInLearning short course “Machine Learning and AI Foundations: Decision Trees”

Review: LinkedInLearning short course "Machine Learning and AI Foundations: Decision Trees"

This is the second Machine Learning course that I have attempted, and the first through LinkedIn Learning. I must admit that I have yet to write a review on the first course because I am unsure as to whether I took away any deep understanding of what it was trying to teach. One day I will explore why. This course, however, proved a lot better.

A little background rambling first. LinkedIn Learning is great. It is basically a giant online training facility with a billion (slight exaggeration) courses on all range of subjects. I’ve explored writing courses, leadership courses, application courses, game design courses and here I am doing a machine learning course. If you are not currently a member then purchase membership. Alternatively, there are corporate memberships and it may be worth while asking your training manager whether your organisation has one because then you can use their account for training. Great stuff!

Secondly, the course is by Keith McCormick, a data mining consultant,
trainer, speaker, and author. According to his web site he has been using Stats software tools since the early 90s, and has been training since 1997. Throughout the course I really got a sense of his mastery of the topic.

OK – into the meat of this post. The course “Machine Learning and AI Foundations: Decision Trees” claims to be an hour and a quarter course with the following learning outcomes:

  • Using the SPSS Modeler
  • Building a CHAID model
  • Adding a second model with C&RT
  • Analysis notes
  • Using a lift and gains chart
  • Exploring algorithms
  • Building a tree interactively
  • The Bonferonni adjustment
  • Handling nominal, ordinal, and continuous variables
  • Examining the CHAID tree
  • The Gini coefficient
  • Weighing purity and balance
  • Understanding pruning
  • Examining the C&RT tree
  • Applying stopping rules
  • Using the Auto Classifier tuning trick

In plain English, decision trees are a way of predicting outcomes. So the example data in this course revolves around passengers on the Titanic and how well we can predict who will survive the iceberg and who will not.

The entire hour and a quarter bit bends the truth a little because first you have to download and install the software (Watson Studio which contains SPSS) from IBM. It is essentially a 30 day free trial, so you need to make sure you download and it and complete the course in the time. With my connection it took me about an hour to download and install. I then found that the interface was different to that in the course (newer version of the software) so I had to figure some stuff out for myself on how to use it. I was also incapable of loading the sample strings and so reconstructed them from the data (not very hard to do).

Now at this point it may be tempting to just sit back and say “I’ll simply watch him doing this on the videos. No need to follow along because it all sounds too hard.”

No! No! No! I got so much more from this excellent course by following along myself in the software. In fact, because the software in the course was a slightly older version I think I learned a lot more because I had to use my brain from time to time instead of simply duplicating keystrokes etc.

Ah! Did I give something away? Yes, I did. The fact that I believe this is an excellent course. I really did get a better understanding of a few things from it. Going through the course, building the stream you can see in the picture below, I really became quite adept at adding the various modules to see the impact they would have.

The course is split into four parts:

  • Decision trees in SPSS
  • Understanding CHAID
  • Understanding C&RT
  • Improving your model

I personally found the explanations to be very clear. Having studied probability and statistics (albeit a while ago at university) I know of approaches such as Chi-squared but I found this explanation to be one of the easiest to understand ever. I also enjoyed how interesting the tutor made some of his explanations. Here is a screen shot of him discussing the Gini coefficient.

Conclusion

I enjoyed this course and felt that I learned a lot from it. While the videos come to around an hour and a quarter, I would actually dedicate a day. Part of that to downloading and installing, but also because you want to duplicate everything that he does in the videos AND you want to give yourself time to absorb it. I stopped every so often to just think about what he was saying because the content was textually dense (i.e. every sentence significant and requiring understanding).

Definitely a 10/10.