Basic Machine Learning Algorithms Implementation

Implementation of few basic machine learning algorithms. Frequent pattern mining in the form of Apriori. Supervised classifier in the form of K-Nearest Neighbour. Unsupervised clustering in the form of K-means clustering. The purpose was to utilise implemented algorithms on messy unstructured data gathered through a survey that had minimal constraints in legal answers. I.e. format, length and type were not considered when collecting data in order to make the task of pre-processing and utilising the data more challenging and real-life like.

One question was; what computer games have you played?
With apriori I could determine with 100% confidence that if one has played Angry Birds one will also have played Rocket League and Minecraft. Other closely related patterns can be observed below.

Subjects also answered how many programming languages they know, what their preferred OS is, and how many computer games they played. Looking into the histograms of the first and the latter, we might be able to use this to predict what programme they’re studying.

With these three attributes, 11-NN was able to predict degree correct 70% of the time.
With K-means I tried to cluster for shoe-size with age, gender and height as the features. Considering that the set only consisted of 84 examples, this is not a lot to use for clustering, however, still it reaches a purity score around 70%. Where the majority class of a cluster is assigned as the true class. Implementations can be seen below, contact for data.

https://github.com/andbis/basicAlgorithms