added decision tree article

This commit is contained in:
Rohan Sircar 2020-05-28 17:41:36 +05:30
parent 8f6ea55114
commit 3f7657896b
4 changed files with 60 additions and 4 deletions

View File

@ -0,0 +1,60 @@
+++
title = " A tale of Decision Trees, Java and OpenCL"
date = "2019-10-02T17:53:52+05:30"
hide_authorbox = true
disable_comments = true
draft = true
categories = [
"Development"
]
tags = [
"opencl",
"java",
"ai",
"Machine Learning",
"id3",
"Decision Tree",
"gpgpu"
]
opacity = false
sidebar = { "disable" = true }
+++
I worked on a decision tree program in java for my final year MSc project. In a series of posts such as this one, I'll be highlighting some of the aspects of it's creation and implementation.
A bit of background - this project is of great signficance to me, one reason being that it had many firsts - my first proper java project, my first foray into machine learning and GPGPU programming, my first GUI program, to mention a few. It was also when I realized that I could actually enjoy programming.
I also had to face significant adversities while working on it, none of which I'd disclose publicly. But yeah, I was going through a bad time, and yet I still managed to finish my project, clear all my exams and get my master's degree - a fact that I'm proud of.
Now for the project itself..
### Decision Tree
Decision Tree is a machine learning algorithm which creates a model of the given data set in the form a tree. This tree can then be used for prediction.
![Decison Tree](/img/decisiontree.png)
#### <p align="center"> Figure - A Decision Tree</p>
A major challenge in implementing this form of decision tree was that the number of children of the nodes is not constant. Thus a simple binary tree wouldn't suffice. I had to use an n-ary tree data structure that I also had to create myself.
### ID3
I used ID3 as the algorithm for partitioning the data set and creating the decision tree. ID3 is a greedy algorithm that uses the concept of entropy to decide where to split the data in order to partition it. ID3 splits the data set recursively until it reaches the terminating condition(s), and each split point becomes a node of the tree.
### OpenCL
I also worked on incorporating GPU acceleration into this project. While I had an Nvidia GPU, I still chose to go with OpenCL over CUDA, and the major benefit was that I could prototype on my laptop despite it not having a GPU(the code would just run on CPU) and then run it on my desktop which had a GPU.
#### Other stuff - MySQL, Swing etc
Some of the other stuff I used were mysql - for storing the data sets. I didn't know about hibernate at this point so I 'simply' used JDBC , and swing - for creating the GUI.
That's it for now. In the next post I'll talk about the system overview and architecture. Stay tuned!
![System Architecture](/img/sysarch.png)

View File

@ -1,10 +1,6 @@
---
date: 2014-03-10
linktitle: Migrating from Jekyll
menu:
main:
parent: tutorials
prev: /tutorials/mathjax
title: Migrate to Hugo from Jekyll
weight: 10
---

BIN
static/img/decisiontree.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

BIN
static/img/sysarch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 198 KiB