I presented two papers at the IEEE SoutheastCON 2016 conference in NorfolkVA on
April 1, 2016:
- April, 2016: An Empirical Analysis of Feature Engineering for Predictive Modeling [PDF][Slides]
- April, 2016: Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms [PDF][Slides]
The first article is very much related to my phd dissertation topic. For this paper,
I generated datasets for neural networks, support vector machines, random forests, and
gradient boosting machines trying to see what types of equations they could learn; and
more importantly, what types of equations they cannot learn. My dissertation topic
is in the area of feature engineering, so I am very interested in what types of
equation representations of features you can augment a model’s feature vector with to
enhance its predictive power. This conference paper is based on research I did for my
phd, while I was exploring dissertation topics.
The second article is on frequent itemset mining. For this paper, I examined several
common frequent set mining items to see what effects the underlying dataset had on the
algorithm runtime. Frequent itemsets are outside my research area. This paper was
based on a paper that I wrote near the beginning of my phd program.
Both of these papers relied heavily on experimentation, and my code is available at
my github site.