I had a paper accepted for the International Joint Conference on Neural Networks (IJCNN 2017) in Anchorage, Alaska. This paper presented made use of an algorithm that I created while working on my dissertation, though I did not end up using it in my final dissertation. I refined it a bit with my advisor and a few others from my university and sent the paper to IJCNN. The complete list of authors is: Jeff Heaton, Steven McElwee, James Fraley, and James Cannady. The paper is titled Early stabilizing feature importance for TensorFlow deep neural networks. The abstract follows:
Feature importance is the process where the individual elements of a machine learning model’s feature vector are ranked on their relative importance to the accuracy of that model. Some feature ranking algorithms are specific to a single model type, such as Garson and Goh’s neural network weight-based feature ranking algorithm. Other feature ranking algorithms are model agnostic, such as Brieman’s perturbation feature ranking algorithm. This paper provides implementations for both the neural network weight-based and perturbation feature ranking algorithms for Google’s TensorFlow Deep Learning framework. Additionally, this paper introduces a novel hybrid approach of these two feature ranking algorithms that produces a stable ranking of features earlier in the training epochs of a deep neural network. Earlier stabilization of feature rank can save considerable compute cycles during model searches and feature engineering where many representations of the feature vector must be compared. This paper demonstrates all three algorithms by empirically showing that the introduced hybrid weight perturbation algorithms achieves earlier stability than the established algorithms.
This paper fit in nicely with my research interests of feature engineering and feature ranking. Feature ranking is where an algorithm is used to determine which features (input columns) are the most important for a model (such as a neural network). Automatic feature engineering was the topic of my dissertation, so I was looking for ways to train a neural network as little as possible before the feature rank can be determined. The less time spent training a neural network, that will only be used to rank feature importance, the quicker my automated engineering algorithm can run. In addition to submitting the paper I also gave a 20 minute talk about the algorithm. I also presented TensorFlow Python code for several common feature ranking algorithms.
- Code for the Paper
- [Slides](/dload/phd/IJCNN 2017-v2-final.pdf)