Dec 12, 2012

Feature selection – An essential step of machine learning

A machine learning model creates prediction from input variables (features) to the output. The output can be binary, such as whether a car is defect or not; it can also be numerical, such as the inflation rate of next year. A good machine learning model uses the minimum number of features to get the most accurate prediction. In many cases, a smaller number of features can gives better performance than a larger number of them. This is because additional features add noise and leads to overfitting. Thus the model does not do well with new test cases.

There is another reason for feature selection: computational speed. A model with a large number of features takes a long time to build, and take too long to apply in real time.

The third reason is sample size vs. the number of features. If the number of features is large, we will overfit the model and cannot create good prediction. In gene expression study, there are more than 100,000 features, but only a few hundred data points.  In order to increase the precision of the model, we need to increase sample size. For supervised learning, this means more human labeling on additional data. Sometime this is not possible.

How do we go about selecting the best and smallest number of features? If we try all subset of n features, it will take times of computation. This is apparently infeasible.

Feature selection has been studied in machine learning field for the last 2 decades.The following 4 methods have been proposed:
  1. Forward search: This method adds feature 1 by 1. It's a greedy algorithm that does not always lead to the best solution
  2. Backward search: This method starts with all features, and substract 1 by 1, until the performance no longer improves. It is also a greedy algorithm.
  3. Adaptive Lasso
  4. L1-regularized Logistic Regression 
I am in favor of the last method, L1-regularized LR, as it consistently gives us the best performance among all feature selection methods. Hopefully in future blogs, I can expand on this discussion  a little more.


  1. Those hints do not only exercise to statistics mining items, i assume those tips will also be relevant in order to other campaigns. Here is also nice article All those are helpful components as well as individual references for on line as well as these are robust points with regard to considering in having a excellent and make it attractive to the actual viewers.

  2. I like your site and content. thanks for sharing the information keep updating, looking forward for more posts. Thanks
    Gabung Disini

  3. If you've got an iPhone 5 (or later, an iPad 4 (or later), an iPad mini 2 or later or a sixth-gen iPod touch, your device is officially rated as iOS 10-compatible, and you can update to iOS 10 for free from Cydia can get iOS 10 Download link from CydiaNerd.

  4. Thank you regarding offering latest revisions about the problem, My partner and i enjoy examine a lot more.

  5. thank you dear for your kindness to give us some of the precious time to write this article for the users.
    aazee app shop

  6. The blog or and best that is extremely useful to keep I can share the ideas
    of the future as this is really what I was looking for, I am very comfortable and pleased to come here. Thank you very much.
    tanki online | 2048 game| tanki online game

  7. You took this post all reasonable points about the education.Thanks for the good points

  8. In my opinion, everything here becomes especially interesting and ambiguous. Interpretation of the definition of machine learning is not simple what it seems at first glance. And if we are going to use automated methods to create models on the interface then the development and use of suitable ways to simplify and understand these models in the background become paramount.

  9. Learning is the process that is going with you the whole life. I like learning very much.

  10. if try to create mathematical model to do feature selection for various types of datasets(information systems) and want to implement on real-word applicable datasets of large dimensions..then is there any source to code my model for feature selection process..because i am unware of coding parts. ..and in which programming lamnguages, i can code my model?.please guide me.

  11. If you want all the categories of movies, tv shows, sports in the single app then you have a beautiful app called Live NetTV. Hence, get the official live for windows 7 here.

  12. I think B is good selection because B is come after A and it shows that It is updated version of A.So my choice is B...

  13. My first priority is first one A as we should take our elders with us forever. And we need to to know more and get all proprepandfulfillment from updated news...

  14. awesome post!!!
    Thank you very much for sharing.

  15. the way you explain your work is too good thanks