Pages

Feb 11, 2013

Data Mining vs. Machine Learning

Data mining and machine learning used to be two cousins. They have different parents. Now they grow increasingly like each other, almost like twins. Many times people even call data mining by the name Machine learning.

The field of machine learning grew out of the effort of building artificial intelligence. Its major concern is making a machine learn and adapt to new information. The origin of machine learning can be traced back to 1957 when the perceptron model was invented. This is modeled after neurons in human brain. That prompted the development of neural network model, which flourished in late 1980s. From 1980s to 1990s, the decision tree method has become very popular, owing to the efficient package of C4.5. SVM was invented in mid-1990s and it has since been widely used in industry. Logistic regression, an old method in statistics, has seen growing adoption in machine learning after 2001 when the book on statistical learning (The Elements of Statistical Learning) was published.

The field of data mining grows out of knowledge discovery from databases. In 1993, a seminal paper by Rakesh Agrawal and two others proposed an efficient algorithm of mining association rules in large databases. This paper promoted many research papers on discovering frequent patterns and more efficient mining algorithms. The early work of data mining in 1990s was linked to creating better SQL statement and working with databases directly.

Data mining has its strong focus on working with industrial problems and getting practical solutions. Therefore it concerns with not only data size (large data), but also data processing speed (stream data). In addition, personalized recommender systems and network mining are all developed due to business need, outside the machine learning field.  

The two major conferences for data mining are KDD (Knowledge Discovery and Data Mining) and ICDM (International Conference on Data Mining). The two major conferences for machine learning are ICML (International Conference on Machine Learning) and NIPS (Neural Information Processing Systems).  Machine learning researchers attend both types of conferences.However, the data mining conferences have much stronger industrial link.

Data Miners typically have strong foundation in machine learning, but also have a keen interesting in applying it large-scale problems.

Over time, we will see deeper connection between data mining and machine learning. Could they become twins one day? Only time will tell. 

2 comments:

  1. Great thoughts you got there, believe I may possibly try just some of it through out my daily life.
    CNC Router Sale

    ReplyDelete
  2. Data mining is a word, which is very difficult to understand. This comes under data modelling. You cannot easily understand this term, but now with the help of AAtrainingandconsultinguk-courses for business analyst, you can understand this in a better way. You can learn many more things through this.

    ReplyDelete