November 10, 2011

Which methods/algorithms did you use for data analysis in 2011

http://www.kdnuggets.com/polls/2011/algorithms-analytics-data-mining.html

Which methods/algorithms did you use for data analysis in 2011? [311 voters]
Decision Trees/Rules (186) 59.8 %
Regression (180) 57.9 %
Clustering (163) 52.4 %
Statistics (descriptive) (149) 47.9 %
Visualization (119) 38.3 %
Time series/Sequence analysis (92) 29.6 %
Support Vector (SVM) (89) 28.6 %
Association rules (89) 28.6 %
Ensemble methods (88) 28.3 %
Text Mining (86) 27.7 %
Neural Nets (84) 27.0 %
Boosting (73) 23.5 %
Bayesian (68) 21.9 %
Bagging (63) 20.3 %
Factor Analysis (58) 18.7 %
Anomaly/Deviation detection (51) 16.4 %
Social Network Analysis (44) 14.2 %
Survival Analysis (29) 9.32 %
Genetic algorithms (29) 9.32 %
Uplift modeling (15) 4.82 %


Did you use analytics in the cloud, Hadoop, EC2, etc in 2011?
Yes 14%
No 86%


Employment type:Percent allAvg Num Algorithms
Industry analyst/consultant (172) 55.3%6.3
Academic researcher (85) 27.3%5.1
Student (37) 11.9%4.3
Government/Other (17) 5.5%5.0

Regional breakdown is

  1. US/Canada, 40.2%
  2. Europe, 37.6%
  3. Asia, 10.3%
  4. Latin America, 5.8%
  5. Africa/Middle East, 3.2%
  6. Australia/NZ 2.9%
We grouped Industry/Gov in one group and Academic researchers/Students into a second group, and computed the "affinity" of the algorithm to Industry/Gov as
N(Alg,Ind_Gov) / N(Alg,Aca_Stu)
----------------------------------
N(Ind_Gov) / N(Aca_Stu)
Thus algorithm with affinity 1.5 is used 50% more in Industry/Government than by Academic Researchers or students, and the algorithm with affinity 0.6 is used only 60% as much in Industry.

The most "industrial" algorithms ( with the highest Industry / Gov "affinity") are:

  1. Uplift modeling, INF (no academic users)
  2. Survival Analysis, 2.47
  3. Regression, 2.00

The most "academic" algorithms ( with the lowest Industry / Gov "affinity") are:

  1. Genetic algorithms, 0.60
  2. Support Vector (SVM), 0.66
  3. Association Rules, 0.83
The following table shows the algorithms ranked by Industry affinity (third column). Second column width shows is proportional to academic affinity (inverse of Industry affinity)
AlgorithmAcademic/ Student
Affinity
Industry / Gov
Affinity
Uplift modelingINF
Survival Analysis2.47
Regression2.00
Visualization1.55
Statistics1.54
Boosting1.50
Time series/Sequence analysis1.48
Bagging1.39
Factor Analysis1.32
Anomaly/Deviation detection1.29
Text Mining1.27
Decision Trees1.20
Neural Nets1.16
Clustering1.14
Ensemble methods1.08
Social Network Analysis0.93
Bayesian0.92
Association rules0.83
Support Vector -SVM0.66
Genetic algorithms0.60


No comments: