The top 100+ things you'll never have working knowledge of as a data science faculty or graduate

The top 100+ things you'll never have working knowledge of as a faculty member or graduate in an accelerated data science program

If you are a student or are teaching in a fast-track data science program, then the chances are that you will need to eventually expand your horizons on how much information will not be covered during your tenure in the program. From a practical standpoint, you may become quite experienced in assembling together interpretable programming language code (e.g. Python) to call math libraries for solving problems in the cloud. However, on the other end of the spectrum, you will have little if any exposure to numerical methods involved in the fundamental background calculations for many of the methods you know and methods your students are learning.

Now for the hard part. If you have only come on line in the last decade, the odds are that you have missed 40-50 years of development in numerical methods which form the basis for a lot of things you call "analysis." What this means is that you are likely unfamiliar with numerical methods but have become good at affixing signposts on things to appear to be analytic. In other words, you won't likely be able to describe, for example, why Gauss-Jordan elimination and Gauss-Siedel differ from the Jacobi method and singular value decomposition. So the question remains: Is there a data science program on the planet that requires faculty/students to really know numerical methods?

For the programming-related items below, in order to appreciate numerical methods assume that you have no access to libraries like e.g. Numpy, Scikit, R, Matlab, etc., but rather, have to code them from scratch using a compilable language like C++, C#, or VB.NET. Python wrapping and calls to math functions are not allowed.

1. Positive-definiteness vs. positive semi-definiteness of a covariance(correlation) matrix.
2. Non-uniqueness of singular value decomposition (SVD).
3. Effect of sine wave (time series) wavelength, amplitude, and phase shift on principal component scores.
4. Meaning of values of eigenvectors whose eigenvalues are zero.
5. What a Sobol sequence is and how they are used for quantile simulation.
6. How to remove pathologies from a simulated covariance(correlation) matrix.
7. How to simulate correlated mixtures of quantiles from multiple probability distributions.
8. How to program pdf and cdf and simulate quantiles of the Beta probability distribution.
9. How to program pdf and cdf and simulate quantiles of the BetaPERT probability distribution.
10. How to program pdf and cdf and simulate quantiles of the Binomial probability distribution.
11. How to program pdf and cdf and simulate quantiles of the Cauchy probability distribution.
12. How to program pdf and cdf and simulate quantiles of the Chi-squared probability distribution.
13. How to program pdf and cdf and simulate quantiles of the Erlang probability distribution.
14. How to program pdf and cdf and simulate quantiles of the Exponential probability distribution.
15. How to program pdf and cdf and simulate quantiles of the F-ratio probability distribution.
16. How to program pdf and cdf and simulate quantiles of the Gamma(Erlang) probability distribution.
17. How to program pdf and cdf and simulate quantiles of the Gumbel probability distribution.
18. How to program pdf and cdf and simulate quantiles of the Geometric probability distribution.
19. How to program pdf and cdf and simulate quantiles of the Laplace probability distribution.
20. How to program pdf and cdf and simulate quantiles of the Negative binomial probability distribution.
21. How to program pdf and cdf and simulate quantiles of the Poisson probability distribution.
22. How to program pdf and cdf and simulate quantiles of the Power probability distribution.
23. How to program pdf and cdf and simulate quantiles of the Rayleigh probability distribution.
24. How to program pdf and cdf and simulate quantiles of the Stable (Levy) probability distribution.
25. How to program pdf and cdf and simulate quantiles of the Student's t probability distribution.
26. How to program pdf and cdf and simulate quantiles of the Triangle probability distribution.
27. How to program pdf and cdf and simulate quantiles of the Weibull probability distribution.
28. How to program and perform a Monte Carlo cost estimate.
29. What sensitivity is in a Monte Carlo uncertainty analysis and how it's programmed.
30. What the Marčenko-Pastur eigendensity distribution is and how it's programmed.
31. How to program super-resolution root MUSIC.
32. How to program supervised random forests classifiers.
33. How to program unsupervised random forests.
34. How to program classifier diversity, and why it's important.
35. What concept drift is.
36. When parametric hypothesis tests have more statistical power than non-parametric tests.
37. What the assumptions are for performing a t-test.
38. How to program Bartlett's test, and what it's used for.
39. What the assumptions are for multiple linear regression.
40. What leverage and jackknife residuals are used for and why they are important.
41. How to program Grizzle-Starmer-Koch regression and why it's important.
42. How to program scaled Schoenfeld residuals and why they are important.
43. How to program binary logistic regression and why it has unattractive performance for many-class (#classes>8) classification problems.
44. How to program polytomous (polychotomous) logistic regression and how to set up its Hessian matrix and score vector.
45. What the negative information matrix is.
46. How to program the Newton-Raphson method and why it's faster than gradient-descent.
47. What gradient ascent is.
48. How to program feed-forward back-propagation neural networks with multiple hidden layers and what they need in terms of data in order to be successful.
49. How to program support vector machines (SVMs) and what they need in terms of data in order to be successful.
50. How to program particle swarm optimization.
51. How to program mixture of experts and why they are useful.
52. How to determine the learning rate during programming of learning vector quantization (LVQ) classifiers.
53. Why linear separability is important in classification analysis.
54. What stemming and stopping is in text mining, and how they are programmed.
55. What N-grams are and how their analysis is programmed.
56. What sentiment mining is in text mining and how it's programmed.
57. What bootstrap-bias is in classification analysis and how it's programmed.
58. What an adjacency matrix is and how it's programmed.
59. What a minimum spanning tree is and how it's programmed.
60. What Bayes' rule is and how it's programmed.
61. What the Behrens-Fisher problem is.
62. What the Cauchy-Schwartz inequality is.
63. What the difference is between Chebyshev, Manhattan, and Canberra distance.
64. What the isomap algorithm is and how it's programmed.
65. What locality preserving projections is and how it's programmed.
66. How to program a genetic algorithm with adaptive mutation.
67. What kernel regression is and how it's programmed.
68. How to program self-organizing maps.
69. What the U-matrix represents from self-organizing maps (SOM).
70. What component maps are from SOM.
71. What the Davies-Bouldin index is and how it's programmed.
72. What Pitman correlation is and how it's programmed.
73. How to program fuzzy k-means classification.
74. What exchangeability is during agglomerative cluster analysis.
75. What Benjamini-Hochberg FDR is and how it's programmed.
76. What the Principal Axis Theorem is and how it's programmed.
77. What Sammon mapping is and how it's programmed.
78. What HSV color normalization is and how it's programmed.
79. How to program an inverse fast Fourier transform (IFFT).
80. What an STFT matrix is and how it's programmed.
81. What NMF is and how it's programmed.
82. How to program percussive-melody sound separation using STFT.
83. What the multiple testing problem is and how can you program to guard against its influence.
84. What backwards stepping is during regression.
85. What the "nesting problem" is during greedy hill-climbing.
86. What sequential floating forward-reverse plus-take-away one algorithms are, and how they are programmed.
87. What the Kruskal-Wallis test is and how it's programmed.
88. How to program ant colony optimization.
89. Why Latin hypercube sampling is necessary for function approximation by neural networks and how it's programmed.
90. What Tanimoto distance is.
91. What out-of-place distance is.
92. What the Central Limit Theorem is.
93. What the Tracy-Widom Law is.
94. How to program unsupervised neural gas algorithm.
95. How to program the supervised neural gas algorithm.
96. What direction cosines are.
97. What the Westfall-Young algorithm is used for.
98. What Widrow-Hoff learning is.
99. How to program Gray coding.
100. How to program binary-to-decimal coding
101. How to program decimal-to-binary coding.
102. What the revolving door algorithm is.
103. What a Wishart ensemble is.
104. What the 0.632 bootstrap is.

The top 100+ things you'll never have working knowledge of as a faculty member or graduate in an accelerated data science program

Now for the hard part. If you have only come on line in the last decade, the odds are that you have missed 40-50 years of development in numerical methods which form the basis for a lot of things you call "analysis." What this means is that you are likely unfamiliar with numerical methods but have become good at affixing signposts on things to appear to be analytic. In other words, you won't likely be able to describe, for example, why Gauss-Jordan elimination and Gauss-Siedel differ from the Jacobi method and singular value decomposition. So the question remains: Is there a data science program on the planet that requires faculty/students to really know numerical methods?

For the programming-related items below, in order to appreciate numerical methods assume that you have no access to libraries like e.g. Numpy, Scikit, R, Matlab, etc., but rather, have to code them from scratch using a compilable language like C++, C#, or VB.NET. Python wrapping and calls to math functions are not allowed. Don't worry about compilable language-based coding of machine learning and AI methods, since I have that covered over the last 20 years.

1. Positive-definiteness vs. positive semi-definiteness of a covariance(correlation) matrix.
2. Non-uniqueness of singular value decomposition (SVD).
3. Effect of sine wave (time series) wavelength, amplitude, and phase shift on principal component scores.
4. Meaning of values of eigenvectors whose eigenvalues are zero.
5. What Sobol sequences are and how they are used for quantile simulation.
6. How to remove pathologies from a simulated covariance(correlation) matrix.
7. How to simulate correlated mixtures of quantiles from multiple probability distributions.
8. How to program pdf and cdf and simulate quantiles of the Beta probability distribution.
9. How to program pdf and cdf and simulate quantiles of the BetaPERT probability distribution.
10. How to program pdf and cdf and simulate quantiles of the Binomial probability distribution.
11. How to program pdf and cdf and simulate quantiles of the Cauchy probability distribution.
12. How to program pdf and cdf and simulate quantiles of the Chi-squared probability distribution.
13. How to program pdf and cdf and simulate quantiles of the Erlang probability distribution.
14. How to program pdf and cdf and simulate quantiles of the Exponential probability distribution.
15. How to program pdf and cdf and simulate quantiles of the F-ratio probability distribution.
16. How to program pdf and cdf and simulate quantiles of the Gamma(Erlang) probability distribution.
17. How to program pdf and cdf and simulate quantiles of the Gumbel probability distribution.
18. How to program pdf and cdf and simulate quantiles of the Geometric probability distribution.
19. How to program pdf and cdf and simulate quantiles of the Laplace probability distribution.
20. How to program pdf and cdf and simulate quantiles of the Negative binomial probability distribution.
21. How to program pdf and cdf and simulate quantiles of the Poisson probability distribution.
22. How to program pdf and cdf and simulate quantiles of the Power probability distribution.
23. How to program pdf and cdf and simulate quantiles of the Rayleigh probability distribution.
24. How to program pdf and cdf and simulate quantiles of the Stable (Levy) probability distribution.
25. How to program pdf and cdf and simulate quantiles of the Student's t probability distribution.
26. How to program pdf and cdf and simulate quantiles of the Triangle probability distribution.
27. How to program pdf and cdf and simulate quantiles of the Weibull probability distribution.
28. How to program and perform a Monte Carlo cost estimate.
29. What sensitivity is in a Monte Carlo uncertainty analysis and how it's programmed.
30. What the Marčenko-Pastur eigendensity distribution is and how it's programmed.
31. How to program super-resolution root MUSIC.
32. How to program supervised random forests classifiers.
33. How to program unsupervised random forests.
34. How to program classifier diversity, and why it's important.
35. What concept drift is.
36. When parametric hypothesis tests have more statistical power than non-parametric tests.
37. What the assumptions are for performing a t-test.
38. How to program Bartlett's test, and what it's used for.
39. What the assumptions are for multiple linear regression.
40. What leverage and jackknife residuals are used for and why they are important.
41. How to program Grizzle-Starmer-Koch regression and why it's important.
42. How to program scaled Schoenfeld residuals and why they are important.
43. How to program binary logistic regression and why it has unattractive performance for many-class (#classes>8) classification problems.
44. How to program polytomous (polychotomous) logistic regression and how to set up its Hessian matrix and score vector.
45. What the negative information matrix is.
46. How to program the Newton-Raphson method and why it's faster than gradient-descent.
47. What gradient ascent is.
48. How to program feed-forward back-propagation neural networks with multiple hidden layers and what they need in terms of data in order to be successful.
49. How to program support vector machines (SVMs) and what they need in terms of data in order to be successful.
50. How to program particle swarm optimization.
51. How to program mixture of experts and why they are useful.
52. How to determine the learning rate during programming of learning vector quantization (LVQ) classifiers.
53. Why linear separability is important in classification analysis.
54. What stemming and stopping is in text mining, and how they are programmed.
55. What N-grams are and how their analysis is programmed.
56. What sentiment mining is in text mining and how it's programmed.
57. What bootstrap-bias is in classification analysis and how it's programmed.
58. What an adjacency matrix is and how it's programmed.
59. What a minimum spanning tree is and how it's programmed.
60. What Bayes' rule is and how it's programmed.
61. What the Behrens-Fisher problem is.
62. What the Cauchy-Schwartz inequality is.
63. What the difference is between Chebyshev, Manhattan, and Canberra distance.
64. What the isomap algorithm is and how it's programmed.
65. What locality preserving projections is and how it's programmed.
66. How to program a genetic algorithm with adaptive mutation.
67. What kernel regression is and how it's programmed.
68. How to program self-organizing maps.
69. What the U-matrix represents from self-organizing maps (SOM).
70. What component maps are from SOM.
71. What the Davies-Bouldin index is and how it's programmed.
72. What Pitman correlation is and how it's programmed.
73. How to program fuzzy k-means classification.
74. What exchangeability is during agglomerative cluster analysis.
75. What Benjamini-Hochberg FDR is and how it's programmed.
76. What the Principal Axis Theorem is and how it's programmed.
77. What Sammon mapping is and how it's programmed.
78. What HSV color normalization is and how it's programmed.
79. How to program an inverse fast Fourier transform (IFFT).
80. What an STFT matrix is and how it's programmed.
81. What NMF is and how it's programmed.
82. How to program percussive-melody sound separation using STFT.
83. What the multiple testing problem is and how can you program to guard against its influence.
84. What backwards stepping is during regression.
85. What the "nesting problem" is during greedy hill-climbing.
86. What sequential floating forward-reverse plus-take-away one algorithms are, and how they are programmed.
87. What the Kruskal-Wallis test is and how it's programmed.
88. How to program ant colony optimization.
89. Why Latin hypercube sampling is necessary for function approximation by neural networks and how it's programmed.
90. What Tanimoto distance is.
91. What out-of-place distance is.
92. What the Central Limit Theorem is.
93. What the Tracy-Widom Law is.
94. How to program unsupervised neural gas algorithm.
95. How to program the supervised neural gas algorithm.
96. What direction cosines are.
97. What the Westfall-Young algorithm is used for.
98. What Widrow-Hoff learning is.
99. How to program Gray coding.
100. How to program binary-to-decimal coding
101. How to program decimal-to-binary coding.
102. What the revolving door algorithm is.
103. What a Wishart ensemble is.
104. What the 0.632 bootstrap is.