The field of statistics is undergoing a drastic change in the wake of the recent explosion in machine learning and AI. No longer is there a cottage industry for training statistics students for a job in big pharma. Instead, all of the new jobs are in the field of Data Science, because of the rapid growth of cloud-based analytics and data storage. Some recent trends include:
- Students are now more interested in getting a job in data science, not statistics
- Academic campuses are now establishing Data Science centers to invest in the future of technology, analytics, and student training
- Academic statistic departments are now establishing faculty positions for data scientists and recruiting top-level data scientists
- Academic statistic departments are now changing their names to "Data Science and Statistics"
- New job requirements demand experience with a mixture of cloud-based programming languages and analytic technologies
Historically, machine learning has always been a top priority at NXG Logic. We didn't develop machine learning techniques after the Cambrian explosion in data science came about. Rather, our machine learning developments had already been a long-term pursuit before the explosion. One of the drawbacks in explosive growth of a new field is that "local" and "global" avalanches will occur in the field as it starts overheating and new talent becomes improperly trained. Eventually, there will be so many data scientists trained through short courses and web blogs alone that there won't be a need for hypothesis-driven analysis, as data science will eventually depart from the field of research-based statistics.
Nevertheless, NXG Logic has developed a broad portfolio of non-linear manifold learning technologies for knowledge discovery, dimensionality reduction, class discovery and data clustering. The machine learning approaches available in Explorer include:
- Crisp K-means cluster (class discovery module)
- Fuzzy-K-means cluster (class discovery module)
- Unsupervised neural gas (class discovery module)
- Gaussian mixture models (class discovery module)
- Unsupervised random forests (class discovery module)
- Kernel-based PCA (class discovery module)
- Kernel Gaussian radial basis function PCA (class discovery module)
- Kernel Tanimoto distance-based PCA (class discovery module)
- Diffusion maps (class discovery module)
- Localized linear embeddings (class discovery module)
- Laplacian eigenmaps (class discovery module)
- Locally preserved projections (class discovery module)
- Stochastic neighbor embedding (class discovery module)
- Sammon mapping (class discovery module)
- Decision tree classification (class prediction module)
- Supervised random forests (class prediction module)
- K-nearest neighbor (class prediction module)
- Learning vector quantization (class prediction module)
- Support vector machines (class prediction module)
- Kernel regression (class prediction module)
- Supervised neural gas (class prediction module)
- Mixture of experts (class prediction module)
With regard to AI, there are plenty of artificial intelligence methods incorporated into Explorer, and these include:
- Kohonen networks, or self-organizing maps (class discovery module)
- Unsupervised artificial neural networks (class discovery module)
- Supervised artificial neural networks (SANN)
- Swarm intelligence (class prediction module)
Activation functions for SANN include:
- Identity
- Logistic
- Softmax
- tanh
- Hermite
- Laguerre
- Exponential
- RBFN
SANN back-propagation learning includes:
SANN Connection weight updates include:
The SANN objective function can be:
Output-side functions include:
- Identity
- Logistic
- Softmax
- tanh