Big data space: Machine Learning Terminolgy

Regularization is a technique used to solve over fitting problem. Other techniques include early stopping, cross validation.

Bias–variance trade off is the problem of simultaneously minimizing two sources of error that prevent supervised learning algorithms from generalizing beyond their training set:

i) The bias is error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to miss the relevant relations between features and target outputs (underfitting).

ii) The variance is error from sensitivity to small fluctuations in the training set. High variance can cause overfitting: modeling the random noise in the training data, rather than the intended outputs.

Algorithms classification:
Discrminative Learning Algorithms:
Algorithms that try to learn p(y|x) directly (such as logistic regression),or algorithms that try to learn mappings directly from the space of inputs X to the labels{0,1}, (such as the perceptron algorithm) are called discriminative learning algorithms.
Generative Learning Algorithms:
Algorithms that instead try to model p(x|y) (and p(y)). These algorithms are called generative learning algorithms. For instance, if y indicates whether an example is a dog (0) or an elephant (1), then p(x|y= 0) models the distribution of dogs’ features, and p(x|y= 1) models the distribution of elephants’ features.

Source: http://cs229.stanford.edu/notes/cs229-notes2.pdf

What is Self Organizing Map in Neural Nets?

Teuvo Kohonen introduced a special class of ANNs called Self-Organizing feature maps. These maps are based on competitive learning.

In competitive learning neurons compete among themselves to be activated.

Brain is dominated by the cerebral cortex, a very complex structure of billions neurons and hundreds of billions synapses.

It includes areas that are responsible for different human activities (motor, visual, auditory, somatosensory, etc.), and associated with different sensory inputs.

Each sensory input is mapped into a corresponding area of the cerebral cortex.The cortex is a self -organizing computational map The cortex is a self -organizing computational map in the human brain.

Classification Model Performance Metrics

Confusion matrix

Precision: Proportion of patients diagnosed as having cancer actually had cancer. In other words how many selected items are relevant.
TP/TP+FP

Recall: Proportion of patients that actually had cancer were diagnosed as having cancer. In other words how many relevant items are selected.
TP/TP+FN

Big data space

Saturday, July 2, 2016

Machine Learning Terminolgy

No comments:

Post a Comment

Blog Archive