machine-learning

@machine-learning

Public team

Joined on Mar 18, 2020

  • In the previous post, we have seen a naive implementation of Convolutional Neural network using Numpy. Here, we are going to implement a faster CNN using Numpy with the im2col/col2im method. To see the full implementation, please refer to my repository. Also, if you want to read some of my blog posts, feel free to check them at my blog. I) Forward propagation :::info
     Like 5 Bookmark
  • In this post, we are going to see how to implement a Convolutional Neural Network using only Numpy. The main goal here is not only to give a boilerplate code but rather to have an in-depth explanation of the underlying mechanisms through illustrations, especially during the backward propagation where things get trickier. However, some knowledge about Convolutional Neural Networks building blocs are required. To see the full implementation, please refer to my repository. For the more advanced, here is another post where we implement a faster CNN using im2col/col2im methods.
     Like 7 Bookmark
  • This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of ImageNet Classification with Deep ConvolutionalNeural Networks paper which introduces the AlexNet architecture. The implementation uses Keras as framework. To see full implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog.
     Like  Bookmark
  • This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of Visualizing and Understanding Convolutional Networks paper which introduces the ZFNet and DeconvNet architecture. The implementation uses Pytorch as framework. To see full implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog.
     Like  Bookmark
  • This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of Very Deep Convolutional Networks for Large-Scale Image Recognition paper which introduces the VggNet architecture. The implementation uses Pytorch as framework. To see full implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog.
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Supervised learning It itself categorized into 2 problems: Regression problem. Classification problem. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function.
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Linear Regression (one variable) where ($x_i$, $y_i$) are values from the training set. Example:
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Introduction In classification your algorithms produce discrete outcomes: one/zero, yes/no, do/don't and so on. Spam versus non-spam emails is a traditional example of classification task: you want to predict if the email fed to your program is spammy or not, where usually 0 means not spam and 1 means spam. Formally, we want to predict a variable $y \in {0,1}$, where 0 is called negative class, while 1 is called positive class. Such task is known as binary classification.
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Introduction Consider the problem of predicting $y$ from $x \in \mathbb{R}$. The leftmost picture shows the result of $h(x)=\theta_0 + \theta_1x_1$. You can notice that the line doesn't fit well the plotted data. This is called underfitting or high bias.
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Evaluating a Learning Algorithm Suppose you have implemented regularized linear regression to predict housing prices. However, when you test your hypothesis on a new set of houses, you find that it makes unacceptably large errors in its predictions (Maybe overfit/underfit the training set ?). Thus, you want to improve it. Errors in your predictions can be troubleshooted by: Getting more training examples. Trying smaller sets of features.
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Anomaly Detection System Consider a set of points, ${x^{(1)}, x^{(2)}, \cdots, x^{(m)}}$ in a training example (represented by blue points) representing the regular distribution of features $x_1^{(i)}$ and $x_2^{(i)}$. The aim of anomaly detection is to separate anomalies from the test set (represented by the red points) based on distribution of features in the training example (represented by the blue points). For example, in the plot below, while point A is not an outlier, point B and C in the test set can be considered to be anomalous (or outliers). Formally, in anomaly detection the $m$ training examples are considered to be normal or non-anomalous, and then the algorithm must decide if the next example, $x_{test}$ is anomalous or not. So given the training set, it must come up with a model $p(x)$ that gives the probability of a sample being normal (high probability is normal, low probability is anomaly).Resulting decision boundary is defined by,
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Note that Decision Tree and Random Forest were not tackle in the course but are considered as a must know concept for everyone starting in the field. I won't be able to understand Decision Tree and Random Forest without StatQuest videos. Feel free to check him out. Also, if you want to read my other notes, feel free to check them at my blog. I) Decision Tree Decision Trees are versatile machine learning algorithms that can perform both classification and regression tasks, and even multi-output tasks. They are very powerful algorithms, capable of fitting complex datasets.
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Intuition The goal of SVM (Support Vector Machine) algorithm is to draw a line that will divide your training data into positive and negative samples as best as possible, then classify new data by determining on which side of the hyperplane they lie on. Consider the following figure in which x's represent positive training examples, o's denote negative training examples, a decision boundary (this is the line given by the equation $\theta^Tx=0$, and is also called the separating hyperplane) is also shown, and three points have also been labeled A, B and C.
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Introduction In the clustering problem, we are given a training set ${x^{(1)},...,x^{(m)}}$ ,and want to group the data into a few cohesive "clusters". Here, $x^{(i)} \in R^n$ as usual; but no labels $y^{(i)}$ are given. So, this is an unsupervised learning problem. The k-means clustering algorithm is as follows: $$\begin{aligned} &\text{1. Initialize cluster centroids } \mu_1,\mu_2,...,\mu_k \in \mathbb{R}^n \text{ randomly} \ &\text{2. Repeat until convergence : }\end{aligned}$$
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Introduction PCA is a dimension reduction algorithm. There are 2 motivations to use PCA: Data compression. Visualization. 1) Data compression
     Like  Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Introduction A recommender system or a recommendation system is a subclass of information filtering system that seeks to predict the "rating" or "preference" a user would give to an item. Problem Formulation: (Predicting Movie Ratings case) Let's define:
     Like  Bookmark
  • This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of Deep Residual Learning for Image Recognition and Study of Residual Networks for Image Recognition paper which introduces the ResNet architecture. The implementation uses Pytorch as framework. To see full implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog. I) Summary
     Like 1 Bookmark
  • This is my personal notes taken for the course Machine learning by Standford. Feel free to check the assignments. Also, if you want to read my other notes, feel free to check them at my blog. I) Introduction The popularity of machine learning techniques have increased in the recent past. One of the reasons leading to this trend is the exponential growth in data available to learn from. Large datasets coupled with a high variance model has the potential to perform well. But as the size of datasets increase, it poses various problems in terms of space and time complexities of the algorithms. Datasets can often approach such sizes as $m = 100,000,000$. In this case, our gradient descent (also called batch gradient descent) step will have to make a summation over all one hundred million examples. To avoid this, we will see some approaches for doing so. II) Stochastic Gradient Descent Stochastic gradient descent is an alternative to classic gradient descent (batch gradient descent) and is more efficient and scalable to large data sets.
     Like  Bookmark
  • This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of the Gradient-Based Learning Applied to Document Recognition paper which introduces the LeNet-5 architecture. The implementation uses Keras as framework. For other frameworks implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog. I/ Summary
     Like  Bookmark
  • This post is divided into 2 sections: Summary and Implementation. We are going to have an in-depth review of EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks paper which introduces the EfficientNet architecture. The implementation uses Keras as framework. To see full implementation, please refer to this repository. Also, if you want to read other "Summary and Implementation", feel free to check them at my blog.
     Like  Bookmark