<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://jugalgandhi.me/feed.xml" rel="self" type="application/atom+xml" /><link href="https://jugalgandhi.me/" rel="alternate" type="text/html" /><updated>2026-03-04T16:57:44+00:00</updated><id>https://jugalgandhi.me/feed.xml</id><title type="html">Learn something everyday.</title><subtitle>Find tech blogs and articles.</subtitle><entry><title type="html">Longest Substring Unique Characters</title><link href="https://jugalgandhi.me/articles/longest-unique-characters-substring/" rel="alternate" type="text/html" title="Longest Substring Unique Characters" /><published>2019-03-24T00:00:00+00:00</published><updated>2019-03-24T00:00:00+00:00</updated><id>https://jugalgandhi.me/articles/longest-unique-characters-substring</id><content type="html" xml:base="https://jugalgandhi.me/articles/longest-unique-characters-substring/"><![CDATA[<p>Queue implementation to find the longest substring inside a string with non-repeating characters.</p>

<p>Time Complexity: O(N)</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kt">int</span> <span class="nf">lengthOfLongestSubstring</span><span class="o">(</span><span class="nc">String</span> <span class="n">s</span><span class="o">)</span> <span class="o">{</span>
        <span class="kt">int</span> <span class="n">maxLength</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span>


        <span class="k">if</span><span class="o">(</span><span class="n">s</span><span class="o">.</span><span class="na">length</span><span class="o">()</span> <span class="o">==</span> <span class="mi">0</span><span class="o">)</span>
            <span class="k">return</span> <span class="n">maxLength</span><span class="o">;</span>
        <span class="k">if</span><span class="o">(</span><span class="n">s</span><span class="o">.</span><span class="na">length</span><span class="o">()</span> <span class="o">==</span> <span class="mi">1</span><span class="o">)</span>
            <span class="k">return</span> <span class="mi">1</span><span class="o">;</span>

        <span class="kt">int</span> <span class="n">endIndex</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span>
        
        <span class="c1">//Keep adding distinct characters to queue</span>
        <span class="nc">Queue</span><span class="o">&lt;</span><span class="nc">Character</span><span class="o">&gt;</span> <span class="n">subLong</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">LinkedList</span><span class="o">&lt;&gt;();</span>

        <span class="k">while</span><span class="o">(</span><span class="n">endIndex</span> <span class="o">&lt;</span> <span class="n">s</span><span class="o">.</span><span class="na">length</span><span class="o">()){</span>
            <span class="nc">Character</span> <span class="n">next</span> <span class="o">=</span> <span class="n">s</span><span class="o">.</span><span class="na">charAt</span><span class="o">(</span><span class="n">endIndex</span><span class="o">);</span>
            <span class="c1">//if the character on next index is already present in queue -&gt; poll the queue</span>
            <span class="c1">//until you find the character</span>
            <span class="k">if</span><span class="o">(</span><span class="n">subLong</span><span class="o">.</span><span class="na">contains</span><span class="o">(</span><span class="n">next</span><span class="o">)){</span>
                <span class="k">while</span><span class="o">(</span><span class="kc">true</span><span class="o">){</span>
                    <span class="nc">Character</span> <span class="n">curr</span> <span class="o">=</span> <span class="n">subLong</span><span class="o">.</span><span class="na">poll</span><span class="o">();</span>
                    <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"popping: "</span> <span class="o">+</span> <span class="n">curr</span><span class="o">);</span>
                    <span class="k">if</span><span class="o">(</span><span class="n">curr</span> <span class="o">==</span> <span class="n">next</span><span class="o">){</span>
                        <span class="k">break</span><span class="o">;</span>
                    <span class="o">}</span>
                <span class="o">}</span>
            <span class="o">}</span>
            <span class="c1">//every time keep adding in queue and check the size of queue with maxLength</span>
            <span class="n">subLong</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">next</span><span class="o">);</span>
            <span class="n">endIndex</span><span class="o">++;</span>
            <span class="n">maxLength</span> <span class="o">=</span> <span class="nc">Math</span><span class="o">.</span><span class="na">max</span><span class="o">(</span><span class="n">maxLength</span><span class="o">,</span> <span class="n">subLong</span><span class="o">.</span><span class="na">size</span><span class="o">());</span>
        <span class="o">}</span>
        <span class="k">return</span> <span class="n">maxLength</span><span class="o">;</span>
    <span class="o">}</span>
</code></pre></div></div>]]></content><author><name>Jugal Gandhi</name><email>jugalg05@gmail.com</email></author><category term="articles" /><category term="sliding-window" /><category term="queue" /><category term="lc-medium" /><summary type="html"><![CDATA[Queue implementation to find the longest substring inside a string with non-repeating characters.]]></summary></entry><entry><title type="html">Minimum Window Subtring</title><link href="https://jugalgandhi.me/articles/minimum-window-substring/" rel="alternate" type="text/html" title="Minimum Window Subtring" /><published>2019-03-24T00:00:00+00:00</published><updated>2019-03-24T00:00:00+00:00</updated><id>https://jugalgandhi.me/articles/minimum-window-substring</id><content type="html" xml:base="https://jugalgandhi.me/articles/minimum-window-substring/"><![CDATA[]]></content><author><name>Jugal Gandhi</name><email>jugalg05@gmail.com</email></author><category term="articles" /><category term="sliding-window" /><category term="lc-hard" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Machine Learning on Amazon Machine Learning</title><link href="https://jugalgandhi.me/blog/machine-learning-on-amazon-machine-learning/" rel="alternate" type="text/html" title="Machine Learning on Amazon Machine Learning" /><published>2017-07-13T12:08:50+00:00</published><updated>2017-07-13T12:08:50+00:00</updated><id>https://jugalgandhi.me/blog/machine-learning-on-amazon-machine-learning</id><content type="html" xml:base="https://jugalgandhi.me/blog/machine-learning-on-amazon-machine-learning/"><![CDATA[<p>Amazon Machine learning service is a full automated and easy tool to use. It automatically chooses the appropriate machine learning algorithm for you and trains the machine learning model. Placing minimal responsibilty on user’s hand, it provides quick results without much hassle.</p>

<p>Amazon Machine learning service is a good starting point for beginners who want to learn how machine learning works. Machine learning comprises of two types of problem: unsupervised machine learning and supervised machine learning. Amazon machine learning only provides solution for supervised machine learning problem. Supervised machine learning problems are the one where each row in the dataset has one target label to predict.</p>

<p>Go <a href="https://aws.amazon.com/">here</a>, and create your AWS free account. Out of many services provided by AWS, Amazon Machine Learning is one of them.</p>

<p>To get started with Amazon Machine Learning, go through this <a href="https://youtu.be/kv60azrH5LM">video</a>.</p>

<p>In this blog, we shall learn how to perform Linear Regression on House Price Prediction Dataset using Amazon Machine Learning Service.</p>

<h3 class="no_toc" id="table-of-contents">TABLE OF CONTENTS</h3>

<ul id="markdown-toc">
  <li><a href="#exploring-amazon-machine-learning-service" id="markdown-toc-exploring-amazon-machine-learning-service">Exploring Amazon Machine Learning Service</a></li>
  <li><a href="#upload-your-dataset" id="markdown-toc-upload-your-dataset">Upload your dataset</a></li>
  <li><a href="#create-data-source" id="markdown-toc-create-data-source">Create Data Source</a></li>
  <li><a href="#create-ml-model-and-train-the-model" id="markdown-toc-create-ml-model-and-train-the-model">Create ML Model and Train the model</a></li>
  <li><a href="#evaluate-the-model" id="markdown-toc-evaluate-the-model">Evaluate the model</a></li>
  <li><a href="#generate-predictions" id="markdown-toc-generate-predictions">Generate predictions</a></li>
  <li><a href="#summary" id="markdown-toc-summary">Summary</a></li>
</ul>

<h1 id="exploring-amazon-machine-learning-service">Exploring Amazon Machine Learning Service</h1>
<p>First login to your <a href="https://console.aws.amazon.com/console/">account</a>.</p>

<p>Once you log-in, go to the top black bar, and click <em>Services</em> and type <strong>Machine Learning</strong>. In the drop-down select <strong>Machine Learning</strong>. This shall open machine learning AWS service. The window looks like below:</p>

<p><img src="/images/Amazon/intro_items.png" alt="intro_items" /></p>

<p>As, you can see there are four objects:</p>

<ol>
  <li><strong>Datasource:</strong> Your input data.</li>
  <li><strong>ML Model:</strong> Machine learning model chosen by Amazon. This machine learning model is trained using a training datasource.</li>
  <li><strong>Evaluation:</strong> Uses your test data to gauge the performance of your machine learning model.</li>
  <li><strong>Batch Prediction:</strong> Uses batch data to predict values based on the rules it learned using training datasource.</li>
</ol>

<p>But, before we do any of these we have to upload our dataset on our AWS account.</p>

<h1 id="upload-your-dataset">Upload your dataset</h1>
<p>As mentioned by AWS Machine Learning documnetation, Amazon Machine Learning allows you to read your dataset from three different resources: (a) one or more files in <strong>Amazon S3</strong>, (b) results of an <strong>Amazon Redshift</strong> query, or (c) results of an <strong>Amazon Relational Database Service (RDS)</strong>.</p>

<p>In our case, we will upload our dataset to our Amazon S3 bucket.</p>

<p>In this exercise, we are going to load a public data-set hosted by kaggle. You can find and download this data-set <a href="https://www.kaggle.com/harlfoxem/housesalesprediction">here</a>. Download it and store it on your local drive. This dataset contains house sale prices for King County. This data-set consists of 21 columns and 21613 rows. Out of these 21 columns, the ‘price’ column is called label(one which we are trying to predict), and the remaining 20 columns are called as features(independent predictors). Each row in dataset is called as observation or data-point.</p>

<p>Once, you have the dataset in your local drive, go back to your AWS console and do the following steps:</p>

<ul>
  <li>
    <p><strong>STEP 1</strong>: Go to the top black bar, and click <em>Service</em> and type <strong>s3</strong>. In the drop-down select <strong>S3</strong>. This shall open S3 service console. It is a simple cloud storage service offerd by Amazon.</p>

    <p><img src="/images/Amazon/s3_service.png" alt="s3_service" /></p>
  </li>
  <li>
    <p><strong>STEP 2</strong>: Click the <strong>Create button</strong>. Give any name to your bucket. Choose your nearest hosting region. Remember the name of your bucket should be unique and universal, and so you have to choose a different name then the one shown below. At the end click <strong>Next</strong>.</p>

    <p><img src="/images/Amazon/create_bucket.png" alt="create_bucket" /></p>
  </li>
  <li>
    <p><strong>STEP 3</strong>: Keep hitting next and at the end click <strong>Create Bucket</strong>. For, now we are not going to set any properties and permissions for the bucket, and we shall keep the default settings for it. This shall create a new bucket with the name you specified and you shall find it in the list of your buckets.</p>

    <p><img src="/images/Amazon/bucket_list.png" alt="bucket_list" /></p>
  </li>
  <li>
    <p><strong>STEP 4</strong>: Now left click your bucket and then click the <strong>Upload</strong> button and under <strong>Select Files</strong> window click <strong>Add Files</strong> to upload the downloaded .csv file from your local drive to Amazon S3. Dont change any default properties and keep hitting next untill end.</p>

    <p><img src="/images/Amazon/upload_file.png" alt="upload_file" /></p>
  </li>
</ul>

<h1 id="create-data-source">Create Data Source</h1>

<p>Now, we have our dataset uploaded on AWS S3. Now go to back to your AWS Machine Learning Console. We can now create new datasource by doing following steps:</p>

<ul>
  <li>
    <p><strong>STEP 1</strong>: Click <strong>Create New…</strong> drop-down menu and select <strong>Datasource</strong>. This shall open <strong>Create Datasource</strong> windowpane.</p>
  </li>
  <li>
    <p><strong>STEP 2</strong>: <strong>Input Data</strong> pane. Under location select <strong>S3</strong>. Specify your bucket location and give your own Datasource name. Then click <strong>Verify</strong>.</p>

    <p><img src="/images/Amazon/input_data.png" alt="input_data" /></p>

    <p>This shall verify a few permissions. If asked to grant any permission press <strong>Yes</strong>. The validation will be performed next. Once, the validation is succesful click <strong>Continue</strong>.</p>
  </li>
  <li>
    <p><strong>STEP 3</strong>: <strong>Schema</strong> pane. At the <em>“Does the first line in your CSV contain the column names?”</em> choose “Yes”. Now you can see your feature columns being listed. Here, each row represents different feature columns of your dataset. Thus, here we have 21 rows representing 21 different feature columns of your dataset. You can also view the datatype of that feature along with the name. Change the data-type of features: ‘grade, waterfornt, view, bedroom, condition and zipcode’ to categorical.</p>

    <p><img src="/images/Amazon/schema.png" alt="schema" /></p>
  </li>
  <li>
    <p><strong>STEP 4</strong>: <strong>Target</strong> pane. At the <em>“Do you plan to use this dataset to create or evaluate an ML model?”</em> choose “Yes”. We do this because we are going to use the same dataset to evaluate the trained machine learning model. Next, you will be asked to choose a <strong>Target</strong>. So, under <strong>Target</strong> column select <strong>price</strong>, as price is what we are trying to predict here.</p>

    <p><img src="/images/Amazon/target.png" alt="target" /></p>

    <p>Once, you select <strong>price</strong>, and because price is a numeric attribute, Amazon will automatically detect this and show you that this is a linear regression problem. Cool!!. Automation!!. Now click <strong>Continue</strong>.</p>
  </li>
  <li>
    <p><strong>STEP 4</strong>: <strong>Row</strong> pane. At the <em>“Does your data contains an identifier?”</em> choose “No”.</p>
  </li>
  <li>
    <p><strong>STEP 5</strong>: <strong>Review</strong> pane. Under this pane your review all the changes made to your datasource. Glance through this and then click <strong>Create Datasource</strong>. This shall open a new window as shown below:</p>

    <p><img src="/images/Amazon/new_datasource.png" alt="new_datasource" /></p>
  </li>
</ul>

<hr />

<h1 id="create-ml-model-and-train-the-model">Create ML Model and Train the model</h1>

<p>Now, the next step would be to create an ML(Machine Learning) model. So, go to back to your AWS Machine Learning Console and do the following steps:</p>

<ul>
  <li>
    <p><strong>STEP 1</strong>: Click <strong>Create New…</strong> drop-down menu and select <strong>ML model</strong>. This shall open <strong>Create ML model</strong> windowpane.</p>
  </li>
  <li>
    <p><strong>STEP 2</strong>: <strong>Input Data</strong> pane. Choose the “I already created a datasource pointing to my S3 data” option and then it will show the datasource that you created in earlier step listed. Click your datasource.</p>

    <p><img src="/images/Amazon/ml_input_data.png" alt="ml_input_data" /></p>

    <p>Now click <strong>Continue</strong>.</p>
  </li>
  <li>
    <p><strong>STEP 3</strong>: <strong>ML model settings</strong> pane. Give a name to your ML model and a name to your Evaluation. Here, we are using <strong>Default Recipe</strong> to train our ML model.</p>

    <p><img src="/images/Amazon/ml_model_settings.png" alt="ml_model_settings" /></p>

    <p>Now click <strong>Review</strong>.</p>
  </li>
  <li>
    <p><strong>STEP 4</strong>: <strong>Review</strong> pane. Review your changes here and then click <strong>Create ML Model</strong>. Once you create your ML model, a new pane shall open which shows you the status of your ML model. This is shown in the figure below:</p>

    <p><img src="/images/Amazon/ml_model_status.png" alt="ml_model_status" /></p>

    <p>The status will show pending. But, after few minutes it must change to <strong>“Completed”</strong>. Once, the ML Model training is done and the evaluation will be performed next automatically.</p>
  </li>
</ul>

<p>To explore the training process, under <strong>ML model summary</strong> block select <strong>Download log</strong>. This log traces down the entire process executed to train your machine learning model. For linear regression, Amazon machine learning uses <strong>Stochastic Gradient Descent</strong> + <strong>Squared Loss funtion</strong>. This log shows how the linear regression model tries different learning rate to find the optimal weights that gives the least error. The end part of the log shows the parameters chosen to train the model. This can be shown in the figure below.</p>

<p><img src="/images/Amazon/download_log.png" alt="download_log" /></p>

<p>As seen the learning rate chosen is 1.0 and the training data set RMSE is 141915.4004.</p>

<hr />

<h1 id="evaluate-the-model">Evaluate the model</h1>

<p>Now, there should be five ML objects in your Machine Learning Dashboard, as shown below. Ypu can check the status of all your ML objects here and you can wathc their progress by refreshing your AWS Machine Learning Console. Wait for the status to show “Completed”.</p>

<p><img src="/images/Amazon/dashboard.png" alt="dashboard" /></p>

<p>Now, select <strong>“Evaluation: ML Model”</strong> for looking at the evaluation summary.</p>

<p><img src="/images/Amazon/ml_model_performance.png" alt="ml_model_performance" /></p>

<p>There is an option of clicking “Explore model performance”, which you can click to visualize the histogram of errors.</p>

<p><img src="/images/Amazon/histogram_errors.png" alt="histogram_errors" /></p>

<p>The green dotted line in the above figure marks the zero point on x-axis. You can change in the size of bin-intervals by choosing different options.</p>

<hr />

<h1 id="generate-predictions">Generate predictions</h1>

<p>Amazon machine learning allows you to do both batch predictions and real-time predictions.</p>

<hr />

<h1 id="summary">Summary</h1>

<hr />]]></content><author><name>Jugal Gandhi</name><email>jugalg05@gmail.com</email></author><category term="blog" /><summary type="html"><![CDATA[Amazon Machine learning service is a full automated and easy tool to use. It automatically chooses the appropriate machine learning algorithm for you and trains the machine learning model. Placing minimal responsibilty on user’s hand, it provides quick results without much hassle.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jugalgandhi.me/%7B%22feature%22=%3E%22amazon_machine_learning.png%22%7D" /><media:content medium="image" url="https://jugalgandhi.me/%7B%22feature%22=%3E%22amazon_machine_learning.png%22%7D" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Understanding Linear Regression Evaluation Metrics</title><link href="https://jugalgandhi.me/articles/understanding-linear-regression-evaluation-metrics/" rel="alternate" type="text/html" title="Understanding Linear Regression Evaluation Metrics" /><published>2017-07-09T00:00:00+00:00</published><updated>2017-07-09T00:00:00+00:00</updated><id>https://jugalgandhi.me/articles/understanding-linear-regression-evaluation-metrics</id><content type="html" xml:base="https://jugalgandhi.me/articles/understanding-linear-regression-evaluation-metrics/"><![CDATA[<p>Linear Regression is type of supervised machine learning technique. ‘Supervised Machine Learning’ is a machine learning technique where many independent predictors are used to predict a target variable. Using this technique you can find a precise equation using independent predictors to find your outcome target variable.</p>

<p>In supervised machine learning techinque, you train you machine learning model by supervising the training data. Training data consists of many rows of independent predictors and each row has a corresponding target outcome varibale. Our machine learning model learns rule/equations by finding out the closest relation between the independent predictors/features and target outcome variable. Then, you give it some new test data to predict the outcome target variable. This trained model predicts values for test data based on what rules it learnt by analzying the training data.</p>

<p>You can see how to do <strong>Linear Regression</strong> to predict the House prices, <a href="">here</a>.</p>

<p>The most important term here is <strong>Residual</strong>(Error). Residual is the difference between the actual and predicted value of the outcome target variable. Most of the metrics below are obtined from this Residual term.</p>

<p>There are many metrics to evaluate your ‘Linear Regression Model’ which are as follows:</p>

<ol id="markdown-toc">
  <li><a href="#mean-absolute-errormae" id="markdown-toc-mean-absolute-errormae">Mean Absolute Error(MAE)</a></li>
  <li><a href="#mean-squared-errormse" id="markdown-toc-mean-squared-errormse">Mean Squared Error(MSE)</a></li>
  <li><a href="#root-mean-squared-errorrmse" id="markdown-toc-root-mean-squared-errorrmse">Root Mean Squared Error(RMSE)</a></li>
  <li><a href="#relative-absolute-errorrae" id="markdown-toc-relative-absolute-errorrae">Relative Absolute Error(RAE)</a></li>
  <li><a href="#relative-squared-errorrse" id="markdown-toc-relative-squared-errorrse">Relative Squared Error(RSE)</a></li>
  <li><a href="#r-squaredco-efficent-of-determination" id="markdown-toc-r-squaredco-efficent-of-determination">R-squared(Co-efficent of Determination)</a></li>
</ol>

<p>Out of all these, <strong>RMSE</strong> and <strong>R-squared</strong> are the most important metrics to understand how well your model fits your data. Depending on your problem dataset some metrics are more useful to understand how your model performs. Let us understand what these metrics mean.</p>

<h1 id="mean-absolute-errormae">Mean Absolute Error(MAE)</h1>
<p>Mean absolute error can be obtained by taking summation of the absolute value of (<strong>residuals</strong>)differences between the predicted and the actual values and then taking a mean of it. It gives an idea about the magnitude of error in the predictions. It can also be called as “Mean of Residuals”.</p>

<hr />

<h1 id="mean-squared-errormse">Mean Squared Error(MSE)</h1>
<p>Mean squared error can be obtained by taking summation of square of (<strong>residuals</strong>)differences between the predicted and actual values and then taking a mean of it. It is a measure of how close a fitted line is to data points. The smaller the Mean Squared Error, the closer the fit is to the data.</p>

<hr />

<h1 id="root-mean-squared-errorrmse">Root Mean Squared Error(RMSE)</h1>
<p>Root mean squared error can be obtained by taking a root of mean squared error. In other words, it is the standard deviation of <strong>residuals</strong>. RMSE is similar to MAE but it amplifies and severly punishes large errors. RMSE can be interpreted more easily because it has the same units as the vertical axis(target value).</p>

<hr />

<h1 id="relative-absolute-errorrae">Relative Absolute Error(RAE)</h1>
<p>RAE can be defined as a absolute error(taking the absolute value of <strong>residual</strong>) as a fraction of the actual value of the outcome target variable.</p>

<hr />

<h1 id="relative-squared-errorrse">Relative Squared Error(RSE)</h1>
<p>RSE can be defined as the squared error(<strong>residual^2</strong>) as a fraction of the actual value of the outcome target variable.</p>

<hr />

<h1 id="r-squaredco-efficent-of-determination">R-squared(Co-efficent of Determination)</h1>
<p>R-squared can be obtained by subtracting the division of Sum of Squared Error(SSE) and Total Sum of Squares(TSS), from one. Sum of Squared Errors is equal to sum of <strong>residuals</strong>^2. Total Sum of Squares is obtained by taking the sum of differences between the target variable values and its mean. TSS is called as total variation in target variable(<strong>y</strong>). Thus, we can see that R-squared is directly dependent on Sum of Squared Error(SSE). So, lesser the model’s SSE value is, higher the model’s R-squared value is.</p>

<p>R-squared is also defined as the ‘how much of the total variation is described by the regression line’ or ‘how much of the total variation in outcome target variable(<strong>y</strong>) is decribed by the independent predictors(<strong>x</strong>).</p>

<p><img src="/images/r-squared.jpg" alt="r-squared" /></p>]]></content><author><name>Jugal Gandhi</name><email>jugalg05@gmail.com</email></author><category term="articles" /><summary type="html"><![CDATA[Linear Regression is type of supervised machine learning technique. ‘Supervised Machine Learning’ is a machine learning technique where many independent predictors are used to predict a target variable. Using this technique you can find a precise equation using independent predictors to find your outcome target variable.]]></summary></entry><entry><title type="html">Machine Learning on Microsoft Azure Machine Learning Studio</title><link href="https://jugalgandhi.me/blog/machine-learning-on-microsoft-azure-machine-learning-studio/" rel="alternate" type="text/html" title="Machine Learning on Microsoft Azure Machine Learning Studio" /><published>2017-07-05T12:08:50+00:00</published><updated>2017-07-05T12:08:50+00:00</updated><id>https://jugalgandhi.me/blog/machine-learning-on-microsoft-azure-machine-learning-studio</id><content type="html" xml:base="https://jugalgandhi.me/blog/machine-learning-on-microsoft-azure-machine-learning-studio/"><![CDATA[<p>Microsoft Azure Machine Learning Studio is an easy drag and drop cloud tool that enables you to perform machine learning at scale. It allows you to upload your own custom data-set and perform supervised or unsupervised machine learning tasks on it.</p>

<p>To access this Microsoft Azure Machine Learning Studio just log into Azure account. If you don’t have one, then you can create <a href="https://azure.microsoft.com/">here</a>. One good initial feature about Microsoft Azure is that it offers $200 free credits on every new account without providing any credit card details.</p>

<p>Here is a brief <a href="https://www.youtube.com/watch?v=kZ04LnSjWek">video</a> explaining about how to use Microsoft Azure Machine Learning Studio.</p>

<p>In this blog, we shall learn how to perform <em>Linear Regression</em> on <strong>House Price Prediction</strong> Dataset using Microsoft Azure Machine Learning Studio.</p>

<h3 class="no_toc" id="table-of-contents">TABLE OF CONTENTS</h3>

<ul id="markdown-toc">
  <li><a href="#exploring-microsoft-azure-machine-learning-studio" id="markdown-toc-exploring-microsoft-azure-machine-learning-studio">Exploring Microsoft Azure Machine Learning Studio</a></li>
  <li><a href="#creating-a-new-experiment-and-loading-custom-dataset" id="markdown-toc-creating-a-new-experiment-and-loading-custom-dataset">Creating a New Experiment and Loading Custom Dataset</a></li>
  <li><a href="#visualizing-and-analyzing-the-data-set" id="markdown-toc-visualizing-and-analyzing-the-data-set">Visualizing and Analyzing the data-set</a></li>
  <li><a href="#pre-processing-the-data-set" id="markdown-toc-pre-processing-the-data-set">Pre-processing the data-set</a></li>
  <li><a href="#using-a-machine-learning-model-to-train-the-dataset" id="markdown-toc-using-a-machine-learning-model-to-train-the-dataset">Using a machine learning model to train the dataset</a></li>
  <li><a href="#scoring-the-dataset" id="markdown-toc-scoring-the-dataset">Scoring the dataset</a></li>
  <li><a href="#performing-cross-validation" id="markdown-toc-performing-cross-validation">Performing Cross-Validation</a></li>
  <li><a href="#evaluating-the-results" id="markdown-toc-evaluating-the-results">Evaluating the results</a></li>
  <li><a href="#deploying-it-as-a-web-service" id="markdown-toc-deploying-it-as-a-web-service">Deploying it as a web-service</a></li>
  <li><a href="#summary" id="markdown-toc-summary">Summary</a></li>
</ul>

<hr />

<h1 id="exploring-microsoft-azure-machine-learning-studio">Exploring Microsoft Azure Machine Learning Studio</h1>

<p>Microsoft Azure Machine Learning Studio is a powerful software tool that helps you to process your data using inbuilt modules without writing any code. But, along with this it also allows the users to write custom R and python script for performing some advance processing on the data.</p>

<p>Microsoft Azure Machine Learning Studio has a well-designed user-interface. Once you login into your <a href="https://studio.azureml.net">workspace</a>, look at the left panel. This left panel has following options:</p>

<ol>
  <li><strong>PROJECT</strong> : Shows a list of Projects you created in this studio. Each project can have different assets. Assets comprises of experiments, web-services, notebooks, datasets, and trained-model.</li>
  <li><strong>EXPERIMENTS</strong> : Shows a list of experiments you created in this studio. AMLS provides sample experiments which guides how to do machine learning using this studio. After clicking experiments, click the <strong>SAMPLE</strong> tab to view the sample experiments.</li>
  <li><strong>WEB SERVICES</strong> : You can deploy the experiment as either a Classic web service, or as a New web service that’s based on Azure Resource Manager. This article shows how to deploy your experiment as a web service.</li>
  <li><strong>NOTEBOOKS</strong> : You can create new <strong>Jupyter Notebook</strong> here in studio hosted on Azure cloud. This notebook will open in a new tab and you can use it for quickly running code, visualizing data, exploring insights, and trying out ideas. You can learn more about this <a href="https://blogs.technet.microsoft.com/machinelearning/2015/07/24/introducing-jupyter-notebooks-in-azure-ml-studio/">here</a>.</li>
  <li><strong>DATASETS</strong> : Here, you can find all your uploaded datasets. You can upload datasets from your local file in this studio. Microsoft Azure Machine Learning Studio gives you about 10 GB of storage for this.</li>
  <li><strong>TRAINED MODELS</strong> : Once you convert your training experiment to a predictive experiment, azure automatically store it as a trained mode. Trained model is required for ‘Setting up a Web-Service’.</li>
  <li><strong>SETTINGS</strong> : You can view your account setting over here. The most useful thing to see over it is how much workspace storage has been used.
   <img src="/images/Azure/explore.png" alt="Exploring Studio" /></li>
</ol>

<hr />

<h1 id="creating-a-new-experiment-and-loading-custom-dataset">Creating a New Experiment and Loading Custom Dataset</h1>

<p>In this exercise, we are going to load a public data-set hosted by kaggle. You can find and download this data-set <a href="https://www.kaggle.com/harlfoxem/housesalesprediction">here</a>. Download it and store it on your local drive. This dataset contains house sale prices for King County. This data-set consists of 21 columns and 21613 rows. Out of these 21 columns, the ‘price’ column is called label(one which we are trying to predict), and the remaining 20 columns are called as features(independent predictors). Each row in dataset is called as observation or data-point. Once logged in Azure Machine Learning Studio, perform the following steps:</p>

<ul>
  <li>
    <p><strong>STEP 1</strong>: Create a new experiment by pressing <strong>+NEW</strong> at the bottom left corner and selecting <strong>Blank Experiment</strong>.</p>

    <p><img src="/images/Azure/new_experiment.png" alt="Creating a new experiment" /></p>
  </li>
  <li>
    <p><strong>STEP 2</strong>: Change the default name of experiment at the top left corner to “House Price Prediction”.</p>

    <p><img src="/images/Azure/change_name.png" alt="Changing the default name" /></p>
  </li>
  <li>
    <p><strong>STEP 3</strong>: Upload your downloaded data-set from your local drive by clicking <strong>+NEW</strong> and having selected <strong>DATASET</strong> select <strong>FROM LOCAL FILE</strong>. Give it a proper name and you can find your custom data-set under <strong>Saved Datasets</strong> option present in the left modules palette. In my case I have named this data-set as “kc_house_data.csv”.</p>

    <p><img src="/images/Azure/upload_dataset.png" alt="Uploading custom dataset" /></p>
  </li>
  <li>
    <p><strong>STEP 4</strong>: Select your uploaded data-set in <strong>Saved Datasets</strong> option present in the left modules palette of your experiment. Now drag and drop it under your experiment canvas.</p>

    <p><img src="/images/Azure/drag_drop.png" alt="Adding the dataset in experiment" /></p>
  </li>
</ul>

<hr />

<h1 id="visualizing-and-analyzing-the-data-set">Visualizing and Analyzing the data-set</h1>

<p>Here, our custom data-set is about predicting the prices of ghouses. There are two types of machine learning problems: one is supervised learning problems and other is unsupervised machine learning. Here, the house price prediction problem is a supervised machine learning problem (classification problem and to be precise a regression problem). Now that we have our custom data-set present in our experiment, the next things to do here is to visualize it, analyze it and pre-process it.</p>

<h3 id="1-visualize">1. VISUALIZE</h3>

<p>Microsoft Azure Machine Learning Studio provides you the option to visualize your data-set.</p>

<p><img src="/images/Azure/visualization.png" alt="visualization" /></p>

<p>Microsoft Azure Machine Learning Studio only shows some 100 rows of your dataset, but not the complete dataset. Here, you can select any feature column and the side pane will show you statistics and histogram related to that feature column. This helps you to get meaningful insight of your dataset.</p>

<h3 id="2-analyze">2. ANALYZE</h3>

<p>When doing machine learning it is important to get insights of your dataset. Analzying here in our dataset, means to carefully look at the values of feature columns. There are two types of features in machine learning: numeric and non-numeric feature. Numeric features are further divided into two categories: catergorical numeric feature and continuous numerical feature. By default, Microsoft Azure Machine Learning Studio would tag each and every feature column as ‘numeric’(representing continuous numeric), if it has numbers in it. It is very important to analyze the type of feature and tag it with that appropriate type, because it affects the output of the machine learning model.</p>

<p>In the case of continuous features, there exist a measurable difference between possible feature values. For example: distance, time, cost, tempature, etc. With categorical features, there is a specified number of discrete, possible feature values. For example: Colors, Class, Car Model, etc.</p>

<p>After looking at the values of feature columns in dataset we realize that feature columns like ‘grade, waterfront, view, bedroom, condition and zipcode’ should actually be categorical features than the default continuous numeric feature. So, this needs to be changed.</p>

<p>Also, the feature column ‘yr_renovated’ is mostly filled with zeroes. Only few rows have non-zero values. Here, we can replace the entire column with only two values : 0(represting the house was never renovated) adn 1(representing the house was renovated).</p>

<p>All the analyzed points needs to be processed in ‘Pre-Process’ Task.</p>

<hr />

<h1 id="pre-processing-the-data-set">Pre-processing the data-set</h1>

<p>The goal of preprocessing task is to make our data-set ready for training the machine learning model.</p>

<p>According to wikipedia, data pre-processing includes cleaning, instance selection, normalization, transformation, feature extraction and selection, etc. According to <a href="http://www.cs.cssu.edu/">CCSU</a>, tasks in data preprocessing are:</p>

<ul>
  <li>Data Discretization</li>
  <li>Data Cleaning</li>
  <li>Data Transformation</li>
  <li>Data Reduction</li>
</ul>

<h3 id="1-data-discretization">1. Data Discretization</h3>

<p>Data discretization is a part of data reduction, replacing numerical attributes with categorical(ordinal/nominal) ones. As discussed in the ‘Analyze’ section, there are some feature columns(grade, waterfornt, view, bedroom, condition and zipcode) whose data types need to be changed to catergorical. In MAMl, you can do this by doing following steps:</p>

<ul>
  <li>
    <p><strong>STEP 1</strong>: Drag the <em>Edit Metadata</em> module under your experiment canvas.</p>
  </li>
  <li>
    <p><strong>STEP 2</strong>: Connect the dataset to this <em>Edit Metadata</em> module.</p>

    <p><img src="/images/Azure/drag_edit_metadata.png" alt="dataset_drag_filter" /></p>
  </li>
  <li>
    <p><strong>STEP 3</strong>: Select the <em>Edit Metadata</em> module and edit the right panel to select all the columns you want to change to categorical type. Here we select ‘grade, waterfornt, view, bedroom, condition and zipcode’ feature columns. Under the <em>Data type</em> dropdown menu we select “String”(here String or Integer both works), and under the <em>Categorical</em> drop down menu we select “Make Categorical”.</p>

    <p><img src="/images/Azure/edit_metadata_side_panel.png" alt="Edit metadata side panel" /></p>
  </li>
  <li>
    <p><strong>STEP 4</strong>: Run the experiment and right click the <em>Edit Metadata</em> module to verify the data-types of feature columns. This also changes the histogram and statistics values for the modified feature columns.</p>

    <p><img src="/images/Azure/em_compare.png" alt="em_compare" /></p>
  </li>
</ul>

<h3 id="2-data-cleaning">2. Data Cleaning</h3>

<p>Data cleaning includes: fill in missing values, smooth noisy data, identify or remove outliers, and resolve inconsistencies. MAML provides different modules for cleaning up your dataset. Lets perform some of these steps:</p>

<p><strong>i) Handling Missing Values:</strong>
Find out missing values in feature columns and either <em>Replace</em> or <em>Clip</em> them. ‘Replace’ means replacing missing values by using many methods. One method is to replcae them with average value of that feature column. ‘Clip’ means to remove the entire row which has missing values in it.</p>

<p>Here, our dataset has no missing values.</p>

<p><strong>ii) Identify or remove outliers:</strong> 
  Here, when we visualize the ‘sqft_living’ column, then we see that peaks of this columns has some outliers. The goal here is to remove some rows from this column which has values greater than 99 percentile threshold. There is no simple way to do this in MAML. So to achieve this we have to perform following three steps:</p>

<ul>
  <li>
    <p><strong>STEP 1</strong>: Select <strong>Data Transformation</strong> -&gt; <strong>Scale and Reduce</strong> option from left panel and drag the <strong>Clip Values</strong> module in your experiment canvas. Adjust the values in the right panel as shown in the figure below.</p>

    <p><img src="/images/Azure/clip_values.png" alt="clip_values" /></p>

    <p>This shall add a new column in your dataset called ‘sqft_living_clipped’ with two values: False(Denoting the value is below the threshold) and True(Denoting the value is above the threshold).</p>
  </li>
  <li>
    <p><strong>STEP 2</strong>: Select <strong>Data Transformation</strong> -&gt; <strong>Manipulation</strong> option from left panel and drag the <strong>Edit Metadata</strong> module in your experiment canvas. Adjust the values in the right panel as shown in the figure below.</p>

    <p><img src="/images/Azure/edit_metadata.png" alt="edit_metadata" /></p>

    <p>This will convert your ‘sqft_living_clipped’ column from ‘Boolean’ type to ‘String’ type so that you can query the column to remove some rows.</p>
  </li>
  <li>
    <p><strong>STEP 3</strong>: Select <strong>Data Transformation</strong> -&gt; <strong>Sample and Split</strong> option from left panel and drag the <strong>Split Data</strong> module in your experiment canvas. Adjust the values in the right panel as shown in the figure below.</p>

    <p><img src="/images/Azure/split_data.png" alt="split_data" /></p>

    <p>This will split your data according to the regular expression written. The regular expression above selects all rows from a column with index #2 that has value equal to <strong>‘False’</strong>.</p>

    <p>The result of these three steps is that we have removed outliers from ‘sqft_living’ column. This can be shown from the figure below. And now we have better normal distribution of data.</p>

    <p><img src="/images/Azure/before_after.png" alt="before_after" /></p>
  </li>
</ul>

<h3 id="3-data-transformation">3. Data Transformation</h3>

<p>Data Transformation includes normalization and aggregation. Normalization involves converting all of your columns values to same scale. This is done when there is significant difference between the ranges of your feature column values. The reason to do normalization is that you dont want one of your feature column values to dominate in the procedure of predicting target variable. Normalization essentially regularizes all of your feature columns.</p>

<p>It also includes adjustment of the column values. As discussed in “Analyze” section, we have to adjust the values of ‘yr_renowated’ feature column. To achieve these lets perform following steps:</p>

<ul>
  <li>
    <p><strong>STEP 1</strong>: Select <strong>Statistical Functions</strong> option from left panel and drag the <strong>Apply Math Operation</strong> module in your experiment canvas. Adjust the values in the right panel as shown in the figure below.</p>

    <p><img src="/images/Azure/math_function.png" alt="math_function" /></p>

    <p>This module compares the selected feature column(in this case “yr_renowated” feature column), with a constant and replaces the feature column values with either: True(if the value matches the constant) or False(if the value doesn’t match the constant). Here, under <em>Output Mode</em> drop down menu we select <strong>Inplace</strong> option. This changes the selected feature column value in place.</p>
  </li>
  <li>
    <p><strong>STEP 2</strong>: Run the experiment and right click the <em>Apply Math Operation</em> module to view the change in the ‘yr_renowated’ feature column. You can see that the data type of ‘yr_renowated’ feature column has now changed to ‘boolean’.</p>

    <p><img src="/images/Azure/yr_renowated.png" alt="yr_renowated" /></p>
  </li>
</ul>

<h3 id="4-data-reduction">4. Data Reduction</h3>

<p>Data reduction means reducing the volume but producing the same or similar analytical results. Data reduction is often termed as ‘Dimensionality Reduction’ in machine learning, where-in advanced techniques like “PCA” are used. Here, we shall reduce the data by removing unrelated and unnecessary data.</p>

<p><strong>i) Removing unimportant feature columns:</strong>
  Some unuseful columns should be removed manually. These columns have no relation with the target label. Sometimes, this information can be obtained from domain expert who knows details about the meaning of the dataset. Here, in our case the unimportant feature columns are ‘sqft_living_clipped’ and ‘id’.</p>

<ul>
  <li>
    <p><strong>STEP 1:</strong> Select <strong>Data Transformation</strong> -&gt; <strong>Manipulation</strong> option from left panel and drag the <strong>Select Columns in Dataset</strong> module in your experiment canvas. Then, select the <strong>Select Columns in Dataset</strong> module and in the right side panel click the <em>Launch column selector</em> to exclude two feature columns: ‘sqft_living_clipped’ and ‘id’.</p>

    <p><img src="/images/Azure/exclude.png" alt="exclude" /></p>
  </li>
</ul>

<p><strong>ii) Removing unrelated feature columns:</strong>
  Not all the feature columns are useful in predicting the target variable. Thus, we need to first find co-relation of the given features with the respective target column. Fortunately, MAML provides various modules under <strong>Feature Selection</strong> option. Out of three options available here we shall select <strong>Filter Based Feature Selection</strong> module.</p>

<p><img src="/images/Azure/feature_selection.png" alt="feature_column_selection" /></p>

<ul>
  <li>
    <p><strong>STEP 1</strong>: Drag the <em>Filter Based Feature Selection</em> module under your experiment canvas. Connect this module.</p>

    <p><img src="/images/Azure/dataset_drag_filter.png" alt="dataset_drag_filter" /></p>
  </li>
  <li>
    <p><strong>STEP 2</strong>: Select the <em>Filter Based Feature Selection</em> module and edit the right panel as shown below</p>

    <p><img src="/images/Azure/pearson.png" alt="Pearson" /></p>
  </li>
  <li>
    <p><strong>STEP 3</strong>: Run the experiment and right click the <em>Filter Based Feature Selection</em> to visualize the correlations of features  with the target column. There are two circles attached with the <em>Filter Based Feature Selection</em> module. When you select the second circle it shows you the value of correlation of each column with the target column. The first circle output the 14 most correlated features of your dataset along with the target label column.</p>

    <p><img src="/images/Azure/correlations.png" alt="correlations" /></p>

    <p>Also, clicking the second circle will show you the correlation values between the feature columns and target label.</p>

    <p><img src="/images/Azure/correlation_value.png" alt="correlation_value" /></p>
  </li>
</ul>

<hr />

<h1 id="using-a-machine-learning-model-to-train-the-dataset">Using a machine learning model to train the dataset</h1>

<p>In this exercise we shall explore how to train a <strong>Linear Regression</strong> model in Microsoft Azure Machine Learning Studio.</p>

<ul>
  <li>
    <p><strong>STEP 1:</strong> Split your data-set into training part and test-part.</p>

    <p><img src="/images/Azure/test_train_split.png" alt="test_train_split" /></p>

    <p>Usual thumb rule for a test-train split is to keep (1/3)rd part testing and (2/3)rd part for training. Though, you are free to choose the ratio of test-train split but always remember to have more training data then testing data. Here, we are going to use 70% of training examples for training.</p>

    <p>Select <strong>Data Transformation</strong> -&gt; <strong>Sample and Split</strong> option from left panel and drag the <strong>Split Data</strong> module in your experiment canvas. Adjust the values in the right panel as shown in the figure below.</p>

    <p><img src="/images/Azure/split_rows.png" alt="split_rows" /></p>

    <p>This shall split your data-set into training part(represented by circle #1) and test part(represented in circle #2). We will use the training data and send it to train a <strong>Linear Regression</strong> model.</p>
  </li>
  <li>
    <p><strong>STEP 2:</strong> Select your machine learning model.</p>

    <p>Here, we know that we are dealing with a regression problem. So, lets select <strong>Machine Learning</strong> -&gt; <strong>Initialize Model</strong> -&gt; <strong>Regression</strong> and drag the <strong>Linear Regression</strong> module in your experiment canvas. Adjust the values in the right panel as shown in the figure below.</p>

    <p><img src="/images/Azure/linear_regression.png" alt="linear_regression" /></p>

    <p>Our <strong>Linear Regression</strong> model is using <strong>Ordinary Least Squares</strong> method to project a line passing through all the data points present in training dataset. Ordinary Least Squares(OLS) method works by minimizing the sum of the squares of the differences between the observed datapoints and the predicted value.</p>
  </li>
  <li>
    <p><strong>STEP 3:</strong> Train your machine learning model.</p>

    <p>Select <strong>Machine Learning</strong> -&gt; <strong>Train</strong> and drag the <strong>Train Model</strong> module in your experiment canvas. In the right panel select <strong>price</strong> column which is our target column.</p>

    <p><img src="/images/Azure/train_model.png" alt="train_model" /></p>

    <p>Connect the <strong>training part</strong> of your dataset and <strong>Linear Regression</strong> module to this <strong>Train Model</strong> module.</p>
  </li>
</ul>

<hr />

<h1 id="scoring-the-dataset">Scoring the dataset</h1>

<p>In this exercise we shall learn how to score the test data-points after training machine learning model. Scoring the dataset means to take the test data and predict the target variable for this test data absed on the rules learnt from analyzing training data.</p>

<ul>
  <li>
    <p><strong>STEP 1:</strong> Select <strong>Machine Learning</strong>-&gt;<strong>Score</strong> from the left panel and drag the <strong>Score Model</strong> module in you experiment canvas.</p>

    <p><img src="/images/Azure/score_model.png" alt="score_model" /></p>
  </li>
  <li>
    <p><strong>STEP 2:</strong> Select the <strong>Score Model</strong> module and click to visualize the newly added <em>Scored Labels</em> column. As, shown in the figure below, we can see the differences between actual(‘Price’ feature column) and predicted(‘Scored Labels’ column).</p>

    <p><img src="/images/Azure/scored_values.png" alt="scored_values" /></p>
  </li>
</ul>

<p>We can see here that predicted values are closer to the actual values. But just by looking we cannot compare and evaluate how well our linear regression model did. MAML provides modules for evaluating the machine learning models.</p>

<hr />

<h1 id="performing-cross-validation">Performing Cross-Validation</h1>

<p>As we saw above, we used top 70% of dataset to train and remaining bottom 30% for testing. This is one way of splitting. You can even take the bottom 70% of your dataset to train and the top 30% of your dataset to test. There are several such possible combinations. A good machine learning model should give you good accuracy on all such possible combinations. Cross Validation technique creates several such combinations and evaluates your model accuracy for all those combination. Cross-validation is an important technique used in machine learning to assess both the variability of a dataset and the reliability of any model trained using that data.</p>

<p>Cross-validation is an important technique used in machine learning to assess both the variability of a dataset and the reliability of any model trained using that data. Here, in MAML, <strong>Cross-Validate Model</strong> module defaults to 10 folds.</p>

<p>To perform cross validation of your dataset, do the following steps:</p>

<ul>
  <li>
    <p><strong>STEP 1:</strong> Select <strong>Machine Learning</strong> -&gt; <strong>Evaluate</strong> and drag the <strong>Cross-Validate Model</strong> module in your experiment canvas. In the right panel select <strong>price</strong> column which is our target column.</p>

    <p><img src="/images/Azure/cross_validate.png" alt="cross_validate" /></p>

    <p>As seen from the above image, one input to this <strong>Cross-Validate Model</strong> module will be <strong>Linear Regression</strong> module which is the type of machine learning model we want to train and other input will be <strong>Filter Based Feature Selection</strong> module which the entire dataset that we want the model to learn.</p>
  </li>
</ul>

<hr />

<h1 id="evaluating-the-results">Evaluating the results</h1>

<p>In this exercise we shall learn how to evaluate the machine learning model’s results.</p>

<ul>
  <li>
    <p><strong>STEP 1:</strong> Select <strong>Machine Learning</strong>-&gt;<strong>Evaluate</strong> from the left panel and drag the <strong>Evaluate Model</strong> module in you experiment canvas.</p>

    <p><img src="/images/Azure/evaluate_model.png" alt="evaluate_model" /></p>
  </li>
  <li>
    <p><strong>STEP 2:</strong> Select the <strong>Evaluate Model</strong> module and click to visualize the evaluation results.</p>

    <p><img src="/images/Azure/evaluation_results.png" alt="evaluation_results" /></p>
  </li>
</ul>

<p>As shown above, our linear regression model has a R-square(Coefficent of Determination) value of <strong>0.83</strong>. This means 83% of variation in price values can be described by the variation in the values of feature columns(predictors). This is really a high value in context of house price prediction problems. Depending on problem optimal value of R-square changes. For house price prediction problem, any R-square value above 0.75 is really good.</p>

<p>To understand the meaning of the other metrics, visit <a href="http://jugalgandhi.me/articles/understanding-linear-regression-evaluation-metrics/">here</a>.</p>

<p>The final picture of our experiment canvas should be like below:</p>

<p><img src="/images/Azure/entire_canvas.png" alt="entire_canvas" /></p>

<hr />

<h1 id="deploying-it-as-a-web-service">Deploying it as a web-service</h1>

<hr />

<h1 id="summary">Summary</h1>]]></content><author><name>Jugal Gandhi</name><email>jugalg05@gmail.com</email></author><category term="blog" /><summary type="html"><![CDATA[Microsoft Azure Machine Learning Studio is an easy drag and drop cloud tool that enables you to perform machine learning at scale. It allows you to upload your own custom data-set and perform supervised or unsupervised machine learning tasks on it.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jugalgandhi.me/%7B%22feature%22=%3E%22azure_poster.jpg%22%7D" /><media:content medium="image" url="https://jugalgandhi.me/%7B%22feature%22=%3E%22azure_poster.jpg%22%7D" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>