SigmaWay Blog

SigmaWay Blog tries to aggregate original and third party content for the site users. It caters to articles on Process Improvement, Lean Six Sigma, Analytics, Market Intelligence, Training ,IT Services and industries which SigmaWay caters to

Garbage In is Garbage Out in Data Sciences!

Whether you are a data analyst in a firm or a developer training its machine learning model, you deal with data. Rather you need data! Data is one of the essential things which is needed to create a foundation. The decisions and results are relied on the output you get from the data. Thus, data is important and like every other thing, it also works on the principle of Garbage In, Garbage Out.

Many people make mistake while feeding data to their data set with a hope to get better results.

However, they end up having an ugly dataset with a greater risk of damaging their product.

The 6 most common mistakes are: Not Enough Data, Low Quality Classes, Low Quality Data, Unbalanced Classes, Unbalanced Data, No Validation or Testing.

These mistakes can be fixed which could further help in fetching good results.

One just need to remember that their dataset is equally important to the model they are working on. Without a balanced dataset, getting a fine finish product is next to impossible.

To know how to fix those mistakes visit: https://hackernoon.com/stop-feeding-garbage-to-your-model-the-6-biggest-mistakes-with-datasets-and-how-to-avoid-them-3cb7532ad3b7

  4575 Hits

What 2016 holds for Machine Learning?

The evolution of Machine Learning (ML) is affected by the approach of the tech giants towards it. Open Source Platforms and the data sources also have an important impact on the ML models. Tech giants have realized the importance of ML, and this is becoming the new normal for them. They are now focusing on providing ML models as a Service. These are built for the common usage, not just for the data scientists. Most of the softwares being used for ML are open sources, thus affecting the market of other softwares making sources. Tools like Apache Spark are going to dominate the market. Read more about it on: http://www.infoworld.com/article/3017251/data-science/what-machine-learning-will-gain-in-2016.html

  4844 Hits
Sign up for our newsletter

Follow us