Everybody has their different opinions regarding big data. Some say it is just a phase that the tech world is going through and some say it is here for the long term. But all that will be in the future and not in control. But today one can say without any doubt that data science is a desired field of study.


There is a lot of raw data stored in business data warehouses, one need to sort them and understand them so that it can be used for the strategic use of the concern. So the entire journey of converting piles of data into usable data is data science.

Everyone is aware of smartwatches, what an invention. It can tell us our heart rate, how many calories we are burning, how healthy, we are, and how many more step to take to complete the daily count. But how can it tell us all this just by being tied on our wrists? It is an immaculate application of data science. It gathers data like heart rate, body temperature and uses sensors to know movement and then processes these data into the meaningful insight of our health.

Today, every business concern needs data science to solve problems and deduce what is in the future and creates structural plans for it. In the past businesses only used to analyze the past data, but now it’s about knowing the future.


There is an entire workflow in data science. Step by step procedure for extracting the substance from raw information.

  1. Data accumulation usually is done by database management (SQL), retrieving semi-structured data, and then categorically storing them using Hadoop, Apache flink etc.
  2. Data cleaning to remove the inconsistencies and anomalies using tools like Python, R, SAS, Hadoop etc.
  3. Data analysis to understand the data, find patterns which can be useful, details which can solve a particular problem using Python libraries and R libraries, statistical modeling, experimental designing etc.
  4. Data modeling by putting in various objective and cases and try to get an algorithm for the business need by using machine learning.
  5. Data interpreting by making non-tech people understand what you have discovered from the data so that one can have an insight using data visualization tools and most importantly communication and presentation skills.


The one who performs all these stages in the pipeline and extracts the data product out of raw data is a data scientist. Though not easy, but it is not impossible to become a data scientist. Correct training and learning with lots of practice in practical field one can ace this new demand in the tech world.

To be a data scientist one need to be curious and have proper training. Training is all about learning different skills in mathematics, technology, business strategic learning and various tools and techniques required in the field. But the most important thing is to have inquisitiveness to ask the right questions, take up difficult tasks and make new discoveries along the way.

Source by Shalini M