Who is a data scientist?

A data scientist performs data analysis employing statistical methods, mathematical models, data analysis processes, data visualization techniques, and machine learning algorithms to extract knowledge and insights from structured as well as unstructured data. They often work with big data.

Data scientists need domain knowledge and business understanding. They also need familiarity with analytical and modeling techniques, in addition to data management and machine learning skills. They address a variety of business problems to deliver recommendations for optimized business decisions.

What does a data scientist do?

With the domain knowledge and business understanding, data scientists translate the business problems into the analysis objectives. They use their analytical and modeling skills for pattern detection and relationship discovery. They leverage their data management skills for preparing datasets required for analysis.

Data scientists usually follow a 5-step process:

  1. Frame the problem: Translate the business problem into a well-defined analysis objective. For example, why customers are no longer subscribing to our service, or where should we launch the new service.
  2. Collect the raw data: Search for data, collect, acquire from externals sources if required, and get ready for preparation.
  3. Prepare the data: Clean the data, discard wrong or redundant data, and focus only on the relevant data that can bring value.
  4. Analyze. Explore the data, detect patterns, build models, apply machine learning algorithms, test, and refine to generate valuable predictions and actionable insights.
  5. Communicate the results: Present the results in the context of the original business problem, with clear reasoning and identified actions.


The biggest challenge data scientists face is locating reliable relevant data to prepare the datasets required for analysis.

Where do data scientists work?

Data scientists typically work in marketing and strategic planning, in business verticals of financial services, IT, retail, hospitality, entertainment, healthcare, logistics, and security. They extract insights from data to identify patterns and deliver predictions.