Data Science is the practice of:
- Asking questions (formulating hypothesis), answers to which solve known problems or unearth unknown solutions that in turn drive business value,
- Defining the data needed or working with an existing data set and employing tools (computer science based) to collect, store and explore such data generally in huge volume & variety (often more than 1 TB and 1000s of dimensions),
- Identifying the type of analysis to be done to get to the answers and performing such analysis by implementing various algorithms/tools (statistics based), often in a distributed and parallel architecture,
- Communicating the insights gathered from the analysis in the form of simple stories/visualizations/dashboards (the Data Product) that a non-data scientist can understand and build conversation out of it. (It should be kept in mind that a product can also be an piece of code that is internal to a company and is used by various departments. The presentation, maintenance, scalability, etc of the code are then the product features, which is often not practiced in many organizations)
- Building a higher level abstraction that does steps 2-3-4 in an autonomous way, analyzing & taking actions on new data as they are fed to the system.
Hope this helps!!
If you need to know more about Data Science, join Data Science training today.
Thank you!