Thumb

Data Science

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data and apply knowledge and actionable insights from data across a broad range of application domains. Data science is related to data mining, machine learning and big data.

Data science is a "concept to unify statistics, data analysis, informatics, and their related methods" in order to "understand and analyze actual phenomena" with data.It uses techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, information science, and domain knowledge. However, data science is different from computer science and information science. Turing Award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational, and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the data deluge.

Technologies and techniques

  • Linear regression
  • Logistic regression
  • Support-vector machine (SVM).
  • Cluster analysis is a technique used to group data together.
  • Dimensionality reduction is used to reduce the complexity of data computation so that it can be performed more quickly.
  • Machine learning is a technique used to perform tasks by inferencing patterns from data.
  • Naive Bayes classifiers are used to classify by applying the Bayes' theorem. They are mainly used in datasets with large amounts of data, and can aptly generate accurate results.
  • Cluster analysis is a technique used to group data together.

Impact

Big data is very quickly becoming a vital tool for businesses and companies of all sizes.As big data continues to have a major impact on the world, data science does as well due to the close relationship between the two.



Leave a comments