This post is going to cover some very basic concepts in machine learning, from linear algebra to evaluation metrics. It serves as a nice guide to newbies looking to enter the field.
Christopher Peters is a full-stack data scientist at Zapier. He was both Zapier and Treehouse's first data scientist. Prior to his work as a data scientist, he was a research associate at LSU’s Center for Energy Studies where he was an energy economist. He has a real passion for working with, sharing, visualizing and analyzing data of all kinds using statistical, visual and machine learning techniques.
At the moment, data scientists are getting a lot of attention, and as a result, books about data science are proliferating. While searching for good books about the space, it seems to me that the majority of them focus more on the tools and techniques rather than the nuanced problem-solving nature of the data science process. That is until I encountered Brian Godsey’s “Think Like a Data Scientist.”
Data mining is the process where one structures the raw data and formulate or recognize the various patterns in the data through the mathematical and computational algorithms. This helps to generate new information and unlock various insights. In this article, I want to share the 10 mining techniques that I believe any data scientists should learn to be more effective while handling big datasets.
Computer Vision is one of the hottest research fields within Deep Learning at the moment. It sits at the intersection of many academic subjects, such as Computer Science (Graphics, Algorithms, Theory, Systems, Architecture), Mathematics (Information Retrieval, Machine Learning), Engineering (Robotics, Speech, NLP, Image Processing), Physics (Optics), Biology (Neuroscience), and Psychology (Cognitive Science).
Data Scientists at Work displays how some of the world’s top data scientists work across a dizzyingly wide variety of industries and applications — each leveraging her own blend of domain expertise, statistics, and computer science to create tremendous value and impact.
It is no doubt that the sub-field of machine learning / artificial intelligence has increasingly gained more popularity in the past couple of years. As Big Data is the hottest trend in the tech industry at the moment, machine learning is incredibly powerful to make predictions or calculated suggestions based on large amounts of data.
Data scientists live at the intersection of coding, statistics, and critical thinking. As Josh Wills put it, “data scientist is a person who is better at statistics than any programmer and better at programming than any statistician.” I personally know too many software engineers looking to transition into data scientist and blindly utilizing machine learning frameworks such as TensorFlow or Apache Spark to their data without a thorough understanding of statistical theories behind them. So comes the study of statistical learning, a theoretical framework for machine learning drawing from the fields of statistics and functional analysis.