An Introduction to Big Data: Relational Database

An Introduction to Big Data: Relational Database

This semester, I’m taking a graduate course called Introduction to Big Data. It provides a broad introduction to the exploration and management of large datasets being generated and used in the modern world. In an effort to open-source this knowledge to the wider data science community, I will recap the materials I will learn from the class in Medium. Having a solid understanding of the basic concepts, policies, and mechanisms for big data exploration and data mining is crucial if you want to build end-to-end data science projects.

Datacast Episode 7: Building Open-Source R Packages with Thomas Lin Pedersen

Datacast Episode 7: Building Open-Source R Packages with Thomas Lin Pedersen

Thomas is a bioinformatician, turned software engineer, who enjoys developing tools for data scientists. His main interests are in the tools that bring the scientist closer to their data, whether it be through intuitive and powerful APIs or through visualization. He describes himself as a creative spirit who enjoys photography as well as generative art and graphic design, and he tends to try and combine this with his interest in programming whenever possible. Thomas lives just north of Copenhagen with his wife and two kids.

Demystifing Graph Algorithms

Demystifing Graph Algorithms

This is a complete guide to graph algorithms that you can rely on to practice for your next technical interview. No matter how complicated these concepts may seem, repeatedly getting familiar with different techniques and problems will make you more competent at solving them.

Datacast Episode 6: Data Science in the Travel Industry with Ewan Nicolson

Datacast Episode 6: Data Science in the Travel Industry with Ewan Nicolson

Ewan Nicolson has been working professionally with numbers and computers for the past 13 years. During the past 6 years at Skyscanner, he has seen the data team grow from 2 analysts based in Edinburgh to a global team of 25+ data scientists making a global impact.

Datacast Episode 5: Applied Statistics in Data Science with Christopher Peters

Datacast Episode 5: Applied Statistics in Data Science with Christopher Peters

Christopher Peters is a full-stack data scientist at Zapier. He was both Zapier and Treehouse's first data scientist. Prior to his work as a data scientist, he was a research associate at LSU’s Center for Energy Studies where he was an energy economist. He has a real passion for working with, sharing, visualizing and analyzing data of all kinds using statistical, visual and machine learning techniques.

Greedy Algorithm and Dynamic Programming

Greedy Algorithm and Dynamic Programming

In an algorithm design there is no one 'silver bullet' that is a cure for all computation problems. Different problems require the use of different kinds of techniques. A good programmer uses all these techniques based on the type of problem. In this blog post, I am going to cover 2 fundamental algorithm design principles: greedy algorithms and dynamic programming

Divide and Conquer Algorithms

Divide and Conquer Algorithms

A very popular algorithmic paradigm, a typical Divide and Conquer algorithm solves a problem using following three steps:

  • Divide: Break the given problem into subproblems of same type.

  • Conquer: Recursively solve these subproblems

  • Combine: Appropriately combine the answers