Database

Datacast Episode 116: Distributed Databases, Open-Source Standards, and Streaming Data Lakehouse with Vinoth Chandar

Datacast Episode 116: Distributed Databases, Open-Source Standards, and Streaming Data Lakehouse with Vinoth Chandar

Vinoth Chandar is the creator and PMC chair of the Apache Hudi project, a seasoned distributed systems/database engineer, and a dedicated entrepreneur. He has deep experience with databases, distributed systems, and data systems at the planet scale, strengthened through his work at Oracle, Linkedin, Uber, and Confluent.

During his time at Uber, he created Hudi, which pioneered transactional data lakes as we know them today, to solve unique speed and scale needs for Uber’s massive data platform. Most recently, Vinoth founded Onehouse - a cloud-native managed lakehouse to make data lakes easier, faster, and cheaper.

Datacast Episode 112: Distributed Systems Research, The Philosophy of Computational Complexity, and Modern Streaming Database with Arjun Narayan

Datacast Episode 112: Distributed Systems Research, The Philosophy of Computational Complexity, and Modern Streaming Database with Arjun Narayan

Arjun Narayan is the co-founder and CEO of Materialize. Materialize is a streaming database for real-time applications and analytics, built on top of a next-generation stream processor – Timely Dataflow. He was previously an engineer at Cockroach Labs and held a Ph.D. in Computer Science from the University of Pennsylvania.

Datacast Episode 84: Business Development and Customer Success for Emerging Technologies with Taimur Rashid

Datacast Episode 84: Business Development and Customer Success for Emerging Technologies with Taimur Rashid

As Chief Business Development Officer, Taimur is responsible for developing emerging businesses at Redis and leading strategic business & corporate development. He is currently leading initiatives related to AI/ML.

Prior to Redis, Taimur led Worldwide Customer Success for Microsoft's Azure Data & AI. He jointly led the design, implementation, and landing of one of Microsoft's largest field transformations, which combined customer success, support engineering, and technical account management.

Before Microsoft, Taimur was the Managing Director for Amazon Web Services (AWS) Platform Technology and Applications - where he led business development from 2008 (near its inception) to 2018 when the business reached $25B in ARR. Taimur helped forge key partnerships and customers, including Airbnb, CapitalOne, Dropbox, Liberty Mutual, NASA JPL, Nasdaq, Netflix, Nintendo, Intuit, SAP, and Samsung.

Taimur grew up in three countries and lived in five states. Bellevue, WA is home for him, where he lives with his wife and three boys. Taimur enjoys cross-training, hiking, and biking. He is an avid reader of technology, business, and history. He enjoys art, music, coffee, and cooking on the weekends for his family.

Datacast Episode 62: Leading Organizations Through Analytics Transformations with Gordon Wong

Datacast Episode 62: Leading Organizations Through Analytics Transformations with Gordon Wong

As a data modeling fanatic, data warehouse architect, multi-hypergrowth startup veteran, and team builder, Gordon has built his career on helping people get their business questions. Over time, he's switched his focus from pure technology to complete solutions where people, process, and technology all play a role. At Fitbit, he established the data warehousing team and, as an early customer of Snowflake, used it to fuel petabyte-scale analytics. Later on, at both ezCater and Hubspot, he rebuilt the data warehousing teams to focus on enabling analysts, not loading more data. A constant focus on the customer and their problems has led him to realize that empathy is the most important trait a leader can have.

Datacast Episode 58: Deep Learning Meets Distributed Systems with Jim Dowling

Datacast Episode 58: Deep Learning Meets Distributed Systems with Jim Dowling

Jim Dowling is the CEO of Logical Clocks AB, an Associate Professor at KTH Royal Institute of Technology, and a Senior Researcher at SICS RISE in Stockholm. His research concentrates on building systems support for machine learning at scale. He is the lead architect of Hops Hadoop, the world's fastest and most scalable Hadoop distribution and only Hadoop platform with support for GPUs as a resource. He is also a regular speaker at Big Data and AI industry conferences.

Datacast Episode 52: Graph Databases In Action with Dave Bechberger

Datacast Episode 52: Graph Databases In Action with Dave Bechberger

Dave Bechberger is known for his expertise in distributed data architecture and being a Graph Database SME.  He is known for his pragmatic approach to data architectures and for implementing large-scale distributed data architectures for big data analysis and data science workflows using various SQL and NoSQL data technologies. He is the author of "Graph Database in Action" by Manning publications and has spoken both nationally and internationally at conferences on subjects related to distributed data and graph databases.


Dave spent 20+ years developing, managing, and consulting on software projects and is currently a member of the Amazon Neptune service team. He works with both customers and engineering teams to simplify and speed the adoption of graph technologies.

An Introduction to Big Data: Distributed Data Processing

An Introduction to Big Data: Distributed Data Processing

This semester, I’m taking a graduate course called Introduction to Big Data. It provides a broad introduction to the exploration and management of large datasets being generated and used in the modern world. In an effort to open-source this knowledge to the wider data science community, I will recap the materials I will learn from the class in Medium. Having a solid understanding of the basic concepts, policies, and mechanisms for big data exploration and data mining is crucial if you want to build end-to-end data science projects.

An Introduction to Big Data: Decision Trees

An Introduction to Big Data: Decision Trees

This semester, I’m taking a graduate course called Introduction to Big Data. It provides a broad introduction to the exploration and management of large datasets being generated and used in the modern world. In an effort to open-source this knowledge to the wider data science community, I will recap the materials I will learn from the class in Medium. Having a solid understanding of the basic concepts, policies, and mechanisms for big data exploration and data mining is crucial if you want to build end-to-end data science projects.

An Introduction to Big Data: Clustering

An Introduction to Big Data: Clustering

This semester, I’m taking a graduate course called Introduction to Big Data. It provides a broad introduction to the exploration and management of large datasets being generated and used in the modern world. In an effort to open-source this knowledge to the wider data science community, I will recap the materials I will learn from the class in Medium. Having a solid understanding of the basic concepts, policies, and mechanisms for big data exploration and data mining is crucial if you want to build end-to-end data science projects.