A Modern Data Stack (MDS) is a stack of technologies that makes a modern data warehouse perform 10–10000x better than a legacy data warehouse. Ultimately, an MDS saves time, money, and effort. The four pillars of an MDS are a data connector, a cloud data warehouse, a data transformer, and a BI & data exploration tool.
Analytics Engineer is a new position coined (and made possible) by dbt. If a Data Engineer (DE) marries a Data Analyst (DA) and they have a baby girl, that baby girl will be an Analytics Engineer (AE). Well, it does not work that way, but you get the point.
Back in the old days, dealing with physical infrastructure is a huge burden, which not only requires teams of experts to manage but also is time-consuming. In the modern cloud computing era, however, you can deploy hundreds of computers instantly to solve your problems with the click of a button. Well, to be realistic, most day to day problems that we are trying to solve won’t require that much computing power.
Kafka was developed and open-sourced by LinkedIn in 2011, and it has since quickly evolved from messaging queue to a full-fledged streaming platform with an exuberant ecosystem. Many tech companies, besides LinkedIn such as Airbnb, Spotify, or Twitter, use Kafka for their mission-critical applications.
Airflow has grown to be an essential part of my toolset as a data professional. After my previous post, many people reached out to me, asking about how to get started learning Airflow. As with many things else, I believe that the best way to get started learning something is by doing.