Data Ecosystem and Software As A Commodity
In economics, a commodity is an economic good, usually a resource, that has full or substantial fungibility: that is, the market treats instances of the good as equivalent or nearly so with no regard to who produced them
2021 has seen great investment activities in Data Startups. Poster child such as dbt, fivetran, airbyte, starburst are all flaunting their new financing rounds and jaw dropping valuations. Whilst the frenzy of silicon valley unicorn minting, a trend that emerged from it is the early shape of Data Ecosystem.
Capturing the ecosystem as a pyramid in figure 1, there are a few themes here.
- These layered solution offerings in the pyramid targets each step of the enterprise data problems from data ingestion, storage, transformation, analytics, management and discovery.
- As the product offering moves up the pyramid, the problem being targeted is less about the infrastructure, more about the application, and closer to the end user.
- Intuitively, moving up the pyramid also means less TAM (totally addressable market) because the problem being targeted in Data Stack is more specialized (smaller), prior players down the pyramid are probably already trying to expand up. One way to mitigate is that these upper pyramid products are generally more platform agnostic and moves wherever user wants to move.
Another point worth mentioning is the interplay between Open Source and Commercial Vendors.
- Commercial Vendors such as Databricks want to land and expand so they will make onboarding easy, bundling in their new products easy to use and customer offboarding really hard. So these technology integrations tend to be irreversible once customer decides to onboard these vendors.
- Competing with vendor route is the open source data stack where companies can lego up their data stack in-house and freely choose which cloud to host their machines. The downside though is the chore in setting up, managing, and maintaining their own in-house fleet of machines. This is why Airbyte became wildly popular because their open source offering is easier to use than even most commercial vendor solutions.
- Lastly, the recent Russia-Ukraine War will accelerate the development of censorship resistant software stack that resembles the open source stack, but even more free from political interference.
What the above meant is that in the future,
- These data technologies will abstract to higher levels of problems and move closer to users to stay relevant. They will become even more user friendly and layman friendly. As this trend continues, it is not hard to envision a No-code low-code Data Platform that can solve practically all of the data plumbing problems. This is the end world where Data Platform becomes a commodity where “the market treats instances of the good as equivalent or nearly so with no regard to who produced them”.
- For data platform adopters, it means making reversible technical decisions. Why? Because you don’t want your options be limited to any infra provider, and you always want to have free rides on the booming new tech that can solve your current or future problems. Being married to a single technology means doing all future development work the hard way..
Taking this mental model to other software verticals, we can see a similar trend.
- Mobile Development: Flutter Flow (no code platform to build any mobile apps)
- DevOps: Porter.run (platform to automate DevOps and PaaS management)
- Web Hosting: Replit, Vercel (You write Node.JS code, they setup rest of the web app).
Feel free to comment and nominate other verticals or companies.
So, what does it mean for entrepreneurs?
If you are entrepreneur wanna-be like me, you are probably thinking, so where and how can I build my company? My thought:
- Go where the ecosystem hasn’t yet formed, and build the ecosystem. For example, what does the Software as a Commodity world in web3 look like?
- Leverage the established ecosystem on an unsolved problem. The good news with all of these matured technology is that building a tech solution is so easy right now. The challenge is finding a worthy problem :)
- The main idea of Data Ecosystem is from Lei. I tried to take it further with some discussions in interplay, implication, and extensions.