thedudeabides5 a day ago

AI doesn't replace data modeling, it makes it way more important, useful and easy to do.

  • A4ET8a8uTh0_v2 a day ago

    Yes. Just last bigger project was an eye-opener for me. All of a sudden, I can't even trust basic info provided, because so many don't even check what they have sent to me. If you understand the underpinnings, you are a lot more useful than 'prompt engineer' ( quotations intended ).

datadrivenangel a day ago

Nope. Data modeling is inherent to having information systems.

The reason the author found that data modeling is 'dead' is that the Modern Data Stack promised that you could transform your data later, and so many people never got around to that. Long live the data swamp!

  • Cheer2171 a day ago

    Every bucket of data is implicitly or explicitly the result of an act of data modeling, some more intentional than others.

  • drillsteps5 a day ago

    I would say that the easy access to previously unthinkable amount of storage and compute (and obv network throughput to tie it together) is thought to make the data modeling unnecessary. Normalized/denormalized data models, Inman/Kimbal architectures were largely dictated by limits of compute and storage which are no longer relevant.

    What is forgotten is the data governance and the data quality, which results in, yes, data swamps as far as the eye can see and hordes of "data scientists" roaming around hoping to find actionable "gems".

tietjens a day ago

I’m not sure I follow, though I like the tone. What has data modeling been replaced by?

  • me_bx 13 hours ago

    Not OP, but in a similar boat. My 2 cents:

    Well thought, sophisticated ways of modeling data for analytics purposes -using established approaches - are being replaced by just pulling data from the data sources - with barely any change in the source structure - into cloud data platforms.

    In the past we used to model layers in a data-warehousing infrastructure each with a purpose and a data modelling methodology. For instance, an operational data store (ODS) layer, integrating data from all the sources, with a normalized data structure. Then a set of datamarts, each of them containing a subset of the ODS content, in a denormalized format, focused each on a specific functional domain.

    We had rules, methods to structure data in order to get performant reporting, and a customer orientation.

    Coming from this world, it seems like data governance principles are gone, and it feels like some organisations use the modern data stack same way as each analyst would be doing their own Excel files in their own corner, without any safeguards.

  • tremon a day ago

    By vibe graphing, probably.

  • icedchai 11 hours ago

    Throwing shit at the wall, mostly. "Here's a S3 bucket of line separated .json blobs that have a consistent format sometimes! Good luck!"

supercanuck 18 hours ago

The core issue is dimensional data modeling was introduced to address limitations on hardware (disk drives) and limited capacity.

With the advent of unlimited storage and separation of computer and storage, dimensional data modeling would only be possible if there was strong data governance in a system like SAP or a COE.

  • drillsteps5 10 hours ago

    I would separate dimensional or relational modeling from data governance. You can dump all the data you have in one S3 bucket, and AWS can take care of near-real-time ingestion. So no need for staging, ODS, data warehouse, data marts, hub and spoke, all that jazz. And no data modeling required, just ingest as is and dump it there. Great.

    Now what is this dump good for? It's just bunch of bytes of information which now needs to be interpreted. There's different perspectives (sales vs manufacturing vs procurement vs finance etc). There's data quality issues that need to be identified and resolved. There's PII and other compliance stuff. You have to watch out for giving permissions to sensitive information (ever dealt with payroll data? It's fun) Your data dump isn't doing any of that by itself. And I think people tend to simply stop at the data dump stage and then give access to analysts and data scientists and tell them to go do reports and outbound data feeds.

    With obvious results.

    • OoooooooO 3 hours ago

      That's how you get 3 different values for a core KPI in 3 dashboards.

      Then you look under the hood of the dashboards, only to see that not a single one follows the official definition of the business.