Digitalisierung

Dataverse Data Modeling: Best Practices from the Field

Most Dataverse problems in remediation projects aren't platform faults — they're modeling decisions made under time pressure and never corrected. An overly loose data type, a wrong relationship behavior: harmless individually, in sum a model nobody dares to touch anymore.

This post is the practical guide: the decisions that keep a data model sustainable in five years. The overall context is provided by our Dataverse guide — here we get into the nitty-gritty.

1. The Model First, Then the App

The most important rule is an attitude: Dataverse is data-centric. You don't build an app; you build a data model around which several apps, flows, and reports grow. Whoever starts with the app screen builds a model that understands exactly one app and blocks every second one. Concretely: before the first screen exists, the entity model is on the whiteboard — tables, relationships, keys. It feels slower on day one and is noticeably faster across the project.

2. Normalization — and When You Break It Deliberately

Normalize first. Every business concept gets its own table; repetitions are mapped through relationships — a project has a customer via a lookup, not the customer name duplicated. There is exactly one good reason to denormalize: read performance on frequent, expensive queries. For that, Dataverse has rollup columns and calculated columns. Both are controlled denormalization: redundant, but kept consistent by the platform — the difference from a manually copied field that drifts apart at the first update.

The next diagram shows the decision: when to normalize, when to denormalize in a controlled way.

Normalize — or Denormalize in a Controlled Way? Does a concept repeat? Normalize own table + lookup default case Denormalize in a controlled way rollup / calculated column only for read performance almost always performance exception Never copy manually — that drifts apart at the first update. Redundancy yes — but only if the platform maintains it. Rule of thumb: redundant only if kept consistent automatically. Source: Medienstürmer

3. Data Types: Every Compromise Is a Mortgage

Choose the narrowest data type that meets the requirement — the one rule that saves the most work in remediations and is violated most often:

  • A choice with fixed values is Choice, not free text. Free text means "yes", "Yes", "YES ", "y" — and a report that groups nothing anymore.
  • A monetary amount is Currency with a currency, not Decimal. Decimal loses the currency irretrievably.
  • A date is Date/DateTime, not text. A date as text destroys any time-based analysis and sorting.
  • Yes/no is Yes/No, not a string and not a two-value choice.
  • Email/URL get the appropriate format so that validation and linking take effect.

The pain comes delayed: while modeling you feel nothing. At the first report, the first integration, the first migration, you pay — with data cleanup on top.

4. Relationships and Their Behavior — the Underestimated Trap

Dataverse knows 1:N, N:1, and N:N — the kind is usually obvious. What's almost always overlooked is the cascade behavior: what happens to child records when the parent record is deleted, reassigned, or shared?

The platform defaults are rarely what you want from a business standpoint. "Cascade All" on delete means: whoever deletes a customer deletes every related project, every activity, every note along with it — one of the most common data losses in young environments, invisible until it happens.

The practical rule: create every relationship with deliberately set behavior, never wave the default through. Three questions — Delete? Reassign? Share? — answered on business grounds, not guessed technically.

For N:N, additionally: a native N:N relationship carries no attributes of its own. As soon as the link itself has data (date, role, status), you need an intermediate table with two lookups. Decided early it's cheap — retrofitted it's a migration.

5. Naming Conventions and the Publisher Prefix

Unsexy, but it shapes the project. Three rules we enforce:

  • One publisher prefix, set once, never changed. It sits in front of every table and column name; a later switch is effectively a re-modeling exercise.
  • Display name and schema name on purpose. Dataverse removes internal underscores from the schema name — what you design as hub_afa_methode becomes hub_afamethode in the API. Code against it without checking the schema name builds in runtime errors.
  • Consistent identifiers. Singular or plural, German or English — whichever convention, but one, consistently. Mixed conventions are a guessing game for every new developer.

6. Extend a Standard Table or Build Your Own

The temptation to stuff everything into Account and Contact is strong. The rule of thumb is simple and still gets violated constantly:

  • A standalone business thing (project, asset, contract) — its own table.
  • A property of an existing thing (a customer's industry, a contact's position) — a column on the standard table.

Every violation — a standalone concept as a loose bundle of fields on Account — makes analysis, permissioning, and reuse harder. Permissioning is no side aspect here: an own table can be permissioned at the row level; a field bundle on Account inherits the account's visibility. More on this: Security Roles in Dataverse.

The next diagram shows the decision.

Own Table or Column on the Standard Table? Is the concept a standalone thing? Own table Project · Asset · Contract permissionable at row level reusable Column on standard table industry/position as a field inherits the account's view pure property yes, standalone only a property When in doubt, an own table — merging is easier than pulling apart. Untangling a field bundle on Account later is a migration. Source: Medienstürmer

7. Global vs. Local Choices and Alternate Keys

Two building blocks with great leverage:

Choices: if a value catalog is needed in more than one place — status, priority, region — it belongs global, defined once and referenced everywhere. Otherwise you maintain the same list five times and they drift apart. A genuinely table-specific choice stays local, or the global namespace gets cluttered.

Alternate keys: for every table into which data is fed from outside (integration, import, ERP sync), you define an alternate key on the business key — customer number, SKU, document number. Without it, every integration identifies records via the GUID it doesn't know and creates duplicates. It decides between a clean or a duplicated data foundation. More on this in the Dataverse guide and in Dataverse vs. SQL vs. SharePoint.

8. The Modeling Checklist

What we go through before every go-live:

  1. Is the entity model in place before the first screen was built?
  2. Is every field typed as narrowly as the business allows?
  3. Does every relationship have a deliberately set cascade behavior?
  4. Do N:N links with their own data carry a real intermediate table?
  5. Is the publisher prefix set once and the naming consistent?
  6. Are reused value lists global, one-off ones local?
  7. Does every integration-relevant table have an alternate key?

If even one point is answered with "no," it's corrected before go-live — not "later." Almost free now, expensive with production data.

Conclusion

Good Dataverse modeling is unspectacular: narrow data types, deliberate relationship behaviors, own tables for standalone concepts, global choices for reused lists, an alternate key for every integration. None of it is difficult — all of it gets skipped under time pressure.

The honest truth: almost every "we'll clean that up later" never gets cleaned up and costs a multiple in the end. The hour of clean modeling at the start is the cheapest in the entire project.

A Data Model That Still Holds Up in Five Years?

We model it with you from the ground up — or review an existing model honestly and tell you what holds up and what becomes a mortgage. With a remediation path, if needed.

Sources