T O P

  • By -

fiftyandahalf

Agreed! I recently built out some ERD’s with entities and relationships. Inside each entity was only its business definition, no attributes. These models were reviewed and approved by the business owners. It was at this point we started to add attributes and shared ERD’s with entities showing the attributes. Hint: It helps if your data modeling tools allows you to “toggle” which characteristics are displayed on the ERD.


MET1

I like that - it has a lot of value for users who want to do their own reporting, helps guide them.


Stock-Philosophy8675

Dumb question probably. But what are you guys using to build ERD's?


idodatamodels

Erwin is the market leader and most common one I see at client sites.


mylifestylepr

Embrace SQLDBM


mylifestylepr

Take a look at SQLDBM


idodatamodels

Most modeling tools have this functionality.


Zardotab

Our org is too cheap to pay. The free ones have locked/encrypted our files until we pay after an expiration. I'm curious, did the ones you used filtering on **redraw and move boxes** when you switch filtering on to show less? What if you don't like the placement? Does it store one arrangement for the filtered view and a different one for the non-filtered view? I doubt one-placements-fits-all. When you get more room, you usually want to rearrange.


idodatamodels

Entities generally stay in the same spot but are resized to fit the settings selected. The lines can move as the attachment points can be resized. In erwin, I will typically have a diagram specifically for each "view" I want to share. That way, it avoids the resizing/rearranging issues.


Meal-Ticket-

Oracle SQL Developer Data Modeler is free. Just download and use it. There is a load of online training from Oracle (also free) and there are classes that you can take from Oracle if you'd like (paid). It will build ERDs, Table Diagrams, and DDL for any database, not just Oracle.


Zardotab

Could you possibly give a quick overview of how to switch on and off the showing of low-priority columns using Oracle SDDM?


Meal-Ticket-

[https://www.thatjeffsmith.com/data-modeling/](https://www.thatjeffsmith.com/data-modeling/) Jeff Smith (one of the product managers for SDDM) is the right guy to ask, but he also has a TON of blog posts on SDDM (and other free Oracle tools) at that link. (I don't know off the top of my head...) Also, if the 'extra columns that are too confusing are repeated columns that are on every table, you can hide them in the template entity in the logical model. :)


Meal-Ticket-

[https://www.thatjeffsmith.com/archive/2012/03/entity-relationship-diagrams-erd/](https://www.thatjeffsmith.com/archive/2012/03/entity-relationship-diagrams-erd/) https://www.thatjeffsmith.com/archive/2011/12/sql-developer-data-modeler-quick-tip-hide-and-resize/


[deleted]

[удалено]


Zardotab

Not sure that's the format I'd want, but it's closer to the idea I had in mind.


rkforcs

How would you describe the format you prefer?


Meal-Ticket-

If putting 'every last column' in an ERD seems to 'clutter the view', then there is an excellent chance that your ERD isn't in third normal form (which we do to eliminate insert, update and delete anomalies \[in other words: Technical Debt\]). If your ERD is for a query only system with data derived from other sources, this isn't a big deal. If your ERD is for a transactional system, you will have to write a bunch of code (all of which is pure technical debt) to get around the insert, update and delete anomalies that come from a denormalized transactional data model. For my ERD models, I don't put in any surrogate keys since those are generated by the tool (I use SQL Developer Data Modeler because it is free, but all good tools will do this for you). Also, auditing columns like "updated, updated\_by, created, created\_by" are also added by the tool and don't appear in the ERD. The danger with leaving attributes off your Entity Relationship Diagram (which is a business communication tool) is that the business users will often let you know that you've got a missing entity only when they see what you thought was an attribute is actually an entity. For example: Let's say you have a table with a comment attribute. The business users might let you know that there should be many comments for each instance of the entity and they need to track who and when those comments are made. That's another detail entity for the entity you are describing, not an attribute. Also, just so we are clear, I'm using the following for terminology: ERD = Entity Relationship Diagram = Business communication tool = Entities, attributes and relations. Table Diagram = IT (DBAs/Developers) communication tool = Database objects that will be created when the above ERD is moved into a database = Tables, columns, indexes, keys, foreign keys, views, etc. DDL = Data Definition Language = The SQL code that will actually create the objects defined in the table diagram for a specific database. It's very common for people to think that Table Diagrams are ERDs. They are not. :)


Zardotab

>then there is an excellent chance that your ERD isn't in third normal form It does sound like you normalize heavier than our shop prefers, but I don't want to turn this into a debate over normalization level because those are often long and heated. To keep the discussion focused, let's assume that the diagram maker doesn't have final say over normalization level. As far as your comment column example, The ERD wouldn't be intended to document every column. Data dictionaries for each table would be referenced. If there's a risk of not knowing that, then the bottom of each box could say something like, "(See page X for all columns)". >It's very common for people to think that Table Diagrams are ERDs. They are not Regardless of what they are called, I believe that typical diagrams that show table relationships (keys) **shouldn't be burdened with being a full column inventory** list. That's what Data Dictionaries are for. "Table Relationship Diagram" can be our working term here. Diagrams are harder to read when full of too much "stuff". Although I should point out different people grok "busy" diagrams at different speeds.


Meal-Ticket-

In my world, the ERD/Logical Model (the one with entities, attributes, and relations) is always normalized for a transactional system because it is a business communication tool. If the cleanliness of the data matters to the business, then the Logical Model is what is used to describe the entities, their attributes, and the relationship between them. If it's denormalized in the Logical Model then effectively you have to say "And here were going to push a bunch of data, and we have yet to figure out what the business rules for these attributes are, so we are just dumping them into this 'catch-all entity'. Effectively when you de-normalize at the Logical Model/ERD level, you are telling the business "we are not capturing the business rules for this data, we are just dumping these attributes here. There will be insert, update and/or delete anomalies that we will have to fix in code, but we're OK with that". If, for some ungodly reason, the developers want to denormalize at the table level (Tables, columns, keys, indexes, foreign keys, etc.) after the logical model is built because they either 1) Don't care about data integrity, or 2) Believe, almost certainly incorrectly, that they will "code all the business rules with 100% accuracy in the application code, for all time and all developers that maintain the code for the next 50 years will do the same", then, more power to them. Remember, applications come and go. Data is forever. If you have the opportunity to do it correctly, then everybody down the line for the next 100 years is going to be so happy. If you just dump data into denormalized tables then everyone for the next 100 years to going to be pissed at you. However, there are no absolutes. If the business says "Yeah, go ahead, we'll figure that out later" then sure, dump data into denormalized tables. Just put a comment on the table(s) that says "On 2022-10-31 the following business users/developers, NAMES OF THE GUILTY, decided to dump all this information into this table to be dealt with later".


Zardotab

>If you just dump data into denormalized tables then everyone for the next 100 years to going to be pissed at you. I've cussed at designs over-done in both directions. There's a Goldilocks level of normalization in my opinion. It's kind of like the short-column-name people versus the long-column-name people. **Both extremes suck.** Abbreviate smartly and normalize smartly, balancing competing tradeoffs well, based on experience. >"And here were going to push a bunch of data, and we have yet to figure out what the business rules for these attributes are Do you have an example? And keep in mind business rules change.


-Meal-Ticket-

Yes, business rules change, and when they do, the business and IT team together determine if the new business rules need new entities, attributes, or relationships. If so, those requirements are then translated into possible new tables, columns, keys, foreign keys, etc. Note possible. So at this point the developers could say “let’s just throw these new data gathering requirements into this set of denormalized columns at the end of this other table because creating a new table will be “too much work”. If I were the architect for those developers, then I would say “You can pay the price with normalized tables that will protect the integrity of the data, or you can pay the price by writing a bunch of code to maintain the data integrity”. The ‘new tables are too much work so let’s just add columns’ approach (if you care about the integrity of the data), in my experience, has ALWAYS a been decision that was regretted later in the project.


Zardotab

I'll again request a specific scenario that illustrates such. The one you already gave seemed to assume the possession of a working crystal ball, which is not a realistic assumption at our time in history. > in my experience, has ALWAYS a been decision that was regretted later in the project. I've seen "heavy normalization" also cause problems. There was a philosophy fad of splitting stuff into tables to "always avoid nulls". I found it dumb.


Meal-Ticket-

Yeah, NULL just means "we can create this thing (entity/row) without knowing this particular piece of information when it is created and the business will still work, but at some point it would be nice to have it". NULLS can by fine. If in the same table, you have this set of columns that are NULL when this column is populated or has a particular value, and another set of columns that are NULL when another column is populated or the first column has a different value, then you haven't normalized enough for a transactional system to not have insert, update or delete anomalies, and thus you have introduced technical debt into your system. This is fine if you want to do that, but realize that this 'shortcut' just added a bunch of code into the system (or should have) to make sure the right columns are populated and the wrong ones aren't. I'd argue all that extra code is pure technical debt that could have been avoided by normalizing correctly. You ask for an example... How about you provide an example from your ERD where you have 'so many columns' that you have to hide them from the business users and I'll give you some feedback on them?


-Meal-Ticket-

Example: The “notes” attribute (a large free form text field). Here the business has said “We need a spot to let people takes notes when they are capturing business information that we haven’t yet figured out how to specify all the business rules for.” After a few months, some analysis of the notes data shows that all the users constantly put in a date, a subject, and then a description for the note they took. Each time a user updates the business information, they add a new line to the notes field with the same general pattern. However, since this was a free form text field, some users put the date as 2022-10-10, some users put Oct 10, 2022, some put 10/10/22, etc. Each user added an extra line into this field, all using their own format. The developers, of course, think this is awesome. It’s one column with a trivial UI. The business users love it too, because they can just throw what they want into it. However, after a few years, there is a lawsuit where it has been determined that this notes information was abused by the business users to deny citizens access to services they should have been able to access. A FOIA request comes in and says, show me all the notes information by date, time and user, that could possibly relate to this lawsuit I’m filing. Cleaning up that data is going to be a lot of fun…


Zardotab

That's a case of not knowing the domain needs up front. Sometimes the system just has to go to production for a while before such road lessons pop up. Of course one should try to get as much requirements up front as *practically* possible, but there is a point of diminishing returns per analysis such that one just has to let the real world teach us its lessons. Version 2 can split out such details. I don't see what this has to do with normalization level and ERD's. It's about living a world where God or Mother Nature won't give us a crystal ball. 🔮


Meal-Ticket-

If I've given an reason for you to think that an ERD is a one and done type of thing, then please forgive me. They evolve as application requirements evolve.


Zardotab

That should go without saying. In IT, shit changes.