What are the general guidelines and best practices to keep in mind when designing a database for an application?
What are the best resources/books/university lectures for database design concepts?
Thank you.
In all database design There are three key things to consider-data integrity (there is basically no data without this), security and performance. All other considerations lag behind these three factors.
Never create a unique identifier without this Table of recorded methods.
There are really very few real natural keys that are really used as primary keys, if you can’t control whether it will change, don’t use it as primary keys (you really don’t want to change the company name ) Do you have 27 children’s tables? ). Please use a surrogate key instead. If you can use a unique composite key, using a surrogate key will not save you from having to set a unique index. If you can be sure of the way to get a unique composite, always set these indexes. Duplicate records are the application The bane of existence. This seems obvious, but never treat the name as a key field, the name is not, and will never be unique.
Don’t use GUID as the primary key, because it may reduce Performance. If you need a guid for replication, you can also consider using int or big int primary keys.
Don’t design as if you want to change the database backend unless you know in advance that you will do it. In fact, performance tuning All excellent technologies of Youyou are database-specific and will not harm your own ability to adjust the database according to non-existent requirements.
Avoid using the value entity table structure. They are difficult to query.
Add everything needed to ensure data integrity into the database design. Default values, constraints, triggers, etc. are necessary to avoid generating useless data. Do not rely on application code to perform this operation, otherwise You will feel sorry.
Other people mentioned normalization, and I agree that you must understand this thoroughly, even if you later decide to denormalize.
If you want any kind of performance, Please do not stack views on top of the view. Every database I have seen is a huge performance issue.
Consider the data needed to manage the database and the data needed by the application. If you want to take the database seriously , You need to understand database auditing, and your database should implement methods to find out who made which changes, and when and what the old data is. The first time someone maliciously changed data or someone accidentally deleted all records in the table When, you will thank me.
Really think about how to query data at design time. It can make a huge difference in design.
Don’t store multiple entries in the field Information. Putting a comma-separated list into a field instead of adding related tables may seem cool, but this is a very bad idea.
Elegance is often the enemy of database performance. Performance is selected every time Better than elegant, you can’t go wrong.
Avoid using database keywords in object naming. Your programmer will thank you. Choose a naming convention and always use it to be consistent. If a field is more than In each table, make sure it is the same name (for example, if an id field has two possible foreign keys in the same table, use the id field name and a prefix to identify the difference between Sales_person_id and Customer_person_id) , The same data type and length, if applicable to all of them. Fix the spelling mistakes right away, you really don’t want to remember in the next ten years, in the table it is persnoid instead of pers onid.
Read about database refactoring (search some good books on Amazon) and consider how to implement this in your design. Few databases are designed to be refactored and are able to do so It is essential to be able to fix database problems caused by well-thought-out design or changes in business requirements.
As you read, read about performance tuning. You will learn a lot about how to design a database. What to avoid.
I believe there is more, but this is enough to start.
I want to add one extra thing. Don’t design the database like setting up a data entry application page. Even in transactional databases, data is often queried rather than written. Really think about how easy it is to get data out of the database (oh, this is why the EAV model is so bad! ) And the impact of design on the report. This is very critical, because I often see that the person doing the report is not the person who designed the database, or the reporting task is in the later stage of the project instead of creating data entries. The database is not easy to refactor, in the design The entire life cycle of the data should be considered when the database. Consider the matter of storing the time value, because the number of orders after two years is multiplied by the number of prices in the product table, because this is not the price at the time of the order. If this type is required for information reports, but When the design is improperly executed, it is usually too late when the report is written. When you need to view thousands or millions of records, things that work fine when processing one record at a time can be troublesome. Not every application creates Separate reporting database, so please really consider this.
My question is about database modeling. I tried to look for this in other database design questions on SO , But haven’t found it yet, so I’m asking here:
What general guidelines and best practices should I keep in mind when designing a database for an application?
What are the best resources/books/university lectures for database design concepts?
Thank you.
I have learned something from experience (I am sure some people will disagree, but I have inquired and designed and Programming database has been 30 years, and has seen the effect of stupid design approximation and personalization):
There are three key things to consider in all database design-data integrity (no This is basically no data), security and performance. All other considerations lag behind these three factors.
Never create a table without a way to uniquely identify records.
There are really very few real natural keys that are actually used as the primary key. If you can’t control whether it will change, don’t use it as the primary key (you really don’t want to change the company name) Do you have 27 children’s tables? ). Please use a surrogate key instead. If you can use a unique composite key, using a surrogate key will not save you from having to set a unique index. If you can be sure of the way to get a unique composite, always set these indexes. Duplicate records are the application The bane of existence. This seems obvious, but never treat the name as a key field, the name is not, and will never be unique.
Don’t use GUID as the primary key, because it may reduce Performance. If you need a guid for replication, you can also consider using int or big int primary keys.
Don’t design as if you want to change the database backend unless you know in advance that you will do it. In fact, performance tuning All excellent technologies of Youyou are database-specific and will not harm your own ability to adjust the database according to non-existent requirements.
Avoid using the value entity table structure. They are difficult to query.
Add everything needed to ensure data integrity into the database design. Default values, constraints, triggers, etc. are necessary to avoid generating useless data. Do not rely on application code to perform this operation, otherwise You will feel sorry.
Other people mentioned normalization, and I agree that you must understand this thoroughly, even if you later decide to denormalize.
If you want any kind of performance, Please do not stack views on top of the view. Every database I have seen is a huge performance issue.
Consider the data needed to manage the database and the data needed by the application. If you want to take the database seriously , You need to understand database auditing, and your database should implement methods to find out who made which changes, and when and what the old data is. The first time someone maliciously changed data or someone accidentally deleted all records in the table When, you will thank me.
Really think about how to query data at design time. It can make a huge difference in design.
Don’t store multiple entries in the field Information. Putting a comma-separated list into a field instead of adding related tables may seem cool, but this is a very bad idea.
Elegance is often the enemy of database performance. Performance is selected every time Better than elegant, you can’t go wrong.
Avoid using database keywords in object naming. Your programmer will thank you. Choose a naming convention and always use it to be consistent. If a field is more than In each table, make sure it is the same name (for example, if an id field has two possible foreign keys in the same table, use the id field name and a prefix to identify the difference between Sales_person_id and Customer_person_id) , The same data type and length, if applicable to all of them. Fix spelling mistakes right away, you really don’t want to remember in the next ten years, it is persnoid instead of personid in the table.
< p>Read about database refactoring (search some good books on Amazon) and consider how to implement this in your design. Few databases are designed to be refactored, and being able to do so Database problems caused by changes in design or business requirements are critical.
When you read, read about performance tuning, and you will learn a lot about what to avoid when designing a database.
>
I believe there are more, but this is enough to start.
I want to add an extra thing. Don’t design the database like setting up a data entry application page. Even in a transactional database, Frequently query data instead of writing data. Really think about how easy it is to get data out of the database (oh, this is why the EAV model is so bad! ) And the impact of design on the report. This is very critical, because I often see that the person doing the report is not the person who designed the database, or the reporting task is in the later stage of the project instead of creating data entries. The database is not easy to refactor, in the design The entire life cycle of the data should be considered when the database. Consider the matter of storing the time value, because the number of orders after two years is multiplied by the number of prices in the product table, because this is not the price at the time of the order. If this type is required for information reports, but When the design is improperly executed, it is usually too late when the report is written. When you need to view thousands or millions of records, things that work fine when processing one record at a time can be troublesome. Not every application creates Separate reporting database, so please really consider this.