Use MySQL well. Advice from a systems engineer to a developer.

Table of contents of the article:

One of the most frustrating aspects of being a system engineer is having to get your hands dirty in jobs that other professionals should do.

In fact, it can happen that after delivering a server that is well sized on the hardware and well configured at the software level the customer complains of absolutely inadequate performance of the application running on it.

Whether it can be a custom CMS, a custom CRM, or any other Web application commissioned and developed tailored to the customer's needs, the essence is that the software does not run as it should.

Often after the initial and calm complaints, as the situation worsens and continues, we get to exclaim with an annoyed tone that "Nothing worksOften with harbor bar epithets, tone and terminology.

After a normal and calm response in which it is highlighted how that configuration is more than adequate for the customer's needs, we arrive at the customer's invitation to discuss with the software developer because perhaps something has gotten out of hand and being a custom application without specific and documented cases (since the software is not known nor widespread) could present critical issues obviously unknown to us and not attributable.

Obviously for the programmer, in 99% of cases of this type, the problem is never in his software, but always attributable to the system administrator who does not know how to configure a server.

And here it is, the most feared of the battles: Systems engineer versus developer.

The systems engineer is usually a XNUMXth dan black belt from UNIX, most likely has a background as a developer and DBA. It is the potential Bruce Lee able to knock you down with a stroke that ALWAYS prefers to evade the challenge, due to lack of time but above all skills.

And it is clear, that by competences, we do not mean the theoretical and practical or academic competences of being able to brilliantly analyze the problem and consequently to solve it, but it is simply not within its competence, or that that task has been entrusted to another esteemed figure of another company who is paid handsomely to design and develop functional and well-optimized software.

Too bad that in the sad life of the computer scientist and above all of the systems engineer, the blame game is always around the corner and therefore if you do not accept the challenge to the death you will always be labeled as the one who does not know how to do his job, as well as you will lose the customer who it will go to some other smoking company until some black belt systems engineer decides to get their hands dirty and prove once and for all that all the problems are derived from a bad software implementation.

Not only will you have saved the life of your customer who would have found himself with unusable software, but also of the programmer who, thanks to your advice and your interventions, will have enjoyed an improvement to his software that he can even resell to other customers.

The truth? You don't know how to use MySQL, let alone SQL in general.

An interesting phenomenon emerges among many LAMP developers (an acronym that indicates the combination of Linux, Apache, MySQL, PHP technologies): little or no knowledge of using MySQL. Although MySQL has reached its eighth version, with a set of features and functionality that could rival Microsoft's SQL Server, PostgreSQL or Oracle, it is observed that a significant portion of developers continue to use it as if it were still at version 3.23.

This version, dating back to 2001, did not support advanced features such as referential integrity constraints, stored procedures, stored functions, triggers, views or transactions, which are available today and are powerful tools for data management and analysis. Therefore, many developers do not fully exploit the potential offered by MySQL, limiting themselves to using it in an old-fashioned and suboptimal way.

It's important to remind these developers that the time when MySQL was limited in its capabilities is long gone. The MyISAM engine, which characterized version 3.23, has been replaced by InnoDB, which is considerably more capable and suitable for professional use. In addition, forks of MySQL have been developed, such as XtraDB by Percona Server, which have further enhanced its capabilities, making it an extremely powerful relational DBMS (Database Management System).

It is therefore essential that developers update and train adequately to take full advantage of the potential offered by MySQL. This improvement in skills will not only allow you to create more efficient and robust applications, but also keep you more in line with the evolution of the field of software engineering, which is constantly evolving.

Personally we have nothing against MySQL or MariaDB, but currently we have standardized practically all our servers with Percona Server which also has accessory tools such as Percona Utilities or Percona Xtrabackup to make hot backups in a much faster, safer, more consistent and efficient way than usual trite mysqldump.

To give a small example, a programmer today is convinced that a DMBS like MySQL is a simple drawer in which to store and retrieve data. In other words, they use a very powerful (and therefore heavy) DBMS as if it were almost a NO-SQL DB and therefore light.

A DBMS like MySQL today is instead more comparable to an advanced industrial production line than to a very banal chest of drawers. In other words, those big candy production chains, where 3 main ingredients and a dye are put in, at the end of the carousel packets of gummy bears come out, cut, mixed, packed and ready to load on the delivery trucks.

For example, few LAMP developers know that MySQL is very powerful and allows you to manage most of the operations that are done at the application level, directly at the DB level since MySQL itself has an internal programming language.

Let's see some of the main shortcomings or bad practices of those who develop Web applications using MySQL.

The following reading can transform any former unemployed peracottaio recycled Web developer into a good developer and maybe even a good DBA if you continue on that path.

1. Foreign Keys and Referential Integrity

There is a common belief, especially among programmers who rely solely on the PHP programming language, that the only way to delete interconnected data in a database is through the application layer. This belief prompts them to implement a process in which they query the data they wish to delete, delete it one by one, and only then go back to the parent table to remove the original item. They also hold a similar view if you want to use a mathematical formula to perform a complex calculation: in their view, this should only be done by PHP, which will produce a result which will then be stored in the database.

However, what is often overlooked is the power of foreign keys and referential integrity features found in relational database management systems (RDBMS). For example, if you wanted to delete an artist and all of her albums from a hypothetical music archive database, you could use the ON CASCADE DELETE action. This functionality allows you to perform a deletion referenced to the deletion of the corresponding foreign key, avoiding having to remove each linked value individually.

In an RDBMS context, referential integrity represents an interrelational integrity constraint, which requires that every value of one attribute (or column) of one relationship (or table) be present as a value of another attribute in another relationship. In simpler words, in relational databases, to maintain referential integrity, each field in a table that has been declared as a foreign key can only contain values that match the primary key or a candidate key of another linked table.

To illustrate this, consider the case where you want to delete a record that contains a value referenced by a foreign key from another table. This action would violate referential integrity. However, some RDBMSs can prevent this violation, either by deleting the corresponding rows of the foreign key, or by aborting the operation and not proceeding with the deletion. This serves to preserve the integrity of the data in the database, emphasizing the importance of foreign keys and referential integrity.

2. Stored Procedures

Stored procedures are one of the key features of many Relational Database Management Systems (RDBMS). This feature, previously missing in versions before 5.0 of MySQL, was finally implemented, thus responding to the criticisms of many users of this platform.

A stored procedure can be defined as a series of SQL statements which are stored directly in the database with a unique name to identify them. This name allows the entire set of statements to be called and executed, eliminating the need to rewrite the entire code each time the same operation needs to be performed.

A stored procedure can have one or more parameters, which are used to customize the operation that the procedure performs. Each parameter has a specific name, a defined data type, and can be categorized as an input parameter, an output parameter, or both. If not specified, the parameter is considered by default as an input parameter.

Stored procedures offer a number of benefits. First of all, they can help reduce network traffic, since a single procedure call replaces the need to issue multiple queries. Also, because the code runs directly in the database, the stored procedure can be faster than an equivalent set of individually executed SQL statements. Finally, using stored procedures can help improve security, since stored procedures provide a layer of abstraction that hides database details.

3. Stored Functions

Stored functions share some similarities with stored procedures in a database, but have a more limited specific purpose. While a stored procedure is a group of SQL statements that can perform complex operations, a stored function is designed to perform a specific operation and return a single value.

In practice, a stored function works just like an ordinary function provided by MySQL or another RDBMS. These functions take one or more parameters, perform an operation, and return a result. For example, they might calculate the sum of a series of numbers, return the length of a text string, or perform a more complex calculation based on data in the database.

Unlike stored procedures, however, stored functions cannot return a resultset, which is a set of rows returned by a SELECT query. Instead, they return a single value of a specific data type, such as an integer, float, string, or date.

In versions of MySQL prior to 5.0, there were user-defined functions that were stored outside the server. These custom user functions allowed developers to create their own functions and use them in their SQL queries. However, starting with version 5.0, MySQL introduced stored functions, which are stored directly in the database.

While user-defined functions are still supported in MySQL, the use of stored functions is generally recommended. This is because stored functions can take advantage of some of MySQL's optimizations and have fewer restrictions than user-defined functions. For example, a stored function can access tables, session variables, and other database resources, while a user-defined function has tighter limits on what it can do.

In essence, stored functions provide a powerful and efficient way to encapsulate business logic within the database, allowing developers to write cleaner and more maintainable code.

4. Triggers

Triggers are special objects in relational databases, closely associated with specific tables, that activate or "trigger" in response to certain events. This feature was introduced in MySQL starting with version 5.0.2.

A trigger can be defined to respond to a variety of events affecting the table it is associated with, such as inserting new rows, modifying existing rows, or deleting rows. A trigger can be configured to fire before (BEFORE) or after (AFTER) the event occurs. So, we have six distinct trigger types:

BEFORE INSERT
AFTER INSERT
BEFORE UPDATE
AFTER UPDATE
BEFORE DELETE
AFTER DELETE

Each trigger is associated with an SQL statement or a block of statements that are automatically executed for each row affected by the event. In other words, a trigger is a mechanism that allows you to define an automatic response to a specific event. This response can consist of a variety of actions, such as updating a value in a specific column, inserting a new row into another table, or logging the event to a log.

Let's take a practical example. Let's say you want to track the total sales of a sales rep in a given month. We could create a trigger that responds to the "insert new row" event on the "orders" table. When a new order is entered, the trigger automatically calculates the total sum of the current month's orders for the specific agent and updates the value in the “total_sales_monthly” table. That way, we wouldn't need to run a SELECT query every time we wanted to know the total sales for the month for an agent, as the value would be pre-calculated and immediately available.

Triggers can greatly simplify application logic by automating tasks that would otherwise have to be performed manually. However, they should be used with caution, as they can complicate database debugging and maintenance if not handled properly.

5. Views

Views in a database represent a virtualized perspective on a subset of data contained in one or more tables. Literally translated from the English term "view", which means "view", views do not physically store data, but offer a tabular presentation of rows and columns derived from the execution of a specific SQL query. Views work as if they were "real" tables, allowing you to perform operations such as SELECT, UPDATE, INSERT or DELETE, depending on the permissions and restrictions you set.

Creating a view involves defining a query that "selects" a subset of data from one or more tables. This query is stored as part of the view definition and runs every time the view is called. In essence, a view functions as a window into a specific set of data, filtering or rearranging the information based on specific user or application needs.

Using views has several advantages. First, views can simplify query operations by hiding the complexity of join, filter, or aggregate operations that may be involved in generating the subset of data. Second, views can help improve data security by limiting access to a subset of data and hiding sensitive or irrelevant details. Finally, views can help maintain data integrity by providing a consistent interface to data even as the underlying structure of tables changes.

An example of a view might be a "CustomerOrders" view which aggregates information from an "Orders" table and a "Customers" table, showing only the order details and basic customer information for each order. This view could be used by a sales application to display a list of orders without manually performing complex filtering or joining.

Views are powerful tools that make it easier, more secure, and more manageable to interact with data in a database.

6. ACID Transactions

The use of transactions in databases is a critical function that allows you to manage a set of operations in a secure and cohesive way, ensuring that changes to data are made only at a well-defined moment. By initiating a transaction, all changes or updates to data remain in a temporary and non-final state (suspended) and are not visible to other users. These changes can then either be confirmed through an operation known as "commit", which makes the changes permanent, or canceled through an operation known as "rollback", which returns the data to its original state. This concept is often described as “all or nothing”: either all the changes foreseen by the transaction are made, or none are.

This transaction management process is critical in situations where maintaining data consistency is critical. For example, if you have 100 records to enter in a restaurant order and the entry stops at the 90th due to a Wi-Fi connection problem, the system will perform a rollback of the transaction, canceling the 90 entries because they complete compared to the 100 total expected.

This way of handling transactions follows the principles ACID, an acronym that stands for Atomicity, Coherence, Isolation and Durability:

Atomicity: This principle ensures that a transaction is regarded as an indivisible unit of work. Either all operations of the transaction are completed successfully, or none are. If an operation within a transaction fails, the entire transaction is canceled (rolled back).
Coherence: This principle ensures that a transaction takes the database from one consistent state to another. If the data in the database was consistent before the transaction, it will be consistent after the transaction, regardless of the success or failure of the transaction.
Isolation: This principle ensures that each transaction runs in isolation from other transactions. This means that no transaction can interfere with the others, and each one must be completed without knowledge of the other transactions in progress.
Durability: This principle ensures that once a transaction has completed, its data changes persist in the database, even in the event of a system failure or crash.

Together, these ACID principles provide a robust framework for managing transactions and ensuring data integrity in relational databases.

And let's not forget the performances.

If we wanted to talk about query execution speed, we could say that fewer and fewer LAMP developers know how to correctly design a database and implement a correct use of indexes which are absolutely vital if you want to achieve speed in the execution of the queries themselves.

In other words, very often slow applications that arrive at the connection timeout and even at the saturation of the maximum available connections have as their sole cause the lack of indexes on the tables.

A database index is a data structure that improves the speed of operations on a table.

Whenever a web application sends a query to a database, the database will search all rows of your table to find the one that matches your request. As your database tables grow, more rows need to be inspected each time and this can slow down the performance of the database and therefore of the application.

MySQL indexes solve this problem by taking data from a column in your table and storing it alphabetically in a separate point called the index.

Does it just work? No. It also has to be fast.

It seems that, in the context of LAMP development (Linux, Apache, MySQL, PHP), one of the main problems lies in the balance between PHP programming and database design and optimization. Looking at the general trends, one notices that developers often tend to focus more on PHP programming, neglecting the importance of a well-designed and optimized database.

One of the reasons for this trend may lie in a lack of relevant academic training, which leads the developer to limit himself to testing the software on very small datasets and in contexts of low competition. Under these conditions, it is easy to conclude that the software works correctly and that the queries return the expected results. However, this kind of test is not sufficiently representative of the operational reality in which the software will actually be used.

In fact, it's one thing to test an application on a dataset of 100 demo records with a single user logged on, and it's quite another thing to test it on a dataset of tens of millions of records with 100 operators logged on in real time. The latter scenario presents significant challenges in terms of concurrency management and data-driven operations efficiency, challenges that do not emerge in a limited test environment.

The moral of the story, then, is that "working" isn't enough to describe good software. A software must be not only functional, but also efficient. It must be able to handle large volumes of data and a high number of users connected simultaneously, returning results quickly and without errors. In other words, it's not enough that it works – it has to be fast too.

To achieve these goals, developers must view database optimization as a critical part of the development process, not as a sideline. This takes time and training, but the result will be more robust, more efficient, and ultimately more useful software for end users.

BLOG

Use MySQL well. Advice from a systems engineer to a developer.

Innovation and Evolution: The Challenge of LAMP Developers in Using MySQL.

And here it is, the most feared of the battles: Systems engineer versus developer.

The truth? You don't know how to use MySQL, let alone SQL in general.

1. Foreign Keys and Referential Integrity

2. Stored Procedures

3. Stored Functions

4. Triggers

5. Views

6. ACID Transactions

And let's not forget the performances.

Does it just work? No. It also has to be fast.

Related Posts

About the author

Marco Marcoaldi

What do we do ?

Do you have doubts? Don't know where to start? Contact us!

Chat with us

0256569681

Contact us online

INFORMATION

Solutions & Services

System assistance

Help & Utility

Company

Secure