Back to Course |
Structuring Databases in Laravel 11

Foreign Keys to "Grandparent": Duplicate Data?

Let's talk about storing additional "duplicate" foreign keys, with an example from the open-source project Krayin CRM.


In the Krayin CRM project, there is a table product_inventories for which migration looks like this:

Schema::create('product_inventories', function (Blueprint $table) {
$table->increments('id');
$table->integer('in_stock')->default(0);
$table->integer('allocated')->default(0);
 
$table->integer('product_id')->unsigned();
$table->foreign('product_id')->references('id')->on('products')->onDelete('cascade');
 
$table->integer('warehouse_id')->unsigned()->nullable();
$table->foreign('warehouse_id')->references('id')->on('warehouses')->onDelete('cascade');
 
$table->integer('warehouse_location_id')->unsigned()->nullable();
$table->foreign('warehouse_location_id')->references('id')->on('warehouse_locations')->onDelete('SET NULL');
 
$table->timestamps();
});

The warehouse_location_id field is from a grandparent table. A simplified schema looks like this:

So, we have the location_id field in both the "child" and "grandchild" tables, duplicating data.

Technically, this violates the third normalization form. Also, there could be problems like:

  • Extra space on the disk.
  • You need to keep the various redundant foreign keys synchronized. That burden is either on your application or (better) on the database, where you use triggers to update all copies.
  • As a consequence of the above, the performance of data modifications will suffer.

However, if your application frequently queries the "grandparent" table, you might gain a performance advantage.

In this example, you could use the direct column instead of joining the full warehouses table:

ProductInventory::where('warehouse_location_id', $id)->get();

Alternative: BelongsToThrough

An alternative approach would be to use the hasManyThrough() relationship from the Laravel core. But we need the opposite, belongsToThrough(), which doesn't exist in Laravel.

But a package staudenmeir/belongs-to-through can help. This package allows you to query the grandparent table through the child table.

The only difference when using this package is that instead of adding warehouse_id to the warehouse_locations table, we would add warehouse_location_id to the warehouses table.

Then, in the ProductInventory Model, we would define the BelongsToThrough relationship from the package where related would be the WarehouseLocation Model and through the Warehouse Model.

use Znck\Eloquent\Relations\BelongsToThrough;
 
class ProductInventory extends Model
{
use \Znck\Eloquent\Traits\BelongsToThrough;
 
// ...
 
public function warehouseLocation(): BelongsToThrough
{
return $this->belongsToThrough(WarehouseLocation::class, Warehouse::class);
}
}

Now, in the Controller or whenever you would use this warehouseLocation relationship as any other Eloquent relationship.


Another Example

The same CRM project has another similar example, with DB tables leads and lead_pipelines.

First, every lead belongs to a specific pipeline:

Schema::create('leads', function (Blueprint $table) {
$table->increments('id');
$table->string('title');
 
// ... more fields
 
$table->integer('lead_pipeline_id')
->unsigned()
->nullable();
$table->foreign('lead_pipeline_id')
->references('id')
->on('lead_pipelines')
->onDelete('cascade');
});

Then, every pipeline consists of stages:

Schema::create('lead_pipeline_stages', function (Blueprint $table) {
$table->increments('id');
 
// ... more fields
 
$table->integer('lead_pipeline_id')->unsigned();
$table->foreign('lead_pipeline_id')->references('id')->on('lead_pipelines')->onDelete('cascade');
});

Finally, we add the lead_pipeline_stage_id to the leads table:

Schema::table('leads', function (Blueprint $table) {
$table->integer('lead_pipeline_stage_id')
->after('lead_pipeline_id')
->unsigned()
->nullable();
$table->foreign('lead_pipeline_stage_id')
->references('id')
->on('lead_pipeline_stages')
->onDelete('cascade');
});

However, we still leave the "grandparent" lead_pipeline_id for convenient querying. So, devs can write queries on both columns, depending on the scenario.


Conclusion: while adding a foreign key from the grandparent table would technically violate the third normalization form, it should ultimately depend on your application. In some cases, it brings more benefit than harm.