There are situations when we're not sure what the columns of the DB table would be, they need to be flexible. For example, e-shop product properties: size, color, fabric, there may be more in the future. What is the best DB structure? I will show you three options and their performance.
We can define a separate DB table properties
and then a Pivot table product_property
with many-to-many relations, and an extra field for the value.
This DB structure implements the so-called EAV (Entity-Attribute-Value) Model.
database/migrations/xxx_create_product_property_table.php:
Schema::create('product_property', function (Blueprint $table) { $table->foreignId('product_id')->constrained(); $table->foreignId('property_id')->constrained(); $table->string('value');});
app/Models/Product.php:
class Product extends Model{ public function properties() { return $this->belongsToMany(Property::class) ->withPivot('value'); }}
Then, if we want to filter the products by specific few property values (like, "Size L and color red"), we can do this.
Controller:
Product::query() ->whereHas('properties', function($query) { $query->where('properties.name', 'size') ->where('product_property.value', 'L'); }) ->whereHas('properties', function($query) { $query->where('properties.name', 'color') ->where('product_property.value', 'red'); }) ->get();
Notice: you may skip the ::query()
, I add it personally to have the well-formatted following lines of code, for better readability.
This structure gives you the flexibility to add more properties in the future, without changing the query.
But there's a performance issue, we'll discuss it after we look at the other structure options.
We can put all properties just as an unstructured JSON column within the same DB table.
Schema::create('products', function (Blueprint $table) { $table->id(); $table->string('name'); $table->json('properties'); // ... $table->timestamps();});
Then, we can query the products by property values just like this:
Product::where('properties->size', 'L') ->where('properties->color', 'red') ->get()
Simple, isn't it? But wait until you see the performance results below.
Finally, the most straightforward approach: you just add the fields that you know for now, make each of them nullable
just in case, and then will add more fields to the same table as they appear.
Yes, it doesn't give you flexibility, but the question is this: do you REALLY need them to be that flexible? In my experience when working with clients, they often demand the system to edit things themselves, and then never really use the functionality.
So, in some cases, it may make sense to "hardcode" things for now, add a few fields when clients ask, and go for the flexibility only when you ACTUALLY need it at least a few times.
With that approach, the structure is just this:
Schema::create('products', function (Blueprint $table) { $table->id(); $table->string('name'); $table->string('size')->nullable(); $table->string('color')->nullable(); $table->string('fabric')->nullable(); $table->timestamps();});
The query is also as simple as this:
Product::query() ->where('size', 'L') ->where('color', 'red') ->get();
Finally, what is faster, and by how much?
In my benchmark test, I've seeded 1000 products with all three possible structures, and launched this code:
Illuminate\Support\Benchmark::dd([ fn() => Product::whereHas('properties', function($query) { $query->where('properties.name', 'size') ->where('product_property.value', 'L'); })->whereHas('properties', function($query) { $query->where('properties.name', 'color') ->where('product_property.value', 'red'); }) ->get(), fn() => Product::query() ->where('properties->size', 'L') ->where('properties->color', 'red') ->get(), fn() => Product::query() ->where('size', 'L') ->where('color', 'red') ->get(),], 5);
It measured all three queries 5 times, and here are the average results:
array:3 [ 0 => "6.688ms" 1 => "1.911ms" 2 => "1.612ms"]
I tried a few more times, but the results were pretty much the same:
whereHas
EAV approach is 3-4x slower than other structuresNot sure if you would draw the same conclusion from my benchmark, or maybe some queries may be optimized.
All in all, my suggestion would be to avoid the EAV model unless you REALLY need it. JSON columns are okay-ish in terms of performance, but depending on your (more complex?) structure may also be slower or less convenient to query.
The good old approach of just adding more fields to the same table may not sound "sexy" but if you go for simplicity and the quickest performance "for now", you should consider starting with that structure.