Is your abstraction worth it?
I would like you warn that this is somewhat lengthy post on my experience.
Context
Currently, I am working as a backend developer in our independent project.
During the development process, I observed that some queries were quite repetitive. Then, I also had to write a dedicated filtering functionality that creates chained select statements for applying all filters at once. Again, such filters turned out to be repetitive, and I was thinking about utilizable functions. Furthermore, in our DB scheme, we have tables that have a multi-language field (and thus several of it for specific languages). So, it made the filtering somehow more complicated.
As I had the mindset of "utilizable functions", or tooling, or just "as I see these queries repeat, I can create functions we can use in future too" mindset, I built a whole DB tooling.
Examples of this tooling
def query_scalars_with_attrs(
*,
model,
session: Session,
attrs_vals: dict[str, Any],
base_s: Select | None = None,
filter_type: QueryTypes = QueryTypes.FILTER_EQUAL,
validate: bool = True,
) -> ScalarResult:
"""
A helper function that applies multiple attribute
filtering and queries the resultant `Select`.
"""
...
This is a top-level function that abstracts applying filters and querying. Under the hood, well, it uses....
def query_scalars(
*, model, session: Session, base_s: Select | None = None, validate: bool = True
) -> ScalarResult:
"""
A function to apply `scalars` query on a given
base_s `Select`.
"""
...
.... another abstraction just for calling session.scalars!
Back to the first function. It only uses the second funtion at the end of its logic, after tasks like creating select statements are done. How does it make up select statements? Well, if you remember that its job is also filtering, it uses...
def select_with_filter_multi_attr(
*,
model,
attrs_vals: dict[str, Any],
base_s: Select | None = None,
filter_type: QueryTypes = QueryTypes.FILTER_EQUAL,
validate: bool = True
) -> Select:
"""
A function to apply ORM filtering
functions. An ORM filtering is
detected based on 'filter_type'.
"""
...
... just another abstraction!
Now, what about these functions below:
def orm_filter_eq_multi_attr(
*, model, base_s: Select, attrs_vals: dict[str, Any], validate: bool = True
) -> Select:
"""
A function to apply FILTER EQUAL in SQL.
...
"""
...
def orm_ilike_multi_attr(
*, model, base_s: Select, attrs_vals: dict[str, Any], validate: bool = True
) -> Select:
"""
A function to apply ILIKE in SQL, with
'%value%' format.
...
"""
...
Cringe, yeah? By the way, there's also an alternative to select_with_filter_multi_attr with support for multi-lingual query.... :)
But, wait, are not they doing some work? How did I decide this all was "cringe"? Well...
The reality
Let's compare the functions and their alternatives in SQLAlchemy.
# Ex1
# WITH the abstraction
author = db_tools.query_scalars_with_attrs(
model=User,
attrs_vals={"public_id": author_public_id},
session=self.session,
).first()
# WITHOUT the abstraction
author = session.scalar(
select(User).filter(User.public_id == author_public_id).limit(1)
)
A whole function just for literally a line of code (in fact).
# Ex2
# WITH the abstraction
query_result = db_tools.query_scalars_with_attrs(
model=User,
attrs_vals=attrs_vals,
session=self.session,
filter_type=QueryTypes.ILIKE,
).first()
# WITHOUT
query_result = session.scalar(
select(User)
.where(User.name.ilike(f"%{name}%"))
.where(User.surname.ilike(f"{surname}"))
)
Quite elegant raw SQLAlchemy, yeah? Here is another example...
# Ex3
# WITH the abstraction
orm_s = db_tools.select_with_filter_multi_attr(
model=Post,
attrs_vals={
"for_students": self.filters.for_students
},
base_s=orm_s
)
# WITHOUT
orm_s = select(Post).filter(Post.for_students == self.filters.for_students)
... This one was just nuclear, where one would become just speechless!
Consider I wanted to be consistent in my query tooling. Then, with my abstraction, I would create another layer for other developers to learn my thing. Moreover, I would restrict everyone to what my functions did, what they accepted, etc. (If we want to be consistent.)
As we see from the comparisons, the core technologies (in this case SQLAlchemy) already provide very good abstractions.
Actually, I was very enthusiastic when was developing this tooling. Then, after such real comparison, I was just shocked --- I was doing all the work just to remove it. The real reality vs. expectation experience.
The reason I looked back and questioned what I have created
Before I compared my functions and the raw SQLAlchemy abstractions, I think the reason I looked back at my tooling was that the abstraction was rigid when I moved to implement a pagination functionality. Simply, extending the functions was too much change as, crystal clear from the previous sections, the functions depended on another one by hierarchy.
So, this presented, "Wait, what I have done just? Is it worth?".
Besides the arguments I made for removing, you can note that these functions did too much validation. It's just an overhead (well, even though I made the validation possible to turn off).
But, was this really useless? I think no. For two reasons.
Multi-lingual select and basic multi-attr select
After the realization, I migrated to raw SQLAlchemy select/queries. I cut my abstractions down to the database tooling and isolated it from other modules. Afterward, I started working on the filtering system.
When I was implementing filtering for multi-lingual fields, I realized that multi-lingual select abstraction was worth it.
The reason is that manually writing raw SQLAlchemy would create boilerplate. We can do multi-lingual filter across languages and find the best matches.
The abstraction exactly does that: it creates multiple select statements. The good part is that it can automatically check specific fields using a given "root" of these fields. If you want, you can specify priority language and also if the function should return a select statement for only that language.
I admit that I could eliminate that multi-field search enforcing a rule that a client must always specify the language they intend, or otherwise set a default language. Actually, the filtering system that uses this abstraction does so when calling it. Who knows, maybe there will be use of this clever function. Anyways, the abstraction is now isolated at the DB tooling level.
As for multi-attr select, I made its function signature start with "_", implying it is not intended for an external use. It is used by multi-lingual select function. Although I can remove, I think it is for now good since it keeps the client function becoming too much lengthy.
I left another abstraction too. First, it is used by the multi-attr select function. Second, I didn't change the usage as in the Ex2. comparison in The reality section. It is just for now.
Now, the next reason is...
The lesson
I was almost devastated after realizing that I have spent at least a day working on that tooling. I had completely been driven by it and never had questioned. So, it taught a really big lesson.
Here are some insights:
- Question your abstractions. Repeating things does not always mean to create another abstraction.
- Don't be too much driven by enthusiasm. Try to keep rationality.
- If you are not sure, compare your abstraction with the alternative without it.
Happy that I had this lesson. Thanks for reading!