Functions should be small and do one thing […] you end up with a slew of tiny functions scattered around your codebase (or a single file), and you are forced to piece together the behaviour they exhibit when called together
I believe you have a wrong idea of what “one thing” is. This comes together with “functions should not mix levels of abstraction” (cited from the first blog entry you referenced). In a very low-level library, “one thing” may be sending an IP packet over a network interface. Higher up, “one thing” may be establishing a database connection. Even higher up, “one thing” may be querying a list of users from the database, and higher up yet again is responding to the GET /users http request. All of these functions do ‘one thing’, but they rely on calls to a few methods that are further down on the abstraction scheme.
By allowing each function to do ‘one thing’, you decompose the huge problem that responding to an HTTP request actually is into more manageable chunks. When you figure out what a function does, it’s way easier to see that the function connectToDb will not be responsible for why all users are suddenly called "Bob". You’ll look into the http handler first, and if that’s not responsible, into getUsersFromDb, and then check what sendQuery does. If all methods truly do one thing, you’ll be certain that checkAuthorization will not be related to the problem.
Tell me if I just didn’t get the point you were trying to make.
Edit: I just read
Martin says that functions should not be large enough to hold nested control structures (conditionals and loops); equivalently, they should not be indented to more than two levels. He says blocks should be one line long, consisting probably of a single function call. […] Most bizarrely, Martin asserts that an ideal function is two to four lines of code long.
If that’s the standard of “doing one thing”, then I agree with you. This is stupid.
The “Don’t repeat yourself” mantra is also used with documentation, this leads to documentation which you first have to read and learn unless you frequently want to step into issues of the documentation assumed you read prior parts and didn’t just searched how to do XYZ.
I’m not a fan of this kind of code fragmentation.
If all those actions were related and it could have been just one thing, retaining a lot more context, then it should be one function imo.
If by not splitting it it became massive with various disconnected code blocks, sure, but otherwise I’d much prefer being able to read everything together.
If splitting the functions required producing side effects to maintain the same functionality, then that’s even worse.
Huh, I really like code like that. Having a multi-step process split up into sections like that is amazing to reason about actual dependencies of the individual sections. Granted, that only applies if the individual steps are kinda independently meaningful
This allows you to immediately see that part1 and part2 are independently calculated, and what goes into calculating them.
There are several benefits, e.g.:
if there is a problem, you can more easily narrow down where it is (e.g. if part2 calculates as expected and part1 doesn’t, the problem is probably in function1, not function2 or function3). If you have to understand the whole do_stuff before you can effectively debug it, you waste time.
if the function needs to be optimized, you know immediately that function1 and function 2 can probably run in parallel, and even if you don’t want to do that, the slow part will show up in a flame graph.
It really depends on the context frankly. I did say it was a crappy example ;)
Try to read this snippet I stole from Clean Code and tell me if it’s readable without having to uselessly jump everywhere to understand what’s going on:
I believe you have a wrong idea of what “one thing” is. This comes together with “functions should not mix levels of abstraction” (cited from the first blog entry you referenced). In a very low-level library, “one thing” may be sending an IP packet over a network interface. Higher up, “one thing” may be establishing a database connection. Even higher up, “one thing” may be querying a list of users from the database, and higher up yet again is responding to the
GET /users
http request. All of these functions do ‘one thing’, but they rely on calls to a few methods that are further down on the abstraction scheme.By allowing each function to do ‘one thing’, you decompose the huge problem that responding to an HTTP request actually is into more manageable chunks. When you figure out what a function does, it’s way easier to see that the function
connectToDb
will not be responsible for why all users are suddenly called"Bob"
. You’ll look into the http handler first, and if that’s not responsible, intogetUsersFromDb
, and then check whatsendQuery
does. If all methods truly do one thing, you’ll be certain thatcheckAuthorization
will not be related to the problem.Tell me if I just didn’t get the point you were trying to make.
Edit: I just read
If that’s the standard of “doing one thing”, then I agree with you. This is stupid.
The “Don’t repeat yourself” mantra is also used with documentation, this leads to documentation which you first have to read and learn unless you frequently want to step into issues of the documentation assumed you read prior parts and didn’t just searched how to do XYZ.
Yeah that was essentially what I was referring to (referring to your edit).
I generally dislike stuff like (crappy example incoming):
I’m not a fan of this kind of code fragmentation.
If all those actions were related and it could have been just one thing, retaining a lot more context, then it should be one function imo.
If by not splitting it it became massive with various disconnected code blocks, sure, but otherwise I’d much prefer being able to read everything together.
If splitting the functions required producing side effects to maintain the same functionality, then that’s even worse.
Huh, I really like code like that. Having a multi-step process split up into sections like that is amazing to reason about actual dependencies of the individual sections. Granted, that only applies if the individual steps are kinda independently meaningful
To adapt your example to what I mean:
This allows you to immediately see that part1 and part2 are independently calculated, and what goes into calculating them.
There are several benefits, e.g.:
It really depends on the context frankly. I did say it was a crappy example ;)
Try to read this snippet I stole from Clean Code and tell me if it’s readable without having to uselessly jump everywhere to understand what’s going on:
That’s what I was talking about.