I’ll give an example. At my previous company there was a program where you basically select a start date, select an end date, select the system and press a button and it reaches out to a database and pulls all the data following that matches those parameters. The horrors of this were 1. The queries were hard coded.
-
They were stored in a configuration file, in xml format.
-
The queries were not 1 entry. It was 4, a start, the part between start date and end date, the part between end date and system and then the end part. All of these were then concatenated in the program intermixed with variables.
-
This was then sent to the server as pure sql, no orm.
-
Here’s my favorite part. You obviously don’t want anyone modifying the configuration file so they encrypted it. Now I know what you’re thinking at some point you probably will need to modify or add to the configuration so you store an unencrypted version in a secure location. Nope! The program had the ability to encrypt and decrypt but there were no visible buttons to access those functions. The program was written in winforms. You had to open the program in visual studio, manually expand the size of the window(locked size in regular use) and that shows the buttons. Now run the program in debug. Press the decrypt button. DO NOT EXIT THE PROGRAM! Edit the file in a text editor. Save file. Press the encrypt button. Copy the encrypted file to any other location on your computer. Close the program. Manually email the encrypted file to anybody using the file.
Not mine, but svn-based JDSL is the best related story that’s always worth sharing.
Based on things I’ve seen I can actually believe this is real. Just goes to show that you can’t trust everyone to have a functional intuition for separating horrible ideas from good ones.
Whatever im working on 💪
Ok so this one is someone trying to move to “the cloud.”
They had a database they used. It was on a server in the office. We were tasked to clone the db server to a hosted VM. Due to order of creation this got put on a new host without anything yet on it.
They needed a site to site VPN to keep privacy, that was all fine. However after the clone and during testing, their guy there said that this one part was really slow. We take a look and everything is good with performance of the server and of the VPN. I have to pop on to take a look.
It was in an office app and written in VB. (I forgot which one.) It was indeed slower on the hosted server. So I took a look at the function (he got it up for me) and I could instantly tell the issue.
This part was a lookup page that searched for you input. The function retrieved the entire table, then filtered the results in the client. I explained that transferring the whole table over the internet would be slower than on the local lan.
This guy said he originally wrote this, but “forgot VB.”
In the end they decided not to update the app or keep the server in the office, but instead they rented some VDIs in the same data centre as the db.
I saw a talk recently, I can find the video if you like but pretty sure it was the most recent ND conference, where they made the point that a lot of lack of efficiency in modern code is because of large companies. Basically in alot of cases it’s more important to get a product out ASAP then to care if it was well done. Ok, a poorly written program may cost an extra $10,000 a month to run but if it earns them a million a month and saves 6 months of development time it pays for itself and they can eat the cost.
This seems like the case with renting vdis instead of fixing the program.
Sounds like he didn’t have much to forget
The encryption thing is definitely weird/crazy and storing the SQL in XML is kinda janky, but sending SQL to a DB server is literally how all SQL implementations work (well, except for sqlite, heh).
ORMs are straight trash and shouldn’t be used. Developers should write SQL or something equivalent and learn how to properly use databases. eDSLs in a programming language are fine as long as you still have complete control over the queries and all queries are expressable. ORMs are how you get shit performance and developers who don’t have the first clue how databases work (because of leaky/bad abstractions trying to pretend like databases don’t require a fundamentally different way of thinking from application programming).
Orm are a way to handle seamlessly the model aspect of a codebase. But I agree.
My first big project (Symfony, with doctrine orm), we had to have several SQL requests made by hand due to the complexity of the databases here and there. So we were kept on our toes when it came to database knowledge haha
Except it’s not seamless, and never has been. ORMs of all kinds routinely end up with N+1 queries littered all over the place, and developers using ORMs do not understand the queries being performed nor what the optimal indexing strategy is. And even if they did know what the performance issue is, they can’t even fix it!
Beyond that, because of the fundamental mismatch between the relational model and the data model of application programming languages, you necessarily induce a lot of unneeded complexity with the ORM trying to overcome this impedance mismatch.
A much better way is to simply write SQL queries (sanitizing inputs, ofc), and for each query you write, deserialize the result into whatever data type you want to use in the programming language. It is not difficult, and greatly reduces complexity by allowing you to write queries suited to the task at hand. But developers seemingly want to do everything in their power to avoid properly learning SQL, resulting in a huge mess as the abstractions of the ORM inevitably fall apart.
This was then sent to the server as pure sql, no orm.
ORMs are overrated.
Yeah but simply using entity framework would of made the configuration file a list of systems.
A (poorly written) Shell check if the process was able to write to the production database which in some, not all, cases threw the gem:
!!! SQL ERROR !!!
At least it’s descriptive
At a small company I used to work for we agreed to take over the management system for someone trading physical resources. The guy that originally wrote it was self taught. We did a hand over with him where he took us through the code base. It was written in dotnet but it was a huge mess, he had blended multiple different dotnet paradigms, there was mixed business and UI code all over the place, large chunks of html were stored in the db, db code was just scattered through the application. We took it over briefly but it was a nightmare to work on and we found a SQL injection vulnerability. So as kindly as possible we told the client that his software was a piece of shit and the dev he hired had no idea what he was doing.
What was the final result? Did you cancel the contract or re-write the whole thing?
We finished working on what we had already agreed to do and then cancelled the contract, the client was quite understanding.
First of all, lack of ORM isn’t bad. It’s not a good or bad thing to use them out not use them. What’s bad is not sanitizing your query inputs and you don’t need an ORM to do that.
I think the worst thing I’ve seen is previous devs not realize there’s a cost to opening a DB connection. Especially back when DBs were on spinning rust. So the report page that ran one query to get the all the items to report on, then for each row ran another individual query to get that row’s details was probably one of the slowest reports I’ve ever seen. Every DB round trip was at minimum 0.1 seconds just to open the connection, run the query, send back the data, then close the connection. So 10 rows per second could be returned. Thousands of rows per page has people waiting several minutes, and tying up our app server. A quick refactor to run 2 queries instead of hundreds to thousands and I was a hero for 10 min till everyone forgot how bad it was before I fixed it.
It’s the round trips that kill you.
Oracle drivers for .NET are fun. Have a user client application which uses quite a lot of data, but a few thousand rows are fetched some queries. It’s way too slow for any larger query, turns out for the batch query kind of work we do, the default FetchSize for Oracle is just a performance killer. Just throw it to 128 MB and it doesn’t really hurt at all.
Worst thing i’ve seen though, apart from the 150 line long dynamic sql stored in our database, was probably a page in our program that loaded about 150 rows from the database. Normally we do create a new connection for each query, but it’s fine since Oracle has a connection pool. Whatever millisecond is trumped by the round trip. But imagine a UI so badly written, it did 4 separate database queries for EACH row it loaded into the UI list. Useless things like fetching a new ID for this row in case it is changed, reading some data for the row i think, and more. Thing took a solid minute to load. There was so many bad patterns in that page that even during the PR for improving the speed it was just dealing with a mess because you couldn’t just rewrite the entire thing, so they had to make it work within the constraints. Horrible thing to work with.
Christ
Do you think he would have better coding standards?
This might require a bit of background knowledge about Power Query in Excel and Power BI, specifically the concept Query Folding.
Power Query is a tool to define and run queries against a host of data sources and spit out tabular data for use in Excel (as tables) or Power BI (as Tabular Data Model). The selling point of it is the low-code graphical presentation: You transform the data by adding steps to the query, mostly through the menu ribbon. Change a column type? Click the column header > Data Type > select the new type. Perform a join? Click “Merge Queries”, select the second query, select the respective key column(s) to join on and thr join type – no typing needed. You get a nested table column you can then select which columns to expand or aggregate from.
Each step provides you with a preview of the results, and you can look at, edit, delete or insert earlier steps at will. You can also edit individual steps or the whole query through a code editor, but the appeal is obviously that even non-programmers can use it without needing to code.
Of course, it’s most efficient to have SQL transformations done by the database server already. Bur Power Query can do that too: “Query Folding” is the feature that automatically turns a sequence of Power Query steps into native SQL. A sequence like “Source, Select Columns, Filter Rows, Rename Columns” will quite neatly be converted into the SQL equivalent you’d expect. Merges will become Join, appending tables becomes Union, converting a text to uppercase becomes UPPER and so on.
If at some point there is a step it can’t fold, it will use a native query to load the data up to that point, then do the rest in-memory. Even if later steps were foldable, they’ll have to be done in-memory. You can guess that this creates a lot of potential for optimising longer queries by ensuring as much or it as possible is folded and that the result is as “small” as possible – as few rows and column as feasible etc.
Now, when I tell you that there is a table in one of our sources with a few large text columns you almost never need, you may be able to smell the smoke already. A colleague of mine needed help with his queries being slow to load. He had copied some code from Stackoverflow or what have you that joins a query with itself multiple times to resolve hierarchies. In theory, it was supposed to be foldable, provided the step it runs off of is. The general schema of my colleague’s query went Data Source -> non-foldable type conversion -> copied code -> filtering (ultimately keeping about 20% of rows) -> renaming columns -> removing columns. Want to guess which columns were loaded, processed with each join, explicitly renamed and only then finally understood to be useless and discarded?
“I always do the filtering last, don’t want to miss anything.”
This is your regularly scheduled reminder that MS (and our corporate BI team) can present Power Query as self-service data transformation tool all it wants, that still doesn’t mean it’s actually designed for use by non-data techies.
I’ll consider myself lucky that the worst I’ve had to deal with was a 8K LOC C file that implemented image processing for a cancer detection algorithm. Nothing terribly tricky but just poorly organized. Almost no documentation at all. The only test was running this code against a data set of patient images and eyeballing the output. No version control other than cloning the project onto their NAS and naming it “v2” etc.
Research code can be really scary.
At my job there’s a class method that’s longer than that.
Get out
My current workmate unironically calls his variables as “cat1”, “cat2”, etc.
He also didn’t knew about git, so before I arrived, he uploaded the code to production with scp.
Finally, my boss told me that he is priority, so if he doesn’t underestand git, we won’t keep using it. I would underestand if this was about a different language, but it’s git vs scp we’re talking about.
Doom original source code, codebase is very messy and looks like my grandpa scrimbles on paper.
Speaking as an old person, back then they didn’t have the same concerns. Security? Ehh just don’t let bad guys access your computer.
Yeah a lot of old programs are either great programming or terrible.
Joined a new team and one of my first tasks was a refactor on a shared code file (Java) that was littered with data validations like
if ("".equals(id) || id == null) { throw new IllegalArgumentException() }The dev who wrote it clearly was trying to make sure the string values were populated but they apparently A) didn’t think to just put the null check first so they didnt have to write their string comparison so terribly or else didnt understand short circuiting and B) didn’t know any other null-safe way to check for an empty string, like, say StringUtils.isEmpty()
I mean… That’s bad but not on the same scale of some of these other issues.
Sure. There were worse problems to. SQL injection vulnerabilities, dense functions with hundreds of lines of spaghetti code, absolutely zero test coverage on any project, etc. That’s just the easiest to show an example of and it’s also the one that made me flinch every time I saw it.
"".equals()😨If it makes you feel better at my last company I asked the “senior validation specialist” what the validation path would be for a program which incorporated unit tests.
The answer I got was “what’s a unit test?”
🥲
There was something like
# sleep for about a second on modern processors math.factorial(10000)After it was found we left it in the code but commented out along with a
sleep(1)for posterity.That’s atleast pretty creative
I saw one where the program ran a busy loop on startup to calculate how long it took. Then it used that as an iterations-to-seconds conversion for busy loops between scheduled actions.
In the readme: if you want this program to be usable, press the turbo button until the turbo light is OFF.





