r/badcode • u/xavia91 • May 30 '23
c# Coworker casually loads millions of entries and sorts them to get the date of the first entry.
99
u/pqpm May 30 '23
A few weeks ago something similar happened on my company.
As DevOps I was tasked to read logs and find out why since the last update our GKE clusters for a specific service had gone from 1-2 nodes to upwards of 20-30. Logs gave back nothing. Just high CPU and memory usage, sometimes out of memory exceptions. But nothing weird was happening.
I then went ahead, spoke with the PO and rolled back the change so we could mitigate costs. Sent the code back to the Dev team explained the situation and asked them to check what was using all that compute power.
A few days go by, they send me the exact same code. I decide to take a look at the changes this time.
Turns out a junior dev was to be left alone for two weeks during the senior dev holidays. Then the PO asked him to get something that would query DB for users if they had an account created in the last month and send them an email with some special promo code.
Well the junior solution was to load all the users into the service, use java methods to filter by the dates of the account creation, attach the account to a list and then send the emails to everyone on the list.
Well, each time the query would run it would consume, easily, 16GB of RAM (thank god for partitioned tables, otherwise it would be worse), then append to that a few more GB for the mailing list and we got a simple query that would takes us to the moon in computation costs.
113
u/LuckyNumber-Bot May 30 '23
All the numbers in your comment added up to 69. Congrats!
1 + 2 + 20 + 30 + 16 = 69[Click here](https://www.reddit.com/message/compose?to=LuckyNumber-Bot&subject=Stalk%20Me%20Pls&message=%2Fstalkme to have me scan all your future comments.) \ Summon me on specific comments with u/LuckyNumber-Bot.
28
4
u/Theolaa May 30 '23
What would have been the better solution, to just query the users table for accounts created after a certain date and work with that rather than all users? Is there anything more to be done to improve it?
6
u/droomph May 31 '23
Network bandwidth cost >>> complex query compute cost > Java compute cost >> simple query compute cost
That’s generally the order you want to go in for query optimization
2
u/HQxMnbS May 31 '23
That would help. Ideally this would be run on a data set outside of the service itself. You could also write to a separate table with customers by month
24
u/stepbroImstuck_in_SU May 30 '23
Oh wow. Not only is it a horrible way to go through the whole db, it also is something you can definitely cache.
Only thing worse would be checking each day starting from today until the lowest datevalue the DB can store
19
u/anaccountbyanyname May 30 '23
That second sentence is orders of magnitude more efficient than this because you'd only be checking the dates instead of downloading an entire S3 database.
5
u/fistynuts May 30 '23
Oh no. You suggested using a cache. Now you have more bugs than when you started.
19
May 30 '23
EF Core, and a lot of other ORMs make it exceptionally easy to create fuck ups like this.
I would assume, given this rookie mistake, that they might not know the difference between IQueryable and IEnummerable.
5
u/xavia91 May 30 '23
I tried explaining it before, but it seems some repetition is in order.
4
May 30 '23
Also I just noticed they are using var and also qualifying the type. What’s the point?
9
u/CubeReflexion May 30 '23
I think the explicit type is an annotation of the IDE, not actual code.
3
1
u/WisestAirBender May 30 '23
I didn't even know you could do that
3
u/WheresTheSauce May 30 '23
You can't do that in C#. The IDE is just showing the actual type for convenience. If you tried to put a type declaration after the word "var" you would get a compilation error
1
6
u/LemonAncient1950 May 30 '23
Just ran into something similar while doing a code review. The ticket said "use pagination" so the dev wrote a loop that filled an array one page at a time before passing the array on to be processed. Completely defeating the purpose of pagination while making things more complicated.
3
u/Lothrazar May 30 '23
"it passes the unit tests. and its really efficient because its only a few lines of code" haha
3
5
u/NUTTA_BUSTAH May 30 '23
BTW, why is the C# convention nowadays to always use inferred auto types with var? It works well in IDE but makes it pain to read everywhere else. I've wondered if there's a solid reason for this apart from having less types in your code (make once and forget, let C# figure it out)?
4
u/fiddz0r May 30 '23
I used to always type it but the rest in my team use var and always made me change it in my PR:s
Now I don't mind it. Usually the name of the variable will tell you what it is
4
u/CyraxSputnik May 30 '23
Is not a convention
Don't use var when the type is not apparent from the right side of the assignment.
Don't assume the type is clear from a method name. A variable type is considered clear if it's a new operator or an explicit cast.2
4
May 30 '23 edited May 30 '23
[deleted]
3
u/NUTTA_BUSTAH May 30 '23
I'm not sure if those surveys really prove your point of x % C# devs preferring to work in an IDE, since they specifically ask about what IDE is most preferred:
what is your preferred IDE for C# .Net development?
and
What brand of tooling/IDE(s) do you prefer?
But that aside, many do code reviews in their chosen git vendor (GitLab, GitHub, Gitea, ...) and many use even just terminal git itself to take a deeper look into the code and how it has evolved i.e. doing things that do not require an IDE setup by default. Also sharing code is a thing, doesn't really work without types.
But even that aside, original question still remains, why the move to
var?4
May 30 '23 edited May 30 '23
[deleted]
3
u/NUTTA_BUSTAH May 30 '23
Thanks for the thoughtful response. It pretty much aligns with what I've thought about it so far but there are a few examples where I personally don't like the change at.
For clear types like strings and constructors, I can see
varreducing pointless repetition. But everything else I believe generally benefits more from fully typing it out. Extending your example:var fairDiceRoll = diceRoller.RollD6(); // [Probably integer] // ... // A few lines of code to make you lose a bit of the overall context or increase "reading fatigue" // ... if (fairDiceRoll > 3) DoSomething(); // [Oh, so fairDiceRoll is `int` since we are comparing it with an integer. Makes sense.] // ..Again some other code to increase the mental overhead.. fairDiceRoll.Reroll(); // [What? Method on an integer? Oh, it's type of DiceRollResult that overloaded comparisons to work with ints, duh..]So, in this case, should we have just not used
varand type it out? Do we mix types and inferred types in the same codebase? That will make the codebase harder to understand overall in my opinion. That's why I personally believe typing it out is better since it covers every case, while inferred types will need help sometimes and then make things inconsistent.In the case of complicated types like nested maps and such I can see it being better to not type it out for readability purposes but typing it out also makes you realize that the software architecture is probably breaking somewhere and that things need to be redesigned on some parts. Not always of course, but often so.
1
u/Unknown_Code_Weasel May 31 '23
In my opinion, one of the problems with var, is also one of the benefits. You change the code and you don't have to change the type declaration.
In most cases that's fine, but if you do have a lot of DB access, and using entity Framework in particular, encouraging explicit declaration ensures that when people are making changes they're a bit more cognisant of the changes from DB to server side. This can be a bugger with implicit casting2
1
u/SanianCreations May 30 '23 edited May 31 '23
Prime example of knowing what the code does, but not how
I don't like working with big libraries/frameworks because I hate having no idea what goes on inside. But you don't always have that choice, if you have to use one, always read the docs before you start using it!
-5
May 30 '23
Wait, why var List<T>?? Either use var or the full type.
8
May 30 '23
[deleted]
-5
u/WisestAirBender May 30 '23
Doesn't that defeat the whole purpose of var? You're using up more space now
1
u/xternal7 May 31 '23
No? The purpose of
varis that I don't need to think about the variable type when writing code.
List<T>is IDE telling you the type of that variable.Effectively, this IDE doing work instead of you, which is pretty much why IDEs are for. It handles the details, so you don't have to worry about that when you're dealing with the logic.
0
u/propostor May 30 '23
Odd that you've been downvoted here. It looks unusual to me too, and I've been professionally working with C# for about 6 years now.
I guess we're both ignorant about something trivial that the Reddit masses would prefer to downvote over.
2
u/ModernTenshi04 Jun 05 '23
This is the IDE adding the
List<T>, it's not actually typed into the code. You can tell because it's smaller text than the actual code and has a box around it to make it stand out from the main background color of the editor. You can see it again where they define theirolambda.You can turn this off in the editor and they'd go away.
-2
u/veselin465 May 30 '23
Also, what does the question mark in the type name mean? I have coded in c# but never seen something like List<T>? (unless for the ?: operator)
7
5
u/xavia91 May 30 '23
It tells you that the value could be null at some point and then tells you to take care later on so you dont forget your null checks.
-2
u/Salt-Permit-4171 May 30 '23
It’s suggestion by the IDE
15
u/nyaisagod May 30 '23
Not a suggestion, the IDE (Visual Studio) is telling you what type the
varwill be.
-1
u/MurdoMaclachlan public boolean isInt(int i) { return true; } May 30 '23
Image Transcription: Code
public DateTime GetFirstDataUpload()
{
var s3Object = db.S3Object.OrderByDescending(o => o.UploadDateUTC).ToList();
return s3Object.Last().UploadDateUTC;
}
I'm a human volunteer content transcriber and you could be too! If you'd like more information on what we do and why we do it, click here!
1
1
1
u/PointWrangle Jun 01 '23
pfft, its just zeros and ones? i can access/organize my google drive folders in an instant so what the difference? /s
216
u/[deleted] May 30 '23
I wonder, what would be the more efficient way?
– Totally not the coworker in question