thelldev
posted this
Time ago

Stupid Bits From Waterfall's Code Over The Last Eighteen Months

I'm going to be honest, I know you're meant to use title case in titles but I don't actually know which words you're not meant to capitalise

I was dealing with a bug report in the Discord earlier and it hit me - all currently known bugs are artefacts of some code I wrote over a year ago and shoved in a file, never to be touched again.

Or, well, no, that's a little bit of an exaggeration. But not much - working on Waterfall has taught me a lot over the last year, so I thought, while I'm working on the API rewrite, it might be amusing to go over some of the more stupid bits of code I've written, the bad decisions, and just things that made me laugh.

All the dumb shit here is fixed in the API rewrite.


1. Have I reblogged this?

The bug report that spawned this post is this one, actually. Someone knows they've reblogged something, but the button never turns green. Annoying, since, other than XKit, we were the first Tumblelog site to implement that feature to my knowledge. Let's look at what could be the problem, shall we?

Notes are stored in their own table in the database, and it's currently over a million entries long. If something is meant to be displayed to a user, it comes from this table. In the case of detecting a reblog, it searches a combination - has [blog ID] reblogged [post source ID]? If yes, it'll be green.

The problem here arises from the simple fact that the notes table is not expected to be 100% consistent. In fact, it's the lowest priority part of the site - while obviously it's not desired, it's considered that if some notes don't get logged, that's fine. Avoid it if possible, but if it happens, no big deal. All the data to reconstruct the table is elsewhere anyway. We could nuked it and have every entry regenerated in less than an hour if we needed to.

And there's the problem. Searching the notes table is inefficient and stupid, especially since it's not consistent. What we SHOULD be doing is searching the Posts table, and asking whether that blog has reblogged a post with the same source post. About 5x faster, and significantly more accurate.

2. NSFW

NSFW tags are mandatory. Always have been. Side ramble: I've started seeing a couple blogs putting "I define NSFW as..." in their blog descriptions. No, you're wrong, go and read the rules again.

Anyway, the code, right now, has hardcoded checks for specific tags. When we overhauled the tag system a few months ago, we pre-seeded two tags - "nsfw" and "NSFW". These are tag IDs 1 and 2 respectively (others were added to the database in the order they first appeared in posts - so the first actual tag on the site has an ID of 3). If you're logged out, a minor, or just have adult content turned off, the site runs a separate query to get posts for you that deliberately excludes posts tagged with that. Over time, the query has been amended to also include tag 938 in the query, because someone on mobile did "Nsfw" at some point.

You can see where this is going. Every tag variation, it needs to be added manually to the queries. The same is true of DNR and DNI tags - variations in casing keep appearing and have to be added.

This doesn't go into people who use the tag "nsfw gif" or some such, and not the plain nsfw tag - those are completely omitted from the queries and not caught. There's a really, REALLY obvious solution I missed here, and I'm upset it took me so long to realise it.

When the post is loaded, just. Have it check the first four letters of all the tags and see if they match "nsfw", and if it is, skip it and get another one if appropriate to do so. It's so simple. I'm so dumb. Writing this I realise I haven't gotten it set up in the API rewrite to use the same tactic for DNR and DNI yet. I'm an idiot.

3. Upload Limits

This was just dumb on my part. Most people think of megabytes as being 1000 kilobytes. In actuality, it's 1024 kilobytes. A gigabyte is 1024 megabytes, and a kilobyte is 1024 bytes, etc. When I added the upload limits, I completely forgot about this fact, as well as, apparently, the fact that I have a degree in computer science, and used 1000 as the figure.

This is fixed now, but there is still an issue relating to uploading stuff near the limit that's probably some weird base64 encoding issue. Oh yeah, speaking of.

4. Canvases

I hate Javascript. Seriously, it's terrible, I hate it. But I have to use it and not doing so is costing us users, which is why there's a big rewrite going on. The biggest JS component on the site right now that I wrote myself is the post editor for image/art posts.

This particular bug related to when you're editing a post rather than making a new one. If you uploaded a gif in the original and go to edit it, the GIF plays normally. But when you hit save, it turns into a static image at whatever frame it was on when you hit save.

When uploading a new file, it doesn't really need changing - you have the file, so it can just send the "blob" (that's a technical term, honest) to the server. But it works differently for editing - there's no guarantee you still have the file, and it wouldn't know where you saved it anyway. So, when editing, it loads the file from the server and puts it on something called an HTML Canvas. The bug comes from my misunderstanding of how canvases work - turns out, they have zero support for GIFs. So when you tell it "hey, gimme the base64 of whatever is loaded in that canvas right now", it takes only the current frame. It's so stupid and counterintuitive. Maybe I shouldn't have assumed it'd work in a way that made sense (it is Javascript, after all), and I'm still figuring out the best way to handle this in the rewrite.

And you know what? There's at least three more I should put here. I really want to. There's another dozen I could put here but aren't as interesting. But I've held my head in my hands three times already reading stuff, so I'll leave it there.

The point is - this has been one hell of a learning experience.


Notes
theredleader liked this post
officialkarkat liked this post
beefox liked this post
nenna-annan liked this post
komaron liked this post
shiny-rayquaza liked this post
goropancakechi liked this post
darkhorsedouglas reblogged this post
star-rice liked this post
mantis-core liked this post
waterfall liked this post
icantchoosenames liked this post
supermario liked this post
cjadewyton liked this post
astr0 liked this post
thellere reblogged this post
thelldev posted this