azurelunatic: LiveJournal pirate ship.  (pirate)
Azure Jane Lunatic (Azz) 🌺 ([personal profile] azurelunatic) wrote2007-01-14 02:21 am
Entry tags:

Syndicated account spamming

151 needs a section on synsuck-spams-friends-page. ("Synsuck", if I'm understanding it correctly, is the name of the mechanism by which LJ reads the feeds. Whimsical names rock.) It seems to be a very clever little bug. For those who aren't familiar with it, syndicated content retrieval works on an "Are we there yet?!" principle. Something checks every X amount of time to see if something's updated yet, similar to the obsessive refreshing of your friends page when you've read everything and you're bored. Except this is only once an hour, not twice in the same five seconds.

So synsuck goes around to all the syndicated account feed sources and "do we have new content yet?"-s at them. (Related: if synsuck asks this *too* often, some websites, like slashdot, will so pull this thing over and smack it.) When there is new content that synsuck hasn't seen before, synsuck grabs it and runs back with it to the feed journal and stashes all those cool new entries there. Then your friendspage reads the syndicated journal, and goes "Ooo, shiny!" and posts stuff to your friends page.

However, syndicated accounts do not store their entries forever. After two weeks, any given syndicated entry will go POOF! off the face of the planet. Imagine Momma Syn coming through and picking up very used facial tissue up off the floor and throwing it away. (This is more true in some syndicated comment pages than others. Roll with the breaking-down scary analogy here.)

Ordinarily, everything is all good and all right. But, imagine what happens if the source of the syndicated feed stops updating for a while? Picture it. Over two weeks without posting. What a long time! Syn has no attention span. None whatsoever. Not that long. (Now picture a very happy retriever, very eager to please, very literal about following directions, not so smart.) All the entries in the feed journal just go away. Woops. So when synsuck goes out and the feed has *finally* updated, it goes, "ZOMG AN UPDATE!!" and immediately checks back with the syndicated journal on LJ to see how much it needs to catch up on.

Now remember, it's been two weeks or more. All the previously-posted content is gone, gone, gone. Syn has no attention span. So synsuck sees ZOMG A BLANK JOURNAL!! So since if it were old, synsuck would see a copy of it in the journal, all the stuff that synsuck sees on the feed source must be NEW!! And what do we do with new content suddenly being poured into a journal? Why, record it all, of course! Thirty fresh articles, all shoved into the syndicated account, and straight to your friendspage! Because syn is not bright enough to figure "Um, I had this before, didn't I." Those entries are only temporarily stored, and therefore don't exist anymore.

...It's a bug.


What they're doing to fix it will depend on implementation and things like actually being practically able to figure things out. But it is a Known Thing, and that's how it does it.
wibbble: A manipulated picture of my eye, with a blue swirling background. (Default)

[personal profile] wibbble 2007-01-14 11:49 am (UTC)(link)
I know how I'd fix it: persistently store at least the last RSS ID retrieved. Not the whole post, just the ID. That's all you need for determining what posts you've seen already.

However, given how obvious a fix that is, they must be either a) already implementing it, and b) can't implement it for some obscure reason.

[identity profile] the-cynic.livejournal.com 2007-01-14 02:11 pm (UTC)(link)
It actually has to do with the fact that the length of the RSS ID they store is less than the actual length of the real RSS ID, so the new ID parsed doesn't match. Imagine storing only 5 digits of your friends' phone numbers as a more practical example.
wibbble: A manipulated picture of my eye, with a blue swirling background. (Default)

[personal profile] wibbble 2007-01-14 02:18 pm (UTC)(link)
That sounds like exactly the sort of obscure reason I would have expected. :o)
wibbble: A manipulated picture of my eye, with a blue swirling background. (Default)

[personal profile] wibbble 2007-01-14 04:55 pm (UTC)(link)
I shouldn't think so. The people writing that code know RSS a lot better than I do. :o)

[identity profile] pyrogenic.livejournal.com 2007-01-14 11:20 pm (UTC)(link)
it is very, very simple. The cleanup routine should:

MAX_ARTICLES = 30
MAX_AGE = two weeks

if (articles.count > MAX_ARTICLES)
-> if (entry.age > MAX_AGE)
-> delete article


That way, if a feed has thirty or less articles, they'll stick around until new ones arrive, but if it's a heavy feed, articles will still fall off after two weeks.
wibbble: A manipulated picture of my eye, with a blue swirling background. (Default)

[personal profile] wibbble 2007-01-15 08:47 pm (UTC)(link)
I think there's non-technical reasons for making it expire to a fixed time. IIRC, LJ didn't want to be open to claims of archiving syndicated material.

[identity profile] hotarunokokoro.livejournal.com 2007-01-15 02:41 am (UTC)(link)
thanks for making that understandable for the regular l.j. reader, i.e., me.
^_^
I like metaphors, and i like your icon too. it should be in the l.j. t-shirt shop!