We are developing the social individualist meta-context for the future. From the very serious to the extremely frivolous... lets see what is on the mind of the Samizdata people.
Samizdata, derived from Samizdat /n. - a system of clandestine publication of banned literature in the USSR [Russ.,= self-publishing house]
|
We need to assemble a lynch mob… …an angry digital lynch mob. Many fellow bloggers have been attacked by waves of trackback spam by some thieving vermin peddling online ‘texas holdem’ to idiots stupid enough to click those links and part with their money. We have been hit by over 450 trackbacks (which we de-spam swiftly via MT Blacklist every time they change their payload URLs).
What is to be done about this? If left unchecked this will simply destroy the trackback system and the beneficial network effect it brings. Presumably the spammers are being directed by companies to drive traffic to target sites, so if a digital lynch mob was to attack those target sites (who are presumably owned by the ones at the end of the chain who pay the spamhaus to do the dirty work), it might impose some cost on their actions, which at the moment involve stealing bandwidth and defacing private property with impunity. As the people involved in this are criminals, it seems to me that the best way to discourage them would be to hurt their ability to make their money.
Any ideas?
|
Who Are We? The Samizdata people are a bunch of sinister and heavily armed globalist illuminati who seek to infect the entire world with the values of personal liberty and several property. Amongst our many crimes is a sense of humour and the intermittent use of British spelling.
We are also a varied group made up of social individualists, classical liberals, whigs, libertarians, extropians, futurists, ‘Porcupines’, Karl Popper fetishists, recovering neo-conservatives, crazed Ayn Rand worshipers, over-caffeinated Virginia Postrel devotees, witty Frédéric Bastiat wannabes, cypherpunks, minarchists, kritarchists and wild-eyed anarcho-capitalists from Britain, North America, Australia and Europe.
|
The main aim isn’t to get people to click the links; the mere existence of the links is enough to get Google to increase the PageRank of the linked-to pages, so that if someone types “poker” or “texas hold’em” into the search box the perpetrator’s site will come up top.
I think I read somewhere that Google are addressing the problem.
Perry: Funny you should ask…
Here is a link to Google’s solution.
I’m all for the digital lynch mob. Much underused I think. I’ve been getting trackback pings for three days now and have had to rename my trackback script and change configuration to prevent the entries appearing on the blog. Now I am getting hundreds of entries in my error log which just shows that the evil spam bots are still sending requests to my site.
I would quite happily slap the little buggers around. Or charge them a fee. No, forget that, I’ll go with the slapping.
I’ve turned off trackbacks. I think there importance is over hyped anyway. If somebody is commenting on something I wrote a quick comment on my blog letting me know serves the same purpose. Technorati will also find new inbound links within a day or two.
It’s sort of like the Windows / Unix debate. Windows likes everything very integrated and connected – much like the blog approach. However, that integration presents security risks in both Windows and blogs. I think the many small, separate applications approach of Unix is also a better model for the web. Too much interconnectedness (if that is a word) is dangerous, IMHO. It’s naturally less secure, and its a more inviting target for the bad guys.
Firstly, I’m not a user of movable type so I wasn’t totally clear on how the whole trackback business works. These links were helpful:
spec
use
Essentially, the person requesting the trackback invokes a particular url (which is unique to each article) and passes that url the url to be tracked back to.
For this article, the link is: http://www.samizdata.net/mt/mt-tb.cgi/7105
so I could do:
http://www.samizdata.net/mt/mt-tb.cgi/7105?url=foo.com
Surely, the trackback spammers just knock up a silly script and blast their url at all the blogs they can find that support trackback so, some ideas:
1) Make it so that the trackback script is slow – maybe takes 2-3 minutes to run. Once the spammers figure out that samizdata does this, they’ll possibly omit y’all from their list.
2) Put all trackbacks into a “holding bin” where they aren’t released until a human ok’s them. When a particular url passes muster its added to a white list and subsequent trackbacks skip the holding bin.
3) The trackback links appear to always be blah-blah-blah/{number}. You could rig samizdata up to obscure the url and explain how to un-obscure it. If the real url is: http://www.samizdata.net/mt/mt-tb.cgi/7
you could post the link as:
http://www.samizdata.net/mt/mt-tb.cgi/3+4
http://www.samizdata.net/mt/mt-tb.cgi/7xxx
http://www.samizdata.net/mt/mt-tb.cgi/booger/7
and put a mild burden on the track backer to undo the foolisness before invoking the url.
I do realize google’s approach will probably ultimately solve the problem but spammers are slow learners and it might be some time before 100% of them realize this type of comment spamming gains them nothing.
They tried again today and most of us caught the buggers. Mike over at Coldfury has several readers suggesting punishment for the cretins doing this to us.
Chris, I don’t agree with your about trackbacks. I find them valuable to be able to keep track of who is linking to what on your site.
What is most impressive is that people like Kathy and others have reacted so quickly with coding solutions & hacks.
Follow Google’s advice as linked above, hack the blog software to add rel=”nofollow” to all URLs in comments and trackbacks.
I’ve put a stop to this kind of spam – in comments and trackbacks – via a few simple steps in my server:
1) I disallow comments or trackbacks to any post that’s fallen out of the RSS feed
2) I disallow any comment or trackback that has more than 3 links in it
3) I have a throttle that disallows comments/trackbacks from the same IP address more than once every 3 minutes
I’ve had one successful spam – in a visible post – in 3 months. That was eliminated within minutes. I don’t think that the blacklist is a particularly useful approach, quite frankly.
How about a system whereby bloggers write on their site:
“By sending us a trackback with the words “texas holdem” you agree to receive an automatical “denial of service attack” against your server.”
and some software that then does exactly that?
Good … but a better target of the DoS attack would be the merchant who paid for the ad.
Maybe re-word it to warn that a DoS attack will be initiated against the spammer’s customer?
There’s an interview with one of these bottom-feeding turds in The Register. It appears there’s big money (and I would guess organised crime involvement) involved in link spamming.
It’s not just some schmuck in a trailer park anymore, and it would not surprise me if there are individuals prepared to respond to DDOS attacks with physical violence.
All these work rounds are excellent and they just go to show how much talent there is within the blogosphere. However, not much of it satisfies my dark side.
Two can play at that game.
When they start hitting tiny little blogs like mine (over 70 trackbacks on Wednesday night), they must be desperate. I get perhaps 50 hits per day (as the BOFH said “baby seals get more hits”): there can’t be a useful market for them in trying to pollute my blog.
James
Blacklist has blocked 784 comment spams since Jan 9th when I installed it.
It has let 18 thru
I was able to delete and update the blacklist directory manually for these.
I would call that usefull.
I’d recommend against the retaliatory route of shooting back at originating IP addresses – I did a bit of research on the 600+ TB spams we’ve gotten at SR, and the IPs mapped to all over the world – from Croatia, to India, to Iran, to Brazil – all over the place. I suspect they aren’t hacking or zombie-ing these machines, probably just spoofing with bogus IPs generated at random.
I’m currently trying to get the attention of someone in law enforcement over this – it’s basic theft, and by all appearances, fraud – with all the steps involved, I’ve got to think that some sort of statute is being violated, even given the rudimentary nature of most cyber-law.
Oh, and by the way, the domain registrations for the domains that hit our site the other day list some pretty posh Upper West side NYC addresses…
Deep pockets?
Thank you WR for all your efforts on our behalf.
Before you light up the angry-mob torches… I haven’t seen examples of the spam, but I play texas holdem online so I know a bit about how the system works. It’s completely possible that the sites are the ones doing the spamming, but another posssibility is that an affiliate is doing it. You can register as an affiliate with one of these sites, and then for every new player you bring to the site, who registers under your referral code, you get a chunk of the money raked from that player when they play.
Obviously in such a case, the site is not to blame for the activites of the affiliate, so be careful about blaming them. If the URL leads to a site which merely contains ads for other sites, that’ll be an affiliate. If the link is directly to a site, but also contains a referral code, that’s an affiliate too.
In such a case, if you can’t identify the person doing the spamming, you could try emailing the site and informing them of the activities of the affiliate. If they take no action, then you could act against the site itself.
If they have an affiliate system that doesn’t kick people off for spam, they are to blame.
No, this has nothing to do with affiliates. For one thing, the bogus refers contain no affiliate info.
For another, the page they’re trying to advertise doesn’t actually have anything to do with “Texas Holdem”. It’s down, in fact, but when it was up it was porn links.
I have been getting insest porn adverts at the same time. We got hit (and filtered out) a couple more hundred today. About 7 got through.
Our “staff engineer”, Sparkey, added some plugins to block the comment spamming, and I added a number of the particular words that were incorporated into the spam, which automatically dump comments containing them into a holding file. I get e-mail notification of any that are blocked or in holding, and just now I dumped over 700 of them, all posted since last night! None of them got actually added to the “Daily Brief”, so I wonder why they still have us on their list. We’ve had them blocked for a month or two, but every Thursday or Friday, there’s a huge uptick. I can’t see why they still bother with us, there’s no future in it.
About there being “no future” in spamming sites that are blocking – you have to look at how it’s happening. The spammers run a set of scripts against a set of websites. The cost of issuing all the http requests is near nil, especially if you can farm the job out to zombie systems. It would take actual work for the spammers to check the success rate – they don’t actually care. So long as a sufficient amount gets through (and so long as enough click throughs pay them), they have no reason to care.
Yes … which is why the only way to deal with spam effectively is to punish the merchants who pay for it.
When the cash for spammers dries up, they will go away.
Expression Engine adds the code equivalent of a captcha to the end of the trackback link someone sends–no code, no trackback. It doesn’t defeat trackback spammers completely, but they have to get a new link for every trackback they wish to send. Doesn’t MT have the equivalent?
I agree, however, with the bigger point (my $.02): I think trackbacks are a general nuisance anyway. Never read them, never follow them, and never accept them. It’s link whoring for far too many. We have referral logs if we want to know if someone has linked to something or they can comment or e-mail. The same problem, with google hits desires, happens in referrer files. I spend at least 20 minutes a week clearing/blocking those suckers.
Beware also of links to “dog information”, which are creeping on to several politcally orientated British blooggers’ comments.
I recently switched from MT to WordPress. I added a plugin to require entering a code when posting a comment (similar to what is used here) and I also configured the system to require approval of all trackbacks. I’ve been getting a few poker trackbacks a week, but not a single one has been successful. Of course, I have the advantage of no one actually reading my blog or creating real trackbacks, so it’s not like I’m inconveniencing a legitimate person.
I don’t know their business models but it might work for a cyber-posse to actually click on their links. If the scammers are paid per thousand clicks, or whatever, then their clients will lose money.
Even if they’re not compensated this way they and their clients will get fed up with bogus responses.
Done right, this won’t be illegal and won’t even be a DoS.
JC
When i firststarted writing to my blog, it was to be a simple blog, telling about my everyday life and all. But after about 5 posts, it turned into a showoff of my average Apache setup skills.
I’m not blacklisting websites. I’m blacklisting keywords. And i explain to any visitor (though in french, as my blog is in that language) how to set up Apache to simply redirect the spammers to a 403 page which just add their IP to a text file, which i sometimes browse/sort/… into a list of “Deny from [ip]”.
There is no viable technique. Even turing tests are weak. Those people can buy a 12 years old chinese kid to copy numbers or words in a textbox…
What are they advertising? viagra, mortages and pr0n sites. I could block approximately 30.000 referer-spamming attacks with just 5 lines in my .htAccess.
But some go through… I can’t help it.