Yahoo spam protection needs tweaking

Yahoo likes to talk
a lot about its oh-so-cool tools to fight spam, but just how effective are they? I
have a yahoo ID that I use only for instant messenger chatting. Of course it has
its own mail account too which I stopped using a few years ago. Of late,
however, Yahoo has started letting a lot of obvious spam through its
"protection".

Don’t believe me? Here’s a screenshot of my in-box the last time I checked
mail on the yahoo account.

yahoo screenshot

Apart from the first one which was legit mail (I’ve blurred it), the other
messages are clearly spam. They aren’t even the more deceptive ones that have
benign subject lines like "hello", "how are you doing", or
"I forgot to tell you".

Yet Yahoo fails to classify them as spam. They need to tweak their algorithm
a bit, I suppose.

(And before somebody asks, yes, the "SpamGuard" option is
enabled on my account. I checked twice.)

BSNL’s new mystery broadband service

BSNL, our wonderful government telecom
provider, has of late been putting out big newspaper advertisements touting its great new
broadband service called DataOne. These ads are remarkable in that they contain
absolutely no information except that they’re introducing a new service. That,
and their web site URL is all.

A couple of weeks ago, there was absolutely no information on their web site
either – all very perplexing at the time. Today, there was yet another ad about
DataOne, but once more no information about plans or cost. After today’s ad, I
checked their site again. Now they even have an FAQ
about the service
.

So does it help? Nope, that’s basically a page about what ADSL
is. The two most important things you need to know are completely missing: available
throughput and pricing
. Here’s what the page has to say about it:

How about Broadband Plans and Pricing?

These will be declared very soon. Meanwhile you can fill up the
registration form so that we can work out the logistics for your connection.

They even have a
form
you can fill up to get a connection even though you don’t know what
you’ll get and for how much. (The form is an image. Wow!)

It contains the helpful clarification:

* Tariff will be most competitive in the industry

** Data rate of 256 kbps and above

That sure tells you a lot, doesn’t it?

According to Kingsly, who has spoken to some
BSNL folks, they have yet to decide on download speeds and tariff, and the
service is supposed to launched only around 15 January. So why on earth are they
wasting money on advertisements if they haven’t worked out any of the important
details yet? Who knows? Somebody on the india-gii list thought that it was
because they wanted to use their ad budget in this financial year. Ah, that
explains it all, doesn’t it?

I’m waiting to be pleasantly surprised by BSNL giving us a 512 Kbps unlimited
connection for about 2000 bucks a month. I’m currently on the Touchtel
Airtel Broadband 128 kbps DSL
plan
which costs approximately Rs. 1450 per month with all the fine print
charges included. It’s already saving me about 5000 Rupees every month, but I’d
love to have some higher throughput. That 128 Kbps is called
"broadband" in this country is downright shameful.

The future of matrimonial classifieds

I noticed something interesting the other day as I walked into the offices of
the Times of India to place a classifieds ad for some waiters for my
restaurant
. As I was handing in the form to the lady at the counter, a sign
on the side caught my eye (mostly because I saw a "10% off" screaming
from it.) Unfortunately, I wasn’t carrying my digicam with me, so I’ll have to
paraphrase what the sign said.

Here’s how it essentially read: "If you place a matrimonial classifieds
ad and do not specify any religion, caste, or regional criteria, we will give
you a 10% discount on the ad. We are doing our bit for the betterment of society
and eliminating bigotry."

I was quite pleasantly surprised to see an initiative like this from an
otherwise morally bankrupt publishing group. We "educated"
middle-class and upper-middle-class Indians love to tell others that the caste
system does not have a strong grip in this country, yet we are unbelievably
hypocritical when it comes to marrying off our own sons and daughters. "We
don’t believe in the caste system, but my Sanjay can get married only to a [some
language] girl from [some caste] caste."

Speaking of matrimonial classifieds ads, I think that the growing popularity
of online wedding sites like shaadi.com will
mean the slow death of the print classified ads, at least in English newspapers.
They will do what the spread of cellular phones did to the pager industry (yeah,
remember pagers in India?) It won’t happen immediately, but I give it about 3
years. The print medium has severe limitations: you can’t write more than a few
lines, which have to as short and sweet as possible, giving you only enough
space to write a bunch of numbers and abbreviations. Here’s an example:
"Smart beautiful homely [caste] girl 25/157/6000 seeks [caste] qualified
well-settled boy. Contact Box no…" Now this could fit almost anyone,
giving you very little info to go on. Online, however, you don’t have any space
restrictions, and adding more fields isn’t that complicated. It also allows you
to easily build databases that can be searched on various criteria. What’s more,
it has that most important bit of information – a photo!

When I mentioned the above Times of India signboard to one of my friends, he
pointed out to me that one of the matrimonial sites, instead of leading the way,
was actually being regressive in its approach. BharatMatrimony.com,
that advertises heavily on many sites, has an annoying "feature"
that’s also a bug. If you want to search for a bride or groom, one of the
parameters required is "language". This isn’t a multiple choice thing;
you can only select one language. It then redirects you to one of its
language-specific sub-sites where you can search away. However, if you are
slightly more modern and don’t particularly care that your prospective partner
come from a particular state, you’re out of luck. There is absolutely no way to
specify "any" as an option or even to search through more than one
language. If you want to check out women from all over the country, you just
have to conduct 29 different searches. Isn’t that amusing?

Lastly, I must mention an interesting conversation I had about our
matrimonial ads with John Rhodes (he runs webword.com
– a usability site) who was visiting Bangalore for some business. He pointed out
some differences between personal ads in the USA and over here. He found that
the most important criteria here seemed to be the person’s caste, religion and
family, while in USA, people would put their interests and partner requirements.
He was amused to note the classifieds were divided by language. I just shrugged
my shoulders and said, "well, it will take another couple of generations to
get rid of our deep-rooted prejudices."

More comment spam prevention

These spammers are a relentless bunch. The more spam-prevention measures you add, the smarter their bots get. For instance, Movable Type’s spam prevention for email addresses was merely changing the “@” to its HTML entity code of “%40” and has long been circumvented by spambots. Even installing MT-Blacklist only reduces your burden; it doesn’t eliminate it. (Though checking my MT Activity log tells me that it catches a LOT of comment spam.) The fuckwits have now started comment spamming with legitimate urls like “www.fda.gov” to get you to accidentally blacklist non-spamming sites.
The only feasible solution is to put in what’s popularly called a “captcha” – a security code verification that needs a real human to manually enter a random number into a box before posting a comment. So that’s what you will now see on this site. To make it easy, I have used only a 4 digit code. This will change each time you load the page.
The captcha system is easy enough to install if you’re a techie though it involves some mucking around in the MT code itself. It’s only ineffective against manual comment spam but most spammers don’t bother with that. Also, it doesn’t work well with MT-Blacklist and you’ll have to disable MT-Blacklist if you want the captcha to work. (Yes, I found this out the hard way after about 30 minutes of cursing.) Lastly, this means that blind readers won’t be able to comment on your site, but I’m not particularly worried about that since this is a personal site.
Update (15 November): To all the people who’ve mailed me asking me to install it on their web sites, please go RTFM and do it yourself. I have neither the time nor the inclination.
Personal note: The Hindu has done a full-page story on three people who have made a career shift to the food business and yours truly is one of them. (The full-length version of that photo is here.) Of course, they mangled some of my words. For instance, the lady asked me if I cook in the kitchen and I told her that like most executive chefs, I am not into hands-on cooking regular food every day. This got twisted to make it sound as if I’m not involved in the kitchen. Also, I’ve been cooking for 15 years and I didn’t learn it from just one dude. Oh well, you take what you get…
I’ve also written a two-part article for Rediff.com on how to start a restaurant. (Part 1 and Part 2). Actually, I wrote the article way back in August. Then the person handling the new career section left Rediff and the section resurfaced only 3 months later. Unfortunately, the editor saw it fit to inject some of her own editorial “style” into my article, which pissed me off royally, especially since I edit myself ruthlessly. Rediff also does the “follow Jakob Nielsen blindly” dance and chops all paragraphs, regardless of continuity, into no more than two sentences each to “improve readability”. Whackos!

Fixing comment spam problems

So both Anita and Yazad were facing serious comment spam problems. Depending on your level of nerdiness, you can implement many solutions.

What I did for them is described by Shelley Powers over here. Give it a shot. It’s not that hard. The only downside is that if you’re running multiple blogs on your MT installation, you’ll have to edit the templates for all of them or comments will not work for the other blogs.
(And before you ask, no, I will not implement it for your blog too. I already provide too much tech support without any credit.)

A simpler, but slightly less effective, solution is to have a script that turns off comments on all entries older than X days, where X should be a reasonable number like 14 or 30. A script for that can be found on Geeksblog.

A still simpler solution is to make the “Comments (X)” link go to the individual entry instead of the comments page. People will be able to leave comments, but spambots will probably not figure it out. I leave it to the reader to look through the MT documentation on how to do this. It’s just changing one tag, that’s all.

Incidentally, my restaurant web site (complete with its own chef’s blog) has just launched in its Beta version. The URL is http://www.shiokfood.com
Some more content will also be added shortly. Sign up for the updates if you want to be informed of new articles (approximately twice a week frequency is planned.)
Your comments are welcome. But if anyone says, “page doesn’t validate” or some such crap, I will feed you a knuckle sandwich.
(Tested with Firefox and IE. I don’t use Opera, and I don’t have a Mac. I don’t care if you use either.)

Update: Head on over to Kiruba’s site and tell me if you spot the speed difference in loading. Something has definitely changed. I’ll write more about it. later.