Main

August 05, 2007

Spam and whitelists

I'm way too obsessed with spam. I've been playing with an auto-whitelist setup for my email: mail from any address that I've seen before (to, from, cc, whatever) is automatically accepted and put in my inbox. All other email gets sent through a spam filter. The filter-approved mail goes to a "greybox." The rest gets put into a spam folder that I never check.

So I check my inbox mail on my treo and never see spam. The greybox I can check once a week. Any mail that lands in the greybox and I later file gets automatically added to the whitelist. I don't have to worry about mail from people I know being tagged as spam because the senders are all in my whitelist. All is well, right?

Unfortunately, no. People change email addresses a lot more than I would have thought. If someone switches from yahoo to gmail, they're off the whitelist and suddenly my spam filter (which I've configured to be as paranoid as HAL 9000) gets to decide whether I see it. Uh oh.

So if I haven't yet responded to your email, well, now you know why. Sorry...

February 04, 2007

Spam AI

In the future, spammers will build a profile of me from my blog, bookmarks, etc., and use that to compose relevant-seeming mail that will make it through all my spam filters. Bummer. The traditional email model of "accept anything from anybody and assume it's all of equal importance" has got to go.

Also, if I could figure out why dreamhost's mail servers only run my spam filter on some incoming mail (not even that much of it), then I'd be a happier person.

September 09, 2006

Accepted into wikipedia!

Finally, wikipedia has accepted an entry about me. I'm really very flattered, even if there are some inaccuracies in the entry (e.g. they got my birthday wrong).

May 09, 2006

Paul Kedrosky deleted my comment

In this article, Paul comments on how great it is for VCs to be jerks. I left a comment, it didn't make the cut. Dunno, maybe he didn't like my grammar. I'll just say this: I've noticed a very strong correlation between cluelessness and arrogance. That was really the gist of my comment, you see.

December 03, 2005

I didn't send you email

It seems that some spammer has adopted "zubairissa@dullroar.org" as a fake return address. If you get mail from that address, it's not me or anyone at all related to dullroar.org. It's just another joe-job.

October 23, 2005

More on the WebOS

A while back, Kottke wrote an article making web-based apps work even when a user is disconnected from the network. I thought that it'd be a lot easier to get more net connectivity than it would be to make online apps work offline. However, Kottke has several good reasons for believing that offline is still important. I think his best point is that as we move more to online apps, our tolerance for disconnection will decrease.

Ok, so let's assume we want to build this offline web app support infrastructure. What's the minimum that a client machine has to provide to make it possible? I'm thinking just storage. Let's put a thing that looks like a web cache between a user's browser and the network. A web app can query for and then signal to the cache to ask it to pre-fetch, store, and retrieve records. Apps would also need to put a lot more application logic into the piece of the app that runs on the browser. I'm going to need to do some research on javascript/AJAX/etc. to figure out just how much pain this would be.

September 11, 2005

Wife in paper

The wife said "I don't want to toot my own horn," but that's not going to stop me: Her blog was featured in City Paper's Pittsburgh N'@ this week.

We found out from a friend last night, and this led us to run around Squirrel Hill at 1:30 am trying to find a copy. We didn't succeed, but having wound up at Mineo's 10 minutes before closing, we decided to grab a couple slices and call it a night.

September 02, 2005

Why does Telerama want me to hate them?

cmu id I buy wireless Internet service from Telerama so I can surf the web from coffee shops. I get the student rate, and of course, they want to see my student ID to verify that I'm still enrolled. Here's where things get ugly. My ID expires every 6 months, and CMU gives me another little sticker to put on to renew it. Around the same time, Telerama sends me an email. It says, to paraphrase: "Your ID has expired, we've suspended your account, and you're a deadbeat. If you have a new ID, scan it in and email it to us, or show up at our offices in person with it." I don't have a scanner, so I usually take a picture of my ID and send it in. That's what I did yesterday evening. This morning, I received a reply saying "we couldn't find a scan of your ID attached to your last email."

Sigh. Is dealing with customers really such a subtle art? First, it would be helpful if the email that said "we couldn't find your ID" would explain why it couldn't be found. Did the attachment not get through? Or was the picture blurry? Or are digicam pictures not acceptable? Second, why can't Telerama send out a reminder email a week before the expiration date? Student IDs expire right before the new semester starts, just when life is most hectic. This is not the week to expect me to remember I need to renew my wireless.

Argh. Just received yet another email from Telerama support: They still can't find the attachment with my id in it. Perhaps I'll just send them to this page.

August 24, 2005

Kottke on where the web is going

Kottke predicts that the future will be about delivering web apps that can work even when the user's computer is disconnected from the network. The major building block he sees for this is the presence of a local web server on the user's computer that somehow holds application state and syncs up with the real world when there's a net connection present. For instance, imagine using Gmail on your laptop even when your net connection is down.

I got really riled up over this post because I've thought along these lines (pdf, sorry) in the past. Yesterday I almost broke out the J2EE and started building the infrastructure, except I had a few concerns. First, I'm lazy. Second, the number of locations where you can't get net access is dwindling rapidly. I'm not just talking about wifi (which is already being rolled out on airliners), but also the big cellular providers, who are offering ubiquitous, moderate speed (100's of Kbps) data service on top of their phone networks. So where are we going to need disconnected operation?

But the major challenge in converting an app to work in this world is supporting disconnected operation, and that's going to require an application rewrite. It's a huge leap for a programmer to go from today's fully-connected model of all data being available to a world where, during disconnected operation, most data is unavailable. And then there's the complexity of syncing up with the server when the client becomes connected again. For a multi-user application (or even an app that allows a single user to log in multiple times), synchronization is going to involve sorting out snafus where two clients have altered the same data. It's an old problem, and the best solution we have today is for the application to throw up its hands and ask the user to sort out the mess. "I'm sorry," says the application, "but something's gone horribly wrong. Would you mind telling me which of these 5 versions of your dissertation is the most recent?" (And the system isn't even prepared to cope with the truth, which is that you want chapter 2 from one version and chapter 3 from another.) Universal net access seems a lot easier than trying to get hairy distributed systems issues right, even if we'll still have to buy our net access from the phone company.

I think Google's purpose in creating Desktop, chat, Picassa, etc. is pretty straightforward: it gives Google more places to put advertisements in the future, and more detailed user behavior data to build better ad placement algorithms. Pretty mundane, but it seems to make a lot of sense. Especially when you think about how many ads Google needs to sell to maintain their revenue growth.

August 05, 2005

Change changing places (and colors)

Just tweaked the blog's color scheme to be a little less boring. Then there's the new logo that showed up recently - that was originally from a Swiss fire exit sign. And I dumped my blogroll, due to general crankiness over how hackneyed and useless it had become.

What do all these changes mean? Will I start spell-checking my posts? Will I sell out and start running Google ads? Nah. People with my page rank who run ads on their blogs are either paying too much for their hosting or are too self-important.

Finally, some random links: this, and this, which is the best blog post I've read in a week.

June 01, 2005

Blogkvelling

My sister has been blogging for months and I just found out today. RosieBlogs is a knitting blog (not entirely surprising, since my sister runs a yarn store.) This is all well and good, but sibling rivalry requires me to be jealous that her blog is better than mine (a lot better, actually).

May 22, 2005

NY Times site change

The NY Times has changed their web site so that each article has a link to the next article in its section. Has anybody else noticed what a difference this makes? It's huge: suddenly, the online version feels much more like flipping through the paper version.

April 08, 2005

Internet Radio Streaming

In his review of the "Who Owns the Culture?" event, A VC mentioned that the costs of Internet radio are daunting, even for the likes of David Byrne. I have no idea how to fix the copyright fees, but I think a good way to decrease bandwidth costs would be a commercial version of End System Multicast, which distributes content via p2p. The technology is a bit like BitTorrent's, but ESM has additional intelligence to support live streaming. It's an open question whether users would be willing to donate bandwidth, but there's a good chance since the bandwidths are low and people are often passionate about music.

Oddly enough, years ago there was a startup, ChainCast that was trying to do this, but it looks like they've abandoned the p2p approach(also, it looks like they're out of business). Why did they fail? Maybe because they were trying to sell this stuff years ago, way before broadband was widely available. Regardless, I have to believe that p2p streaming is going to happen one day. We just need to find a killer app.

March 19, 2005

Grease(monkey) + Fire(fox)

Probably you've heard of Greasemonkey, the Firefox browser extension that makes it easy for people to write little bits of code that transform web sites. What? Greasemonkey makes it possible for users to fix badly designed web pages. Without Greasemonkey, if a site's broken, the author needs to fix it. But usually, the author doesn't think the site is broken (or isn't willing to fix it).

So far, Greasemonkey scripts are being used to filter out junk from various web sites. Ads are usually the first thing to go, followed by all the junk formatting that gets between the user and the information he wanted. These scripts seem to be straightforward to write, and another 10 or so are popping up every day. What they all have in common is that they try to make the browsing experience better for users, and more often than not, they succeed.

Some people think this is bad. Hell, for every idea, there's someone who disagrees, even for unmitigated goods like donuts, puppies, and Radiohead. The outcome that these people seem to worry about is a bunch of scripts interacting badly, or one script going crazy after a site redesign, and rendering a page unreadable. In other words, they're worried about bugs. Bugs. The latest argument is based on HTML being a poor foundation for these kinds of hacks. As the current crop of scripts demonstrate, HTML works well enough. If the hacks persist, maybe that'll encourage publishers to move to something better to reduce confusion. Meanwhile, Microsoft's Scoble, whom I ordinarily think of as reasonable, was anti-script, then decided it was ok provided the script conformed to his own extensive guidlines. In other words, he hates Greasemonkey, but he can't come up with a convincing argument against it.

I think they're really worried that the publishers are going to pack up and go home. If I can reformat CNN's pages to be 80% content and 20% ads (rather than the other way around, as they currently are), maybe CNN is going to stop giving away its content. It's probably way too late for CNN to give up on the web. What it might do is make publishers realize that users aren't going to stand for the ad-laden, "user-hostile" designs that are so popular today. Publishers will lose control of their content. The question is only whether the users taking over will do so out of love or anger.

As always, this isn't going to transform the world overnight, and the process is going to be painful. We'll know Greasemonkey is going mainstream when publishers start attempting counter-measures. Possibly they'll try to obfuscate their HTML in an attempt to confuse the reformatting scripts. It'll be an interesting fight to watch.

In the meantime, here's a collection of Greasemonkey scripts. Enjoy!

Update, March 21, 2005: Glassdog has covered the greasemonkey/autolink issue optimally.

Also, Scoble left a comment (see below) saying that he doesn't hate Greasemonkey. Maybe "hate" is the wrong word, but I think there's at least a whole lot of ambivalence there. Scoble's guidelines for proper use of linking technologies are here and then here. By my reading, a browser that ships with the default behavior of blocking popups violates Scoble's first, second, third, and sixth guidelines (there are 8 in total). But blocking popups is exactly what I'd want the default behavior to be, for me or grandma. Admittedly, I'm not sure I'm applying the guidelines properly. But the impression I'm left with is that they're an obstacle to widespread adoption of greasemonkey. You want grandma to download a plugin pack and click through a bunch of screens to enable/disable various features? Forget it.

This whole situation is starting to remind me of the debate about whether to accept or reject non-well-formed XML. Eww.

January 30, 2005

Today's Google search gone awry

What happens when you click on this link and then mouse over the guy's picture? Something like what's shown below:

before
Before
after
After

In fact, there's a whole company full of people with pages like this. Here's their staff directory. Oh, and don't forget winky.

January 11, 2005

What I learned this morning

The blogosphere cannot resist giant rocks.

January 05, 2005

NYTimes interstitials

It seems like the New York Times is doing interstitial ads more often these days. If you want to avoid the flash-based monstrosity of the ad without watching it all the way through, just hit reload and you'll get the article right away.

Movable Type's Useless Guide to Comment Spam

Movable Type has announced a document all about fighting spam on blogs. They're chasing their customers away, and it's just dumb.

They're writing a whole long guide about installing this plugin and that plugin and whatever. Don't tell me to go install a plugin, ship your damn software in the recommended configuration. MT has lost touch with how real people use their software. That's made more clear every time I have to go clear out comment spam. MT's crap comment interface makes removing the comment and then banning the IP a multi-step process. Does anyone at MT admin their own blog?

But anyway, MT has learned at least one lesson: they've turned off trackbacks on their announcements blog. So no more embarassing trackbacks from dissatisfied customers like me. Way to go, MT! Why solve the problem when you can hide it?

December 28, 2004

Civic Zeitgeist

As many have already remarked, Google's end of year zeitgeist is out. In the image search results, the 7th most popular car to search for is the Honda Civic. The Civic edged out cars like the Mitsubishi Eclipse (#10) and Mustang (#8), and slotted in right after Mercedes (#6) and Porsche (#5). OK, great, but: There are currently about 500 billion Honda Civics out there, so if people want to see a Civic, why don't they just look out their freaking window?

September 13, 2004

Thus spaketh Cordozar Broadus

Step over on to the alternate universe. For sure!

Thanks to Joe for finding this, and to Snoop Dogg (nee Cordozar Broadus).

June 14, 2004

Floods

weather radar image So here it is raining pretty heavily, and I'm sitting around reading blogs, and I saw Greg's latest blog entry, A new flood?. I was wondering how Greg posted about the storm so quickly (it isn't even done raining yet), but it turns out he's writing about the huge volume of spam he's receiving.

About the spam: I'm seeing the volume of spam rise as well, though I've also gotten my address posted on the web in a few places recently (while trying to rent my apartment). I used to get 10-15 spams per day, but these days I'm in the 30-40 range. Whatever happend to that CAN SPAM act? Didn't the government solve the spam problem last year?

About the storm: 'tis the season for thunderstorms, I guess. The storm seems to be done now, and the temperature has fallen from 81 degrees to 68. Ahh. Now to go downstairs and see whether the basement has flooded...

April 23, 2004

More news from around the web

From my parents: Pneumatic Vacuum Elevators - Home Elevator Sales, Service, and Installation . Residential Elevators for new or existing homes. You know those pneumatic tubes they used to use in stores and offices? Well, now they do that for people. It's like Futurama! Key quote from the web site: "on electricity cut-off the vacuum elevator automatically descends to ground floor." Yeah.

And from my sister: a knitted elvis wig pattern. Face it, you've always wanted Elvis hair. If you can't (or don't want to) grow it, now you can knit it.

March 19, 2004

LOAF: social networking over email

Loaf is a piece of software that adds social networking to your email. Why? Because a lot of interesting things can be done (e.g. spam filtering) when you know where in your social network an email came from (or whether the sender is in your network at all). The beautiful thing is that it doesn't compromise privacy and it doesn't require you to send a huge amount of data with each email. Check it out, as they say.

Many2many also has a short post about LOAF here.

February 17, 2004

Information Wants to be Free (of Attribution)

Dare to compare my December '03 weblog entry covering an essay by Reed Hundt about broadband in the U.S. with a Slashdot story today about the same essay. The person who submitted the article to Slashdot seems to have magically arrived at almost precisely the same blurb about the essay that I used a couple months earlier. Mostly, I'm just happy to have a little validation of my blurb-writing skills, but I would have liked some credit. The copyright on this site allows anyone to use any of my material, but requires attribution.

What makes this incident particularly precious (well, coincidental anyway) is that this afternoon, I led a discussion in a seminar using a set of powerpoint slides that someone mistakenly left out on a public web server. I found them through a Google search, and they were just the thing for the seminar, but I should have asked the author for permission to use them. So yeah, I'm a big hypocrite, what can you do?

February 11, 2004

Comcast wants to buy Disney

Cause, you know, that AOL/TimeWarner thing worked out so well. Here's the NY Times article.

February 05, 2004

Douglas Adams' How to Stop Worrying and Learn to Love the Internet

Thanks to Doc Searls for linking to this essay by Douglas Adams, which is absolutely sublime. I quote:

1) everything that’s already in the world when you’re born is just normal;

2) anything that gets invented between then and before you turn thirty is incredibly exciting and creative and with any luck you can make a career out of it;

3) anything that gets invented after you’re thirty is against the natural order of things and the beginning of the end of civilisation as we know it until it’s been around for about ten years when it gradually turns out to be alright really.

And he goes on to put the Internet in context as something we don't understand, and perhaps won't understand until some people grow up with it as a fact of life rather than the cool new thing. I'm only 29, but that just makes me feel old.

January 23, 2004

Maltese dog lovers of the world, untie!

Maltese dog breeders of quality pet and show Maltese puppies is a creepy web site about a kind of tiny, harmless dog. Tonight when I go to sleep, I'm going to see a tiny maltese marching inexorably toward me. (Shudder).

Also, scroll down for such notable quotables as "Many thanks to David Fitzpatrick for his gentle and loving handling."

January 01, 2004

the art of friendster pictures and authentication


So here we are in the age of social networking, and we're all publishing profiles of ourselves (say, on Friendster or wherever) that give complete strangers a pretty good idea of our likes, dislikes, habits, friends, etc. Talk about stalker-bait. So people use pseudonyms and obfuscation to keep their Internet stalkers and their real-world stalkers separate.

But people still want their friends to be able to recognize them. Some include pictures in their profile that only show a bit of themselves. The pics give enough detail that friends know who it is, but everyone else just sees an almost random image. It's a sort of authentication, but targeted at a very specific set of people: those who know us well in real life.

As we contemplate publishing even more information about ourselves (e.g. location awareness), we're either going to have to give up on keeping our various identities (real and Internet, business and personal) separate, or else social networking services are going to have to provide a more general, effective way of managing our identities than authentication-by-picture.

December 14, 2003

Reed Hundt on "big broadband"

In this essay, Reed Hundt talks about building a 10 to 100 Mbps network for every household in the U.S. He makes a great case for why it should be done and how we can pay for it.

What's interesting about this piece is that Hundt advocates a new approach to universal service. Instead of giving away broadcast spectrum (for HDTV) and maintaining (ancient, inflexible) phone lines, we should spend money on building out a next generation fiber network to every household, and run both HDTV and phone over that network. Then we can stop funding the phone network (which is pretty much maxed out anyway) and sell off the HDTV spectrum for 10s of billions of dollars.

I think it's a great vision, because a single, public network is a huge win for the same reasons that cities decided to only keep a single set of electric lines or a single set of gas lines. After all, you know something's wrong when Verizon is able to actively screw its competitors because it owns the phone lines. Last I checked, Verizon was selling DSL to end-users for $35 per month while it was selling DSL wholesale to its competitors for $40 per month. A publicly-owned network could fix that.

December 13, 2003

Presidential Candidate Blogs

A round-up of candidate blogs:
Clark
Dean
Edwards
Gephardt
Kerry
Kucinich
Lieberman
Moseley Braun (no blog found, so here's the campaign web site)
Sharpton (no blog found, so here's the campaign web site)

I'll leave it up to you whether a blog written just for the publicity is still a blog.

December 11, 2003

What would you do with 100 Mbps?

The 100x100 Network is a new project I'm part of at CMU. The goal is to build a network capable of delivering 100 times more bandwidth than most DSL or cable modems currently deliver. Among the challenges is figuring out what people would use all that bandwidth for.

Friendly wiki?

TikiWiki seems like one of the best open source wikis. I do wish the docs and ui were a little cleaner, though.