Categories
Brain Buster Productivity Booster SEO and Paid Search The Marketeer

What is Scraping and how to stop it?

I’m at the Search Engine Strategies conference and we just had lunch with a team from Google who showed off some of the new webmaster tools (and I managed to get in a vote for a crawl error referral report to Vanessa Fox, but that’s another post). The topic of scraping was raised and Danny Sullivan mentioned that there will be a full session on it later in the week. My general rule is not to blog during business hours but since we’ve been fighting this battle at work it’s relevant (and remember that AccuRev has the Ultimate Source Control Tool).

In our Web 2.0 world you can make money just by generating traffic and putting up Google AdSense ads. For the Ronin Marketeer, you post quality content, get the traffic and are regarded as a hero by all. Another approach for those of more flexible business ethics is to copy someone else’s content and show it as your own. This is happening more and more in the blogosphere, is already an issue for corporate sites.

The practice of grabbing content from another website and posting it as your own is called scraping. I’ve never played with scripting this myself but there are varying degrees of automating this process. Most people come across it when they are googling themselves or their company and they get some results that are outside of their own domains (often blogs using a default template) that copies their content verbatim. More recently these pages often include copy from multiple websites.

So, what to do about the theives in our midst? Adam Lasnik of Google discussed this during the panel today, and here’s a summary of the answer as I heard it:

  1. Overall, “Don’t Panic”. It’s fairly easy for Google to verify this, your site published it first and your domain has been established with Google. The scraper is not established, their URL is newer and probably registered for a year or less.
  2. You can file a DCMA Takedown request with them
  3. The takedown request is good but Adam referred to it as “swatting flies”, your time is better spent staying the course – make sure you are the source for your content by continuing to crank it out and remain the source.

Keep in mind that in the grand scheme the majority of scraping is garbage and clutter, and anyone providing search results will continue to screen it. But then again, it’s yet another cat and mouse game for us to follow.

I’m learning some good stuff, more to follow.

Categories
Daily Life The Marketeer

It’s Sunday – that means there’s an M Show

boots

Nothing better than drinking Mulled Wine on public property at the German Christmas Market. Back to business tomorrow!

Check out your Monday M Show

Categories
Daily Life Podcasting

Why Radio Sucks

Not long ago I was travelling for business with a rental car and I had no way to get my iPod output into the stereo. After 15 minutes I was actually considering the risk of driving with headphones (and decided against it for my personal safety, I already felt at risk in my ultra-sub-mini-compact vehicle).

Reason #1 – Ads. There’s tons of them. This is the golden age of podcasting, there are very few there.

Reason #2 – Finite number of stations, finite choices. When’s the last time you heard a show about the best way to raise pigs or knit? Radio: every town in America has, the morning zoo, talk from the left and right, country, rap, classic rock, current rock, oldies and that’s it. No matter what your hobby, you can find something that you really want to learn about. And even though I am 100% podcast there’s lots of hollywood quality stuff through Audible (John Federico, thank me later).

Reason #3 – 60 GB, 9000 songs, my choice.

Reason #4 – If you are stuck waiting somewhere you can watch video on your iPod, kind of cool coming through 6 car speakers.

Bonus Reason: You may say “But John, in your simple-minded rant you forgot one thing – variety”, and you I say “See #3 and add Smart Playlist – Random” and check out my channels on gigadial – it’s great, subscribe to the feed and I’ll throw you random new podcasts to check out. The CAPOW channel covers marketing and business stuff, New England Podcasters is everything else for the general public, and the John Wall channel has everything that’s too nerdy, radical, edgy, or adult for the (somewhat) family friendly NE Casters channel.

My channel has some All-Star geek recordings right now that include Apple Founder Steve Wozniak, Joel from Joel on Software, and Business Gurus Clayton Christensen (the Innovator’s Dilemma) and Malcolm Gladwell (The Tipping Point and Blink).
If you want just music then check out some folks doing some great stuff: Rock from Accident Hash, Hip-Hop from Julien, Chill with Anji Bee, and get your pirate Barry White style groove on with Suzy Chase.

By the way, anyone can add to those channels so if you have anything you’d like to share please add it.

Holy web bluntman, that’s a lot of links. Have a good weekend, The M Show will be out Sunday Night with some special guests…

Categories
Brain Buster Geek Stuff

Web 3.0

I mentioned yesterday that I had a chance to speak with Mike Kowalchik of Grazr and that started me thinking about the changing face of the web. This goes right along with a post of Steve Rubel’s regarding Yahoo no longer putting feeds on major pages. RSS is a way to get through content faster – it removes some of the friction in an already nearly frictionless environment.

The only problem is that we are now drowning in information – the web is being crushed under its own weight. A tool like Grazr allows readers to skip unchanged page views that would normally bear advertising messages. Once you are hooked on RSS feeds your surfing time decreases. This is a disruptive force.

I’m beginning to think that the missing link is an RSS killer app. With a program that folks on the far side of the chasm would adopt (something beyond a propellor-head newsreader), a program that makes RSS completely seamless, we will see something completely new. While Grazr may look like a widget on the surface, I think it may be the first look at something completely different.

Categories
Lead Generation Productivity Booster SalesForce.com SEO and Paid Search The Marketeer

Integrating Salesforce.com with Google Adwords

A stumbling block on the path to the holy land today, the code snippet we need to integrate SF.com with our Google Adwords campaign conflicts with some existing javascript we have on our custom web-to-lead forms. As I have no Perl skills to speak of beyond the “cut and paste somebody else’s stuff and pray it works” I’ve had to call in some bigger guns, i.e. Salesforce support level two and our own Ronin Coder. Perhaps there will be more luck tomorrow…

On the plus side, Joel delivered the web traffic today…

Categories
Daily Life The Marketeer

Joel Digs AccuRev

A great surprise for me this morning forwarded by Chicago Mike : AccuRev has made the homepage of Joel on Software. This should be an interesting day as far as web traffic.

Categories
Brain Buster Geek Stuff

What a 21st Century Record Label Looks Like

I attended the WebInno event tonight over at the Royal Sonesta just across the river from Boston in Cambridge. Besides getting a chance to catch up with Andrew Bourland and Christopher Carleton, I got to see some interesting new web apps. I geeked out on RSS and OPML with Mike Kowalchik of Grazr, but that stretched my brain too far and now I have to process that for a day or two before talking about it.

The other main course was Calabash Music which was demoed by Brad Powell. They focus on World Music, and the interesting thing was that they have a mini player that bands can host on their own site which has both a playlist and integrated purchasing mechanism. You can listen to the tunes, and click to purchase the track. He mentioned that they already have a deal going with National Geographic (who hosts their podcasts with LibSyn). Very cool stuff, sort of a CD Baby without the CDs. Is this the record label of tomorrow?

Categories
Daily Life Geek Stuff

Where to check out new Web Stuff in Boston

I found out about the Web Innovators Group through Brian Owen of Masthead Venture Partners at the Nantucket Conference back in the spring. The latest meeting is tonight so I’ll fill you in on anything cool. If you are in the Boston area just go over to their Wiki and sign up.

I may even grab some audio for The M Show.

Categories
Email Marketing Lead Generation Productivity Booster The Marketeer

Email is as dead as direct mail

That is – not dead at all. Today was a big email day for me sending out two blasts. I’m currently using ConstantContact which is the best value for the price – free to start and not expensive after that. I’ve used ExactTarget, which is a great product (and perhaps in my future due to integration with Salesforce.com), and in fact Chris Baggott from over there is coming out with a book next year and if some of my pieces make the editoral cut I’ll be published there.

Contrary to what you may hear, email is very much alive, just as is direct mail as I can tell from the 35 catalogs that have come in through the mailslot at home in the past week. Perhaps no longer the silver bullet, these tactics still deliver.

ConstantContact has some benchmark figures across the service that are interesting: Global Bounces are at 18.3% (although probably understated since I get some Out of Office messages direct to me), opens at 37%, and clicks at 8.9%. I do better on bounces, lower on opens, and much better on clicks. My personal mailing list (for The M Show, listen now!) has under 1,000 names but performs at a level of magnitude much greater (9x cleaner 4x clicks). This is quite normal for smaller lists, I have more stats on that but I’m not going to dig that up now, leave me a comment if you want more.

Categories
Geek Stuff Graphic Design Productivity Booster The Marketeer

Browse Fonts, Preview Fonts, Manage Fonts, Ahhhh

I’ve spent the past 5 years looking for a font browser. Like any other Ronin Marketeer, I have a portable hard drive of digital tools that I’ve gathered. It includes some insane amount of fonts now over 10,000. Yesterday I again hit a point where I was so frustrated that I decided to take a time out to see if there were any tools out there.

I found Suitcase by Extensis. This gift from the gods allows you to grab a folder and it will build a library, complete with samples to view of all the fonts in the subfolders. I’m doing the 30-day trial but unless it does something ridiculuous I’ll be a new customer in no time