I Want

One feature I’d really like for my blog is a micro web.archive.org that just caches the pages I link to (along with any graphics, frames, embedded junk, stylesheets and whatever else they might contain that affects how they display), so that I can have blosxom automagically redirect the link to my cached copy if and when the link goes stale.

Only problem is there doesn’t seem to be any software around that can just spider a single page and all the gumph that’s on it, but not anything it links to. Worse, that’s a Hard Problem, requiring a real HTML parser. Oh well.

UPDATE 2003/11/14:

So Clinton pointed me at wget’s -p option, which does what I want. How cool! A modicum of futzing around is required but this is actually doable. Sweet.

Leave a Reply