Wednesday, October 28, 2020

What to do with 6 MIL + pages

I’m in-house SEO working for a really old company that’s been around for decades and has lots of different facets to it.

There’s a real legacy issue with the website. Hundreds of people have had access and autonomy over the site, and there’s so much crap. They’ve also used the site as an intranet and - though they’re pretty hard to find - there’s notes from meetings, HR docs and so much more live on the site.

I’m running a crawl now, after I noticed I’ve got www pages linking to non www pages. So I need to get everything on the same domain. I can’t do that until I know the extent of the issue, historically I’ve not been able to crawl the full site just because of time restraints.

So I’ve always crawled specific subfolders and tackled deadweight in stages. Now I want to bite the bullet and do a full crawl because I want the full picture. But it’s onto 6 million pages now (and counting) excl. images obviously.

When this is done, how do I even go about exporting this? Surely excel and google sheets can’t handle that much data? Any advice around this would be amazing.

Thank you!

PS using Screaming Frog

submitted by /u/Sick_Turtle
[link] [comments]

from Search Engine Optimization: The Latest SEO News https://www.reddit.com/r/SEO/comments/jjm0l0/what_to_do_with_6_mil_pages/>

No comments:

Post a Comment