⏱ Heads Up, CodePlex Archive Going Down Soon! <420 Hours

Can you break the file up into max size for Mega? all 750GB in say 50GB chunks?

EDIT

LOL nvm, it is max 50GB for free accounts…

Wish there was some other way…

I don’t think my torrent works anyway. Probably should have fixed it by now, but what I think I’ll do is make a website that hosts all the data once codeplex goes down.

I’ll also curate the data so there’s a downloadable version with all projects (750 GB) and a version with just some of the biggest files (that are on github) removed. That should cut down the size to less than 30 GB.

I’ll have it up before Codeplex goes down, but for right now I’ll be taking a break from this.

@MrValentine Let me know if you want me to put it all up on Mega in the meantime, that way Codeplex isn’t lost even if my house burns down or I die or get amnesia or something. I can do that pretty quickly.

1 Like

Happy to store it, currently sitting on roughly 30TB empty… I think removing any projects is not important and would be an insane amount of work anyway.

Thanks for your skillset and efforts :heavy_check_mark::rice_ball:

Sure, the only reason I’m good at this is that I’ve made many web scapers before, I think at least 5, for some reason. Every time I get a little better. I first started by downloading HTML as a string and using string.IndexOf() to look for data, then I realized APIs are a thing. I will say, I’m not sure if a Selenium web scraper has the same limitations of an API based scraper when it comes to API call restrictions. I’m thinking a Selenium scraper may be bypass call restrictions by virtue of imitating a user, but that’s some techno jargon.

If you want to do a really accurate site rip, the most accurate you can, you have to use Selenium. It basically hooks into a modern web browser (Chrome) and gives you control over it through code. One of my favorite things to use. I tried to use it to automatically buy Bitcoin on Robinhood based off my machine learning models. XD That didn’t turn out well.

Now I’m moving on to scraping Stocktwits comments to find the most accurate investors and storing all the data in a real SQL database as opposed to one bloated XML file that has to be recreated entirely to modify at all.

Did you hear that the reason for GTA’s long loading screen is one single extremely poorly done method that loads one JSON file every single time the game loads anything? It makes me hurt inside.

Anyways hopefully 7zip and Mega will be done compressing/uploading the data… tomorrow.

1 Like

Thanks for the insight :tea:

That company, oh boy, yes, they managed to make something somewhat impressive, but on the fundamentals, and now the whole gambling system… nah, avoiding…

1 Like

@jamie_yello Another one?

EDIT

Found something useful for you @jamie_yello ?

It’s on Wayback

https://web.archive.org/web/20210212000649/https://shawnhargreaves.com/blogindex.html

That scraper may work but I prefer to make my own. A lot of the scraping tools I’ve run into often just don’t work and there’s no way for me to fix them. It might be a personal preference, but when something’s stopping me from getting a project done I prefer it be myself.

And BTW… I’ll have that upload completed in 7 days. You’ll have to pay €10 for Mega’s “standard” plan if you want to download it. XD

Here’s to hoping Windows doesn’t forcefully restart my computer before then. Ugh, I would switch back to PopOS, but as a developer you kind of need Windows.

1 Like

Really prefer local, wayback is a ball ache to navigate :stuck_out_tongue:

DM me with your PayPal when ready :slight_smile:

EDIT

WBM is refusing to connect for me :thinking: EDIT Seems to be an issue with my landline fibre connection EDIT The website refuses to connect :frowning: will try the laptop to see if that helps EDIT Nope, so weird… it’s as though my router is blocking archive.org, but pretty sure I used it not long ago? EDIT It is my content blocker, but wtf, PHub works !?

EDIT

It appears to be that archive.org is a malicious site lol:

EDIT

Added to safe list and changed my settings as so:

Wondering if it blocks ads from social certain sites now lol

I’ll share that site as well then just so we can be safe

1 Like

XNA | Catalin ZZ (catalinzima.com)

:joy: just found this gem

1 Like

Wow, I’ll read that myself.

39% uploaded btw

1 Like

Oh no, after 5 days of uploading the Mega page is lagging like hell.

Good thing I split it up into multiple files. :sunglasses: I may not be smart enough to set up a torrent but I managed to see that one coming.

Here’s 39% of the files.

https://mega.nz/folder/qhgUlTDR#TsBjqs3t6-PKeUZ6ummZlQ

1 Like

OH NO, you used 7z… I hope I can open those… but I will just archive them, anyway, downloading one at a time…

Any luck with that site?

Also Mega is a bit broken for me but I think I am making headway, not sure where it is downloading to though…

EDIT

Mega crashed my browser

Literally cannot access it now :frowning:

It asked for permission to use local storage and no idea where that is and then my browser derped

Why wouldn’t you be able to open a 7z file? It should work if you manually open 001 when they’re all in the same folder.

Have you tried Chrome or Firefox?

Unfortunately, I cannot install them for reasons…

I cannot access that site anymore…

image

I am not installing that… I have 4.5TB of space locally, it’s annoying…

I don’t know where this space it mentions is

What file sharing method would you suggest then?

I’ll keep them up on Mega as well

1 Like

I don’t understand why I cannot just download, and they gated their download options, would pay, but I restrict what apps I use for a reason.

I can pass you a licence for Office which gives 1TB? [please use .zip :stuck_out_tongue: ]

EDIT

I can also paypal you for a one year office personal licence

EDIT

Two large zip files I have on OneDrive are 4.68-4.69GB in size which work fine, so, maybe do 5GB chunks to be safe?

Very mysterious.

I hope you aren’t paying too much for that, but if you want I’ll accept. There are other options, but if you are very selective you might not want to use some obscure thing off the internet.

Before you do that, I will check my school’s site to see if I already have an account.

1 Like

So Office costs either $70 for one year, or $7 a month. I can get the $7 plan and just cancel once you’re done downloading, if you were to send me a key you would have to pay for the full $70?

I can pay the $7 myself, that’s less than what I spent on my lunch today, that’s no big deal.

I’ll split it up into 5 GB zip files as well

1 Like

If you are ok with that, :bowing_woman::pray:

Who knew replicating an archive could be so challenging in 2021, despite my negligence in questionable software…