Hi there
Intro to Archive Depots
I'm Jase, a solutions engineer at Perforce, and today I want to talk about a very exciting new feature in Helix Core 2023.2, which is support for archive depots being backed by S3 storage. This is something that's been requested for quite a while. I'm super excited about it. Basically Archive depots, which are something that has existed for a long time in Helix Core, allow you to take the actual file data from files that are stored on your server and move those off to somewhere else. The idea is that this could be some kind of slower storage, like a spinning disk, or even a volume that you could detach, maybe snapshot it, something, save it. All the metadata stays on your server.
So you'll still see the history of these files. But you won't be able to actually access that file data and it's no longer taking up space on your main, fast, directly-mounted storage which, especially in the cloud, takes up more money as well.
Instead, you can set up your archive Depot so that it will directly store stuff on S3 storage. That could be AWS's S3, but it also could be any other S3 compatible storage, as long as they follow the standard S3 API for how you interact with it, which most of them do.
Object storage or S3 storage is way cheaper than block storage, than standard disk storage, on the order of maybe one fifth to one tenth of the price, a lot of the time. And then because it's in S3, it will often have extra redundancies and backups built in, and that depends on your cloud provider.
So I'm not going to get into all the details on that. You'll have to check with them and see what kind of durability they have on that data. But by putting it in S3, rather than having to put it on a drive, zip that up, detach it, store it somewhere and then bring it back. You can just keep that S3 attached all the time, and if you ever need to get those files back, you can just run the p4 restore command to restore files that you've previously put into your archive. This is a really exciting feature.
So I just want to quickly walk you through DigitalOcean and then we'll continue on with how to use it, which is then the same for both once you have them set up
DigitalOcean: Create Spaces Bucket
For Digital Ocean, our setup is quite a bit simpler than AWS for good and for bad. It's nice because it's simpler to set up. The downside is that you have less granular control over your security.
So what we'll do first is, within your project on the left side, if you go to Spaces Object Storage, this is DigitalOcean's S3 compatible storage, and if you don't have a subscription yet, you will need to sign up for that to get access to spaces. And then you'll see I already have a bucket, but I'm going to create a new one.
I'm going to say create spaces bucket right here, and then I will choose which data center I want. There are also some others that you can choose around the world, but I'll just stick with this San Francisco one for right now. And I'm going to choose a unique spaces bucket name. So I'll call this Jase S3 archive demo will be the name of my bucket. This has to be unique of all buckets in the world. So put your name or something in there to make sure it will be unique. Put it inside of a project. I'm going to put this in my demo project here, and then I'll just say create.
My bucket has now been created and you'll see, I don't have any files in it at the moment. A couple important things to notice. The first one is this Endpoint up here. This is the URL we're going to need later.
DigitalOcean: Create Access Key
And then, we need to get our access key and our secret key. And we get those by, on the left menu, going down to API. And then in our APIs, we'll go to Spaces Keys at the top. I'm just going to generate a new key. And this one, again, I will call S3 Demo access key, create access key. This name does not have to be unique globally, this just has to be unique on your own account. And now, we will get an access key here, which I'm going to copy, and again, I will just paste this off screen so I have it for later, and then my secret key, which I will hit copy here. And it's important to note that you only get one chance to copy this. If you miss it here, this will not be shown again. So I'm also going to paste this into my notepad, and I'm going to delete this after I'm done here, so it's Okay. that you've seen this key, but normally don't let anyone else see this. If you did forget to copy this, you can come here and regenerate key that will invalidate any previous ones you're already using, but you can just use that new one and replace it if you ever lose access to that key, or you fear that it's been compromised, like you showed it in a video. And that's it. The setup here is quite simple.
P4: Archive Depot Creation
Now we're going to go to P4 to create our archive Depot, which we will then link to whatever S3 storage we're using.
If you go to the HelixCore Administrator Guide at perforce.com, you'll find this page about reclaim disk space by archiving files. You can just search for that up here, or you can find it in the navigation on the left, under manage server and its resources. And on here, this talks about archive depots generally, but toward the bottom, we have this new section S3 storage for archive depots that talks about the different keys that we're going to need and how we will set this up.
So I'm going to go through these steps and show you, but just so you know, you can come back and reference this here.
So, what I will do first is. I'm going to go to P4V, And I like to right click. And open a terminal window from here. This just injects all of these environment variables for me, so I know that I'm connected to the right server. If you only ever connect to one server, you might not need to do that. But I find it helpful for me.
So now the first thing that we need to do is to create an archive Depot. You can create an archive Depot through p4admin, but as of the time of this recording, that does not yet have the capability to set one up for S3.
So we're going to do this through the command line here. So what I'm going to do is I'm going to create a Depot with the p4 depot command. And I'm going to add dash t archive. And that means this is going to be an archive type depot. And then I'm going to give it a name. So this time, I will say s3-archive-demo is the name of my Depot. And I'll hit enter. And this is going to bring up my default text editor for my system here.
We open this up. You can see it has the name that we gave it in the command and it's type, archive and map is normally what determines where the files are actually stored in the file system. But we are going to set this one up using the address field. So if I go back to this documentation here, I can see this example here. And we'll just go through this together.
The specifics of how we configure this archive Depot will be a little bit different depending on our S3 storage provider. I'll go through how you would set this up for Digital Ocean. And then we'll go on to actually using this archive Depot.
Digital Ocean: Archive Depot Setup
So first thing is s3, comma. Next is this region field. This is not necessary for DigitalOcean. I'm actually not sure this is necessary for anything except for AWS. So I'm just going to remove that. But what we do need to add is this URL field here. We're going to go comma, and then, oop, No spaces. URL, colon, and this is where we're going to paste this origin endpoint that we got here. So this is my archive demo bucket that I made, I'm just going to copy that, and then if I come back to my text editor, I'm going to paste that in here for url. So you'll see I've got this digitaloceanspaces. com field here. Now I do need my bucket name, which is actually just the same as this first part of this URL here.
So I'm just gonna grab it from there, paste that in, and this is wrapped around the line here. And then for my access key, that will be the first part that I copied earlier when I created that key. Again, no spaces. And then my secret key, which is that second longer part that I only get to see that once I'll paste that there. Again, make sure I don't have any spaces and then I will hit ctrl S to save Ctrl Q to quit and in my terminal I should see depot archive saved.
P4: Archiving Files
And Now if I type P4 depots, I can see all of my depots here, including this new S3 archive demo.
I'm going to clear my screen. And let's go look at the documentation real quick. So up further on this same page where we were, we can see some instructions about creating and restoring files from an archive Depot, but I just want to grab this link to actually go to the command line reference for p4 archive.
So the p4 archive command is the one we're going to use to actually send files over to the archive Depot. I definitely recommend reading through this. So you understand all of the options here, but some important ones: the -z command, which makes it so that it will also store files that have been branched to, or from another revision.
This -h to do not archive head revisions. This is very useful for if you just wanted to archive all the history, but still have easy online access to all of the latest revisions of everything.
One other one worth noting here is the -t, which will also archive text files. By default, this is only going to archive binary files or text files that are just stored as binary data. Normally text files are small enough that you're not going to get a lot of gain by moving those off to an archive Depot. However, if you want to, you can use this -t flag for that as well.
So I could archive this entire Depot, but just for this example, I'm just going to archive this one particular folder inside of it. So if I navigate to that in my depot view here on the left, I will get this depot path up here. I'm going to copy that. So the command will be p4 archive dash uppercase D and then I have to give the name of the archive depot I want to send it to, which is this: s3-archive-demo up here.
And then I can add some of these flags if I want, in my case, I'll add t and z to say, I want even branched files, and I also want text files just because I'll be thorough here. And then I'm going to paste in that Depot path from here that we copied.
So I'll just copy that. Come here. Paste that in. And I do need to put this into quotes because I have a space in my filename here, which is generally a bad idea for exactly this reason. I'm gonna say dot dot. dot. And then hit enter.
And Now this is running through and it's archiving those files off to that S3 storage. Now, depending on the size of these files, depending on if your storage is on the same service as your server, like if your server's on AWS and your S3 is also on AWS, this will be very fast. In this case, it's having to upload those, to store them. And you can see it came back and these were some large ish video and audio files. So we can see here that if I come back now these files are gone from here because I've archived them and they are now no longer taking up space on my server. This is especially useful, I find to archive an entire project. Or to archive the entire history of a particularly large set of files so that you just keep the head revision.
P4: Restoring Archived Files
And now I can continue on working. If I ever decided that I needed to get these files back then what I can do is restore them.
Let's go back one more time to our documentation here. And in this case, I'm going to click on this link for the p4 restore command. You can also get to this directly from the helix core command line reference, which is one of my main bookmarks that I use all the time. And again, you can read information. This one's generally a lot simpler. You're just going to specify the Depot from which you want to restore them, and then you'll give that same path of which files you want to put back.
So you don't have to get everything back out of the archive if you don't want to. So let's go through that here real quick.
Now, when you want to restore this, it's generally a best practice to use the p4 verify command, like this. p4 verify -A to tell it, you want to verify an archive Depot and then we could put in the S3 archive. depot name here. Slash dot dot, dot, or we could give the specific path that we want to verify.
What that does is just make sure that no corruption or anything has happened to those files while they've been in storage. This is honestly less of a concern, in my opinion, when it comes to cloud storage. But if you had this on an actual physical disk that you just stuck in a closet for several years, it's very possible that that data is going to degrade over time. So verify is going to check for that and make sure you're not going to bring back files that. have some kind of data corruption in them.
And now to get this back, we're just going to do a very similar command. We're going to go p4 restore, dash uppercase D and again, we'll give the name of our archive Depot. And then the path of where those files are going to go back to.
So in this case, this will be, this same path that I used up here, I'm just going to copy this same path that we had from before. If this were a whole project, we could also say, we just want to get back everything. So in this case, we only archived this one folder. But let's say we gradually archived various folders over time. But what I want to say now is, Hey, everything that's inside of this that used to be inside of this old game Depot. Get me back all of that. In this case, it'll just be the same files, and of course I have a typo here that's why I like to copy and paste. This is S3. Archive demo is what I called that Depot, not depo. And I'll just hit enter there and you can see it's restoring these files back. That one went much faster than it did when it was uploading them. And now if I come back to my P4V and hit refresh. I will see that this video1 folder is back.
All those files are back here. And if I were to sync my workspace, I would get all those files back.
Conclusion
I hope this detailed tutorial of S3 storage for archive, depots and helix core was helpful for you. Please let us know in the comments. If there are more things you'd like to see, or if you have questions, and of course, check out our YouTube channel, go to Perforce. And of course, check out our YouTube channel for more videos and go to perforce.com for the latest updates and software downloads, as well as all the documentation that I mentioned in this video.