Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tape still beats SSDs and hard drives when it comes to price per byte (techradar.com)
230 points by jaytaylor on Sept 24, 2019 | hide | past | favorite | 277 comments


If folks are curious about the benefits/drawbacks/etc of the medium, I would highly recommend checking out the media presentations given at the Library of Congress Designing Storage Architectures meeting (most slides are online -- scroll down to the "Designing Storage Architectures Meetings" section and click through to each year's list of presentations): http://www.digitalpreservation.gov/meetings/index.html

The 2019 meeting just happened and had a lot of good info about current use of tape, slides should be online in another couple weeks. (I presented there this year and work at the Internet Archive, though we do not currently use tape in our storage infrastructure.)


For the impatient, could you summarize it for the tape?


Basically, tape has a very very high latency (on the order of minutes to hours) for random access and it has a very low number of writes (usually a few dozen) before you need to copy it to a new medium.

The drives are also quite expensive and if you use them for backups, you need to keep one ready in case the other borks it but you need the tape's data.

Lastly you need specialized tools for access since most average tools will mishandle the tape.


Not sure where you've got the 'few dozen' number from - that's not to my experience. Tandberg quote "1,000,000 passes on any area of tape, equates to over 20,000 end to end passes/260 full tape backup" http://www.tandbergdata.com/default/assets/File/Data_Sheets/...


Atleast from what I've been told and from experience, if you rotate over master backup tapes every few months, the tape will be almost 6 years old by the time it reaches the first two dozen end to end writes, that usually means newer LTO standards, newer drives and newer tapes on the market to replace them with.


> 1,000,000 passes on any area of tape, equates to over 20,000 end to end passes

What in the world does that first number mean?

My best guess is that it's some nonsense math based on the fact that the tape head contacts 32 tracks at once. So writing a full tape is 1 "full pass", which equals 208 "end to end passes" as the tape feeds back and forth, which equals 6656 """passes on any area""".


My LTO-7 drive needs about 2 minutes from one end of the tape to the other, so that is the maximum latency when the tape is already loaded. Adding another minute to manually find and load a tape, would reach a 3 minute latency.

So a few minutes of latency in the worst case, yes, but certainly not hours of latency, unless you must drive a car to an off-site location, but that would be the same for a HDD or optical disc stored in another place.


Few hours of latency are expected when backups stretch across multiple tapes and recatalog of the tapes is required (the original backup server is dead). Having to do it manually with a single reader because your tape library is dead will slow the operation even more.


That is a problem of the backup software, not of the tapes. I have a custom program that generates an index with all the files on all my tapes and which updates the index after any tape write. I can determine in a second the location of any file, so I know immediately which tape cartridge must be loaded to retrieve anything.

The file index is actually a relational database table with a lot of metadata about the files, including: position in the tape collection as tape number + file number, name, path name, modification time, length, content hash (to identify duplicates) and a set of keywords and tags, to be able to search for a file even when the name is not known, or to be able to retrieve sets of files.

The files written on the tape are grouped in compressed archive files, but the database with metadata is about the files inside of those archive files.

Obviously, the file index itself is also backed up regularly.


That's essentially what most backup software is doing. Restoring the index database from the tape and matching the physical tapes in the set to their logical names in the index gets complex and time consuming when you have many tapes in a set.

It gets worse when you have to use more than one set because there is too much data to have full backup daily and you also need to index few incremental sets.


Restoring the index database from a tape would be needed only if the SSD where my database is normally located would become defective, which should be a very seldom event, especially if it is replicated on a few SSDs.

In my custom program & file database, I only use physical tape names, i.e. the names that I write on the labels of the cartridges. Either the program tells me what cartridge to load in the drive, or I inform it what cartridge I have loaded (e.g. for a new cartridge). I see no reason for a logical name. If I would ever copy a tape on another and I would discard the original tape, it would be trivial to substitute the name in the database.

Maybe for a large enterprise maintaining a comprehensive file database could be time consuming even in the best implementation, but I doubt it.

In my experience, keeping an up-to-date file database for about 120 TB of content that increases by about 10 to 20 GB per day still takes a negligible time.

While I have worked in a few large companies where a lot of data existed, in all small- and medium-sized companies (less than 100 computers) that I have seen there was less backup data than that (not counting things like standard disk images with the OS and with SW, which do not belong in periodic backups and which where shared, being identical for various classes of employees).


Usually when someone gives latency numbers, they're not including the "server died and everything has to be rebuilt" case. Even SSDs might take days to return data in that case!

Realistically it's 3 minutes.


Plugging a SSD to a new server and restoring a file should take few minutes, since the filesystem table is accessible instantly. With tapes you either have the catalog saved outside the media or you need to read it from the media. Depends on the type of the backups taken and the amount of data, this process can take much longer than expected naively.


The difference is that you typically have lots of more tapes than readers.

Gmail had an event with data loss in their early days and restoring from tapes took (from memory) many weeks, if not months.


The post above was about the latency of random access to a file, not about the duration of a restoring operation.

The tapes have the fastest sequential transfer speed of any alternative, so the restoring duration should be better than from anything else, if the restoring software plans correctly the sequence of retrievals.

It is true that if the backup software is not clever, you could waste a lot of time to determine what you must restore and where is it. However, this has nothing to do with the tapes, the same can happen with backup on HDDs if the backup software is equally bad.

I am not sure what you mean by having lots of more tapes than reader, as that is the point of tape storage and the cause of it low cost per terabyte.

The only way that I could understand that phrase as a disadvantage for tapes is that when you would need to read as fast as possible e.g. 300 TB with one reader, that would need 10 days, but if the 300 TB were stored on 25 HDDs and you would also have 25 spare computers and some technicians that could open the computers and connect the HDDs inside them and you would also have an 100 Gb/s Ethernet network from those computers to the destination of those 300 TB, then yes, you could read the 300 TB 12 times faster that with a single tape reader.

However, such circumstances seem too unbelievable. For such a huge organization it would be more realistic to buy immediately e.g. 3 more readers to reduce the reading time from 10 days to 2.5 days.

On the other hand, if the 25 HDDs would have been located from the beginning inside a backup system, then that system would have had a price with at least 5 digits, far higher than the cost of a tape backup system, and unless it would have been connected by 100 Gb/s or at least 25 Gb/s Ethernet, its speed could not have been much higher than that of a single tape drive.


Well, they are tightly coupled. If you only have one tape to consider then fine - whatever works and the discussion would be pointless.

Also, why on earth would you need 25 spare computers for 25 drives??? 25 drives fit into a single 4U chassi. A single operation and everything is done.

Seeking a file on a harddrive is about 7ms. Seeking 25 drives is also about 7ms. But will be quite the operation for a tape system.

You don't just "buy three more readers", you are likely making use of tape robots and fitting them into your infrastructure probably isn't an afternoon of work. And the whole cost benefit of tapes greatly rely on the fact that you don't have to. Which is also why the latency skyrocket. You still want to perform your backups of course, maybe doubly so because of any unforeseen secondary effects that whatever caused the restore process might lead to - which will compete with precious time.

I'd love to use tape, but the cost is prohibitive and you really want to verify your backups so for home use it is quite the hassle. In my opinion. But if I stumble on a great deal, sure! It'll be fun.


> why on earth would you need 25 spare computers for 25 drives??? 25 drives fit into a single 4U chass

I have already mentioned that with other words at the end of my reply and in that case you will not have a speed 25 times higher, but maybe at most 4 times higher than a single HDD, i.e. twice higher than a single tape drive.

Maybe I misunderstood, but the initial posting mentioned a restoring operation from tapes that was slow and my point was that using HDDs is unlikely to make it faster. For a restoring operation the duration will be dominated by sequential transfers, not by seeking, so tape drives would be faster than HDDs, unless you use ridiculous resources, like in my example with a computer per HDD and with a network fast enough to aggregate the data from all computers.

If we are talking about general-purpose data storage on tapes, then obviously, what is frequently accessed should be on SSDs, because the average seek time for a file stored on tape will be between 1 minute and 2 minutes.

I suppose that there might exist applications where there exists a quantity of data so large that it would be too expensive to be stored on SSDs and with worst case latency requirements so severe that it cannot be stored on tapes, so you have to use HDDs. However, in some 40 years of using all kinds of computer systems in many companies, I have never encountered such an application. Everything I have seen could be better handled by a combination of SSDs and tapes, without any HDDs. The only problem with tapes is the prohibitive cost of drives, which limits their use to data sizes of at least 50 to 100 TB, unless long term storage reliability is more important than the cost.


All fine and dandy if you have more drives than concurrent retrieve jobs or requests...

Hours comes into play when you have a queue of retrievals and many tapes....


If you are impatient, tapes are not for you.


> slides should be online in another couple weeks.

Why does it take so long?


They have to be loaded from tape.


Because someone probably has to take time out of their day job to do so. Also, if it’s like any other conference I have ever attended, half the presenters didn’t send in slides and have to be hounded to do so—and many still won’t get around to it.


The tape has to rewind.


People have something better to do instead providing information to you for free?


To be fair, the job of the Library of Congress is providing information for free.


how do they pay for it? because if it's taxes...


Our forefathers put their tooth fairy money in a trust to maintain the sum total of human knowledge. Ever notice how people back in the day all had missing teeth? Like Washington and his wooden dentures? We owe them mightily.


Honest question:

When you ask that (particularly in places like HN), are you expecting someone to have a revelation and and not understand that taxes pay for government services? Like free is shorthand for "no cost at time of use" here.


I think this is a bit misleading. Tape is not just the tape cartridge itself. Quality drives are very expensive and so is tape management. Managing access to tapes at scale is also not trivial and complicates a lot of your processes.

Another important factor everybody seems to forget is that you don't want the tape to be your only backup. They are so slow to restore that you typically only use tape as last resort.

For my personal backups I calculated a bunch of regular 4TB drives, external USB 3.1 bay which takes 2 of them at once and an off-site storage (place where I can keep drives) to be best solution.


Tape drives are consumable. LTO tape is abrasive, this cleans the heads but also wears them down. I used to manage a scientific computing facility with a PB scale rape archive. We ingested maybe 50 TB a day to tape, and would chew through one of the 8 drives at least twice a week.


I know this is a typo, and I don't usually point out such things, but maybe fix "rape archive"


It certainly made me stop and reread the statement.


I didn't even notice until I read the comment below. My brain just saw tape, I had to go back and reread it to make sure it hadn't been corrected yet.


Was this same generation drive and tape? I've only seen this be this bad when doing MP (LTO4/LTO5) reads/writes in BaFe-targeting drives (LTO6/LTO7).

These old MP tapes are like sandpaper to new drives.


At work we also refer to LTO5 tapes as sandpaper tapes - glad we're not the only ones with the realization. If we have a significant restore involving archived lto5s, we immediately plan for additional wear and drive repairs. :(

That being said, this does apply within generation to a lesser extent as well.


I might have worked at your company/project. :)


Actually, yes, a bit of account stalking says you have. m( Hey there! Folks on the team still miss you.


> Tape drives are consumable. LTO tape is abrasive, this cleans the heads but also wears them down.

Is it possible to replace the RW head on a tape drive? I guess the manufacturers generally don't want this to be easily replaceable, so they can sell more drives...


Almost universally the business model for significant deployments would be that you have maintenance contracts anyway. That's kind of how the tape business works, certainly if you have libraries as well. In that case you get those drives RMA'd.


We had a service contract. The drive basically contacted IBM and I had a new drive on my desk in 4 hours. I made enough friends at IBM that they left a stock of spares.


Are you using green media? Green media is very abrasive and it's recommended to burn them in before usage. This can literally saves you thousands upon thousands in tape drives.


I'm curious for more details. Are these full-height or half-height drives?


Was this "rape" archive at the Vatican perhaps?


Ya, this is an odd article. It seems aimed at the consumer. Maybe it's a lack of imagination on my part, but I can't think of a consumer use case that calls for tape. I think the article could have benefited from spelling one out.


I was considering to mirror my cold storage backup (12 bay NAS with a few hours uptime per week) to tape for long term archival of private data (e.g. photos) or stuff on the internet I think is worth having a backup of (e.g. the http://unrealarchive.org).

But the tape storage requirements and danger of running into trouble with the reader (especially considering I could get a misaligned one when buying used, and if that one breaks => tough luck) convinced me this isn't a good choice. [edit] For the same money I can just get a bunch of additional disks, and use two for simple, redundant, offsite backups; and the remainders to swap out old disks in the NAS pre-fail.


For what it's worth, the tape drive head has to align to the servo tracks on the tape. Both on read and write.


Do manufacturers even offer consumer-grade tape backup solutions nowadays? Do tape drives even use USB, or is the tech still using SCSI? I can't seem to get a straight answer from Google, but I doubt it.


The interesting thing is that for long term high reliable cold storage (for e.g. research data) using a bunch of tape robots seems to still be quite viable idea. (The robots read, check and replace broken tapes automatically getting data from tapes can take long so you normally put some form of sdd cache in front of it.)

Then you place the tape robots into a fireproofed room with exhaustive protection measurements (e.g. filling the whole room with CO2 or similar on fire alarm, having extra fireproofed doors which are comparable with bomb shelter doors etc.).


Exactly what I was thinking. Tapes may be efficient while you're not accessing the data on them, but data you're not accessing is ultimately useless. You need the infrastructure to restore from backup, and if you've got so much data backed up that the cost savings of tape really matter, you might have to backup (and restore) so much data that the lower speed also really matters.


Out of curiosity, where are you keeping them off-site? I have about 15T of used storage right now. I have it in a RAID and then backed up to another RAID in my home, but I am missing the third piece of the backup puzzle. Are you just putting them in a safety deposit box? Or do you put them in a colo of some kind?


For personal backups (unless you're the king of personal Big Data) tape doesn't make sense in most cases.

Where tape shines is for offsite copies, DR copies, or duplication for safe keeping of any kind, not backups.

Tape will work for backups, but you have to structure your working processes around tape (serial access) to the backed up data. If for you "backups" includes rapid access to an old version of a file you edited or deleted by mistake and pulling in data that's too old for you to want to keep on expensive disks so you can process it, tapes aren't a good option, for reasons of access time.

Most modern IT professionals would use NAS or SAN snapshotting to handle the first case above, where you did an oops and need a copy of the file as it was before that. A snapshot directory on a NAS usually gets implemented using a journaling file system or some other mechanism for maintaining a point in time view of the system, so you're not keeping another complete copy of your data, only journal entries listing what changed when.

For the second case where you have too much data to fit on your expensive SSDs and you might need to use some of it e.g. next week or next month, what you want isn't tape, it's a cheaper hard drive. IT types usually call this "near line storage". State of the art spinning hard drives are a good option, but if those are still too expensive you can go a couple generations of technology back and get lower price/TB. The problem you run in to doing this is that you need more physical space and hardware and power/cooling for the lower capacity drives.

Tape is great for when you want to keep data but may not need to access it ever or might need to access it months from now, and when you can accept a delay in access of a day or two. DR copies, offsite vaulting of data for legal purposes, logs from systems you want to keep in case you ever need to check back on what happened 5 years ago, bulk data from experiments that may never be accessed again but which could be critical for someone if it's needed (and may therefore turn out to be extremely valuable)... those are the scenarios for tape, and the price per TB stored is lower than anything else.

The company where I work off sites data for long term storage regularly, including a copy of our main ERP database. We meet legal compliance requirements the cheap way by copying stuff to a tape we can access and paying someone to store it for us. Backups of production systems are done to disk, to a NAS or SAN block storage device that is then copied to tape if needed.

Edit: Forgot to mention the ERP databases' size... it's in the Petabyte range, it runs the entire company. If we tried to transmit it to a cloud backup provider, without spending a giant amount of money on a high speed internet connection we'd never complete a daily backup before it was time for the next one. On top of that, if we need 2 copies of it the storage price gets very high very fast - cloud storage is very convenient, but costs more than even local spinning disk. Tapes,on the other hand have a small incremental cost if we already have the drives and robot, and we can make as many copies as we like by paying only for the tapes and the time.


IMO one of the worst parts about LTO is that it's another one of those formats with "open" in its name, but you can't just download a PDF of the standard or even buy it from something like ISO/IEC. There's some convoluted licensing process involved. Only the first generation was actually open, but I guess they decided to close it after that:

https://www.ecma-international.org/publications/standards/Ec...


What would you do with a totally open specification?


It's easier to recover corrupted media if you know everything about the underlying format and protocols. As an example, I was able to recover a partition that was partially overwritten by just a few sectors by finding a backup copy of the superblock and writing it into the correct place armed only with hexdump and dd. That would have been impossible without detailed info about ext2/3/4 available.

Also look at the floppy imagers people have made, that operate at the magnetic flux level.


> It's easier to recover corrupted media if you know everything about the underlying format and protocols.

Sorry, I'm still not sure how knowing the LTO spec would be helpful.

You are basically asking for the "Layer 1" documentation of how the tape head write to the tape media? Given that you can only access the media via a drive, wouldn't it be sufficient to simply know the SCSI commands (T-10 SSC-x) to send? You'd then have to know the file format of the backup software (e.g., tar).


The parent commentor is probably talking about, in this case, wanting to know what would be required to write their own “external” tape drive firmware/driver, to post-decode an analogue flux reading (ala https://applesaucefdc.com/) of an LTO tape into the same digital data-stream that the drive itself would output. I.e., what would be required to write an “LTO tape drive emulator” to interface with hardware/drivers that expect to be talking to an LTO drive (perhaps in the context of emulating that software itself.)


Let's pray we have the drives in 40 years. I think we're already seeing media lost with the Enchanced CD things.


Why? You're not expected to find a tape in the attic 40 years from now and insert it into a drive you just bought at Fry's and restore the data from it.

If you do care about the data on any given medium a few years from now, you copy it then to a current medium. And so on.

This works fine as long new media are cheap, easily available and higher density than the old. As has been the last sixty years. If this progress ever halts or reverses (due to some man-made or natural disaster, including a major economical recession), we're screwed.


Doesn't matter what is expected to happen. Some time in the future someone needs to recover data from some old medium, proprietary format hinders that effort significantly and unnecessarily.

Not to mention the closed format causes reduced competition in making better or cheaper devices, which is another downside.


"That would have been impossible without detailed info about ext2/3/4 available."

Try telling that to IKARI+TALENT, Legend, Skid Row, PaRaDoX and all the other cracking groups from Commodore64, Amiga, ATARI ST, PlayStations, Nintendos, groups which competed on who is going to one-file a program or convert a secret disk coding format to a regular filesystem, with all the protections disabled and the programs bug and NTSC+PAL fixed while they were at it...


"Easier". Yes, you could reverse engineer all that information yourself. It just takes ages.


They competed on both quality and speed, as in who will be the first to 0-day the program which had to work 100%. Whoever got the 100% crack first won, anybody else's crack would get nuked. Them's the rules.

So, either we have gross incompetence across the entire information technology industry in general if it's impossible or slow, or the people disassembling custom filesystems and protections, bug fixing binaries and adding custom speed-loaders in record time are from another planet. Which one is it?


This is like asserting that because Usain Bolt exists nobody needs to drive a car.


I've no idea who Usain Bolt is (or what it is, if it is a thing), so no clue what you are attempting to tell me with that allegory. No rows returned from the database. To me, it looks like

"cats are furry. Your argument is invalid."

Now, is "Hacker News" supposed to be where the information technology elite gathers, or is that a gross misnomer these days?


I don't think it's grossly unfair to assume people know the name of the fastest human being in the world.

It's also fine not to know, but your comment assumes we know who or what, for example, "allegory", "database", "cats" and "misnomer" are. My point is that every discussion assumes some knowledge, and if you lack it you can just look it up, or ask, without making allegories about cats and databases and huffing and puffing about information technology elites.


Is this a site for IT elite or isn't it?


> It just takes ages.

It doesn't have to be slow, sometimes it can be quick, but the precondition is that it takes the most talented engineers in the field to do it...


The point is to be capable of doing this without being some sort of superstar you can name who's dedicated themselves to the craft.


There is no replacement for dedication to the craft.


I mean, there is hella replacement for craft: industrial precision. Like you can write fucking fast code now without having to ASM that shit. And no one is building a revolutionary microprocessor with his hands. That shit is going in the machine.


The machine cannot replace craftsmanship. Soldering transistors and writing small, fast code are two completely unrelated activities.


You're deliberately missing the point just to name-drop some scene names?


No, I'm making a point: if "Hacker News" is supposed to be where the information technology elite is, then what kind of "elite" is it if editing a filesystem in situ is considered impossible without the source code? Or is anyone who can turn on a computer these days and boot some kind of an OS program to dabble with the computer a hacker now, so the bar has been dropped to the ground altogether? Are we now all inclusive here too, is that how bad things have gotten? Maybe I'm completely wrong, and hacking has been changed so nowadays it's completely disassociated from being competent?


Read it?


Just noticed your username. Any connection to the TV decoder chip of the same name?


Cookie consent, adblock warning, autoplaying video, some kind of newsettler form that I just glanced at before closing the tab.

What a miserable website.


Got JavaScript disabled (thanks firefox and noscript), and I have none of that bullshit.

Give it a try :)


Even just the "Reader" view of Firefox fixes that: https://support.mozilla.org/en-US/kb/firefox-reader-view-clu...


Yes, I am an IB Computer Science teacher and we still teach tape drives as a form of storage so I was excited to share this story with my students based on the title. Then I went to the website. No way I'm sending a link to a site like that to them.


Try it this way: https://outline.com/hDqg78

Generally, you can add outline.com before any news or blog domain, and you get a reader view to share.


I'm well-aware that tape is still the king for datacenter-level storage, and I'd love to have a tape drive for my ever-increasing size of my personal backup.

Their advantages to me is obvious: its inexpensive price and great longterm reliability beats any alternative choices: hard drives, optical drives, or solid-state drives, and sometimes cloud. Also, it's cool to put my "tape archive" (.tar) files on a real tape! But I found for personal uses, tapes aren't cost-effective, the entry-cost and initial investment of the tape drive is extremely high, it's all enterprise-grade SCSI or SAN. Unless you already have tens of terabytes of data (video studio?), buying a tape drive is unjustified. HDDs are still sometimes the only cost-effective choice for personal use.

Now I mainly use HDDs for large backups (> 300 GiB), multiple Blu-ray disks for medium-size backups (< 100 GiB), and use cloud for the smallest archives. I don't know if Amazon's "mail your drive to the cloud" service is available at my location, but it's also an attractive solution.

I'd like to hear about your experience.


Did you try backblaze.com yet?


I think any cloud solution is limited by the broadband uplink. It takes 24 hours to upload "only" 200 GiB of data over a 20 Mbps uplink. It's fine for longterm backup that literally nobody touches unless the computer breaks, but one still has to be patient enough to backup a RAID array...


Backblaze has a service where they ship you a disk pack, you back up to the unit, and then ship it back:

* https://www.backblaze.com/b2/solutions/datatransfer/fireball...


Note that's for B2, their S3 like service that is paid for per GB. It doesn't apply to their unlimited tier they are more known for.


Great to hear that, thanks for the comment.


You could also change your internet service to something reasonable for a month or two until your initial backup is complete.


It's the consumer upload pipe that's the issue, and it's very likely there's no higher upload available. I've literally tried to throw money at the local ISPs and the best I can get is ~35mbs/up despite having about a half gig down.


Oh wow. At my house in Austin I have AT&T fiber and get 1 gpbs up and down with no caps for $60 / month. I use BackBlaze to backup up my NAS and it works great.


AT&T is technically an option, but since their "pay for privacy" debacle[1] and terrible support that somehow makes Spectrum appear caring, I'd rather burn my money for warmth then give it to them.

[1] - https://www.fool.com/investing/2016/10/03/att-drops-its-cont...


Did you ask if you can go onsite and leave something overnight to finish the upload?


I'd be curious to hear from any folks who actually run tape drive setups for personal use.

Where did you get the gear?

How cost-effective is the overall setup compared to spinning rust?

How is driver support in popular operating systems?

Is there more you'd recommend taking into consideration or learning before jumping in? Especially beyond the obvious realities of extremely high latency, of course ;)


> Where did you get the gear?

I never ended up using it personally, but when I was considering it, used gear on eBay was top of the list.

> How cost-effective is the overall setup compared to spinning rust?

For personal use, it is not even remotely cost-effective unless you are a data hoarder. The cost of the drive is simply too high. Also keep in mind that the drive can become damaged or need servicing.

It only makes sense if you are dealing with e.g. large quantities of video. The article cites a minimum 140TB where tape becomes cheaper but this involves some dubious decision-making.

> How is driver support in popular operating systems?

I have only ever used tape with Linux + custom software so I can’t speak of software support in general.

> Is there more you'd recommend taking into consideration or learning before jumping in? Especially beyond the obvious realities of extremely high latency, of course ;)

The obvious reality to me here is that it just doesn’t make sense unless you’re in spitting distance of a petabyte. The only thing I can imagine someone doing personally is video, and it would have to be a lot of video like some kind of long-running documentary project that you are producing in your spare time.

The other factor is to be more paranoid about mixing tape generations than the vendors suggest. Ignore the specs and stick to same-generation media and drives.

Efficiently writing to a tape means filling e.g. a 20 MB/s write channel for LTO-6, or 750 MB/s for LTO-8. Can you sustain 750 MB/s? It’s not so easy in practice. You throw away tape capacity if you get buffer underruns.


> Can you sustain 750 MB/s? It’s not so easy in practice. You throw away tape capacity if you get buffer underruns.

The drive doesn't pause while the buffer fills again? Considering the price of the drives, I find it very surprising that they don't do that.


> The drive doesn't pause while the buffer fills again?

The drive does stop, and it runs at different speeds depending on how fast you are feeding it data. However, this process wastes tape capacity as you switch speeds.


Thus you should pick one speed that's fast enough and stick to it.


I don't understand how stopping or switching speeds wastes tape capacity. Are you not talking about storage capacity but some other type of capacity?


No, they stop, rewind, and accelerate again, causing massive mechanical wear.


The higher-end tape devices are capable of running at slower speeds to better match the rate you're feeding data to it to avoid this.


Yes. But it was argued that they could just stop and wait for the buffer. As said, they do, but you don't want them doing so.


> Can you sustain 750 MB/s?

Do 10% of a tape at a time, buffered on a $200-$300 NVMe SSD?


That wastes tape capacity when the drive inevitably has to stop and restart.

It's cheaper to run the tape slower (you can adjust it so that it is very close to what you can sustainably deliver)


> That wastes tape capacity when the drive inevitably has to stop and restart.

Not meaningfully, unless there's something ridiculous about the format I'm unaware of. A full tape write covers 160-200 linear kilometers. Stopping 10 times, which is once per hour, is barely anything. I haven't found anything listing the exact size of gaps, but I did find a post from someone having thousands of buffer underruns that was only losing half a second of capacity per underrun.


As Dylan16807 also said, you waste very little. I am using 6 TB LTO-7 tapes and I group the data written to the tapes in files of approximately 50 GB, which are read from a NVMe SSD.

This means that each 50 GB file is written at 3 Gb/s, the maximum tape speed, but after each 50 GB the tape is stopped then restarted. So I write on each tape about 120 files with an aggregate size of up to 45 905 860 blocks of 131 072 bytes each. That is more than the 6.0 TB advertised for the tapes, so I do not waste any measurable tape capacity. The 50 GB file size is chosen because it fits inside the 64 GB RAM of the computer with the tape drive and certain algorithms that I am using to process the tape files (lrzip & par2, for compression & redundancy) are faster when the entire file fits in the RAM.


LTO is used in the film industry for long term archival. I own a LTO8 drive. Sourced from Maxx Digital in Orange County, CA. Talk to Matt Stone and he'll spec one for you. Tell him Mark Maunder sent you. :-)

The drives are bloody expensive. Mine was around $5k.

LTO8 tapes are still hard to come by even though the patent dispute is settled. This should change soon.

The software to archive to tape is clunky and frustrating but it works. I use something free that integrates with Finder on Mac. I'm away from my office so dont remember what it's called.

The main benefit is that the tapes last for half a century and are very cheap per TB once you have made the initial investment in the drive. Very good for long term archival of films and source footage which is what I use it for.


Exactly what we’ve used it (when I was day-to-day in that business as well). Daily shot material stored in three copies and sent to three different physical locations and one additional copy for on-site archiving + online on hard drives.

When you do a lot of those, drive price isn’t all that expensive. My major gripe with it is that you couldn’t hook it up to a USB3. Rare ones had thunderbolt, but most were that bloody external SAS thing. Not really all that great when you don’t have those controllers around. Beats the purpose of external drives.


I use TB3 and I'm quite happy with this. Also has a drive bay that'll fit 12TB drives I have.

https://magstor.com/collections/magstor-thunderbolt-3-tape-d...


Got it on ebay. There are great deals to be had if one knows the technology, partly because there are very few of us left now who know it, so extremely expensive equipment goes for pennies on the dollar.

I use self-system engineered (read: self OS packaged) Bacula software for backups with a custom developed shell program which interfaces with Oracle RMAN to do backups.

I use Solaris 10 on sparc, which also goes for pennies on the dollar nowadays since there are few of us left who understand how to run it and even fewer who understand how to squeeze the most out of that hardware + software combination.

The tape library shows up as character and block files under /dev/rmt/ and the robotic arm with the barcode scanner under /dev/changer/, nothing special or fancy about it. There is another piece of freeware I compiled and packaged (whose name I forgot) which controls the robot arm and the scanner, also by issuing standard SCSI commands to /dev/changer/ files, nothing fancy or special about it. The tape library has enough capacity for full backups with three month retention policy, so the only thing I have to do is replace a worn out tape cartridge roughly once in three years. Backups are daily. The entire solution ran me around $800 (server + differential SCSI controller + tape library) and the tapes ran me $500, over the span of 12 years.


I setup a tape drive for my lab backup and in paper it was straightforward- we got a tape drive from Newegg for a few thousand bucks, put it in a machine, realized we needed some pci card, got it, realized we needed some cable, got it, realized we needed some custom drivers for some reason, got it, and then finally it was working. Started getting some tapes and backing up. Overall would not recommend unless you have hundreds of terabytes that you wanna backup yourself


> Where did you get the gear?

I have an old Quantum DLT S4 drive and something like 18 tapes. Each tape is 800GB uncompressed and I get about 1.8x compression on average. It's an external unit connected via SCSI ultra 320. The ultra 320 SCSI card is something I picked up from ebay for like $20. The drive and tapes I got from a former consulting client who was going to trash it.

It was the last of the DLT line and is similar to LTO4. Get an LTO5 if possible; it has some cool features like partitions.

I only store my most critical stuff on tapes. Each weekend I do a tape swap and take the old tape to work on Monday and bring the oldest one back. I keep the tapes at work in case my house burns down or something (I lived through two house fires as a kid so I'm super neurotic about it, don't ask). The tapes are encrypted so nobody can recover from them if they get one. If my house did burn down I'd have to somehow procure another tape drive, but that would be the least of my problems.

I just use a custom tar script to do it. I tried both Bacula and Amanda in the past but they were too complicated or had other issues. My own tar script did what I wanted.

> How cost-effective is the overall setup compared to spinning rust?

Since I got most of it for free and I already had experience with tape, it was pretty good. I've had it since 2008 so it's over ten years old and still going. I also have some UBS3 backup disks that I use for daily backups. In all this time, only one tape has died and I do a full validation of the previous backups every time I bring a tape back and re-use it.

For new installs, go with some USB3 disks from Seagate or WD -- both are great, I have both. Avoid Toshiba due to some weaknesses in handling bumps and vibration during runtime due to firmware (Toshiba internal drives are great though) -- Don't ask me how I know this. Seal them into some kind of fire-proof/flood-proof safe and store somewhere that a fire/flood won't destroy it.

> How is driver support in popular operating systems?

I'm all linux. It works. All that matters is the SCSI card being supported. Make sure NOT to get a RAID card, regular SCSI only. The RAID cards often won't work with tape drives because the firmware is made for RAID drives, not tape.

> Is there more you'd recommend taking into consideration or learning before jumping in? Especially beyond the obvious realities of extremely high latency, of course ;)

So the only real advantage for me using tapes is that I got most of it for free, the tapes are more portable and lighter than a USB hard drive for carrying around. Almost everything else is a negative compared to USB hard drives.

I use some UBS3 hard drives for daily backups via an rsync --link-dest script that de-duplicates everything and that works great. For $100 you can get a 4-6GB USB hard drive from WD or Seagate and for my important stuff, that works fine.

I'm about to fill up my older 5GB drive -- it's down to about 200GB free space left. I already have a new 6TB drive I brought recently and when it gets down a little lower, I'll just replace it, do full backups, and put the old drive into storage for permanent archival.

Data backups is a full occupational specialty, like being a DBA, network engineer, security, or web monkey. I did TSM for awhile. There's a ton of issues in the hardware, strategy, security, performance, and more.

Do yourself a favor and just set up something simple that JUST WORKS and don't fuck with it too much.

Do a practice recovery once and that'll teach you how good your setup is and improve.


As a backup specialist that last line of yours is everyone should pay attention to. It sets expectations on recovery time and proves it work.

We ( work ) have a massive pile of DLT, mainly S4's...these were written 2005 to 2009 ish and they still restore just great. I mean, i wish they wouldn't as i'd love to be shot of them but they are too reliable.


I think it's printed on some TSM manual somewhere and I never forgot it: "Backups are irrelevant. Only the ability to restore matters." I had that in mind when I said what I did.

The casual observer won't understand the difference between "ability to restore" and "doing backups", but it matters when the building burns down.


I am incredibly impressed with your level of s/paranoia/detail. What do you do for a living that requires that much rigorous static data retention.


I'm just a sysadmin/net-eng. These days they call it dev ops, but the job never really changed.


May I ask what kind of encryption you use with the tapes?


openssl. I wasn't kidding when I said I use tar! I just pipe to/from openssl enc.


For personal archival I think Blu-ray is still best. I think of it as a supplement to S3 or Dropbox. Even if you end up re-writing discs it is very cheap per GB, and 25 or 50 GB discs are comfortable for “normal” personal sizes (my entire file set that I care to keep around is around 100 GB — mostly because I’m not careful about deleting old node_modules from ancient projects I’ve done)


Note: there are 100GB bluray discs now that can be recorded using consumer burners. There are also M-Disc bluray discs which are specified to last about 1000 years (with some science and testing to back it up, even though no one tested it for 1000 real years obviously), because it uses a more hard "rock" layer to store the data instead of basically a color dye which is used on regular burnable discs (which last for 10 years according to some sources).


My photos/videos alone are over 300gb and growing.

node_modules is shit anyway, and shouldn't be duplicated for every single project.


I also have a few hundred GB of personal videos and photos and have historically managed backing up the collection on optical discs. Partitioning the backup by timeframes seems to work well for me, and tends to mirror the basic structure of historical photo albums. As disc density increased, my older CD's got grouped together on to new DVD tech, and DVDs now being grouped together on BDs. 300GB of data is 6 dual-layer BDs, or 3 100GB BD-R XL. To me storing a dozen or so discs as a backup for all my family's usual photos/home videos is not onerous. All of this is on top of storing this data "hot" on a storage array and various cloud services for normal browsing and usage so its not like we're constantly juggling these discs around. They tend to stay put tucked away with all the other important family heirloom kind of artifacts.


How do you store and view this?

I have yet to find a good solution. My phone is crippled by the library and none of the Macs like it much either. I’d guess I was at 600gb.


Very good question.. I have most of it in directory folders "yyyy-mm-dd NAME OF THE EVENT" on dropbox, which I only sync on my macmini due to the amount of storage.

I'm trying to use icloud photos, because then I have them where they belong on my iPhone, but I just don't like the way it works. It's slow on my mac, but it's nice to have thumbnails locally, and originals available when needed. Althought that also doesn't work nicely when dragging. It's cool that photos groups things on event, but for me, I like to have my "camera roll", and manually make albums. That's possible, but it's ust easier to drag files around. Also, I'm looking for a solution to 'archive' photos (hide them, because I took 10 pictures of 1 thing).

There was a time I wanted to write an app/service for this. I might still do that if there's enough demand.

I also haven't figured out how to archive some video material. Recently I started shooting small clips, with the intention to make small 5 minute videos of my trips. As I don't want to delete the original material, I also need a place to store them, but I don't want to "browse" them.


What's Blu-ray durability like with commodity drives?


To re-iterate my other comment, there are M-disc bluray discs now (up to 100GB per disc), which seem to last for 1000 years, if kept away from direct light and in good room conditions (low temperature and humidity).


eBay. 1-1 HDD backups is probably close in price even in the used tape market. I use LTO-4 with FC. Works fine in Windows with Retrospect. I'd expect it to work in Linux as well. The hard part is backup software. I guess with LTO-5+ you can just directly access the tapes with LTFS, but I haven't tried that yet.


Anything which supports ndmp should be good. Both bacula and Amanda among other SW support NDMP. I mean of course enterprise solutions support it too.


LTO6 here, I bought it at Newegg. It’s pricy but having a 2.5TB and fairly durable cartridge is nice for my needs. I just dropped one earlier this week, no big deal though. I use LTFS on it, it has latency but it’s faster than you expect and once you’re streaming on or off of it, it’s downright quick. Historically, I burned optical media for this but the size of tape makes it much easier. It’s on a SAS card in a linux server, I literally plugged it in and it worked with open SUSE. I forget if the tape tools were already there or not but the old school Unix stuff just worked. I installed the ltfs tools and it all worked without any tricks. It’s a quantum drive. I did experiment with tar initially but ltfs makes it radically more usable, radically.

Pre-tape, I used disks, rotated them and kept them in a safe. Burned optical discs of my most important stuff. I had two experiences that caused me to change: I had a failure and went to do a recovery and in either haste or anxiety (I put this on me, I should have paused and took more time but I was rushing) I knocked the most recent drive off my desk and effectively killed it. The other was a bit more subtle (oddly, I have only had 3 backup recovery situations and I messed up 2 of them) but my “backup” was rsync to disk on cron and a backup started during restore and began to “remove” my good backup because the master didn’t have that restored yet. There are a lot of lessons in there, most significantly, “backup” is a different thing than live access to data, you want it versioned and somewhat fixed. Rsync isnt it. You don’t want to accidentally delete your backup. Tape switches your mindset on that stuff. I’m not going to pretend it’s cost effective for my needs (although I have a lot of hd+ video of my kids) but it’s durable, it’s purpose built and cold storage can’t really be hacked. There’s no monthly fees, no risk of a cloud provider sunsetting my tier of service, and 25T of tape is like $250. Drive MTBF numbers are big enough to be abstract and yet you will likely encounter it at some point, tape has these practical numbers that are ultra conservative and come from decades of experience; you sort of keep it in mind when you’ve used a tape a dozen times or so. That being said, when the drive flakes, I’m in for an expensive upgrade/replacement and when I do upgrade, I should copy my tapes to new ones with the new format which will take time and money. With SMR, it’s almost like discs are taking on tape like qualities and it’s not hard to find stories about people getting crap performance on their new disks. It’s not sexy but I feel very confident that I’ll be able to get my data if I need it. It’s really easy to store a tape in your safe deposit box or at your parents house too, they won’t accidentally kill it by knocking it off a shelf.


I have been using in my home for several years a Quantum external LTO-7 drive, which is connected via a Serial-Atached-SCSI cable, so I also have a SAS HBA card in a computer. I am living in the European Union and I have bought the required hardware from an on-line shop from Germany, where it was cheaper. Now you can also buy it easily e.g. from Amazon. The cost was around 3000 EUR.

The tape cost is about 10 EUR per terabyte. Previously I was using HDDs for archiving. I have not followed closely the evolution of the HDD price, but looking now on Amazon I see about 30 EUR per terabyte.

I have a data archive of about 120 TB, but I am paranoid because of past data losses due to defective HDDs/CDs, so I store 3 copies of each tape (in different places).

That means that by using tape instead of HDDs, I have already saved 20 * 360 - 3000 = 4200 EUR.

Someone who would store just 2 copies of each tape (the minimum rational value), would need to have a data archive of at least 75 TB to recover the initial cost.

Nevertheless saving money is not everything, you also gain peace-of-mind because the chances of the tape becoming defective just from storing are negligible compared to a HDD, which if not used for a very long time might never start spinning again.

With tapes you must transfer the data only when you can no longer buy drives that would read them. With HDDs you should better replace them before their warranty expires, as I have seen too many that died immediately after that.

The tapes work OK both in Linux and in FreeBSD, but the FreeBSD utilities are more convenient, so I have connected the tape drive to a FreeBSD server and I transfer the data to be archived through 10 Gb/s Ethernet, which is faster than the 6 Gb/s SAS, so it is not a bottleneck. The writing speed of LTO-7 is 3 Gb/s, so that is the bottleneck anyway, not the interfaces.

That speed is much faster than of any HDD of up to 8 TB that I have ever used, but I have not checked the latest HDD models to see if they have improved their speed to levels comparable to tapes.

Using tape is very simple and retrieving any file (I have a tape index on my SSDs) takes at most 3 minutes, including the time to manually start the drive and inserting a tape in the drive. The maximum seek time is about 2 minutes. When I had a HDD archive, there was still the same time for manually selecting a HDD and inserting it in the HDD rack. For retrieving very large files, the tape can even be faster, due to the faster sequential speed.

The conclusion is that if you have data of at least several tens of terabytes and/or if you value that data enough to want to avoid any chance of losing it, than magnetic tape remains the best solution for now.


I also run my own tape setup for personal backups. On where to source etc - I’d agree with others that eBay is a good option for both drives and media (but as always check what exactly you’re buying and make sure you’re comfortable with the seller). You’ll almost certainly need a SCSI or FC interface too so don’t forget that. I’d recommend going with external standalone drives rather than either internal or sled mounted for this type of setup unless you’ve got a good reason to do otherwise as that means you only need to power the drive when you’re using it which can cut down on wear etc.

I’ve only ever read bad things about mixing generations despite the fact that the drives are capable of this (eg LTO5 tape in LTO6/7 drives) though I’ve not done this personally - but I wouldn’t recommend doing so based on this. Don’t be tempted to buy really old generations as if you ever need to restore in a disaster where you only have the tapes, you then have the nightmare of sourcing a new old drive too which by that point might be a lot harder.

You’ll want to think carefully about software setup. You’re probably going to wind up using some form of custom scripts unless you can put in the time to set up bacula properly and are sure it works for you (bare metal restore can be a tricky aspect of any system to verify it works properly). If you script your own then make sure you also understand the drawbacks of tar. tar as a tool is highly compatible which is good but is poor at recording incremental backups (there are some custom extensions which gnu tar has but this reduces compatibility) if you want to go with a more complex schedule than full backup every time. Tar also makes it difficult to seek to just the part of the archive that you need to get a single file. Again there are gnu extensions for this but this isn’t the best solution. Finally, tar then gzip means if your gzip stream gets corrupted then you might lose quite a bit of the tar archive so it’s not very resilient. You could rely on the drive compression but you might also consider other solutions, especially ones which also include checksumming. Consider either scripting around these or better what other tools could take its place (eg I have used dar in some of my setup as it handles this better but then you need to think about keeping a working binary around which can unpack your archives - I put a static compiled binary in my archive sets just in case).

Finally if you go down the tape route then consider strongly spooling your stream which will be written out to tape onto disk first, to give yourself maximum chance of getting the more consistent throughput onto tape which has benefits in terms of tape reliability (and may indeed wind up being quicker overall). That also allows you to easily write multiple copies of the same tape if you intend to send one or more offsite and keep one locally.

Where this setup works well for me is that when you start wanting many copies of your backups (offsite, multiple point in time full backups etc) then the cost effectiveness becomes a lot better than the equivalent with disk. I also trust tape to survive longer term on a shelf than disk.


I have, back in the day.. both floppy-controller and IDE based... with a whopping range of storage from 220 MiB to 4 GiB.

Since I don't have enough karma to write another comment anonymously:

Restoring from backups is a last resort for very important files deleted more than a few days ago (beyond what hourly/daily snapshots contain) or for business continuity (BCP) / disaster recovery (DR) of an entire system that cannot be recovered via automated configuration management rebuild. Snapshots to nearline storage reduce RTO for most common use-cases, and most users and systems don't generate much churn of data in a day. For recovery of important files, sending a courier out to retrieve a backup tape set from the vaulting provider implies the RTO would accept in a matter of several hours. An RTO of under an hour would imply backups would need to be stored on nearline media (commodity spinning rust) or at a backup cloud vendor like AWS Glacier or Tarsnap, which is more expensive than tape.

The deciding factor of how much to rationally spend on DR/BCP solutions follows from a Business Impact Analysis (BIA) to quantify the loss of a service over a certain time, which drives RTO and RPO goals. (A common factoid of DR/BCP is "about half of businesses who lose their data go out of business within six months.")

RTO - Recovery Time Objective - desire/need to recover within a certain time window

RPO - Recovery Point Objective - how close to the original is the recovery


Beware of two things with tape. They often have a VERY low storage temperature. I lost $5k in tapes with a minor AC problem and the room only got up to 115F for a bit.

Also keep in mind that capacity and bandwidth often assume 2.5x compression of pure text, which doesn't seem like a common scenario.

The article compares SSDs to tape is silly, access times are in the minutes when a robot is involved, and many minutes of you are swapping trays. Why compare to SSDs with that can to 50k IOPs and up?


Did anyone ever figure out what Amazon Glacier uses? They only publicly talk about using custom software to "optimize the sequence of inputs and outputs"...[1]

[1] https://aws.amazon.com/glacier/faqs/


Note cloud vendors get the benefit of mixing high IOPs workloads with low IOPs workloads.

Say a cloud vendor needs a bunch of HDD for a database workload with high IOPs. Using modern disks (4+ TiB) this leaves a lot of stranded disk storage that can only be used for low IOPs workloads.

Archive storage uniquely meets the criteria of lots of storage, yet low IOPs.

Disclaimer work for Google but have no knowledge of how AWS glacier works.


I could believe that we have a lot of unused space which is amenable to a different overlay use, but the potential for mis-design is huge here. If I pay high IOPs cost and you have to replace my high IOPS data backend on my paypoint, migrating the low paypoint secondary customer is non-trivial if not designed in day one (I am sure you did: cheap players might not understand the risk exposure here)

The telco equivalent was ISDN, where a single linecard with two sub-rate ports might be sold to a bank, and a domestic user. The bank could (and would) demand card reset 24/7 to restore service to their ATM, the domestic user simply lost carrier, and game state.


Rumor has it Blu-Ray (specifically BDXL): https://storagemojo.com/2014/04/25/amazons-glacier-secret-bd...

Though a comment on that page mentions this HN comment claiming custom low-speed HDDs: https://news.ycombinator.com/item?id=4416065


I certainly recall a Facebook video showcasing their bluray archival system. I was seriously impressed by the whole setup. And here we are archiving to LTO like primitives :)


Even with BDXL, which seems to be able to store up to 128GB per disc, you'd have to stack 100 of them to store the 12TB that an LTO-8 tape cartridge can store raw/natively. Tape cartridges also seem to be more physically durable than optical discs.

This doesn't mean that they seem simply superior, though. I imagine optical discs have way better seek times, for example, even if read/write speeds are better with tape.


Rumor (from this site) consensus seems to be custom low RPM hard drives optimized for long life.


:) I know :)


For me there is only one factor that matters in backup. Not restore/backup speed, not price per byte, not fashion - only reliability. Does the backup medium will perform without corruption when times come to read it? That's the only thing that matters.

I had this problem - store and forget, long time, as in decades. Tried a lot of stuff and: Personal conclusion - offline backup sucks, regardless of medium. Tape, harddisks, zip drives - demagnetization of medium. CD, bluray - scratch and you're done. Thumb stick, SSD, non-volatile memory in general - needs a plug-in once in a while. So what I use is encrypted volumes and upload to cloud...and a dedicated SSD that I have on me at all times if I need offline access (which gets plugged in daily).


> For me there is only one factor that matters in backup. Not restore/backup speed, not price per byte, not fashion - only reliability. Does the backup medium will perform without corruption when times come to read it? That's the only thing that matters.

Only if the other parameters are within certain bounds, otherwise your backup medium would be macro-scale physical engraving on a highly resistant substrate, at least some sort of stone, ideally a refractory or noble metal. Rhodium would probably be your ideal standard.


How are you gonna scratch your optical discs? Do you move them around all the time? Also with something like Rar recovery records or or other redundancy measures, a scratch does not mean that you are "done", because the data can be recovered anyway (with storage size penalty, but setting aside 10% of storage for being able to recover from random scratches is well worth it).


> How are you gonna scratch your optical discs?

IME the problem of optical discs is that unless they're molded (ROM) they degrade over time.


Not M-Disc which is rated to last for 1000 years and can be burned with consumer grade burner.


That's just the company's claims.

When the Laboratoire National de Métrologie et d’Essais performed an extreme conditions accelerated aging test (90°C and 85% RH) they found the M-Disc to be no better than most quality DVD+R (<250h), with MPO's Gold and Northern Star's DataTresorDisc reaching 250h (but not 500) and Syylex's GlassMasterDisc surviving almost unscathed to 1000h.

Sadly it looks like Syylex is as dead as Millenniata, Inc (though it looks like Verbatim still sells m-discs).


To be objective you should have mentioned that US Department of Defence did a similar test and they did end up claiming that M-discs lasted for longer than regular discs.

It looks like someone took over the production of M-Disc (wiki: "The debt holders subsequently started a new company, Yours.co, to sell M-DISCs and related services."), not sure about Syylex. Syylex seems like the perfect solution (making discs in glass, the material that is routinely used in chemical labs due to it's inertness), but the problem always was that it cannot be burned at home...


> Thumb stick, SSD, non-volatile memory in general - needs a plug-in once in a while.

What do you mean by this? It is non-volatile after all.


Any non-volatile memory is characterized by a "retention time" after which it may loose the information stored in it.

The retention time depends on the storage conditions, especially on temperature & humidity. For ancient flash memories, with large cells and 1 bit per cell, the retention time can be of 10 years or even 20 years. After that the capacitor will discharge, because the insulation is not perfect.

For modern flash memories, with much smaller cells and wit 3 or even 4 bits per cell, the electrical charge variation that will lead to wrong bits is much less, so the retention time can vary from a few years for a new SSD stored at low temperature to a few months for a used SSD stored at a higher temperature. At a high enough temperature (not impossible to reach, e.g. when forgotten in sun light), the retention time can be reduced to days.

When a SSD is used, i.e. it is powered, the microcontroller inside it rewrites its cells from time to time to avoid losing data. If you store it without using it, after the guaranteed retention time you may lose data. Consumer SSDs and memory sticks are guaranteed for a retention time of 1 year, so you should not expect more than that.


I did not know they had to maintain data that way, but I guess it makes sense.

That is good to know! Thanks for the explanation! Now I know to never use an SSD as backup o.O


Another crucial importance is bytes per square centimetre. When archiving something, there's no point in archiving hundreds of storage controllers with it especially if you need to store them in a safe (which you probably should). MicroSD cards may beat tape but they are a lot slower and harder to label.


Which one wins on long term storage durability, shock proof, and water damage?


LTO turns out to be garbage on the key measure of "can I read my backups a little while down the road", because you'd be surprised how quickly a new LTO drive can't read old tapes.

It's unpleasant having to buy old drives to keep with your tape archives because you won't otherwise be able to read them any more.


I don’t know why this is being downvoted, because it’s absolutely true. Any data you want to read years down the road should be retaped to newer media generations. Don’t rely on the promise that you can read tapes two generations back—test it yourself, if it’s important.


Can't you just keep the old drives you wrote the old tapes with?


Yeah, well true but mechanisms degrade and each generation is a pretty long as electronics go. The two generations ago reading should be an indication that you need to retake the old data. Getting old working drives is a pain, and the rental prices aren't cheap because you often need an expansion card to interface with the older drives.


Until they break. Then you can't get a new one.

Source: have done this at scale, have spent time scouring the second-hand market to cobble together early-generation LTO drives.


Okay, so by 'a little while' you mean a pretty significant amount of time. If you plan up front to migrate to new tapes every two generations (shrinking your collection 4x in the process!) then you'll never need a particularly old machine.


Always buy two I guess. That messes with the affordability calculus though.


You should have at least two to have critical redundancy. Not to mention making backup testing easier.


LTO doesn't suffer from those things. Shock can destroy the case and dislodge the leader pin of an LTO cartridge but AFAIK it can't damage the data on the tape. Water is bad for the transport but as long as you dry your tapes out they should still work. The operating specs for LTO are crazy, you can read and write them under all practical atmospheric conditions.


Smoke haze will cause all kinds of trouble for tape drives though.


As will very dry conditions. The IceCube neutrino telescope at South Pole used LTO-3 tapes to store data until they switched to only hard drives around 5 years ago. At that point, the newer tape drives had been tried (IIRC LTO-5 was the industry standard by then) but they never seemed to work, presumably because the humidity on station is incredibly low. It was considered more reasonable to keep the old LTO-3 equipment in use, versus building a humidor for the tapes and drives.


So don't run the backups when the DC is on fire, got it.


Mr. Robot was a liar!


I had a room full of servers, many with 16-36 disks per server and had an AC failure. Things turned off as planned, but it hit 115F for a bit. None of the disks died, majority of the tapes did.


That is really odd. LTO shouldn't do that. I would actually suspect the actual drive, but you probably got a replacement drive to check. That is well outside my experience. If I remember correctly they should be good to 120F for a little while.


Not sure if the tapes were permanently damaged or just had lost data. We ended up switching to a 16 disk server running backuppc. Not nearly the same capacity, but incrementals, deduplication, and compression made up the difference.

Also sadly the great price/performance brand with great service (overland) got bought up by the radically more expensive larger company and all the prices doubled 8-(.


That sounds unusual. Tape shouldn't degrade that easily. 115F is 46C, which is hot for a human and not a good temperature for long-term storage, but not anywhere near immediately hazardous to the media. Was it LTO or some other type of tape?


LTO, the archival temp is REALLY narrow, only 61-77F. Just google "lto storage temperature" and google will pop up a card based on the IBM data.


That is the range of temperatures if you are going to be storing the tape for over 6 months.


Also keep in mind the monitoring for the room hit 115F, so the temp inside the tape drive/robot might have been over 115F.

Similar with the drives, but the drives were fine.


Yes, note the operational temperature is below 115F as well.


That's if you are actively trying to do backup.


True, I don't remember if the backup software was running. However we lost tapes in the cartridge in the drive (7 tapes I think) and also tapes in other cartridges (banks of 7).

I think we had 4 cartridges of 7, then upgraded to a newer robot that had 8 cartridges of 11.

At the time I was pretty shocked, and asked around and apparently it wasn't a particularly unusual event. Fortunately we always kept a cartridge rotated out to a different room.



Tape vs SD cards? Tape.

NAND flash can leak charge over time if not powered on which makes it unsuitable as an archival medium. Depending on P/E cycles each cell has undergone and the drive type (SLC/MLC/TLC/QLC) it can be from 3 months to 10 years.


Disappointing article. Sure $/GB is a factor. But what about RPO and RTO? Or $/IOPS? When those factors are taken into consideration, disk often wins. Which is why the backup to disk market has gone bonkers in recent years and tape hasn't.


In terms of longevity MDisc is king (1k years estimate). How does tape compare to drives and ssds without power? and what kind of environment is required? Someone mentioned losing 5k$ worth of tapes to temperature hike.


MDisc is not king. Their DVDs were outperformed by competitors in a test done by the French National Archives [0].

Further, DVDs are effectively obsolete these days and MDisc's Blu-Rays have no scientific studies backing their claims to longevity compared to competitors like the DVDs did. One of their largest advantages relative to DVDs is completely gone in Blu-Rays as well: The inorganic dyes and recording layer (these are now standard for essentially all BD-Rs).

[0]: https://www.lne.fr/sites/default/files/inline-files/syylex-g... and full test and results in French: https://francearchives.fr/file/de7f8ea96ceb4ce38eb6d9278b3df...


I can confirm that they are extremely unreliable compared with other hardware. There is also no support from vendors.


Every week without fail i see articles shilling tape storage. Do people realize only really Big Data benefits from tape? Unless you write multiple petabytes of data that you rarely need each day, tape is inferior in every way to hard disks. A huge corporation that counts every cent and needs lots of cold storage might benefit from tape, and companies still prefer better storage ( https://engineering.fb.com/core-data/under-the-hood-facebook... ).

"Good enough for Google, good enough for you"(a lie that neglects economies of scale that force Google to use tape). https://www.overlandstorage.com/blog/?p=323


I can’t agree with your assertion that “tape is inferior in every way to hard disks”. Tape is cheaper per TB than HDD; why put data on more expensive media if you don’t have to. Tape is also more reliable by orders of magnitude than disk. You are more likely to be eaten by a shark than suffer an unrecoverable error using LTO-8. Tape is more durable and portable than disk. It also requires no power at rest and can be used to create an air gap to thwart cyber attack and malware. Finally, tape areal densities for tape are increasing far more quickly than for disk due to superparamagnetism, which makes tape far better suited to storing archive data in the IoT era of Big Data. Tape is less ideal if you have very stringent RTO or an application that needs fast random access to first byte. But in most cases, for its core strength of longer term data archiving, tape is better than disk or cloud (although as you note, the irony is that tape underpins major cloud vendor archive tiers). Don’t be absolutist; use the right storage, for the right purpose, at the right tape.


No, tapes become preferable to HDDs at less than 100 TB written, not at multiple petabytes. For many applications and for many people, including for myself in my home, it makes absolutely no sense to use HDDs for anything, but only SSDs & tapes. For long term storage tape is superior in every way to HDDs, while for fast access SSDs are superior in every way to HDDs.

Obviously, because the drives are extremely expensive, whoever needs no more than 20 or 30 TB of storage cannot afford tapes and must use a few HDDs. The best storage technology changes in time. 20 years ago I was using optical disks and 10 years ago I was using HDDs, but since I went back to tapes, like I was previously using 30 years ago, I have saved a lot of money and I have much less worries about the HDDs becoming defective.


How does it compare with 8TB USB3 drives(140$)? https://pcpartpicker.com/product/YpJtt6/western-digital-elem...


LTO-7 tapes are $10 per TB at Amazon (of course uncompressed), so they are still cheaper, but this $17.50 USD per TB is indeed the cheapest HDD that I have ever seen.

Nevertheless, these HDDs are much more expensive than they look at the first glance. The real cost of archival storage is given by the price per TB and per warranty year, because after the warranty time you must replace the storage medium.

I have seen countless HDDs that have died just a few months after their warranty time expired.

So these drives with 2-year warranty cost $8.75 per TB and per year.

The warranty time of the tapes is 30 years, but in fact you must replace them much faster because you will no longer find tape drives able to read them.

For LTO tapes, you should expect to replace them after 6 to 10 years.

Even with only a 6 year lifetime, the price for tape would be $1.67 per TB per year, i.e. more than 5 times cheaper than those WD HDDs.

If you want to just have some backups against accidents, which you expect to erase after a few weeks, then those HDDs could be a good choice.

For archival purposes they are still much more expensive than tapes.


Note that the article is straight up lying about the price of HDD capacity, 12TB drives generally cost more per TB than lower capacity drives, the best capacity deals are generally in the range 3 to 8 TB. Comparing cheapest to cheapest, tapes only go for around half the price of hard disks.


Dunno, if shucking is an option, external consumer WD drives are regularily available under EUR 200 for 10 TB, while bulk drives of same capacity or higher for dedicated storage use are at least 50% more expensive. In my experience this is a releatively new development, that those two markets diverged that much. 2015 and before bulk HDD where usually cheaper than external ones.

If you do care about warranty you can just keep them in the external usb casing and enjoy having an abundance of power supplies and USB 3 cables.

If it's 24/7 hot storage it also uses 50kWH per Year at 6W. That's also about 15EUR/Year minimum energy costs per drive.

Me feeling is, that all HDD manufacterers are extracting as as much value from the market as they get away with (meaning having more of a smell of informal price fixing, then of free market) while they still can, before they will be made osolete to solid state storage in 5 years latest.


If we compare them to tapes, it doesn't seem fair that the hard disks should be kept spinning, booting them up on demand still gives you vastly lower latency than tapes.

Me feeling is that the competition in the tape market is also a bit rusty, $3000 for a fancier VCR, what a steal! ;-)


Normal HDDs are at least 3 times the price of tapes.

To reach only double price of tapes you must choose some super-ancient junk, e.g. 1 TB 3.5" HDDs, which are much slower than tapes and which require a huge size and weight for the same capacity as tapes.

12 TB HDDs have about the same volume as LTO-7 tapes (i.e. about the same volume as two 6 TB tapes), but they are much heavier. LTO-8 tapes have double capacity compared to LTO-7.


What about tape gives it superior compression? Why can’t that same compression be applied to any storage?


Nothing. They estimate an average compression based on an average dataset with uncompressed text data. That wouldn't work for things like already compressed video, audio, encrypted data, and so on.


I really wish they wouldn't add stuff like compression and so forth, because I feel like it's an "abstraction layer violation" and it's misleading in a way that the majority of other storage media isn't; for example, a 1TB hard drive is going to store (at least) 10^12 bytes regardless of what those bytes contain. A 60 minute audio tape will always record 60 minutes of pure silence, white noise, or anything in between. Yet somehow data tape is marketed with a theoretical compressed capacity.


It's super annoying, and it's annoying to (attempt) to disable the hardware compression on the tape drives as well. We've got a tape library with 2 two LTO7 tape drives, and even though we're accessing it through the linux device file that should be accessing the tapes in an un-compressed mode diagnostic tools indicate that compression is still be used when writing to tape. Since we're pre-compressing our data with gzip having the tape drive try and apply compression is a wasted effort and I wonder if we're losing write speed to it.


works great on giant uncompressed database backups though


Sure, or just datadump | gzip > /dev/tape or similar. Tape loves to brag about bandwidth/capacity and then hide the assumed 2.5x compression in a footnote.


Please consider zstd instead. It is superior in every way except memory usage and compatibility with old systems.


zstd looks good but it's only three years old, so use cases need to fit in the lifetime of today's platform. gzip is about to turn 27 and everyone has heard of it, which makes me pretty confident in finding decoders after another decade or two.

(Actually this problem is why I haven't owned a tape drive for a while.)


Sure. Anything, not like text compression is hard.


pigz!


Zstd also has a multithreaded mode. On the CLI `zstd -T0` tells zstd to use all available cores.


Please be aware that this will significantly impact compression ratio. So much actually, that I don't dare use it by default. I'd often use the non-multithreaded mode as for some data (perf.data files with large stack captures, most recently) and reasonable compression speeds of >=50MB/s input multi-threaded modes are not pareto-optimal.

I'll get a small benchmark script for that done over lunch. Come back here later.


> Please be aware that this will significantly impact compression ratio.

Is this a continuation from previous message ? Because multithreading mode of zstd doesn't impact its compression ratio. It's actually an interesting property of zstd : whatever the nb of threads used, the compression ratio remains the same (aka. reproducible).


Before reading up on LTO, I thought maybe it was some kind of physical compression where it would write the data closer together at the cost of read/write speed or something (doesn't really make sense in retrospect). I'm kind of disappointed that it was just regular data compression with an assumed ratio. :/


I spend most of my time these days with a database that seems to compress to 4% of its size...


It's not about compression, but total storage capacity.

Imagine a disk. It has two sides and a certain surface area, as well as a certain density (bits per square in/cm). A standard 3.5" HDD platter has a bit more than 4200mm^2 of surface area (having been unable to quickly find the dimensions of an HDD platter, I grabbed the dimensions of a floppy disk from the standard) per side, or a little less than 10000mm^2 per platter; I'm being deliberately rough with the numbers since all we need is an order of magnitude.

Imagine a tape. It's long and thin, and can be rolled up. An LTO-8 is 960m long and 12.65mm wide, for a total surface area of 12,144,000mm^2. Tape has a lower recording density than HDDs (LTO-8: ~20Kbit/mm * 6656 tracks, or ~128MB/mm^2 compared to ~1Tbit/in^2 = ~185MB/mm^2 for a HDD), but more than makes up for it with its length.


That's all well and good, but tapes are smaller than disks. LTO-8 is 12TB and the 14TB disks have been out for a good while.

Also with inefficient tapes you end up spending many more tapes for inefficient incrementals, incredibly long seeks, and lack of efficient deduplication.

Then there's other nasty surprises like a relatively narrow temperature range for tapes and an improper alignment of the head can result in tapes you can read and write... but when the drive dies you can't read the tapes anywhere else.

Tapes still make sense above a certainly scale, but I think it's around 10PB before the real world costs of tapes, drives, robot, support agreements, extra capacity for inefficient incrementals/dedupe, etc.

The big win with tapes is it's much more feasible to rotate tapes offsite.


> but tapes are smaller than disks

Tapes are also physically smaller, therfore their capacity per volume is better than for HDDs. Two LTO Ultrium catridges have about the same volume as a 3.5" HDDs, so you have 24 TB in the same space as a 14 TB HDD.

Moreover, the HDDs are much heavier. You can easily carry a suitcase with 20 tape cartridges, i.e. with 240 TB.

If you would carry a suitcase with 17 HDDs (238 TB), that would be around 14 kg (around 31 lbs). Possible but not at all comfortable.


After multiplying the single track linear density by the number of tracks, you also need to divide by the tape width to convert to a recording area density number. So you end up with about 10.9Mbit/mm².


The compression ratio seems to actually be much worse than gzip, but it's also faster.


on a core i5 you can use lz4 and compress LZ4 for a ratio around 2 at 370 MB/s (ans decompress it faster) - for one core.


> you can buy cheap LTO-7, reformat them to M8 and get 9TB of native storage

Note: You can't reformat them this way. This is a permanent choice you make with a blank tape.


Also IIRC you cant do this yourself with a regular LTO7 tape and drive. I think you need some kind of drive shuttle or you need to buy the tapes preformatted by someone who has the right gear. I have an LTO7/8 drive and I cant do this.


We had a Ditto drive and a series of tapes, bought circa 1996. We used the software it came with on Windows to perform full back ups of the machine, cycling between a set of tapes (maybe 6). Backup would complete overnight. Never had to restore from it. It was a consumer friendly, easy to use product and cheap. My father was an accountant that ran his own practise, so needed to know his files were safe. That went from carbon copies into a second filing cabinet, to 5 1/4" floppies for groups of clients, to 3.5" floppies, to Ditto tapes to USB sticks to Microsoft Cloud storage (which sinks between laptop and PC). The oldest days of filing cabinets and floppies were easy, the middle days of Ditto and USB storage was more worrying, and the final days of cloud storage went back to being simple again. Now he is retired so none of it matters.

This was all donkeys years ago, so it is interesting to hear how things have changed.


Are there any cheap tape drives? If not, what's the issue? Is it the fact the prices are kept high because it's used mostly by business/enterprises customers? If so, are there any hopes of LTO becoming mainstream for home desktop usage? Couldn't find much info on Google about this.


DAT (Well DDS) used to fill that niche. Used to be pretty reasonably priced - for tape. Went for a few hundred a drive, an order of magnitude less than LTO. Smaller capacity each generation though, but is an older standard.

Disappeared for no apparent reason about a decade or so ago.


As someone totally unfamiliar with tape aside from knowing some places still use it for archival and recording audio to tape - something totally different - is incredibly expensive and most (if not all?) recording tape is out of production, I’m surprised storage tape is still a thing.

How does storage tape even work? Is this literal tape in a cartridge much like tape in a cassette tape? I assume we’re not talking about data stored in 1s and 0s? Is this technically an analog medium!?

Going to do some googling now to quell my curiosity.

The article does seem targeted at consumers. I’m going to guess I’d need some expensive hardware to even use one of these $60 tape drives.


"As for transfer speeds, they can reach 300MBps"

I am getting over 500MBps on my setup


Impressive.

Will you share more information about the specific details (i.e. hardware and configuration) for the set up you're running with?


Ok now i am embarressed.. I was out by factor of 10...

i am getting 50Mbps.. copying NAS to tape

For what its worth..

Network Router: TG789vas v2

NAS: Synology DS1815+

Switch: (for Server, Tape Drive, Tape Library & Their Associated UPS ) Airpho Gigabit AR-GS108

Server: X3650 M3 Running Windows 2012 R2

Tape Library: IBM TS3100 3573 L2U

Tape Drive: IBM LTO6 8Gib Fibre Channel 35P1624 Feature Code 8348

Tape Drive Fibre Channel Card: Avago AFBR-57F5PZ

Server Fibre Card: Emulex LPE12002

Using NO compression

I understand the new IBM LTO 8 Drives can get up around 750 to 800Mbps


well probably with compression


I wonder if the 300MBps figure might also be with compression.


No, the LTO-7 transfer speed is 3 Gb/s a.k.a. 300 MB/s without compression. The compression in drive may be useful for automated backups, but if you use the tape for an archive of files selected by yourself, it is better to disable the drive compression. If you have compressible files, you can compress them with better algorithms on the computer, before storing them on the tape.


> you can compress them with better algorithms on the computer

is that always true? what if I have dozen's of pdf's which are similar and compressed (most pdfs are)? wouldn't de-duplication also reduce the size? I mean with millions/billions/trillions of files of the same format.


For example on a computer you can use algorithms like lrzip, which search for very long distance redundancies (i.e. not in small data windows like traditional algorithms). This usually achieves a decent compression even on normally incompressible files, e.g. on collections of movies or on collections of PDF files, because it may find repetitions at least e.g. in the headers of the files or of the file sections. If there are similar files, then there are chances that lrzip will find repetitions between files, even if a traditional compression would not find repetitions inside a single file. If de-duplication is possible, then that would also be much more efficient than enabling the compression option of the tape drive.


Mind sharing the details of your setup? Interested in doing something similar.


Ok now i am embarressed.. I was out by factor of 10...

i am getting 50Mbps.. copying NAS to tape

For what its worth..

Network Router: TG789vas v2

NAS: Synology DS1815+

Switch: (for Server, Tape Drive, Tape Library & Their Associated UPS ) Airpho Gigabit AR-GS108

Server: X3650 M3 Running Windows 2012 R2

Tape Library: IBM TS3100 3573 L2U

Tape Drive: IBM LTO6 8Gib Fibre Channel 35P1624 Feature Code 8348

Tape Drive Fibre Channel Card: Avago AFBR-57F5PZ

Server Fibre Card: Emulex LPE12002

Using NO compression

I understand the new IBM LTO 8 Drives can get up around 750 to 800Mbps


I have been using since 2016 the following setup:

1. QUANTUM LTO-7 Tape Drive, Half Height, Tabletop 2. Delock Cable Mini SAS HD SFF-8644 > Mini SAS SFF-8088 1 m 3. LSI LSI00343 SAS 9300-8E Host Bus Adapter (8-Port, PCI-e 3.0 8x, SAS 3.0)

Also, there is a couple of useful tape accessories:

4. Cleaning Cartridge (I use one from Quantum) 5. Cases for storing LTO Ultrium cartridges (I use some from Turtle Case, for 20 cartridges each)


I forgot to mention that the files that are written to tape should come from an SSD, because an HDD is not fast enough. The files can come from an SSD of a remote computer, if you use 10 Gb/s Ethernet, as that is fast enough.


If you overshoot demand on one performance metric, buyers shift to another as yet unsatisfied criterion.

e.g. 8" HDDs are also cheaper per byte than 5.25", than 3.5", than SSD's.


Is Tape worth it for a movie collection? I don’t think I will see any benefits of the compression and the prices seem comparable to just buying a 8TB HD


> Is Tape worth it for a movie collection?

I doubt your movie collection is large enough that tape would be cheaper. A single tape has about ~50x the capacity of a Blu-ray Disc, but you need to buy a lot of tapes before the whole setup becomes cheaper than HDD.

> I don’t think I will see any benefits of the compression…

Ignore the compression, it’s garbage.

> …the prices seem comparable to just buying a 8TB HD…

By “comparable”… tape is clearly cheaper per byte when you compare prices, it’s just not so radically cheaper that it changes your life.


The most common tape is LTO-7, which is 6 TB (non-compressed). The most common Blu-ray disks store 50 GB, so a tape equals 120 discs, not 50.

I have mentioned in another message that the breaking point is around 75 TB or around 50 TB, depending on how many copies you store, i.e. around 1000 to 1500 Blu-ray movies.

I agree that many people, probably most people, do not need to store so much data to make tape storage economical, but for those who have a large enough data archive, like myself, the tape is really radically cheaper to have changed my life.

Obviously, there is a vicious circle, few people buy tape drives because they are too expensive, so that tape drives are expensive because few of them are sold.

If they would have been mass produced, tape drives should have been 5 to 10 times cheaper, and then tape archives would have made sense even for the smallest movie collections.


> The most common tape is LTO-7, which is 6 TB (non-compressed). The most common Blu-ray disks store 50 GB, so a tape equals 120 discs, not 50.

I was being conservative and using BD XL.

Seems like you came to the same conclusion about the break-even point—around 150TB before you consider replication.


Tape (LTO-7) costs 3 times less than the cheapest HDDs, so the prices are not comparable at all.

To recover the drive + SAS HBA card costs, the movie collection must be larger than about 75 TB if you store double copies or larger than about 50 TB if you store triple copies.

If you store a single copy of any digital medium, the chances to not lose anything after many years are negligible.


I forgot to mention that I referred to the prices for non-compressed tape storage, i.e. counting LTO-7 tapes as 6 TB tapes. Compression does not work for movies and even for compressible data I disable the tape compression and I use better compression algorithms on the computer.


Here the LTO-7 is only half as cheap as a HDD, ~110 USD for 6TB tape vs ~230 USD for 8TB HDD (cheaper per GB than 6TB models).

Cheapest new LTO-7 drive is over 3000 USD though, which means you'd need more than 300TB of data for a new LTO-7 tape solution to be cheaper here.


110 USD is extremely expensive. In Europe, at Amazon, the price is 60 EUR, including sale taxes, for a 6 TB tape from IBM, Quantum, HP or Fuji.

Looking at Amazon USA (just search LTO-7), I see prices between USD 63 and USD 69, or even just $60 per cartridge if you buy a 10-pack. That is still more expensive than where I buy, but it is nevertheless far closer to 3 times less than HDDs than to 2 times less than HDDs.


Could be there are cheaper ones, and this was single pack, maybe the 20 pack is cheaper still.

Though again, with the cost of the drive you need quite a lot of data for it to make sense.


> Even if LTO-8 tapes are now in stock, you can buy cheap LTO-7,

I wonder if this was a typo. Did they mean that LTO-8 tapes are out of stock? Otherwise, the sentence structure "even if ... are now in stock, you can buy ..." is kind of weird.

> reformat them to M8 and get 9TB of native storage (22.5TB compressed).

What does this mean? Doesn't how much it compresses depend on the data not the storage device?


Until very recently, LTO-8 tapes were not available for purchase due to ongoing patent litigation. LTO-8 drives were readily available, but you couldn't buy the tapes to make the most of those drives.

LTO-8 drives can also use blank LTO-7 media to get density halfway between LTO-7 and LTO-8; this is referred to as LTO-7 Type M, or Type M-8. This applies to the raw storage capacity, not just the compressed capacity (which is conventionally referred to with the assumption of a 2.5x compression ratio, even though not all real-world data is compressible).


There was a dispute involving LTO-8 that just recently was resolved, so LTO-8 tape should now be available in the US. I wasn't following it closely, so I can't give any real insight, but it explains the wording.



I think it's 12TB

Google says

"HP LTO-8 Ultrium, 30 TB, RW Data Cartridge Tape. Exhaustively tested, HPE LTO Ultrium cartridges meet all your demands for maximum reliability when restoring data, offering high storage density, ease of management and scalable storage and backup performance."

TapeandMedia.com says

"1 - HPE LTO Ultrium 8 Tape with a capacity of 12 TB and up to 30 TB compressed capacity."


That says 30TB.

Here's a 15TB compressed for $63:

https://www.amazon.com/HP-C7977A-Cartridge-Compressed-Capaci...


The tapes and robots are cheap. Drives on the other hand...


Yeah, they all seem to be between $2,000 and $3,000, and I can't imagine how these things could be much more complicated than a dvd drive, if they aren't simpler.


Open a DVD player then open a VHS player, the dvd does more complicated things on a tech level but VHS player is harder and more expensive to make. Early gen drives are cheap on the second hand market, you can pick up a crummy one for under $100, tape usually included.


I was told once that certain types of storage medium will demagnetize over time and that you need to spin them up to prevent data loss


If you only save your data as a single copy on a single tape, then the price is quite low (around 1/3 of Glacier). However you can reach prices of below Glacier using online storage with GB/s performance and access times of milliseconds using Ceph with the right hardware selection including every cost. You can learn more about how from croit.io.


Price per byte maybe, but in value you are sacrificing a lot. I prefer continuous backups and versioned storage.


You're not sacrificing anything. Or specifically having a long term, offline tape storage does not prevent you from having short term, online, versioned file/block storage. Different use cases.


Well, my online versioned storage via Google is unlimited for $12/mo which syncs my files locally and then I have another backup provider that does the same thing via CrashPlan Small Business which is also unlimited for $10/mo. So that is $22/mo for unlimited continuous backup and versioning. I don''t see how tapes would offer much value, and storage location of those tapes especially for a home use case is likely going to be a single point of failure.


Ok, but how fast can you restore if something fails? And how fast can you add 200gb of new data to that service? It's not for home use, but they're are valid use cases.


As fast as a local CrashPlan backup can restore based on the storage medium.


The tape story is really killing me. I even bought one old tandberg for 200 euros, used and run it on an ancient hardware just for backups. I just dont get it that in all those years not a one single vendor decided to make less expensive tape devices for the consumer market.


There used to be less expensive consumer-oriented tape drives (floppy-based QIC, various parallel port drives, VXA, an DDS/DAT). That segment of the market has been subsumed by backup services, USB-based solid state or hard disk-based storage, and writable optical media.


> not one single vendor decided to make less expensive tape devices

https://www.youtube.com/watch?v=TUS0Zv2APjU


"But there's something else that tape offers that no other storage medium currently offers and that's on-the-fly, transparent compression which can go up to 2.5:1"

I don't understand this point. Why would tape allow compression that other forms of media don't?


Lots of tape drives offer built-in compression. The drive firmware does it. On AIT3 drives you can enable or disable it.

Compression may have been a big deal in the ages of sub-1Ghz single core systems but not anymore. Especially since you ought to be encrypting before writing anyway.

Why does it continue? Probably so they can double the capacity that they put on the tape label, which is always the compressed capacity and not the native or real capacity. Plus backward compatible with such encoded tapes.


I think what is unique here could be the "on-the-fly, transparent".

But this kind of compression can be better achieved I think with one core of your CPU, plaintext compression of 3x shouldn't need lzma (and if you really need to compress 22TB of plain text with no binary or multimedia files, well maybe you could compress it slightly with lz4 or a simple zlib compression before sending it).


People are very conservative about backup, and some backup solutions don’t compress in software.


Not an expert here, but could it be because you can actually slow down the tape 2,5x and still be readable?


Does anyone have any experience with encrypted tapes and could give a few words about their setup? Mainly other options then the manufacturers encryption.


I do not trust any kind of encryption inside devices, because I cannot verify how the keys are managed. The essence of encryption is that the key must not be stored together with the encrypted data, but if a key is ever used inside a device you cannot be sure that it can no longer be retrieved from it in the future.

The data files that I am writing to the tapes have their metadata written into a database to be able to retrieve them in the future, then I group them in chunks of an approximately fixed size (50 GB in my case), then I compress them, then I encrypt them and then I add some redundancy (with par2) for error recovery. A simple solution for compression with encryption is the lrzip program, but for more flexibility in encryption you could use e.g. the openssl program invoked with the desired parameters. There are many encryption programs and any of them is better than using the encryption option of a tape drive.

So I queue the files to be written on the tape in a directory and when the size of the directory exceeds a threshold (50 GB in my case) a script is invoked that does the processing mentioned above, including the encryption, then it writes the result on the tape. A similar script does the inverse operations when I retrieve a file or group of files from the tape, giving its position as a tape number + a tape file number as obtained from the file database.


Thank you for the detailed answer.


It never ceases to amaze me how much tape I still see imported, mind you I see way more SSDs, then HDDs but still a fantastical amount of tape.


It's crucially important to take managing and lifetime of such storages into account. I doubt it can fight with SSD or HDD.


If you want to talk about MBps, imagine a truck full of tape cassettes driving from the Google data center to off-site storage.


From the sound of your comment, you haven't heard of this:

https://aws.amazon.com/snowmobile/

...putting stuff on a big truck is a very viable alternative to using a series of tubes.


This is not true at least in my experience. Price doesn’t tell you about reliability. I have 10% of broken cartridges and 66% of broken tape drives. These devices have mechanical parts and are sensitive to magnetic fields so they could be damaged easily. If you don't believe me then I can send you all this garbage to you


That's if you don't count the extra time required to deal with tape backups.


Assuming that you already own the tape recorder, WTF


This is the principle behind AWS Glacier


sure, but they are really slow. It depends on your use cases.


Tapes are slow only at the latency of random access. Their sequential transfer speed is lower than that of good SSDs, but higher than that of HDDs.


I worked at IT for a large university. We had an HP/Compaq SSL2020 with dual AIT drives made by Sony. The issue was never determined (perhaps micro debris in the air), but the drives has to be replaced nine (9) times due to failure. Furthermore, backup and verification is a tedious process, and combined with local and offsite vaulting and tape management, having proper backups is a royal PITA but there's no substitute. Replication isn't a backup because it propagates errors and you can't do disaster recovery without proper software, offsite vaulting and practice.

Backups aren't valid until they're tested.


And why wouldn't it? Tapes as far as I know cost pennies to manufacture. It's easily over 90% plastic in most cases. The low quality ingredient on a low end tape is probably still ferric oxide - not exactly an expensive chemical if I know anything about chemistry. Perhaps cobalt is still heavily used as part of the coating as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: