Let's make storage work: 2007

Wednesday, July 4, 2007

The next physical wave of the Internet

The first physical wave was all about decentralized connectivity. IMPs and later routers permitted a file to be broken into many fixed-sized blocks or packets and then sent independently to a distant machine. The blocks took different routes, arrived out of sequence and sometimes with duplication or retries, but no matter, they were all stitched together perfectly at the receiving end. The distinguishing feature of this architecture was that there was no one machine controlling the journeys of these blocks. It didn’t matter if a router broke down or T1 communication lines were severed, the emergent behavior of all those routers was to find a way to get all those packets delivered. It worked in the face of disaster or misconfiguration, and the process of delivery was abstracted completely from the many applications that depended upon it.

The next physical wave is about decentralized storage. We have the cheap hard-drives and servers that hold them and we have lots of data to protect. The problem is that we manage the data in an old-fashioned centralized way. Napster and its progeny were on the right track, but they were all about sharing; providing access through thousands of copies of a (music) file. But that’s no good here. Today’s user demands privacy but wants the same convenience of machine independence.

Visualize this: In the same way that the TCP/IP protocols split up a file into blocks for transfer, let’s do it for storage. We’ll compress the data for efficiency and encrypt them into thousands of anonymous blocks and store them on many different ‘block servers’. The block servers will be like stripped down web servers; only smart enough to accept a block for storage based on a 64 bit number and give it back in future when presented with that same 64 bit number. If you break into one of these servers, what will you see? There will be hundreds of millions of encrypted blocks of exactly the same size addressed by a set of these numbers.

Next, place some intelligence on the client computers that use this space. When it is time to store a file, software will create those blocks and then send them to the block servers. But how will it decide which block server should be used, and where (the 64 bit number) on that server it should be placed? I’m sure that you can dream up strategies to place blocks based on an ascending sequence of addresses on the next available server, but I suspect most of these ideas will require some central authority that regulates where everyone’s blocks must go to prevent conflicts. That’s no good for the next wave. We cannot efficiently grow a centralized storage system without technological (scaling) or political problems. We’ve tried that. It’s not working.

Here’s how we do it: Create a ‘storage schedule’ on the fly at the instant of storage for a file that is based on (1) a user’s privately-held encryption key and (2) the complete pathname of the file to be stored. This schedule will be created in a 64 bit number space using ‘one-way’ functions developed over the last 30 years by encryption theorists. Store the blocks. Discard the schedule. At some time in the future when you want to retrieve the file, recreate the schedule from (1) the encryption key and (2) the file’s pathname and use it to go to each block server in the list and ask for the particular block.

Let’s think about the ramifications of this technique. First, the ‘one-way’ functions statistically guarantee that the servers all receive an equal number of blocks so our hardware people will love us because all the equipment is used to peak efficiency. Secondly, the blocks of the file are retrieved through a direct numerical calculation – unlike conventional solutions that require two or three database lookups. This eliminates the requirement for expensive IT staff to manage complicated mission-critical database servers. Thirdly, we have a storage algorithm with two variables. If we hold the encryption key constant and permute the file's pathname, we have a hierarchical file system that can grow to any size (as long as we add more block servers when they fill up) that depends only on that one encryption key. If we permute the encryption key but use the same file pathname the schedule is still unique so we can have any number of independent file systems co-existing on the same block servers. That gives unbounded scalability.

The real issue in everyone’s mind, however, is privacy. Why would I place my personal data on someone else’s server? Why would I trust someone to hold my data? Let’s analyze the security of this architecture. With conventional technology, looking for customer data is a bit like breaking into a bank. Once the thief gets through the ‘door’ by hacking through the security or bribing the sys admin, the file system is laid out before them and they can easily get to the specific ‘safety-deposit box’ or file of a customer. Hackers can then use their formidable skills and resources (e.g. botnets) to try to ‘break open the box’ or crack the encryption of that file. Now consider our system. We let them into the safe. A hacker can ask any block server for a block by specifying a 64 bit number. The problem they face is that they don’t know where to look. To rebuild a file without the encryption key that created the schedule, the hacker has to make 2⁶³ = 9 billion, billion guesses at a hundred different servers, decrypt each block and reassemble them in the correct order. This is like searching 100 haystacks to find a specific blade of grass – unlike a needle, the right block does not look any different than its neighbors. Such an attack would take longer than the creation of the universe.

With this technology it is finally possible to create the user-centric Internet of storage. It is possible to place all data into a distributed and homogeneous store of anonymous blocks with complete privacy for all participants. The protection of data will no longer require the machine-centric point of view of the past and users will comfortably store their data ‘on the net’ in complete confidence. It only makes sense.

Tuesday, April 24, 2007

Just what is a file system, anyway?

What is a file system? Look at the C: drive on your PC, or the F: drive that shows up when you're in the office. In the simplest terms they are both just a database managing a bunch of blocks that reside on something that has persistent magnetic, electrostatic (USB drive) or optical properties (CD-ROM, DVD). If you peel back the covers on the 'device driver' that controls your hard-drive, you see that it is just an optimized database that maps filenames to specific blocks on the drive. Some of those filenames represent directories, so the contents of those directory blocks are just another layer of this database; more filenames pointing to other blocks of data. The end result is a heirarchical file system that abstracts variable-length files into nested directories by hiding that database implementation.

Now take another step back and look at your mapped drive on F:. What is that? Another database. A little different, perhaps. It's a mapping of your username, password and group memberships with a smattering of security settings to a specific directory on a certain hard-drive housed in a file server machine in your office. It looks like a part of the directory system on a hard-drive because it is just that. The only difference is that the hard-drive is in a different computer and there is a conversation between your computer and the file server to deliver the files when you want them. Again, the principles of abstraction insist that the details of this database are hidden to simplify the maintenance of this remote file system.

Lets take one more step. Let's look at a corporate-wide file system or perhaps the storage system of an on-line provider. How are they built? They're too large to be formed by simple clusters of Windows or Linux file servers. They're made from an alphabet soup of SAN, DAS and NAS and they run a complex application called File Virtualization. What's that? Well, it's another database. This one has the much richer content of implementation detail for each of the component storage pieces. Things like capacity, speed, physical location, maintenance history. Like the layers below, it will map the credentials of a user to entry points in the storage devices where the user's files will go. It's smart enough to shift user files around to ensure all of the storage devices are used efficiently, manage disk-to-disk and disk-to-tape backup procedures and presumably it can react when a sub-system fails.

This all makes good sense. What we have here is a set of layers of storage that have evolved over time to compensate for the weaknesses of the earlier layers. A single hard-drive is not big enough to meet the needs of an office. A file server does not have enough of its own drives to provide space for a large corporation. It is too much work for an IT staff to make sure that all the storage devices are used at capacity and are properly protected.

But there's a complexity problem here. Each of the layers is progressively more complicated. Each layer's 'database' is a single point of failure that must be protected through strategies of multiple redundant copies. A breakdown in any point in this chain may lead to an interruption of service while a particular database is restored from a backup. Highly qualified IT professionals who understand the complicated software must be on hand to monitor these processes and deal with conflicting circumstances that might pollute one of these databases as they are re-integrated with the real-time data. In short, it is expensive and it is fragile.

What about a fresh look at file storage? By changing the fundamental architecture of file data storage it is possible to replace the complexity of all those layers with one simple layer. This permits a massive file system that supports unlimited numbers of private collections of files across any number of cheap server appliances and does not require any databases at all. Would you as a consumer of storage space be interested in a file system that intrinsically self-balances, grows organically as need permits and does not require a backup procedure to protect data?

Impossible? No it's not. Keep reading and you'll see how one simple algorithm can make all file storage simple. It's all about thinking outside the box.

Sunday, April 22, 2007

Mental Models

How many discrete objects are in your home? Try to estimate the pieces of furniture, appliances, entertainment devices, books, DVDs, ornaments. Then drill down even deeper. How many dishes, pictures on walls, pens in drawers, toothbrushes, paper documents and sewing needles can you add to that number. I bet there are tens of thousands of distinct things in your home that you could put your hands on within 60 seconds of searching.

Now turn to your computer. How many separate files on it? If you're a developer with ten years of history, it might number in the high end of the thousands. The average user will have less. How long would it take you (or them) to find a specific file? Even with Vista's improved searching could take a tens of minutes. In fact, it might prove impossible to find that file at all.

Why is that? It's because your brain has been wired through eons of evolution to work in three dimensional space. You remember things in a 3D context, and you learn the geometry of your own home because you spend so much time navigating through it. In contrast, a computer directory is a two dimensional hierarchy of words. Its completely alien to your evolutionary past and to work with it you have to develop new skills. Up until you saw your first files-view, you had never spoke that way or thought that way.

In Medieval times before paper was prevalent and when most people couldn't read, the scholars used a technique called Memory Theatre to remember lots of unrelated pieces of information. People would imagine themselves walking through a large cathedral and they would make associations between objects encountered in such a walk with what they wanted to remember. Later they would retrace the walk to recall all of the items. James Burke does a brilliant recounting of this in the "Matter of Fact" episode of his "The Day the Universe Changed" series.

Wouldn't it be cool be we could walk though our own home and find all of our computer files on the walls and in the drawers?

Wednesday, April 4, 2007

Wireless doesn't mean tiny

I was a big proponent of the J2ME environment from SUN for wireless devices. Seemed wise. Tailor the run-time environment to fit in the small memory spaces of cellphones and Personal Digital Assistants or PDAs (does anyone still call them that anymore?) Then you can write code that is sort of like the code you'd write for the workstations and laptops and it would work.

Well it did, within reason, but ultimately it was just another fork in the development of applications. You see technology relentlessly advances. Processors are still getting smaller, memories continue to get bigger. When you look a handheld device you have to ask what is it that is being constrained by that physical size and what will continue to be constrained in the near future.

Sure, you say, I know: power. The processor can only run so fast without draining the battery. OK, I agree. So why write applications that fit in small memory spaces? That's what J2ME is. A stripped-down version of the run-time that fits in a few MegaBytes. You can buy a 128 GByte USB stick for 10 dollars. Assume the new handhelds will have this space and skate to where the action will be.

The biggest problem with J2ME is that it tries to replace the underlying operating system. Here's a better idea. Put a real OS in there. Linux will fit in the memory space of those devices nicely because you can create a distribution that only has what's needed for that device without compromising the pieces that remain. That isn't possible with a monolithic operating system. Sure you can create a smaller version of a big OS for these devices but that's just another development fork just like J2ME was. Guess what. That's exactly what Apple did with their iPhone.

It will happen that way. The economics will dictate that Linux will ultimately run most of your handheld devices. The scary thing is that once you get used to these devices and the applications that run on them, you won't need the bloated operating systems that run the workstations. You will start using the those machines just as dumb terminals for the devices in your hands. You'll have figured out how to do your word processing on these gadgets and you'll think "if I just had a bigger screen to do this, I'd be OK. Easier that learning a new package...". There's a very smart man called Clayton Christenson who sees this happening. Check out his Podcast.

Shhh. Does anybody hear a dynasty crumbling?

Wednesday, March 28, 2007

Do you like the way computers make us do storage?

Hierarchical storage. What a great idea! My hat's off to the guy that came up with the idea of infinitely-nested layers of directories (or folders as we like to call them now) that can have as many files as needed. What a brilliant way to organize digital data. Modern operating systems or application suites could not exist without this concept. Seems so blindingly obvious now that you wonder where I am going with this.

If you think about it, a hierarchical model is ideal for organizing lots of separate pieces. The names of the folders can mean something or they may not. It doesn't matter because you can still express in code that a given piece of data is in a place that defined by this precise syntax of sub-directories and it will be there. Unless you deliberately move it. Its deterministic, it can be permanent and it can accommodate new files and new directories later on when the need exists.

But if you use this for your personal data it gets nasty. Can you remember where you stored that text file containing the street address of that excellent restaurant in Seattle where you had those great mussels last November? No, so you've got to hunt for it. Now the sheer number of directories fights against you. There's too many paths. You work down them randomly. You try to do a file search down a few folders. But was it end of November or first week of December? Did you even spell it correctly in your haste? You give up because the amount of work required to find that information outstrips your desire to have it.

Sad thing is, you've got computer skills. You knew how to do a file search, right? What about the new computer user who doesn't understand that a picture of a word in an image file is different than a text file containing that word? What chance have they got to find that file? Yeah, maybe they'll grit their teeth and slog through a download and installation of Google's latest desktop search utility. But will some conceptual misunderstanding of how to use that tool defeat them in the end?

To remember a personal thing, you need a real-world reference. That's the way the brain has evolved over the past eons. You operate with visual or aural clues to other things that have some sort of abstract relationship. Why is it that you can remember how to drive to that restaurant in Seattle the next time you're down there even though you gave up on finding its address at home? You only had a vague idea where it was in the city, but you got there. What's happening here? Hey, a city is a city and it has roads and some roads have buildings that look like large boxes, but this one was weird because it had all of its windows in that interesting geometric pattern, and wasn't that just before I turned left, and ..... Try and capture that in a text file.

The point is that we've got the technology to store lots of things, but now we've got to build some new tools that use this technology wastefully, perhaps, but get the job done in new ways. I've got some ideas. Want to hear them?

Tuesday, March 27, 2007

Visions of storage...

Twenty five years with your hands on technology gives you a pretty good feel for where things are going. You can extrapolate.

I was there when the microprocessor was born. A forty-pin wonder that you could wire into a circuit (I was a digital electronics design engineer back in those days) and you could write assembly language to make things happen. You need to appreciate why this was so special to us. We knew hardware. We knew what registers were because we had burned our fingers connecting the chips together, and so "LD AL, 20H" made perfect sense to us. It meant we could eliminate fifty wires on our circuit board just by getting that two-byte instruction burned into an EPROM. Ask the 'Woz', he'll tell you how exciting that was.

But I stopped writing assembly code less than ten years later. Why? Too much effort. Let's do it in 'C'. Sure we lost sight of the details of how it worked. We had to trust the compiler writers. We had to build bigger memories because the code size quadrupled. We had to get floppies working to hold all this code because we were writing it faster than our microcomputers could hold it. Complexity was increasing as fast as Moore's Law could build better chips. It seemed like a simple linear progression. More code, more features, bigger programs.

Then something happened that changed the way we thought about systems. The hard-drive appeared. This was permanent storage. Yeah, so were floppies if they didn't fail, but this was something esoterically different. Up until then, your program resided on a disk, was inserted when needed and kept all its state on that disk; that program was separate from all others. It was logically independent. The hard drive changed that because all programs and their settings were now on the same slab of magnetics. Now the first program could know about the second program. The 'application' was born: lots of programs cooperating to do a much bigger task. Most importantly, the hard drive became a universal place to store anything related to you. Your entire computing history was intertwined through settings on that drive. It was a good thing, you could tune your desktop to your own preferences. The sad thing was when that hard drive crashed, you would waste weeks re-tuning to almost get back, and after the fourth crash, (or computer upgrade) you stopped tuning. It was too disheartening to start again.

Where are we going next? In very abstract terms, it is clear to me that everyone is going to store everything on the Web. That's it. Simple as that. Today's hard drive will become just a caching device to speed things up. No one will be willing to be locked into having one computer hold their life, because computers are just silicon and there's so much of it out there. Why aren't we there yet? Trust. There are online companies that offer to mirror your computer up to their secure data centers, but you have to take that on faith. They say it uses XYZ encryption and that their techs are bonded and equipment is protected by guards and dogs and walls of concrete. Trouble is, nobody wants to fly out to Colorado to see if that is true. And yet, people want the convenience of letting somebody else deal with the storage thing. Just like paying for TV, electricity or the phone bill, it comes naturally.

Something is going to come along that solves the trust issue. Then its all going to flip. We'll all have our user-centric view of our data and we'll look at it from different devices depending on what's convenient. We'll use one of the company's big-screen workstations with the ergonomic mouse when that's handy. We'll use a laptop when we're on the road. We'll use a Treo or a BlackBerry when we're in a cab. We won't think about whether our data is safe from theft or loss because, duh, it's on the Web.

We'll get there. You want to be there.

Let's make storage work