Blockchain Made Simple

Blockchain! Blockchain! Blockchain! That ought to get some attention. If anyone hasn’t noticed, blockchain is somewhere near the hysteria phase of the hype cycle. Everyone associates blockchain with digital currency. The apparent bubble around Bitcoin and other digital or crypto currencies adds to the intensity of the discussion. But very few people have a firm grasp of what blockchain is and it’s potential. In this blog, I will differentiate blockchain from digital currency and discuss block chain’s potential for profoundly disrupting commerce and society.

Not a single technology

First, blockchain is not a single technology. Like email, or a packet network, blockchain is a method of communication with certain qualities that many different technologies can be used to implement. The most popular implementations of blockchain (Bitcoin and its ilk) use specific technology that relies on cryptographic signatures for verifying a sequence of transactions. However, tying blockchain to a specific technology is unnecessarily limiting. The term “secure distributed ledger” is more descriptive, but secure distributed ledger is also pedantic and long-winded, so most people will continue to call it blockchain.

Essential blockchain

In essence, blockchain is a definitive statement of the result of a series of transactions that does not rely on a central authority for its validity. Putting it another way, contributions to a block chain may come from many different sources. Although the contributions are authenticated, they are not controlled by a central authority.

Think about real estate and deeds. Possession of real estate relies on courts and official records that validate ownership of physical lumps of earth. A central authority, in the form of a county records office has the official record of transactions that prove the ownership of parcels of physical land. Courts, lawyers, surveyors, title companies and individuals all contribute to the record. If you want to research the ownership of a parcel in another county, state, or country you either travel or hope that the local authority has put the records on-line and keeps them current. If you want to record a survey, deed or a sale, you must present the paperwork to the controlling local authority.

If you want to put together a real estate transaction that depends on interlocking simultaneous ownership by a complex structure of partners in parcels spread over several jurisdictions, the bureaucratic maneuvering and paperwork to complete the transaction is daunting.

Blockchain could be used to build a fully validated and universally accessible land records office that does not rely on a local authority. Instead of interacting with local jurisdictions, the blockchain would validate land information and contracts, simplifying and reducing the cost of land ownership deals.

This type of problem repeats itself in many different realms. Ownership of intellectual property, particularly digital intellectual property that travels from person to person and location to location at the speed of light presents similar problems.

Blockchains and distributed transactional databases

Although blockchain is implemented quite differently from a traditional distributed transactional database, it performs many of the same functions. Distributed transactional databases are nothing new. In preparing to write this blog, I pulled a thirty-year-old reference book from my shelf for a quick review. A distributed database is a database in which data is stored on a several computers. The data may be distinct (disjoint) or it may overlap, but the data on each computer is not a simple replica of the data on other computers in the distribution.

A transactional database is one in which each query or change to the database is a clearly defined transaction whose outcome is totally predictable based on itself and the sequence of prior transactions. For example, the amount of cash in your bank account depends on the sequence of prior deposits and withdrawals. By keeping close control of the order and amounts of transactions, your bank’s account database always accurately shows how much money you have in your account. If some transactions were not recorded, or recorded out of order, at some instance your account balance would be incorrect. You rely on the integrity of your bank for assurance that all transactions are correctly recorded.

The value added when a database is distributed is that individual databases can be smaller and maintained close to the physical origin of the data or by experts familiar with the data in the contributing database. Consequently, distributed data can be of higher quality and quicker to maintain than the enormous data lakes in central databases.

Both transactional and distributed databases are both common and easy to implement, or at least they have been implemented so many times, the techniques are well known. But when databases are supposed to be both distributed and keep transactional discipline, troubles pop up like a flush of weeds after a spring rain.

Distributed transactional databases in practice

The nasty secret is that although distributed transactional databases using something called “two-phase commit” are sound in theory and desirable to the point of sometimes being called a holy grail, in practice, they don’t work that well. If networks were reliable, distributed transactional database systems would work well and likely would have become common twenty years ago, but computer networks are not reliable. They are designed to work very well most of the time, but the inevitable trade-off is that two computers connected to the network cannot be guaranteed to be able to communicate successfully in any designated time interval. A mathematical theorem proves the inevitable trade off.

Look at a scaled up system, for example Amazon’s ordering system; if records were distributed between the distribution centers from which goods are shipped, the sources of the goods, and Amazon’s accounting system, a single order might require thousands of nearly simultaneous connections to start the goods on their way to the customer. These connections might reach over the entire globe. The probability that any one of those connections will be blocked long enough to stall the transaction is intolerably high.

Therefore, practical systems are designed to be as resilient as possible to network interruptions. In general, this means making sure that all the critical connections are within the same datacenter, which is designed with ample redundancy to guarantee that business will not halt when a daydreaming employee trips on a stray cable.

Reliable, performant transactional systems avoid widely distributed transactions. Accounts are kept in central databases and most transactions are between a local machine running a browser and a central system running in a cloud datacenter. The central database may be continuously backed up to one or more remote datacenters and the system may be prepared flip over to an alternative backup system whenever needed, but there is still a definitive central database at all times. When you buy a book on Amazon, you don’t have to wait for computers in different corners of the country and the world to chime in with assent before the transaction will complete.

That is all well and good for Amazon, where sales transactions in an enterprise resource planning database (accounting system to non-enterprise system wonks) makes sense, but not for problems that involve the coordination of many sources of truth, like the jurisdictions in our land records example above. In that case, each records office is the definitive source of truth for its share of the transaction, but there is no central authority that rules them all and we spend days and weeks wading through a snake pit of physical contacts and possibly erroneous paper when we cross jurisdictional boundaries.

Practical blockchain

This is where blockchain comes in. Imagine a land record system where rights, records, and surveys for a parcel of real estate were all recorded and anyone could record information and transfer rights by presenting a credential that proves they hold a given right, then transfer the right to a properly credentialed recipient all within a single computer interface. A chain of transactions stretching back to the beginning of the blockchain verify the participants’ data and rights. The identities of the rights holders could be required to be public or not depending on the needs of the system, but the participants can always present verifiable credentials to prove they are agents holding the rights they assert or are an authority for the data they contribute. And all this authenticated data spread over many different computer systems.

Digital currency

This is basically the way crypto currencies work. When a Bitcoin is acquired, its value is backed by a verifiable record tracing the transfers of the coin back to its moment of origin. Maintaining the record is the work of many computers distributed all over the world. Bitcoin solves the problem of the cost of maintaining the record by offering freelance maintainers, called miners, bitcoins in return for performing the maintenance. The integrity of the record is maintained through elaborate cryptographic protocols and controls. Bitcoin and similar currencies are designed to be untraceable: purchasing or selling Bitcoins require credentials, but not identities of the participants. Anonymity makes Bitcoins attractive for secret or illegal transactions.

A downside of the Bitcoin-type blockchain is the tremendous amount of expensive computing required to maintain the blockchain. In addition, some critics have charged that Bitcoin-type currencies are enormously complex schemes that promise profits based on future investments rather than true increases in value, which amounts to a Ponzi scheme. Whether Bitcoin is a Ponzi scheme is open to argument.

The red herring

Unfortunately, the crypto-currency craze is a red herring that distracts attention from blockchain’s profound disruptive potential. Stock, bond, and commodity exchanges, land record offices, supply chain accounting and ordering, tracking sales of digital assets like eBooks, and many other types of exchange rely on central clearing houses that charge for each transaction. These clearing houses facilitate exchange and collect fees without contributing substantial value. Blockchain has the potential to simplify and speed transactions and eliminate these middlemen, like Uber eliminates taxi fleets and online travel apps eliminate travel agents. Amazon eliminated publishers and literary agents for eBooks and blockchain could eliminate Amazon from eBook sales.

How can this work without a central authority holding the records of the transactions? The underlying concept is that as long as the blockchain is maintained according to rules known to a group of maintainers, no one maintainer need have central control as long as the other maintainers can detect when somebody fudges something. A proper blockchain algorithm is physically impossible to maintain in ways contrary to the rules. There is more than one way to do it. The Bitcoin model is not the only way to implement a blockchain. Paid miners and anonymity are not inherent in blockchains.

For example, an author could register an electronic document with an electronic book block chain registry. Readers could enter into a contract with the author for the right to access to the book. These contracts could be registered in the blockchain. Readers could relinquish their contracts with the author or transfer their contract to another reader and lose the right to access to the book.  The rules depend on the contracts represented in the blockchain, not the technology. Additional technology such as cryptographic keys might be used to enforce contracted book access. Unlike current DRM schemes, no authority like a publisher or an Amazon is involved, only the author, the reader, and the transaction record in the blockchain.

I’ll get more into how it’s done in another blog.

The Task List Reveals a Computer’s Beating Heart

Windows 10 Task List
Windows 10 Task List

Like an echocardiogram  that shows the blood flowing through a beating heart, the task shows the flow of activity on a computer. On Apple products, the task list is called the activity list. In Unix and its derivatives, such as Linux and Android, the task list is usually called the process list. In all of these operating systems, a task, activity, or process, whichever you want to call it, is an executing program. Most of the time, there are a lot of them.

Processors

A processor can only run one process at a time, but they switch between processes so rapidly it looks like a processor is running many processes at the same time. All the processes on the task list have been started but have not finished. Some are waiting for input, others are waiting for a chance to use some busy resource like a hard drive, but all are entitled to some time on the processor when their turn comes up.

Many computers today have more than one processor, which increases the number of processes that can run at one time and the amount of time a computer can give to each process. Different operating systems have different strategies for switching between processes, but all the strategies are like plate spinning acts. The plate spinner hurries from plate to plate, giving plates a spin when they begin to slow down and need attention. (If you don’t know what plate spinning is, see it here.) The processor does the same thing, executing a few instructions for a program, then rushing to the next process.

Processes

All the processes, both active and waiting, show up on the task list. That includes malware as well as legitimate processes. If you can spot a bad guy on the process list, you can kill it. The kill may not be permanent, processes can regenerate themselves, but it is usually worth a try. The challenge is to sort the good from the bad. Unless you know what you are shooting at, you might crash the entire computer or lose data, so be careful. You could find yourself restoring your entire system from a backup. Nevertheless, this is one area where you can strap on your weapons and wage open warfare against malware.

When I see an unfamiliar process on the task list, I usually run to Google. Most of the time, Google results tell me that the process is something innocuous that I hadn’t noticed before, but not always. By the way, be a little careful when Googling. There are questionable companies out there with sites that will appear in the search result and try to take advantage of you by offering unnecessary clean up services or dubious downloads. Microsoft will give you trustworthy advice, as will the established antivirus companies, but avoid sending money or installing programs from places you have never heard of. Some may be legitimate, but not all. Above all, don’t let anyone log into your computer remotely without rock solid credentials.

CPU Time

The task list tells you more than just the names of the running processes. There are number of readouts on the state of each process and the resources it is using. The one I usually look at first is the percentage of CPU time being taken and the accumulated CPU time. (Click at the top of the column to sort the processes by the metric.) Both of these metrics show the amount of time a process consumes on the processor–the amount of time the plate spinner has spent spinning the plate. A program that consumes more CPU time is using an extra share of the system’s most critical resource. Shutting down a high CPU consumer will do more to improve your computer’s performance than halting a low CPU consumer.

Some high-consumer processes are legitimate. For example, you will often see a browser using a lot of CPU time. That is because a browser does a lot of work. Any computing that takes place in web page interactions is chalked up to the browser. Some internal system processes, such “system interrupts,” do a lot of work also and rank high on CPU consumption. If you see an installed application that is hogging down CPU, you might check its configuration. There may be adjustments that will reduce consumption. Google will help you find what to do, but keep track of your changes so you can change them back if they don’t work. If you don’t use the program much, perhaps turning it off would be a good idea. When a high consumer happens to be malware, put a high priority on scrubbing it out. A high CPU-consuming malware is like a blockage in a coronary artery. You’ll feel much better without it.

Next Time

There’s more to the task list. Next time.

Service in Business, Computing, and the Cloud

I am a longtime enthusiast of the IT Infrastructure Library (ITIL). I was first introduced to ITIL in the mid-nineties by a support architect from the Netherlands. Their practices seemed overly complex, but I was pleased to see that the ITIL approach to service desk management was similar to the methodology built into the Network Management Forum trouble ticketing standard that I had worked on a few years before.

Business Services

ITIL places a heavy emphasis on managing IT as a system of services. Service is an important concept in both business and computing architecture.  In business, a service comes into being when a buyer purchases an action, often contrasted with a transaction in which the buyer purchases goods, i.e. things. Hence the phrase “goods and services.”  An important part of the notion of “service” in business is that services always have a buyer and a seller. There are always two participants in the transaction. I can wash my clothes myself, or I can purchase laundry service from a laundry. When I wash them myself, it is just me grabbing a box of detergent. When I subscribe to a laundry service, it is me and the laundry.

In an IT department, I can acquire help desk software, hire support analysts and managers, and run my own help desk, or I can subscribe to a help desk service. This is exactly like choosing to subscribe to a laundry service.   In both cases, I have decided to have something done for me instead of doing it myself. This is a business decision, not a technical decision, although deciding between  building a help desk or subscribing to a service involves a choice between technologies.

Software Services

I can do the same thing in software. Suppose I want to perform arithmetic calculations in a program. I could write a function to do it. Or I might find a library with a function to link into my program. In both cases, my program would be doing the calculation on my machine. Alternatively I could use a different software architecture and call the Google calculator web service. If I chose the service, my program would become a consumer of a Google cloud service and the calculation would be done in one of Google’s warehouse scale computers. I could choose to call Google’s REST API or SOAP API, but no matter which I chose, my program would still be a consumer of Google’s service.

Cloud Services

Cloud computing is based on a great divide between consumers and providers, both in a business and a programmatic sense. A great divide between customers and vendors can lead to conflicts and severed relations, but the divide between consumers and providers  is division by choice for more efficient  allocation of resources and may well result in improved relations.