Marv Waschke On Computing – Page 12 – An Engineer's View of Personal Computing

February 24, 2018March 2, 2018

Blockchain Made Simple

Blockchain! Blockchain! Blockchain! That ought to get some attention. If anyone hasn’t noticed, blockchain is somewhere near the hysteria phase of the hype cycle. Everyone associates blockchain with digital currency. The apparent bubble around Bitcoin and other digital or crypto currencies adds to the intensity of the discussion. But very few people have a firm grasp of what blockchain is and it’s potential. In this blog, I will differentiate blockchain from digital currency and discuss block chain’s potential for profoundly disrupting commerce and society.

Not a single technology

First, blockchain is not a single technology. Like email, or a packet network, blockchain is a method of communication with certain qualities that many different technologies can be used to implement. The most popular implementations of blockchain (Bitcoin and its ilk) use specific technology that relies on cryptographic signatures for verifying a sequence of transactions. However, tying blockchain to a specific technology is unnecessarily limiting. The term “secure distributed ledger” is more descriptive, but secure distributed ledger is also pedantic and long-winded, so most people will continue to call it blockchain.

Essential blockchain

In essence, blockchain is a definitive statement of the result of a series of transactions that does not rely on a central authority for its validity. Putting it another way, contributions to a block chain may come from many different sources. Although the contributions are authenticated, they are not controlled by a central authority.

Think about real estate and deeds. Possession of real estate relies on courts and official records that validate ownership of physical lumps of earth. A central authority, in the form of a county records office has the official record of transactions that prove the ownership of parcels of physical land. Courts, lawyers, surveyors, title companies and individuals all contribute to the record. If you want to research the ownership of a parcel in another county, state, or country you either travel or hope that the local authority has put the records on-line and keeps them current. If you want to record a survey, deed or a sale, you must present the paperwork to the controlling local authority.

If you want to put together a real estate transaction that depends on interlocking simultaneous ownership by a complex structure of partners in parcels spread over several jurisdictions, the bureaucratic maneuvering and paperwork to complete the transaction is daunting.

Blockchain could be used to build a fully validated and universally accessible land records office that does not rely on a local authority. Instead of interacting with local jurisdictions, the blockchain would validate land information and contracts, simplifying and reducing the cost of land ownership deals.

This type of problem repeats itself in many different realms. Ownership of intellectual property, particularly digital intellectual property that travels from person to person and location to location at the speed of light presents similar problems.

Blockchains and distributed transactional databases

Although blockchain is implemented quite differently from a traditional distributed transactional database, it performs many of the same functions. Distributed transactional databases are nothing new. In preparing to write this blog, I pulled a thirty-year-old reference book from my shelf for a quick review. A distributed database is a database in which data is stored on a several computers. The data may be distinct (disjoint) or it may overlap, but the data on each computer is not a simple replica of the data on other computers in the distribution.

A transactional database is one in which each query or change to the database is a clearly defined transaction whose outcome is totally predictable based on itself and the sequence of prior transactions. For example, the amount of cash in your bank account depends on the sequence of prior deposits and withdrawals. By keeping close control of the order and amounts of transactions, your bank’s account database always accurately shows how much money you have in your account. If some transactions were not recorded, or recorded out of order, at some instance your account balance would be incorrect. You rely on the integrity of your bank for assurance that all transactions are correctly recorded.

The value added when a database is distributed is that individual databases can be smaller and maintained close to the physical origin of the data or by experts familiar with the data in the contributing database. Consequently, distributed data can be of higher quality and quicker to maintain than the enormous data lakes in central databases.

Both transactional and distributed databases are both common and easy to implement, or at least they have been implemented so many times, the techniques are well known. But when databases are supposed to be both distributed and keep transactional discipline, troubles pop up like a flush of weeds after a spring rain.

Distributed transactional databases in practice

The nasty secret is that although distributed transactional databases using something called “two-phase commit” are sound in theory and desirable to the point of sometimes being called a holy grail, in practice, they don’t work that well. If networks were reliable, distributed transactional database systems would work well and likely would have become common twenty years ago, but computer networks are not reliable. They are designed to work very well most of the time, but the inevitable trade-off is that two computers connected to the network cannot be guaranteed to be able to communicate successfully in any designated time interval. A mathematical theorem proves the inevitable trade off.

Look at a scaled up system, for example Amazon’s ordering system; if records were distributed between the distribution centers from which goods are shipped, the sources of the goods, and Amazon’s accounting system, a single order might require thousands of nearly simultaneous connections to start the goods on their way to the customer. These connections might reach over the entire globe. The probability that any one of those connections will be blocked long enough to stall the transaction is intolerably high.

Therefore, practical systems are designed to be as resilient as possible to network interruptions. In general, this means making sure that all the critical connections are within the same datacenter, which is designed with ample redundancy to guarantee that business will not halt when a daydreaming employee trips on a stray cable.

Reliable, performant transactional systems avoid widely distributed transactions. Accounts are kept in central databases and most transactions are between a local machine running a browser and a central system running in a cloud datacenter. The central database may be continuously backed up to one or more remote datacenters and the system may be prepared flip over to an alternative backup system whenever needed, but there is still a definitive central database at all times. When you buy a book on Amazon, you don’t have to wait for computers in different corners of the country and the world to chime in with assent before the transaction will complete.

That is all well and good for Amazon, where sales transactions in an enterprise resource planning database (accounting system to non-enterprise system wonks) makes sense, but not for problems that involve the coordination of many sources of truth, like the jurisdictions in our land records example above. In that case, each records office is the definitive source of truth for its share of the transaction, but there is no central authority that rules them all and we spend days and weeks wading through a snake pit of physical contacts and possibly erroneous paper when we cross jurisdictional boundaries.

Practical blockchain

This is where blockchain comes in. Imagine a land record system where rights, records, and surveys for a parcel of real estate were all recorded and anyone could record information and transfer rights by presenting a credential that proves they hold a given right, then transfer the right to a properly credentialed recipient all within a single computer interface. A chain of transactions stretching back to the beginning of the blockchain verify the participants’ data and rights. The identities of the rights holders could be required to be public or not depending on the needs of the system, but the participants can always present verifiable credentials to prove they are agents holding the rights they assert or are an authority for the data they contribute. And all this authenticated data spread over many different computer systems.

Digital currency

This is basically the way crypto currencies work. When a Bitcoin is acquired, its value is backed by a verifiable record tracing the transfers of the coin back to its moment of origin. Maintaining the record is the work of many computers distributed all over the world. Bitcoin solves the problem of the cost of maintaining the record by offering freelance maintainers, called miners, bitcoins in return for performing the maintenance. The integrity of the record is maintained through elaborate cryptographic protocols and controls. Bitcoin and similar currencies are designed to be untraceable: purchasing or selling Bitcoins require credentials, but not identities of the participants. Anonymity makes Bitcoins attractive for secret or illegal transactions.

A downside of the Bitcoin-type blockchain is the tremendous amount of expensive computing required to maintain the blockchain. In addition, some critics have charged that Bitcoin-type currencies are enormously complex schemes that promise profits based on future investments rather than true increases in value, which amounts to a Ponzi scheme. Whether Bitcoin is a Ponzi scheme is open to argument.

The red herring

Unfortunately, the crypto-currency craze is a red herring that distracts attention from blockchain’s profound disruptive potential. Stock, bond, and commodity exchanges, land record offices, supply chain accounting and ordering, tracking sales of digital assets like eBooks, and many other types of exchange rely on central clearing houses that charge for each transaction. These clearing houses facilitate exchange and collect fees without contributing substantial value. Blockchain has the potential to simplify and speed transactions and eliminate these middlemen, like Uber eliminates taxi fleets and online travel apps eliminate travel agents. Amazon eliminated publishers and literary agents for eBooks and blockchain could eliminate Amazon from eBook sales.

How can this work without a central authority holding the records of the transactions? The underlying concept is that as long as the blockchain is maintained according to rules known to a group of maintainers, no one maintainer need have central control as long as the other maintainers can detect when somebody fudges something. A proper blockchain algorithm is physically impossible to maintain in ways contrary to the rules. There is more than one way to do it. The Bitcoin model is not the only way to implement a blockchain. Paid miners and anonymity are not inherent in blockchains.

For example, an author could register an electronic document with an electronic book block chain registry. Readers could enter into a contract with the author for the right to access to the book. These contracts could be registered in the blockchain. Readers could relinquish their contracts with the author or transfer their contract to another reader and lose the right to access to the book. The rules depend on the contracts represented in the blockchain, not the technology. Additional technology such as cryptographic keys might be used to enforce contracted book access. Unlike current DRM schemes, no authority like a publisher or an Amazon is involved, only the author, the reader, and the transaction record in the blockchain.

I’ll get more into how it’s done in another blog.

February 11, 2018

Spectre, Meltdown, and Virtual Systems

In June of 2017 I wrote a blog for InfoWorld on How to handle the risks of hypervisor hacking. In it, I described the theoretical points where Virtual Machines (VMs) and hypervisors could be hacked. My crystal ball must have been well polished. Spectre and Meltdown prey on one of the points I described there.

What I did not predict is where the vulnerability would come from. As a software engineer, I always think about software vulnerabilities, but I tend to assume that the hardware is seldom at fault. I took one class in computer hardware design thirty years ago. Since then, my working approach is to look first for software flaws and only consider hardware when I am forced, kicking and screaming, to examine for hardware failure. This is usually a good plan for a software engineer. As a rule, when hardware fails, the device bricks (is completely dead), seldom does it continue to function. There is usually not much beyond rewriting drivers that a coder can do to fix a hardware issue. Even rewriting a driver is usually beyond me because it takes more hardware expertise than I have to write a correct driver.

In my previous blog here, I wrote that Spectre and Meltdown probably will not affect individual users much. So far, that is still true, but the real impact of these vulnerabilities is being felt by service providers, businesses, and organizations that make extensive use of virtual systems. Although the performance degradation after temporary fixes have been applied is not as serious as previously estimated, some loads are seeing serious hits and even single digit degradation can be significant is scaled up systems. Already, we’ve seen some botched fixes, which never help anyone.

Hardware flaws are more serious than software flaws for several reasons. A software flaw is usually limited to a single piece of software, often an application. A vulnerability limited to a single application is relatively easy to defend against. Just disable or uninstall the application until it is fixed. Inconvenient, but less of a problem than an operating system vulnerability that may force you to shut down many applications and halt work until the operating system supplier offers a fix to the OS. A flaw in a basic software library can be worse: it may affect many applications and operating systems. The bright side is that software patches can be written and applied quickly and even automatically installed without computer user intervention— sometimes too quickly when the fix is rushed and inadequately tested before deployment— but the interval from discovery of a vulnerability to patch deployment is usually weeks or months, not years.

Hardware chip level flaws cut a wider and longer swathe. A hardware flaw typically affects every application, operating system, and embedded system running on the hardware. In some cases, new microcode can correct hardware flaws, but in the most serious cases, new chips must be installed, and sometimes new sets of chips and new circuit boards are required. If installing microcode will not fix the problem, at the very least, someone has to physically open a case and replace a component. Not a trivial task with more than one or two boxes to fix and a major project in a data center with hundreds or thousands of devices. Often, a fix requires replacing an entire unit, either because that is the only way to fix the problem, or because replacing the entire unit is easier and ultimately cheaper.

Both Intel and AMD have announced hardware fixes to the Spectre and Meltdown vulnerabilities. The replacement chips will probably roll out within the year. The fix may only entail a single chip replacement, but it is a solid prediction that many computers will be replaced. The Spectre and Meltdown vulnerabilities exist in processors deployed ten years ago. Many of the computers using these processors are obsolete, considering that a processor over eighteen months old is often no longer supported by the manufacturer. These machines are probably cheaper to replace than upgrade, even if an upgrade is available. More recent upgradable machines will frequently be replaced anyway because upgrading a machine near the end of its lifecycle is a poor investment. Some sites will put off costly replacements. In other words, the computing industry will struggle with the issues raised by Spectre and Meltdown for years to come.

There is yet another reason vulnerabilities in hardware are worse than software vulnerabilities. The software industry is still coping with the aftermath of a period when computer security was given inadequate attention. At the turn of the 21^st century, most developers had no idea that losses due to insecure computing would soon be measured in billions of dollars per year. The industry has changed— software engineers no longer dismiss security as an optional afterthought, but a decade after the problems became unmistakable, we are still learning to build secure software. I discuss this at length in my book, Personal Cybersecurity.

Spectre and Meltdown suggest that the hardware side may not have taken security as seriously as the software side. Now that criminal and state-sponsored hackers are aware that hardware has vulnerabilities, they will begin to look hard to find new flaws in hardware for subverting systems. A whole new world of hacking possibilities awaits.

We know from the software experience that it takes time for engineers to develop and internalize methodologies for creating secure systems. We can hope that hardware engineers will take advantage of software security lessons, but secure processor design methodologies are unlikely to appear overnight, and a backlog of insecure hardware surprises may be waiting for us.

The next year or two promises to be interesting.

January 10, 2018January 10, 2018

Spectre and Meltdown

Will Spectre and Meltdown be the flagship computer security crisis of 2018? There is a good chance that it will be, although I doubt that many personal computer users will be directly affected.

Good news

These flaws are hard to understand and take advanced engineering skills to implement; when implemented they are hard to exploit; I struggle to imagine results that would be worth a hacker’s trouble. Also, exploiting these flaws on a computer you do not already have access to is close to impossible. Consequently, good basic computer hygiene will protect you from these attacks as well as everything else thrown at you. In addition, the exploits are read-only; they do not corrupt data or processes.

The patches are going out this week to all the major operating systems and so far, the bruited predictions of devastating across-the-board 30% performance degradations have not proven out. 10% degradation and only in limited circumstances seems more realistic according to early testing reports.

Less good news

Nevertheless, the fallout from Spectre and Meltdown is likely to cause migraines and insomnia among computer security experts for months, even years to come. And the picture is not quite as rosy for businesses, especially for businesses that rely on virtual computing in various forms, as it is for individuals.

Scope

These are not your garden variety zero-day exploits. When I wrote about KRACK a few months ago, I explained that the flaw is particularly bad because it is in the standard and every correct implementation is vulnerable. The Spectre and Meltdown flaws are in the processor chip design. Intel processors have the worst problems and they perform the vast majority of computer processing in the world today, but AMD and ARM processors are also affected. That covers most of the rest of computing, including phones and tablets. For reasons I will elaborate on later, I suspect other processors have not been cited only because no one has looked hard enough yet.

The patches that have been applied are crack sealers; they do not repair the broken foundation that caused the cracks. Fixing the source of the cracks will require new processor designs and new chips. In order to explain just what Spectre and Meltdown are, I have to explain several unfamiliar concepts.

Protection rings

One of the pillars of computer security is called a “protection ring.” They are what prevents one computer process from interfering with another. For example, without protection rings, forcing a user to pass through a login gate before using a computer is easier to circumvent. Protection rings have been built right into the silicon of most processors since the eighties and the concept goes back to the beginnings of multi-processing in the 60s.

To science fiction readers, I liken protection rings to Asimov’s laws of robotics—they are intended to be intrinsic in all computers. In theory, protection rings when properly used make it impossible to break into a well-written operating system without physically altering the processor. When a computer is hacked into, it usually stems from a flaw in the operating system’s use of protection rings, not the physical processor chip.

The Spectre and Meltdown flaws are special because they are gaps in the integrity of privilege rings that were inadvertently built into the processor chips. To see how these gaps were opened, we have to look at concepts of modern processor design.

Multi-core processors

One of these concepts is “multi-core processors.” Before the advent of multi-cores, the capacity of processors was beginning to be limited by the great physical speed limit: the speed of light. When a processor reaches a certain number of instructions per second, it is limited by the time a signal takes to travel across the chip at the speed of light. The processor can’t move on to the next instruction in less time than it takes to read he previous instruction’s results.

Processor designers got around that by putting multiple processors, cores, on a single chip. In theory, by putting two cores on a chip, the speed is doubled. But that does not really solve the problem because taking advantage of the doubled speed requires complex and expensive changes in program design.

Speculative execution

The designers hit on a solution to this: speculative execution. Most computer programs are long chains of “if-thens”. If X condition is met, do Y; if it is not met, do Z. Traditional computers first evaluate X, then decide whether to perform Y or Z. With speculative execution, at the same time one core evaluates X, another core performs Y, and a third performs Z. Depending on how X comes out, Y or Z is discarded. This is a gross simplification, but in the time a single core uses to evaluate X, the three cores already have both the Y and Z results. Thus, the multi-core processor executes a conventionally written program in much less time than a single core. And the speed of computing doubles in 18 months again. Nifty, huh?

Not so nifty. Those discarded speculative chunks of execution can be manipulated in such a way that protection rings are violated. I won’t go into how it’s done. A Google researcher explains it here.

Migraines and insomnia

I am not optimistic when I think about what these defects reveal about processor design. Software development underwent a revolution in the early part of this century when security rose in priority. You can read about it in my book, Personal Cybersecurity. Security was a neglected step-child in the pioneering days of software development in the last century, but around 2000, the industry realized that computing would die if software was not built with more secure methodologies. The revolution is still going on, but the slap-dash attitude toward security that characterized the software cowboys of the 90s is gone.

Spectre and Meltdown tell me that the security revolution did not make it into processor design. Makes you think about why the CEO of Intel sold a big block of Intel stock after the flaws in Intel chips were discovered.

I am afraid we have not heard the last of chip level security flaws. I hope processor designs are not easy pickings for hackers, but the fact that these flaws have been present for at least a decade is daunting. Also, to completely eradicate these flaws, processor chips or entire computers will have to be replaced, which suggests that heads will ache on for years.

Coming soon

I wrote a blog on hypervisor hacking and one on virtual machine security for Network World last year that are affected by the Spectre and Meltdown flaws, but I’ll save comments on the safety of virtual computing for another blog.

December 27, 2017February 6, 2021

Privacy and Online Ads

Without ads monetizing the content of public computer networks, a service that is now low cost would be much more expensive. I’m willing to accept that. But there is something sinister in the online ad business.

Today, “monetize” usually means to change something that is popular in the digital world into a money-maker for someone. Online ads monetize most of what we think of as the internet. Google makes most of their money from online ads as does Facebook. Amazon makes their money from selling things, but their online ads are a crucial part of their business plan.

The ad business has changed

Remember “banner ads”? A seller like Rolex will be glad to pay a premium for a banner ad on a site like the New Yorker that has wide circulation and a good reputation among people with money to spend on luxury watches.

But the banner ad is an endangered species from the age of paper advertising. They are based on high-end, intelligent marketing that made many careers in the 20^th Century. But no longer.

21st Century digital advertisers have facts. Traditional marketers knew that New Yorker readers were affluent and well-educated, but they were short on specifics on who was buying and why. Digital marketers today can tell you who sees an ad, how often viewers click on an ad, and, for digital sales, how often they spend money. And they know the age, location, income bracket, and browsing habits of most potential customers. They can target ads to the most likely customers and know exactly how the ads perform.

How do online ads work?

Traditionally, a big city daily newspaper could charge more for their ads than a community weekly because a seller could expect more people to see an ad in the big city daily and act on the ad. Sellers measure the effectiveness of ads by “return on investment” (ROI). If a seller invests $50 in an ad in a community fish wrapper and sees a $100 increase in sales, they get a 200% return. ($100 return/$50 investment = 200%. Sometimes a low-cost ad has better ROI, usually not.

Some businesses occasionally use advertising to improve their image or convey information, but the everyday advertising goal is ROI, using ads to make more sales. The lure of digital advertising is that digital advertising can be fine-tuned to increase ROI by reducing costs and increasing returns.

Digital advertisers can count how many times the ad was seen (impressions) and was followed (clicks). If the transaction is digital, they can count the number of times the ad resulted in a sale. Traditional paper advertising only knows how many copies of the ad were circulated, not how often the ad was seen and only generalities about readers.

The network collects information on buyers that can be used to target advertising toward people likely to buy. For example, people who don’t have cars are unlikely to buy car polish. Therefore, car polish sellers can improve their advertising ROI by directing their ads to car owners and ignoring people without cars.

Who are the players in the online ad biz?

Customers. That’s you.
The ad publishers. Google, Facebook, Amazon, etc. Ad publishers put the ads in front of potential customers.
Ad networks and exchanges. The folks in the background who match likely buyers to sellers and maximize the vigorish. When you open a web page with slots for ad, the slots are often auctioned off highest bidder in milliseconds. The bidders use information about you, to decide how much to bid. You may be familiar with some of these players like “DoubleClick” whose addresses flash by as you enter a site.
Ad agencies. Those waggish artists who think up cunning ads for the advertisers. These companies usually have bland names like “WPP Group.”
Data brokers. The vacuum cleaners that suck up data and sort it into a commodity they can sell to advertisers, ad agencies, networks, and exchanges. These are companies like Blue Kai or Live Ramp, whom you may not have heard of.

Except for customers, the players are often combined. There are one-stop shops that combine all the functions and boutiques that specialize in a narrow aspect of the process.

The network never forgets

The data collected on buying habits has grown rapidly in the last few years. If you do something on the network, someone, somewhere, has taken a note. The more we use computer networks, the more data is amassed on us. “Big data” arose to process the mountains of accumulated data.

Today, electronic payment is common, and many customers get discounts by identifying themselves when they purchase. Consequently, grocery store managers may know more about your food buying habits than you do. They can use that information to offer the items you want, but they also use it to find and persuade you to buy more profitable items. They can appeal to habits you may not even know you have. Online sales are even more effective at collecting data on customers.

Although you may not enjoy being manipulated in this way, most people still choose to use payment methods that identify themselves and trade their phone number at the point-of-sale for reduced prices. A lot of people feel that the convenience of electronic payment and a reduced price are reasonable tradeoff for subjecting themselves to manipulation by their sellers.

Why do online ads make me feel uneasy?

Using network habits to target ads is occasionally annoying. My grandfather died of colon cancer after a colostomy fifty years ago. Recently I wondered how those ugly colostomy bags had changed. I searched online. What a mistake! I still occasionally get an ad for disposable bags in cheery prints.

Creepy, yes, but not threatening. I, thank Heavens, am not remotely likely to purchase a colostomy bag according to my gastroenterologist. The sellers have made a mistake, but it only costs them a few cents and they certainly get a worthwhile ROI on their ads, winning the numbers game. And I get annoying ads. Nothing to lose sleep over.

Misuse of personal profiles

But let’s change the story some. Suppose you looked up alcoholism treatment out of curiosity. And the user of your profile was not an alcoholism treatment center selling their services, but an investigative agency running a check for a potential employer to whom you sent an application. Maybe the job was important to you and you were well-qualified, but your application was tossed on the first round because you were flagged as an alcoholic.

Do you see how the situation changed? A seller looking at ROI doesn’t grudge a fried fig for a few ads sent to the wrong place. A loss of a few cents to misdirected ads is nothing compared to all those colostomy bag sales. But you lost a job that you may have wanted, even needed, badly. And the potential employer lost a brilliant prospect. This happens when a personal profile is used in a scenario where much harm can result from inferences that are perfectly valid in other circumstances.

The danger is that the profiles will applied wrongly when they are harmless and useful in most circumstances. That is sinister.