Blockchain Made Simple

Blockchain! Blockchain! Blockchain! That ought to get some attention. If anyone hasn’t noticed, blockchain is somewhere near the hysteria phase of the hype cycle. Everyone associates blockchain with digital currency. The apparent bubble around Bitcoin and other digital or crypto currencies adds to the intensity of the discussion. But very few people have a firm grasp of what blockchain is and it’s potential. In this blog, I will differentiate blockchain from digital currency and discuss block chain’s potential for profoundly disrupting commerce and society.

Not a single technology

First, blockchain is not a single technology. Like email, or a packet network, blockchain is a method of communication with certain qualities that many different technologies can be used to implement. The most popular implementations of blockchain (Bitcoin and its ilk) use specific technology that relies on cryptographic signatures for verifying a sequence of transactions. However, tying blockchain to a specific technology is unnecessarily limiting. The term “secure distributed ledger” is more descriptive, but secure distributed ledger is also pedantic and long-winded, so most people will continue to call it blockchain.

Essential blockchain

In essence, blockchain is a definitive statement of the result of a series of transactions that does not rely on a central authority for its validity. Putting it another way, contributions to a block chain may come from many different sources. Although the contributions are authenticated, they are not controlled by a central authority.

Think about real estate and deeds. Possession of real estate relies on courts and official records that validate ownership of physical lumps of earth. A central authority, in the form of a county records office has the official record of transactions that prove the ownership of parcels of physical land. Courts, lawyers, surveyors, title companies and individuals all contribute to the record. If you want to research the ownership of a parcel in another county, state, or country you either travel or hope that the local authority has put the records on-line and keeps them current. If you want to record a survey, deed or a sale, you must present the paperwork to the controlling local authority.

If you want to put together a real estate transaction that depends on interlocking simultaneous ownership by a complex structure of partners in parcels spread over several jurisdictions, the bureaucratic maneuvering and paperwork to complete the transaction is daunting.

Blockchain could be used to build a fully validated and universally accessible land records office that does not rely on a local authority. Instead of interacting with local jurisdictions, the blockchain would validate land information and contracts, simplifying and reducing the cost of land ownership deals.

This type of problem repeats itself in many different realms. Ownership of intellectual property, particularly digital intellectual property that travels from person to person and location to location at the speed of light presents similar problems.

Blockchains and distributed transactional databases

Although blockchain is implemented quite differently from a traditional distributed transactional database, it performs many of the same functions. Distributed transactional databases are nothing new. In preparing to write this blog, I pulled a thirty-year-old reference book from my shelf for a quick review. A distributed database is a database in which data is stored on a several computers. The data may be distinct (disjoint) or it may overlap, but the data on each computer is not a simple replica of the data on other computers in the distribution.

A transactional database is one in which each query or change to the database is a clearly defined transaction whose outcome is totally predictable based on itself and the sequence of prior transactions. For example, the amount of cash in your bank account depends on the sequence of prior deposits and withdrawals. By keeping close control of the order and amounts of transactions, your bank’s account database always accurately shows how much money you have in your account. If some transactions were not recorded, or recorded out of order, at some instance your account balance would be incorrect. You rely on the integrity of your bank for assurance that all transactions are correctly recorded.

The value added when a database is distributed is that individual databases can be smaller and maintained close to the physical origin of the data or by experts familiar with the data in the contributing database. Consequently, distributed data can be of higher quality and quicker to maintain than the enormous data lakes in central databases.

Both transactional and distributed databases are both common and easy to implement, or at least they have been implemented so many times, the techniques are well known. But when databases are supposed to be both distributed and keep transactional discipline, troubles pop up like a flush of weeds after a spring rain.

Distributed transactional databases in practice

The nasty secret is that although distributed transactional databases using something called “two-phase commit” are sound in theory and desirable to the point of sometimes being called a holy grail, in practice, they don’t work that well. If networks were reliable, distributed transactional database systems would work well and likely would have become common twenty years ago, but computer networks are not reliable. They are designed to work very well most of the time, but the inevitable trade-off is that two computers connected to the network cannot be guaranteed to be able to communicate successfully in any designated time interval. A mathematical theorem proves the inevitable trade off.

Look at a scaled up system, for example Amazon’s ordering system; if records were distributed between the distribution centers from which goods are shipped, the sources of the goods, and Amazon’s accounting system, a single order might require thousands of nearly simultaneous connections to start the goods on their way to the customer. These connections might reach over the entire globe. The probability that any one of those connections will be blocked long enough to stall the transaction is intolerably high.

Therefore, practical systems are designed to be as resilient as possible to network interruptions. In general, this means making sure that all the critical connections are within the same datacenter, which is designed with ample redundancy to guarantee that business will not halt when a daydreaming employee trips on a stray cable.

Reliable, performant transactional systems avoid widely distributed transactions. Accounts are kept in central databases and most transactions are between a local machine running a browser and a central system running in a cloud datacenter. The central database may be continuously backed up to one or more remote datacenters and the system may be prepared flip over to an alternative backup system whenever needed, but there is still a definitive central database at all times. When you buy a book on Amazon, you don’t have to wait for computers in different corners of the country and the world to chime in with assent before the transaction will complete.

That is all well and good for Amazon, where sales transactions in an enterprise resource planning database (accounting system to non-enterprise system wonks) makes sense, but not for problems that involve the coordination of many sources of truth, like the jurisdictions in our land records example above. In that case, each records office is the definitive source of truth for its share of the transaction, but there is no central authority that rules them all and we spend days and weeks wading through a snake pit of physical contacts and possibly erroneous paper when we cross jurisdictional boundaries.

Practical blockchain

This is where blockchain comes in. Imagine a land record system where rights, records, and surveys for a parcel of real estate were all recorded and anyone could record information and transfer rights by presenting a credential that proves they hold a given right, then transfer the right to a properly credentialed recipient all within a single computer interface. A chain of transactions stretching back to the beginning of the blockchain verify the participants’ data and rights. The identities of the rights holders could be required to be public or not depending on the needs of the system, but the participants can always present verifiable credentials to prove they are agents holding the rights they assert or are an authority for the data they contribute. And all this authenticated data spread over many different computer systems.

Digital currency

This is basically the way crypto currencies work. When a Bitcoin is acquired, its value is backed by a verifiable record tracing the transfers of the coin back to its moment of origin. Maintaining the record is the work of many computers distributed all over the world. Bitcoin solves the problem of the cost of maintaining the record by offering freelance maintainers, called miners, bitcoins in return for performing the maintenance. The integrity of the record is maintained through elaborate cryptographic protocols and controls. Bitcoin and similar currencies are designed to be untraceable: purchasing or selling Bitcoins require credentials, but not identities of the participants. Anonymity makes Bitcoins attractive for secret or illegal transactions.

A downside of the Bitcoin-type blockchain is the tremendous amount of expensive computing required to maintain the blockchain. In addition, some critics have charged that Bitcoin-type currencies are enormously complex schemes that promise profits based on future investments rather than true increases in value, which amounts to a Ponzi scheme. Whether Bitcoin is a Ponzi scheme is open to argument.

The red herring

Unfortunately, the crypto-currency craze is a red herring that distracts attention from blockchain’s profound disruptive potential. Stock, bond, and commodity exchanges, land record offices, supply chain accounting and ordering, tracking sales of digital assets like eBooks, and many other types of exchange rely on central clearing houses that charge for each transaction. These clearing houses facilitate exchange and collect fees without contributing substantial value. Blockchain has the potential to simplify and speed transactions and eliminate these middlemen, like Uber eliminates taxi fleets and online travel apps eliminate travel agents. Amazon eliminated publishers and literary agents for eBooks and blockchain could eliminate Amazon from eBook sales.

How can this work without a central authority holding the records of the transactions? The underlying concept is that as long as the blockchain is maintained according to rules known to a group of maintainers, no one maintainer need have central control as long as the other maintainers can detect when somebody fudges something. A proper blockchain algorithm is physically impossible to maintain in ways contrary to the rules. There is more than one way to do it. The Bitcoin model is not the only way to implement a blockchain. Paid miners and anonymity are not inherent in blockchains.

For example, an author could register an electronic document with an electronic book block chain registry. Readers could enter into a contract with the author for the right to access to the book. These contracts could be registered in the blockchain. Readers could relinquish their contracts with the author or transfer their contract to another reader and lose the right to access to the book.  The rules depend on the contracts represented in the blockchain, not the technology. Additional technology such as cryptographic keys might be used to enforce contracted book access. Unlike current DRM schemes, no authority like a publisher or an Amazon is involved, only the author, the reader, and the transaction record in the blockchain.

I’ll get more into how it’s done in another blog.

Privacy and Online Ads

Without ads monetizing the content of public computer networks, a service that is now low cost would be much more expensive. I’m willing to accept that. But there is something sinister in the online ad business.

Today, “monetize” usually means to change something that is popular in the digital world into a money-maker for someone. Online ads monetize most of what we think of as the internet. Google makes most of their money from online ads as does Facebook. Amazon makes their money from selling things, but their online ads are a crucial part of their business plan.

The ad business has changed

Remember “banner ads”? A seller like Rolex will be glad to pay a premium for a banner ad on a site like the New Yorker that has wide circulation and a good reputation among people with money to spend on luxury watches.

But the banner ad is an endangered species from the age of paper advertising. They are based on high-end, intelligent marketing that made many careers in the 20th Century. But no longer.

21st Century digital advertisers have facts. Traditional marketers knew that New Yorker readers were affluent and well-educated, but they were short on specifics on who was buying and why. Digital marketers today can tell you who sees an ad, how often viewers click on an ad, and, for digital sales, how often they spend money. And they know the age, location, income bracket, and browsing habits of most potential customers. They can target ads to the most likely customers and know exactly how the ads perform.

How do online ads work?

Traditionally, a big city daily newspaper could charge more for their ads than a community weekly because a seller could expect more people to see an ad in the big city daily and act on the ad. Sellers measure the effectiveness of ads by “return on investment” (ROI). If a seller invests $50 in an ad in a community fish wrapper and sees a $100 increase in sales, they get a 200% return. ($100 return/$50 investment = 200%. Sometimes a low-cost ad has better ROI, usually not.

Some businesses occasionally use advertising to improve their image or convey information, but the everyday advertising goal is ROI, using ads to make more sales. The lure of digital advertising is that digital advertising can be fine-tuned to increase ROI by reducing costs and increasing returns.

Digital advertisers can count how many times the ad was seen (impressions) and was followed (clicks). If the transaction is digital, they can count the number of times the ad resulted in a sale. Traditional paper advertising only knows how many copies of the ad were circulated, not how often the ad was seen and only generalities about readers.

The network collects information on buyers that can be used to target advertising toward people likely to buy. For example, people who don’t have cars are unlikely to buy car polish. Therefore, car polish sellers can improve their advertising ROI by directing their ads to car owners and ignoring people without cars.

Who are the players in the online ad biz?

  • Customers. That’s you.
  • The ad publishers. Google, Facebook, Amazon, etc. Ad publishers put the ads in front of potential customers.
  • Ad networks and exchanges. The folks in the background who match likely buyers to sellers and maximize the vigorish. When you open a web page with slots for ad, the slots are often auctioned off highest bidder in milliseconds. The bidders use information about you, to decide how much to bid. You may be familiar with some of these players like “DoubleClick” whose addresses flash by as you enter a site.
  • Ad agencies. Those waggish artists who think up cunning ads for the advertisers. These companies usually have bland names like “WPP Group.”
  • Data brokers. The vacuum cleaners that suck up data and sort it into a commodity they can sell to advertisers, ad agencies, networks, and exchanges. These are companies like Blue Kai or Live Ramp, whom you may not have heard of.

Except for customers, the players are often combined. There are one-stop shops that combine all the functions and boutiques that specialize in a narrow aspect of the process.

The network never forgets

The data collected on buying habits has grown rapidly in the last few years. If you do something on the network, someone, somewhere, has taken a note. The more we use computer networks, the more data is amassed on us. “Big data” arose to process the mountains of accumulated data.

Today, electronic payment is common, and many customers get discounts by identifying themselves when they purchase. Consequently, grocery store managers may know more about your food buying habits than you do. They can use that information to offer the items you want, but they also use it to find and persuade you to buy more profitable items. They can appeal to habits you may not even know you have. Online sales are even more effective at collecting data on customers.

Although you may not enjoy being manipulated in this way, most people still choose to use payment methods that identify themselves and trade their phone number at the point-of-sale for reduced prices. A lot of people feel that the convenience of electronic payment and a reduced price are reasonable tradeoff for subjecting themselves to manipulation by their sellers.

Why do online ads make me feel uneasy?

Using network habits to target ads is occasionally annoying. My grandfather died of colon cancer after a colostomy fifty years ago. Recently I wondered how those ugly colostomy bags had changed. I searched online. What a mistake! I still occasionally get an ad for disposable bags in cheery prints.

Creepy, yes, but not threatening. I, thank Heavens, am not remotely likely to purchase a colostomy bag according to my gastroenterologist. The sellers have made a mistake, but it only costs them a few cents and they certainly get a worthwhile ROI on their ads, winning the numbers game. And I get annoying ads. Nothing to lose sleep over.

Misuse of personal profiles

But let’s change the story some. Suppose you looked up alcoholism treatment out of curiosity. And the user of your profile was not an alcoholism treatment center selling their services, but an investigative agency running a check for a potential employer to whom you sent an application. Maybe the job was important to you and you were well-qualified, but your application was tossed on the first round because you were flagged as an alcoholic.

Do you see how the situation changed? A seller looking at ROI doesn’t grudge a fried fig for a few ads sent to the wrong place. A loss of a few cents to misdirected ads is nothing compared to all those colostomy bag sales. But you lost a job that you may have wanted, even needed, badly. And the potential employer lost a brilliant prospect. This happens when a personal profile is used in a scenario where much harm can result from inferences that are perfectly valid in other circumstances.

The danger is that the profiles will applied wrongly when they are harmless and useful in most circumstances. That is sinister.

Cyber Defense Skill: URL Reading

Want to quickly sort out real emails from spam? Spot a bad links on web pages? Identify sham web sites? I have a suggestion: learn to read URLs.

Learning to read URLs is like taking a class in street self-defense or carrying a can of mace. Actually, much better because reading URLs can’t be turned against you. You might end up in the hospital or worse if you resist a street thug with your self-defense skills, but you will never be injured spotting a bad URL.

Uniform Resource Locators (URLs), more properly called Uniform Resource Identifiers (URIs), direct all the traffic on the World Wide Web. Almost every cyber-attack directs traffic to or from an illegitimate URL at some point in the assault. If you can distinguish a good address from a bad address and develop the habit of examining internet addresses, you will be orders of magnitude more difficult to hack.

Addresses are constructed according to simple rules. You can master the rules you need to know in order to distinguish legitimate addresses from scams in a few minutes. And be much safer.

If you want to dig deep into URLs, take a look at RFC 3986. There is much more to URLs than I cover here.

Here is a typical simple URL:

https://www.marvinwaschke.com

HTTP

The first part, called the scheme, “http:” tells you that it is a HyperText Transfer Protocol (HTTP) address. You need to know two things about the HTTP scheme. First, almost all data on the web travels to and from your desktop, laptop, tablet, or phone over HTTP. In fact, if an address does not begin with “http”, it’s not a web address. There other schemes, the most important of these is “mailto:”, which designates an email address. More on this below.

Secure HTTP

There is an important variant of HTTP called HTTPS. The “S” stands for “secure.” Data shipped via HTTPS is encrypted and the source and destination are verified with a security organization. HTTPS used to be reserved for financial transactions, but now, with all the dangers of the network, HTTPS is encouraged for all traffic. When you see “https” in a web address, hackers have a hard time snooping on your data or faking a web site. HTTPS is especially important if you are on open public WiFi at a coffee shop or other public place.

Not too long ago, security experts used to say HTTPS guaranteed that a site was legitimate. That is no longer good advice. HTTPS is not a guarantee that a site is legit. Smart scamming hackers can set up fake sites with HTTPS security. You have to check the rest of the address for signs of bogosity. However, setting up a fake site with a legitimate address is still hard, so a good address with HTTPS is still a strong bet.

HTTP address “authority”

The part of the address following the “//” is the “authority.” Most of the time, the authority is a registered domain name. The authority section of a URL ends with a “/”. Notice that the slash leans forward, not backward. A backward slash is completely different. The “query” follows the forward slash. The query usually contains search criteria that narrow down the data you want retrieved and is often hard to interpret without specific information about the domain. You can ignore it, although sometimes hackers can learn secrets about a web site from information inadvertently placed in the query.

Domain extensions

In the above address, “marvinwaschke.com” is a domain name that I have registered with the with the Internet Assigned Number Authority (IANA). “.com” is the extension. In the old days, there were only a few extensions allowed: “.gov”, “.edu”, “.net”, “.com”, and “.mil”. They are still the most common, although many others— such as “.tv”, “.partners”, “.rocks” and country abbreviations— have been added.

You can use extensions as a clue. For instance, most established firms and organizations still use the old standbys. A web site with a “amex.rocks” domain is likely not the American Express you think it is. We all know that some countries harbor more hackers than others. If an address has an extension that is an abbreviation for a cyber rogue state, be careful.

Remember, these are clues, not rules. A street lined with wrecked cars and broken windows may be crime free, but more often than not, it is a dangerous neighborhood. The same applies to incongruous domain names. They could be safe, but there is a good chance they are not.

Authority subsections

The authority section is divided by periods (“.”s) and reads in reverse. The extension that immediately precedes the first forward slash is the most important. “.com” in “marvinwascke.com” indicates that the marvinwaschke.com domain is in the vast segment of the internet made up of commercial ventures. “marvinwaschke” determines which commercial venture the address refers to. “www” indicates that the address points to the “www” part of the “marvinwaschke” venture. I could set up my website to have a “public.marvinwaschke.com” section or a “public.security.marvinwascke.com” section if I cared to. The “www” is historically so common, most browsers will strip it off or add it on as needed to make a connection.

“Microsoft.marvinwaschke.com” only indicates that my web site has a section devoted to Microsoft. “Microsoft.marvinwaschke.com” has nothing to do with Microsoft Corporation. Hackers make use of this to try to fool you that “Microsoft.pirates-r-us.ru” is a Microsoft site. It’s not! Hackers are creative. Make sure that the right end of the domain name makes sense.

Email URIs

Email addresses are URIs that follow a different scheme but use the same domain name rules. Usually, email addresses drop the “mailto” scheme but they can always be fully written out like mailto://boss@example.com. If you see an address like captain@microsoft.pirates-r-us.ru you can be fairly certain that the mail did not come from Bill Gates.

Near miss URIs

A favorite hacking trick is to register a domain that looks real, but is just a little off. For example, micrasoft.com instead of microsoft.com. Keep an eye out for those little tricks.

When in doubt, Google it

When you see a link or address with a suspicious domain name, Google the domain name before you use the address. Most of the time, Google will pick up information on dangerous domains.

Look at every link with caution

The internet is all about grabbing your attention. Absurd promises abound that that few people would take seriously after they took a moment to think. Losing weight is hard, wealth management is useless if you aren’t already accumulating wealth the hard way, and no miracle food will prevent cancer or make you a genius. Not all ads are scams, but  don’t tempt fate by clicking on links that prey on impossible hopes.

Finally

Make a habit of looking at internet addresses. Often, a link on a webpage or in an email is text like ” here “.  Hackers hide bogus URLs under innocuous text. They also sometimes use a legitimate URL for the text and stick in a dubious URL for the real target.  Like this: https://marvinwaschke.com  If you place the cursor over a link or address, most browsers and email tools will display the working address in the lower left-hand corner of the window. Look at the address remembering all the cautions in this post. Does something look wrong? If so, use care. Try the two links in this paragraph to see what I mean. The habit of looking at addresses will make you much harder to hack than unsavvy computer users.

Tax Refund Cyber Fraud

I’ve been thinking about tax refund fraud a lot this month. I was resolved that we would get our tax return in early this year so it would be harder for a scammer to rip off our refund, but not all the required documents have wandered in yet and so I sit and fret.

The FBI and the IRS are expecting more fraud than last year, and last year set records. I thought maybe folks would be interested in how the tax refund fraud business works. It is simple: a scammer sends in a fraudulent tax return in your name that nets a big tax refund. The scammer arranges to have the refund wired to his account instead of yours. Then the money vanishes and so does the scammer. When you file your genuine return, the IRS shows its unpleasant side until you can prove that you are the real Clem Kaddidlehopper.

How can the hackers do this? Tax refund fraud is big business. Like all big business, the work is divided up among specialists. Before the tax fraud can occur, the criminals have to steal your identity and steal or manufacture the documents to substantiate a refund that is worth the scammer’s effort and risk. Gathering the documents is the most difficult because it requires the most special knowledge and skill. If scammers have a genuine W-2 form for a victim, they are set. Those W-2s have everything they need.

But how do they get a person’s W-2? The old-school method was to steal them from mail boxes. Modern crooks reject stealing paper mail as risky and inefficient. Stealing W-2s electronically requires more skills, but risk is lower and the take is higher. This year, there have been a number of exploits recorded in which an employee in the financial or human resources department gets an emergency email request from what appears to be the CEO or other higher up in the organization. The request is for the electronic copy of all the W-2s for a department or the entire company. The employee complies and sends the files. Then they discover that the CEO’s email account has been hacked, or on close examination, the email was actually sent by an outside impostor who now has hundreds of juicy W-2s. This outside impostor could be operating from anywhere— onshore, offshore, makes no difference.

What happens then? The impostor might be a tax fraudster, although chances are good that the impostor is an accomplished social engineer who does not dirty his hands with tax fraud. Instead, the impostor goes to a dark net criminal sales site and sells the W-2s for prices that vary based on the amount earned. More money can be extracted from high-earning W-2s, so they sell for more.

The tax fraudster purchases W-2s that suit his fancy on the dark net, then fabricates deductions to extract a large refund from the IRS and files the return electronically. The fraudster’s job is to put together a return that is plausible enough to trick the IRS into believing it is genuine. Although there is word that the IRS has taken steps to clamp down on refund fraud this year, the service is also under pressure to get refunds out speedily, which limits the intensity of the vetting before a check is cut. The growing fraud numbers suggest it is not too hard for a fraudster to fool the IRS.

Good luck! And get those returns in early.