Spectre, Meltdown, and Virtual Systems

In June of 2017 I wrote a blog for InfoWorld on How to handle the risks of hypervisor hacking. In it, I described the theoretical points where Virtual Machines (VMs) and hypervisors could be hacked. My crystal ball must have been well polished. Spectre and Meltdown prey on one of the points I described there.

What I did not predict is where the vulnerability would come from. As a software engineer, I always think about software vulnerabilities, but I tend to assume that the hardware is seldom at fault. I took one class in computer hardware design thirty years ago. Since then, my working approach is to look first for software flaws and only consider hardware when I am forced, kicking and screaming, to examine for hardware failure. This is usually a good plan for a software engineer. As a rule, when hardware fails, the device bricks (is completely dead), seldom does it continue to function. There is usually not much beyond rewriting drivers that a coder can do to fix a hardware issue. Even rewriting a driver is usually beyond me because it takes more hardware expertise than I have to write a correct driver.

In my previous blog here, I wrote that Spectre and Meltdown probably will not affect individual users much. So far, that is still true, but the real impact of these vulnerabilities is being felt by service providers, businesses, and organizations that make extensive use of virtual systems. Although the performance degradation after temporary fixes have been applied is not as serious as previously estimated, some loads are seeing serious hits and even single digit degradation can be significant is scaled up systems. Already, we’ve seen some botched fixes, which never help anyone.

Hardware flaws are more serious than software flaws for several reasons. A software flaw is usually limited to a single piece of software, often an application. A vulnerability limited to a single application is relatively easy to defend against. Just disable or uninstall the application until it is fixed. Inconvenient, but less of a problem than an operating system vulnerability that may force you to shut down many applications and halt work until the operating system supplier offers a fix to the OS. A flaw in a basic software library can be worse: it may affect many applications and operating systems. The bright side is that software patches can be written and applied quickly and even automatically installed without computer user intervention— sometimes too quickly when the fix is rushed and inadequately tested before deployment— but the interval from discovery of a vulnerability to patch deployment is usually weeks or months, not years.

Hardware chip level flaws cut a wider and longer swathe. A hardware flaw typically affects every application, operating system, and embedded system running on the hardware. In some cases, new microcode can correct hardware flaws, but in the most serious cases, new chips must be installed, and sometimes new sets of chips and new circuit boards are required. If installing microcode will not fix the problem, at the very least, someone has to physically open a case and replace a component. Not a trivial task with more than one or two boxes to fix and a major project in a data center with hundreds or thousands of devices. Often, a fix requires replacing an entire unit, either because that is the only way to fix the problem, or because replacing the entire unit is easier and ultimately cheaper.

Both Intel and AMD have announced hardware fixes to the Spectre and Meltdown vulnerabilities. The replacement chips will probably roll out within the year. The fix may only entail a single chip replacement, but it is a solid prediction that many computers will be replaced. The Spectre and Meltdown vulnerabilities exist in processors deployed ten years ago. Many of the computers using these processors are obsolete, considering that a processor over eighteen months old is often no longer supported by the manufacturer. These machines are probably cheaper to replace than upgrade, even if an upgrade is available. More recent upgradable machines will frequently be replaced anyway because upgrading a machine near the end of its lifecycle is a poor investment. Some sites will put off costly replacements. In other words, the computing industry will struggle with the issues raised by Spectre and Meltdown for years to come.

There is yet another reason vulnerabilities in hardware are worse than software vulnerabilities. The software industry is still coping with the aftermath of a period when computer security was given inadequate attention. At the turn of the 21st century, most developers had no idea that losses due to insecure computing would soon be measured in billions of dollars per year. The industry has changed— software engineers no longer dismiss security as an optional afterthought, but a decade after the problems became unmistakable, we are still learning to build secure software. I discuss this at length in my book, Personal Cybersecurity.

Spectre and Meltdown suggest that the hardware side may not have taken security as seriously as the software side. Now that criminal and state-sponsored hackers are aware that hardware has vulnerabilities, they will begin to look hard to find new flaws in hardware for subverting systems. A whole new world of hacking possibilities awaits.

We know from the software experience that it takes time for engineers to develop and internalize methodologies for creating secure systems. We can hope that hardware engineers will take advantage of software security lessons, but secure processor design methodologies are unlikely to appear overnight, and a backlog of insecure hardware surprises may be waiting for us.

The next year or two promises to be interesting.

Memory On the Task List

Memory usage is another column on the task list that can help you understand what is happening under the hood of your computer. In my last blog, I wrote about CPU usage. Memory is similar to CPU in that it is a critical resource that affects computer performance and it can help evaluate malware on your system.

The Role of Memory

Without memory, often called RAM, your computer has Alzheimer’s. It may have the fastest processor in the world and the coolest programs, but it won’t do anything unless it can keep track of where it is at. The processor pulls an instruction from memory, executes it, and puts the result back into memory to use later. Without memory, a processor doesn’t know what to do next or what it has already done; it is nearly useless.

Memory vs Storage

Memory has to be as fast as the processor or the processor has to wait for data and instructions to be fetched from memory and results to be stored in memory for later use. Using present technology, the fastest memory is volatile. By volatile, I don’t mean memory is liable to fly off the handle and jet to Maui without provocation. Instead, data stored in volatile memory flys to Maui, as far as I know, when the electricity is switched off. In any case, it disappears.

Speed and volatility make memory different from storage. Data that stays around between computing sessions resides in storage, which is useful, but not when speed is the main consideration. Usually storage is on a hard disk. Hard disks are much slower than memory chips, but they store more data at less expense and they are not volatile. In other words, powering down does not affect data stored on a disk.

As processors get faster, memory must also get faster and speed is expensive. This makes memory a scarce and expensive commodity on computers. A laptop with 4 gigabytes of memory and a terabyte of storage has 400 times more storage than memory. At today’s prices, 1 gigabyte of memory costs about the same as 200 gigabytes of storage. Speed costs.

Performance and Memory

Memory is precious, but it performs. When developers have to make a process run faster, one way is to change the code to use memory instead of disk storage. If the developers go overboard and use more memory than the system has available, their optimization backfires. When the system starts to run out of memory, it moves data from memory to slower disk storage and the system begins to bog down as the processor waits for the slow moving data. The same thing happens when several heavy memory consuming processes run at the same time.

Memory Hogging

There are many reasons for heavy memory consumption. One I already mentioned— a process has been designed to consume more memory in order to perform well. Processes running above their designed capacity can also use extra memory. For example, a process designed to support ten simultaneous users might use much more memory if it is supporting a hundred users. Sometimes excess memory usage comes from defective code. A “memory leak” is a classic defect that causes processes to consume more and more memory the longer the process stays running.

Whatever the reason, when memory consumption reaches beyond the optimal level for your computing device, performance will slooooow. The cursor may get jerky. The keyboard will seem to hang, then spit out a clump of characters. When you attempt to start something new, there is a long pause. Nothing works right. Not pleasant. Not pleasant at all.

Memory Shortage Diagnostics

The task list is the first tool I use to determine if I have a memory shortage and what is causing it.

On Windows 10, a convenient way to get to the task list is to right click on the Windows icon in the lower left-hand corner of the screen. The task list will be below the line not too far from the center of the menu. Click on it.

You will get something like this.

In this snapshot, 55% of available fast memory is in use. That is a good number. When the percentage gets above 60%, into the 70s and 80s, your system will begin to suffer. Here, I’ve clicked on the memory column header to sort the processes by memory usage. In this case, I had Firefox up when I took this screen shot and it is the biggest memory consumer. Firefox uses a lot of memory so popping up a new screen is snappy. Therefore, I don’t mind that it is a big consumer. If one of the heavy hitters was an application that I was not using, I would shut it down to free up memory for a performance boost.

Memory Hogging Malware

If a memory-hogging process happens to be malware, it’s bad. You seldom know what the malware is doing. It could be generating spam or sending large quantities of messages to a server that the hacker is trying to overwhelm. It could, perish the thought, be encrypting your files, preparing to demand ransom for their return. Hogging memory is not the only way malware can slow your computer, but it is one way.

As I mentioned in my previous blog, I Google a process name if I am not familiar with it. Usually it is a Windows internal process I don’t know about, but sometimes it will show up as malware.

Emergency Measures

Now we get into some risky stuff that could force you to restore your system, but could also avoid restoring the system. You will have to decide for yourself how much risk you are willing to take, and own the results.

Removing the executable file of the malware can stop the malware’s damage. If you want to remove the file from the system, right click on the process name in the task list, the click on “open file location.” From there, you can delete the executable, but you should think about that before jumping in.

It is always better to remove an application through “Uninstall or change a program” in the Control Panel if you can. Removal is often more complicated that removing a single file. Sometimes configuration files and registries have to be modified and several files deleted. The uninstall in Control Panel is supposed to clean up everything, and, unless the author of the uninstall was sloppy, it always does.

For malware, there usually is no uninstall. If an anti-virus tool detects malware, it will do a better job of uninstalling than you can do manually. So try an anti-virus scan of the malware executable file. If scan finds and eradicates the malware, you win!

Manual Kill

However, if the scan fails and there is no uninstall, I delete any malware files I can find. Deleting the wrong file by mistake will not harm your hardware, but it could require reinstalling your operating system and restoring from a backup. (Highly unpleasant.) However, in my opinion, if your system is already damaged by malware, deleting will probably do no more damage than has already been done and may stop the damage. Therefore, when all else fails, I usually choose to delete immediately to limit the damage. This is a risk I am willing to take, but it is a risk.

If the malware is clever (bad!) it may regenerate the file you deleted. Also, deleting a file out from under a running process may not kill the process, so you will have to hit the end task button to kill it.

Manual Kill Checklist
  • Verify that the process is malware
  • Run a virus scan on the file and let the anti-virus take care of it
  • Check “Uninstall or change a program” in the Control Panel on the off chance you can uninstall it there
  • If all else fails, try killing it with the “End Task” button and deleting the file

Good luck! You could save the day for yourself. Or ruin it. I’ve seen it both ways.

The Task List Reveals a Computer’s Beating Heart

Windows 10 Task List
Windows 10 Task List

Like an echocardiogram  that shows the blood flowing through a beating heart, the task shows the flow of activity on a computer. On Apple products, the task list is called the activity list. In Unix and its derivatives, such as Linux and Android, the task list is usually called the process list. In all of these operating systems, a task, activity, or process, whichever you want to call it, is an executing program. Most of the time, there are a lot of them.

Processors

A processor can only run one process at a time, but they switch between processes so rapidly it looks like a processor is running many processes at the same time. All the processes on the task list have been started but have not finished. Some are waiting for input, others are waiting for a chance to use some busy resource like a hard drive, but all are entitled to some time on the processor when their turn comes up.

Many computers today have more than one processor, which increases the number of processes that can run at one time and the amount of time a computer can give to each process. Different operating systems have different strategies for switching between processes, but all the strategies are like plate spinning acts. The plate spinner hurries from plate to plate, giving plates a spin when they begin to slow down and need attention. (If you don’t know what plate spinning is, see it here.) The processor does the same thing, executing a few instructions for a program, then rushing to the next process.

Processes

All the processes, both active and waiting, show up on the task list. That includes malware as well as legitimate processes. If you can spot a bad guy on the process list, you can kill it. The kill may not be permanent, processes can regenerate themselves, but it is usually worth a try. The challenge is to sort the good from the bad. Unless you know what you are shooting at, you might crash the entire computer or lose data, so be careful. You could find yourself restoring your entire system from a backup. Nevertheless, this is one area where you can strap on your weapons and wage open warfare against malware.

When I see an unfamiliar process on the task list, I usually run to Google. Most of the time, Google results tell me that the process is something innocuous that I hadn’t noticed before, but not always. By the way, be a little careful when Googling. There are questionable companies out there with sites that will appear in the search result and try to take advantage of you by offering unnecessary clean up services or dubious downloads. Microsoft will give you trustworthy advice, as will the established antivirus companies, but avoid sending money or installing programs from places you have never heard of. Some may be legitimate, but not all. Above all, don’t let anyone log into your computer remotely without rock solid credentials.

CPU Time

The task list tells you more than just the names of the running processes. There are number of readouts on the state of each process and the resources it is using. The one I usually look at first is the percentage of CPU time being taken and the accumulated CPU time. (Click at the top of the column to sort the processes by the metric.) Both of these metrics show the amount of time a process consumes on the processor–the amount of time the plate spinner has spent spinning the plate. A program that consumes more CPU time is using an extra share of the system’s most critical resource. Shutting down a high CPU consumer will do more to improve your computer’s performance than halting a low CPU consumer.

Some high-consumer processes are legitimate. For example, you will often see a browser using a lot of CPU time. That is because a browser does a lot of work. Any computing that takes place in web page interactions is chalked up to the browser. Some internal system processes, such “system interrupts,” do a lot of work also and rank high on CPU consumption. If you see an installed application that is hogging down CPU, you might check its configuration. There may be adjustments that will reduce consumption. Google will help you find what to do, but keep track of your changes so you can change them back if they don’t work. If you don’t use the program much, perhaps turning it off would be a good idea. When a high consumer happens to be malware, put a high priority on scrubbing it out. A high CPU-consuming malware is like a blockage in a coronary artery. You’ll feel much better without it.

Next Time

There’s more to the task list. Next time.