Loading Code in a Multicellular World

Code transfer should be Taboo because computers are becoming far more integrated into our lives and their vulnerabilities far more dangerous.  How then can we safely load new code into our machines?  Or can we?

The Risk in Loading Code

In most embedded computers, all code comes pre-loaded in ROM; it cannot be added or modified unless one can re-flash the ROM. That is much like the situation with DNA in multicellular organisms where every cell contains all the DNA it will ever use. But embedded processors are the exception in computing. Most programmers have learned their craft in a computing culture where installing and modifying code is taken for granted and most computer users assume such practices are just a part of computing  That world is rapidly becoming archaic.

We now live in a world swarming with code seeking entry into our computers: viruses, worms, trojans, rootkits, and relentless botnets searching out weaknesses in our defenses. No longer can we assume that loading code is a matter of our choice or that the code does what, and only what, we desire. Within a very well managed IT organization such trusting assumptions may still apply (until a new hacking exploit is discovered). And perhaps some skilled amateurs can maintain their own PCs carefully enough. But the typical computer user, let alone an  unsophisticated smartphone user, is at the mercy of stealthy and dangerous code.

In this dangerous digital world, we still must distribute and install new versions or patches to operating systems and existing applications (apps in modern parlance) as well as new apps. Every one of those practices has proven to be an avenue for malware entry. In the not-too-distant future, many computers may be so cheap and ubiquitous that maintenance, patching, and enhancing their code is neither practical nor worthwhile. That is not the case today but it may be the only effective long-term defense against malware. In some cases it is already easier and cheaper to throw away and replace a severely infected Windows PC rather than to attempt to rid it of malware!

What must be Protected?

A computer's file system used to be the most important subsystem to be protected. Now critical security exposures occur in communication between networked computers and communication between the computers and their user(s).

Compromised Internet access to computers facilitates spamming and DDoS attacks by botnets. And "sniffing" of wireless signals in public places can reveal information useful for identity theft. 

Paths of communication between the machine and humans are also susceptible to malware attacks.  Intercepting user keystrokes provides password and account information for identity theft. Manipulating the screen display, together with intercepting mouse or touch pointer inputs can fool the user into divulging passwords. The video camera can unobrusively reveal the user and even be used to read the screen via reflections from the user's glasses. Modern face recognition could determine who else is nearby. Research, from Penn State University and IBM shows that "...motion-sensor data from smartphones can be used to effectively guess what keys a user is tapping and steal sensitive data such as PINs and bank details."  Accelerometer, gyroscope, and compass sensors together can act as an inertial navigation device independent of GPS. Such motion data, if sufficiently accurate, "always on," and occasionally recalibrated with GPS location data, could track the path and location of a smartphone even within large and supposedly secure buildings (e.g., the Pentagon), malls, and hospitals. Such data could reveal the user's location, speed and direction at all times.  A compromised smartphone, especially one with a separate motion co-processor, could be tracked anywhere in the world from anywhere in the world. And the FBI  and other savvy hackers can install malware to remotely turn on a computer's microphone to listen in on conversations or other ambient sounds.  Fingerprint readers in iPhones are perhaps too new to be misused by malware yet, but once the data is captured, it will eventually be susceptible to exploitation. In short, all the sensors in the latest smartphones can be a boon to the user if the data they provide remains under the complete control of its owner.  Otherwise users are at the mercy of those who can steal the data. or those in increasingly aggressive surveillance states who can "legally" commandeer the data.

All of these invasive exploits begin with an illicit ability to load code into an unsuspecting user's computer/smartphone. Loading code, while essential for the maintenance or customization of a single computer, is fundamentally threatening to any multicellular computing system of which it is a part.  Traditional practices rely upon the IT priesthood to determine what code should be installed on important corporate machines.  The rest of us were largely on our own until the IT people found that the networks and data could be compromised by end-user machines. They thus began to recognize the risks of loading code onto any networked computer or mobile device that is ever allowed within the firewall. Lots of stopgap measures have been tried including individual firewalls for each machine, encrypted wireless nets, anti-virus software, rules, rules, and more rules that have proven to be easily circumvented by "social engineering". Yet little has been done in the way of fundamentally rethinking the implications of allowing code to move from machine to machine. Meanwhile, each month brings discoveries of new security holes not covered by existing band-aids.

Some Approaches for Solutions

New practices are now evolving for mobile devices. Loading code into an iPod/iPhone/iPad is designed to be under the complete control of Apple's App Store, The App Store quite carefully vets all code and places restrictions on app developers before allowing them to submit apps. Apple deals severely with app developers who break the rules. However old notions of developers' freedom of action die hard. It didn't take long for programmers to seek ways around Apple's restrictions.  The results became knows as"jailbreaking" Jailbreaking tools are usually available within a few weeks or months after the release of each new iOS device. One estimate claims that over 14 million iOS 6.x.x iPhones have been jailbroken. And jailbreaking isn't the only risk: iOS 6.1.3 had a zero-day flaw, details of which reportedly sold for $500,000. Neither Google's Android nor Microsoft's smartphone protection is as strict as Apples, so their mobile device ecologies are even less secure. Apple should certainly be commended for their efforts to prevent malware.  But the underlying problem is not due to insufficient attention to security, it is out of control complexity.

Some have proposed biological metaphors such as "immune systems" for computers   Digital "immune systems" already have played some role in detecting viral code. However, immune system analogies must be thought out with care. Cells in multicellular organisms come with a full complement of DNA. Any attempt to inject DNA or RNA into such cells is therefore, by definition, an attack that is to be blocked. So cells have mechanisms to trigger their own suicide if such an attempt is detected.

In contrast, most computing systems need to accept and execute many sorts of code on a regular basis, e.g., JavaScript or Flash code in Web pages or macros in Microsoft Office documents. And the digital situation is further complicated by the fact that executable computing code can masquerade as data inside email attachments, HTML documents, images, etc. or can piggyback to and from USB devices.

One approach that may help is to improve our ability to discriminate code (in all its varieties) from data. Executable code, whether machine language, java bytecodes, or JavaScript, among many others, will almost certainly have statistical or structural regularities that differ from non-executable data. The structure of the CPUs, compilers, and code interpreters that must run the code necessarily impose regularities upon the code itself. Whether such regularities can be reliably detected in chunks of code of the size found in malware exploits is an open question, as is the question of whether the malware can be obfuscated sufficiently to prevent detection (pdf) yet still execute properly. Ongoing research into such topics may well bear fruit. However malware writers don't stand still. Code detection and code obfuscation are in an arms race in which malware writers have a sort of "first mover" advantage.

Two other possibilities are hardware-assisted security and prevention of software and/or hardware monocultures.  Monocultures are large populations of genetically identical organisms.  Hybrid wheat, rice and corn fields and banana plantations are common examples.  Since they are genetically identical, any organism that defeats the defenses of one plant can spread to all the plants.  Intel processors and Windows operating systems form a software monoculture,  The iOS smartphone has become another.  Ways to inject diversity into either the hardware processors or the software (ideally both) would complicate the task of malware writers.  However the monocultures exist because they provide economies of scale.  How to create diversity in an economically scalable way is the difficult task. Such ideas are in the early conceptual stage at best.

What, then, can we say about computing paradigms that fundamentally require the transmission of code? Robots, whether on the surface of Mars, in orbit around Jupiter, or in a battlefield situation may require code updates on the fly. Computer scientists have spent a decade or more exploring mobile agents that are based upon moving code from one machine to another. Robots that accept code are simply not going to be connected to the open Internet (except perhaps as toy examples for the enjoyment of netizens). Still, insufficient security protection in communication with battlefield robots could allow them to be hijacked by the enemy, as Iran claimed to have done to a US drone, even though a great deal of effort undoubtedly has been expended to prevent such occurrences. Mobile agent systems can perhaps be sufficiently guarded if they are used with specialized hardware and operating systems designed specifically for the task.

To summarize the situation, loading code remains dangerous despite all present efforts to make it safe.  And the threat of cyber warfare disabling the world banking system is not as far fetched as many suppose.  Nonetheless, the major cultures of computing still rely on loading code and are not inclined to retreat to the biological solution to making it Taboo.

Last revised 9/19/2013