Computer Hard-Drive Analysis 101

A couple of weeks ago, I began a series of posts about some problems I perceive in the way computer forensics and scientific evidence are presented to juries. As an example, I chose the computer-forensics testimony of Lydell Wall in the Scott Peterson murder trial. Two people who followed the trial closely supplied me with access to transcripts and other information and a computer forensics expert gave me some additional insights. I have unpublished two of my posts on the subject and plan to take an entirely new approach to the issue, which incorporates a great deal of this information.

Sunflowers, Cookies, and Umbrella Stands–Who Was Online on Christmas Eve?

An interesting aspect of the investigation of computer searches these days is that the search engines use searches as a way of making money off of advertising. Whenever you search (on Yahoo, Google, or MSN, etc.), the search engine not only tries to figure out what web sites most closely match your needs but also which advertisers (who are paying them to display their ads) are most likely to be of interest to you.

Not only does a long list of websites display, but at various places on the screen paid ads also display.

How do the search engines decide which ads to display? which to list near the top of the screen?

The techniques are becoming very sophisticated, but in 2002 they must have been cruder than now. For instance, I imagine (guess, speculate) that cookies were a primary tool then. A cookie is a small file that a website deposits on your hard drive when you visit it. So, if you like house accessories featuring floral motifs and you visit a flowery website, it deposits a cookie with this fact about you on your computer. The cookie file contains all sorts of information, including, for example, the search words you typed into the search engine which led you to their site. 

Another way the search engine finds the best sites for you is to examine the “keywords” in the “header” of the web page. A website featuring gardening tools and accessories might have “hidden” keywords to attract web surfers to their site: a person who likes floral motif yard and garden implements and home items, such as a sunflower-motif umbrella stand, might type in a keyword such as “home weather vanes” and the search engine might display its description (something such as this): Garden and home accessories featuring daisies, sunflowers, roses … “

This is important to understand with regard to the issue that was raised in the Scott Peterson trial.

I’m trying–in vain, so far–to find the actual transcript text of the testimony concerning the computer forensic investigation of the Peterson computer on Dec. 24. The clearest report I can find is the AP story at this link:

Nothing I have found yet confirms my memory that someone claimed to be able to tell who was actually at the keyboard when the shopping site involving a sunflower-decorated umbrella stand was accessed.

The interpretations of this computer forensics data are interesting, but obviously were unclear to the jury.

So, I’m going to try to recreate what I think might actually have been recorded to the hard drive. First, someone booted up the computer and may or may not have had to enter a password to log on to a personal “account” on the computer. Then, someone was running some kind of searches on the web. From the AP report, a Yahoo ad must have popped up on the search screen. Someone clicked on the ad, which displayed a Yahoo shopping page with the tell-tale sunflower-decorated umbrella stand. This happened in fifteen minutes from 8:30 to about 8:45. This was derived by a police investigator named Lydell Wall’s examination of the hard drive.

The key information isn’t who logged on or who might have been at the keyboard, in my opinion. The key is this Yahoo ad.

Someone needs to “interrogate” someone at Yahoo who would know the algorithms for display ads for the Yahoo search engine in 2002-2003. What made that ad pop up? What cookies were on the hard drive that led Yahoo to display a sunflower-motif umbrella stand? What search words were typed in, which brought up that “impression” (view) of the shopping site link?

To me, this suggests that Laci (who liked sunflowers) had previously searched on keywords involving sunflowers. She used that computer, whether or not it required a password and whether or not it also accessed Scott’s sexual email messages. The only other explanation is that at some time Scott himself searched for sunflower motif objects on that computer.

But you won’t know for sure unless you talk to Yahoo. It is also possible that Scott was a weather freak and his previous searches and the cookies on the computer led him to click, quite by coincidence, on a shopping site involving weather vanes and a sunflower umbrella stand. Only Yahoo can say–and another investigation of the hard drive that takes into consideration the cookies.

The fact that someone actually clicked on an ad that led to a Laci-oriented shopping site is also important, of course. Why? Because you have to ask why someone would click on that site on Christmas Eve morning.

Attorney Geragos is right–the umbrella stand tends to indicate that Laci was still alive at 8:45 on that morning. I’m no expert on the crime timeline. Maybe he killed her at 8:46 or even later. I’m not saying I’m convinced he’s innocent.

The prosecution is wrong–the odds that anyone could manipulate the Yahoo ad algorithms and search engine to establish an alibi are beyond astronomical. Unless, of course, someone at Yahoo can explain it.

Someone at Yahoo needs to explain what their search engine was up to on Christmas Eve 2002.


Where in the world are all your secrets stored?

If you’re a criminal, your secrets are stored all over the place. A clever computer forensics investigator can find them, but not necessarily easily and certainly not always where you might expect.

Because I’m a fiction writer, and I’m currently writing about these issues, I’m not going to tell all the criminals out there–or all the other mystery writers out there–every detail of what I know about this topic. Let’s just say that any programmable electronic device you use stores data that can be retrieved and studied.

Think a minute about everything in your home besides your computer that is programmable: your alarm clock, your coffee pot . . . .

Now think a minute about everything in your yard, your workplace, your car, . . . .

Add to that everything in the public spaces through which you move, and don’t forget all the surveillance cameras, ATM machines, and locked doors through which you pass by means of keycards or other electronic-access devices.

Oh, don’t forget the toll booths, cell phones, GPS devices . . . .

Credit card swipers, bank accounts . . . .

Now, think about how many of these electronic devices are on a network (LANs, WANs, and the Internet). Every network has its own logs of activity, and all these logs are stored “out there” somewhere.

Your garbage is permanent, too. Anything that goes into a landfill is probably there for longer than you will be around. Ever wonder what happens to all your recycled stuff? Who’s that guy sitting in that white van down the street when you put that recycle bin out on the curb last night?

The only thing that protects us from having all our secrets exposed is that we are only one among hundreds of millions of people in this country. If anyone wants to learn our secrets, he has to dig through a whole lot of data and other garbage.

The vastness of all this data is the biggest barrier to criminal investigation–not paper shredders and hard-drive shredders. Unless you use a confetti-type paper shredder and then burn the confetti, the cops can still piece together that draft of the ransom note you wrote. Unless you completely, physically crush your hard drives and thumb drives and all the other data discs, the cops can retrieve a substantial portion of those email messages to your crooked accountant about where to launder your money. And even if you melt your entire PC to a gooey lump, the cops can still retrieve the archived email messages on your accountant’s server at work and probably also some further evidence from the ISPs you both use.

The problem I have with computer forensics being used as evidence in a trial isn’t any of this. It’s that alarm clock, coffee pot, cable box, and even your Google searches. The records that these electronic devices store are records of everyday life. They prove nothing about crimes. So what if you set your clock to wake you up at 6:30 every morning, but on the morning of the murder you get out of bed before it goes off? Maybe you often get up before the alarm goes off. So what if the coffee pot automatically started perking away as usual at 6:30 on the morning of the murder, but you forgot to put coffee in the basket the night before? Maybe you forget to do this all the time. So what if you rented on-demand sex movies for the first time in your life after your wife disappeared? Maybe you were trying to distract yourself from the horror of your situation.

So what if you searched for pornography online before or after a crime occurred? Millions of people do this every day. They just don’t happen to be caught up in a crime.

I have heard far too much testimony in murder trials about web search activity as if it were evidence of a crime. This is not evidence. At most, it helps investigators to piece together a picture of your behavior at certain times. This picture may or may not be suggestive of aberrant psychology. For instance, as far as I can tell, Neil Entwistle often searched the web for sex-related sites months, if not years, before the shooting of his wife and daughter. I’m not sure whether there was testimony at his trial that he sat down at the computer and searched for an escort service while his wife and daughter’s bodies were lying in his bed. If so, this is definitely evidence. But if these sorts of searches were dated before the tragedy, then they are not evidence, just clues for investigators.

Jurors should never be asked to evaluate Internet search behavior patterns as evidence, unless the searches can be shown to be illegal in themselves. (At least, that’s the opinion of this jurors.)

Computer Forensics–Part I, Valid or not?

In a criminal trial, the lawyers submit all items of evidence, including anticipated testimony, to the judge for approval, before the jury gets to hear it and “find the facts.” A good judge excludes some items, because they are of questionable veracity, not probative, unduly prejudicial, and so on.

All sorts of activities are recorded on all sorts of electronic equipment, all of which seems to fall under the rubric of computer-forensics evidence. I’m not talking only about PCs; I’m talking about all sorts of electronic records. I’m convinced that much of the computer-forensics evidence that makes it into a trial should be excluded, because it is not probative (to-the-point, if you will, and incriminating) and/or it’s prejudicial. In other words, I feel that judges should exclude a lot of computer-related evidence. (I’ll supply specific examples in a minute.)

Some electronic records are–obviously–evidence of crime, though. Such records include actual communications which are themselves crimes, such as bank fraud, wire fraud, mail fraud, and conspiracy. If a murderer emails his accomplice about when they will meet after the murder–that’s evidence. If a murderer tells a friend she wants to buy a gun out of state so that she doesn’t have to wait as long for the background check as she would if she were buying it in her home state, that’s evidence, too.

That sort of electronic communication may, in and of itself, be criminal activity. It certainly should be presented to a jury. One reason it should be presented is that a jury can decide the facts for themselves based on the evidence. They don’t need an expert witness to read the email message to them or to tell them what to think about it.

A jury can also decide whether such a communication is evidence of some other crime. A jury may decide it is proof that certain facts alleged by the prosecution are true. The problem is, some evidence is so technical that only experts can evaluate it. A good deal of computer forensics falls under this heading, and so the jury is unable to evaluate it.

(I hate how wordy you have to get when you’re talking about the law. Let’s see if I can say this is human terms.)

A jury is called “the finder of fact.” That means they weigh the evidence and evaluate it. They listen to witnesses and decide if they’re telling the truth. They don’t have to believe every witness simply because the judge permitted him to testify. The judge’s admission of the testimony isn’t “vouching” for the witness.

A jury observes the visual evidence, too, and makes up their own minds about what it means. They don’t have to believe that a certain knife is the murder weapon, for instance, simply because the judge admitted it into evidence.

They also read the email messages or bank statements or word-processed documents or spreadsheets or whatever other electronic records and decide what they mean. For example, the jury can decide whether or not the gun the defendant ordered off of eBay is the murder weapon or not.

But some of the computer-forensics data presented to a jury is meaningless to jurors unless an expert interprets it. I have a problem with this sort of “evidence,” because an expert’s interpretation of something is only as good as the expert’s expertise. And as far as I’m concerned, current computer-forensics experts are not experts in interpretation. They are only experts in retrieving obscure data from obscure and damaged media. (I hope I’ve expressed this distinction clearly, because it is a critical distinction.)

Another problem with computer-forensic experts’ testimony is that defense attorneys have no expertise in cross-examining them. Defense attorneys rarely ask the right questions about the integrity of the retrieved data and the methodologies used.

(I’ve heard a few pointed questions, but only a few. For example, I heard one defense attorney ask the expert whether he was using the latest version available of his analytical software. The expert admitted he was not. But the attorney left it at that–as if a layperson juror would understand how much that compromises the results of the computer-forensic investigation. The lawyer should have hammered him on this and then brought on a defense expert to list exactly the sorts of flaws the old software might produce. Better yet, the defense ought to have sought to have the judge exclude the data from the trial on this basis.)

Some of the computer-forensics data that many, many judges admit into trials aren’t evidence of crime–they are simply clues to the crime, which the criminal investigators used in the investigation. And we all know that everything that comes out in an investigation, isn’t admitted into court, because it isn’t relevant. This is my biggest complaint with computer-forensics data as evidence.

The following sorts of records may give investigators clues to follow, but they should not be treated as actual evidence of crime–just of suspicious behavior:

1) Searching the web for crime-related information (or everyone who reads this blog would be under suspicion)
2) Searching the web for sexual information
3) Searching the web for last-minute travel plans
4) Searching the web or other online directories for legal assistance or information
5) Records of GPS locations (from GPS monitors or from cell phones, etc.)–unless, of course, the records show that the suspect was at the scene of the crime precisely when the crime occurred
6) Security camera videos showing a suspect in a public space in the general vicinity of the crime, but doing nothing criminal or suspicious

Yes, these are great clues that can lead investigators to real, solid evidence that can be presented in court. But, since there is also an innocent explanation for all of these records, I don’t think they should make it past the judge and into court.

Unfortunately, far too much of this sort of information (which is subject to widely varying interpretations) makes it into court. For example, in almost every recent murder trial the prosecution brought on a computer-forensics expert to testify to the defendant’s web surfing habits, which almost always included porn searches, crime information, and travel plans. Except for the porn, which appalls me because it is a form of violence against women and children and sometimes animals, I am constantly surfing for crime information and travel information. If a person does this often and then a crime occurs in her circle of friends or neighbors, suddenly the surf pattern looks suspicious. It’s taken out of context. I am convinced that this is the case for Neil Entwistle–guilty or not, his porn and escort searches are completely irrelevant and prejudicial and should never have made it into the trial.

Another sort of computer-forensics testimony occasionally makes it into a trial. It’s rare. But it’s terrifically incriminating–if you can believe it. This testimony is related to forensic linguistics (which is actually something I approve of, as long as it is interpreted by a wise expert), but what I’m complaining about is actually a sort of pseudo-linguistics, a junk linguistics, if you will.

I’m referring to the analysis of computer keystrokes and “individual web-search preferences” when more than one person has access to a computer, whether at home or in the office or in public.

The most egregious use of this crystal-ball-gazing, computer voodoo was in the Scott Peterson trial. A computer expert testified that he could tell that only Peterson was on the family computer on the morning of the murder and that the victim, Lacy Peterson, was not on the computer at that time, even though there was a click-through to an ad for an umbrella stand with a daisy pattern on it (daisy, being a favorite emblem of Lacy’s). The expert claimed he could tell that the click-through was not Lacy’s. That meant it had to be Scott who was clicking on the ad.

An aside: What if the expert was right? What if Scott did click on the umbrella stand? It was Christmas Eve. He might have been thinking about buying a present for his wife. This would tend to exonerate Peterson, wouldn’t it?

However, the prosecution used this testimony as evidence of premeditation. They even claimed Peterson clicked on the ad just to make everyone think that Lacy was still alive at the time. Now, if that is true, Scott Peterson very nearly devised an incredibly clever crime. How did he know that a Google ad would pop up next to a search for the currents in San Francisco Bay? Or sturgeon? Or whatever he was searching for at the time? Google claims that they display ads based on the keywords in the search. Hmm.

This kind of testimony prejudices a jury against a defendant unfairly. The judge doesn’t understand it, so he admits it into the trial rather than going out on a limb and doubting an expert. The defense attorney doesn’t know how to cross-examine an expert who makes such claims. Unless the jury includes someone who really understands computers, the jury won’t question it, either.

To be continued . . . .



Computer Forensics–What Can They Really Find?

You Can Run, But Not Hide

In the next few days, I’ll be exploring the domain of computer forensics. Here are a few of the topics I intend to cover (based on my professional work in computing, beginning in 1980):

  • Where in the world are the records stored? (It’s not only on PC hard drives.)

  • Why can’t you simply delete everything incriminating?

  • What electronic devices in addition to computers can store records about their users?

  • Why do I say that the data recovered by computer forensics are clues but not evidence?

Forensic Linguistics, Anthrax, and Steven Hatfill

The June 30, 2008 Wall Street Journal editorial, “The Anthrax Fiasco,” vilifies the FBI and Justice Department for incorrectly targeting Steven Hatfill (a former military scientist) as the 2001 anthrax-terrorist. I’ve known all along that Hatfill was one of the least likely culprits. And I was not alone. During the anthrax panic I subscribed to a forensic linguistics mailing list on which several academic linguists expressed opinions that the letters were not written by a native English speaker, were possibly written by a native Arabic speaker, and showed signs of having originated in Great Britain. An FBI document expert on the list posted several responses to these linguists’ notes in which he indicated he would make sure that these comments were passed along to the agents investigating the case. I also emailed the FBI expert privately, off list, in order to suggest that certain aspects of the letters looked to me as if at some point they had been prepared for and then transmitted on a TTY (teletype) device. His response to my email indicated that he thought my insights had merit.

It looks as if the government settled Hatfill’s lawsuit before it reached court, probably to protect the investigating agents who ignored the linguists’ advice.

The issue of forensic linguistics in crime detection and the courtroom deserves a post of its own. Recently, I heard a legal commentator call forensic linguistics “junk science.” I disagree. It isn’t junk, even though it isn’t science either. It’s an analytical methodology.

Neil Entwistle’s Tell-Tale Hard Drive: Computer Forensics and Junk Science

I have a very difficult time understanding why judges allow evidence of most computer-hard-drive searches in murder trials–but, then, I also have a hard time understanding why library records and video-rental records are allowed in evidence.

Today’s so-called computer forensics evidence in the Neil Entwistle murder trial is junk. What you look at online and what you read do not prove you are a criminal.

About a decade ago I got into an argument on a listserv about whether or not a defendant’s fingerprints on a library book were ever used as evidence in a trial. All the librarians on the list insisted it was against the librarian code of ethics to permit police to access library records, and a lawyer on the list said that this wouldn’t be permitted by a judge as a violation of privacy rights. So, I had to give them a URL with the story. (I recently searched for this and couldn’t find it, or I would provide it here.) As I recall, the book in question had information on poisons. A woman was convicted of murder because her fingerprints were on the book. Since then, every time I check out a book with potentially fatal facts, I think about this. It makes me want to wear surgical gloves to the library.

Now, of course, the Patriot Act requires librarians to turn over their records. I’m not surprised that the government wants access to such things, but I am surprised that judges really think the books you touch, as well as the books you read, are probative in a criminal investigation.

Judges permit all sorts of junk evidence into trials. The most recent fad seems to be computer search records.

I do believe that computer forensics is a powerful investigative tool, but such information ought not to be used in trials. Computer forensics is an art, not a science. The development of the “clues” depends entirely on the quality of the software used to extract it from the hard drive. Everyone in the software industry knows the saying “garbage in, garbage out.” What that means is that the output of any program depends not only on the quality of the data fed in but also the algorithms that do the processing. Currently, several companies supply hard-drive analysis software to law enforcement, but there is no “standards body” to vet the software for accuracy or thoroughness.

Then, the analysis of the meaning of the output is entirely an art. A list of web searches or page downloads is meaningless in and of itself. A human brain has to interpret it, and interpretation is . . . well, given to interpretation.

I heard some of the testimony concerning Neil Entwistle’s searches around the time of the crime. The commentators seem to think some of the searches were incredibly incriminating and “probative” of his state of mind. Now, I suppose I would be shocked to hear that my husband was searching for escort services, whether or not I was recently delivered of a child. And I suppose Entwistle’s searches for ways to stab someone in the neck indicates that at some time and for some reason he had some “dark” thoughts. But the other searches were completely innocuous. Why didn’t everyone stipulate to the fact that he was looking for a job online and that he was often thinking of hopping on a plane and going home to England? These searches are not evidence of a crime.

Re: Escort Services–I don’t believe what the commentators say about how the jury will react to this evidence. Young people these days have different attitudes toward sex than those of us who are a bit older. Europeans, including the English, are much more relaxed than we are about sex. Young men who have strong sex drives often look for “a bit on the side” during their wives’ pregnancies and afterwards. (Did anyone see “The Tudors” episode in which Ann Boleyn “gives” Henry one of her ladies in waiting while she’s pregnant?)

Re: Web Surfing with the Bodies in the House–Consider “time stamps.” This is one of the problems with computer forensics. When the precise time of a crime is unknown, it is foolish to assert that anything on a computer’s hard drive proves that a search took place before or after the crime. Time stamps are recorded to the drive when certain loggable events occur on a computer. The time is taken from the computer’s “memory” (read, little brain). The computer’s time is generated and “put” in its brain by chips with batteries. The battery can die. The user can change the time and date to any time and date he wants. An electrical engineer would know how to do this. Even I know how to do this.

In fact, I hope they have verified that the Entwistle computer (a laptop) wasn’t running on UK time. (They are seven hours ahead of us.) I have traveled internationally with a laptop, and I have changed my clock to correspond to local time at my destination long in advance of the trip. When I returned home, I have sometimes forgotten to change the time back. Alternatively, I know of people who retain U.S. time when they travel abroad. And then there’s the problem of Windows XP making time zone changes automatically when you take your computer into a new time zone and get on the internet (because Windows checks the “atomic” clock frequently). A knowledgeable user watches out for this and tries to reset the time to a time he wants to use.

I hope the jury doesn’t fall for the emotional impact of this computer forensics “evidence.” Whether or not Entwistle is guilty, nothing he did on the computer does anything but paint a portrait of an unemployed, new father who for some reason was interested in ways to do himself in (or maybe was just curious about suicide–I have also searched for such information).

In my opinion, the worst abuse of computer forensics was in the Scott Peterson trial when an expert testified that he could tell the difference between the defendant’s keystrokes and the victim’s keystrokes when using the computer. I hope I misheard this, or misremembered it. This is such hogwash, it doesn’t even bear commenting on.

I wish someone would challenge the use of this kind of evidence in court, all the way to the Supreme Court, if necessary. And I definitely hope there’s an engineer on the Entwistle jury.