There was a huge battle that was waged in the courtroom for the Rittenhouse trial over the admission of an “enhanced” screen shot from a drone video. The prosecution used a still frame of Rittenhouse that the police spent 20 hours “enhancing” to get a grainy image that they claim shows Rittenhouse pointing his weapon at a party that was not involved in the shootings that took place a short time later.

The disagreement centers around what is meant by enhancement. Those on one side claim that this is no different than zooming in on a picture from your cell phone. The other side (the defense) claims that the software adds information to the picture. They are both correct, but the prosecution is being misleading about it.

The expert that the prosecutors seated to testify on this matter was a clueless moron that doesn’t know anything beyond “I push the button and this is the result I get.”

PIXELS

Let’s start with a discussion of what a pixel is. I’m going to do this in simple terms, so forgive me if I oversimplify. Light is actually made up of different wavelengths, each one symbolizing a color. The colors that we see are made up of a combination of those colors. There are three “primary colors” which, when combined, make up every color that we see. Those three primary colors are: Red, green, and blue. (RGB)

When a digital picture is taken, the light that passes through the lens and strikes a computer chip. That chip is made up of a number of tiny light sensors, each one converting the light into a digital signal. Each of those tiny sensors is known as a “pixel.” (Not, as the judge called them, “pickles”) That signal is represented by a number from 0 to 255 for each color. For example, a red pixel would be symbolized by 255,0,0.

Since there are 256 values for each of the three colors, that gives us 256^3 possible combinations, or 16.7 million possible color combinations. This results in each pixel taking up 24 bits of storage. This is important later.

When shooting a video, all the camera does is capture a series of pictures. The number of pictures that it takes per second is called the frame rate. Like the old flip cartoons, a display shows them back to you and your brain interprets that series of pictures as if it were smooth motion, filling in what it thinks is the missing information.

Here is a great example of video:

RESOLUTION

Resolution is simply the number of pixels in a given image. The more the number of pixels, the higher the sharpness of the display and the more that it can be “zoomed” or expanded without losing definition or detail. Different cameras and displays have different numbers of pixels, depending on the model of the device and the format that it is using.

For example, an HD display that has a resolution of 720p contains 1280 pixels wide by 720 pixels high. This adds up to 921,600 pixels, or what is rounded up to 1 Megapixel. Similarly, 1080p contains 1920 by 1080 pixels, which works out to 2,073,600 pixels, or 2 Megapixels. This means that we have fit more than twice the number of pixels into the same image size, all other things being equal, so the picture would be sharper. This also would allow us to double the size of the display without increasing the size of the pixel, which would cause us to have a loss of sharpness.

Compression

You can’t have more information than you started with, so whatever the camera doesn’t capture just isn’t there. So the resolution of the image is limited by the resolution of the camera. That isn’t the only thing that limits your picture. Another is the particular protocol that is being used to encode the picture. A picture that is stored as raw data takes up a large amount of memory space. As discussed before, a 1080p picture in its raw form has 2 million pixels, with each one using up 24 bits of space. This means that each picture takes up 5.4 Megabytes (there are 8 bits to a byte). This is very large, and raw pictures would quickly eat up your memory. The answer to this is compression.

Many of the pixels in an image are identical, or very much alike. Electronic devices take advantage of this by compressing the data. They remove the identical parts with a notation that basically says “this large area here is all one color.” How this is done is particular to each compression method, for example: jpeg, gif, etc.

Video is compressed in much the same way, using something called a codec. When frames are close to each other, the codec essentially will say “this frame is very similar to the previous frame, with the following exceptions” then proceeds to describe the differences. Each codec handles this in a different way. Some are more accurate in describing differences than others, and each codec has its own advantages and disadvantages. Some are used because they require less processing power, others may be chosen because they are more memory intensive, and still others are chosen because they tend to look the best. The point is that each codec handles the frames in a different way.

The Display Device

The last and final thing that affects an image is the device being used to display the image. The device, which can be a monitor, a printer, or even a projector, will have its own resolution, its own capabilities for frame rate, and its pixels will be of a certain size. A 4K television has 4096 by 2160 pixels. If that TV is 48 inches wide, then each pixel will be 48 inches divided by 4096, or slightly less than 0.0117 inches in diameter. If you get very close to your television, you can probably see the individual pixels on your screen.

Changing resolution

So what happens when we want to zoom in on one particular area of a picture? Remember that the limiting factor to resolution is the lowest resolution in the entire process: the camera, the compression method, and the display device. This is why they don’t film studio quality movies on low price point and shoot cameras.

Unfortunately, people watch too much television. On TV cop shows, some cop says “enhance” and the computer technician pushes a button, then the computer picks out a reflection on an eyeball in a photograph and is able to get a license plate number from a car that was passing by. That just isn’t how this works. Here is a woman from a frame of a video:

We want to see a what that person looks like. So we take a look and see that there is an area of interest, and we want to zoom in on that. Let’s say that the area of interest was one tenth of the height of the original, making that area we are interested in 512 by 288 pixels. So lets make that larger.

Now I am trying to make a picture that is 512 pixels wide by 288 high fit onto my 4k television, which is a display that is 4096 by 2160 pixels. There are only two ways to do that: You can make the pixels themselves larger, to get an image that looks like this:

That doesn’t help. We have a larger picture, but we have lost a lot of detail. The other way is to leave the pixels the same size, but have more of them. That means adding pixels. This process is referred to as upsampling. Many photo software packages will do this automatically, and each one has its own method for deciding what the added pixels will look like.

Some software packages will do averaging. Averaging is where you want to add a pixel between two existing ones. Lets say that one pixel is a bright red (200,0,0) and the adjacent one is a duller red (100,0,0). The added pixel will wind up being in between the two(150,0,0).

So what does this mean to the Rittenhouse case?

Even though I paid attention, no one mentioned the drone or the camera, so I don’t know the resolution of the camera that was on the drone that provided the so called “unicorn” video in the Rittenhouse case. Let’s just assume for the sake of argument that the drone was a DJI Mavic 3, a popular consumer drone that costs in the neighborhood of $2,000. That drone has a camera that shoots in 5k resolution, which is 5120 by 2880 pixels, or 14.7 Megapixels.

When that software expands the selected part of the picture to show what Rittenhouse was doing, there will be pixels that are added. What was added? What remained the same? Normally, the person seeking to admit that photo into evidence has the burden of proving that the enhanced picture is an accurate depiction of what actually happened. They do this by having an expert present who can attest to how and to what extent that photo was changed.

In this case, the cop who did the photo enhancement was using software, but had no idea how that software worked, nor did he have any idea how the enhanced picture differed from the original. All he knew was that he pushed a button, and *poof* the picture was enhanced to show something.

That is why this is an issue. The judge, and apparently every leftist who is a CSI fan doesn’t understand that.


5 Comments

E M Johnson · November 12, 2021 at 2:22 pm

a fleeting image that has an unknown number of variables massaged that maybe possibly kinda sorta might indicate Kyle maybe possibly kinda sorta flagged someone with his muzzle still doesn’t rise to the level he “pointed” the weapon soooooo no. that one doesn’t fly either mister prosecutor

Chris · November 12, 2021 at 2:26 pm

People actually believe these crime entertainment shows.

Reason 1 why you should have to take a test to vote or be on a jury

    anonymous coward · November 12, 2021 at 4:21 pm

    Sounds to me like, there is enough drone video footage floating around, that law enforcement could be pulling in a ton of rioters or peaceful protestors (if they wished).

e · November 12, 2021 at 3:07 pm

Not exactly the same point but close enough maybe to defend against the prosecutor taking a poor quality 2D still image and claiming it means you pointed a weapon (aggravated assault) at someone else.
You just can’t tell angles in most cases from a 2D image. You can illustrate that point to people by playing back the “Larry does war movies” (Larry Vickers reviewing war movies) on youtube. He did Saving Private Ryan and he brought this up in a scene where German soldiers at the end look like they’re running up to US soldiers on the ground, and shooting them with their rifles. Larry points out and you can see it, once he does, that even though they were using blanks in the rifles for the scene they didn’t want to risk an accident, so the “Germans” are actually pointing their rifles off to one side when they shoot. It’s done well and it’s hard to tell, but you can definitely see it once it’s mentioned and you’re looking for it.
So arguing this kind of thing on one photo, should be easy for the defense to show it’s not credible.

In a similar vein, I learned a week or two the difference between a legal tactic and a legal trick, in trials. A “tactic” is what the other side does when they know you’re lying and will question you in a way to make you reveal the lie or that you’re not telling the truth. A “trick” is where the other lawyer knows you’re telling the truth so he’ll question you in a way to make you appear to not be, when you are. I thought this made more legal stuff make sense suddenly, once I’d heard this.

    Divemedic · November 12, 2021 at 4:31 pm

    The big takeaway here is that the rules of evidence say that, in order for a picture (or video) to be brought into evidence, the party presenting it has to establish that the photo is a fair and accurate depiction of what actually happened. They didn’t, as their witness did in fact testify that he didn’t ensure that it was accurate.
    Since they didn’t do that, the opposing party needs to get an objection on the record, which the defense did. If there is such an objection, this is an issue that could overturn any potential conviction on appeal.

Comments are closed.