There was a huge battle that was waged in the courtroom for the Rittenhouse trial over the admission of an “enhanced” screen shot from a drone video. The prosecution used a still frame of Rittenhouse that the police spent 20 hours “enhancing” to get a grainy image that they claim shows Rittenhouse pointing his weapon at a party that was not involved in the shootings that took place a short time later.
The disagreement centers around what is meant by enhancement. Those on one side claim that this is no different than zooming in on a picture from your cell phone. The other side (the defense) claims that the software adds information to the picture. They are both correct, but the prosecution is being misleading about it.
The expert that the prosecutors seated to testify on this matter was a clueless moron that doesn’t know anything beyond “I push the button and this is the result I get.”
Let’s start with a discussion of what a pixel is. I’m going to do this in simple terms, so forgive me if I oversimplify. Light is actually made up of different wavelengths, each one symbolizing a color. The colors that we see are made up of a combination of those colors. There are three “primary colors” which, when combined, make up every color that we see. Those three primary colors are: Red, green, and blue. (RGB)
When a digital picture is taken, the light that passes through the lens and strikes a computer chip. That chip is made up of a number of tiny light sensors, each one converting the light into a digital signal. Each of those tiny sensors is known as a “pixel.” (Not, as the judge called them, “pickles”) That signal is represented by a number from 0 to 255 for each color. For example, a red pixel would be symbolized by 255,0,0.
Since there are 256 values for each of the three colors, that gives us 256^3 possible combinations, or 16.7 million possible color combinations. This results in each pixel taking up 24 bits of storage. This is important later.
When shooting a video, all the camera does is capture a series of pictures. The number of pictures that it takes per second is called the frame rate. Like the old flip cartoons, a display shows them back to you and your brain interprets that series of pictures as if it were smooth motion, filling in what it thinks is the missing information.
Here is a great example of video:
Resolution is simply the number of pixels in a given image. The more the number of pixels, the higher the sharpness of the display and the more that it can be “zoomed” or expanded without losing definition or detail. Different cameras and displays have different numbers of pixels, depending on the model of the device and the format that it is using.
For example, an HD display that has a resolution of 720p contains 1280 pixels wide by 720 pixels high. This adds up to 921,600 pixels, or what is rounded up to 1 Megapixel. Similarly, 1080p contains 1920 by 1080 pixels, which works out to 2,073,600 pixels, or 2 Megapixels. This means that we have fit more than twice the number of pixels into the same image size, all other things being equal, so the picture would be sharper. This also would allow us to double the size of the display without increasing the size of the pixel, which would cause us to have a loss of sharpness.
You can’t have more information than you started with, so whatever the camera doesn’t capture just isn’t there. So the resolution of the image is limited by the resolution of the camera. That isn’t the only thing that limits your picture. Another is the particular protocol that is being used to encode the picture. A picture that is stored as raw data takes up a large amount of memory space. As discussed before, a 1080p picture in its raw form has 2 million pixels, with each one using up 24 bits of space. This means that each picture takes up 5.4 Megabytes (there are 8 bits to a byte). This is very large, and raw pictures would quickly eat up your memory. The answer to this is compression.
Many of the pixels in an image are identical, or very much alike. Electronic devices take advantage of this by compressing the data. They remove the identical parts with a notation that basically says “this large area here is all one color.” How this is done is particular to each compression method, for example: jpeg, gif, etc.
Video is compressed in much the same way, using something called a codec. When frames are close to each other, the codec essentially will say “this frame is very similar to the previous frame, with the following exceptions” then proceeds to describe the differences. Each codec handles this in a different way. Some are more accurate in describing differences than others, and each codec has its own advantages and disadvantages. Some are used because they require less processing power, others may be chosen because they are more memory intensive, and still others are chosen because they tend to look the best. The point is that each codec handles the frames in a different way.
The Display Device
The last and final thing that affects an image is the device being used to display the image. The device, which can be a monitor, a printer, or even a projector, will have its own resolution, its own capabilities for frame rate, and its pixels will be of a certain size. A 4K television has 4096 by 2160 pixels. If that TV is 48 inches wide, then each pixel will be 48 inches divided by 4096, or slightly less than 0.0117 inches in diameter. If you get very close to your television, you can probably see the individual pixels on your screen.
So what happens when we want to zoom in on one particular area of a picture? Remember that the limiting factor to resolution is the lowest resolution in the entire process: the camera, the compression method, and the display device. This is why they don’t film studio quality movies on low price point and shoot cameras.
Unfortunately, people watch too much television. On TV cop shows, some cop says “enhance” and the computer technician pushes a button, then the computer picks out a reflection on an eyeball in a photograph and is able to get a license plate number from a car that was passing by. That just isn’t how this works. Here is a woman from a frame of a video:
We want to see a what that person looks like. So we take a look and see that there is an area of interest, and we want to zoom in on that. Let’s say that the area of interest was one tenth of the height of the original, making that area we are interested in 512 by 288 pixels. So lets make that larger.
Now I am trying to make a picture that is 512 pixels wide by 288 high fit onto my 4k television, which is a display that is 4096 by 2160 pixels. There are only two ways to do that: You can make the pixels themselves larger, to get an image that looks like this:
That doesn’t help. We have a larger picture, but we have lost a lot of detail. The other way is to leave the pixels the same size, but have more of them. That means adding pixels. This process is referred to as upsampling. Many photo software packages will do this automatically, and each one has its own method for deciding what the added pixels will look like.
Some software packages will do averaging. Averaging is where you want to add a pixel between two existing ones. Lets say that one pixel is a bright red (200,0,0) and the adjacent one is a duller red (100,0,0). The added pixel will wind up being in between the two(150,0,0).
So what does this mean to the Rittenhouse case?
Even though I paid attention, no one mentioned the drone or the camera, so I don’t know the resolution of the camera that was on the drone that provided the so called “unicorn” video in the Rittenhouse case. Let’s just assume for the sake of argument that the drone was a DJI Mavic 3, a popular consumer drone that costs in the neighborhood of $2,000. That drone has a camera that shoots in 5k resolution, which is 5120 by 2880 pixels, or 14.7 Megapixels.
When that software expands the selected part of the picture to show what Rittenhouse was doing, there will be pixels that are added. What was added? What remained the same? Normally, the person seeking to admit that photo into evidence has the burden of proving that the enhanced picture is an accurate depiction of what actually happened. They do this by having an expert present who can attest to how and to what extent that photo was changed.
In this case, the cop who did the photo enhancement was using software, but had no idea how that software worked, nor did he have any idea how the enhanced picture differed from the original. All he knew was that he pushed a button, and *poof* the picture was enhanced to show something.
That is why this is an issue. The judge, and apparently every leftist who is a CSI fan doesn’t understand that.