Editor's Note: This article was first published on the official blog of the photography app Halide, originally titled iPhone SE: The One-Eyed King?. The author is app developer Ben Sandofsky, and the minority is reprinted with authorized translation.
After four years, we have ushered in a cheap version of the iPhone. Like the previous generation, the new iPhone SE controls the price by combining old and new hardware. iFixit also found that the rear lens module of this generation of iPhone SE is interchangeable with the iPhone 8.

iPhone 8 and iPhone SE teardown comparison (Photo /iFixit)
If you only evaluate the lens of the iPhone SE from the level of hardware parameters, we will let you directly read the iPhone 8 review. In simple terms, despite being shot from three years ago, we think it's still excellent. Sebastiaan1 recently took a number of raw-format photos to test its actual effect.
But the iPhone SE camera's focus is on the system, and the iPhone uses Single Image Monocular Depth Estimation technology to do what other iPhones don't. Translated into the vernacular, this is the first iPhone to produce a "portrait" effect with just one flat image.
Shooting with the latest halide upgrade, the system camera can only turn on "Portrait Mode" when shooting people
You might say, "Isn't that the case with the iPhone XR?" It also has only one camera!" But as we mentioned earlier in the article, although the iPhone XR is also a rear-mounted single camera, it still uses hardware to obtain depth of field information. The iPhone XR takes information from the sensor's phase-focused pixels, which you can think of as a pair of small eyes that assist in focus, and it takes advantage of the slight differences observed by the different eyes to generate a rough depth-of-field map.
But this method is not available on the new iPhone SE, because its sensor size is relatively small, so there are not enough phase focus pixels to use, so the iPhone SE is completely through machine learning to generate depth of field data. You can do a simple test yourself: just take a picture of it.
(We recommend using a Halide camera not entirely because we developed the app, but more because the system camera only turns on portrait mode when shooting people.) We will discuss this issue in a later section. )
Let's zoom in on this image to full screen and take a photo with the iPhone SE:
Lux Optics' mascot June
An iPhone XR will see a nearly flat surface — that is, a computer's display. It seems to be adjusting the depth of field by the colors on the picture, and guessing that this one in the middle is slightly ahead of the other parts.
Depth of field data generated by iPhone XR
The iPhone SE generates a completely different depth-of-field map. It's wrong, but it's amazing!
Depth-of-field data generated by iPhone SE
Here's a 50-year-old film of my grandmother's film while cleaning up my dad's house...
It's amazing. This trick is really powerful, but after all, this is a designed experimental scene. What if I tried it in the real world?
<h2 class="ss-hId-1" > combat test: occasional flaws</h2>
Why does Apple only allow the system camera to turn on portrait mode when shooting people? This is because Apple has an extra step in processing pictures: if the subject is a person, then the effect is quite good; but if the subject is not a person, then sometimes there will be a small problem.
In this example, the neural network failed to distinguish between June's head and the tree behind it. Maybe the tree is the horn on the head?
Also, it seems that depth maps are excellent at partitioning (which means splitting different parts of a picture), but are good and bad when judging the actual distance. Here we take a frightened face of June as an example:
"2020 won't be good for anyone"
I shot with both the iPhone 11 Pro and the iPhone SE, and it was obvious that more shots could get better data. The iPhone 11 captured the full depth of field information of the aisle, and the iPhone SE made a mistake in judgment on the floor that gradually extended into the distance.
左:iPhone 11 Pro;右:iPhone SE
How does this affect in real life? Let's take this succulent as an object to illustrate, because this photo has a good sense of layering.
In the depth of field of the iPhone 11 Pro we can see the obvious edges, and the iPhone SE just grasps a rough idea. Maybe it's because the neuron network learns a lot of pictures of dogs, and doesn't learn how to identify plants?
What about the end result? Normally, the blurring intensity of objects in the foreground and objects in the background should be different, but the blurring effects of the iPhone SE are all mixed together.
To be clear, in order to make the effect more pronounced, I strengthened the intensity of the portrait effect. If you like the effects of the iPhone SE, you can copy it on the iPhone 11 Pro, and vice versa — you can't copy the layered effect on the iPhone 11 Pro on the iPhone SE.
The second step in Apple's machine learning plays a crucial role in differentiating the hierarchy. Apple introduced an API called Portrait Effects Matte when it introduced the iPhone XR, which detects portraits in photos and creates a mask with rich detail. In fact, this mask is higher than the resolution of the depth of field information contained in the photo.
Shooting with the iPhone XS in 2018, the resolution is scaled to a quarter of the original image
As long as the foreground object is in focus and the image is sharp, most people will never know that you haven't put much effort into background blur.
So back to the question: Why does Apple only allow portrait mode on characters?
Apple tends to "do more under-promise and over-deliver." It's not impossible for you to use portrait mode to shoot everything, but they prefer to make users think that "portrait mode can only shoot people" than an unstable imaging effect. But as developers, we're glad apple gives us the freedom to view this depth of field data.
Next question: Will machine learning be good enough to require multi-camera devices?
< h2 class="ss-hId-1" > the challenge of machine learning to calculate depth of field</h2>
Our brains are incredible. Unlike the new iPad Pro, we don't see LiDAR to provide depth-of-field information. Our brains get this information from a range of sources.
The best way to do this is to compare the different images we see with both eyes. Our brain will compare the information of the two pictures in series, the greater the difference, the farther away the object is, and the principle of iPhone with two cameras to collect depth of field information is similar to this.
Another way is through exercise. When you walk, objects in the distance move more slowly than objects in the near distance.
Scene from the game Sonic Dodger
This approach is similar to the way AR apps determine where you are in the real world. For photographers, this approach is not ideal: it would be too much trouble for people to wave their phones in the air for a while before taking pictures.
So we only have this last resort: to get depth of field information from a still image taken with a single lens. If you happen to know people who can see with only one eye, you know that their lives are no different from other normal people; it may take a little more effort when driving. This is because we need to rely on other clues to determine distance, such as the relative size of known objects.
For example, you know how big a phone is in general. If you see a very large phone, your brain will guess that it's close to you, even though you've never seen it before. Most of the time, this method of judgment is correct.
Top Secret is an underrated film
That's simple. Just train a neural network to detect these detail cues, right? Unfortunately, this is actually an "Ill Posed problem." In the "Well Posed problem," there is and there is a unique solution to the problem, just as "one train is coming out of A at 60 mph and another train is going out at 80 mph from B..."
But when it comes to guessing depth of field, a single image can draw multiple conclusions. Take a look at the following explosive Douyin video, is the person in the video facing the photographer, or is he facing the photographer with his back to the photographer?
Some pictures cannot be processed, either because there is not enough auxiliary information or because the pictures themselves are unsolvable.
"Waterfall" M. C. Escher
So in general, neural networks look amazing, but they also can't break through the limitations that human intelligence encounters. In some situations, one picture simply isn't enough. A machine learning model may generate a reasonable depth-of-field graph, but that doesn't mean it's necessarily true.
If you just want to easily take a good portrait photo, it's not a problem. But if you want to accurately record a picture and preserve as much information as possible for post-editing, you'll need an extra viewing angle (like a dual-camera system) or another sensor (like LiDAR).
Will machine learning surpass multi-camera phones? The answer is no, because no matter what kind of system, it is usually better to get more information, which is why we evolved two eyes.
The problem is that if the depth of field measurement is quite close to the actual situation on a given day, and there are few missed cases, even professional photographers will think about whether to pay more for additional hardware. We don't think this vision will be realized this year, but if we look at the speed of machine learning now, maybe in a few years we will usher in the application of this technology, rather than hoping for the future decades later.
This is an exciting time for photography.
<h2 class="ss-hId-1" > try it yourself</h2>
If you're curious about the depth of field capture space right now after reading this article, try our newly updated Halide camera.
If you have any iPhone that supports Portrait mode, you can shoot in Halide's Depth of Field mode to see what the resulting depth of field looks like. It's really interesting to see how your iPhone looks at this beautiful three-dimensional world.
Finally, thank you for reading. If you have more questions about the iPhone SE, please feel free to contact us on Twitter. Happy taking pictures!