Here are a few problem domains that I thought would've been solved before the advent of "generative AI that can turn text prompts into videos,"
but it turns out I am wrong and they are deceptively hard:
Automated long haul freight trucks
Via DALLE 2: "a freight truck on a snowy highway, in the style of a watercolor painting"
As a layperson, I assumed that self-driving freight trucks on highways would be a light version of
full "level-5 autonomy passenger vehicles in cities." The logic seems simple: it's mostly following well-marked lanes,
and there are no intersections.
But it turns out that highway driving is actually MUCH harder than autonomous city driving
(which - to be clear - is already an extremely difficult problem). Engineers seem to inevitably discover that when they create startups trying
to disrupt long-haul trucking:
- The available sensor technologies don't reliably have enough range under all conditions. This can lead to unacceptable safety problems:
at normal highway speeds it can take a long time to bring a truck to an emergency stop, and if your braking distance is further
than your sensors can detect obstacles, it will mean by the time you've detected an obstruction
it might be too late to avoid hitting it.
- A truck is a complex object - the cab and the trailer are joined but have independent movement along an axis, so it cannot be physically
modeled as a single rigid body like a simple passenger vehicle. This complexifies planning and operations.
- In city driving, novel situations like unplanned obstructions or road closures can usually be handled with a "pull over safety stop"
and rerouting to a different path. In highway driving, not only is pulling over usually not an option, "rerouting" around
an unexpected obstacle usually isn't an option either. Consider the case of encountering highway construction and police officers
waving at an alternate route - the vehicle doesn't have the option of saying "I don't know what's going on, so I'm going to avoid that
street" - it's on a highway and the only way off is through it!
- The data on unique events (like construction detours) is extremely sparse compared to city driving, and its therefore extremely expensive to collect enough of it.
Automated food assembly
Every few years I see a demo for some robotic machine for making burgers, pizza, pasta, or
sandwiches, and I end up...underwhelmed? Unfortunately the most common pattern is a test deployment in some novel concept restaurant somewhere,
and then eventually the whole operation folds up shop.
Having prepared a lot of food myself, I think I was surprised by how difficult it is to automate at scale? Let's consider:
- Assembling a burger actually requires very precise movements and manipulation of up to a dozen different types of
objects (of heterogenous shapes) and with complex behavior (burger patties will deform or break up
if picked up but not properly supported, tomatoes are both hard-to-grip and fragile, etc).
- So manipulating these ingredients requires not just advanced robotics, but advanced computer vision as well. In practice, practical
applications of "robot burger maker" get around this problem by mostly squirting prepared ingredients out of chutes or tubes, and
moving things around on belts instead of more complicated lifting motions.
- So then you still have the problem of "how to fill the chopped lettuce chute with chopped lettuce," and this is trivial to
solve with automation if you have a 2 million sqft lettuce processing factory, but in a more constrained environment, you apparently
need a human to come by every once in awhile.
- "Scoop the peanut butter out of the jar and spread it on the bread" - an open research problem, it seems.
- Hilariously, we learned you can't put things like robot pizza makers on pizza delivery trucks, because the actual forces of the truck in motion
will destroy an uncooked pizza in an oven (but not a cooked one in a pizza box).
Automated brick laying
Via DALLE 2: "a complicated brick laying robotic contraption at a construction site, watercolor painting"
Brick laying seems like it'd be straightforward-ish, because it's just laying a set pattern over and over again. But in practice:
- Mortar and brick requires complicated handling because mortar is a fluid, and it requires a probably-surprising amount of
precision to apply the right amount in the right place. Also, since a machine would pump it instead of scooping, the
mortar consistency would have to be in a very specific, very narrow range, despite unpredictable ambient temperatures.
- Any robot would have to move along the entire wall to be built, and this is hard because - by definition - things are
under construction; there probably isn't a perfectly level surface to move along, so it has to be able to traverse uneven
ground while still placing bricks accurately. Not impossible, but not trivial.
- It's rare for there to be large sections of uninterrupted brick wall without a corner, window,
door, etc to build around - those parts mostly have to be laid by hand since they break the usual simple-overlapping-brick pattern.
- Fixing mortar joints to make them presentable is nominally part of brick-laying, but is a totally different problem as far
as physics is concerned. One of the few
commercially deployed systems requires a mason to follow it around and clean the masonry joints / level the
occasionally misplaced brick, as well as a field technician to build the "wall map" and troubleshoot the machine.
Laying bricks is just deceptively complicated [1].
...and automating building construction in general
Unfortunately one component of high housing prices is high construction costs, and productivity in the construction industry
has stagnated for decades (and is maybe even moving backwards in some advanced countries). But making progress with
automation seems to be pretty hard because:
- We've managed to automate much of car manufacturing with very precise stationary robots
that can repeat the same complicated motion over and over again - its the cars that move on an assembly line.
In contrast, construction sites are very large because buildings are very large, so the robots need to be mobile (or on huge cranes?).
And making complicated robots that are mobile and also move under their own power and
can fit through doorways is...very challenging, apparently.
-
Most buildings are bespoke, with an architectural layout specific to the building's requirements, and many individual components
are tailor-made and fitted on-site. Having enough sensors to detect whether something was placed/secured/inserted correctly, and
adjusting as needed, is a challenging problem, and there might be hundreds or thousands of these types of unique tasks in the
construction of building. The result is that very little of "construction" is repeating the same identical
motion over and over again (even in our brick example, there are a literally unsolved edge cases), which reduces the gain from automation while
raising the cost of automating it.
- It's usually not cost effective to pre-fabricate large components and then ship them in. There's a limit to how big components can be,
since they still have to fit on trucks that will go on roads. And the savings can be small, because most
building components are cheap but heavy,
so even large savings on labor will be eaten up by the high cost of transporting finished components over even
medium distances. Because of this, it's
hard for pre-fab building component manufacturers to reach a volume where economies of scale start to really kick in, and
yet - scale is simultaneously limited by how close their raw materials and potential customers are. On top of
this, at least in the US,
pre-fab buildings have a reputation for low quality, which dampens the demand that would be required to reach scale. Most
buildings are still mostly constructed onsite because of this, which limits the amount of factory-quality machinery that can be used.
- Architect drawings are usually not detailed enough to be converted into a plan a machine can follow, since this would take
forever and the result would be unreadable.
The drawings have to be interpreted by expert tradesmen familiar with the local building codes, and even the building
codes themselves often have unspoken rules or reference implicit tribal knowledge that roughly ends up being "do whatever is customary
in this situation, given the totality of the circumstances."
This means there's actually a lot of work that goes into "automating" any building construction process, and it's hard to reuse.
[1] I have, however, seen some pretty cool videos of automated brick-laying for brick roads, which seems easier (they lie flat against prepared ground, and no windows or doors!).