The Science of 3D Machine Vision with Svorad Stolc's Photoneo

Speaker 0 00:00:00 Do you know what machine's dream of to finally see the world emotion Speaker 1 00:00:05 <inaudible> hello everyone. Speaker 2 00:00:10 And welcome to the eighty-three the robot industry podcast. We're glad you're here and thank you for subscribing. My name is Jim Beretta and I am your host we're broadcasting from London, Ontario, and from Bratislava in Slovakia. Today, I'd like to introduce our guest today from photo Neo storage. Soltz is the sensor division CTO at photo Neo. He's an expert in 3d imaging machine vision, artificial intelligence and parallel computing. In 2001, he gained a master's degree at Comenius university in Bratislava, faculty of mathematics, physics, and informatics, and a PhD degree in 2009 jointly at Slovak academy of sciences and brought a Slava and technical university of kosha. Say for several years, he worked at AIT, the Austrian Institute of technology in Vienna, where he led a research group focused on computational photography and 3d sensing. During this time he published a number of internationally awarded scientific articles and at photo Neosporin is responsible for research and development at the company's 3d sensing technology. So welcome to the podcast horrid. Speaker 0 00:01:16 Hi Jim, thanks for having me here at the podcast. Speaker 2 00:01:20 And I wanted to mention to you to the audience out there, that photo Neo designs and manufacturers, state of the art vision sensing technology used in robotics, logistics and automation. So my first question to you is for, is can you explain your view of the 3d sensor sector and what are 3d sensors good for? Speaker 0 00:01:42 Well, it's a complex question. I will rather start with something that is here for many years, which is 2d imaging. Uh, for most of the people from the audience to the imaging is well-known technology. It is very powerful for, for solving many, uh, industrial problems where, uh, object need to be, or the scene needs to be understand understood only in, in lateral dimension. So one basically is able to distinguish between objects, uh, how they are composed, uh, laterally, which object is to the left or to the right. But, uh, the one thing cannot be done, uh, by 2d imaging, which is to understand how far the objects are or what is the composition of the scene in the third dimension, in the depth. Uh, what we see nowadays is, uh, is a need for, for more, more complex sensing approaches because problems that we are facing or many, many industries are facing these days, uh, need also the understanding for the third dimension. Speaker 0 00:02:52 And, uh, I would say that that's an examples of these domains are for example, vision guided robotics or a hundred inspection. Uh, and these are problems from these domains cannot be tackled by good old 2d. So what 3d sends us bring over the standards to the imaging 3d sensors, uh, give us an information about, about how far the objects are in front of the sensor, which means, uh, we can not only see which object is, is sideways from another object. We also can, can see how far one object is from, from, from the sensor or how distant two objects are in the, in the set dimension in the third dimension. And, uh, it can be illustrated at the very simply example, uh, for a 2d camera. There is no difference between two dimensional poster that is put in front of the camera where a three dimensional scene is just, you know, uh, pictured on the poster, uh, for a 3d camera. Speaker 0 00:03:58 There is huge difference between 2d poster of a 3d scene and the 3d scene itself. Meaning, uh, if a 3d camera is looking at the poster, we, uh, see a nice image of a 3d scene, but we know that all these, uh, all these objects are basically painted in one plane, you know, in one distance from the camera. However, if there is a true 3d scene and we see that one object is closer to the sensor and the other one is, is far away more over 3d sensors are, are very good, very capable, uh, for understanding, uh, shapes of the objects that need to be handled, or for example, for recognizing, uh, smallest anomalies on the, on the surfaces of these objects, uh, which means that if, if one has a 3d sensor in incorporated in the system, it is, it is, uh, possible to, to better understand, uh, what, uh, are the best, uh, picking or grasping points. Uh, one can see the smallest deviation from the, from the good, from the, from the template, uh, rather than just, uh, understanding that there is a two dimensional or, or, or, uh, uh, stain or, or for example, that there is an entire part missing. This is the good part about the 3d Sandoz. Speaker 2 00:05:28 Okay. Thank you. All right. What is a structured light? And remember we're on a podcast. So your explanation has to be, I don't know, very visual and what are some of the limitations for 3d sensors? Speaker 0 00:05:40 Uh, with this question, we are jumping right into the center of the, of the nowadays 3d sensing technology, uh, structured light is a, is a 3d sensing approach that is based on the stereoscopic principle. It is, uh, from the, from the fundamental point of, from the technological point of view, it is a stereo system just like having two eyes like humans have, uh, or, or, uh, two cameras in a stereo system, but with a structured light approaches, uh, we have a, instead of one eye, we have a pattern projector that is sending special light codes into the scene, which means we are not only looking at a scene that is in front of the sensor from two perspectives, and then using a triangulation or a correspondence analysis and the triangulation principle to get depth, a depth of certain distinctive points in the scene, uh, with structured light, we are deliberately sending a, an information rich light codes onto the scene from one eye. Speaker 0 00:06:50 And the other eye is just interpreting these codes to get the, the information that is needed for, for, uh, for the, for the triangulation in order to detect depth in, in, in many points, basically in all points of the scene, that is, that is visible by the, by the camera, by the sensor, uh, structured light is, is a very popular method because of its accuracy and robustness. And that is the reason why most top brands are using this approach. Um, however it is, uh, it is limited in one fundamental thing that is, uh, uh, that, that it works only, uh, when the, the scene itself or the objects that are, that are scanned by the, by the sensor. Uh, they're stationary from the nature of, of the, of the scene, which means they, they must not move during the scanning process because there is multiple light codes, might multiple patterns projected onto the scene over the time. Speaker 0 00:07:58 And, and these light codes are sequentially captured by the camera and reinterpreted in order to get the 3d information. And naturally, if the scene moves during this acoustic financial acquisition process, the, the light, uh, colds get disturbed, the information is inconsistent and the 3d information cannot be captured correctly, or the, it cannot be recovered to, to get the correct 3d information. That basically is the most crucial limitation of these methods. And, uh, as I, as I have hinted photo now, uh, have solved this long-lasting problem by it's brand new technology that is called parallel structured light. Speaker 2 00:08:45 Yeah. So let's talk about that, uh, because this is one of these recent innovations and how did this happen? And, uh, the, that this innovation, Speaker 0 00:08:56 Well, as many fundamental discoveries, also, also this one happened organically, not basically scheduled or planned by, by any manager. Uh, photo nail was established in 2013 about this one brilliant idea, which, which came from our CEO and CTO, uh, which was, which was fundamentally very simple idea, uh, where the main trick was that, uh, when they looked at, at the state of the art structured light cameras, they have realized that, well, there's this limitation. We cannot capture a moving objects by this, and already back then they have realized that there will be a need for, for capturing 3d information from, of, of moving object, like, like 3d videos. And, and, uh, they have realized that actually one cannot send multiple light patterns onto the, onto the scene in parallel. But what one can do is to, to, uh, uh, shift the structured light trig onto the camera side, sent onto the scene only a single pattern, and then do the structured light streak on the, in a, in a specially defined Simo sensor. Speaker 0 00:10:19 So in, in, in, it sounds complicated, but, and in fancy technological words, uh, it is the idea is to swap spatial, multiplexing of pattern projection, uh, for temporal shutter multiplexing within pixels of a custom multi-tap CMOs sensor. And, and, uh, this is basically, uh, implemented, uh, by, by this special exposure blinking inside of the camera. That, again, it implements basically the same principle as the structured light sends those do and, and employ, but it is, it is entirely implemented on the camera side. So not on the projection side, but on the camera side, uh, where many, or all the patterns can be captured simultaneously, uh, which means extremely we fast. Speaker 2 00:11:18 So you, you just switched it around, right? You're, you're just now letting the camera do the work or the pixels in the CMOs, do the work exactly Speaker 0 00:11:25 Because yet again, if, if one wants to project multiple patterns like spatial light patterns onto the scene at the same time, you know, if you overlap multiple patterns, you get nonsense, but one can, if you have multiple eyes and each eye is doing a different patterning or implements a different pattern, eyes can do things in parallel while multiple projectors cannot do the same. Uh, the, the, the they're different patterns in parallel. So this is the trick we swapped the, uh, the, the role of the camera and the projector. And essentially they are very similar as, as one does in the, in the standard structured light. But with the, with the use of custom made CMOs stanza, Speaker 2 00:12:12 You know, what, that's it, it's, it's very brilliant. And what do you call that again? Speaker 0 00:12:16 It is called parallel structured light because, because we are doing structured light patterning in parallel that is not possible to do with a conventional CMOs sensor. And here I would, I would like to point out that, uh, you know, designing a custom Simo sensor is, is not an easy job. And it illustrates, you know, the, uh, the mindset of, of, of people working at photo Neo of the photo Neo as, as a whole, that, that the entire team of, of, of engineers working from photo Neo as well, as well as our, our, uh, uh, funders and, and core core people, uh, you know, the expertise is so broad and so deep that we were able to, to, you know, to design our own SEMA sensor Isabelle, as well as to build the entire robotic solutions about it, and, and, and also, uh, deliver the entire solutions, uh, that are, that are capable of solving fundamental problems. Speaker 2 00:13:21 Sport, people are talking a lot about artificial intelligence. And what's your perspective on AI and machine vision? Speaker 0 00:13:30 Well, back to the photo Neo company, I would like to say that the mission of the photo neon is to deliver not only eyes, but also the intelligence, uh, to machines. So, uh, with the 3d sensors, machines can see what they are, what they, what is around them, what is in front of the sensor, but without intelligence, uh, they do not understand what they are looking at. So obviously AI is one of our core technologies and where we are investing a lot of effort into, into taming this technology and making use of it. And speaking of the 3d sensors, because that's the, that's the focus of this podcast. Uh, we see the greatest potential in the capability of generating huge amounts of data. Uh, our 3d sensor is not only, you know, a sensor that is capable of capturing a nice point point cloud or a nice 3d data set, but it is also able to do it very fast. Speaker 0 00:14:35 So, so it can really provide data sets that are much more information reach that, uh, than other, uh, or then other technologies may provide. Therefore we see our sense or, uh, in a combination with, with, uh, or as a, as a technology that supports AI greatly because we see what we, what we see in, in current or nowadays AI, uh, technologies and trends that it is no longer about the model AI model. It is more about the, the, uh, excess to the, to the right data, the data that describes the given problem domain. Well, if one has the access to the data, it is very likely that, uh, the AI system trained making use of this data will work fine. If one doesn't have the access to the right data, it is very likely that it won't work regardless of the complexity or, or the specific details of the AI model. Speaker 0 00:15:44 So that's the reason why, where we, or why we strongly believe in our sense, or as a, uh, as a support tool for, for a, uh, functional and, and well-performing AI systems spark, what makes it so fast? And again, uh, it is the, our parallel structured light technology that is making it very fast. We, uh, instead, uh, as I said, instead of taking multiple images over the time that needs to be captured, which takes certain, certain amount of time, and then, uh, interpreted, analyzed, and interpreted and turned into the 3d information. We can capture the entire 3d information in a single shot, in a single pattern and the rest of the structured light or the pet turning is done on a sense of sight. Okay. In parallel, Speaker 2 00:16:42 Thank you for that. Um, sorry. How do you work with customers? Well, Speaker 0 00:16:48 We focus on, on, on this new technology, on the parallel structure, light technology. It recently, a couple of months ago, we have released a brand new product called motion cam 3d that is, uh, bringing this nice and, and, and very powerful technology to our customers. Uh, it seems that our technology has been recognized by broad expert community because we won all major owners, such as vision award in 2018, or vision innovators, award bronze and platinum, 2018, 19 in vision innovation, 2019 and 21, as well as recently, I triply year awards 2020. So, uh, it has been recognized we believe because it basically erases this, uh, you know, old dilemma that an engineer that is trying to deploy the 3d system needs to decide either the accuracy or a level of detail of the 3d sensor or the speed, which means I either wanted to have a lot of details and admit that my system is very slow or having a very fast system or quite fast system, but admitting that there is not much of detail that you get from the sensor, but with our new emotion cam 3d, uh, we deliver both at the same time. Speaker 0 00:18:16 And, and we like to make the, you know, the analogy, uh, with, you know, a dinosaur eyes that are developing over thousands of years, you know, maybe, you know, those huge animals back then, they, they saw that their eyes are good enough, but, you know, compared to human eye, that can deliver both the detail and the speed at the same time. You know, now we see that, you know, those old eyes were just, you know, crap. They were too simple. And, and, uh, uh, we believe that we are approximately at this point right now that, that, uh, we have, we brought to the, to the broad expert community, to the broad engineering community, a new sensor that, uh, doesn't put the, the engineer into, into an awkward situation that a certain compromise need to be made. And, uh, we are obviously opening up a new market, uh, with this new technology, those people who actually understood the technology, they fell in love with it. Speaker 0 00:19:21 They, they really wanted to put their hands very fast on it. And our production is still, uh, preoccupied fully by, by, uh, building and, and, uh, providing these sensors to, to all sorts of customers that are interested in the technology. Uh, but there's still a huge amount of, of, of people of the expert community, uh, which I think just didn't comprehend yet. What is possible with this technology? And we are working very hard on educating these people on, on providing them with a sufficient amount of examples so that they, they, you know, basically broadened their, uh, range of applications, which are solvable now with this new sensor. So we understand that it is normal for every breakthrough, uh, that it needs certain adoption time. And that is exactly where we are right now with, uh, our parallel structured light technology. Speaker 2 00:20:25 I was gonna ask you as well. Uh, if you have, uh, some examples of some applications that Speaker 3 00:20:32 You can talk about, Speaker 0 00:20:34 Yes, this is it. Uh, I am, I'm not allowed to speak very openly about, about our applications, but, uh, as I said, it is, it is, as we speak opening up a really range of applications, for example, in the automotive sector, we see a big interest in, in real time 3d inspection where huge, you know, point clouds need to be fused together for the inspection purposes, uh, in the environment where there is a lot of vibrations and a lot of motion. And, and, uh, basically the entire configuration prevents the, the integrator, uh, to use standards technologies that are, you know, readily available on the market. And our emotion cam is, is the answer to that because we, we, uh, suffer from no motion blur. We can, we deliver crispy detailed point clouds in high speeds that can be fused in real time together in order to do the 3d inspection tasks, um, sufficiently and to the, to the satisfaction of the customer. Okay. Speaker 2 00:21:46 That's great. Well, listen, uh, as far, thank you very much for being on the podcast. How can people get in touch with you? Speaker 0 00:21:55 Uh, I'd encourage everyone who is interested in our technology, uh, to visit our website, www.photo, neo.com, where they can find all the important contact information to our sales support and local offices. So everyone who is interested feel free to write us a question, and we will gladly answer to that. Speaker 2 00:22:17 Well, thanks again. Our sponsor for this episode is Earhart automation systems, Earhart builds and commissions, turnkey automation solutions for their worldwide clients. With over 80 years of precision manufacturing, they understand the complex world of robotics, automated manufacturing, and project management, delivering world-class custom automation on time and on budget contact one of their sales engineers to see what Earhart can build for [email protected] and Earhart is E H R H a R D T. I'd like to thank and acknowledge our partner at three, the association for advancing automation, they are the leading automation trade association in the world for robotics, vision and imaging, motion control and motors, and the industrial artificial intelligence technologies visit automate.org toward more. I'd like to thank our partner painted robot painted robot builds and integrates digital solutions. They're a web development firm that offers SEO and digital social marketing, and can set up and connect CRM or ERP tools to unify marketing sales and operations. And you can find painted [email protected]. And if you'd like to get in touch with us at the robot industry podcast, you can find me Jim Beretta on LinkedIn. We'll see you next time. Thanks for listening. Be safe out there. Today's podcast was produced by customer attraction, industrial marketing, and I would like to thank my nephew, Chris gray for the music, Chris Colvin, audio production, my partner, Janet, and our partners <inaudible> painted robot and our sponsor Earhart automation. <inaudible>.

Show Notes

Episode Transcript

Other Episodes

Episode 126

Transforming the Manufacturing Industry with Formic's CEO Saman Farid

Episode 85

Robo chefs and cooking with Robots. How Gastronomous is Disrupting the Restaurant Biz.

Episode 0

Dave Henderson Interview