The emerging Industry 4.0 paradigm facilitates intelligent factories where machines, robots and human workers work side by side. Collaboration between these entities in a factory environment allows having higher throughput with the ability to perform complex processes which needs multiple skills from robots as well as humans. Properly understanding the surrounding environment of these factories by the machines is crucial for ensuring safety and keeping the efficiency to a maximum.
Currently, in most cases robots operate in a specific confined area called “robot cells” and they have limited perception of the environment. Robot cells are usually physically separated from the environment with specific entry points allowing for human operators or other robots like AGVs to interact. These interactions are kept minimal and simple in most implementations we can see today. Safety sensors are used to detect when something or someone crosses the boundaries of the robot cell. Usually, a detection from a safety sensor slow down or completely stop the operation of the robot.
Figure 1: Human-robot collaboration 
However, in a collaborative robotic workspace, different robots and human operators need to share the same work area, and sometimes the same workpieces to achieve a common goal. . Diverging from the individual robot cells, collaborative robots work together with other robots or humans by complementing each other in completing given tasks. Human operators and collaborative robots which are also called “Cobots”  can work together in different time-sharing options such as synchronized, continuous or alternating to achieve a common task . This calls for an open boundary environment where the machines are seamlessly connected and have an advanced perception of the surrounding environment. These machines should be capable of not only sensing in 3D but also detecting, identifying and understanding the 3D space as well.
Figure 2: Ouster OS0-128 LIDAR 
Light Imaging Detection and Ranging (LIDAR) sensors have been recognized as a prime way of perception in 3D with their decreasing cost and increasing quality of output. Semantic segmentation directly on the point cloud data from LIDAR sensors has become feasible with the introduction of faster hardware and the advancement of deep learning. Semantic segmentation offers finer details about the surrounding environment than object detection and can be extended to instance segmentation and part segmentation. This information can be used in calculating the space occupied by different objects and predicting their movements and actions. Ultimately this allows for building a spatially-aware robotics system that can collaboratively work with human workers safely and efficiently.
Figure 3: Semantic segmentation of a road scene in a 2D image 
The Institute for Mechatronics Systems at Zurich University of Applied Science has implemented such an intelligent collaborative workspace in our Smart Factory Demonstrator. In the demonstrator, a robot manipulator has the responsibility of assembling a pen from parts coming in a conveyer carrier. The smart factory environment has the ability to add another robot manipulator resting aside as a secondary assembly robot to the system when needed. This is done by using an AGV to carry the secondary robot to the assembly station where the assembly robot is working. Furthermore, when the assembly of the pen is completed, the robot hands it over to a human operator for inspection. Additionally, the human operator can intervene when a part is missing, or the robots need some help.
Figure 4: Robots and AGV in the collaborative workspace
In this use case, multiple LIDARs are used to get a complete perception of the entire environment in real-time. These data are fed through a perception pipeline for preprocessing such as aligning and filtering before used by the deep neural network which is responsible for semantically segmenting the 3D space. Semantic segmentation is usually done using supervised learning with neural networks which are trained on point-wise annotated 3D point cloud data.
Figure 5: Semantic Segmentation of the collaborative workspace
Multiple state-of-the-art deep neural network models were trained on a dataset obtained from the environment and later deployed into the real-time system. The system was able to detect and segment multiple classes of objects in the scene with very high accuracy. Semantically segmented high-density point clouds can be consumed by different systems to keep aware of their environment at all times with fine-grained information on surrounding entities, spatial occupancy and even behaviours. IMS carry out further research on semantic understanding and event prediction to build up a fully spatially aware, intelligent collaborative factory environment.
 “Collaborative Robots Working In Manufacturing | ManufacturingTomorrow.” https://www.manufacturingtomorrow.com/article/2016/02/collaborative-robots-working-inmanufacturing/ 7672/ (accessed May 04, 2021).
 S. Bragança, E. Costa, I. Castellucci, and P. M. Arezes, “A brief overview of the use of collaborative robots in industry 4.0: Human role and safety,” in Studies in Systems, Decision and Control, vol. 202, Springer International Publishing, 2019, pp. 641–650.
 M. Olender and W. Banas, “Cobots – future in production,” International Journal of Modern Manufacturing Technologies, vol. 11, no. 3 Special Issue, 2019.
 F. Vicentini, “Collaborative Robotics: A Survey,” Journal of Mechanical Design, Transactions of the ASME, vol. 143, no. 4. 2021, doi: 10.1115/1.4046238.
 “High-performance digital lidar: autonomous vehicles, robotics, industrial a... | Ouster.” https://ouster.com/ (accessed Apr. 28, 2021).
 S. Ghosh, N. Das, I. Das, and U. Maulik, “Understanding Deep Learning Techniques for Image Segmentation,” 2019, doi: 10.1145/3329784.