dicom
I am an IT engineer and computer vision scientist, experienced software developing and university laboratory practice leading. Interests include computer vision and graphics, digital image processing, scene understanding, 2D/3D objects detection and recognition, machine learning, neural networks, deep learning, feature extraction, artificial intelligence, sensor fusion, robotics, LiDAR technologies, coding. Currently I am a PHD student in Deep learning and computer vision at Péter Pázmány Catholic University and I am a developer and research scientist at Hungarian Academy of Sciences Insitute for Computer Science and Control.
Demo page:
Contact:
  • balazs.nagy.it@gmail.com
  • balazs.nagy@sztaki.mta.hu
Publications Demos

Papers

B. Nagy, L. Kovács and Cs. Benedek: "SFM AND Semantic Information Based Online Targetless Camera-LIDAR Self-calibration", 26th IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, September 22-25 2019 download

In this paper we propose an end-to-end, automatic, online camera-LIDAR calibration approach, for application in self driving vehicle navigation. The main idea is to connect the image domain and the 3D space by generating point clouds from camera data while driving, using a structure from motion (SfM) pipeline, and use it as the basis for registration. As a core step of the algorithm we introduce an object level alignment to transform the generated and captured point clouds into a common coordinate system. Finally, we calculate the correspondences between the 2D image domain and the 3D LIDAR point clouds, to produce the registration. We evaluated the method in various different real life traffic scenarios.

Y. Ibrahim, B. Nagy and Cs. Benedek: "CNN-based Watershed Marker Extraction for Brick Segmentation in Masonry Walls", International Conference on Image Analysis and Recognition (ICIAR), Waterloo, Canada, August 27-29, 2019, Lecture Notes in Computer Science, Springer, 2019 download

Nowadays there is an increasing need for using artificial intelligence techniques in image-based documentation and survey in archeology, architecture or civil engineering applications. Brick segmentation is an important initial step in the documentation and analysis of masonry wall images. However, due to the heterogeneous material, size, shape and arrangement of the bricks, it is highly challenging to develop a widely adoptable solution for the problem via conventional geometric and radiometry based approaches. In this paper, we propose a new technique which combines the strength of deep learning for brick seed localization, and the Watershed algorithm for accurate instance segmentation. More specifically, we adopt a U-Net-based delineation algorithm for robust marker generation in the Watershed process, which provides as output the accurate contours of the individual bricks, and also separates them from the mortar regions. For training the network and evaluating our results, we created a new test dataset which consist of 162 hand-labeled images of various wall categories. Quantitative evaluation is provided both at instance and at pixel level, and the results are compared to two reference methods proposed for wall delineation, and to a morphology based brick segmentation approach. The experimental results showed the advantages of the proposed U-Net markered Watershed method, providing average F1-scores above 80%.

B. Nagy, L. Kovács and Cs. Benedek: "Online Targetless End-to-End Camera-LIDAR Self-calibration", International Conference on Machine Vision Applications, Tokyo, Japan, May 27-31 2019 download

In this paper we propose an end-to-end, automatic, online camera-LIDAR calibration approach, for application in self driving vehicle navigation. The main idea is to connect the image domain and the 3D space by generating point clouds from camera data while driving, using a structure from motion (SfM) pipeline, and use it as the basis for registration. As a core step of the algorithm we introduce an object level alignment to transform the generated and captured point clouds into a common coordinate system. Finally, we calculate the correspondences between the 2D image domain and the 3D LIDAR point clouds, to produce the registration. We evaluated the method in various different real life traffic scenarios.

B. Nagy, and Cs. Benedek: "Real-time point cloud alignment for vehicle localization in a high resolution 3D map", 6th Workshop on Computer vision for road scene understanding and autonomous driving at ECCV, Lecture Notes in Computer Science, Munich, Germany, September 8-14 2018 download

In this paper we introduce a Lidar based real time and accurate self localization approach for self driving vehicles (SDV) in high resolution 3D point cloud maps of the environment obtained through Mobile Laser Scanning (MLS). Our solution is able to robustly register the sparse point clouds of the SDVs to the dense MLS point cloud data, starting from a GPS based initial position estimation of the vehicle. The main steps of the method are robust object extraction and transformation estimation based on multiple keypoints extracted from the objects, and additional semantic information derived from the MLS based map. We tested our approach on roads with heavy traffic in the downtown of a large city with large GPS positioning errors, and showed that the proposed method enhances the matching accuracy with an order of magnitude. Comparative tests are provided with various keypoint selection strategies, and against a state-of-the-art technique.

B. Nagy, and Cs. Benedek: "3D CNN Based Phantom Object Removing from Mobile Laser Scanning Data", International Joint Conference on Neural Networks (IJCNN), pp. 4429-4435, Anchorage, Alaska, USA, 14-19 May, 2017 download

In this paper we introduce a new deep learning based approach to detect and remove phantom objects from point clouds produced by mobile laser scanning (MLS) systems. The phantoms are caused by the presence of scene objects moving concurrently with the MLS platform, and appear as long, sparse but irregular point cloud segments in the measurements. We propose a new 3D CNN framework working on a voxelized column-grid to identify the phantom regions. We quantitatively evaluate the proposed model on real MLS test data, and compare it to two different reference approaches.

B. Gálai, B. Nagy and Cs. Benedek: "Crossmodal Point Cloud Registration in the Hough Space for Mobile Laser Scanning Data", International Conference on Pattern Recognition (ICPR), pp. 3363-3368, Cancun, Mexico, 4-8 December 2016 download

In this paper we propose a general approach for registration of point clouds obtained by various mobile laser scanning technologies. Our method is able to robustly match measurements with significantly different density characteristic including the sparse and inhomogeneous instant 3D (I3D) data taken be self-driving cars, and the dense and regular point clouds captured by mobile mapping systems (MMS) for virtual city generation. The core steps of the algorithm are robust scan seg- mentation, abstract street object extraction, object based coarse transformation estimation in the Hough accumulator space, and point-level registration refinement. Experimental results are provided using three different sensors: Velodyne HDL64 and VLP16 I3D scanners, and a Riegl VMX450 MMS. Application examples are shown regarding self localization of autonomous cars through crossmodal I3D and MMS frame registration, IMU- less SLAM and change detection based on I3D data.

Cs. Benedek, B. Nagy, B. Gálai and Z. Jankó: "Lidar-based Gait Analysis in People Tracking and 4D Visualization", European Signal Processing Conference (EUSIPCO), Nice, France, August 31-September 4, 2015 download

In this paper we introduce a new approach on gait analysis based on data streams of a Rotating Multi Beam (RMB) Lidar sensor. The gait descriptors for training and recognition are observed and extracted in realistic outdoor surveillance sce- narios, where multiple pedestrians walk concurrently in the field of interest, while occlusions or background noise may af- fects the observation. The proposed algorithms are embedded into an integrated 4D vision and visualization system, which is able to analyze and display from free-viewpoint real dynam- ic scenarios in natural outdoor environments. Gait features are exploited in two different components of the workflow. First, in the tracking step the collected characteristic gait pa- rameters support as biometric descriptors the re-identification of people, who temporarily leave the field of interest, and re- appear later. Second, in the visualization module, we display moving avatar models which follow in real time the trajecto- ries of the observed pedestrians with synchronized leg move- ments. The proposed approach is experimentally demonstrated in seven multi-target scenes.

A. Börcs, B. Nagy, M. Baticz and Cs. Benedek: "A Model-based Approach for Fast Vehicle Detection in Continuously Streamed Urban LIDAR Point Clouds," Workshop on Scene Understanding for Autonomous Systems at ACCV, Lecture Notes in Computer Science, Singapore, November 1-5 2014 download

Detection of vehicles in crowded 3-D urban scenes is a challenging problem in many computer vision related research fields, such as robot perception, autonomous driving, self-localization, and mapping. In this paper we present a model-based approach to solve the recognition problem from 3-D range data. In particular, we aim to detect and recognize vehicles from continuously streamed LIDAR point cloud sequences of a rotating multi-beam laser scanner. The end-to-end pipeline of our framework working on the raw streams of 3-D urban laser data consists of three steps 1) producing distinct groups of points which represent different urban objects 2) extracting reliable 3-D shape descriptors specifically designed for vehicles, considering the need for fast processing speed 3) executing binary classification on the extracted descriptors in order to perform vehicle detection. The extraction of our efficient shape descriptors provides a significant speedup with and increased detection accuracy compared to a PCA based 3-D bounding box fitting method used as baseline.

A. Börcs, B. Nagy and Cs. Benedek: "Fast 3-D Urban Object Detection on Streaming Point Clouds", 2ndWorkshop on Computer vision for road scene understanding and autonomous driving at ECCV, Lecture Notes in Computer Science, Zurich, Switzerland, September 6-12 2014 download

Efficient and fast object detection from continuously streamed 3-D point clouds could has a major impact in many related research tasks, such as autonomous driving, self localization and mapping and understanding large scale environment. This paper presents a LIDARbased framework, which provides fast detection of 3-D urban objects from point cloud sequences of a Velodyne HDL-64E terrestrial LIDAR scanner installed on a moving platform. The pipeline of our framework receives raw streams of 3-D data, and produces distinct groups of points which belong to different urban objects. In the proposed framework we present a simple, yet efficient hierarchical grid data structure and corresponding algorithms that significantly improve the processing speed of the object detection task. Furthermore, we show that this approach confidently handles streaming data, and provides a speedup of two orders of magnitude, with increased detection accuracy compared to a baseline connected component analysis algorithm.

A. Börcs, B. Nagy and Cs. Benedek "On board 3D Object Perception in Dynamic Urban Scenes," IEEE International Conference on Cognitive Infocommunications, pp. 515 - 520, Budapest, Hungary 2013 download

In urban environments, object recognition and road monitoring are key issues for driving assistance systems or autonomous vehicles. This paper presents a LIDAR-based perception system which provides reliable detection of 3D urban objects from point cloud sequences of a Velodyne HDL-64E terrestrial LIDAR scanner installed on a moving platform. As for the output of the system, we perform real-time localization and identification of typical urban objects, such as traffic signs, vehicles or crosswalks. In contrast to most existing works, the proposed algorithm does not use hand-labeled training datasets to perform object classification. Experimental results are carried out on real LIDAR measurements in the streets of Budapest, Hungary.

Journals

B. Nagy and C. Benedek, "3D CNN-Based Semantic Labeling Approach for Mobile Laser Scanning Data," in IEEE Sensors Journal, vol. 19, no. 21, pp. 10034-10045, 1 Nov.1, 2019. doi: 10.1109/JSEN.2019.2927269 download

In this paper, we introduce a 3D convolutional neural network (CNN)-based method to segment point clouds obtained by mobile laser scanning (MLS) sensors into nine different semantic classes, which can be used for high definition city map generation. The main purpose of semantic point labeling is to provide a detailed and reliable background map for self-driving vehicles (SDV), which indicates the roads and various landmark objects for navigation and decision support of SDVs. Our approach considers several practical aspects of raw MLS sensor data processing, including the presence of diverse urban objects, varying point density, and strong measurement noise of phantom effects caused by objects moving concurrently with the scanning platform. We also provide a new manually annotated MLS benchmark set called SZTAKI CityMLS , which is used to evaluate the proposed approach, and to compare our solution to various reference techniques proposed for semantic point cloud segmentation. Apart from point level validation we also present a case study on Lidar-based accurate self-localization of SDVs in the segmented MLS map.

Cs. Benedek, B. Gálai, B. Nagy and Z. Jankó: ”Lidar-based Gait Analysis and Activity Recognition in a 4D Surveillance System,” IEEE Trans.on Circuits and Systems for Video Technology, to appear, 2016, IF: 2.254 download

This paper presents new approaches for gait and activity analysis based on data streams of a Rotating Multi Beam (RMB) Lidar sensor. The proposed algorithms are embedded into an integrated 4D vision and visualization system, which is able to analyze and interactively display real scenarios in natural outdoor environments with walking pedestrians. The main focus of the investigations are gait based person re-identification during tracking, and recognition of specific activity patterns such as bending, waving, making phone calls and checking the time looking at wristwatches. The descriptors for training and recognition are observed and extracted from realistic outdoor surveillance scenarios, where multiple pedestrians are walking in the field of interest following possibly intersecting trajectories, thus the observations might often be affected by occlusions or background noise. Since there is no public database available for such scenarios, we created and published a new Lidar-based outdoors gait and activity dataset on our website, that contains point cloud sequences of 28 different persons extracted and aggregated from 35 minutes-long measurements. The presented results confirm that both efficient gait-based identification and activity recognition is achievable in the sparse point clouds of a single RMB Lidar sensor. After extracting the people trajectories, we synthesized a free-viewpoint video, where moving avatar models follow the trajectories of the observed pedestrians in real time, ensuring that the leg movements of the animated avatars are synchronized with the real gait cycles observed in the Lidar stream.

A. Börcs, B. Nagy and Cs. Benedek: "Instant Object Detection in Lidar Point Clouds", IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 7, pp. 992 - 996, 2017, IF: 2.761 download

In this paper we present a new approach for object classification in continuously streamed Lidar point clouds col- lected from urban areas. The input of our framework is raw 3-D point cloud sequences captured by a Velodyne HDL-64 Lidar, and we aim to extract all vehicles and pedestrians in the neighborhood of the moving sensor. We propose a complete pipeline developed especially for distinguishing outdoor 3-D urban objects. Firstly, we segment the point cloud into regions of ground, short objects (i.e. low foreground) and tall objects (high foreground). Then using our novel two-layer grid structure, we perform efficient connected component analysis on the foreground regions, for producing distinct groups of points which represent different urban objects. Next, we create depth-images from the object candidates, and apply an appearance based preliminary classi- fication by a Convolutional Neural Network (CNN). Finally we refine the classification with contextual features considering the possible expected scene topologies. We tested our algorithm in real Lidar measurements, containing 1159 objects captured from different urban scenarios.

Book chapters

Börcs, B. Nagy and Cs. Benedek: "Dynamic Environment Perception and 4D reconstruction using a Mobile Rotating Multi-beam Lidar sensor," Handling Uncertainty and Networked Structure in Robot Control, Springer, to appear 2015

BSc/OTDK (2014) and MSc (2016 ) thesis

BSc and OTDK thesis: Dinamikus utcai környezet háromdimenziós analízise mobil lézerszkenner mérései alapján download

MSc thesis: 3D scene understanding based on mobile laser scanning systems download