FORM YL-2 H.Ü. FEN BİLİMLERİ ENSTİTÜSÜ YÜKSEK LİSANS TEZ KONUSU ÖNERİ FORMU Tezin Türkçe Adı: Python'da Klasik Görüntü İşleme Yöntemleriyle Karşılaştırıldığında Drone Görüntülerinden Bina Çıkarımı İçin CNN Tabanlı Algoritmaların Performansının Değerlendirilmesi Tezin İngilizce Adı: Evaluating the Performance of CNN-Based Algorithms for Building Extraction from Drone Images Compared to Classical Image Processing Methods in Python Öğrencinin Adı ve Soyadı Öğrencinin Numarası Zeinab Bayat Tezin Yapıldığı Anabilim Dalı N21237886 Tez Danışmanı İkinci Danışman (Varsa) Yapılacak çalışmada herhangi bir Etik Kuruldan izin alınmasına gerek varmı dır ? Evet Hayır Yapılacak çalışma bir kurum ya da H.Ü. Araştırma Fonu tarafından destekleniyor mu ? Evet Hayır Cevabınız EVET ise destekleyen kurumun adı Parasal Destek Türü Araç-Gereç Diğer (belirtiniz) Tezin konusunda anabilim dalınızda benzer araştırmalar yapılıp / yapılmadığı. Yapıldı Yapılmadı Yapılmış ise (Yüksek Lisans veya Doktora) tezin/tezlerin adı/adları ve kim/kimler tarafından yapıldığı. Öğrencinin İmzası Danışmanın İmzası Tez Önerisi, Akademik Anabilim Kurulunun ...................... tarihli toplantısında tartışılmış ve Fen Bilimleri Enstitüsü Yönetim Kurulu’na önerilmeye değer bulunmuştur. Anabilim Dalı Başkanı Tarih: İmza: Ek : Akademik Anabilim Kurul Kararı enstitüye gönderilmelidir. (1) Tez önerisi, en geç ikinci yarıyılın sonuna kadar verilmesi zorunludur. Tez önerisini vermeyen öğrenciler özel konular dersini programına alamaz. (Md.26) (2) Tez önerisi, öğrenci ve danışmanı tarafından hazırlanır. Öneri, Anabilim Dalı akademik kurulunda veya YÖK tarafından onaylanmış bilim dalı akademik kurulunda tartışılır. Kurulun kararı anabilim dalı başkanlığınca enstitüye bildirilir. Öneri enstitü yönetim kurulu kararı ile kesinleşir. Tez konusu üzerinde daha sonra yapılması istenen değişikliklerde de aynı süreç izlenir. (Md.26) 1. TEZİN ÖZETİ The thesis aims to evaluate the performance of convolutional neural networks (CNNs) as a building extraction method, comparing it with classical methods such as edge detection , classical image segmentation , Canny edge detection or watershed segmentation with the help of python language. To accomplish this objective, I intend to gather a collection of 2D drone images that contain buildings, and I will annotate them with ground-truth data to indicate the precise location and extent of each building. After preprocessing the dataset and splitting it into training, validation, and testing sets, I will design and train CNN architecture for building extraction, using popular models such as U-Net or Mask R-CNN or developing my own architecture. Next, I will evaluate the performance of the CNN-based system on the validation set using metrics like precision (the ratio of correctly identified positive instances to the total instances identified as positive), recall (the ratio of correctly identified positive instances to the total positive instances in the dataset), F1-score (the harmonic mean of precision and recall), and mean intersection over union (IoU) (a measure of the overlap between the predicted and ground-truth regions)." Finally, I will compare the results obtained with those of classical methods and other deep learning methods like fully convolutional networks (FCNs) or Mask R-CNN, analyze the findings, and draw conclusions about the effectiveness of CNN-based building extraction compared to traditional methods, while also identifying possible areas for future research Future research in this field could explore the use of alternative CNN architectures or evaluate the performance of the proposed CNN-based method on a larger and more diverse dataset of urban areas. Anahtar Kelimeler: Object identification, building extraction, UAVs, convolutional neural networks 2. AMAÇ VE HEDEFLER - Drone-based images are chosen just because drone images have become increasingly popular for property evaluation due to their numerous advantages. Drones are equipped with high-definition cameras that can capture detailed images of various types of properties, including residential, commercial, and industrial. Moreover, drones can capture images from unique angles that were previously inaccessible, allowing for a more comprehensive analysis of properties. Additionally, using drones to collect data can significantly reduce the risk in disaster-prone areas by capturing images of areas that are inaccessible to humans or by monitoring areas that are difficult to reach. Given these benefits, there is a pressing need to develop an automated method for analyzing building areas using drone imagery and processing techniques. This method could consider factors such as property value or potential hazards and could greatly enhance the efficiency and accuracy of property evaluation.[ 1] -This thesis provides a comprehensive guide to solving image processing problems using popular Python image processing libraries such as numpy , matplotlib ,PIL, sci-kit-image, python-opencv, scipy ndimage, and SimpleITK, as well as machine learning libraries such as scikit-learn and deep learning libraries such as TensorFlow and Keras. The thesis covers a wide range of image processing problems, including de noising, segmentation, classification, and object detection. By following the code snippets provided, readers will be able to implement complex image processing algorithms with ease. The use of Python libraries and frameworks greatly simplifies the process of implementing these algorithms and models. This thesis is designed for readers with a basic understanding of Python and image processing concepts, and it provides a valuable resource for beginners and more advanced users alike.[2] 3. KONU, KAPSAM ve LİTERATÜR ÖZETİ Human visual perception allows for the efficient identification of familiar objects amidst complex environmental scenes, as well as the discernment of unfamiliar objects. Such recognition abilities, which are executed with minimal cognitive effort on a daily basis, account for variations in color, texture, form, scale, viewing angles, and occlusions. Object identification systems that aim to mimic human visual perception must thus take inspiration from its underlying mechanisms, which rely on the identification of distinctive patterns that reflect the arrangement of salient features or attributes.[3] Automatic and intelligent building footprint extraction using object identification technologies has become a crucial application with a wide range of practical uses. This technology supports domains such as cartography, emergency response, and urban planning, where accurate and up-to-date geographic data is essential. Manual building footprint extraction is a labor-intensive and expensive process, which makes it a current research topic. Inaccurate or incomplete building footprint data can lead to errors in navigation systems, incomplete risk assessments in emergency response scenarios, and hindered city planning and development. Automatic, intelligent, reliable, and accurate building footprint extraction is, therefore, of great practical value and relevance for the collection and updating of basic geographic data. The method has gained extensive usage in fields like catastrophe assessment, military reconnaissance, and digital cities. By implementing this technology, we can increase efficiency and accuracy, reduce costs, and improve decision-making [4] Unmanned Aerial Vehicles (UAVs) have revolutionized the field of cadastral mapping by providing a low-cost and accessible platform for acquiring high-resolution data. UAVs can fly below clouds and capture sub-meter-level imagery in a cost-effective and timely manner, making them an ideal platform for capturing aerial imagery for land-use classification, identifying property boundaries, or monitoring land-use changes over time. The use of UAVs has significantly impacted the accuracy and efficiency of cadastral mapping by reducing the need for ground surveys, resulting in more efficient data collection and improved accuracy. However, the use of UAVs for cadastral mapping is not without limitations, such as limited flight time, data storage limitations, and challenges in data processing. Nonetheless, the use of UAVs in cadastral mapping has opened up new opportunities for the efficient and accurate acquisition of high-resolution data in a variety of domains. [5] The field of UAVs has witnessed significant growth in research and industry, particularly over the last decade, with aerial images offering new and exciting research directions. Combining drones and computer vision is a novel and challenging idea that could enable unmanned aerial vehicles to understand the area being surveyed. This could significantly enhance various applications, such as Automatic Feature Matching Recognition and Imagebased Control, leading to improved real-time localization, mapping, and navigation in the absence of GPS.[3] Over the years, many traditional computer vision approaches have been proposed for building extraction, relying on empirical knowledge of buildings to extract features such as color, texture, edge, shape, shadow, and context, combined with one or more knowledge-based methods like template matching, active contour model, mathematical morphology, graphbased analysis, and dynamic programming. However, the performance of these approaches depends heavily on the quality of manually designed features. Recently, deep learning techniques, including convolutional neural networks (CNNs), have shown improved performance in computer vision tasks such as image classification, object detection, and semantic segmentation. With the development of deep learning techniques and the increasing availability of high-resolution UAV images, significant advancements have been made in the field of automatic building extraction.[6] 4. ÖZGÜN DEĞER This study focuses on the advantages of deep learning methods and their impact on object detection in various applications. Deep learning methods are significantly transforming the field of object detection, and this study aims to explore why this is the case. We will discuss the key advantages of deep learning methods over traditional machine learning techniques, particularly in the context of object detection. The study will also examine the applications of deep learning in object detection and explore how these methods have led to significant improvements in detection accuracy and efficiency. By analyzing the latest research in this field, this study aims to provide valuable insights into the benefits of deep learning methods for object detection and their potential impact on various domains. The most significant advantage of deep learning algorithms lies in their ability to learn lowlevel and high-level features from training images incrementally. As such, handcrafted feature extraction or engineering is not necessary, thus simplifying the preprocessing stage. Deep learning techniques take an end-to-end approach to problem-solving, unlike classical ML algorithms such as Support Vector Machines (SVM), which require a bounding box object detection algorithm to identify potential objects that require HOG input for correct object recognition. For instance, the YOLO network, a deep learning method, takes an image as input and produces the object's name and location as output, demonstrating the end-to-end approach. However, deep learning algorithms' significant number of parameters and vast datasets results in a lengthy training process. Therefore, it is essential to train deep learning models on high-end hardware such as GPUs for an adequate period. Transfer learning plays a significant role in deep learning's adaptability and transferability, allowing pre-trained deep networks to be reused for various applications within the same domain. As a result, deep learning techniques are highly versatile and can be applied to a wide range of domains and applications. 5. YÖNTEM The ultimate aim of Machine Learning (ML) is to achieve generalization, which refers to the ability of an algorithm to perform with high accuracy on unseen datasets after being trained on a specific training dataset. In complex image processing tasks such as image classification, having more training data often leads to better generalization, as long as overfitting is avoided through regularization techniques. Deep Learning is a subset of ML that uses artificial neural networks (ANNs) consisting of multiple layers of processing units. These layers process data in a non-linear way and are inspired by the structure and function of neurons in the human brain. Each layer takes the output from the previous layer as input, and the ANNs are used for feature extraction, transformation, pattern recognition, and abstraction development. Deep Learning can be supervised, as in classification tasks, or unsupervised, as in pattern analysis tasks. It employs gradient descent algorithms to learn multiple levels of representations that correspond to different levels of abstraction. These levels form a hierarchical structure of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones. Through this hierarchical representation of data, Deep Learning achieves great power and flexibility, enabling it to learn complex relationships and dependencies among the features of the data. This makes it well-suited for tasks such as image and speech recognition, natural language processing, and many others.[2] 6. YAYGIN ETKİ In the realm of image processing problems addressed by traditional machine learning (ML) techniques, the crucial step in preprocessing is often the extraction of handcrafted features, such as HOG (Histograms of Oriented Gradients) and SIFT (Scale-Invariant Feature Transform). This step aims to reduce the complexity of the image and enhance the visibility of patterns that are relevant for the learning algorithms to function effectively. In contrast to deep learning approaches, classical ML techniques address the problem statement by decomposing it into various components, resolving each part independently, and then integrating the results to generate the final output. Nonetheless, classical ML techniques are typically limited in their scope of application.[2] 7. KAYNAKLAR [1] Jun Chen , Member, IEEE, Ganbei Wang, Linbo Luo, Wenping Gong, and Zhan Cheng, Building Area Estimation in Drone Aerial Images Based on Mask R-CNN, IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 1-4, 2020. [2] Hands-On Image Processing with Python, Expert techniques for advanced image analysis and effective interpretation of image data ,Sandipan Dey [3] MASTER THESIS,A Local - Global Approach to Semantic Segmentation in Aerial Images Scientific Advisor: Author:Assoc.Prof. Marius Leordeanu Alina - Elena Marcu [4] Building Footprint Extraction from High Resolution Optical Remote Sensing Imagery Using Deep Learning Master Thesis To obtain the Master degree at the Faculty of Natural Sciences, Paris-LodronUniversität Salzburg Submitted by: Wufan Zhao Supervisor: Univ.-Dr. Dirk Tiede [5] EXTRACTING CADASTRAL BOUNDARIES FROM UAV IMAGES USING FULLY CONVOLUTIONAL NETWORKS XUE XIA February, 2019 SUPERVISORS: Dr. M.N. Koeva Dr. C. Persello [6] Deep Learning Based Building Extraction from High-Resolution Remote Sensing Images by Yifan Wu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Applied Science in Systems Design Engineering