Uploaded by Zeynab Bayat

proposa3

advertisement
FORM YL-2
H.Ü. FEN BİLİMLERİ ENSTİTÜSÜ
YÜKSEK LİSANS TEZ KONUSU ÖNERİ FORMU
Tezin Türkçe Adı: Python'da Klasik Görüntü İşleme Yöntemleriyle Karşılaştırıldığında Drone
Görüntülerinden Bina Çıkarımı İçin CNN Tabanlı Algoritmaların Performansının Değerlendirilmesi
Tezin İngilizce Adı: Evaluating the Performance of CNN-Based Algorithms for Building Extraction from
Drone Images Compared to Classical Image Processing Methods in Python
Öğrencinin Adı ve Soyadı
Öğrencinin Numarası
Zeinab Bayat
Tezin Yapıldığı Anabilim Dalı
N21237886
Tez Danışmanı
İkinci Danışman (Varsa)
Yapılacak çalışmada herhangi bir Etik Kuruldan izin alınmasına gerek varmı dır ?
Evet
Hayır
Yapılacak çalışma bir kurum ya da H.Ü. Araştırma Fonu tarafından destekleniyor mu ?
Evet
Hayır
Cevabınız EVET ise destekleyen kurumun adı
Parasal
Destek Türü
Araç-Gereç
Diğer (belirtiniz)
Tezin konusunda anabilim dalınızda benzer araştırmalar yapılıp / yapılmadığı.
Yapıldı
Yapılmadı
Yapılmış ise (Yüksek Lisans veya Doktora) tezin/tezlerin adı/adları ve kim/kimler tarafından yapıldığı.
Öğrencinin İmzası
Danışmanın İmzası
Tez Önerisi, Akademik Anabilim Kurulunun ...................... tarihli toplantısında tartışılmış ve Fen Bilimleri
Enstitüsü Yönetim Kurulu’na önerilmeye değer bulunmuştur.
Anabilim Dalı Başkanı
Tarih:
İmza:
Ek : Akademik Anabilim Kurul Kararı enstitüye gönderilmelidir.
(1) Tez önerisi, en geç ikinci yarıyılın sonuna kadar verilmesi zorunludur. Tez önerisini
vermeyen öğrenciler özel konular dersini programına alamaz. (Md.26)
(2) Tez önerisi, öğrenci ve danışmanı tarafından hazırlanır. Öneri, Anabilim Dalı akademik
kurulunda veya YÖK tarafından onaylanmış bilim dalı akademik kurulunda tartışılır. Kurulun kararı
anabilim dalı başkanlığınca enstitüye bildirilir. Öneri enstitü yönetim kurulu kararı ile kesinleşir. Tez
konusu üzerinde daha sonra yapılması istenen değişikliklerde de aynı süreç izlenir. (Md.26)
1. TEZİN ÖZETİ
The thesis aims to evaluate the performance of convolutional neural networks (CNNs) as a
building extraction method, comparing it with classical methods such as edge detection ,
classical image segmentation , Canny edge detection or watershed segmentation with the help
of python language. To accomplish this objective, I intend to gather a collection of 2D drone
images that contain buildings, and I will annotate them with ground-truth data to indicate the
precise location and extent of each building. After preprocessing the dataset and splitting it
into training, validation, and testing sets, I will design and train CNN architecture for
building extraction, using popular models such as U-Net or Mask R-CNN or developing my
own architecture. Next, I will evaluate the performance of the CNN-based system on the
validation set using metrics like precision (the ratio of correctly identified positive instances
to the total instances identified as positive), recall (the ratio of correctly identified positive
instances to the total positive instances in the dataset), F1-score (the harmonic mean of
precision and recall), and mean intersection over union (IoU) (a measure of the overlap
between the predicted and ground-truth regions)." Finally, I will compare the results obtained
with those of classical methods and other deep learning methods like fully convolutional
networks (FCNs) or Mask R-CNN, analyze the findings, and draw conclusions about the
effectiveness of CNN-based building extraction compared to traditional methods, while also
identifying possible areas for future research Future research in this field could explore the
use of alternative CNN architectures or evaluate the performance of the proposed CNN-based
method on a larger and more diverse dataset of urban areas.
Anahtar Kelimeler: Object identification, building extraction, UAVs, convolutional neural
networks
2. AMAÇ VE HEDEFLER
- Drone-based images are chosen just because drone images have become increasingly
popular for property evaluation due to their numerous advantages. Drones are equipped with
high-definition cameras that can capture detailed images of various types of properties,
including residential, commercial, and industrial. Moreover, drones can capture images from
unique angles that were previously inaccessible, allowing for a more comprehensive analysis
of properties. Additionally, using drones to collect data can significantly reduce the risk in
disaster-prone areas by capturing images of areas that are inaccessible to humans or by
monitoring areas that are difficult to reach. Given these benefits, there is a pressing need to
develop an automated method for analyzing building areas using drone imagery and
processing techniques. This method could consider factors such as property value or potential
hazards and could greatly enhance the efficiency and accuracy of property evaluation.[ 1]
-This thesis provides a comprehensive guide to solving image processing problems using
popular Python image processing libraries such as numpy , matplotlib ,PIL, sci-kit-image,
python-opencv, scipy ndimage, and SimpleITK, as well as machine learning libraries such as
scikit-learn and deep learning libraries such as TensorFlow and Keras. The thesis covers a
wide range of image processing problems, including de noising, segmentation, classification,
and object detection. By following the code snippets provided, readers will be able to
implement complex image processing algorithms with ease. The use of Python libraries and
frameworks greatly simplifies the process of implementing these algorithms and models. This
thesis is designed for readers with a basic understanding of Python and image processing
concepts, and it provides a valuable resource for beginners and more advanced users alike.[2]
3. KONU, KAPSAM ve LİTERATÜR ÖZETİ
Human visual perception allows for the efficient identification of familiar objects amidst
complex environmental scenes, as well as the discernment of unfamiliar objects. Such
recognition abilities, which are executed with minimal cognitive effort on a daily basis,
account for variations in color, texture, form, scale, viewing angles, and occlusions. Object
identification systems that aim to mimic human visual perception must thus take inspiration
from its underlying mechanisms, which rely on the identification of distinctive patterns that
reflect the arrangement of salient features or attributes.[3]
Automatic and intelligent building footprint extraction using object identification
technologies has become a crucial application with a wide range of practical uses. This
technology supports domains such as cartography, emergency response, and urban planning,
where accurate and up-to-date geographic data is essential. Manual building footprint
extraction is a labor-intensive and expensive process, which makes it a current research topic.
Inaccurate or incomplete building footprint data can lead to errors in navigation systems,
incomplete risk assessments in emergency response scenarios, and hindered city planning and
development. Automatic, intelligent, reliable, and accurate building footprint extraction is,
therefore, of great practical value and relevance for the collection and updating of basic
geographic data. The method has gained extensive usage in fields like catastrophe
assessment, military reconnaissance, and digital cities. By implementing this technology, we
can increase efficiency and accuracy, reduce costs, and improve decision-making [4]
Unmanned Aerial Vehicles (UAVs) have revolutionized the field of cadastral mapping by
providing a low-cost and accessible platform for acquiring high-resolution data. UAVs can
fly below clouds and capture sub-meter-level imagery in a cost-effective and timely manner,
making them an ideal platform for capturing aerial imagery for land-use classification,
identifying property boundaries, or monitoring land-use changes over time. The use of UAVs
has significantly impacted the accuracy and efficiency of cadastral mapping by reducing the
need for ground surveys, resulting in more efficient data collection and improved accuracy.
However, the use of UAVs for cadastral mapping is not without limitations, such as limited
flight time, data storage limitations, and challenges in data processing. Nonetheless, the use
of UAVs in cadastral mapping has opened up new opportunities for the efficient and accurate
acquisition of high-resolution data in a variety of domains. [5]
The field of UAVs has witnessed significant growth in research and industry, particularly
over the last decade, with aerial images offering new and exciting research directions.
Combining drones and computer vision is a novel and challenging idea that could enable
unmanned aerial vehicles to understand the area being surveyed. This could significantly
enhance various applications, such as Automatic Feature Matching Recognition and Imagebased Control, leading to improved real-time localization, mapping, and navigation in the
absence of GPS.[3]
Over the years, many traditional computer vision approaches have been proposed for building
extraction, relying on empirical knowledge of buildings to extract features such as color,
texture, edge, shape, shadow, and context, combined with one or more knowledge-based
methods like template matching, active contour model, mathematical morphology, graphbased analysis, and dynamic programming. However, the performance of these approaches
depends heavily on the quality of manually designed features. Recently, deep learning
techniques, including convolutional neural networks (CNNs), have shown improved
performance in computer vision tasks such as image classification, object detection, and
semantic segmentation. With the development of deep learning techniques and the increasing
availability of high-resolution UAV images, significant advancements have been made in the
field of automatic building extraction.[6]
4. ÖZGÜN DEĞER
This study focuses on the advantages of deep learning methods and their impact on object
detection in various applications. Deep learning methods are significantly transforming the
field of object detection, and this study aims to explore why this is the case. We will discuss
the key advantages of deep learning methods over traditional machine learning techniques,
particularly in the context of object detection. The study will also examine the applications of
deep learning in object detection and explore how these methods have led to significant
improvements in detection accuracy and efficiency. By analyzing the latest research in this
field, this study aims to provide valuable insights into the benefits of deep learning methods
for object detection and their potential impact on various domains.
The most significant advantage of deep learning algorithms lies in their ability to learn lowlevel and high-level features from training images incrementally. As such, handcrafted
feature extraction or engineering is not necessary, thus simplifying the preprocessing stage.
Deep learning techniques take an end-to-end approach to problem-solving, unlike classical
ML algorithms such as Support Vector Machines (SVM), which require a bounding box
object detection algorithm to identify potential objects that require HOG input for correct
object recognition. For instance, the YOLO network, a deep learning method, takes an image
as input and produces the object's name and location as output, demonstrating the end-to-end
approach.
However, deep learning algorithms' significant number of parameters and vast datasets
results in a lengthy training process. Therefore, it is essential to train deep learning models on
high-end hardware such as GPUs for an adequate period. Transfer learning plays a significant
role in deep learning's adaptability and transferability, allowing pre-trained deep networks to
be reused for various applications within the same domain. As a result, deep learning
techniques are highly versatile and can be applied to a wide range of domains and
applications.
5. YÖNTEM
The ultimate aim of Machine Learning (ML) is to achieve generalization, which refers to the
ability of an algorithm to perform with high accuracy on unseen datasets after being trained on
a specific training dataset. In complex image processing tasks such as image classification,
having more training data often leads to better generalization, as long as overfitting is avoided
through regularization techniques.
Deep Learning is a subset of ML that uses artificial neural networks (ANNs) consisting of
multiple layers of processing units. These layers process data in a non-linear way and are
inspired by the structure and function of neurons in the human brain. Each layer takes the
output from the previous layer as input, and the ANNs are used for feature extraction,
transformation, pattern recognition, and abstraction development. Deep Learning can be
supervised, as in classification tasks, or unsupervised, as in pattern analysis tasks. It employs
gradient descent algorithms to learn multiple levels of representations that correspond to
different levels of abstraction. These levels form a hierarchical structure of concepts, with
each concept defined in relation to simpler concepts, and more abstract representations
computed in terms of less abstract ones.
Through this hierarchical representation of data, Deep Learning achieves great power and
flexibility, enabling it to learn complex relationships and dependencies among the features of
the data. This makes it well-suited for tasks such as image and speech recognition, natural
language processing, and many others.[2]
6. YAYGIN ETKİ
In the realm of image processing problems addressed by traditional machine learning (ML)
techniques, the crucial step in preprocessing is often the extraction of handcrafted features,
such as HOG (Histograms of Oriented Gradients) and SIFT (Scale-Invariant Feature
Transform). This step aims to reduce the complexity of the image and enhance the visibility
of patterns that are relevant for the learning algorithms to function effectively. In contrast to
deep learning approaches, classical ML techniques address the problem statement by
decomposing it into various components, resolving each part independently, and then
integrating the results to generate the final output. Nonetheless, classical ML techniques are
typically limited in their scope of application.[2]
7. KAYNAKLAR
[1] Jun Chen , Member, IEEE, Ganbei Wang, Linbo Luo, Wenping Gong, and Zhan Cheng, Building Area
Estimation in Drone Aerial Images Based on Mask R-CNN, IEEE GEOSCIENCE AND REMOTE SENSING
LETTERS, 1-4, 2020.
[2] Hands-On Image Processing with Python, Expert techniques for advanced image analysis and effective
interpretation of image data ,Sandipan Dey
[3] MASTER THESIS,A Local - Global Approach to Semantic Segmentation in Aerial Images
Scientific Advisor: Author:Assoc.Prof. Marius Leordeanu Alina - Elena Marcu
[4] Building Footprint Extraction from High Resolution Optical Remote Sensing Imagery Using Deep
Learning Master Thesis To obtain the Master degree at the Faculty of Natural Sciences, Paris-LodronUniversität Salzburg Submitted by: Wufan Zhao Supervisor: Univ.-Dr. Dirk Tiede
[5] EXTRACTING CADASTRAL BOUNDARIES FROM UAV IMAGES USING FULLY
CONVOLUTIONAL NETWORKS XUE XIA February, 2019 SUPERVISORS: Dr. M.N. Koeva Dr. C.
Persello
[6] Deep Learning Based Building Extraction from High-Resolution Remote Sensing Images by Yifan Wu A
thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of
Applied Science in Systems Design Engineering
Download