One Stage Object Detection Architectures¶
One Stage Architectures¶
Now we will focus on model architectures that directly predict object bounding boxes for an image in one step. In other words, there is no intermediate task (as we saw in the last lecture with region proposals) that must be performed to produce a result. This leads to a simpler and faster model architecture, although it can sometimes be difficult to be flexible enough to adapt to arbitrary tasks (like mask prediction).
SSD - Single Shot Detection¶
SSD is designed for real-time object detection. Faster R-CNN uses a region proposal network to create boundary boxes and uses these boxes to classify objects. While considered state-of-the-art in accuracy, the entire process runs at 7 frames per second. This is far below what real-time processing requires. SSD speeds up the process by eliminating the need for the region proposal network. To recover from the drop in accuracy, SSD applies a few improvements, including multi-scale features and standard boxes. These improvements allow SSD to match the accuracy of Faster R-CNN using lower resolution images, which further increases the speed.
SSD has two components: a backbone model and an SSD head. The backbone model is usually a pre-trained image classification network as a feature extractor. This is typically a network like ResNet trained on ImageNet from which the final fully connected classification layer has been removed. We are therefore left with a deep neural network that is able to extract semantic meaning from the input image while preserving the spatial structure of the image, albeit at a lower resolution. For ResNet34, the backbone outputs 256 7x7 feature maps for an input image. We will explain what features and feature maps are later. The SSD head is just one or more convolutional layers added to this backbone, and the outputs are interpreted as bounding boxes and object classes at the spatial location of the activations of the final layers.
Grid cell¶
Instead of using a sliding window, SSD divides the image using a grid and makes each cell of the grid responsible for detecting objects in that region of the image. Object detection simply means predicting the class and location of an object in that region. If no object is present, we consider it as the background class and the location is ignored. For example, we could use a 4x4 grid in the example below. Each cell of the grid is capable of displaying the position and shape of the object it contains.
Now you might be wondering what if there are multiple objects in a grid cell or what if we need to detect multiple objects of different shapes. This is where the anchor box and receptive field come into play.
Anchor box¶
Each grid cell in SSD can be assigned multiple anchor/previous boxes. These anchor boxes are predefined and each one is responsible for a size and shape within a grid cell. For example, the swimming pool in the image below corresponds to the tallest anchor box while the building corresponds to the widest box.
What are Anchor Boxes?¶
To predict and localize many different objects in an image, most state-of-the-art object detection models, such as EfficientDet and YOLO models, start with anchor boxes as a priority and scale from there.
State-of-the-art models typically use bounding boxes in the following order:
- Form thousands of candidate anchor boxes around the image
- For each anchor box, predict some offset from that box as a candidate box
- Compute a loss function based on the ground truth example
- Compute the probability that a given offset box overlaps with a real object
- If this probability is greater than 0.5, factor the prediction into the loss function
- By rewarding and penalizing the predicted boxes, slowly pull the model towards only finding true objects
- This is why, when you’ve lightly trained a model, you’ll see predicted boxes popping up everywhere.
SSD uses a matching phase during training to match the appropriate anchor box to the bounding boxes of each ground-truth object in an image. Essentially, the anchor box with the highest degree of overlap with an object is responsible for predicting that object’s class and location. This property is used to train the network and to predict the detected objects and their locations once the network has been trained. In practice, each anchor box is specified by an aspect ratio and a zoom level.
Aspect ratio¶
Not all objects are square. Some are longer and some are wider, to varying degrees. The SSD architecture allows for predefined aspect ratios of anchor boxes to account for this. The aspect ratio parameter can be used to specify different aspect ratios of the anchor boxes associated with each grid cell at each zoom/scale level.
Zoom level¶
The anchor boxes do not need to be the same size as the grid cell. We may be interested in finding smaller or larger objects within a grid cell. The zooms parameter is used to specify how much the anchor boxes need to be scaled up or down relative to each grid cell. As we saw in the anchor box example, the building is usually larger than the swimming pool.
Receptive Field¶
The receptive field is defined as the region in the input space that a specific CNN feature is looking at (i.e., is affected by). We will use “feature” and “activation” interchangeably here and treat them as the linear combination (sometimes applying an activation function afterward to increase non-linearity) of the previous layer at the corresponding location. Because of the convolution operation, features in different layers represent different sizes of regions in the input image. As you go deeper, the size represented by a feature increases. In this example below, we start with the bottom layer (5x5) and then apply convolution which results in the middle layer (3x3) where a feature (green pixel) represents a 3x3 region of the input layer (bottom layer). Then apply convolution to the middle layer to get the top layer (2x2) where each feature corresponds to a 7x7 region in the input image. This type of green and orange 2D matrix is also called feature maps, which refers to a set of features created by applying the same feature extractor to different locations of the input map in a sliding window. Features in the same feature map have the same receptive field and look for the same pattern, but at different locations. This creates the spatial invariance of ConvNet.
The receptive field is the core premise of the SSD architecture, as it allows us to detect objects at different scales and produce a tighter bounding box. Why? As you may recall, the ResNet34 backbone generates 256 7x7 feature maps for an input image. If we specify a 4x4 grid, the simplest approach is to just apply a convolution to this feature map and convert it to 4x4. This approach can actually work to some extent and is exactly the idea of YOLO (You Only Look Once). The extra step that SSD takes is that it applies more convolution layers to the backbone feature map and makes each of these convolution layers produce object detection results. Since the previous layers with smaller receptive field can represent smaller sized objects, the predictions from the previous layers help to deal with smaller sized objects. Because of this, SSD allows us to define a hierarchy of grid cells in different layers. For example, we could use a 4x4 grid to find smaller objects, a 2x2 grid to find medium-sized objects, and a 1x1 grid to find objects that cover the entire image.
YOLO Family¶
In 2015, a family of neural networks was proposed with the abbreviation YOLO in reference to the famous phrase “You only live once”. This is based on the simple fact that the network only takes “one look” or one pass through the network before producing the final image. This allows for real-time object detection with images, which is considerably preferable for surveillance-related applications. The accuracy of the detected objects is lower than the previously mentioned models due to its exceptional speed, but it still manages to be a top competitor among the others.
YOLO v1¶
Prior to the invention of YOLO, object detector CNNs such as R-CNN used Region Proposal Networks (RPNs) to first generate bounding box proposals on the input image, then ran a classifier on the bounding boxes, and finally applied post-processing to eliminate duplicate detections as well as refine the bounding boxes. The individual stages of the R-CNN network had to be trained separately. The R-CNN network was difficult to optimize and also slow.
The creators of YOLO were motivated to design a single-stage CNN that could be trained end-to-end, was easy to optimize, and worked in real-time.
As shown, YOLO divides the input image into S x S grid cells. As shown in the middle image of the figure, each grid cell predicts B bounding boxes and a “objectness” score P(Object) indicating whether the grid cell contains an object or not. Each grid cell also predicts the conditional probability P(Class | Object) of the class to which the object contained in the grid cell belongs.
For each bounding box, YOLO predicts five parameters – x, y, w, h and a confidence score. The center of the bounding box relative to the grid cell is indicated by the coordinates (x,y). The values of x and y are bounded between 0 and 1. The width w and height h of the bounding box are predicted as a fraction of the width and height of the entire image. Therefore, their values are between 0 and 1. The confidence score indicates whether the bounding box contains an object and how accurate the bounding box is. If the bounding box does not contain an object, the confidence score is zero. If the bounding box contains an object, the confidence score is equal to the Intersection over Union (IoU) of the predicted bounding box and the ground truth. Thus, for each grid cell, YOLO predicts B x 5 parameters.
For each grid cell, YOLO predicts C class probabilities. These class probabilities are conditional based on whether an object exists in the grid cell. YOLO predicts only one set of C class probabilities per grid cell, even if the grid cell has B bounding boxes. Thus, for each grid cell, YOLO predicts C + B x 5 parameters.
Total prediction tensor for an image = S x S x (C + B x 5). For the PASCAL VOC dataset, YOLO uses S = 7, B = 2, and C = 20. Thus, the final YOLO prediction for PASCAL VOC is a tensor 7 x 7 x (20 + 5 x 2) = 7 x 7 x 30.
Finally, YOLO version 1 applies non-maximum suppression (NMS) and thresholding to report the final predictions, as shown in the figure, right image.
CNN Design¶
YOLO CNN version 1 is shown in the figure above. It has 24 convolutional layers that act as a feature extractor. These are followed by 2 fully connected layers that are responsible for object classification and bounding box regression. The final output is a 7 x 7 x 30 tensor. YOLO CNN is a simple single-path CNN similar to VGG19. YOLO uses 1x1 convolutions followed by 3x3 convolutions inspired by Google’s Inception CNN version 1. Leaky ReLU activation is used for all layers except the final layer. The final layer uses a linear activation function.
Results¶
The results of YOLO v1 on the PASCAL VOC 2007 dataset are listed above. YOLO achieves 45 FPS and 63.4% mAP, which are significantly higher compared to DPM - another real-time object detector. While Faster R-CNN VGG-16 has a much higher mAP at 73.2%, its speed is considerably slower at 7 FPS.
Limitations¶
- YOLO has difficulty detecting small objects that appear in clusters.
- YOLO has difficulty detecting objects with unusual proportions.
- YOLO makes more localization errors compared to Fast R-CNN.
YOLO v2¶
There are many modifications introduced by the authors, but I hope you are familiar with YOLO v1 as it will help you understand YOLO v2 much faster, better and stronger.
Better¶
The authors claim that YOLO v1 makes much more localization errors compared to Fast-RCNN and also has relatively low recall. Therefore, to address the above issues, they introduce the following modifications:
Batch Normalization:¶
BN layers are introduced after each convolution layer in YOLO v1, due to which the authors achieved around 2% improvement in terms of mAP
High resolution classifier¶
YOLO v1 trains the classifier with 224×224 image resolution and upscales it to 448 for detection. However, YOLO v2 first fine-tunes its classifier directly to 448×448 resolution for 10 epochs on ImageNet before starting to train the network for detection. This resulted in a 4% improvement in terms of mAP.
Convolutional with anchor boxes¶
The authors removed the fully connected layers from YOLO v1 and used anchor boxes to predict the bounding boxes. In addition, they removed a pooling layer and changed the input resolution to 416 input images instead of 448×448. This is because they need an odd number of locations in our feature map for there to be a single central cell. As a result, they achieved a small decrease in performance in terms of mAP, but a good improvement in terms of recall, which is approximately 7%.
Dimension clusters¶
There are two problems with anchor boxes when using them with YOLO. First, we need to choose good priors, i.e. anchor boxes, for the network to start from so that it is easier for the network to learn. Therefore, the authors employ K-Means clustering on the bounding boxes of the training set.
a) They chose the distance function as follows: d(box, centroid) = 1 − IOU(box, centroid).
b) They ran K-Means with various values of k and found that k = 5 offers a good tradeoff between model complexity and high recall.
Multiscale Training¶
Instead of fixing the input image size, the network randomly chooses a different input resolution every 10 epochs from the following multiples of 32: {320, 352, …, 608}. This scheme encourages the network to perform well with a variety of input dimensions. Furthermore, it offers an easy tradeoff between speed and accuracy.
Faster¶
Darknet-19¶
The authors propose a new backbone, Darknet-19, which has 19 convolutional layers and 5 maxpooling layers. It takes 5.58 billion operations to process an image; however, it achieves 72.9% for top-1 accuracy and 91.2% for top-5 accuracy on ImageNet.
Training for classification¶
The authors use standard augmentations. First, they train their proposed backbone with 224×224 input resolution and fine-tune it to a larger size, 448, for 10 epochs. Please refer to the original paper for more details.
Stronger¶
There are several datasets for classification and detection. Can they be combined together? Note that the authors propose YOLO9000, not YOLO v2, for this reason only. They combined two datasets to get over 9,000 classes, so this part is about how YOLO9000 was trained.
Microsoft COCO contains 100k images, with 80 classes, detection labels, and classes are more general, for example “dog” or “boat”.
ImageNet has 13 million images, with 22k classes, classification labels, and classes are more specific, such as “Norfolk terrier”, “Yorkshire terrier” or “Bedlington terrier”.
As shown above, the authors build a hierarchical tree of visual concepts using WordTree. Thus, “Norfolk terrier” is also labeled as “dog” and “mammal.” There are 9,418 classes in total.
Classification and joint detection¶
- The authors use 3 priors instead of 5 to limit the output size.
- For detection image, the loss is backpropagated normally.
- For classification image, only the classification loss is backpropagated at or above the corresponding label level.
YOLO v3¶
The official title of the YOLO v2 paper made it sound like YOLO was a healthy milk-based drink for kids, rather than an object detection algorithm. It was called “YOLO9000: Better, Faster, Stronger.”
It’s about time that YOLO 9000 was the fastest algorithm and also one of the most accurate. However, a few years later, it is no longer the most accurate, with algorithms like RetinaNet and SSD surpassing it in terms of accuracy. It was still one of the fastest, however.
But that speed has been traded for accuracy increases in YOLO v3. While the previous variant ran at 45 FPS on a Titan X, the current version ran at around 30 FPS. This has to do with the increased complexity of the underlying architecture called Darknet.
Darknet-53¶
YOLO v2 used a custom deep architecture darknet-19, an original 19-layer network supplemented with 11 more layers for object detection. With a 30-layer architecture, YOLO v2 often struggled with small object detections. This was attributed to the loss of fine-grained features as the layers downsampled the input. To remedy this, YOLO v2 used identity mapping, concatenating feature maps from a previous layer to capture low-level features.
However, the YOLO v2 architecture still lacked some of the most important elements that are now staples of most state-of-the-art algorithms. No residual blocks, no skip connections, and no upsampling. YOLO v3 incorporates all of these.
First, YOLO v3 uses a variant of Darknet, which originally had a 53-layer network trained on Imagenet. For the detection task, 53 more layers are stacked on top of it, giving us a 106-layer fully convolutional underlying architecture for YOLO v3. This is the reason behind the slowness of YOLO v3 compared to YOLO v2. Here’s what the YOLO architecture looks like now.
Three-scale detection¶
The latest architecture features residual skip connections and increased resolution. The most notable feature of v3 is that it performs detections at three different scales. YOLO is a fully convolutional network and its final output is generated by applying a 1 x 1 kernel to a feature map. In YOLO v3, detection is done by applying 1 x 1 detection kernels to feature maps of three different sizes at three different locations in the network.
The shape of the detection kernel is 1 x 1 x (B x (5 + C) ). Here B is the number of bounding boxes that a cell in the feature map can predict, “5” is for the 4 attributes of the bounding box and the confidence of an object, and C is the number of classes. In YOLO v3 trained on COCO, B = 3 and C = 80, so the kernel size is 1 x 1 x 255. The feature map produced by this kernel has identical height and width as the previous feature map, and has detection features along the depth as described above.
Before we move on, I would like to point out that the network’s forwardness, or a layer, is defined as the ratio by which it downsamples the input. In the following examples, I will assume that we have an input image of size 416 x 416.
YOLO v3 makes predictions at three scales, which are precisely provided by downsampling the input image dimensions by 32, 16, and 8, respectively.
The first detection is made by the 82nd layer. For the first 81 layers, the image is upsampled by the network such that the 81st layer has a stride of 32. If we have a 416 x 416 image, the resulting feature map would be 13 x 13 in size. A detection is done here using the 1 x 1 detection kernel, giving a detection feature map of 13 x 13 x 255.
The feature map from layer 79 is then subjected to a few convolutional layers before being 2x upsampled to dimensions of 26 x 26. This feature map is then depthwise concatenated with the feature map from layer 61. The combined feature maps are then again subjected to a few 1 x 1 convolutional layers to fuse the features from the previous layer (61). Then, the second detection is done by the 94th layer, producing a detection feature map of 26 x 26 x 255.
A similar procedure is followed again, where the feature map from layer 91 is subjected to a few convolutional layers before being concatenated depthwise with a feature map from layer 36. As before, a few 1 x 1 convolutional layers follow to fuse the information from the previous layer (36). We do the final 3 at the 106th layer, producing a feature map of size 52 x 52 x 255.
Better at detecting smaller objects¶
Detections at different layers help solve the problem of small object detection, a common complaint of YOLO v2. The enlarged layers concatenated with the previous layers help preserve the fine-grained features that help in detecting small objects.
The 13 x 13 layer is responsible for detecting large objects, while the 52 x 52 layer detects the smaller objects, and the 26 x 26 layer detects medium-sized objects. Here is a comparative analysis of different objects selected from the same object by different layers.
Choosing anchor boxes¶
YOLO v3 uses 9 anchor boxes in total. Three for each scale. If you are training YOLO on your own dataset, you should use K-Means clustering to generate 9 anchors.
Next, arrange the anchors in descending order of one dimension. Assign the three largest anchors to the first scale, the next three to the second scale, and the last three to the third scale.
More bounding boxes per image¶
For an input image of the same size, YOLO v3 predicts more bounding boxes than YOLO v2. For example, at its native resolution of 416 x 416, YOLO v2 predicted 13 x 13 x 5 = 845 boxes. In each grid cell, 5 boxes were detected using 5 anchors.
On the other hand, YOLO v3 predicts boxes at 3 different scales. For the same 416 x 416 image, the number of predicted boxes is 10,647. This means that YOLO v3 predicts 10x the number of boxes predicted by YOLO v2. You can easily imagine why it is slower than YOLO v2. At each scale, each grid can predict 3 boxes using 3 anchors. Since there are three scales, the number of anchor boxes used in total is 9, 3 for each scale.
Benchmarking¶
YOLO v3 performs on par with other state-of-the-art detectors such as RetinaNet, although it is considerably faster on the COCO mAP 50 benchmark. It is also better than SSD and its variants. Here is a performance comparison straight from the paper.
YOLO v4¶
YOLOv4 was a real-time object detection model published in April 2020 that achieved state-of-the-art performance on the COCO dataset. It works by splitting the object detection task into two parts, regression to identify object positioning via bounding boxes, and classification to determine the object class. This implementation of YoloV4 uses the Darknet framework.
By using YOLOv4, you are implementing many of the previous research contributions in the YOLO family, along with a number of new contributions unique to YOLOv4, including new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss. In short, with YOLOv4, you are using a better object detection network architecture and new data augmentation techniques.
YOLOv4 Architecture¶
YOLO v4 Results:¶
As can be seen in the results below, YOLOv4 performs incredibly well at very high FPS; this was a huge improvement over previous object detection models, which only had high performance or high inference speeds.
YOLO v5¶
Origin of YOLOv5: A YOLOv3 PyTorch Extension¶
The YOLOv5 repository is a natural extension of Glenn Jocher’s YOLOv3 PyTorch repository. The YOLOv3 PyTorch repository was a popular destination for developers to port YOLOv3 Darknet weights to PyTorch and then push them into production. Many (including our vision team at Roboflow) liked the ease of use of the PyTorch branch and would use this output for deployment.
After fully replicating the model architecture and training procedure of YOLOv3, Ultralytics began making improvements to the research along with changes to the repository design with the goal of empowering thousands of developers to train and deploy their own custom object detectors to detect any object in the world. This is a goal we share here at Roboflow.
These advancements were originally named YOLOv4 due to the recent release of YOLOv4 in the Darknet framework, the model was renamed to YOLOv5 to avoid version collisions.
An Overview of the YOLOv5 Architecture¶
The YOLO model was the first object detector to connect the bounding box prediction procedure with class labels in an end-to-end differentiable network.
The YOLO network consists of three main parts.
Backbone: A convolutional neural network that aggregates and shapes image features at different granularities.
Neck: A series of layers to mix and match image features to pass them for prediction.
Head: Consumes features from Neck and performs box and class prediction steps.
That said, there are many approaches that can be taken to combine different architectures into each core component. The contributions of YOLOv4 and YOLOv5 are primarily to integrate advances in other areas of computer vision and prove that, as a collection, they improve YOLO object detection.
Automatic learning of Bounding Box Anchors¶
In the YOLOv3 PyTorch repository, Glenn Jocher introduced the idea of learning anchor boxes based on the distribution of bounding boxes in a custom dataset using K-Means and genetic learning algorithms. This is very important for custom tasks because the distribution of bounding box sizes and locations can be drastically different from the predefined bounding box anchors in the COCO dataset.
To make box predictions, the YOLOv5 network predicts bounding boxes as deviations from a list of anchor box dimensions.
The most extreme difference in anchor boxes can occur if we are trying to detect something like giraffes, which are very tall and skinny, or manta rays, which are very wide and flat. All YOLO anchor boxes are automatically learned in YOLOv5 when you input your custom data.
CSP Backbone¶
Both YOLOv4 and YOLOv5 implement the CSP Bottleneck to formulate image features. Research credit for this architecture goes to WongKinYiu and his recent paper on Cross Stage Partial Networks for Convolutional Neural Network Backbones.
CSP addresses the problem of duplicated gradients in other larger ConvNet backbones, resulting in fewer parameters and fewer FLOPS of comparable importance. This is extremely important for the YOLO family, where inference speed and small model size are of utmost importance.
CSP models are based on DenseNet. DenseNet was designed to connect layers in convolutional neural networks with the following motivations:
- to alleviate the vanishing gradient problem (it is difficult to backprop loss signals through a very deep network);
- to enforce feature propagation;
- to encourage the network to reuse features; and;
- to reduce the number of network parameters.
In CSPResNext50 and CSPDarknet53, DenseNet was edited to separate the base layer feature map, copying it and sending one copy through the dense block and sending another directly to the next stage. The idea with CSPResNext50 and CSPDarknet53 is to remove computational bottlenecks in DenseNet and improve learning by passing in an unedited version of the feature map.
PA-Net Neck¶
Both YOLOv4 and YOLOv5 implement PA-NET Neck for resource aggregation.
Each of the P_i above represents a feature layer in the CSP backbone.
The above image comes from a Google Brain survey of the EfficientDet object detection architecture. The authors of EfficientDet found that BiFPN is the best choice for the detection neck, and this may be an area of greater stability for YOLOv4 and YOLOv5 to explore with other implementations here.
It is certainly worth noting here that YOLOv5 borrows from YOLOv4’s research inquiry to decide on the best neck for its architecture. YOLOv4 investigated several possibilities for the best YOLO Neck, including:
- FPN
- PAN
- NAS-FPN
- BiFPN
- ASFF
- SFAM
YOLOv5 Preliminary Evaluation Metrics¶
As métricas de avaliação apresentadas nesta seção são preliminares e podemos esperar que um artigo de pesquisa formal seja publicado no YOLOv5 quando o trabalho de pesquisa estiver concluído e mais contribuições novas tiverem sido feitas à família de modelos YOLO.
Dito isso, é útil fornecer essas métricas para um desenvolvedor que está considerando qual estrutura usar hoje, antes que os artigos de pesquisa YOLOv5 sejam publicados.
As métricas de avaliação abaixo são baseadas no desempenho do conjunto de dados COCO, que contém uma ampla variedade de imagens contendo 80 classes de objetos. Para mais detalhes sobre a métrica de desempenho, veja este post sobre o que é mAP.
O artigo oficial YOLOv4 publica as seguintes métricas de avaliação executando sua rede treinada no conjunto de dados COCO em uma GPU V100:
Use Case: Orange Tree Detection with YOLO v5¶
Dataset preparation¶
To get your object detector up and running, you must first collect training images. You should think carefully about the activity you are trying to complete and plan ahead for the components of the task that your model might find difficult. To improve the accuracy of your final model, I recommend reducing the domain that your model must handle as much as possible.
For custom YOLOv5 training, we need to build a dataset. If you don’t have any data, you can use the openimages database.
Annotating the dataset¶
Use LabelImg or any annotation tool to annotate the dataset. Create a file with the same name as the image and the annotation text.
Prepare a set, for example, corresponding to
- images_0.jpg
- images_0.txt
YOLOv5 accepts label data in text files (.txt) in the following format:
Implementing:¶
In this example, we will use a portion of the total area of a drone image to create the dataset. In this part, we will use Qgis and create a .SHP of polygons where we will draw each object in the image as shown in the example below. It is possible to draw simpler polygons because here we are only interested in the boundbox of the trees and not in segmenting each tree perfectly. After collecting all the individuals in the image, we will work on the code in order to obtain the format mentioned above to feed YOLOv5.
Let's use a ready-made implementation of YOLO v5 to detect orange trees in a UAV image.
First, we will install rasterio and geopandas to manipulate the UAV image and shapefile with the delimitation of the training samples:
!pip install rasterio
Collecting rasterio
Downloading rasterio-1.3.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.3/21.3 MB 74.1 MB/s eta 0:00:00
Collecting affine (from rasterio)
Downloading affine-2.4.0-py3-none-any.whl (15 kB)
Requirement already satisfied: attrs in /usr/local/lib/python3.10/dist-packages (from rasterio) (23.1.0)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from rasterio) (2023.7.22)
Requirement already satisfied: click>=4.0 in /usr/local/lib/python3.10/dist-packages (from rasterio) (8.1.7)
Requirement already satisfied: cligj>=0.5 in /usr/local/lib/python3.10/dist-packages (from rasterio) (0.7.2)
Requirement already satisfied: numpy>=1.18 in /usr/local/lib/python3.10/dist-packages (from rasterio) (1.23.5)
Collecting snuggs>=1.4.1 (from rasterio)
Downloading snuggs-1.4.7-py3-none-any.whl (5.4 kB)
Requirement already satisfied: click-plugins in /usr/local/lib/python3.10/dist-packages (from rasterio) (1.1.1)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from rasterio) (67.7.2)
Requirement already satisfied: pyparsing>=2.1.6 in /usr/local/lib/python3.10/dist-packages (from snuggs>=1.4.1->rasterio) (3.1.1)
Installing collected packages: snuggs, affine, rasterio
Successfully installed affine-2.4.0 rasterio-1.3.8 snuggs-1.4.7
Requirement already satisfied: geopandas in /usr/local/lib/python3.10/dist-packages (0.13.2)
Requirement already satisfied: fiona>=1.8.19 in /usr/local/lib/python3.10/dist-packages (from geopandas) (1.9.4.post1)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from geopandas) (23.1)
Requirement already satisfied: pandas>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from geopandas) (1.5.3)
Requirement already satisfied: pyproj>=3.0.1 in /usr/local/lib/python3.10/dist-packages (from geopandas) (3.6.0)
Requirement already satisfied: shapely>=1.7.1 in /usr/local/lib/python3.10/dist-packages (from geopandas) (2.0.1)
Requirement already satisfied: attrs>=19.2.0 in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (23.1.0)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (2023.7.22)
Requirement already satisfied: click~=8.0 in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (8.1.7)
Requirement already satisfied: click-plugins>=1.0 in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (1.1.1)
Requirement already satisfied: cligj>=0.5 in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (0.7.2)
Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from fiona>=1.8.19->geopandas) (1.16.0)
Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.0->geopandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.0->geopandas) (2023.3.post1)
Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.0->geopandas) (1.23.5)
We will import the packages and functions that we will use here:
import rasterio
import geopandas as gpd
import numpy as np
from matplotlib import pyplot as plt
import cv2
from rasterio.features import rasterize
from rasterio.windows import Window
import os
from shapely.geometry import box
import pandas as pd
from skimage.io import imsave
from sklearn import model_selection
import os
import shutil
import json
import ast
import numpy as np
from tqdm import tqdm
import pandas as pd
import seaborn as sns
import fastai.vision as vision
import glob
The next steps are: mount the Drive, define the file paths and import them:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
path_img = '/content/drive/MyDrive/Datasets/orange_trees/Orange_trees.tif'
path_shp = '/content/drive/MyDrive/Datasets/orange_trees/orange_trees.shp'
label = gpd.read_file(path_shp)
src = rasterio.open(path_img)
img = src.read()
img.shape
(3, 5106, 15360)
img = img.transpose([1,2,0])
Now we can plot the original image:
plt.figure(figsize=[16,16])
plt.imshow(img)
plt.axis('off')
(-0.5, 15359.5, 5105.5, -0.5)
So, let's split this image into several 1024x1024 pixel patches:
qtd = 0
out_meta = src.meta.copy()
for n in range((src.meta['width']//1024)):
for m in range((src.meta['height']//1024)):
x = (512+(n*1024))
y = (512+(m*1024))
window = Window(x,y,1024,1024)
win_transform = src.window_transform(window)
arr_win = src.read(window=window)
arr_win = arr_win[0:3,:,:]
qtd = qtd + 1
path_exp = '/content/drive/MyDrive/Datasets/orange_trees/data/img_' + str(qtd) + '.tif'
out_meta.update({"driver": "GTiff","height": 1024,"width": 1024, "transform":win_transform})
with rasterio.open(path_exp, 'w', **out_meta) as dst:
for i, layer in enumerate(arr_win, start=1):
dst.write_band(i, layer.reshape(-1, layer.shape[-1]))
del arr_win
We then have the Data folder with the patches resulting from the division. Let's separate these images into training images and validation images:
path_data = '/content/drive/MyDrive/Datasets/orange_trees/data/'
images_files = [f for f in os.listdir(path_data)]
images_files_train, images_files_valid= model_selection.train_test_split(
images_files,
test_size=0.1,
random_state=42,
shuffle=True,
)
print(len(images_files_train))
print(len(images_files_valid))
54 6
The patches are in .tiff format, but the YOLOv5 implementation requires images in .jpg format. Let's then create two folders in our Google Colab content to store the training and validation images.
destination_1 = 'train'
destination_2 = 'validation'
if not os.path.isdir(destination_1):
os.mkdir(destination_1)
if not os.path.isdir(destination_2):
os.mkdir(destination_2)
path_data_new = '/content/train'
for images in images_files_train:
src = rasterio.open(os.path.join(path_data,images))
raster = src.read()
raster = raster.transpose([1,2,0])
imsave(os.path.join(path_data_new,images.split('.')[0] + '.jpg'), raster)
path_data_new = '/content/validation'
for images in images_files_valid:
src = rasterio.open(os.path.join(path_data,images))
raster = src.read()
raster = raster.transpose([1,2,0])
imsave(os.path.join(path_data_new,images.split('.')[0] + '.jpg'), raster)
Now let's work with Labels. First we check if the image and the vector have the same coordinate system:
src.crs
CRS.from_epsg(31982)
label = label.to_crs(31982)
label.crs
<Projected CRS: EPSG:31982> Name: SIRGAS 2000 / UTM zone 22S Axis Info [cartesian]: - E[east]: Easting (metre) - N[north]: Northing (metre) Area of Use: - name: Brazil - between 54°W and 48°W, northern and southern hemispheres, onshore and offshore. In remainder of South America - between 54°W and 48°W, southern hemisphere, onshore and offshore. - bounds: (-54.0, -54.18, -47.99, 7.04) Coordinate Operation: - name: UTM zone 22S - method: Transverse Mercator Datum: Sistema de Referencia Geocentrico para las AmericaS 2000 - Ellipsoid: GRS 1980 - Prime Meridian: Greenwich
Let's plot the labels:
fig, ax = plt.subplots(1, 1, figsize=(15, 15))
label.plot(ax = ax)
<Axes: >
However, we will only use the bounding boxes of each orange tree.
Bound_list = []
for i,p in label.iterrows():
bound = label['geometry'][i].bounds
bound_rect = (bound[0],bound[1],bound[2],bound[3])
geom = box(*bound_rect)
Bound_list.append(geom)
gdf_bounds = gpd.GeoDataFrame(geometry=Bound_list)
fig, ax = plt.subplots(1, 1, figsize=(15, 15))
gdf_bounds.plot(ax = ax)
<Axes: >
Now let's create the training and validation datasets with the bboxes that intersect each of the patches:
poly_geometry_train = []
img_id_train = []
for fp1 in images_files_train:
src1 = rasterio.open(os.path.join(path_data,fp1))
bounds1 = src1.bounds
df1 = gpd.GeoDataFrame({"id":1,"geometry":[box(*bounds1)]})
df1 = df1.set_crs(epsg=31982)
for i,row in gdf_bounds.iterrows():
intersects = df1['geometry'][0].intersection(row['geometry'])
if (intersects.is_empty == False):
poly_geometry_train.append(intersects)
img_id_train.append(fp1)
/usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs)
poly_geometry_val = []
img_id_val = []
for fp2 in images_files_valid:
src2 = rasterio.open(os.path.join(path_data,fp2))
bounds2 = src2.bounds
df2 = gpd.GeoDataFrame({"id":1,"geometry":[box(*bounds2)]})
df2 = df2.set_crs(epsg=31982)
for i,row in gdf_bounds.iterrows():
intersects = df2['geometry'][0].intersection(row['geometry'])
if (intersects.is_empty == False):
poly_geometry_val.append(intersects)
img_id_val.append(fp2)
/usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs) /usr/local/lib/python3.10/dist-packages/shapely/set_operations.py:133: RuntimeWarning: invalid value encountered in intersection return lib.intersection(a, b, **kwargs)
dataset_train = gpd.GeoDataFrame(geometry=poly_geometry_train)
dataset_val = gpd.GeoDataFrame(geometry=poly_geometry_val)
dataset_train['ImageId'] = img_id_train
dataset_val['ImageId'] = img_id_val
So we have the dataframe with the bbox geometry and the id of the image to which it belongs:
dataset_val
| geometry | ImageId | |
|---|---|---|
| 0 | POLYGON ((621368.878 7741229.644, 621368.878 7... | img_1.tif |
| 1 | POLYGON ((621372.923 7741229.644, 621372.923 7... | img_1.tif |
| 2 | POLYGON ((621377.033 7741229.644, 621377.033 7... | img_1.tif |
| 3 | POLYGON ((621368.472 7741239.415, 621370.851 7... | img_1.tif |
| 4 | POLYGON ((621371.908 7741235.714, 621371.908 7... | img_1.tif |
| 5 | POLYGON ((621375.582 7741235.227, 621375.582 7... | img_1.tif |
| 6 | POLYGON ((621380.025 7741234.772, 621380.025 7... | img_1.tif |
| 7 | POLYGON ((621371.470 7741244.855, 621371.470 7... | img_1.tif |
| 8 | POLYGON ((621375.872 7741244.855, 621375.872 7... | img_1.tif |
| 9 | POLYGON ((621380.335 7741244.855, 621380.335 7... | img_1.tif |
| 10 | POLYGON ((621383.683 7741242.222, 621381.281 7... | img_1.tif |
| 11 | POLYGON ((621383.683 7741222.060, 621384.048 7... | img_6.tif |
| 12 | POLYGON ((621384.972 7741218.533, 621384.972 7... | img_6.tif |
| 13 | POLYGON ((621388.691 7741217.173, 621388.691 7... | img_6.tif |
| 14 | POLYGON ((621393.000 7741216.096, 621393.000 7... | img_6.tif |
| 15 | POLYGON ((621398.894 7741216.122, 621397.091 7... | img_6.tif |
| 16 | POLYGON ((621383.683 7741229.616, 621384.016 7... | img_6.tif |
| 17 | POLYGON ((621384.773 7741225.786, 621384.773 7... | img_6.tif |
| 18 | POLYGON ((621389.122 7741225.274, 621389.122 7... | img_6.tif |
| 19 | POLYGON ((621392.462 7741223.859, 621392.462 7... | img_6.tif |
| 20 | POLYGON ((621398.894 7741223.428, 621396.850 7... | img_6.tif |
| 21 | POLYGON ((621516.571 7741229.644, 621516.571 7... | img_37.tif |
| 22 | POLYGON ((621513.216 7741229.644, 621513.216 7... | img_37.tif |
| 23 | POLYGON ((621509.029 7741229.916, 621509.029 7... | img_37.tif |
| 24 | POLYGON ((621505.602 7741231.842, 621505.602 7... | img_37.tif |
| 25 | POLYGON ((621516.912 7741235.170, 621516.912 7... | img_37.tif |
| 26 | POLYGON ((621513.161 7741236.718, 621513.161 7... | img_37.tif |
| 27 | POLYGON ((621509.278 7741237.937, 621509.278 7... | img_37.tif |
| 28 | POLYGON ((621505.767 7741239.289, 621505.767 7... | img_37.tif |
| 29 | POLYGON ((621520.584 7741244.855, 621520.584 7... | img_37.tif |
| 30 | POLYGON ((621517.431 7741244.855, 621517.431 7... | img_37.tif |
| 31 | POLYGON ((621551.007 7741229.644, 621550.923 7... | img_45.tif |
| 32 | POLYGON ((621535.795 7741229.644, 621535.795 7... | img_45.tif |
| 33 | POLYGON ((621547.436 7741229.910, 621547.436 7... | img_45.tif |
| 34 | POLYGON ((621543.854 7741231.684, 621543.854 7... | img_45.tif |
| 35 | POLYGON ((621539.975 7741233.141, 621539.975 7... | img_45.tif |
| 36 | POLYGON ((621536.431 7741234.865, 621536.431 7... | img_45.tif |
| 37 | POLYGON ((621551.007 7741236.928, 621550.905 7... | img_45.tif |
| 38 | POLYGON ((621547.335 7741238.765, 621547.335 7... | img_45.tif |
| 39 | POLYGON ((621543.350 7741240.539, 621543.350 7... | img_45.tif |
| 40 | POLYGON ((621539.994 7741241.726, 621539.994 7... | img_45.tif |
| 41 | POLYGON ((621538.921 7741244.855, 621538.921 7... | img_45.tif |
| 42 | POLYGON ((621398.894 7741187.132, 621401.562 7... | img_12.tif |
| 43 | POLYGON ((621402.872 7741184.010, 621402.872 7... | img_12.tif |
| 44 | POLYGON ((621407.294 7741184.010, 621407.294 7... | img_12.tif |
| 45 | POLYGON ((621411.092 7741184.010, 621411.092 7... | img_12.tif |
| 46 | POLYGON ((621399.021 7741192.163, 621399.021 7... | img_12.tif |
| 47 | POLYGON ((621402.882 7741191.316, 621402.882 7... | img_12.tif |
| 48 | POLYGON ((621406.586 7741190.292, 621406.586 7... | img_12.tif |
| 49 | POLYGON ((621410.723 7741189.701, 621410.723 7... | img_12.tif |
| 50 | POLYGON ((621412.068 7741199.221, 621412.068 7... | img_12.tif |
| 51 | POLYGON ((621414.105 7741197.017, 621413.173 7... | img_12.tif |
| 52 | POLYGON ((621569.284 7741200.821, 621569.284 7... | img_55.tif |
| 53 | POLYGON ((621566.218 7741205.280, 621568.438 7... | img_55.tif |
| 54 | POLYGON ((621581.429 7741203.564, 621579.128 7... | img_55.tif |
| 55 | POLYGON ((621575.931 7741206.380, 621575.931 7... | img_55.tif |
| 56 | POLYGON ((621572.519 7741208.495, 621572.519 7... | img_55.tif |
| 57 | POLYGON ((621569.176 7741210.360, 621569.176 7... | img_55.tif |
| 58 | POLYGON ((621566.218 7741214.432, 621568.187 7... | img_55.tif |
| 59 | POLYGON ((621581.429 7741214.432, 621581.429 7... | img_55.tif |
The next step is to convert the coordinates into d and y values.
df_train = []
Id_train = []
for i,row in dataset_train.iterrows():
ImageID = row['ImageId'].split('.')[0] + '.jpg'
src1 = rasterio.open(os.path.join(path_data,row['ImageId']))
poly = []
for point in list(row.geometry.exterior.coords):
x = point[0]
y = point[1]
row, col = src1.index(x,y)
tuple = (row,col)
poly.append(tuple)
Id_train.append(ImageID)
df_train.append(poly)
df_val = []
Id_val = []
for i,row in dataset_val.iterrows():
ImageID = row['ImageId'].split('.')[0] + '.jpg'
src2 = rasterio.open(os.path.join(path_data,row['ImageId']))
poly = []
for point in list(row.geometry.exterior.coords):
x = point[0]
y = point[1]
row, col = src2.index(x,y)
tuple = (row,col)
poly.append(tuple)
Id_val.append(ImageID)
df_val.append(poly)
train_set = pd.DataFrame([])
valid_set = pd.DataFrame([])
train_set['ImageId'] = Id_train
valid_set['ImageId'] = Id_val
train_set['geometry'] = df_train
valid_set['geometry'] = df_val
Since we only have the orange tree class, let's configure it.
train_set['class'] = 0
train_set['class_name'] = 'orange_tree'
valid_set['class'] = 0
valid_set['class_name'] = 'orange_tree'
train_set
| ImageId | geometry | class | class_name | |
|---|---|---|---|---|
| 0 | img_32.jpg | [(677, 0), (677, 70), (850, 70), (850, 0), (67... | 0 | orange_tree |
| 1 | img_32.jpg | [(905, 114), (764, 114), (764, 300), (905, 300... | 0 | orange_tree |
| 2 | img_32.jpg | [(1002, 389), (831, 389), (831, 579), (1002, 5... | 0 | orange_tree |
| 3 | img_32.jpg | [(1024, 629), (924, 629), (924, 844), (1024, 8... | 0 | orange_tree |
| 4 | img_32.jpg | [(391, 138), (174, 138), (174, 346), (391, 346... | 0 | orange_tree |
| ... | ... | ... | ... | ... |
| 489 | img_38.jpg | [(483, 101), (307, 101), (307, 300), (483, 300... | 0 | orange_tree |
| 490 | img_38.jpg | [(211, 0), (211, 52), (402, 52), (402, 0), (21... | 0 | orange_tree |
| 491 | img_38.jpg | [(304, 1024), (304, 950), (151, 950), (151, 10... | 0 | orange_tree |
| 492 | img_38.jpg | [(0, 926), (204, 926), (204, 753), (0, 753), (... | 0 | orange_tree |
| 493 | img_38.jpg | [(0, 663), (53, 663), (53, 528), (0, 528), (0,... | 0 | orange_tree |
494 rows × 4 columns
We just need to get the xmax, ymax, xmin and ymim coordinates for each annotation. To do this we will use the getBounds function:
def getBounds(geometry):
try:
arr = np.array(geometry).T
xmin = np.min(arr[0])
ymin = np.min(arr[1])
xmax = np.max(arr[0])
ymax = np.max(arr[1])
return (xmin, ymin, xmax, ymax)
except:
return np.nan
def getWidth(bounds):
try:
(xmin, ymin, xmax, ymax) = bounds
return np.abs(xmax - xmin)
except:
return np.nan
def getHeight(bounds):
try:
(xmin, ymin, xmax, ymax) = bounds
return np.abs(ymax - ymin)
except:
return np.nan
train_set.loc[:,'bounds'] = train_set.loc[:,'geometry'].apply(getBounds)
train_set.loc[:,'width'] = train_set.loc[:,'bounds'].apply(getWidth)
train_set.loc[:,'height'] = train_set.loc[:,'bounds'].apply(getHeight)
train_set.head(10)
| ImageId | geometry | class | class_name | bounds | width | height | |
|---|---|---|---|---|---|---|---|
| 0 | img_32.jpg | [(677, 0), (677, 70), (850, 70), (850, 0), (67... | 0 | orange_tree | (677, 0, 850, 70) | 173 | 70 |
| 1 | img_32.jpg | [(905, 114), (764, 114), (764, 300), (905, 300... | 0 | orange_tree | (764, 114, 905, 300) | 141 | 186 |
| 2 | img_32.jpg | [(1002, 389), (831, 389), (831, 579), (1002, 5... | 0 | orange_tree | (831, 389, 1002, 579) | 171 | 190 |
| 3 | img_32.jpg | [(1024, 629), (924, 629), (924, 844), (1024, 8... | 0 | orange_tree | (924, 629, 1024, 844) | 100 | 215 |
| 4 | img_32.jpg | [(391, 138), (174, 138), (174, 346), (391, 346... | 0 | orange_tree | (174, 138, 391, 346) | 217 | 208 |
| 5 | img_32.jpg | [(453, 374), (271, 374), (271, 580), (453, 580... | 0 | orange_tree | (271, 374, 453, 580) | 182 | 206 |
| 6 | img_32.jpg | [(553, 649), (392, 649), (392, 819), (553, 819... | 0 | orange_tree | (392, 649, 553, 819) | 161 | 170 |
| 7 | img_32.jpg | [(642, 1024), (642, 918), (469, 918), (469, 10... | 0 | orange_tree | (469, 918, 642, 1024) | 173 | 106 |
| 8 | img_32.jpg | [(0, 736), (0, 736), (0, 538), (0, 538), (0, 7... | 0 | orange_tree | (0, 538, 0, 736) | 0 | 198 |
| 9 | img_32.jpg | [(0, 986), (76, 986), (76, 792), (0, 792), (0,... | 0 | orange_tree | (0, 792, 76, 986) | 76 | 194 |
valid_set.loc[:,'bounds'] = valid_set.loc[:,'geometry'].apply(getBounds)
valid_set.loc[:,'width'] = valid_set.loc[:,'bounds'].apply(getWidth)
valid_set.loc[:,'height'] = valid_set.loc[:,'bounds'].apply(getHeight)
valid_set.head(10)
| ImageId | geometry | class | class_name | bounds | width | height | |
|---|---|---|---|---|---|---|---|
| 0 | img_1.jpg | [(1024, 27), (927, 27), (927, 242), (1024, 242... | 0 | orange_tree | (927, 27, 1024, 242) | 97 | 215 |
| 1 | img_1.jpg | [(1024, 299), (947, 299), (947, 504), (1024, 5... | 0 | orange_tree | (947, 299, 1024, 504) | 77 | 205 |
| 2 | img_1.jpg | [(1024, 576), (1000, 576), (1000, 761), (1024,... | 0 | orange_tree | (1000, 576, 1024, 761) | 24 | 185 |
| 3 | img_1.jpg | [(366, 0), (366, 160), (559, 160), (559, 0), (... | 0 | orange_tree | (366, 0, 559, 160) | 193 | 160 |
| 4 | img_1.jpg | [(615, 231), (410, 231), (410, 438), (615, 438... | 0 | orange_tree | (410, 231, 615, 438) | 205 | 207 |
| 5 | img_1.jpg | [(648, 478), (448, 478), (448, 695), (648, 695... | 0 | orange_tree | (448, 478, 648, 695) | 200 | 217 |
| 6 | img_1.jpg | [(678, 777), (488, 777), (488, 964), (678, 964... | 0 | orange_tree | (488, 777, 678, 964) | 190 | 187 |
| 7 | img_1.jpg | [(0, 201), (41, 201), (41, 53), (0, 53), (0, 2... | 0 | orange_tree | (0, 53, 41, 201) | 41 | 148 |
| 8 | img_1.jpg | [(0, 498), (134, 498), (134, 292), (0, 292), (... | 0 | orange_tree | (0, 292, 134, 498) | 134 | 206 |
| 9 | img_1.jpg | [(0, 798), (168, 798), (168, 616), (0, 616), (... | 0 | orange_tree | (0, 616, 168, 798) | 168 | 182 |
After that we create the .csv files to use in YOLOv5:
def convert(data, data_type):
df = data.groupby('ImageId')['bounds'].apply(list).reset_index(name='bboxes')
df['classes'] = data.groupby('ImageId')['class'].apply(list).reset_index(drop=True)
df.to_csv(data_type + '.csv', index=False)
print(data_type)
print(df.shape)
print(df.head())
df_train = convert(train_set, '/content/train')
df_valid = convert(valid_set, '/content/validation')
/content/train
(52, 3)
ImageId bboxes \
0 img_10.jpg [(782, 0, 910, 62), (776, 160, 994, 335), (874...
1 img_11.jpg [(781, 0, 935, 79), (804, 193, 992, 344), (873...
2 img_13.jpg [(955, 0, 1024, 72), (981, 121, 1024, 330), (4...
3 img_14.jpg [(991, 0, 1023, 148), (456, 0, 640, 114), (515...
4 img_15.jpg [(517, 0, 704, 193), (587, 222, 763, 404), (67...
classes
0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
1 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
2 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
3 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
4 [0, 0, 0, 0, 0, 0, 0, 0, 0]
/content/validation
(6, 3)
ImageId bboxes \
0 img_1.jpg [(927, 27, 1024, 242), (947, 299, 1024, 504), ...
1 img_12.jpg [(813, 0, 1000, 179), (865, 267, 1024, 459), (...
2 img_37.jpg [(1021, 753, 1024, 926), (927, 528, 1024, 663)...
3 img_45.jpg [(960, 1018, 1024, 1024), (997, 0, 1024, 189),...
4 img_55.jpg [(734, 206, 916, 398), (616, 0, 793, 149), (51...
classes
0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
1 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
2 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
3 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
4 [0, 0, 0, 0, 0, 0, 0, 0]
Now it's time to prepare the environment to use YOLOv5. We will copy github and install this version of pytorch:
!git clone https://github.com/ultralytics/yolov5.git # clone repo
!pip install -qr yolov5/requirements.txt # install dependencies
#!pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
Cloning into 'yolov5'...
remote: Enumerating objects: 15994, done.
remote: Counting objects: 100% (27/27), done.
remote: Compressing objects: 100% (15/15), done.
remote: Total 15994 (delta 18), reused 18 (delta 12), pack-reused 15967
Receiving objects: 100% (15994/15994), 14.64 MiB | 28.03 MiB/s, done.
Resolving deltas: 100% (10980/10980), done.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 190.0/190.0 kB 4.3 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 618.0/618.0 kB 12.4 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.7/62.7 kB 6.9 MB/s eta 0:00:00
import torch
from IPython.display import Image # for displaying images
print('Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))
Using torch 2.0.1+cu118 _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', major=7, minor=0, total_memory=16150MB, multi_processor_count=80)
%cd yolov5
!ls
/content/yolov5 benchmarks.py data LICENSE requirements.txt tutorial.ipynb CITATION.cff detect.py models segment utils classify export.py README.md setup.cfg val.py CONTRIBUTING.md hubconf.py README.zh-CN.md train.py
Inside the yolo folder, we create the plant_data folder to store our data.
!mkdir plant_data
%cd plant_data
/content/yolov5/plant_data
!mkdir images
!mkdir labels
%cd images
!mkdir train
!mkdir validation
%cd ..
%cd labels
!mkdir train
!mkdir validation
%cd ..
%cd ..
%cd ..
/content/yolov5/plant_data/images /content/yolov5/plant_data /content/yolov5/plant_data/labels /content/yolov5/plant_data /content/yolov5 /content
for root,dir,_ in os.walk('yolov5/plant_data'):
print(root)
print(dir)
yolov5/plant_data ['labels', 'images'] yolov5/plant_data/labels ['train', 'validation'] yolov5/plant_data/labels/train [] yolov5/plant_data/labels/validation [] yolov5/plant_data/images ['train', 'validation'] yolov5/plant_data/images/train [] yolov5/plant_data/images/validation []
Let's copy the images and create a .txt for each image with boundaries inside the folder of our project that we just created:
INPUT_PATH = '/content/'
OUTPUT_PATH = '/content/yolov5/plant_data'
def process_data(data, data_type='train'):
for _, row in tqdm(data.iterrows(), total = len(data)):
image_name = row['ImageId'].split('.')[0]
bounding_boxes = row['bboxes']
classes = row['classes']
yolo_data = []
for bbox, Class in zip(bounding_boxes, classes):
x_min = bbox[1]
y_min = bbox[0]
x_max = bbox[3]
y_max = bbox[2]
x_center = (x_min + x_max) / 2.0 / 1024
y_center = (y_min + y_max) / 2.0 / 1024
x_extend = (x_max - x_min) / 1024
y_extend = (y_max - y_min) / 1024
yolo_data.append([Class, x_center, y_center, x_extend, y_extend])
''''x = bbox[0]
y = bbox[1]
w = bbox[2]
h = bbox[3]
x_center = x + w / 2
y_center = y + h / 2
x_center /= 1024
y_center /= 1024
w /= 1024
h /= 1024
yolo_data.append([Class, x_center, y_center, w, h])'''
yoy_data = np.array(yolo_data)
np.savetxt(
os.path.join(OUTPUT_PATH, f"labels/{data_type}/{image_name}.txt"),
yolo_data,
fmt = ["%d", "%f", "%f", "%f", "%f"]
)
shutil.copyfile(
os.path.join(INPUT_PATH, f"{data_type}/{image_name}.jpg"),
os.path.join(OUTPUT_PATH, f"images/{data_type}/{image_name}.jpg")
)
df_train = pd.read_csv('/content/train.csv')
df_train.bboxes = df_train.bboxes.apply(ast.literal_eval)
df_train.classes = df_train.classes.apply(ast.literal_eval)
df_valid = pd.read_csv('/content/validation.csv')
df_valid.bboxes = df_valid.bboxes.apply(ast.literal_eval)
df_valid.classes = df_valid.classes.apply(ast.literal_eval)
process_data(df_train, data_type='train')
process_data(df_valid, data_type='validation')
100%|██████████| 52/52 [00:00<00:00, 1415.87it/s] 100%|██████████| 6/6 [00:00<00:00, 1161.59it/s]
Here we can check if the .txt was created correctly:
f = open('/content/yolov5/plant_data/labels/train/'+os.listdir("/content/yolov5/plant_data/labels/train/")[0])
print(f.name)
for l in f:
print(l)
/content/yolov5/plant_data/labels/train/img_58.txt 0 0.005371 0.951172 0.010742 0.095703 0 0.358887 0.668945 0.053711 0.140625 0 0.200684 0.568848 0.215820 0.213867 0 0.022949 0.423828 0.045898 0.150391 0 0.322754 0.097168 0.131836 0.180664 0 0.122070 0.027344 0.130859 0.054688
Finally, we will create the yaml with our project information and run the train.py file. If necessary, you should run this command twice to start the training:
%cd yolov5
/content/yolov5
%%writefile orange_tree.yaml
train: plant_data/images/train
val: plant_data/images/validation
nc: 1
names: ['Orange Tree']
Writing orange_tree.yaml
!python train.py --img 1024 --batch 8 --epochs 200 --data orange_tree.yaml --cfg models/yolov5l.yaml --name orangetree
train: weights=yolov5s.pt, cfg=models/yolov5l.yaml, data=orange_tree.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=200, batch_size=8, imgsz=1024, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=orangetree, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest github: up to date with https://github.com/ultralytics/yolov5 ✅ YOLOv5 🚀 v7.0-218-g9e97ac3 Python-3.10.12 torch-2.0.1+cu118 CUDA:0 (Tesla V100-SXM2-16GB, 16151MiB) hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0 Comet: run 'pip install comet_ml' to automatically track and visualize YOLOv5 🚀 runs in Comet TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/ Downloading https://ultralytics.com/assets/Arial.ttf to /root/.config/Ultralytics/Arial.ttf... 100% 755k/755k [00:00<00:00, 25.5MB/s] Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt... 100% 14.1M/14.1M [00:00<00:00, 83.9MB/s] Overriding model.yaml nc=80 with nc=1 from n params module arguments 0 -1 1 7040 models.common.Conv [3, 64, 6, 2, 2] 1 -1 1 73984 models.common.Conv [64, 128, 3, 2] 2 -1 3 156928 models.common.C3 [128, 128, 3] 3 -1 1 295424 models.common.Conv [128, 256, 3, 2] 4 -1 6 1118208 models.common.C3 [256, 256, 6] 5 -1 1 1180672 models.common.Conv [256, 512, 3, 2] 6 -1 9 6433792 models.common.C3 [512, 512, 9] 7 -1 1 4720640 models.common.Conv [512, 1024, 3, 2] 8 -1 3 9971712 models.common.C3 [1024, 1024, 3] 9 -1 1 2624512 models.common.SPPF [1024, 1024, 5] 10 -1 1 525312 models.common.Conv [1024, 512, 1, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 models.common.Concat [1] 13 -1 3 2757632 models.common.C3 [1024, 512, 3, False] 14 -1 1 131584 models.common.Conv [512, 256, 1, 1] 15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 16 [-1, 4] 1 0 models.common.Concat [1] 17 -1 3 690688 models.common.C3 [512, 256, 3, False] 18 -1 1 590336 models.common.Conv [256, 256, 3, 2] 19 [-1, 14] 1 0 models.common.Concat [1] 20 -1 3 2495488 models.common.C3 [512, 512, 3, False] 21 -1 1 2360320 models.common.Conv [512, 512, 3, 2] 22 [-1, 10] 1 0 models.common.Concat [1] 23 -1 3 9971712 models.common.C3 [1024, 1024, 3, False] 24 [17, 20, 23] 1 32310 models.yolo.Detect [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [256, 512, 1024]] YOLOv5l summary: 368 layers, 46138294 parameters, 46138294 gradients, 108.2 GFLOPs Transferred 57/613 items from yolov5s.pt AMP: checks passed ✅ optimizer: SGD(lr=0.01) with parameter groups 101 weight(decay=0.0), 104 weight(decay=0.0005), 104 bias albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) train: Scanning /content/yolov5/plant_data/labels/train... 52 images, 0 backgrounds, 0 corrupt: 100% 52/52 [00:00<00:00, 2544.67it/s] train: New cache created: /content/yolov5/plant_data/labels/train.cache val: Scanning /content/yolov5/plant_data/labels/validation... 6 images, 0 backgrounds, 0 corrupt: 100% 6/6 [00:00<00:00, 240.22it/s] val: New cache created: /content/yolov5/plant_data/labels/validation.cache AutoAnchor: 4.18 anchors/target, 0.968 Best Possible Recall (BPR). Anchors are a poor fit to dataset ⚠️, attempting to improve... AutoAnchor: WARNING ⚠️ Extremely small objects found: 3 of 499 labels are <3 pixels in size AutoAnchor: Running kmeans for 9 anchors on 499 points... AutoAnchor: Evolving anchors with Genetic Algorithm: fitness = 0.8210: 100% 1000/1000 [00:02<00:00, 336.44it/s] AutoAnchor: thr=0.25: 0.9619 best possible recall, 6.55 anchors past thr AutoAnchor: n=9, img_size=1024, metric_all=0.471/0.827-mean/best, past_thr=0.589-mean: 50,67, 27,186, 187,42, 83,172, 175,91, 150,153, 197,163, 177,197, 211,202 AutoAnchor: Done ⚠️ (original anchors better than new anchors, proceeding with original anchors) Plotting labels to runs/train/orangetree/labels.jpg... Image sizes 1024 train, 1024 val Using 8 dataloader workers Logging results to runs/train/orangetree Starting training for 200 epochs... Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 0/199 11.9G 0.1128 0.1307 0 69 1024: 100% 7/7 [00:08<00:00, 1.20s/it] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:01<00:00, 1.86s/it] all 6 55 0.00111 0.0364 0.000595 0.000267 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 1/199 11.9G 0.1121 0.121 0 49 1024: 100% 7/7 [00:01<00:00, 4.06it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.41it/s] all 6 55 0.00222 0.0727 0.00122 0.00033 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 2/199 11.9G 0.1106 0.1152 0 38 1024: 100% 7/7 [00:01<00:00, 4.24it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.40it/s] all 6 55 0.00167 0.0545 0.000945 0.000329 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 3/199 12G 0.1107 0.1288 0 75 1024: 100% 7/7 [00:01<00:00, 4.23it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.41it/s] all 6 55 0.00167 0.0545 0.000944 0.000329 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 4/199 12G 0.1105 0.1131 0 43 1024: 100% 7/7 [00:01<00:00, 4.34it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.32it/s] all 6 55 0.00167 0.0545 0.000882 0.000295 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 5/199 12G 0.1101 0.122 0 56 1024: 100% 7/7 [00:01<00:00, 4.30it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.43it/s] all 6 55 0.00167 0.0545 0.001 0.00035 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 6/199 12G 0.1101 0.1174 0 33 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.44it/s] all 6 55 0.00167 0.0545 0.000884 0.000295 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 7/199 12G 0.1096 0.1119 0 58 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.43it/s] all 6 55 0.00167 0.0545 0.000883 0.000295 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 8/199 12G 0.1092 0.1164 0 75 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.44it/s] all 6 55 0.00167 0.0545 0.00102 0.000353 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 9/199 12G 0.1084 0.122 0 49 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.46it/s] all 6 55 0.000556 0.0182 0.00029 0.000116 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 10/199 12G 0.1077 0.106 0 30 1024: 100% 7/7 [00:01<00:00, 4.32it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.44it/s] all 6 55 0.00222 0.0727 0.00142 0.000436 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 11/199 12G 0.1074 0.1188 0 49 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.41it/s] all 6 55 0.00222 0.0727 0.00187 0.000357 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 12/199 12G 0.1084 0.1159 0 27 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.42it/s] all 6 55 0.00222 0.0727 0.00177 0.000505 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 13/199 12G 0.1068 0.122 0 58 1024: 100% 7/7 [00:01<00:00, 4.44it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.47it/s] all 6 55 0.00222 0.0727 0.00177 0.000505 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 14/199 12G 0.1061 0.115 0 53 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.43it/s] all 6 55 0.00222 0.0727 0.00158 0.000468 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 15/199 12G 0.1058 0.1259 0 59 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.38it/s] all 6 55 0.00333 0.109 0.00248 0.00061 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 16/199 12G 0.1059 0.1322 0 51 1024: 100% 7/7 [00:01<00:00, 4.33it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.44it/s] all 6 55 0.00278 0.0909 0.00164 0.000441 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 17/199 12G 0.1052 0.1304 0 76 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.47it/s] all 6 55 0.00278 0.0909 0.00171 0.000319 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 18/199 12G 0.1057 0.1269 0 64 1024: 100% 7/7 [00:01<00:00, 4.34it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.47it/s] all 6 55 0.00278 0.0909 0.00176 0.000324 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 19/199 12G 0.1048 0.119 0 55 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.59it/s] all 6 55 0.00389 0.127 0.00243 0.000492 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 20/199 12G 0.1028 0.134 0 42 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.71it/s] all 6 55 0.00667 0.218 0.00441 0.00109 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 21/199 12G 0.1036 0.1232 0 47 1024: 100% 7/7 [00:01<00:00, 4.47it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.64it/s] all 6 55 0.00667 0.218 0.00441 0.00109 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 22/199 12G 0.1024 0.131 0 65 1024: 100% 7/7 [00:01<00:00, 4.34it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.62it/s] all 6 55 0.00667 0.218 0.00424 0.0011 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 23/199 12G 0.101 0.1207 0 65 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.76it/s] all 6 55 0.0354 0.0182 0.00362 0.000873 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 24/199 12G 0.1003 0.1267 0 41 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 2.44it/s] all 6 55 0.00846 0.0727 0.00417 0.00104 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 25/199 12G 0.09921 0.1326 0 57 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 3.42it/s] all 6 55 0.0138 0.0364 0.00557 0.00103 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 26/199 12G 0.09772 0.1355 0 48 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 4.69it/s] all 6 55 0.0127 0.127 0.00622 0.00185 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 27/199 12G 0.09799 0.1289 0 38 1024: 100% 7/7 [00:01<00:00, 4.31it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 4.47it/s] all 6 55 0.00789 0.218 0.0046 0.00143 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 28/199 12G 0.09636 0.1234 0 53 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 2.42it/s] all 6 55 0.0134 0.0909 0.00639 0.00156 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 29/199 12G 0.09217 0.1337 0 52 1024: 100% 7/7 [00:01<00:00, 4.53it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 2.83it/s] all 6 55 0.0134 0.0909 0.00639 0.00156 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 30/199 12G 0.09186 0.1331 0 38 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.89it/s] all 6 55 0.0134 0.273 0.00937 0.00258 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 31/199 12G 0.08902 0.1305 0 50 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.56it/s] all 6 55 0.013 0.291 0.0096 0.00301 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 32/199 12G 0.08921 0.1312 0 55 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.70it/s] all 6 55 0.0194 0.636 0.0188 0.00594 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 33/199 12G 0.08717 0.1261 0 50 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 1.61it/s] all 6 55 0.0273 0.545 0.0256 0.00745 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 34/199 12G 0.08802 0.1263 0 61 1024: 100% 7/7 [00:01<00:00, 4.33it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 2.05it/s] all 6 55 0.0279 0.709 0.0238 0.00732 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 35/199 12G 0.08462 0.1342 0 70 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 2.20it/s] all 6 55 0.0385 0.473 0.0323 0.0115 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 36/199 12G 0.08405 0.1284 0 58 1024: 100% 7/7 [00:01<00:00, 4.43it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 2.66it/s] all 6 55 0.0447 0.327 0.0403 0.0132 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 37/199 12G 0.08377 0.1212 0 40 1024: 100% 7/7 [00:01<00:00, 4.45it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 2.81it/s] all 6 55 0.0447 0.327 0.0403 0.0132 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 38/199 12G 0.08066 0.1197 0 46 1024: 100% 7/7 [00:01<00:00, 4.43it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 3.13it/s] all 6 55 0.0406 0.6 0.036 0.0114 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 39/199 12G 0.08258 0.1293 0 50 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 3.43it/s] all 6 55 0.0567 0.6 0.049 0.0135 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 40/199 12G 0.0812 0.1283 0 29 1024: 100% 7/7 [00:01<00:00, 4.32it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 3.83it/s] all 6 55 0.0767 0.636 0.0614 0.0203 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 41/199 12G 0.07822 0.1168 0 56 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 7.96it/s] all 6 55 0.00944 0.309 0.00621 0.00157 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 42/199 12G 0.08036 0.1205 0 59 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.18it/s] all 6 55 0.000856 0.0182 0.000289 5.78e-05 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 43/199 12G 0.07811 0.1192 0 45 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.44it/s] all 6 55 0.00896 0.291 0.00585 0.00186 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 44/199 12G 0.08116 0.1069 0 54 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.80it/s] all 6 55 0.0851 0.182 0.0504 0.0169 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 45/199 12G 0.0789 0.1042 0 31 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.69it/s] all 6 55 0.0851 0.182 0.0504 0.0169 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 46/199 12G 0.08344 0.1043 0 36 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.77it/s] all 6 55 0.0538 0.455 0.0439 0.0114 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 47/199 12G 0.07779 0.1044 0 37 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.24it/s] all 6 55 0.183 0.255 0.113 0.0327 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 48/199 12G 0.07517 0.11 0 39 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 6.17it/s] all 6 55 0.0428 0.0909 0.0256 0.00795 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 49/199 12G 0.07646 0.1091 0 50 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.79it/s] all 6 55 0.153 0.127 0.0495 0.0183 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 50/199 12G 0.07983 0.1071 0 52 1024: 100% 7/7 [00:01<00:00, 4.32it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 6.67it/s] all 6 55 0.268 0.327 0.183 0.0498 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 51/199 12G 0.07579 0.1152 0 64 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.31it/s] all 6 55 0.122 0.215 0.111 0.03 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 52/199 12G 0.07361 0.1207 0 85 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.66it/s] all 6 55 0.249 0.127 0.0911 0.0321 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 53/199 12G 0.07723 0.0975 0 66 1024: 100% 7/7 [00:01<00:00, 4.51it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.87it/s] all 6 55 0.249 0.127 0.0911 0.0321 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 54/199 12G 0.07706 0.09257 0 41 1024: 100% 7/7 [00:01<00:00, 4.32it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.13it/s] all 6 55 0.29 0.2 0.205 0.0602 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 55/199 12G 0.07872 0.1055 0 97 1024: 100% 7/7 [00:01<00:00, 4.30it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 7.12it/s] all 6 55 0.484 0.345 0.332 0.0954 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 56/199 12G 0.07795 0.1081 0 57 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.67it/s] all 6 55 0.301 0.291 0.205 0.0598 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 57/199 12G 0.07503 0.09516 0 31 1024: 100% 7/7 [00:01<00:00, 4.33it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 6.70it/s] all 6 55 0.259 0.509 0.267 0.0747 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 58/199 12G 0.0716 0.1027 0 37 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 5.41it/s] all 6 55 0.622 0.691 0.644 0.229 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 59/199 12G 0.07173 0.1031 0 45 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 6.77it/s] all 6 55 0.623 0.582 0.581 0.236 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 60/199 12G 0.06989 0.09709 0 46 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.35it/s] all 6 55 0.369 0.291 0.268 0.0943 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 61/199 12G 0.07172 0.1116 0 70 1024: 100% 7/7 [00:01<00:00, 4.48it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.26it/s] all 6 55 0.369 0.291 0.268 0.0943 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 62/199 12G 0.07593 0.09078 0 46 1024: 100% 7/7 [00:01<00:00, 4.33it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 7.42it/s] all 6 55 0.363 0.394 0.276 0.0767 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 63/199 12G 0.07324 0.08752 0 39 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.43it/s] all 6 55 0.484 0.427 0.382 0.132 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 64/199 12G 0.0683 0.09707 0 68 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.72it/s] all 6 55 0.543 0.564 0.527 0.215 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 65/199 12G 0.07019 0.104 0 44 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.69it/s] all 6 55 0.727 0.582 0.618 0.214 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 66/199 12G 0.06843 0.1135 0 84 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.09it/s] all 6 55 0.724 0.764 0.701 0.274 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 67/199 12G 0.06837 0.09453 0 51 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.15it/s] all 6 55 0.528 0.455 0.482 0.146 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 68/199 12G 0.06894 0.1055 0 98 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.55it/s] all 6 55 0.358 0.327 0.325 0.0879 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 69/199 12G 0.06797 0.1033 0 56 1024: 100% 7/7 [00:01<00:00, 4.50it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.52it/s] all 6 55 0.358 0.327 0.325 0.0879 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 70/199 12G 0.06815 0.0887 0 51 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.30it/s] all 6 55 0.465 0.285 0.318 0.0831 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 71/199 12G 0.06454 0.09615 0 57 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.55it/s] all 6 55 0.507 0.473 0.471 0.14 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 72/199 12G 0.06462 0.09492 0 55 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.92it/s] all 6 55 0.771 0.727 0.738 0.372 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 73/199 12G 0.06143 0.1032 0 59 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.23it/s] all 6 55 0.783 0.618 0.699 0.267 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 74/199 12G 0.06781 0.09579 0 66 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.86it/s] all 6 55 0.739 0.669 0.741 0.369 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 75/199 12G 0.06416 0.09625 0 54 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.20it/s] all 6 55 0.56 0.691 0.575 0.3 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 76/199 12G 0.06373 0.1014 0 71 1024: 100% 7/7 [00:01<00:00, 4.43it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.35it/s] all 6 55 0.869 0.722 0.732 0.307 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 77/199 12G 0.06458 0.07644 0 22 1024: 100% 7/7 [00:01<00:00, 4.50it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.81it/s] all 6 55 0.869 0.722 0.732 0.307 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 78/199 12G 0.06403 0.1014 0 58 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.07it/s] all 6 55 0.729 0.735 0.723 0.273 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 79/199 12G 0.0609 0.1031 0 66 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.70it/s] all 6 55 0.79 0.655 0.731 0.323 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 80/199 12G 0.05864 0.09456 0 63 1024: 100% 7/7 [00:01<00:00, 4.33it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.52it/s] all 6 55 0.722 0.8 0.784 0.386 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 81/199 12G 0.06012 0.1005 0 40 1024: 100% 7/7 [00:01<00:00, 4.34it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.16it/s] all 6 55 0.878 0.8 0.812 0.372 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 82/199 12G 0.05975 0.09225 0 56 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.58it/s] all 6 55 0.895 0.691 0.784 0.341 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 83/199 12G 0.06257 0.09426 0 42 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.08it/s] all 6 55 0.845 0.727 0.786 0.369 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 84/199 12G 0.06328 0.09477 0 55 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.37it/s] all 6 55 0.86 0.782 0.816 0.33 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 85/199 12G 0.06143 0.08486 0 40 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.57it/s] all 6 55 0.86 0.782 0.816 0.33 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 86/199 12G 0.05986 0.09233 0 57 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.35it/s] all 6 55 0.825 0.772 0.779 0.324 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 87/199 12G 0.05925 0.09522 0 48 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.71it/s] all 6 55 0.978 0.79 0.86 0.421 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 88/199 12G 0.05955 0.1007 0 71 1024: 100% 7/7 [00:01<00:00, 4.28it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.70it/s] all 6 55 0.834 0.8 0.826 0.334 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 89/199 12G 0.05844 0.0954 0 60 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.19it/s] all 6 55 0.846 0.745 0.811 0.377 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 90/199 12G 0.06201 0.09137 0 40 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.00it/s] all 6 55 0.879 0.789 0.815 0.364 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 91/199 12G 0.06358 0.08033 0 47 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.02it/s] all 6 55 0.945 0.8 0.816 0.449 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 92/199 12G 0.06024 0.09962 0 73 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.13it/s] all 6 55 0.977 0.78 0.825 0.385 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 93/199 12G 0.0583 0.08984 0 55 1024: 100% 7/7 [00:01<00:00, 4.46it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.99it/s] all 6 55 0.977 0.78 0.825 0.385 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 94/199 12G 0.05665 0.1082 0 62 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.32it/s] all 6 55 0.868 0.782 0.785 0.358 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 95/199 12G 0.05592 0.08867 0 60 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.35it/s] all 6 55 0.977 0.776 0.834 0.447 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 96/199 12G 0.05841 0.0803 0 36 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.93it/s] all 6 55 0.947 0.818 0.828 0.492 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 97/199 12G 0.05566 0.09067 0 38 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.88it/s] all 6 55 0.938 0.745 0.827 0.492 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 98/199 12G 0.05575 0.08492 0 28 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.95it/s] all 6 55 0.977 0.8 0.848 0.437 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 99/199 12G 0.06165 0.08444 0 49 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.94it/s] all 6 55 0.953 0.8 0.841 0.497 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 100/199 12G 0.05438 0.0868 0 60 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 8.94it/s] all 6 55 0.932 0.782 0.842 0.505 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 101/199 12G 0.05614 0.08702 0 55 1024: 100% 7/7 [00:01<00:00, 4.46it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.01it/s] all 6 55 0.932 0.782 0.842 0.505 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 102/199 12G 0.05904 0.08039 0 51 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.85it/s] all 6 55 0.975 0.706 0.849 0.52 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 103/199 12G 0.0555 0.08632 0 55 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.22it/s] all 6 55 0.957 0.808 0.878 0.518 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 104/199 12G 0.05307 0.09515 0 50 1024: 100% 7/7 [00:01<00:00, 4.33it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 7.81it/s] all 6 55 0.964 0.782 0.839 0.5 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 105/199 12G 0.05638 0.08183 0 40 1024: 100% 7/7 [00:01<00:00, 4.32it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.10it/s] all 6 55 0.968 0.8 0.866 0.518 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 106/199 12G 0.05662 0.09464 0 37 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.86it/s] all 6 55 0.946 0.764 0.844 0.432 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 107/199 12G 0.05518 0.08613 0 42 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.74it/s] all 6 55 0.94 0.818 0.877 0.404 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 108/199 12G 0.05655 0.08381 0 52 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.15it/s] all 6 55 0.917 0.803 0.853 0.426 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 109/199 12G 0.05578 0.08784 0 64 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.20it/s] all 6 55 0.917 0.803 0.853 0.426 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 110/199 12G 0.05509 0.0817 0 47 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.46it/s] all 6 55 0.933 0.8 0.842 0.357 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 111/199 12G 0.05528 0.08013 0 44 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.30it/s] all 6 55 0.934 0.8 0.856 0.463 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 112/199 12G 0.04775 0.08639 0 45 1024: 100% 7/7 [00:01<00:00, 4.43it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.95it/s] all 6 55 0.978 0.809 0.87 0.534 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 113/199 12G 0.05254 0.0939 0 69 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.34it/s] all 6 55 0.965 0.782 0.874 0.486 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 114/199 12G 0.05249 0.08682 0 66 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.02it/s] all 6 55 0.938 0.855 0.874 0.488 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 115/199 12G 0.05314 0.08152 0 39 1024: 100% 7/7 [00:01<00:00, 4.43it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.23it/s] all 6 55 1 0.813 0.889 0.479 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 116/199 12G 0.0528 0.08872 0 63 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.26it/s] all 6 55 0.939 0.835 0.881 0.524 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 117/199 12G 0.05343 0.08497 0 48 1024: 100% 7/7 [00:01<00:00, 4.52it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.19it/s] all 6 55 0.939 0.835 0.881 0.524 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 118/199 12G 0.05448 0.08331 0 44 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.14it/s] all 6 55 0.922 0.8 0.876 0.48 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 119/199 12G 0.05572 0.08779 0 46 1024: 100% 7/7 [00:01<00:00, 4.31it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.06it/s] all 6 55 1 0.761 0.877 0.493 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 120/199 12G 0.05433 0.083 0 46 1024: 100% 7/7 [00:01<00:00, 4.45it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.27it/s] all 6 55 0.96 0.8 0.877 0.545 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 121/199 12G 0.05147 0.07991 0 53 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.06it/s] all 6 55 0.935 0.784 0.879 0.54 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 122/199 12G 0.04887 0.08298 0 66 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.27it/s] all 6 55 0.978 0.796 0.887 0.547 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 123/199 12G 0.04797 0.09225 0 75 1024: 100% 7/7 [00:01<00:00, 4.29it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.57it/s] all 6 55 0.955 0.782 0.867 0.485 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 124/199 12G 0.05301 0.08784 0 43 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.08it/s] all 6 55 0.978 0.802 0.886 0.551 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 125/199 12G 0.05067 0.09289 0 70 1024: 100% 7/7 [00:01<00:00, 4.45it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.00it/s] all 6 55 0.978 0.802 0.886 0.551 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 126/199 12G 0.05197 0.08528 0 73 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.75it/s] all 6 55 0.961 0.818 0.885 0.551 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 127/199 12G 0.0513 0.07462 0 34 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.08it/s] all 6 55 0.958 0.818 0.873 0.557 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 128/199 12G 0.05156 0.08571 0 45 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.29it/s] all 6 55 0.938 0.824 0.875 0.558 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 129/199 12G 0.04943 0.08221 0 65 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.35it/s] all 6 55 0.958 0.782 0.891 0.579 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 130/199 12G 0.04848 0.07787 0 66 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.01it/s] all 6 55 0.916 0.818 0.88 0.538 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 131/199 12G 0.05077 0.09038 0 70 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.26it/s] all 6 55 0.972 0.818 0.887 0.576 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 132/199 12G 0.05078 0.08039 0 58 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.46it/s] all 6 55 0.958 0.821 0.876 0.439 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 133/199 12G 0.04995 0.07814 0 38 1024: 100% 7/7 [00:01<00:00, 4.51it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.56it/s] all 6 55 0.958 0.821 0.876 0.439 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 134/199 12G 0.05251 0.07241 0 49 1024: 100% 7/7 [00:01<00:00, 4.32it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.35it/s] all 6 55 0.958 0.823 0.872 0.417 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 135/199 12G 0.05141 0.08356 0 45 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.05it/s] all 6 55 0.959 0.818 0.889 0.501 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 136/199 12G 0.04994 0.07336 0 49 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.37it/s] all 6 55 0.958 0.829 0.88 0.553 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 137/199 12G 0.0512 0.07677 0 38 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.15it/s] all 6 55 0.851 0.833 0.863 0.586 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 138/199 12G 0.04985 0.07277 0 38 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.18it/s] all 6 55 0.856 0.836 0.877 0.583 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 139/199 12G 0.04947 0.08709 0 72 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.01it/s] all 6 55 0.908 0.836 0.875 0.555 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 140/199 12G 0.04825 0.08276 0 49 1024: 100% 7/7 [00:01<00:00, 4.32it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.63it/s] all 6 55 0.903 0.873 0.886 0.571 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 141/199 12G 0.04814 0.08451 0 59 1024: 100% 7/7 [00:01<00:00, 4.44it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.09it/s] all 6 55 0.903 0.873 0.886 0.571 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 142/199 12G 0.04695 0.08721 0 39 1024: 100% 7/7 [00:01<00:00, 4.34it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.02it/s] all 6 55 0.941 0.863 0.9 0.572 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 143/199 12G 0.04521 0.07544 0 34 1024: 100% 7/7 [00:01<00:00, 4.45it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.15it/s] all 6 55 0.94 0.861 0.902 0.588 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 144/199 12G 0.04781 0.08883 0 68 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.01it/s] all 6 55 0.959 0.859 0.901 0.613 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 145/199 12G 0.05175 0.07183 0 38 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.26it/s] all 6 55 0.94 0.849 0.901 0.606 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 146/199 12G 0.0473 0.07425 0 50 1024: 100% 7/7 [00:01<00:00, 4.33it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.42it/s] all 6 55 0.922 0.873 0.893 0.607 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 147/199 12G 0.04626 0.07804 0 61 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.11it/s] all 6 55 0.886 0.855 0.887 0.587 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 148/199 12G 0.04969 0.07934 0 68 1024: 100% 7/7 [00:01<00:00, 4.34it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.05it/s] all 6 55 0.872 0.855 0.888 0.596 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 149/199 12G 0.04646 0.06747 0 32 1024: 100% 7/7 [00:01<00:00, 4.45it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.03it/s] all 6 55 0.872 0.855 0.888 0.596 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 150/199 12G 0.04838 0.08761 0 73 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.15it/s] all 6 55 0.886 0.852 0.89 0.587 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 151/199 12G 0.05406 0.06794 0 47 1024: 100% 7/7 [00:01<00:00, 4.43it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.25it/s] all 6 55 0.948 0.836 0.891 0.6 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 152/199 12G 0.04958 0.08157 0 89 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.04it/s] all 6 55 0.958 0.834 0.884 0.592 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 153/199 12G 0.04447 0.07747 0 68 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.30it/s] all 6 55 0.979 0.828 0.899 0.597 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 154/199 12G 0.04759 0.09214 0 75 1024: 100% 7/7 [00:01<00:00, 4.33it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.21it/s] all 6 55 0.897 0.836 0.88 0.583 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 155/199 12G 0.04492 0.08703 0 61 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.07it/s] all 6 55 0.974 0.836 0.889 0.599 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 156/199 12G 0.04688 0.08433 0 56 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.75it/s] all 6 55 0.919 0.855 0.899 0.575 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 157/199 12G 0.04771 0.08473 0 69 1024: 100% 7/7 [00:01<00:00, 4.48it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.48it/s] all 6 55 0.919 0.855 0.899 0.575 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 158/199 12G 0.04436 0.08163 0 57 1024: 100% 7/7 [00:01<00:00, 4.30it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.23it/s] all 6 55 0.924 0.855 0.893 0.56 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 159/199 12G 0.04582 0.07747 0 57 1024: 100% 7/7 [00:01<00:00, 4.31it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.09it/s] all 6 55 0.966 0.818 0.883 0.582 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 160/199 12G 0.04602 0.07762 0 62 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.08it/s] all 6 55 0.98 0.782 0.885 0.594 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 161/199 12G 0.04573 0.08105 0 57 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.55it/s] all 6 55 0.982 0.8 0.882 0.6 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 162/199 12G 0.04475 0.07896 0 59 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.02it/s] all 6 55 0.943 0.855 0.893 0.611 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 163/199 12G 0.04718 0.07811 0 56 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.25it/s] all 6 55 0.939 0.855 0.907 0.577 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 164/199 12G 0.04579 0.07475 0 48 1024: 100% 7/7 [00:01<00:00, 4.34it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.25it/s] all 6 55 0.929 0.855 0.902 0.575 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 165/199 12G 0.04803 0.09168 0 58 1024: 100% 7/7 [00:01<00:00, 4.48it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.21it/s] all 6 55 0.929 0.855 0.902 0.575 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 166/199 12G 0.04399 0.08842 0 64 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.03it/s] all 6 55 0.927 0.855 0.903 0.552 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 167/199 12G 0.04617 0.08419 0 55 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.10it/s] all 6 55 0.938 0.873 0.906 0.585 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 168/199 12G 0.04926 0.07266 0 25 1024: 100% 7/7 [00:01<00:00, 4.32it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.12it/s] all 6 55 0.923 0.873 0.904 0.615 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 169/199 12G 0.04453 0.08439 0 46 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.81it/s] all 6 55 0.96 0.87 0.917 0.611 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 170/199 12G 0.04618 0.07767 0 63 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.24it/s] all 6 55 0.98 0.87 0.919 0.625 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 171/199 12G 0.047 0.08232 0 61 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.26it/s] all 6 55 0.94 0.873 0.911 0.599 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 172/199 12G 0.04452 0.07685 0 32 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.93it/s] all 6 55 0.937 0.891 0.922 0.615 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 173/199 12G 0.0467 0.08482 0 56 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.04it/s] all 6 55 0.937 0.891 0.922 0.615 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 174/199 12G 0.04488 0.09137 0 74 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.09it/s] all 6 55 0.929 0.891 0.924 0.602 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 175/199 12G 0.04734 0.07431 0 40 1024: 100% 7/7 [00:01<00:00, 4.43it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.36it/s] all 6 55 0.929 0.891 0.924 0.613 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 176/199 12G 0.04372 0.08589 0 47 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.50it/s] all 6 55 0.937 0.891 0.908 0.613 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 177/199 12G 0.04526 0.0829 0 72 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.98it/s] all 6 55 0.922 0.863 0.906 0.615 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 178/199 12G 0.04661 0.07014 0 53 1024: 100% 7/7 [00:01<00:00, 4.34it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.33it/s] all 6 55 0.922 0.855 0.898 0.619 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 179/199 12G 0.04371 0.0806 0 74 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.80it/s] all 6 55 0.939 0.855 0.902 0.613 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 180/199 12G 0.04476 0.07644 0 75 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.24it/s] all 6 55 0.941 0.866 0.917 0.628 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 181/199 12G 0.0446 0.08319 0 66 1024: 100% 7/7 [00:01<00:00, 4.45it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.07it/s] all 6 55 0.941 0.866 0.917 0.628 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 182/199 12G 0.04419 0.08137 0 51 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.89it/s] all 6 55 0.941 0.866 0.914 0.621 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 183/199 12G 0.04428 0.0847 0 69 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.28it/s] all 6 55 0.923 0.87 0.907 0.609 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 184/199 12G 0.04487 0.08205 0 52 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.93it/s] all 6 55 0.919 0.891 0.915 0.62 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 185/199 12G 0.04219 0.07219 0 28 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.37it/s] all 6 55 0.923 0.869 0.912 0.625 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 186/199 12G 0.04337 0.07978 0 37 1024: 100% 7/7 [00:01<00:00, 4.42it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.43it/s] all 6 55 0.923 0.872 0.909 0.62 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 187/199 12G 0.04522 0.07773 0 39 1024: 100% 7/7 [00:01<00:00, 4.41it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.78it/s] all 6 55 0.923 0.872 0.91 0.614 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 188/199 12G 0.04356 0.07879 0 59 1024: 100% 7/7 [00:01<00:00, 4.35it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.82it/s] all 6 55 0.923 0.872 0.912 0.606 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 189/199 12G 0.04754 0.07628 0 52 1024: 100% 7/7 [00:01<00:00, 4.48it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 9.89it/s] all 6 55 0.923 0.872 0.912 0.606 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 190/199 12G 0.0495 0.07351 0 47 1024: 100% 7/7 [00:01<00:00, 4.38it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.09it/s] all 6 55 0.921 0.873 0.913 0.611 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 191/199 12G 0.04432 0.07214 0 50 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.09it/s] all 6 55 0.94 0.891 0.919 0.614 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 192/199 12G 0.04265 0.06821 0 33 1024: 100% 7/7 [00:01<00:00, 4.31it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 7.75it/s] all 6 55 0.939 0.891 0.908 0.61 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 193/199 12G 0.04582 0.06743 0 42 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.35it/s] all 6 55 0.939 0.891 0.906 0.617 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 194/199 12G 0.04633 0.07468 0 55 1024: 100% 7/7 [00:01<00:00, 4.37it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.08it/s] all 6 55 0.92 0.873 0.915 0.623 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 195/199 12G 0.0452 0.0808 0 70 1024: 100% 7/7 [00:01<00:00, 4.40it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.13it/s] all 6 55 0.921 0.873 0.915 0.615 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 196/199 12G 0.04279 0.07906 0 44 1024: 100% 7/7 [00:01<00:00, 4.43it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.25it/s] all 6 55 0.923 0.873 0.915 0.623 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 197/199 12G 0.04192 0.07727 0 55 1024: 100% 7/7 [00:01<00:00, 4.47it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.14it/s] all 6 55 0.923 0.873 0.915 0.623 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 198/199 12G 0.04797 0.06895 0 44 1024: 100% 7/7 [00:01<00:00, 4.36it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.42it/s] all 6 55 0.923 0.872 0.915 0.617 Epoch GPU_mem box_loss obj_loss cls_loss Instances Size 199/199 12G 0.04277 0.08332 0 89 1024: 100% 7/7 [00:01<00:00, 4.39it/s] Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 10.15it/s] all 6 55 0.923 0.87 0.916 0.62 200 epochs completed in 0.181 hours. Optimizer stripped from runs/train/orangetree/weights/last.pt, 93.2MB Optimizer stripped from runs/train/orangetree/weights/best.pt, 93.2MB Validating runs/train/orangetree/weights/best.pt... Fusing layers... YOLOv5l summary: 267 layers, 46108278 parameters, 0 gradients, 107.6 GFLOPs Class Images Instances P R mAP50 mAP50-95: 100% 1/1 [00:00<00:00, 7.36it/s] all 6 55 0.941 0.866 0.916 0.628 Results saved to runs/train/orangetree
After finishing the training, we can observe the statistics:
print(os.listdir('/content/yolov5/runs/train/orangetree'))
['results.csv', 'results.png', 'labels.jpg', 'hyp.yaml', 'train_batch0.jpg', 'val_batch0_labels.jpg', 'labels_correlogram.jpg', 'confusion_matrix.png', 'PR_curve.png', 'weights', 'F1_curve.png', 'events.out.tfevents.1694872662.c29cdf705d45.1808.0', 'train_batch1.jpg', 'val_batch0_pred.jpg', 'opt.yaml', 'P_curve.png', 'train_batch2.jpg', 'R_curve.png']
Image(filename='/content/yolov5/runs/train/orangetree/results.png', width=900)
We can also see the images along with the annotations used in the training:
Image(filename='/content/yolov5/runs/train/orangetree/train_batch1.jpg', width=900)
So, let's detect orange trees in our validation images:
!python detect.py --source /content/validation --weight runs/train/orangetree/weights/best.pt
detect: weights=['runs/train/orangetree/weights/best.pt'], source=/content/validation, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1 YOLOv5 🚀 v7.0-218-g9e97ac3 Python-3.10.12 torch-2.0.1+cu118 CUDA:0 (Tesla V100-SXM2-16GB, 16151MiB) Fusing layers... YOLOv5l summary: 267 layers, 46108278 parameters, 0 gradients, 107.6 GFLOPs image 1/6 /content/validation/img_1.jpg: 640x640 10 Orange Trees, 12.5ms image 2/6 /content/validation/img_14.jpg: 640x640 9 Orange Trees, 12.6ms image 3/6 /content/validation/img_37.jpg: 640x640 9 Orange Trees, 12.6ms image 4/6 /content/validation/img_47.jpg: 640x640 8 Orange Trees, 12.6ms image 5/6 /content/validation/img_5.jpg: 640x640 9 Orange Trees, 12.6ms image 6/6 /content/validation/img_57.jpg: 640x640 4 Orange Trees, 12.6ms Speed: 0.5ms pre-process, 12.6ms inference, 21.9ms NMS per image at shape (1, 3, 640, 640) Results saved to runs/detect/exp
print(sorted(os.listdir('/content/yolov5/runs/detect/exp')))
['img_1.jpg', 'img_14.jpg', 'img_37.jpg', 'img_47.jpg', 'img_5.jpg', 'img_57.jpg']
for images in glob.glob('/content/yolov5/runs/detect/exp/*.jpg')[0:5]:
display(Image(filename=images))
We can also split our original image into 4096x4096 patches, convert them to jpg and perform detection and count on this larger image:
img.shape
(5106, 15360, 3)
img1 = img[0:4096,0:4096,0:3]
img2 = img[0:4096,4096:8192,0:3]
img3 = img[0:4096,8192:12228,0:3]
%cd ..
/content
!ls
drive sample_data train train.csv validation validation.csv yolov5
!mkdir predict
imsave('/content/predict/img1.jpg', img1)
imsave('/content/predict/img2.jpg', img2)
imsave('/content/predict/img3.jpg', img3)
%cd yolov5
/content/yolov5
!python detect.py --source /content/predict --img-size 4096 --weight runs/train/orangetree/weights/best.pt --save-txt
detect: weights=['runs/train/orangetree/weights/best.pt'], source=/content/predict, data=data/coco128.yaml, imgsz=[4096, 4096], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=True, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1 YOLOv5 🚀 v7.0-218-g9e97ac3 Python-3.10.12 torch-2.0.1+cu118 CUDA:0 (Tesla V100-SXM2-16GB, 16151MiB) Fusing layers... YOLOv5l summary: 267 layers, 46108278 parameters, 0 gradients, 107.6 GFLOPs image 1/3 /content/predict/img1.jpg: 4096x4096 120 Orange Trees, 301.8ms image 2/3 /content/predict/img2.jpg: 4096x4096 125 Orange Trees, 302.1ms image 3/3 /content/predict/img3.jpg: 4096x4064 130 Orange Trees, 307.1ms Speed: 11.7ms pre-process, 303.6ms inference, 28.6ms NMS per image at shape (1, 3, 4096, 4096) Results saved to runs/detect/exp3 3 labels saved to runs/detect/exp3/labels
for images in glob.glob('/content/yolov5/runs/detect/exp3/*.jpg')[0:3]:
display(Image(filename=images))