Is there a way to upload images with annotations (labeled images) to custom vision? - image-loading

I have hundreds of labeled images and do not want to redo that work in the custom vision labeling tool. Is there a way to upload labeled images to custom vision? Or to Azure ML or Azure ML Studio? Does any vision services in Azure provide for uploading annotated images? Thanks

I built a proof of concept using a package that I developed called PyLabel to upload annotations to Azure Custom Vision. You can see it here https://github.com/pylabel-project/samples/blob/main/pylabel2azure_custom_vision.ipynb.
PyLabel can read annotations from COCO, YOLO, or VOC format into a dataframe. Once they are in the data frame you can loop through the dataframe of annotations and use the Custom Vision APIs to upload the images and annotations.
The annotation format used by Custom Vision similar to the YOLO format because they both used normalized coordinated between 0-1.
Here is a snippet of the code from the notebook mentioned above:
#Iterate the rows for each image in the dataframe
for img_filename, img_df in dataset.df.groupby('img_filename'):
img_path = str(PurePath(dataset.path_to_annotations, str(img_df.iloc[0].img_folder), img_filename))
assert exists(img_path), f"File does not exist: {img_path}"
#Create a region object for each bounding box in the dataset
regions = []
for index, row in img_df.iterrows():
#Normalize the boundings box coordinates between 0 and 1
x = Decimal(row.ann_bbox_xmin / row.img_width).min(1)
y = Decimal(row.ann_bbox_ymin / row.img_height).min(1)
w = Decimal(row.ann_bbox_width / row.img_width).min(1-x)
h = Decimal(row.ann_bbox_height / row.img_height).min(1-y)
regions.append(Region(
tag_id=tags[row.cat_name].id,
left=x,
top=y,
width=w,
height=h
)
)
#Create an object with the image and all of the annotations for that image
with open(img_path, mode="rb") as image_contents:
image_and_annotations = [ImageFileCreateEntry(name=img_filename, contents=image_contents.read(), regions=regions)]
#Upload the image and all annnotations for that image
upload_result = trainer.create_images_from_files(
project.id,
ImageFileCreateBatch(images=image_and_annotations)
)
#If upload is not successful, print details about that image for debugging
if not upload_result.is_batch_successful:
print("Image upload failed.")
for image in upload_result.images:
print(img_path)
print("Image status: ", image.status)
print(regions)
#This will take a few minutes
print("Upload complete")

Related

Uploading large dataset from FiftyOne to CVAT

I'm trying to upload around 15GB of data from FiftyOne to CVAT using the 'annotate' function in order to fix annotations. The task is divided into jobs of 50 samples. During the sample upload, I get an 'Error 504 Gateway Time-Out' error. I can see the images in CVAT but they are without the current annotations.
Tried uploading the annotations separately using the 'task_id' and changing the 'cvat.py' file in FiftyOne but I wasn't able to load the changed annotations.
I can't break this down into multiple tasks since all tasks have the same name making it inconvenient.
In order to be able to use 'load_annotations' to update the dataset, I understand that I have to upload it using the 'annotate' function (unless there is another way).
Update: This seems to be a limitation of CVAT on the maximum size of requests to their API. In order to circumvent this for the time being, we are adding a task_size parameter to the annotate() method of FiftyOne which automatically breaks an annotation run into multiple tasks of a maximum task_size to avoid large data or annotation uploads.
Previous Answer:
The best way to manage this workflow now would be to break down your annotations into multiple tasks but then upload them to one CVAT project to be able to group and manage them nicely.
For example:
import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart").clone()
# The label schema is automatically inferred from the existing labels
# Alternatively, it can be specified with the `label_schema` kwarg
# when calling `annotate()`
label_field = "ground_truth"
# Upload batches of your dataset to different tasks
# all stored in the same project
project_name = "multiple_task_example"
anno_keys = []
for i in range(int(len(dataset)/50)):
anno_key = "example_%d" % i
view = dataset.skip(i*50).limit(50)
view.annotate(
anno_key,
label_field=label_field,
project_name=project_name,
)
anno_keys.append(anno_key)
# Annotate in CVAT...
# Load all annotations and cleanup tasks/project when complete
anno_keys = dataset.list_annotation_runs()
for anno_key in anno_keys:
dataset.load_annotations(anno_key, cleanup=True)
dataset.delete_annotation_run(anno_key)
Uploading to existing tasks and the project_name argument will be available in the next release. If you want to use this immediately you can install FiftyOne from source: https://github.com/voxel51/fiftyone#installing-from-source
We are working on further optimizations and stability improvements for large CVAT annotation jobs like yours.

How to use s2_mask() function in R to mask clouds in Sentinel 2 image?

I want to use the toolbox sen2r to process Sentinel 2 L2A data in R. I have already manually downloaded the images in .SAFE format.
I have used the s2_translate() to convert .SAFE format to geotif:
in_dir <- "D:/data/s2"
out_dir <-"D:/s2_geotifs"
## translate .safe to geotif
s2_example <- file.path(
in_dir,
"S2B_MSIL2A_20200525T104619_N0214_R051_T31UFT_20200525T133932.SAFE")
s2_raster_dir <- s2_translate(s2_example,
format="GTiff",
outdir = out_dir)
This results is a raster brick with 11 layers, all corresponding to the optical bands of sentinel 2 as far as i can see.
Now i want to apply the s2_mask function (specifically to band 4 and 8 because i want to make NDVI) but the documentation for the code says you need the SCL product as input. The SCL product are bands with the classified cloud pixels used for masking. If i load the .SAFE image into SNAP e.g. i can see the SCL products. However i cannot find the SCL in my s2_translate() output, or in the original .SAFE for that matter.
According to the documentation the input should be as follows:
So the issue is that i cannot find the SCL product anywhere. I have applied s2_translate as required.
By default, s2_translate only generates BOA output. I think you need to explicitly generate also the SCL file from the SAFE using again s2_translate, using something on the lines:
s2_translate(s2_example,
prod_type = "SCL",
format="GTiff",
outdir = out_dir)
see documentation here:
http://sen2r.ranghetti.info/reference/s2_translate.html
http://sen2r.ranghetti.info/reference/safe_shortname.html

How to batch edit a field in a jpg EXIF header?

I am busy with some drone mapping. However, the altitude value in the images are very inconsistent between repeating flight missions (up to 120m). The program I use to stitch my drone images into a orthomosaic thinks the drone is flying underground as the image altitude is lower than the actual ground elevation.
To rectify this issue, I want to batch edit the altitude values of all my images by adding the difference between actual ground elevation and the drone altitude directly into the EXIF of the images.
e.g.
Original image altitude = 250m. Edited image altitude = 250m+x
I have found the exiftoolr R packages which allows you to read and write EXIF data through using the standalone ExifTool and Perl programs (see here: https://github.com/JoshOBrien/exiftoolr)
This is my code so far:
library(exiftoolr)
#Object containing images in directory
image_files <-dir("D:/....../R/EXIF_Header_Editing/Imagery",full.names=TRUE)
#Reading info
exif_read(image_files, tags = c("filename", "AbsoluteAltitude")) #Only interested in "filename" and "AbsoluteAltitude"
#Saving to new variable
altitude<-list(exif_read(image_files, tags=c("filename","AbsoluteAltitude")))
This is how some of the output looks like:
FileName AbsoluteAltitude
1 DJI_0331.JPG +262.67
2 DJI_0332.JPG +262.37
3 DJI_0333.JPG +262.47
4 DJI_0334.JPG +262.57
5 DJI_0335.JPG +262.47
6 DJI_0336.JPG +262.57
ext.
I know need to add x to every "AbsoluteAltitude" entry in the list, and then overwrite the existing image altitude value with this new adjusted altitude value, without editing any other important EXIF information.
Any ideas?
I have a program that allows me to batch edit EXIF Altitude, but this makes all the vales the same, and I need to keep the variation between the values.
Thanks in advance
Just a follow up from #StarGeek answer. I managed to figure out the R equivalent. Here is my solution:
#Installing package from GitHub
if(!require(devtools)) {install.packages("devtools")}
devtools::install_github("JoshOBrien/exiftoolr",force = TRUE)
#Installing/updating ExifTool program into exiftoolr directory
exiftoolr::install_exiftool()
#Loading packages
library(exiftoolr)
#Set working directory
setwd("D:/..../R/EXIF_Header_Editing")
#Object containing images
image_files <- dir("D:/..../R/EXIF_Header_Editing/Imagery",full.names = TRUE)
#Editing "GPSAltitude" by adding 500m to Altitude value
exif_call(args = "-GPSAltitude+=500", path = image_files)
And when opening the .jpg properties, the adjusted Altitude shows.
Thanks StarGeek
If you're willing to try to just use exiftool, you could try this command:
exiftool -AbsoluteAltitude+=250 <DIRECTORY>
I'd first test it on a few copies of your files to see if it works to your needs.

Streaming data visualization in R

It looks simple but am struggling with it. Basically I am trying to visualize (plot) a simulated streaming data that comes in every two seconds i.e. for the plot to be continuous in real-time.
I have seen Microsoft Azure (but i don't have the expensive subscription). I have also seen the Animation package but it has to read in all the data before rendering the display. Is there a way to achieve this in R (or perhaps python)?
This is what I have done so far in R:
#simulate some sensor data
time <- seq(1,200,2)
sensor <- runif(100,1,100)
timedf <- data.frame(cbind(time,sensor))
timedf$time <- as.POSIXct(timedf$time,origin = "2016-05-05")
plot(timedf$time,timedf$sensor,type = "l")
#I want to visualize or plot the data continously every 2 seconds
#sleep function
mysleep <- function(x)
{
p1 <- proc.time()
Sys.sleep(x)
proc.time() - p1
}
for(i in 1:nrow(timedf)){
mysleep(2)
print(timedf[i,])
plot(timedf[i,]$time,timedf[i,]$sensor,type = "l")
par(new=FALSE)
}[![some sensor data][1]][1]
Here's a demonstration using nodejs:
#clone the nodejs plotly github directory
git clone https://github.com/plotly/plotly-nodejs.git
Edit the examples/config.json file using your API credentials and then simply run one of the demo provided (in this example; signal-stream.js)
node signal-stream.js
Which will output a JSON string in the form of:
{ streamstatus: 'All Streams Go!',
url: 'https://plot.ly/~voxnonecho/194',
message: '',
warning: '',
filename: 'streamSimpleSensor',
error: '' }
Then simply follow the URL provided:

Plotting images but holding visualization until instruction in R

I want to load two jpeg images in R consecutively but they are quite large (4000X3000 pixels)
So simply doing
library(biOps)
x <- readJpeg("image.jpg")
plot(x)
Takes a while. When the first image is displayed the user would have to fill in some observations on the image. I wanted to know if it was possible to plot the image but pause the actual visualization so as to take advantage of the time the user is filling in the data I mentioned only to display the image later, maybe upon an instruction of the user like pressing the enter key.
Can this be done?
One option is to use animation package. saveHTML will create a html file animated by SciAnimator library. You can show the first plot and stop the visualisation, go to the next, use a timer,...
ll.imgs <- list.files(Imgs_folder,patt='jpg',full=TRUE)
saveHTML({
for(i in seq_along(ll.imgs)){
x <- readJpeg("image.jpg")
plot(x) ##maybe you should try grid.raster(x) from grid package
}
}, img.name = "plots", imgdir = "plots_dir",
htmlfile = "random.html", autobrowse = FALSE,
title = "Plotting images but holding visualization until instruction",
outdir=getwd())

Resources